Sully.ai

87 posts

Sully.ai

@sullyai

Autonomous OS for healthcare

San Francisco Bay Area Katılım Aralık 2017

10 Takip Edilen943 Takipçiler

Sabitlenmiş Tweet

Sully.ai@sullyai·17 Şub

Excited to partner deeply with @nvidia & @baseten

NVIDIA AI@NVIDIAAI

🩺 @sullyai has returned over 30 million minutes to physicians — more time with patients, less on paperwork. @baseten powers this with their optimized inference stack built using NVIDIA Blackwell, NVFP4, TensorRT LLM, and NVIDIA Dynamo, to run frontier open models like gpt oss 120b. The result: 10x cost reduction and 65% faster responses for workflows like clinical note generation. 🔗 Read the blog: nvda.ws/468smA3

English

3.8K

Sully.ai@sullyai·3d

Our @DrataHQ trust center 👉 trust.sully.ai

English

3.6K

Sully.ai@sullyai·3d

For folks claiming we're using @getdelve for compliance & our SOC II, we moved to @DrataHQ almost 6 months ago. The link to our Drata trust center is in the comments.

English

57.1K

Sully.ai@sullyai·5d

We are all patients at some point.

Ahmed Omar.@omar_or_ahmed

Every healthcare AI co is building for clinicians. Nobody's building for patients. That's the $50B mistake.

English

2.2K

Sully.ai retweetledi

Speechmatics@Speechmatics·12 Mar

@sullyai That's the world's first Arabic-English bilingual medical model in production. 📊 Powered by @nvidiahealth. speechmatics.com/company/articl…

English

574

Sully.ai@sullyai·14 Mar

Excited to partner deeply with @Speechmatics to launch the world's first Arabic-English speech-to-text model.

Speechmatics@Speechmatics

@sullyai tested the major STT providers on real MENA clinical audio. 👀 Code-switching, dialect-heavy consultations, the conditions generic models fail on. 🏥 Patrick Nguyen, Head of Engineering MENA: ours was the only one that hit the performance thresholds needed for clinical documentation at regional scale.

English

454

Sully.ai retweetledi

Ahmed Omar.@omar_or_ahmed·14 Mar

Healthcare AI cos are solving the wrong problem. Clinicians don't need better diagnoses. They need time back. Here's the $100B insight:

English

160

19.4K

Sully.ai retweetledi

Muratcan Koylan@koylanai·12 Mar

We benchmarked NVIDIA’s new Nemotron 3 Super in two modes **Thinking Off and High Thinking** across three medical evaluation sets: MedMCQA, MedCaseReasoning, and MedXpertQA. Thinking Off outperformed High Thinking: 26.4% vs. 25.2% accuracy. The cost gap was much larger than the accuracy gap. High Thinking increased mean latency from 1.13s to 4.43s and mean completion length from 109 tokens to 1,089 tokens. In our setup, the higher-reasoning mode was much slower and more verbose, without improving aggregate results. The benchmark-level split was more revealing than the overall average. On MedMCQA, accuracy dropped from 56.6% to 49.1% with High Thinking. On MedCaseReasoning, it also declined, from 24.4% to 20.2%. The only clear gain was on MedXpertQA, where High Thinking improved accuracy from 9.2% to 15.0%. That pattern fits the benchmark design: MedMCQA rewards concise answer selection on constrained multiple-choice questions, while MedXpertQA is harder and more reasoning-intensive, so extra inference budget appears to help more there than on exam-style MCQs. Across the overlap set, High Thinking improved 166 questions but flipped 182 previously correct answers into incorrect ones, explaining the net regression. Many of these looked like classic overthinking on structured medical multiple-choice items: the non-thinking run selected the correct answer directly, while High Thinking often chose a plausible distractor after longer deliberation. Our main takeaway: Nemotron Super’s High Thinking mode should not be treated as a universal default. In this experiment, it looked more like a specialized mode for harder expert synthesis than a general-purpose accuracy booster. For structured medical multiple-choice tasks, Thinking Off was both faster and more accurate. For harder expert-level reasoning tasks, especially those closer to MedXpertQA, additional reasoning showed some benefit. The practical implication is that the reasoning depth should likely be routed by task type rather than enabled globally. We used the @baseten Model API for these runs, and we’re grateful for their support from day one. We’re also thankful to @NVIDIAAI for its commitment to open source. As a research team that transitioned fully to open-source models this year, we deeply appreciate this level of openness, weights, data, and recipes. We also expect this model to be especially strong for orchestration and agent-style tasks, which is an area we’re excited to explore further.

Bryan Catanzaro@ctnzr

Announcing NVIDIA Nemotron 3 Super! 💚120B-12A Hybrid SSM Latent MoE, designed for Blackwell 💚36 on AAIndex v4 💚up to 2.2X faster than GPT-OSS-120B in FP4 💚Open data, open recipe, open weights Models, Tech report, etc. here: research.nvidia.com/labs/nemotron/… And yes, Ultra is coming!

English

3.7K

Sully.ai@sullyai·12 Mar

🎯

Ahmed Omar.@omar_or_ahmed

Healthcare AI has a 0B regulatory arbitrage window. Investors are missing it. Most AI healthcare cos are racing to FDA approval. That's the wrong strategy. Here's the hidden advantage: 🧵

ART

247

Sully.ai retweetledi

Ahmed Omar.@omar_or_ahmed·9 Mar

Healthcare AI will be bigger than legal AI, bigger than coding AI. But most investors still dont see it. Heres the math that changes everything 🧵

English

214

285.5K

Sully.ai@sullyai·9 Mar

💯

Ahmed Omar.@omar_or_ahmed

Healthcare AI will be bigger than legal AI, bigger than coding AI. But most investors still dont see it. Heres the math that changes everything 🧵

ART

612

Sully.ai retweetledi

Ahmed Omar.@omar_or_ahmed·4 Mar

Proud of the Sully team to get to this! We were late to the party, but now we’re #1! cc: @amitskumthekar, @koylanai and the research team

Ahmed Omar.@omar_or_ahmed

We made an early bet on a full team of autonomous agents. We're #1 on speed! If you walk into any hospital today, you would see them using an average of 50-100 software tools, with some having over 800 SaaS subscriptions! Getting one integration with one solution to your health system at a time is a nightmare. Not just that, getting all those AI tools to talk to each other is nearly impossible. The biggest objection we get before showing people our demo is: how good each and every agent is, how you're doing all those suites of agents, and how you can claim you're better. But we don't like to talk, we show them what we built, and their jaws drop. Why does that happen? because we're driven by UX (user experience). If something is a better UX, our research and engineering team figures out how to do it. If physicians don't want to wait for something, they shouldn't; if technology isn't there yet, we'll figure it out. Thanks to the @sullyai team and our partners for making this happen! PS. A lot of people ask us about the accuracy with this speed, for us, this is clinical information, so quality is out of the question! Read our paper here about our clinical accuracy: arxiv.org/abs/2505.23075

English

2.3K

Sully.ai retweetledi

Ahmed Omar.@omar_or_ahmed·4 Mar

@ns123abc wild. the talent wars in AI are just getting started. the best researchers leaving big companies to either start their own thing or join vertical AI companies where they can actually ship to production. research for researchs sake wont cut it anymore

English

5.6K

Sully.ai retweetledi

Ahmed Omar.@omar_or_ahmed·2 Mar

💯

Muratcan Koylan@koylanai

Browser use agents have so many use cases but managing Slack is not one of them for me. MCPs are way faster and token-efficient.

ART

632

Sully.ai retweetledi

Sully.ai@sullyai·17 Şub

Excited to partner deeply with @nvidia & @baseten

NVIDIA AI@NVIDIAAI

English

3.8K

Sully.ai retweetledi

Ahmed Omar.@omar_or_ahmed·17 Şub

we built a SNOMED coding judge that actually works. old one scored bad agents 90% "optimal." good ones? also 90%. completely useless. new semantic judge: 70 point discrimination gap. r ≈ 0.99 with ground truth. here's how we did it 👇

English

356

Sully.ai@sullyai·13 Şub

@omar_or_ahmed @nvidia and @baseten for this partnership!

English

Ahmed Omar.@omar_or_ahmed·13 Şub

Thanks @nvidia for the mention and the compute ⚡️

NVIDIA@nvidia

🧵 How @baseten, @DeepInfra, @FireworksAI_HQ, and @togethercompute are cutting AI Costs by up to 10x With Open Source Models on NVIDIA Blackwell Across industries, tokens power every AI interaction: 💉 Medical insights 🎮 Game character dialogue 🧙‍♂️ Autonomous customer support 🤖 Agentic chat

English

821

Sully.ai@sullyai·13 Şub

🤝 @nvidia

Ahmed Omar.@omar_or_ahmed

Thanks @nvidia for the mention and the compute ⚡️

QME

608

Sully.ai retweetledi

Ahmed Omar.@omar_or_ahmed·12 Şub

People think this is a Reddit shit post. I actually think this is a great presentation of our intense culture without advertising it.