Sameed Khan

314 posts

Sameed Khan

@sameedmed

M3.5 (Research Year) @CleClinicLCM. CS '21 @michiganstateu. Interested in artificial intelligence, health tech, and entrepreneurship. Always a student.

Cleveland, OH Beigetreten Aralık 2022

143 Folgt70 Follower

Sameed Khan@sameedmed·2d

@MGalvosas No fault of the authors, to be fair - it’s a systemic incompatibility b/t innovation pace and speed of publication but this is a common flaw with these studies. Hard to derive any meaningful conclusions when the capabilities we’re discussing now are in a different realm

English

Sameed Khan@sameedmed·2d

@MGalvosas Skimmed abstract, scrolled straight to appendix - commit / API column is misleading; most of these are late 2024 release models. Only one that is slightly current is Qwen3-30B-A3B-Instruct-2507 (July 2025). This is a completely unrepresentative view of current AI capabilities.

English

Mindaugas Galvosas, MD@MGalvosas·3d

Largest stress tests on medical misinformation in LLMs Most models still struggle to separate fact from fiction thelancet.com/journals/landi…

English

519

Sameed Khan retweetet

Inna Vishik@InnaVishik·5 Mar

Scrap the journals and formal peer review. A better system could be a dynamic arXiv-like document (including a verification system similar to arXiv) where new versions are assigned new DOIs, coupled with a comment section. Comments get DOIs. Helpful comments are incorporated into the living manuscript as edits or citations. Substantive additions are given co-authorship in updated version. Substantive critiques or re-analysis of data by another group get cited because they have DOIs and become part of the record of knowledge in this topic. Comments can be anonymous, but if you want credit and H-index growth, it has to be under your name.

Ryan Briggs@ryancbriggs

It does seem like now is a very good time to think about what we want the “journal” system of the future to look like

English

7.6K

Sameed Khan retweetet

Lisan al Gaib@scaling01·2 Mar

x.com/i/article/2028…

ZXX

107

27.1K

Sameed Khan@sameedmed·6 Şub

Wake up babe new Claude just dropped

Claude@claudeai

Introducing Claude Opus 4.6. Our smartest model got an upgrade. Opus 4.6 plans more carefully, sustains agentic tasks for longer, operates reliably in massive codebases, and catches its own mistakes. It’s also our first Opus-class model with 1M token context in beta.

English

Sameed Khan@sameedmed·18 Oca

This also greatly explains why I don’t see the 100x multiplier effect for science code but still more like a 5-10x speed up

Chase Saunders@MaineFrameworks

"Bottom-up programming as the root of LLM dev skepticism" klio.org/theory-of-llm-…

English

Sameed Khan@sameedmed·17 Oca

Deploying cognition at scale means you get to do much more ambitious projects but also journals can review many more submissions far more rigorously than we are currently able to.

Jay Bhattacharya@DrJBhattacharya

Update! My brilliant colleague and frequent coauthor, @MikkoPackalen writes with a different take about AI use in science and scholarship. His take is persuasive but contrary to mine. Perhaps he's right that I'm not fully appreciating the culture change that AI portends for science. Here's what he wrote to me: Non-Dinosaur NIH AI Policy: "AI is an important opportunity for advancing and accelerating science. Applicants are encouraged to use AI as they best see fit. NIH understands that AI is deeply integrated in the workflows of many researchers, and NIH does not want to discourage the use of AI in any way. Of course, every researcher continues to be responsible for every aspect of their grant application submission, whether developed and written with AI or not."

English

Sameed Khan retweetet

Ed Livingston@ehlJAMA·13 Oca

This is a big deal. Frequentist statistics have done a lot of damage to medical science. People rely on P values without really understanding what they mean. Many faulty conclusions have been made because of them. Once you understand Bayes methods, you’ll never rely on frequent tests again. Bayes methods are completely aligned with clinical thinking and should be the standard methodology used for statistical analysis in clinical research.

Frank Harrell@f2harrell

This is a big step forward in improving the efficiency of clinical trials of drugs and biologics, and a big day for @US_FDA which I've been dreaming of for decades : fda.gov/news-events/pr… #bayes #RCT #clinicaltrial #pharma

English

27.4K

Sameed Khan@sameedmed·13 Oca

Med school curricula going to be need a big update lol

Dr. Marty Makary@DrMakaryFDA

FDA is now open to Bayesian statistical approaches. A leap forward! Bayesian statistics can help: ✅ Clinical trial design ✅ Finding the optimal dose ✅ Extrapolation to children ✅ Leveraging phase 2 results in phase 3

English

Sameed Khan retweetet

Andrew Curran@AndrewCurran_·7 Oca

Utah has become the first state to allow AI to renew medical prescriptions with no doctor involved. The company, Doctronic, also secured a malpractice insurance policy for their AI. Their data also shows that their system matches doctors treatment plans 99.2% of the time.

Andrew Curran@AndrewCurran_

Quiet revolution taking place in healthcare. I use it as well. I can say from personal experience that the healthcare systems in Canada and the UK are suffering from crippling staffing shortages, as well as a crisis in competence. The cure for this will eventually be ChatMD.

English

145

893

5.3K

861.4K

Sameed Khan@sameedmed·7 Oca

Sample hundreds of arxiv papers, get 20 different approaches think thoughtfully to filter down to 10, spin them up and try it out on your problem, get 1-2 to run with. This would be a weeks long project and a publishable literature review in and of itself. Now it’s just your typical morning.

Kieran Klaassen@kieranklaassen

The unlock for me was realizing I could delegate the boring parts: planning research, security review, architecture checks. Sub-agents run in parallel. They report findings. I make decisions. That's it. That's the job now. And yeah, it feels great.

English

Sameed Khan@sameedmed·6 Oca

I arrived at a much inferior, hacky version of this mainly putting CLAUDE.md in subdirectories for project components and then having the repo modularized; adding slash commands to automatically checkpoint and update context files after long running tasks were done. give the agent only the context it needs when it works on that thing; file system abstraction / call stack abstraction seems to make sense for context engineering

Rohan Paul@rohanpaul_ai

The paper says the best way to manage AI context is to treat everything like a file system. Today, a model's knowledge sits in separate prompts, databases, tools, and logs, so context engineering pulls this into a coherent system. The paper proposes an agentic file system where every memory, tool, external source, and human note appears as a file in a shared space. A persistent context repository separates raw history, long term memory, and short lived scratchpads, so the model's prompt holds only the slice needed right now. Every access and transformation is logged with timestamps and provenance, giving a trail for how information, tools, and human feedback shaped an answer. Because large language models see only limited context each call and forget past ones, the architecture adds a constructor to shrink context, an updater to swap pieces, and an evaluator to check answers and update memory. All of this is implemented in the AIGNE framework, where agents remember past conversations and call services like GitHub through the same file style interface, turning scattered prompts into a reusable context layer. ---- Paper Link – arxiv. org/abs/2512.05470 Paper Title: "Everything is Context: Agentic File System Abstraction for Context Engineering"

English

Sameed Khan@sameedmed·6 Oca

Medicine -> AI def easier; if you’re in medicine you already know the constraints of that world re: regulatory, patient safety, data governance, etc. AI feels like an exciting frontier. The other way around you’re probably just thinking of the 20 other things you might be doing if you weren’t fighting with an IRB, getting stakeholders on board, etc. That being said, esp in radiology and cardiology there’s definitely high levels of interest and ppl are coming around. RSNA 2025 this year had a ton of presentations that could also have been at ACL, EMNLP, NeurIPS, etc. For other specialties I think this space is much farther removed.

English

will brown@willccbb·6 Oca

@iScienceLuvr find people into medicine and get them into ai, or vice versa not sure which is easier haha

English

3.2K

Tanishq Mathew Abraham, Ph.D.@iScienceLuvr·6 Oca

how can we get more people interested in medical ai?

English

172

320

32.5K

Sameed Khan retweetet

Geoffrey Litt@geoffreylitt·5 Oca

We need a shorthand way of saying: "An AI did the work, but I vouch for the result" Saying "I did it" feels slightly sketchy, but saying "Claude did it" feels like avoiding responsibility

English

1.1K

258

7.8K

541.9K

Sameed Khan retweetet

Michelle Fang@MichelleFangCS·12 Ara

We just published a model that does automated data extraction from Cardiac MRI reports. Let AI help make chart reviewing easier and the data collection process faster. Thank you Drs. @DebbieKwonMD & Chen for your mentorship on this project! doi.org/10.1016/j.jocm…

English

1.6K

Sameed Khan@sameedmed·5 Oca

Using Claude Code for data science and research has made me far more terminal-coded. Before stored data in RDS files and parquet with polars now I’m only using JSONL and jq. Python scripts >> notebooks, etc. will be very interesting to see how ecosystem adapts. Most interested to see where @marimo_io goes; @akshaykagrawal

English

Sameed Khan retweetet

Jake@JustJake·4 Oca

Today I handed Claude a document that I've been growing for...years on building an orchestrator/distributed runtime that I had only purely theorized possible. One we've been working towards. It would have taken me probably months to code by hand. Building on 5 years of work and 10 years of experience. Claude wrote all the code in Golang in 4 hours. I'd always actually wanted it in Rust cause I thought it would be easier to express, so I threw it in a loop with a "Rewrite it in Rust and make it as succinct as possible" I went and ate a burrito. I came back and it was done. That's the world we live in now.

English

895

155.1K

Sameed Khan@sameedmed·2 Oca

This is now possible it really is; witnessed it myself. Not as simple as one click go but a little nudging here and there and it’s a done deal.

caiden@pipelineabuser

someone stop me. seriously. i am going to CLONE your shtty enterprise backend in ONE AFTERNOON. then i am going to SCRAPE your entire customer list. then i am going to COLD EMAIL every single one of them offering the same product but BETTER and for like 80% LESS because i built it in a DAY with CLAUDE and MODAFINIL and ZERO VENTURE CAPITAL OVERHEAD. your entire engineering team? 47 people. me? ONE GUY who is VISIBLY UNWELL. your dev timeline? 18 months. mine? i started after breakfast and i'm already writing the sales copy. i WILL steal your customers. i WILL undercut your pricing. i WILL tweet about it the entire time. there is NO MOAT. there is NO DEFENSIBILITY. there is only ME and i am LOCKED IN and i have not slept properly in 3 days and that is YOUR PROBLEM NOW. your roadmap is my tuesday. your product is my template. your customers are my lead list. i cannot be stopped. i cannot be reasoned with. someone should genuinely intervene but they WON'T because this is SHIPPING CULTURE and we are SO BACK GLHF :>>>>>

English

Sameed Khan retweetet

ken@aquariusacquah·31 Ara

This thread is great and we can only expect this frontier to accelerate further. a few other tips you'll need to understand to stay ahead of the curve. with these + a few other best practices our 4 person engineering team regularly merges ~1M lines a week. - Computer use has become exceptionally powerful, especially for modern frontend apps when agents can access source code. every surface of your product interface must be easily testable. You'll need scripts to put agents in every scenario accessible by your end users. - any time any engineer opens an ide is an unacceptable devex failure. I maybe open mine once a week. your agents should be able to test every change end-to-end with high confidence across all important dimensions (UI, performance, resource use, etc) - as code accelerates, slop compounds. meticulous systems level review of every change becomes the most important job of any serious IC. github's code review isn't nearly high level enough so you'll need to prompt agents to write detailed and referenced reports with every change ready for your review. - Everything must be framework-ed to hell. Agents must be able to easily make meaningful changes to complex systems without guessing what code organization approach you prefer. - for fellow compatriots in the b2b saas mines. hook your customer ticketing system directly to coding agents. build a 1 step triage agent that takes customer queries and formats them into a coding agent prompt. to continue to win. the activation energy for your company to improve your end user product must round to 0

rahul@rahulgs

yes things are changing fast, but also I see companies (even faang) way behind the frontier for no reason. you are guaranteed to lose if you fall behind. the no unforced-errors ai leader playbook: For your team: - use coding agents. give all engineers their pick of harnesses, models, background agents: Claude code, Cursor, Devin, with closed/open models. Hearing Meta engineers are forced to use Llama 4. Opus 4.5 is the baseline now. - give your agents tools to ALL dev tooling: Linear, GitHub, Datadog, Sentry, any Internal tooling. If agents are being held back because of lack of context that’s your fault. - invest in your codebase specific agent docs. stop saying “doesn’t do X well”. If that’s an issue, try better prompting, agents.md, linting, and code rules. Tell it how you want things. Every manual edit you make is an opportunity for agent.md improvement - invest in robust background agent infra - get a full development stack working on VM/sandboxes. yes it’s hard to set up but it will be worth it, your engineers can run multiple in parallel. Code review will be the bottleneck soon. - figure out security issues. stop being risk averse and do what is needed to unblock access to tools. in your product: - always use the latest generation models in your features (move things off of last gen models asap, unless robust evals indicate otherwise). Requires changes every 1-2 weeks - eg: GitHub copilot mobile still offers code review with gpt 4.1 and Sonnet 3.5 @jaredpalmer. You are leaving money on the table by being on Sonnet 4, or gpt 4o - Use embedding semantic search instead of fuzzy search. Any general embedding model will do better than Levenshtein / fuzzy heuristics. - leave no form unfilled. use structured outputs and whatever context you have on the user to do a best-effort pre-fill - allow unstructured inputs on all product surfaces - must accept freeform text and documents. Forms are dead. - custom finetuning is dead. Stop wasting time on it. Frontier is moving too fast to invest 8 weeks into finetuning. Costs are dropping too quickly for price to matter. Better prompting will take you very far and this will only become more true as instruction following improves - build evals to make quick model-upgrade decisions. they don’t need to be perfect but at least need to allow you to compare models relative to each other. most decisions become clear on a Pareto cost vs benchmark perf plot - encourage all engineers to build with ai: build primitives to call models from all code bases / models: structured output, semantic similarity endpoints, sandbox code execution. etc What else am I missing?

English

214

35K

Entdecken

@MGalvosas @iScienceLuvr @DebbieKwonMD @marimo_io @akshaykagrawal @elonmusk @BarackObama @taylorswift13