Sameed Khan

314 posts

Sameed Khan banner
Sameed Khan

Sameed Khan

@sameedmed

M3.5 (Research Year) @CleClinicLCM. CS '21 @michiganstateu. Interested in artificial intelligence, health tech, and entrepreneurship. Always a student.

Cleveland, OH شامل ہوئے Aralık 2022
143 فالونگ70 فالوورز
Sameed Khan
Sameed Khan@sameedmed·
@MGalvosas No fault of the authors, to be fair - it’s a systemic incompatibility b/t innovation pace and speed of publication but this is a common flaw with these studies. Hard to derive any meaningful conclusions when the capabilities we’re discussing now are in a different realm
English
1
0
2
15
Sameed Khan
Sameed Khan@sameedmed·
@MGalvosas Skimmed abstract, scrolled straight to appendix - commit / API column is misleading; most of these are late 2024 release models. Only one that is slightly current is Qwen3-30B-A3B-Instruct-2507 (July 2025). This is a completely unrepresentative view of current AI capabilities.
Sameed Khan tweet media
English
1
0
1
20
Sameed Khan ری ٹویٹ کیا
Inna Vishik
Inna Vishik@InnaVishik·
Scrap the journals and formal peer review. A better system could be a dynamic arXiv-like document (including a verification system similar to arXiv) where new versions are assigned new DOIs, coupled with a comment section. Comments get DOIs. Helpful comments are incorporated into the living manuscript as edits or citations. Substantive additions are given co-authorship in updated version. Substantive critiques or re-analysis of data by another group get cited because they have DOIs and become part of the record of knowledge in this topic. Comments can be anonymous, but if you want credit and H-index growth, it has to be under your name.
Ryan Briggs@ryancbriggs

It does seem like now is a very good time to think about what we want the “journal” system of the future to look like

English
10
8
55
7.6K
Sameed Khan
Sameed Khan@sameedmed·
Deploying cognition at scale means you get to do much more ambitious projects but also journals can review many more submissions far more rigorously than we are currently able to.
Jay Bhattacharya@DrJBhattacharya

Update! My brilliant colleague and frequent coauthor, @MikkoPackalen writes with a different take about AI use in science and scholarship. His take is persuasive but contrary to mine. Perhaps he's right that I'm not fully appreciating the culture change that AI portends for science. Here's what he wrote to me: Non-Dinosaur NIH AI Policy: "AI is an important opportunity for advancing and accelerating science. Applicants are encouraged to use AI as they best see fit. NIH understands that AI is deeply integrated in the workflows of many researchers, and NIH does not want to discourage the use of AI in any way. Of course, every researcher continues to be responsible for every aspect of their grant application submission, whether developed and written with AI or not."

English
0
0
1
25
Sameed Khan ری ٹویٹ کیا
Ed Livingston
Ed Livingston@ehlJAMA·
This is a big deal. Frequentist statistics have done a lot of damage to medical science. People rely on P values without really understanding what they mean. Many faulty conclusions have been made because of them. Once you understand Bayes methods, you’ll never rely on frequent tests again. Bayes methods are completely aligned with clinical thinking and should be the standard methodology used for statistical analysis in clinical research.
Frank Harrell@f2harrell

This is a big step forward in improving the efficiency of clinical trials of drugs and biologics, and a big day for @US_FDA which I've been dreaming of for decades : fda.gov/news-events/pr… #bayes #RCT #clinicaltrial #pharma

English
12
14
87
27.4K
Sameed Khan ری ٹویٹ کیا
Andrew Curran
Andrew Curran@AndrewCurran_·
Utah has become the first state to allow AI to renew medical prescriptions with no doctor involved. The company, Doctronic, also secured a malpractice insurance policy for their AI. Their data also shows that their system matches doctors treatment plans 99.2% of the time.
Andrew Curran tweet media
Andrew Curran@AndrewCurran_

Quiet revolution taking place in healthcare. I use it as well. I can say from personal experience that the healthcare systems in Canada and the UK are suffering from crippling staffing shortages, as well as a crisis in competence. The cure for this will eventually be ChatMD.

English
145
894
5.3K
861.4K
Sameed Khan
Sameed Khan@sameedmed·
Sample hundreds of arxiv papers, get 20 different approaches think thoughtfully to filter down to 10, spin them up and try it out on your problem, get 1-2 to run with. This would be a weeks long project and a publishable literature review in and of itself. Now it’s just your typical morning.
Kieran Klaassen@kieranklaassen

The unlock for me was realizing I could delegate the boring parts: planning research, security review, architecture checks. Sub-agents run in parallel. They report findings. I make decisions. That's it. That's the job now. And yeah, it feels great.

English
0
0
0
25
Sameed Khan
Sameed Khan@sameedmed·
I arrived at a much inferior, hacky version of this mainly putting CLAUDE.md in subdirectories for project components and then having the repo modularized; adding slash commands to automatically checkpoint and update context files after long running tasks were done. give the agent only the context it needs when it works on that thing; file system abstraction / call stack abstraction seems to make sense for context engineering
Rohan Paul@rohanpaul_ai

The paper says the best way to manage AI context is to treat everything like a file system. Today, a model's knowledge sits in separate prompts, databases, tools, and logs, so context engineering pulls this into a coherent system. The paper proposes an agentic file system where every memory, tool, external source, and human note appears as a file in a shared space. A persistent context repository separates raw history, long term memory, and short lived scratchpads, so the model's prompt holds only the slice needed right now. Every access and transformation is logged with timestamps and provenance, giving a trail for how information, tools, and human feedback shaped an answer. Because large language models see only limited context each call and forget past ones, the architecture adds a constructor to shrink context, an updater to swap pieces, and an evaluator to check answers and update memory. All of this is implemented in the AIGNE framework, where agents remember past conversations and call services like GitHub through the same file style interface, turning scattered prompts into a reusable context layer. ---- Paper Link – arxiv. org/abs/2512.05470 Paper Title: "Everything is Context: Agentic File System Abstraction for Context Engineering"

English
0
0
0
34
Sameed Khan
Sameed Khan@sameedmed·
Medicine -> AI def easier; if you’re in medicine you already know the constraints of that world re: regulatory, patient safety, data governance, etc. AI feels like an exciting frontier. The other way around you’re probably just thinking of the 20 other things you might be doing if you weren’t fighting with an IRB, getting stakeholders on board, etc. That being said, esp in radiology and cardiology there’s definitely high levels of interest and ppl are coming around. RSNA 2025 this year had a ton of presentations that could also have been at ACL, EMNLP, NeurIPS, etc. For other specialties I think this space is much farther removed.
English
0
0
0
11
will brown
will brown@willccbb·
@iScienceLuvr find people into medicine and get them into ai, or vice versa not sure which is easier haha
English
7
1
41
3.2K
Sameed Khan ری ٹویٹ کیا
Geoffrey Litt
Geoffrey Litt@geoffreylitt·
We need a shorthand way of saying: "An AI did the work, but I vouch for the result" Saying "I did it" feels slightly sketchy, but saying "Claude did it" feels like avoiding responsibility
English
1.1K
258
7.8K
541.9K
Sameed Khan ری ٹویٹ کیا
Michelle Fang
Michelle Fang@MichelleFangCS·
We just published a model that does automated data extraction from Cardiac MRI reports. Let AI help make chart reviewing easier and the data collection process faster. Thank you Drs. @DebbieKwonMD & Chen for your mentorship on this project! doi.org/10.1016/j.jocm…
English
4
5
16
1.6K
Sameed Khan
Sameed Khan@sameedmed·
Using Claude Code for data science and research has made me far more terminal-coded. Before stored data in RDS files and parquet with polars now I’m only using JSONL and jq. Python scripts >> notebooks, etc. will be very interesting to see how ecosystem adapts. Most interested to see where @marimo_io goes; @akshaykagrawal
English
0
0
0
44
Sameed Khan ری ٹویٹ کیا
Jake
Jake@JustJake·
Today I handed Claude a document that I've been growing for...years on building an orchestrator/distributed runtime that I had only purely theorized possible. One we've been working towards. It would have taken me probably months to code by hand. Building on 5 years of work and 10 years of experience. Claude wrote all the code in Golang in 4 hours. I'd always actually wanted it in Rust cause I thought it would be easier to express, so I threw it in a loop with a "Rewrite it in Rust and make it as succinct as possible" I went and ate a burrito. I came back and it was done. That's the world we live in now.
English
28
49
895
155.1K
Sameed Khan ری ٹویٹ کیا
ken
ken@aquariusacquah·
This thread is great and we can only expect this frontier to accelerate further. a few other tips you'll need to understand to stay ahead of the curve. with these + a few other best practices our 4 person engineering team regularly merges ~1M lines a week. - Computer use has become exceptionally powerful, especially for modern frontend apps when agents can access source code. every surface of your product interface must be easily testable. You'll need scripts to put agents in every scenario accessible by your end users. - any time any engineer opens an ide is an unacceptable devex failure. I maybe open mine once a week. your agents should be able to test every change end-to-end with high confidence across all important dimensions (UI, performance, resource use, etc) - as code accelerates, slop compounds. meticulous systems level review of every change becomes the most important job of any serious IC. github's code review isn't nearly high level enough so you'll need to prompt agents to write detailed and referenced reports with every change ready for your review. - Everything must be framework-ed to hell. Agents must be able to easily make meaningful changes to complex systems without guessing what code organization approach you prefer. - for fellow compatriots in the b2b saas mines. hook your customer ticketing system directly to coding agents. build a 1 step triage agent that takes customer queries and formats them into a coding agent prompt. to continue to win. the activation energy for your company to improve your end user product must round to 0
rahul@rahulgs

yes things are changing fast, but also I see companies (even faang) way behind the frontier for no reason. you are guaranteed to lose if you fall behind. the no unforced-errors ai leader playbook: For your team: - use coding agents. give all engineers their pick of harnesses, models, background agents: Claude code, Cursor, Devin, with closed/open models. Hearing Meta engineers are forced to use Llama 4. Opus 4.5 is the baseline now. - give your agents tools to ALL dev tooling: Linear, GitHub, Datadog, Sentry, any Internal tooling. If agents are being held back because of lack of context that’s your fault. - invest in your codebase specific agent docs. stop saying “doesn’t do X well”. If that’s an issue, try better prompting, agents.md, linting, and code rules. Tell it how you want things. Every manual edit you make is an opportunity for agent.md improvement - invest in robust background agent infra - get a full development stack working on VM/sandboxes. yes it’s hard to set up but it will be worth it, your engineers can run multiple in parallel. Code review will be the bottleneck soon. - figure out security issues. stop being risk averse and do what is needed to unblock access to tools. in your product: - always use the latest generation models in your features (move things off of last gen models asap, unless robust evals indicate otherwise). Requires changes every 1-2 weeks - eg: GitHub copilot mobile still offers code review with gpt 4.1 and Sonnet 3.5 @jaredpalmer. You are leaving money on the table by being on Sonnet 4, or gpt 4o - Use embedding semantic search instead of fuzzy search. Any general embedding model will do better than Levenshtein / fuzzy heuristics. - leave no form unfilled. use structured outputs and whatever context you have on the user to do a best-effort pre-fill - allow unstructured inputs on all product surfaces - must accept freeform text and documents. Forms are dead. - custom finetuning is dead. Stop wasting time on it. Frontier is moving too fast to invest 8 weeks into finetuning. Costs are dropping too quickly for price to matter. Better prompting will take you very far and this will only become more true as instruction following improves - build evals to make quick model-upgrade decisions. they don’t need to be perfect but at least need to allow you to compare models relative to each other. most decisions become clear on a Pareto cost vs benchmark perf plot - encourage all engineers to build with ai: build primitives to call models from all code bases / models: structured output, semantic similarity endpoints, sandbox code execution. etc What else am I missing?

English
6
5
214
35K