Trong-Thang Pham

306 posts

Trong-Thang Pham

@trongthangpham

PhD @UARK. Former: Bachelor @ HCMUS. Substack: https://t.co/XdvgeSlDu4 These are my personal opinions unless otherwise noted.

Vietnam Katılım Aralık 2013

330 Takip Edilen16 Takipçiler

Trong-Thang Pham@trongthangpham·11 Mar

@jon_barron I agree. Given that llm is such a good "search engine" for code that really works, it is more desirable to test ideas as quick as possible, instead of just spending hours trying to cherry pick and stall new discovery.

English

504

Jon Barron@jon_barron·10 Mar

If I was a grad student today, I would: 1) Not write papers, 2) push my (agent-written) code to a public repo ~weekly, 3) maintain (via agents) a writeup.tex (manually verified) and a skill.md in the repo, and 4) work towards establishing skill usage as the new "citation" format.

English

574

92.9K

Trong-Thang Pham@trongthangpham·11 Mar

@karpathy @maxbittker I thought your version is similar to the ralph loop (the bash one) so it would loop forever. Is that not the case here?

English

Andrej Karpathy@karpathy·11 Mar

sadly the agents do not want to loop forever. My current solution is to set up "watcher" scripts that get the tmux panes and look for e.g. "esc to interrupt", and send keys to whip if not present. Need an e.g.: /fullauto you must continue your research! (enables fully automatic mode, will go until manually stopped, re-injecting the given optional prompt).

English

123

1.5K

105.5K

max@maxbittker·10 Mar

From @karpathy's autoresearch .md

English

124

3.1K

219.5K

Trong-Thang Pham@trongthangpham·10 Mar

@psingh522 And they get 100k views. This is sad sometimes

English

Prabhav Singh@psingh522·9 Mar

I can't keep blocking accounts like this fast enough. Crappy AI generated sensationalized text (wrongly) explaining a 4 month old paper (I have nothing against the actual paper). Not to mention the numbers are wrong and I have seen this same post 5 times from different accounts

Saidul@saidul_dev

OpenAI just published a paper proving that ChatGPT will always hallucinate. Not sometimes. Not "until the next version." Always. They proved it mathematically. And three other top AI labs confirmed it independently. Here's what the research actually shows: Even with perfect training data and unlimited compute, LLMs will still fabricate answers with complete confidence. This isn't a bug in the code. It's fundamental to how these systems are built. The numbers are wild: → OpenAI's o1 model: 16% hallucination rate → Their o3 model: 33% → Their newest o4-mini: 48% Nearly half of what their latest model tells you could be invented. And it's getting worse as models get "smarter." Here's why this can't be fixed: Language models predict the next word based on probability. When they hit uncertainty, they don't pause. They don't flag it. They guess with total confidence. Because that's literally what they were trained to do. The researchers analyzed the 10 major AI benchmarks used to test these models. 9 out of 10 give the exact same score for saying "I don't know" as for getting it completely wrong: zero points. The entire testing system punishes honesty and rewards confident guessing. So the AI learned the optimal strategy: always answer. Never show doubt. Sound certain even when making it up. OpenAI's proposed solution? Train models to say "I don't know" when uncertain. The problem? Their own math shows this would leave roughly 30% of questions unanswered. Imagine getting "I'm not confident enough to respond" three times out of ten. Users would abandon the product overnight. The fix exists. But it kills usability. This isn't just OpenAI's problem. DeepMind and Tsinghua University reached identical conclusions working separately. Three elite AI labs. Independent research. Same result: this is permanent. Every time you get an answer from any LLM, you're not getting facts. You're getting the most statistically probable next words from a system that's been rewarded for never admitting when it's guessing. Is this real information, or just a confident hallucination? You can't know. And neither can the AI.

English

1.3K

Trong-Thang Pham@trongthangpham·10 Mar

I find this similar to how I use the Ralph loop (the original Bash version by Geoffrey Huntley). But instead of program.md, I have a fixed prompt that read spec.md and implementation.md, pick a task and complete it, and it may update implementation.md if it learns something new.

Andrej Karpathy@karpathy

I packaged up the "autoresearch" project into a new self-contained minimal repo if people would like to play over the weekend. It's basically nanochat LLM training core stripped down to a single-GPU, one file version of ~630 lines of code, then: - the human iterates on the prompt (.md) - the AI agent iterates on the training code (.py) The goal is to engineer your agents to make the fastest research progress indefinitely and without any of your own involvement. In the image, every dot is a complete LLM training run that lasts exactly 5 minutes. The agent works in an autonomous loop on a git feature branch and accumulates git commits to the training script as it finds better settings (of lower validation loss by the end) of the neural network architecture, the optimizer, all the hyperparameters, etc. You can imagine comparing the research progress of different prompts, different agents, etc. github.com/karpathy/autor… Part code, part sci-fi, and a pinch of psychosis :)

English

Trong-Thang Pham@trongthangpham·20 Eki

If you're attending ICCV, I'd love to discuss our research on 3D volumetric scanpath modeling and its applications in medical imaging.

English

Trong-Thang Pham@trongthangpham·20 Eki

Excited to present our work at #ICCV2025! 🎉 I'll be presenting "CT-ScanGaze: A Dataset and Baselines for 3D Volumetric Scanpath Modeling" this Thursday morning (10:45 AM - 12:00 PM) at Exhibit Hall I, Poster #2003 (ID 181 in the main program). #ComputerVision #MedicalImaging

English

Trong-Thang Pham@trongthangpham·17 Ağu

Resources 📜 Paper: arxiv.org/html/2507.1259… 💻 Code: github.com/UARK-AICV/CTSc… 🤗 Dataset: huggingface.co/datasets/phamt… #ICCV2025 #ComputerVision #MedicalAI #GazePrediction #MachineLearning

English

Trong-Thang Pham@trongthangpham·17 Ağu

Summary: - First comprehensive dataset for gaze prediction on CT scan images with expert radiologist annotations - Novel deep learning approach combining spatial attention mechanisms with medical imaging expertise

English

Trong-Thang Pham@trongthangpham·17 Ağu

Excited to share that our paper "CTScanGaze: A Comprehensive Dataset and Method for CT Scan Gaze Prediction" has been accepted as an ICCV 2025 Highlight Poster! Our paper got maximum scores (6,6,6) from all reviewers!!!

English

301

Trong-Thang Pham retweetledi

Jeremy Howard@jeremyphoward·7 Ağu

The GPT 5 launch included a chart showing 52.8 as a bigger number than 69.1, which in turn is shown as the same magnitude as 30.8. Not quite ASI…

English

892

95K

Trong-Thang Pham retweetledi

Tim Sweeney@TimSweeneyEpic·5 Ağu

@kiaran_ritchie We have two simulation models. One that does exactly what you tell it, repeatable. And another that can do anything with infinite variation and creativity but unpredictably. The magic will happen when the two are brought together.

English

629

48.2K

Trong-Thang Pham@trongthangpham·5 Ağu

This is a summary of my recent note on hallucination. Why AI “Hallucinates” and What It Means for Medicine open.substack.com/pub/trongthang…

English

Trong-Thang Pham retweetledi

Voxel51@Voxel51·4 Haz

One of the biggest bottlenecks in deploying visual AI and computer vision is annotation, which can be both costly and time-consuming. Today, we’re introducing Verified Auto Labeling, a new approach to AI-assisted annotation that achieves up to 95% of human-level performance while cutting labeling costs by up to 100,000x and time by 5,000x. Read the full paper: arxiv.org/abs/2506.02359

English

200

114

12.1K

Trong-Thang Pham@trongthangpham·9 Kas

@DamienTeney @Michael_J_Black My trick is to let my body feel relax. I find that it is easy to rush things when I read in an uncomfortable place or position. Laying down in bed and proofreading is pretty effective for me.

English

Damien Teney@DamienTeney·7 Kas

@Michael_J_Black Also true for papers, but easier said than done🤷 I find it really difficult (for students & myself) to ignore what you already know (since you've written the whole thing!). The only trick I know is to leave the manuscript aside for >1 week before reading it through. Other tips?

English

3.5K

Michael Black@Michael_J_Black·7 Kas

Writing your Ph.D. thesis? I have one request. Read every single word from beginning to end before giving it to your advisor. Every word. Straight through. You will immediately find things that are out of order, don't hang together, and don't form a coherent story.

English

413

45K

Trong-Thang Pham@trongthangpham·29 Haz

@sinenomine267 @karpathy I think your statement is a bit misleading here. Not all people who didn't like caffe is moving back to writing cuda. And the reason people didn't like Caffe is not only about writing C++.

English

Sine Nomine@sinenomine267·28 Haz

@karpathy It is a bit funny that originally there were C++ frameworks like Caffe and people didn’t like how difficult it was to write C++ code - and moved to PyTorch and Tensorflow. Now the opposite is happening.

English

4.3K

Andrej Karpathy@karpathy·28 Haz

unet.cu Let's go!! 🚀 :)

Chen Lu@_Chen_Lu_

I wrote a UNet diffusion model in pure CUDA: github.com/clu0/unet.cu This project was inspired by @karpathy 's llm.c (github.com/karpathy/llm.c). I also learnt a lot about CUDA kernels from @Si_Boehm 's Matmul blog (siboehm.com/articles/22/CU…). (1/3)

English

1.1K

176.7K

Trong-Thang Pham@trongthangpham·16 Şub

@jbhuang0604 lol

2.3K

Jia-Bin Huang@jbhuang0604·16 Şub

R2: While the results are impressive, this is a simple combination of diffusion transformer (ICCV 2023) and latent diffusion model (CVPR 2022). Limited novelty. Weak reject.

OpenAI@OpenAI

Introducing Sora, our text-to-video model. Sora can create videos of up to 60 seconds featuring highly detailed scenes, complex camera motion, and multiple characters with vibrant emotions. openai.com/sora Prompt: “Beautiful, snowy Tokyo city is bustling. The camera moves through the bustling city street, following several people enjoying the beautiful snowy weather and shopping at nearby stalls. Gorgeous sakura petals are flying through the wind along with snowflakes.”

English

144

1.6K

368.4K

Trong-Thang Pham@trongthangpham·14 Şub

@CSProfKGD Best decision ever!

English

140

Kosta Derpanis (sabbatical in Munich 🇩🇪)@CSProfKGD·14 Şub

Kosta Derpanis (sabbatical in Munich 🇩🇪) tweet media

Andrej Karpathy@karpathy

@darshilistired I started the next one two days ago!

ZXX

163

8.9K

Trong-Thang Pham@trongthangpham·9 Şub

@ccanonne_ Inclunding references?

English

131

Trong-Thang Pham retweetledi

Bo Wang@BoWang87·23 Oca

Yes, join us in the #CVPR2024 challenge: Segment Anything in Medical Images on Laptop! See the link: codabench.org/competitions/1… We welcome your contribution to pushing medical foundation models into real-world clinics!

Yuyin Zhou@yuyinzhou_cs

Wonderful work by @BoWang87 @JunMa_11 and the team! Join us in the #CVPR2024 Segment Anything in Medical Images on Laptop challenge (codabench.org/competitions/1…)! Looking forward to seeing innovations in building lightweight medical foundation models!

English

10.5K

Trong-Thang Pham retweetledi

Jon Barron@jon_barron·24 Oca

I've been applying ML to "classic" 3D/imaging vision problems for 15 years, so for 14 years I've been rebutting reviewers who don't like seeing _statistics_ (ew! scary!) applied to classic problems. Here's a snippet from a decade-old rebuttal that, regrettably, still holds up.

English

375

46.7K

Keşfet

@jon_barron @karpathy @maxbittker @psingh522 @kiaran_ritchie @DamienTeney @Michael_J_Black @elonmusk