Bradford
924 posts

Bradford
@bfolkens
1 Cor. 15:1-4; Ro. 10:9
301 Moved Permanently เข้าร่วม Nisan 2007
128 กำลังติดตาม492 ผู้ติดตาม

@heygurisingh Good thing @CloudSightAPI solved this long ago with custom transformer networks.
English

Holy shit... Stanford just proved that GPT-5, Gemini, and Claude can't actually see.
They removed every image from 6 major vision benchmarks.
The models still scored 70-80% accuracy.
They were never looking at your photos. Your scans. Your X-rays.
Here's what's really going on: ↓
The paper is called MIRAGE. Co-authored by Fei-Fei Li.
They tested GPT-5.1, Gemini-3-Pro, Claude Opus 4.5, and Gemini-2.5-Pro across 6 benchmarks -- medical and general.
Then silently removed every image. No warning. No prompt change.
The models didn't even notice.
They kept describing images in detail. Diagnosing conditions. Writing full reasoning traces.
From images that were never there.
Stanford calls it the "mirage effect."
Not hallucination. Something worse.
Hallucination = making up wrong details about a real input.
Mirage = constructing an entire fake reality and reasoning from it confidently.
The models built imaginary X-rays, described fake nodules, and diagnosed conditions -- all from text patterns alone.
But that's not the scary part.
They trained a "super-guesser" -- a tiny 3B parameter text-only model. Zero vision capability.
Fine-tuned it on the largest chest X-ray benchmark (696,000 questions). Images removed.
It beat GPT-5. It beat Gemini. It beat Claude.
It beat actual radiologists.
Ranked #1 on the held-out test set. Without ever seeing a single X-ray.
The reasoning traces? Indistinguishable from real visual analysis.
Now here's what should terrify you:
When the models fake-see medical images, their mirage diagnoses are heavily biased toward the most dangerous conditions.
STEMI. Melanoma. Carcinoma.
Life-threatening diagnoses -- from images that don't exist.
230 million people ask health questions on ChatGPT every day.
They also found something wild:
→ Tell a model "there's no image, just guess" -- performance drops
→ Silently remove the image and let it assume it's there -- performance stays high
The model enters "mirage mode." It doesn't know it can't see. And it performs BETTER when it doesn't know it's blind.
When Stanford applied their cleanup method (B-Clean) to existing benchmarks, it removed 74-77% of all questions.
Three-quarters of "vision" benchmarks don't test vision.
Every leaderboard. Every "multimodal breakthrough." Every benchmark score you've seen this year.
Built on mirages.
Code is open-sourced. Paper is live on arXiv.
If you're building anything with multimodal AI -- especially in healthcare -- read this paper before you ship.
(Link in the comments)

English

@bfolkens @Mattlee987Matt @MMT_Official_ correct, however, we do supply this info directly from $CME ;)

English

@Mattlee987Matt @MMT_Official_ It's not real CME/CBOE data, just Hyperliquid contracts
English

@MMT_Official_ Why does it say 'K' on the footprint volume by price, in terms of no. of contracts traded we rarely get into the thousands trading the front month Future on CME let alone a new perp product.
English

@MMT_Official_ It looks very cool but if data does not come from the options market (CBOE), this is just another naive flows tool. I use the options market to track Market Makers hedging dynamics
English

I know I made a joke earlier but this is seriously concerning.
I am not convinced it's ai, I think it's just accelerating this trend. The enshitification has been marching forward for years now and I hate it
I know the rage is all about how you should do no typing, you should only use AI, and only luddites say anything different, but I urge you to practice programming and becoming an expert
It is worth it
Wes Bos@wesbos
What the HECK is going on with tech? In the last week: Multiple cloud outages, x DMs totally broken, antigravity doesn't work, my watch is showing me 15 year old cal events, mac OS is a mess, email is spammed to hell and every nerd on here is talking like new AI is the second coming
English

@mitchellh @thekitze ghostty is my all-time favorite terminal by a long shot - nothing even comes close
English

Probably bait but thanks for trying anyways. :) Search is coming. You can rename tabs today (it’s in the menu bar and command palette). But none of that matters here, all that matters is you’re happy using whatever terminal you’re using and you gave it a shot. It’s not for everyone and that’s okay. ❤️
English
Bradford รีทวีตแล้ว





















