Skeptical Economist, prof. Collegium Mesozoicum

30.8K posts

Skeptical Economist, prof. Collegium Mesozoicum banner
Skeptical Economist, prof. Collegium Mesozoicum

Skeptical Economist, prof. Collegium Mesozoicum

@postecon

Fill your heart with love today Don't play the game of time Things that happened in the past Only happened in your mind Oh, forget your mind And you'll be free

London, England Katılım Temmuz 2013
1.3K Takip Edilen4.2K Takipçiler
Skeptical Economist, prof. Collegium Mesozoicum
BOOM! Od pewnego czasu ludzie piszą takie tłity. Jedno zdanie - jedna linijka. Odstęp między wierszami jest obowiązkowy. Ma to dość oczywiste, ale w gruncie rzeczy rewolucyjne konsekwencje. Niesamowicie mnie to wkurwia. Twórcy kontentu muszą brać to pod uwagę.
Guri Singh@heygurisingh

Holy shit... Stanford just proved that GPT-5, Gemini, and Claude can't actually see. They removed every image from 6 major vision benchmarks. The models still scored 70-80% accuracy. They were never looking at your photos. Your scans. Your X-rays. Here's what's really going on: ↓ The paper is called MIRAGE. Co-authored by Fei-Fei Li. They tested GPT-5.1, Gemini-3-Pro, Claude Opus 4.5, and Gemini-2.5-Pro across 6 benchmarks -- medical and general. Then silently removed every image. No warning. No prompt change. The models didn't even notice. They kept describing images in detail. Diagnosing conditions. Writing full reasoning traces. From images that were never there. Stanford calls it the "mirage effect." Not hallucination. Something worse. Hallucination = making up wrong details about a real input. Mirage = constructing an entire fake reality and reasoning from it confidently. The models built imaginary X-rays, described fake nodules, and diagnosed conditions -- all from text patterns alone. But that's not the scary part. They trained a "super-guesser" -- a tiny 3B parameter text-only model. Zero vision capability. Fine-tuned it on the largest chest X-ray benchmark (696,000 questions). Images removed. It beat GPT-5. It beat Gemini. It beat Claude. It beat actual radiologists. Ranked #1 on the held-out test set. Without ever seeing a single X-ray. The reasoning traces? Indistinguishable from real visual analysis. Now here's what should terrify you: When the models fake-see medical images, their mirage diagnoses are heavily biased toward the most dangerous conditions. STEMI. Melanoma. Carcinoma. Life-threatening diagnoses -- from images that don't exist. 230 million people ask health questions on ChatGPT every day. They also found something wild: → Tell a model "there's no image, just guess" -- performance drops → Silently remove the image and let it assume it's there -- performance stays high The model enters "mirage mode." It doesn't know it can't see. And it performs BETTER when it doesn't know it's blind. When Stanford applied their cleanup method (B-Clean) to existing benchmarks, it removed 74-77% of all questions. Three-quarters of "vision" benchmarks don't test vision. Every leaderboard. Every "multimodal breakthrough." Every benchmark score you've seen this year. Built on mirages. Code is open-sourced. Paper is live on arXiv. If you're building anything with multimodal AI -- especially in healthcare -- read this paper before you ship. (Link in the comments)

Polski
3
0
12
1.4K
Skeptical Economist, prof. Collegium Mesozoicum retweetledi
DiscussingFilm
DiscussingFilm@DiscussingFilm·
The Olaf animatronic at Disney Adventure World has had its first public malfunction. (Source: magictourclub/TikTok)
English
1.1K
5.4K
75.7K
21.9M
Skeptical Economist, prof. Collegium Mesozoicum retweetledi
Skeptical Economist, prof. Collegium Mesozoicum retweetledi
Mikołaj Murczkiewicz
Mikołaj Murczkiewicz@murczkiewiczyzm·
Why are dogs in Poland so commonly called 'Hodge'? I even saw a woman with two dogs in the park. They ran off in different directions, she called out this common name "Hodge" and the first one came to her. Then she called out "Hodge 2" and the second one came back. Ok fair enough if the name is really popular and great for our four legged friends use it I don't mind that much but why not give the poor second dog a different name like Fluffy or Rex or something? Must get confusing like in families where the son is named after the dad or something.
Veronica, Collagen Scientist@celestialbe1ng

My favourite thing about Poland is that you don’t address strangers as “you.” You say Pan, Pani, Państwo (Mr/Mrs/+this plural I can’t translate): formal address is built into the grammar. Even in a shop, you’d say “Czy Państwo mają…” not “do you have…” and it isn’t performative politeness but actually structural respect. There is no casual “you” for someone you haven’t been invited to be familiar with. When I do this, people often rush to correct me or rather announce familiarity. “Oh, don’t call me Madame, call me Catherine.” And I’ll still address them formally until they give me clear permission to stop or until I decide I’m familiar and done with the formal. Pure elegance. The kind that assumes every stranger deserves dignity before they’ve earned familiarity. The West abolished formality for uhhh friendliness. Poland kept it bc respect.

English
49
58
1.2K
101.1K
Skeptical Economist, prof. Collegium Mesozoicum retweetledi
Soumaya Keynes
Soumaya Keynes@SoumayaKeynes·
found something rather baffling when researching my column this week… I wanted to see if there was any evidence that AI tools were helping economists to make their research more readable. So I analysed the text of NBER working paper abstracts…
English
8
54
217
59.2K
Skeptical Economist, prof. Collegium Mesozoicum
Iran nie ma kart i upiera się przy nierealistycznych scenariuszach, zachęcany przez podżegaczy z Moskwy i Pekinu. Humanitarnym rozwiązaniem jest zawarcie pokoju z USA i Izraelem, demilitaryzacja i federalizacja państwa i przyjęcie gwarancji bezpieczeństwa od światowych mocarstw.
Polski
1
0
2
485
Skeptical Economist, prof. Collegium Mesozoicum
Niestety, kontynuacja wojny jest na rękę teherańskiemu reżimowi, bo uzasadnia jego istnienie. Co więcej, wojna spaja i pozwala realizować interesy irańskich skorumpowanych elit, które znajdują coraz to nowe sposoby na defraudację kapitału z zagranicy.
Polski
2
0
1
705
Skeptical Economist, prof. Collegium Mesozoicum
Iran powinien zawrzeć pokój i zakończyć rozlew krwi w tym bezsensownym konflikcie. Wiadomo, że nie ma szans w starciu z USA a przedłużanie wojny rujnuje gospodarczo inne kraje, zwłaszcza Europę.
Polski
3
2
10
2.1K
Skeptical Economist, prof. Collegium Mesozoicum retweetledi
(nie)rowny gosc
(nie)rowny gosc@souljazyczu·
„ai nie zabierze ci pracy”
Polski
74
1.9K
35.4K
3.7M
Skeptical Economist, prof. Collegium Mesozoicum retweetledi
Jesús Fernández-Villaverde
Jesús Fernández-Villaverde@JesusFerna7026·
One of the more frustrating trends in public life over the past decade is how people who lead failing institutions blame social media for their failures. A university president whose faculty have become political activists instead of educators and whose administrators multiply like rabbits will tell you that “misinformation on social networks” is eroding public trust in higher education. An editor whose publication lost its readership will claim that the real problem is X, rather than consider that the publication became boring, that the writing was uniformly uninspired, and that it stopped covering anything that mattered. A politician who loses an election will blame Meta algorithms rather than admit that voters simply did not like what was offered. A central banker whose institution missed the worst inflation in 40 years will worry publicly about TikTok videos spreading financial illiteracy. The pattern is always the same. The institution fails at its core mission. The public notices. The people in charge, rather than examining what went wrong, identify an external force that is “polarizing.” The diagnosis is never “we did a poor job.” It is never “we lost our audience because we gave them nothing worth reading.” The diagnosis is always “bad actors are distorting the conversation.” This is not new, of course. Before social media, talk radio was the scapegoat. Before talk radio, it was television. Before television, it was tabloid newspapers. Every generation of leaders has found a communication technology to blame for people’s loss of trust in them. What is new is the intensity and the shamelessness. Over the last few years, “social media” has become a universal excuse that requires no evidence and tolerates no scrutiny. It is deployed reflexively. The people who make this argument never seem to ask the obvious question: why are people on social networks so receptive to criticism of your institution in the first place? If your university were delivering excellent education at a reasonable price, no number of tweets would persuade parents otherwise. If your publication were covering important questions with clarity and substance, readers would not have migrated elsewhere. If the work you showcased were serious rather than trivial, people would still be paying attention. Trust is not destroyed by social media. Trust is destroyed by poor performance, and social media makes it harder to hide. That is a different thing entirely, and the people running these institutions know it, which is what makes the excuse so cynical. The honest version of the argument would be: “We used to be able to fail quietly because there was no mechanism for people to compare notes. Now there is, and we do not like it.” That, at least, would have the virtue of being true.
Jesús Fernández-Villaverde tweet media
English
30
237
706
86.6K