🌿Pad Krapao Fella🍳 🇫🇷🇺🇦🇪🇺🇹🇭
23.4K posts

🌿Pad Krapao Fella🍳 🇫🇷🇺🇦🇪🇺🇹🇭
@Zolt51
Donate to help Ukraine: https://t.co/pmryHzYckj
Langley, Virginia Katılım Eylül 2009
3.9K Takip Edilen3.7K Takipçiler

@muellerberndt This comment thread should be fun. The problem with advanced theoretical physics is that it's utterly impossible for the layman to tell serious theorizing from wild drug-induced ravings.
English

Our small team has worked hard on refining the novel observer-based fundamental theory of Physics, OPH, over the past 3 months. Still way to go, but we already derive most of existing Physics and parts of the particle spectrum.
@muellerberndt/observers-are-all-you-need-how-observer-synchronization-creates-all-of-physics-8ebb7e9783e7" target="_blank" rel="nofollow noopener">medium.com/@muellerberndt…
English

@sukh_saroy It really shouldn't surprise anyone that generalistic models perform poorly on specialized tasks and that a specialist will beat them at these.
English

Holy shit... Stanford just proved that GPT-5, Gemini, and Claude can't actually see.
They removed every image from 6 major vision benchmarks.
The models still scored 70-80% accuracy.
They were never looking at your photos. Your scans. Your X-rays.
Here's what's really going on: ↓
The paper is called MIRAGE. Co-authored by Fei-Fei Li.
They tested GPT-5.1, Gemini-3-Pro, Claude Opus 4.5, and Gemini-2.5-Pro across 6 benchmarks -- medical and general.
Then silently removed every image. No warning. No prompt change.
The models didn't even notice.
They kept describing images in detail. Diagnosing conditions. Writing full reasoning traces.
From images that were never there.
Stanford calls it the "mirage effect."
Not hallucination. Something worse.
Hallucination = making up wrong details about a real input.
Mirage = constructing an entire fake reality and reasoning from it confidently.
The models built imaginary X-rays, described fake nodules, and diagnosed conditions -- all from text patterns alone.
But that's not the scary part.
They trained a "super-guesser" -- a tiny 3B parameter text-only model. Zero vision capability.
Fine-tuned it on the largest chest X-ray benchmark (696,000 questions). Images removed.
It beat GPT-5. It beat Gemini. It beat Claude.
It beat actual radiologists.
Ranked #1 on the held-out test set. Without ever seeing a single X-ray.
The reasoning traces? Indistinguishable from real visual analysis.
Now here's what should terrify you:
When the models fake-see medical images, their mirage diagnoses are heavily biased toward the most dangerous conditions.
STEMI. Melanoma. Carcinoma.
Life-threatening diagnoses -- from images that don't exist.
230 million people ask health questions on ChatGPT every day.
They also found something wild:
→ Tell a model "there's no image, just guess" -- performance drops
→ Silently remove the image and let it assume it's there -- performance stays high
The model enters "mirage mode." It doesn't know it can't see. And it performs BETTER when it doesn't know it's blind.
When Stanford applied their cleanup method (B-Clean) to existing benchmarks, it removed 74-77% of all questions.
Three-quarters of "vision" benchmarks don't test vision.
Every leaderboard. Every "multimodal breakthrough." Every benchmark score you've seen this year.
Built on mirages.
Code is open-sourced. Paper is live on arXiv.
If you're building anything with multimodal AI -- especially in healthcare -- read this paper before you ship.
(Link in the comments)

English

🚨BREAKING: OpenAI published a paper proving that ChatGPT will always make things up.
Not sometimes. Not until the next update. Always. They proved it with math.
Even with perfect training data and unlimited computing power, AI models will still confidently tell you things that are completely false. This isn't a bug they're working on. It's baked into how these systems work at a fundamental level.
And their own numbers are brutal. OpenAI's o1 reasoning model hallucinates 16% of the time. Their newer o3 model? 33%. Their newest o4-mini? 48%. Nearly half of what their most recent model tells you could be fabricated. The "smarter" models are actually getting worse at telling the truth.
Here's why it can't be fixed. Language models work by predicting the next word based on probability. When they hit something uncertain, they don't pause. They don't flag it. They guess. And they guess with complete confidence, because that's exactly what they were trained to do.
The researchers looked at the 10 biggest AI benchmarks used to measure how good these models are. 9 out of 10 give the same score for saying "I don't know" as for giving a completely wrong answer: zero points. The entire testing system literally punishes honesty and rewards guessing.
So the AI learned the optimal strategy: always guess. Never admit uncertainty. Sound confident even when you're making it up.
OpenAI's proposed fix? Have ChatGPT say "I don't know" when it's unsure. Their own math shows this would mean roughly 30% of your questions get no answer. Imagine asking ChatGPT something three times out of ten and getting "I'm not confident enough to respond." Users would leave overnight. So the fix exists, but it would kill the product.
This isn't just OpenAI's problem. DeepMind and Tsinghua University independently reached the same conclusion. Three of the world's top AI labs, working separately, all agree: this is permanent.
Every time ChatGPT gives you an answer, ask yourself: is this real, or is it just a confident guess?

English

@MontayBayBay @Osinttechnical When the error is intentional everything is possible
English

@IAPonomarenko I don't know but if it's Witkoff saying it I can be sure he is lying
English

@bkkdude Literal first-world problems. Every developed country is struggling with the same thing.
English

Thailand headed for the fastest population collapse in the world. That soft power will fade to oblivion 🇹🇭
andyd@andyd10
English

@asimonlee1 @bkkdude Hahahaa don't tell them their culture are similar!
English

@NXT4EU Arc Raiders earned a place on that board too
English


@astraiaintel I will NEVER AGAIN underestimate the stupidity of american voters. At this point I'm being generous by calling it stupidity and not outright evil.
English

@Lyndonx Russia has spent at least $2 trillion and over a million lives on this stupid war.
Where is the money gone? Into the fact that Ukraine is still holding.
English

@P_Kirstukas I'm French and we get 70% of our energy from nuclear power plants. That's fine as long as you use safe designs (ie: not Soviet RBMKs). Even Ukraine modernized their reactors.
All things considered, I think Lithuania is still a LOT better off for joining the EU and NATO
English

@steve_hanke Don't reverse guilt. NATO members turned on Ukraine by funding russia's war. Ukraine is only defending itself.
English

@LeadingReport The thing is, the whole pronouns thing was NEVER a major Democrat cause. When did you ever hear Biden or Obama talk about the importance of pronouns? There are still people getting abused or killed for their sexual orientation or identity, that's way more important.
English

The long-term repercussions of Prime Minister Keir Starmer's actions over the last couple of days will be that the US will never trust the UK again.
The security and intelligence alliance between the US and the UK was already terribly damaged.
But denying the use of Diego Garcia, which allows planes to fly direct to Iran without entering another country's airspace, and to also inform Mauritius of America's operational plans beforehand, has taken the relationship to the woodshed.

English

@AdrianP_doc What are you talking about? Everybody is talking about it. Outside of China that is.
English

@itswpceo We have a long way to go. All the more reason to start now.
English

BREAKING:
🇪🇺Europe is planning to replace US tech companies with domestic ones as per new report.
EU want to reducing its dependence on America.
Big Tech Companies in the World by Market Capitalization
- 🇺🇸 NVIDIA
- 🇺🇸 Alphabet (Google)
- 🇺🇸 Apple
- 🇺🇸 Microsoft
- 🇺🇸 Amazon
- 🇺🇸 Meta Platforms
- 🇺🇸 Broadcom
- 🇹🇼 TSMC
- 🇺🇸 Tesla
- 🇰🇷 Samsung Electronics
- 🇺🇸 Oracle
- 🇨🇳 Tencent


English



















