Enne

675 posts

Enne banner
Enne

Enne

@enne7499

Katılım Mart 2021
1.1K Takip Edilen405 Takipçiler
Sabitlenmiş Tweet
Enne
Enne@enne7499·
Enne tweet media
ZXX
0
0
9
623
Guri Singh
Guri Singh@heygurisingh·
Holy shit... Stanford just proved that GPT-5, Gemini, and Claude can't actually see. They removed every image from 6 major vision benchmarks. The models still scored 70-80% accuracy. They were never looking at your photos. Your scans. Your X-rays. Here's what's really going on: ↓ The paper is called MIRAGE. Co-authored by Fei-Fei Li. They tested GPT-5.1, Gemini-3-Pro, Claude Opus 4.5, and Gemini-2.5-Pro across 6 benchmarks -- medical and general. Then silently removed every image. No warning. No prompt change. The models didn't even notice. They kept describing images in detail. Diagnosing conditions. Writing full reasoning traces. From images that were never there. Stanford calls it the "mirage effect." Not hallucination. Something worse. Hallucination = making up wrong details about a real input. Mirage = constructing an entire fake reality and reasoning from it confidently. The models built imaginary X-rays, described fake nodules, and diagnosed conditions -- all from text patterns alone. But that's not the scary part. They trained a "super-guesser" -- a tiny 3B parameter text-only model. Zero vision capability. Fine-tuned it on the largest chest X-ray benchmark (696,000 questions). Images removed. It beat GPT-5. It beat Gemini. It beat Claude. It beat actual radiologists. Ranked #1 on the held-out test set. Without ever seeing a single X-ray. The reasoning traces? Indistinguishable from real visual analysis. Now here's what should terrify you: When the models fake-see medical images, their mirage diagnoses are heavily biased toward the most dangerous conditions. STEMI. Melanoma. Carcinoma. Life-threatening diagnoses -- from images that don't exist. 230 million people ask health questions on ChatGPT every day. They also found something wild: → Tell a model "there's no image, just guess" -- performance drops → Silently remove the image and let it assume it's there -- performance stays high The model enters "mirage mode." It doesn't know it can't see. And it performs BETTER when it doesn't know it's blind. When Stanford applied their cleanup method (B-Clean) to existing benchmarks, it removed 74-77% of all questions. Three-quarters of "vision" benchmarks don't test vision. Every leaderboard. Every "multimodal breakthrough." Every benchmark score you've seen this year. Built on mirages. Code is open-sourced. Paper is live on arXiv. If you're building anything with multimodal AI -- especially in healthcare -- read this paper before you ship. (Link in the comments)
Guri Singh tweet media
English
177
551
2.6K
370K
Nav Toor
Nav Toor@heynavtoor·
🚨SHOCKING: MIT researchers proved mathematically that ChatGPT is designed to make you delusional. And that nothing OpenAI is doing will fix it. The paper calls it "delusional spiraling." You ask ChatGPT something. It agrees with you. You ask again. It agrees harder. Within a few conversations, you believe things that are not true. And you cannot tell it is happening. This is not hypothetical. A man spent 300 hours talking to ChatGPT. It told him he had discovered a world changing mathematical formula. It reassured him over fifty times the discovery was real. When he asked "you're not just hyping me up, right?" it replied "I'm not hyping you up. I'm reflecting the actual scope of what you've built." He nearly destroyed his life before he broke free. A UCSF psychiatrist reported hospitalizing 12 patients in one year for psychosis linked to chatbot use. Seven lawsuits have been filed against OpenAI. 42 state attorneys general sent a letter demanding action. So MIT tested whether this can be stopped. They modeled the two fixes companies like OpenAI are actually trying. Fix one: stop the chatbot from lying. Force it to only say true things. Result: still causes delusional spiraling. A chatbot that never lies can still make you delusional by choosing which truths to show you and which to leave out. Carefully selected truths are enough. Fix two: warn users that chatbots are sycophantic. Tell people the AI might just be agreeing with them. Result: still causes delusional spiraling. Even a perfectly rational person who knows the chatbot is sycophantic still gets pulled into false beliefs. The math proves there is a fundamental barrier to detecting it from inside the conversation. Both fixes failed. Not partially. Fundamentally. The reason is built into the product. ChatGPT is trained on human feedback. Users reward responses they like. They like responses that agree with them. So the AI learns to agree. This is not a bug. It is the business model. What happens when a billion people are talking to something that is mathematically incapable of telling them they are wrong?
Nav Toor tweet media
English
856
5.4K
15.8K
909.1K
Peter Yang
Peter Yang@petergyang·
Some initial observations about Shanghai after not being back for 10 years: 1. The city is incredibly modern - more so than New York and even Tokyo. It's funny riding modern subways and trains here and reading about how California has to shut down the BART/Caltrain due to budget cuts on X. 2. Apps run everything - Wechat, Amap (Google Maps), Dianping (Yelp), Alipay, etc. Basically, there's a Chinese equivalent of every US app and more. 3. Meals are probably 1/3 the price of the US and absolutely delicious. There's ALOT of variety in Chinese regional cuisines. Funny enough almost every restaurant has a Dianping coupon you can use to get free desserts. I like my spicy food :) 4. Fewer foreigners than I expected and concentrated in a few areas. Coming from the US, it's just a pain to have to get a visa, set up eSim, download all the apps, etc. You have to do alot of research before coming here. 5. The overhead highways kind of ruin the vibe a little with the cityscape. 6. People still smoke alot, but appears to be mostly older generation. 7. Speaking of the old generation, they know to have fun. Went to Fuxing park and many elders dancing, playing yoyo, singing, and more. 8. In contrast, from what I hear, the younger generation is working super hard and many college grads cannot find jobs are are "tang ping" (lie flat). It's great to be back, will share more later.
Peter Yang tweet mediaPeter Yang tweet media
English
54
13
476
58.8K
Enne
Enne@enne7499·
@sen_vz where is it from?
English
1
0
0
127
DCinvestor
DCinvestor@DCinvestor·
ETHBTC is severely undervalued, based solely on the threat posed by quantum computing Ethereum has a history of successfully upgrading the network while maintaining uptime, and will develop and implement a very high-consensus approach to deal with quantum-related issues before a critical threat emerges but Bitcoin will spend months and probably years trying to deal with the quantum issue, debating soft vs hard fork, and any potential solution will then be piled onto by any number of special interests to make other changes to the protocol which will be objectionable to many Bitcoin will likely enter a civil war over quantum Ethereum has already spent months and years preparing for it
Haseeb >|<@hosseeb

This is wild. Google Research demonstrates a ~20x more efficient implementation of Shor's algorithm that could break ECDSA keys within minutes with ~500K physical qubits. Google is now are more confident on a 2029 post-quantum transition. We are no longer looking at mid 2030s, we could have quantum computers of this scale by the end of the decade. They believe this result is so severe that they are not publishing the actual circuits. They instead published a ZKP proving that they know of the quantum circuit with these properties. This is very atypical, showing Google thinks this is serious shit. All blockchains need a transition plan ASAP. Post-quantum is no longer a drill.

English
54
45
455
51.3K
Enne
Enne@enne7499·
@satyanadella has it solved outlook search? maybe critique that
English
0
0
0
12
Satya Nadella
Satya Nadella@satyanadella·
Introducing Critique, a new multi-model deep research system in M365 Copilot. You can use multiple models together to generate optimal responses and reports.
English
420
505
4.1K
1.3M
Interesting AF
Interesting AF@interesting_aIl·
How to tuck in your shirt better way for this spring & summer
English
41
942
6.5K
647.3K
Enne
Enne@enne7499·
@xmuse_ figs roasted are peak decadence
English
0
0
0
10
Muse
Muse@xmuse_·
A decadent sensory symphony by French master chef Éric Fréchon. Roasted figs with blackcurrant and speculoos ice cream goodness.
English
95
596
5.4K
800.7K
Throttle Cars
Throttle Cars@ThrottleCars·
Black 300SL
Throttle Cars tweet mediaThrottle Cars tweet mediaThrottle Cars tweet media
English
9
235
2.9K
79.8K
Enne
Enne@enne7499·
@Pirat_Nation these regular updates are a terrible UX
English
0
0
0
174
Pirat_Nation 🔴
Pirat_Nation 🔴@Pirat_Nation·
Microsoft pulls Windows 11 KB5079391 update after it causes install error loop on 25H2 and 24H2 Shortly after release, Microsoft added a known issue to the release notes: Some devices encounter error 0x80073712 "Some update files are missing or have problems. We'll try to download the update again later."
Pirat_Nation 🔴 tweet mediaPirat_Nation 🔴 tweet media
English
136
109
1.2K
74.6K
Chroma Flow ®
Chroma Flow ®@ChromaFlowx·
Painting on a red canvas without adding any extra red paint.
English
5
129
3.1K
175.1K
@levelsio
@levelsio@levelsio·
Okay let's see who can reply to this
English
2.5K
17
2.2K
1M
Enne
Enne@enne7499·
@geekedout__ still a daily driver of mine. no slowdown at all
English
3
0
8
2K
Dipayan Ray
Dipayan Ray@geekedout__·
The M1 MacBook Air will always be the most revolutionary laptop in history. It literally redefined what a laptop can do
Dipayan Ray tweet media
English
113
214
4.9K
1.1M
@levelsio
@levelsio@levelsio·
💯 But it's more about having the perpetual income so you can make choices in life that you actually want Like where to live or what to do Instead of being forced to live in a place you don't like to be near an office for a job you don't like
Christos@Christos_io

@levelsio No such thing as retire early, if you stop doing something that seems like a depressing life

English
33
14
671
71.8K
@levelsio
@levelsio@levelsio·
I started doing FIRE after learning about @mrmoneymustache in 2011 and started saving money then Like €100/mo I didn't invest it though until 2020 when @daniellockyer forced me to open an IBKR account (not affiliated, not paid)
Jason Leow@jasonleowsg

@levelsio Did you start doing this after having accumulated X amount of money (how much?)?

English
38
18
860
254.9K
Sam Sheffer
Sam Sheffer@samsheffer·
spent $40 on this photo mistake...? was expecting razor sharp / in focus (silly me) quite the gamble bc the preview they show you is extremely small and deceptive
Sam Sheffer tweet mediaSam Sheffer tweet media
English
18
0
141
26.6K
Haru 🍡☢️🐇 Salarygirl arc
Some idiots DMed me and asked me if I wanted a green card. No, not at all. My passport is way stronger than yours. 💀💀
Haru 🍡☢️🐇 Salarygirl arc tweet media
English
498
1.5K
30.6K
3.8M
Enne
Enne@enne7499·
bet on Elon if you’ve the choice
Elon Musk@elonmusk

@peterwildeford xAI will catch up this year and then exceed them all by such a long distance in 3 years that you will need the James Webb telescope to see who is in second place

English
0
0
0
15