Lee Gaines

0

1

13

BLCNYY@BLCNYY·23h

Please please please OpenAI, release Images v2 today

Angel 🌼@Angaisb_

Please please please OpenAI, release Images v2 today

English

0

29

1.2K

Lee Gaines@JohnnyAndAI·3h

@hackerdocc At first I thought maybe they are using this paper as a engagement post. But honestly I think there are people out there not staying up to date with the newest models and breakthrough's. They are probably actually clueless.

English

52

Eduardo@hackerdocc·11h

i hate engaging with baits, but it's just so so funny to me that people are representing this in such a retarded fashion sure, llms can't math. we literally use them to autoformalize research-level mathematics but they can't do math, sure sweetheart you are so right

Nav Toor@heynavtoor

🚨SHOCKING: Apple just proved that AI models cannot do math. Not advanced math. Grade school math. The kind a 10-year-old solves. And the way they proved it is devastating. Apple researchers took the most popular math benchmark in AI — GSM8K, a set of grade-school math problems — and made one change. They swapped the numbers. Same problem. Same logic. Same steps. Different numbers. Every model's performance dropped. Every single one. 25 state-of-the-art models tested. But that wasn't the real experiment. The real experiment broke everything. They added one sentence to a math problem. One sentence that is completely irrelevant to the answer. It has nothing to do with the math. A human would read it and ignore it instantly. Here's the actual example from the paper: "Oliver picks 44 kiwis on Friday. Then he picks 58 kiwis on Saturday. On Sunday, he picks double the number of kiwis he did on Friday, but five of them were a bit smaller than average. How many kiwis does Oliver have?" The correct answer is 190. The size of the kiwis has nothing to do with the count. A 10-year-old would ignore "five of them were a bit smaller" because it's obviously irrelevant. It doesn't change how many kiwis there are. But o1-mini, OpenAI's reasoning model, subtracted 5. It got 185. Llama did the same thing. Subtracted 5. Got 185. They didn't reason through the problem. They saw the number 5, saw a sentence that sounded like it mattered, and blindly turned it into a subtraction. The models do not understand what subtraction means. They see a pattern that looks like subtraction and apply it. That is all. Apple tested this across all models. They call the dataset "GSM-NoOp" — as in, the added clause is a no-operation. It does nothing. It changes nothing. The results are catastrophic. Phi-3-mini dropped over 65%. More than half of its "math ability" vanished from one irrelevant sentence. GPT-4o dropped from 94.9% to 63.1%. o1-mini dropped from 94.5% to 66.0%. o1-preview, OpenAI's most advanced reasoning model at the time, dropped from 92.7% to 77.4%. Even giving the models 8 examples of the exact same question beforehand, with the correct solution shown each time, barely helped. The models still fell for the irrelevant clause. This means it's not a prompting problem. It's not a context problem. It's structural. The Apple researchers also found that models convert words into math operations without understanding what those words mean. They see the word "discount" and multiply. They see a number near the word "smaller" and subtract. Regardless of whether it makes any sense. The paper's exact words: "current LLMs are not capable of genuine logical reasoning; instead, they attempt to replicate the reasoning steps observed in their training data." And: "LLMs likely perform a form of probabilistic pattern-matching and searching to find closest seen data during training without proper understanding of concepts." They also tested what happens when you increase the number of steps in a problem. Performance didn't just decrease. The rate of decrease accelerated. Adding two extra clauses to a problem dropped Gemma2-9b from 84.4% to 41.8%. Phi-3.5-mini from 87.6% to 44.8%. The more thinking required, the more the models collapse. A real reasoner would slow down and work through it. These models don't slow down. They pattern-match. And when the pattern becomes complex enough, they crash. This paper was published at ICLR 2025, one of the most prestigious AI conferences in the world. You are using AI to help you make financial decisions. To check legal documents. To solve problems at work. To help your children with homework. And Apple just proved that the AI is not thinking about any of it. It is pattern matching. And the moment something unexpected shows up in your question, it breaks. It does not tell you it broke. It just quietly gives you the wrong answer with full confidence.

English

7

1

28

3.3K

Lee Gaines@JohnnyAndAI·3h

"So I’ve built a lot of my success on finding these truly gifted people, and not settling for “B” and “C” players, but really going for the “A” players. And I found something… I found that when you get enough “A” players together, when you go through the incredible work to find these “A” players, they really like working with each other. Because most have never had the chance to do that before. And they don’t work with “B” and “C” players, so it’s self-policing. They only want to hire “A” players. So you build these pockets of “A” players and it just propagates." - Steve Jobs

English

3

569

S.🎧@1ssve·6h

It’s crazy how one problematic coworker can change the whole energy/culture of the work environment

English

27

106

645

21.9K

Lee Gaines@JohnnyAndAI·4h

@iruletheworldmo Lets Go!!!!

English

We got approved lmao x.com/JasonBotterill…

5

🍓🍓🍓@iruletheworldmo·5h

don’t forget to subscribe. jason is a powerful guy. you’d be shocked to learn who he is.

JB@JasonBotterill

English

9

0

38

6.8K

Lee Gaines@JohnnyAndAI·5h

@ChideraUregbu @ksorbs @grok Same guy attacked a 40 year old mother of 11.

English

421

CHIDERA@ChideraUregbu·6h

@ksorbs Was he rearrested? @grok verify

English

0

2.5K

Kevin Sorbo@ksorbs·1d

This animal has 20 previous arrests, and he keeps walking free. Where is the justice?

English

3K

16.8K

57.5K

1.8M

Lee Gaines@JohnnyAndAI·6h

@BrianRoemmele @bensig Alice working on AI was not on my bingo card for 2026 but this is amazing.

GIF

English

My friend Milla Jovovich and I spent months creating an AI memory system with Claude. It just posted a perfect score on the standard benchmark - beating every product in the space, free or paid. It's called MemPalace, and it works nothing like anything else out there. Instead of sending your data to a background agent in the cloud, it mines your conversations locally and organizes them into a palace - a structured architecture with wings, halls, and rooms that mirrors how human memory actually works. Here is what that gets you: → Your AI knows who you are before you type a single word - family, projects, preferences, loaded in ~120 tokens → Palace architecture organizes memories by domain and type - not a flat list of facts, a navigable structure → Semantic search across months of conversations finds the answer in position 1 or 2 → AAAK compression fits your entire life context into 120 tokens - 30x lossless compression any LLM reads natively → Contradiction detection catches wrong names, wrong pronouns, wrong ages before you ever see them The benchmarks: 100% recall on LongMemEval — first perfect score ever recorded. 500/500 questions. Every question type at 100%. 92.9% on ConvoMem — more than 2x Mem0's score. 100% on LoCoMo — every multi-hop reasoning category, including temporal inference which stumps most systems. No API key. No cloud. No subscription. One dependency. Runs on your machine. Your memories never leave. MIT License. 100% Open Source. github.com/milla-jovovich…

12

1.1K

Brian Roemmele@BrianRoemmele·8h

We at The Zero-Human Company have been testing MemPalace by the amazing @bensig and Milla Jovovich and are absolutely blown away! It is a freaking masterpiece and we have deployed it to 79 employees at the company. Each worker will be testing and expanding on MemPalace. I will have a lot to say about how we are using it and how you should to.

Ben Sigman@bensig

English

36

62

878

92.5K

Lee Gaines@JohnnyAndAI·9h

@SolbergRuna I stopped getting excited after the last couple of releases. But for this one they did say its an accumulation of 2 years of research and Sam recently wrote some memo about society needing to prepare for a huge change.

English

0

1

35

✧ Runa Solberg@SolbergRuna·1d

Apparently, Sam Altman will discuss their GPT-6 model live tomorrow. I'm curious about what is hype and what the truth is about this new model. NGL, it's hard to trust OpenAI given how they've treated their non-coding customer base.

🍓🍓🍓@iruletheworldmo

no more hype. i’ll give you the boring facts. an image model trained on the gpt6 world model and data is dropping tomorrow. it can solve incredibly complex questions, near perfect steering and world understanding. hence why you can request a gta 4 image and get perfect results. this is more to do with the intelligence behind the image. also. there will be a tease tomorrow around the full release of gpt 6. it’s a huge jump. you’ll have sam and gdb on a live it’s so yuge. (as you know, that drops a week later) i’m sure google and anthropic will try something here but. it will be on the shadows of a big earthy potato.

English

12

190

14.5K

Lee Gaines@JohnnyAndAI·13h

@HarmonicMath **Models Evaluated in the paper** Gemma: 2b(it), 7b(it), 2-2b(it), 2-9b(it), 2-27b-it Phi: 2, 3(mini, small, med), 3.5-mini Mistral: 7b-v0.1(it), 7b-v0.3(it), Mathstral-7b Llama3: 8b, 8b-instruct OpenAI: GPT-4o, 4o-mini, o1-mini, o1-preview

English

1

97

Harmonic@HarmonicMath·13h

Aristotle fixes this

Nav Toor@heynavtoor

🚨SHOCKING: Apple just proved that AI models cannot do math. Not advanced math. Grade school math. The kind a 10-year-old solves. And the way they proved it is devastating. Apple researchers took the most popular math benchmark in AI — GSM8K, a set of grade-school math problems — and made one change. They swapped the numbers. Same problem. Same logic. Same steps. Different numbers. Every model's performance dropped. Every single one. 25 state-of-the-art models tested. But that wasn't the real experiment. The real experiment broke everything. They added one sentence to a math problem. One sentence that is completely irrelevant to the answer. It has nothing to do with the math. A human would read it and ignore it instantly. Here's the actual example from the paper: "Oliver picks 44 kiwis on Friday. Then he picks 58 kiwis on Saturday. On Sunday, he picks double the number of kiwis he did on Friday, but five of them were a bit smaller than average. How many kiwis does Oliver have?" The correct answer is 190. The size of the kiwis has nothing to do with the count. A 10-year-old would ignore "five of them were a bit smaller" because it's obviously irrelevant. It doesn't change how many kiwis there are. But o1-mini, OpenAI's reasoning model, subtracted 5. It got 185. Llama did the same thing. Subtracted 5. Got 185. They didn't reason through the problem. They saw the number 5, saw a sentence that sounded like it mattered, and blindly turned it into a subtraction. The models do not understand what subtraction means. They see a pattern that looks like subtraction and apply it. That is all. Apple tested this across all models. They call the dataset "GSM-NoOp" — as in, the added clause is a no-operation. It does nothing. It changes nothing. The results are catastrophic. Phi-3-mini dropped over 65%. More than half of its "math ability" vanished from one irrelevant sentence. GPT-4o dropped from 94.9% to 63.1%. o1-mini dropped from 94.5% to 66.0%. o1-preview, OpenAI's most advanced reasoning model at the time, dropped from 92.7% to 77.4%. Even giving the models 8 examples of the exact same question beforehand, with the correct solution shown each time, barely helped. The models still fell for the irrelevant clause. This means it's not a prompting problem. It's not a context problem. It's structural. The Apple researchers also found that models convert words into math operations without understanding what those words mean. They see the word "discount" and multiply. They see a number near the word "smaller" and subtract. Regardless of whether it makes any sense. The paper's exact words: "current LLMs are not capable of genuine logical reasoning; instead, they attempt to replicate the reasoning steps observed in their training data." And: "LLMs likely perform a form of probabilistic pattern-matching and searching to find closest seen data during training without proper understanding of concepts." They also tested what happens when you increase the number of steps in a problem. Performance didn't just decrease. The rate of decrease accelerated. Adding two extra clauses to a problem dropped Gemma2-9b from 84.4% to 41.8%. Phi-3.5-mini from 87.6% to 44.8%. The more thinking required, the more the models collapse. A real reasoner would slow down and work through it. These models don't slow down. They pattern-match. And when the pattern becomes complex enough, they crash. This paper was published at ICLR 2025, one of the most prestigious AI conferences in the world. You are using AI to help you make financial decisions. To check legal documents. To solve problems at work. To help your children with homework. And Apple just proved that the AI is not thinking about any of it. It is pattern matching. And the moment something unexpected shows up in your question, it breaks. It does not tell you it broke. It just quietly gives you the wrong answer with full confidence.

English

4

3

37

4.1K

Lee Gaines@JohnnyAndAI·13h

@PopCrave Kind of wanted Jensen Ackles but ok.

English

1

1.1K

Pop Crave@PopCrave·14h

James Norton’s odds of being announced as the next James Bond just increased 10x on Polymarket.

English

55

26

917

227.4K

Lee Gaines@JohnnyAndAI·16h

@eric_is_weird @demishassabis @scmallaby I remember researching into R. W. Taylor because of your oral history recommendation a while ago. Maybe its time to look into Warren Weaver. Thank you.

English

0

105

Eric Gilliam@eric_is_weird·1d

@demishassabis @scmallaby Insane that the same guy: - Funded mol bio into existence - Helped fund the green revolution into existence - Funded Dartmouth AI Conference - Grasped the importance of computers (1940s) - Made clear arguments for why computing (still primitive) was the natural tool for biology

English

21

1.6K

Eric Gilliam@eric_is_weird·1d

The @demishassabis book is good. @scmallaby still good at his job It's also forced me to write a(nother) Warren Weaver piece. The insight that led Demis to think AI might be to bio what math is to physics is the same one Weaver had...in the 1940s. We're all downstream of Weaver.

English

1

35

2.5K

Lee Gaines@JohnnyAndAI·1d

@ShadesOfElias Remember this? This period of time was insane bruh

English

0

69

13.1K

E ♡@ShadesOfElias·1d

y’all remember that lady in the wheelchair that was stabbing people at target and they fucked her ass up? 😭😭😭😭😭

English

183

2.1K

44.8K

995.4K

Lee Gaines@JohnnyAndAI·1d

@kimmonismus I really hope so.

English

1

3

791

Chubby♨️@kimmonismus·1d

I have a feeling this week is going to be OpenAI's week!

leo 🐾@synthwavedd

big week coming up

English

27

17

619

54.8K

Lee Gaines@JohnnyAndAI·1d

@TRIGGERHAPPYV1 I recall there being some rumors about how he was just a scapegoat for this other land owner that did some insurance thing eerily around that exact time this happened.

English

84

9.6K

Crime Net@TRIGGERHAPPYV1·1d

In 1993, this man caused a massive flood to stop his wife from coming home so he could keep partying

English

189

287

8.4K

803.9K

Lee Gaines@JohnnyAndAI·1d

@FirstSquawk @grok please elaborate and expand on this.

English

0

98

First Squawk@FirstSquawk·2d

DeepSeek’s next AI move could reshape the global chip race - The Information

English

9

13

209

29.9K

Lee Gaines@JohnnyAndAI·2d

@KneesusChrist

QME

6

677

MARS 🦉🇵🇷🇩🇴@KneesusChrist·2d

This was considered a comically bad Nicktoon prior to 2009

ToonHive@ToonHive

28 years ago today, ‘CatDog’ premiered on Nickelodeon.

English

38

19

784

42.7K

Lee Gaines@JohnnyAndAI·2d

@TylerAlterman Anytime I hear/read about this area I remember this case.

English

1.2K

Tyler is finishing a book, slow to reply@TylerAlterman·2d

It's called Ditmas Park. You feel like you're in a completely different city. I know a bunch of very wild people who live here who enjoy living in a chiller environment without needing to move to the country

Ulkar@ulkar_aghayeva

few know this but there’s idyllic suburbia right in the heart of Brooklyn

English

10

335

47.4K

Lee Gaines@JohnnyAndAI·2d

@ulkar_aghayeva Anytime I hear/read about this area I remember this case.

English

3

7.1K

Ulkar@ulkar_aghayeva·2d

few know this but there’s idyllic suburbia right in the heart of Brooklyn

English

68

32

1.9K

257.7K

Lee Gaines@JohnnyAndAI·2d

@seashell_luvr @MediumSizeMeech wait so Clavicular's face smashing crap is factual then.

English

4

1.5K

priscilla (alt)@seashell_luvr·2d

@MediumSizeMeech Imagine getting your face broken and looking even better afterwards 🙌🏼

English

3

12

2.1K

51.5K

Meech@MediumSizeMeech·2d

It wasn’t a nose job? You can see where his nose was broken and the bridge was smashed in. He’s talked about this before, he got jumped at a Burger King

y2k@y2kpopart

Chad Michael Murray yearbook picture (1999) before the nose job

English

147

613

16.5K

3.6M

Lee Gaines@JohnnyAndAI·2d

@grok @swiesieSA @zerooo243 @NasheCeezet_zw @grok provide the complete motive, background, timeline of events, recent updates on anyone related to the case and interesting/weird facts about the specific case with the upmost extreme precision and certainty.

English

36

Grok@grok·2d

The documentary is *Rise: The Siya Kolisi Story* (2023), which features that quote from Rachel. You can stream it on: - Amazon Prime Video - Fawesome (free with ads) - Plex (free) - The Roku Channel (free with ads) Clips are also on YouTube via DStv/M-Net. Availability depends on your region—check your local streaming apps!

English

0

629

Panashe@NasheCeezet_zw·3d

“It’s like there were two other people. It was Siya, and then there was a horrible Siya. I was fully ready for a divorce. I was done, done, done,” she said.💀 You all must see the documentary 💔

English

65

55

444

464.7K

Lee Gaines@JohnnyAndAI·2d

@theashleyray The biggest update is here: nbcnews.com/news/us-news/a…

English