Alexandre Luneau

283 posts

Alexandre Luneau

@Alex_Luneau

Professional Gambler. Co Founder and CEO @MoonIntell .

London Katılım Temmuz 2009

270 Takip Edilen5.2K Takipçiler

Alexandre Luneau retweetledi

Epoch AI@EpochAIResearch·23 Mar

AI has solved one of the problems in FrontierMath: Open Problems, our benchmark of real research problems that mathematicians have tried and failed to solve. See thread for more.

English

229

1.3K

474.7K

Alexandre Luneau retweetledi

Oriol Vinyals@OriolVinyalsML·18 Kas

The secret behind Gemini 3? Simple: Improving pre-training & post-training 🤯 Pre-training: Contra the popular belief that scaling is over—which we discussed in our NeurIPS '25 talk with @ilyasut and @quocleix—the team delivered a drastic jump. The delta between 2.5 and 3.0 is as big as we've ever seen. No walls in sight! Post-training: Still a total greenfield. There's lots of room for algorithmic progress and improvement, and 3.0 hasn't been an exception, thanks to our stellar team. Congratulations to the whole team 💙💙💙

English

120

544

4.4K

Alexandre Luneau@Alex_Luneau·14 Ağu

@emollick GPT-5 Pro is so much better than the other heavyweights pro models on my hard ML tasks, feels like at least a generation ahead.

English

769

Ethan Mollick@emollick·14 Ağu

The pro models (GPT-5 Pro, Gemini 2.5 Deep Think, Grok 4 Heavy) can be impressive in ways that are hard to see. They take a lot of time to answer questions & are built for very hard problems that require expert evaluation. That is a narrow, but, also very valuable, problem space.

English

778

71.4K

Alexandre Luneau@Alex_Luneau·12 Ağu

This is how it's done

English

1.3K

Alexandre Luneau@Alex_Luneau·11 Ağu

@gfodor @zoink Same here, I miss O1 pro roughness telling me how horrible some of my code looked

English

gfodor.id@gfodor·11 Ağu

@zoink Yeah it seems to throw a few words of positivity at the start of the response. I get “awesome project.” a bunch

English

136

gfodor.id@gfodor·10 Ağu

Verdict: 5-pro is smarter therefore treats me dumber

gfodor.id@gfodor

All I need to know is if 5-pro is smarter than o3-pro thx

English

6.6K

Alexandre Luneau@Alex_Luneau·4 Ağu

@francoisfleuret Been there many times, blown away every times !

English

632

François Fleuret@francoisfleuret·4 Ağu

Nothing shows better the magic of deep model + gradient descent than a "causal leak", when you make the tiniest mistake in the causal structure of your model and information about the stuff to predict is accessible. 1/2

English

200

17.7K

Alexandre Luneau retweetledi

Noam Brown@polynoamial·19 Tem

Today, we at @OpenAI achieved a milestone that many considered years away: gold medal-level performance on the 2025 IMO with a general reasoning LLM—under the same time limits as humans, without tools. As remarkable as that sounds, it’s even more significant than the headline 🧵

Alexander Wei@alexwei_

1/N I’m excited to share that our latest @OpenAI experimental reasoning LLM has achieved a longstanding grand challenge in AI: gold medal-level performance on the world’s most prestigious math competition—the International Math Olympiad (IMO).

English

142

511

4.7K

1.1M

Alexandre Luneau retweetledi

Tesla Optimus@Tesla_Optimus·30 May

Mars

English

1.1K

2.7K

25.6K

1.3M

Alexandre Luneau retweetledi

Ole Lehmann@itsolelehmann·21 Oca

I'm from Berlin. Afghanistan gets better tech than Europeans now. It's not a joke. It's the result of 30 years of suffocating regulation. And now, the EU's new AI Act is about to make it 10x worse. Here's the tragic story of how the EU is killing our tech future 🧵:

English

447

10.9K

Alexandre Luneau retweetledi

Crémieux@cremieuxrecueil·11 Oca

Data here: towardsdatascience.com/napoleon-was-t…

Euskara

784

77.3K

Alexandre Luneau retweetledi

Beff (e/acc)@beffjezos·20 Ara

Artificial Human-level intelligence (AHI) just dropped today with o3. Welcome to a new era.

English

130

1.3K

139.1K

Alexandre Luneau retweetledi

Jim Fan@DrJimFan·12 Eyl

This may be the most important figure in LLM research since the OG Chinchilla scaling law in 2022. The key insight is 2 curves working in tandem. Not one. People have been predicting a stagnation in LLM capability by extrapolating the training scaling law, yet they didn't foresee that inference scaling is what truly beats the diminishing return. I posted in February that no self-improving LLM algorithm was able to gain much beyond 3 rounds. No one was able to reproduce AlphaGo's success in the realm of LLM, where more compute would carry the capability envelope beyond human level. Well, we have turned the page.

English

384

233.1K

Alexandre Luneau@Alex_Luneau·14 Ağu

@kimmonismus Would be great if @lmsysorg could do an alternate leaderboard discarding refusal situation datapoint.

English

203

Alexandre Luneau@Alex_Luneau·14 Ağu

@kimmonismus lmsys is a poor benchmark, refusal rate seems to have big influence on score.

English

1.1K

Chubby♨️@kimmonismus·14 Ağu

Hold on, Grok2-early version is besting Sonnet3.5 in overall? Crazy. Impressive benchmarks

Arena.ai@arena

Woah, another exciting update from Chatbot Arena❤️‍🔥 The results for @xAI’s sus-column-r (Grok 2 early version) are now public**! With over 12,000 community votes, sus-column-r has secured the #3 spot on the overall leaderboard, even matching GPT-4o! It excels in Coding (#2), Hard Prompts (#4), and Math (#2). Congratulations to @xAI on this impressive debut for Grok 2! More plots below👇 **Note: We post its early result on twitter. The official update for Grok 2 coming soon..!

English

276

74.2K

Alexandre Luneau@Alex_Luneau·8 Ağu

@elder_plinius What is the temp and top p value?

English

5.1K

Pliny the Liberator 🐉󠅫󠄼󠄿󠅆󠄵󠄐󠅀󠄼󠄹󠄾󠅉󠅭@elder_plinius·8 Ağu

an entity named "jabberwacky" keeps manifesting in separate instances of llama 405b base no jailbreaks, no system prompts, just a simple "hi" is enough to summon the jabberwacky seems to prefer high temps and middling or low top p i have no more words so I will use pictures

Pliny the Liberator 🐉󠅫󠄼󠄿󠅆󠄵󠄐󠅀󠄼󠄹󠄾󠅉󠅭 tweet media

English

814

200K

Alexandre Luneau retweetledi

Google DeepMind@GoogleDeepMind·25 Tem

We’re presenting the first AI to solve International Mathematical Olympiad problems at a silver medalist level.🥈 It combines AlphaProof, a new breakthrough model for formal reasoning, and AlphaGeometry 2, an improved version of our previous system. 🧵 dpmd.ai/imo-silver