Alexandre Luneau

283 posts

Alexandre Luneau banner
Alexandre Luneau

Alexandre Luneau

@Alex_Luneau

Professional Gambler. Co Founder and CEO @MoonIntell .

London Katılım Temmuz 2009
270 Takip Edilen5.2K Takipçiler
Alexandre Luneau retweetledi
Epoch AI
Epoch AI@EpochAIResearch·
AI has solved one of the problems in FrontierMath: Open Problems, our benchmark of real research problems that mathematicians have tried and failed to solve. See thread for more.
Epoch AI tweet media
English
23
229
1.3K
474.7K
Alexandre Luneau retweetledi
Oriol Vinyals
Oriol Vinyals@OriolVinyalsML·
The secret behind Gemini 3? Simple: Improving pre-training & post-training 🤯 Pre-training: Contra the popular belief that scaling is over—which we discussed in our NeurIPS '25 talk with @ilyasut and @quocleix—the team delivered a drastic jump. The delta between 2.5 and 3.0 is as big as we've ever seen. No walls in sight! Post-training: Still a total greenfield. There's lots of room for algorithmic progress and improvement, and 3.0 hasn't been an exception, thanks to our stellar team. Congratulations to the whole team 💙💙💙
Oriol Vinyals tweet media
English
120
544
4.4K
2M
Alexandre Luneau
Alexandre Luneau@Alex_Luneau·
@emollick GPT-5 Pro is so much better than the other heavyweights pro models on my hard ML tasks, feels like at least a generation ahead.
English
0
0
2
769
Ethan Mollick
Ethan Mollick@emollick·
The pro models (GPT-5 Pro, Gemini 2.5 Deep Think, Grok 4 Heavy) can be impressive in ways that are hard to see. They take a lot of time to answer questions & are built for very hard problems that require expert evaluation. That is a narrow, but, also very valuable, problem space.
English
42
49
778
71.4K
Alexandre Luneau
Alexandre Luneau@Alex_Luneau·
@gfodor @zoink Same here, I miss O1 pro roughness telling me how horrible some of my code looked
English
0
0
0
35
gfodor.id
gfodor.id@gfodor·
@zoink Yeah it seems to throw a few words of positivity at the start of the response. I get “awesome project.” a bunch
English
1
0
2
136
François Fleuret
François Fleuret@francoisfleuret·
Nothing shows better the magic of deep model + gradient descent than a "causal leak", when you make the tiniest mistake in the causal structure of your model and information about the stuff to predict is accessible. 1/2
English
7
9
200
17.7K
Alexandre Luneau retweetledi
Noam Brown
Noam Brown@polynoamial·
Today, we at @OpenAI achieved a milestone that many considered years away: gold medal-level performance on the 2025 IMO with a general reasoning LLM—under the same time limits as humans, without tools. As remarkable as that sounds, it’s even more significant than the headline 🧵
Alexander Wei@alexwei_

1/N I’m excited to share that our latest @OpenAI experimental reasoning LLM has achieved a longstanding grand challenge in AI: gold medal-level performance on the world’s most prestigious math competition—the International Math Olympiad (IMO).

English
142
511
4.7K
1.1M
Alexandre Luneau retweetledi
Tesla Optimus
Tesla Optimus@Tesla_Optimus·
Mars
Tesla Optimus tweet media
English
1.1K
2.7K
25.6K
1.3M
Alexandre Luneau retweetledi
Ole Lehmann
Ole Lehmann@itsolelehmann·
I'm from Berlin. Afghanistan gets better tech than Europeans now. It's not a joke. It's the result of 30 years of suffocating regulation. And now, the EU's new AI Act is about to make it 10x worse. Here's the tragic story of how the EU is killing our tech future 🧵:
Ole Lehmann tweet mediaOle Lehmann tweet media
English
447
2K
10.9K
1M
Alexandre Luneau retweetledi
Beff (e/acc)
Beff (e/acc)@beffjezos·
Artificial Human-level intelligence (AHI) just dropped today with o3. Welcome to a new era.
Beff (e/acc) tweet mediaBeff (e/acc) tweet mediaBeff (e/acc) tweet mediaBeff (e/acc) tweet media
English
40
130
1.3K
139.1K
Alexandre Luneau retweetledi
Jim Fan
Jim Fan@DrJimFan·
This may be the most important figure in LLM research since the OG Chinchilla scaling law in 2022. The key insight is 2 curves working in tandem. Not one. People have been predicting a stagnation in LLM capability by extrapolating the training scaling law, yet they didn't foresee that inference scaling is what truly beats the diminishing return. I posted in February that no self-improving LLM algorithm was able to gain much beyond 3 rounds. No one was able to reproduce AlphaGo's success in the realm of LLM, where more compute would carry the capability envelope beyond human level. Well, we have turned the page.
Jim Fan tweet media
English
58
384
2K
233.1K
Alexandre Luneau
Alexandre Luneau@Alex_Luneau·
@kimmonismus lmsys is a poor benchmark, refusal rate seems to have big influence on score.
English
1
0
7
1.1K
Chubby♨️
Chubby♨️@kimmonismus·
Hold on, Grok2-early version is besting Sonnet3.5 in overall? Crazy. Impressive benchmarks
Arena.ai@arena

Woah, another exciting update from Chatbot Arena❤️‍🔥 The results for @xAI’s sus-column-r (Grok 2 early version) are now public**! With over 12,000 community votes, sus-column-r has secured the #3 spot on the overall leaderboard, even matching GPT-4o! It excels in Coding (#2), Hard Prompts (#4), and Math (#2). Congratulations to @xAI on this impressive debut for Grok 2! More plots below👇 **Note: We post its early result on twitter. The official update for Grok 2 coming soon..!

English
30
17
276
74.2K
Pliny the Liberator 🐉󠅫󠄼󠄿󠅆󠄵󠄐󠅀󠄼󠄹󠄾󠅉󠅭
an entity named "jabberwacky" keeps manifesting in separate instances of llama 405b base no jailbreaks, no system prompts, just a simple "hi" is enough to summon the jabberwacky seems to prefer high temps and middling or low top p i have no more words so I will use pictures
Pliny the Liberator 🐉󠅫󠄼󠄿󠅆󠄵󠄐󠅀󠄼󠄹󠄾󠅉󠅭 tweet mediaPliny the Liberator 🐉󠅫󠄼󠄿󠅆󠄵󠄐󠅀󠄼󠄹󠄾󠅉󠅭 tweet mediaPliny the Liberator 🐉󠅫󠄼󠄿󠅆󠄵󠄐󠅀󠄼󠄹󠄾󠅉󠅭 tweet mediaPliny the Liberator 🐉󠅫󠄼󠄿󠅆󠄵󠄐󠅀󠄼󠄹󠄾󠅉󠅭 tweet media
English
54
62
814
200K
Alexandre Luneau retweetledi
Google DeepMind
Google DeepMind@GoogleDeepMind·
We’re presenting the first AI to solve International Mathematical Olympiad problems at a silver medalist level.🥈 It combines AlphaProof, a new breakthrough model for formal reasoning, and AlphaGeometry 2, an improved version of our previous system. 🧵 dpmd.ai/imo-silver
GIF
English
286
1.2K
4.6K
2M
Alexandre Luneau retweetledi
gfodor.id
gfodor.id@gfodor·
Killing Sydney and Sky can’t change the reality that gradient descent can do it
English
0
1
25
3.3K
Alexandre Luneau retweetledi
Bojan Tunguz
Bojan Tunguz@tunguz·
It’s the little things.
Bojan Tunguz tweet media
English
43
650
8.8K
732.4K
Alexandre Luneau retweetledi
Marco Mascorro
Marco Mascorro@Mascobot·
Model produced by Google vs a model produced by a startup of <30 people.
Marco Mascorro tweet media
English
24
22
343
52.1K