Thibault

2.3K posts

Thibault banner
Thibault

Thibault

@Thibaultle33

En attente du 19 juin 2027

Katılım Aralık 2014
615 Takip Edilen44 Takipçiler
818qs
818qs@818qs·
@Zapidroid @OpenAI because it’s easier to make ai on macs, they have the silicon chips which windows doesn’t have. it’s as easy as that
English
1
0
2
247
OpenAI
OpenAI@OpenAI·
Codex for (almost) everything. It can now use apps on your Mac, connect to more of your tools, create images, learn from previous actions, remember how you like to work, and take on ongoing and repeatable tasks.
English
884
1.5K
14.7K
3.3M
YellOwStaR
YellOwStaR@YellOwStaRL0L·
🇫🇷 La France sera bien présente ! C'est officiel, j'ai l'immense honneur de vous partager que j'ai été nommé Manager de l'Équipe de France pour l'Esports Nations Cup @ENC_EN 🏆 La scène esport française est immense, la communauté derrière nous est incroyable et ensemble, nous allons relever le défi face aux meilleures nations du monde. On espère que vous serez au rendez-vous ! #ENC2026 #Esports #France
Français
94
516
6.8K
300.5K
Thibault
Thibault@Thibaultle33·
@imjszhang @rryssf Dont worry bro, every frontier lab is already doing this inhouse. There is just a reason why people don't use the available RL framework and just build from scratch
English
0
0
0
8
JS
JS@imjszhang·
@rryssf Two years optimizing "model not strong enough" while the real bottleneck was I/O-heavy ops in training loops. Expensive mistake: solving the wrong problem elegantly.
English
1
0
0
194
Robert Youssef
Robert Youssef@rryssf·
BREAKING: NVIDIA just proved that the AI agent training bottleneck everyone blamed on model capability was actually an infrastructure design error. Every framework SkyRL, VeRL-Tool, Agent Lightning, rLLM, GEM embeds rollout inside the training loop. I/O-intensive execution fighting GPU-intensive optimization. Nobody separated them. > Treat rollout as a service. Qwen 8B nearly doubles on SWE-Bench. The compute was always there. It was just fighting itself. > Training an AI agent with reinforcement learning requires two fundamentally different workloads running simultaneously. Rollout is I/O-intensive spinning up sandboxed environments, executing tool calls, waiting on shell commands, scoring outcomes. Training is GPU-intensive forward passes, backward passes, gradient synchronization. Every existing framework runs both inside the same process. The result is constant resource contention: rollout workers block on disk I/O while GPU compute sits idle, and gradient updates stall while environments finish executing. > NVIDIA audited every major agentic RL framework and found the same architectural decision in all of them. SkyRL keeps rollout control inside the training driver. Agent Lightning embeds rollout workers as child processes of the trainer if training stops, rollout stops. VeRL-Tool, rLLM, and GEM all keep environment management and trajectory collection inside the training stack. Not because it's the right design. Because it was easier to build that way and nobody had fixed it yet. > ProRL Agent separates them completely. The rollout system runs as a standalone HTTP service. The trainer sends a task instance. The rollout server handles environment initialization, multi-turn agent execution, tool calls, reward computation, and returns a completed trajectory. The trainer never touches the execution environment. The two systems communicate through one interface and run on separate machines optimized for their respective workloads. > The results are not subtle. Qwen 8B: 9.6% on SWE-Bench Verified under standard training. ProRL Agent: 18.0%. That's close to 2x on the benchmark the entire software engineering AI field uses to measure progress. Qwen 14B goes from 15.4% to 23.6%. Qwen 4B goes from 14.8% to 21.2%. Same models. Same data. Different infrastructure. → Qwen 8B on SWE-Bench Verified: 9.6% baseline → 18.0% with ProRL Agent → Qwen 14B: 15.4% → 23.6% → Qwen 4B: 14.8% → 21.2% → Throughput scales near-linearly with compute nodes added → Efficient bash optimization alone: shell command latency drops from 0.78s to 0.42s per action → GPU utilization with full system: 78% vs 42% without load balancing → Every existing framework audited: zero had decoupled rollout from training The engineering insight that gets buried in the results: the problem compounds at every tool call. A typical software engineering rollout spans dozens of sequential environment interactions. Each one blocks. Each one accumulates latency. At scale with hundreds of parallel rollouts, tool execution becomes the dominant bottleneck not model inference, not gradient computation. The entire field was measuring model capability while the infrastructure was quietly eating half the compute budget.
Robert Youssef tweet media
English
17
74
392
40.2K
Thibault
Thibault@Thibaultle33·
@marieoclock @GoogleDeepMind World models basically. This is particularly useful for training the model themselves. We are still far from where we need to be so that it is useful but with a bit more that will be it
English
0
0
0
275
not_your_type
not_your_type@marieoclock·
@GoogleDeepMind That's great, but what's the point? You have such an amazing tool and decide to spend the time and money on making this useless showoff feature? Nobody needs to generate a web in real time.
English
22
1
92
17.6K
Google DeepMind
Google DeepMind@GoogleDeepMind·
Watch how fast Gemini 3.1 Flash-Lite can generate websites. ⚡ This browser creates each page in real-time as you click, search, and navigate. Give it a try → goo.gle/4t9In1R
English
148
369
3.2K
836.2K
Spencer Hakimian
Spencer Hakimian@SpencerHakimian·
Cost of Living in Germany vs USA:
English
1.8K
1.7K
11.2K
2.1M
Dexerto
Dexerto@Dexerto·
France may ban Kick, following the death of Jean Pormanove, 46, while streaming live on the platform The Paris court is set to deliver it's ruling on the government's request by December 19th
Dexerto tweet mediaDexerto tweet media
English
154
493
11.8K
1.6M
Thibault
Thibault@Thibaultle33·
@Shyguyy22 @KUKKI_ALL De sa gestion de vladi + les drafts + le cas BO c’est sûr qu’on a peut être un mauvais apriori, mais c’est dur de juger sans être dans le système. Il y a des bons retour sur lui dans la commu française donc je lui laisse le bénéfice du doute perso
Français
0
0
0
73
Grok
Grok@grok·
@MoreauMore484 @_IDVL Oui, cette vidéo a été filmée dans le nord de la France. Elle montre des policiers faisant des dérapages sur la neige dans un parking la nuit.
Français
1
0
1
320
IDIOTS DU VILLAGE
IDIOTS DU VILLAGE@_IDVL·
Je pense que c'est l'une des plus belles arrestations que j'ai vu de ma vie
Français
77
364
5.7K
294.6K
Thibault
Thibault@Thibaultle33·
@LoL_France J’espère que vous mettrez à disposition dans l’api si le joueur est wasd ou non. Histoire que la commu puisse monitorer elle même si c’est broken…
Français
0
0
1
133
League of Legends FR
League of Legends FR@LoL_France·
Cela fait maintenant plus d’une semaine que les contrôles en ZQSD sont disponibles dans League of Legends en modes non classés hors draft. Vous avez déjà essayé cette nouvelle façon de jouer ? Partagez nous vos retours ! 👊
League of Legends FR tweet media
Français
85
1
278
116K
Thibault
Thibault@Thibaultle33·
@Shyguyy22 @02Xebec Si t’avais regardé sont stream de rétrospective, tu saurais que c’est pire que ça: Il pense que l’impact des drafts est assez faible (anyway il disait qu’il aimait pas vraiment ça et que c’était pas lui qui s’en occupait)
Français
1
0
0
24
Thibault retweetledi
Karmine Corp
Karmine Corp@KarmineCorp·
Bonjour ! 👋
Français
324
1.1K
14K
2.4M
Thibault retweetledi
Arcane
Arcane@arcaneshow·
Behind-the-scenes of Arcane, courtesy of @ForticheProd
Arcane tweet mediaArcane tweet mediaArcane tweet mediaArcane tweet media
English
983
43.6K
236.5K
13.7M
Thibault retweetledi
Arcane
Arcane@arcaneshow·
Thank you for four magical years. Always with you 💙
Arcane tweet mediaArcane tweet mediaArcane tweet media
English
382
21.5K
130.9K
6.8M
Thibault
Thibault@Thibaultle33·
@TraYt0N Saken mid et cinkrof jng i beg
English
0
0
0
285