Amine Benhalloum

141 posts

Amine Benhalloum

Amine Benhalloum

@amine_benh

Agents and Post Training @Meta Superintelligence Labs

Katılım Aralık 2013
531 Takip Edilen237 Takipçiler
Amine Benhalloum
Amine Benhalloum@amine_benh·
I’m at #NeurIPS2025! Our @Meta MSL Agents team is hiring interns in Paris —DM me if you’re excited to build the next wave of agents, environments, and everything in between.
English
1
1
9
837
Amine Benhalloum retweetledi
Romain Froger
Romain Froger@froger_romain·
I'll be @NeurIPSConf in San Diego this week, together with the co-authors of ARE/Gaia2 @mialon_gregoire & @amine_benh . Would love to connect: let’s talk about what’s next for agents!
Grégoire Mialon@mialon_gregoire

I am at #NeurIPS2025! I am hiring an intern for our Paris team to succeed @MekalaDheeraj and @ulyanapiterbarg, DM if you want to work on what's next for agents Will also have a look back on Gaia and introduce Gaia2 at the Scaling Environments for Agents workshop on Sunday!

English
0
2
8
1.1K
Amine Benhalloum retweetledi
Gabriel Synnaeve
Gabriel Synnaeve@syhw·
(🧵) Today, we release Meta Code World Model (CWM), a 32-billion-parameter dense LLM that enables novel research on improving code generation through agentic reasoning and planning with world models. ai.meta.com/research/publi…
English
60
312
1.8K
918.3K
Amine Benhalloum retweetledi
Rohan Paul
Rohan Paul@rohanpaul_ai·
🧠Great research from @Meta Superintelligence Labs. Proposes Meta Agents Research Environments (ARE) for scaling up agent environments and evaluations. ARE lets researchers build realistic agent environments, run agents asynchronously, and verify them cleanly. On top of it they release Gaia2, a 1,120 scenario benchmark that stresses search, execution, ambiguity, time pressure, collaboration, and noise, and the results show sharp tradeoffs between raw reasoning and speed or cost. ⚙️ The Core Concepts ARE (Agent Runtime Environment) treats the world as a clocked simulation where everything is an event, the agent runs separately, and interactions flow through tools and notifications. Apps are the tools, environments bundle the apps plus rules, and scenarios package starting state, scheduled events, and a verifier. Traditional old benchmarks froze the world while a model was “thinking.” That made results look clean but ignored the real costs of inference time. In ARE, the world keeps ticking asynchronously. Time passes even while the model is generating, apps can trigger notifications, and other actors may act. So if a model is slow, it directly shows up as missed deadlines in the benchmark. That is exactly why GPT-5 (high) got 79.6 on Search but 0 on Time in default mode. The reasoning quality was excellent, but ARE exposed its inference slowness as a concrete failure mode. When ARE switched to instant mode, stripping out the latency, the model suddenly performed well — proving the bottleneck wasn’t reasoning but raw response time @AIatMeta 🧵 Read on 👇
Rohan Paul tweet media
English
2
11
26
5K
Amine Benhalloum retweetledi
echen
echen@echen·
@ThomasScialom amazing!! 🚀🚀 excited ARE is finally out :)
English
0
3
6
745
Amine Benhalloum retweetledi
Clémentine Fourrier 🍊 is off till Dec 2026 hiking
Did you see that the Agent Research Environment is MCP compatible? -> using any MCP tools with any agent is now completely trivial! Check it out! We've used an LLM agent to 1) move a robot arm remotely 2) depending on real time web search results! :D How to in thread ^^
Clémentine Fourrier 🍊 is off till Dec 2026 hiking@clefourrier

Wanna upgrade your agent game? With @AIatMeta , we're releasing 2 incredibly cool artefacts: - GAIA 2: assistant evaluation with a twist (new: adaptability, robustness to failure & time sensitivity) - ARE, an agent research environment to empower all! huggingface.co/blog/gaia2

English
1
8
32
3.2K
Amine Benhalloum retweetledi
elvis
elvis@omarsar0·
Very cool work from Meta Superintelligence Lab. They are open-sourcing Meta Agents Research Environments (ARE), the platform they use to create and scale agent environments. Great resource to stress-test agents in environments closer to real apps. Read on for more:
elvis tweet media
English
39
175
990
151.2K
Artsiom Sanakoyeu
Artsiom Sanakoyeu@artsiom_s·
Staff Research Scientist: Personal Update I have some exciting news that I'd like to share with you! On Monday, I was promoted to E6, which means I am now a Staff Research Scientist at Meta GenAI. This was made possible thanks to the significant impact and scope of a Generative AI project that I proposed, led, and completed last year. The project is not yet public, so I can't share details about it right now. Before this, I was at the terminal level - Senior Research Scientist, a position many get stuck in forever. It takes extra effort and personal qualities to break out of this limbo and become a Staff member. But now, I've unlocked a new ladder, E6+, where leveling up is significantly more challenging than between Junior (E3) and Senior (E5) levels. However, this also presents a challenge and an opportunity for further development! Exciting stuff!
Artsiom Sanakoyeu tweet media
English
20
3
222
46.9K
NVIDIA GeForce
NVIDIA GeForce@NVIDIAGeForce·
We’re giving away a NVIDIA GeForce RTX 4090 with a one-of-a-kind @AlanWake custom backplate, launching 10/27 with full ray tracing + NVIDIA DLSS 3.5 ⚡ Entering is easy: 🟢 Like this post 🟢 Comment #RTXON
NVIDIA GeForce tweet media
English
32.4K
2.7K
36.4K
1.7M
Gael Varoquaux 🦋
Gael Varoquaux 🦋@GaelVaroquaux·
✨Updated version: book chapter on machine-learning model evaluation To me, this text is very important, introducing readers to important and under-rated concepts, though most are neither new nor complicated hal.science/hal-03682454/ 1/6
English
3
14
73
10.5K
Delip Rao e/σ
Delip Rao e/σ@deliprao·
Yesterday, @upennnlp invited @gail_w to share her work with @yoavgo & @yahave on “Thinking Like Transformers” at our long-running Wed. speaker series “clunch”, and it is one of the most interesting transformer-related works I’ve listened to. Plus, Gail is superbly engaging!
Delip Rao e/σ tweet media
English
9
31
247
37.4K
Thiago Ghisi
Thiago Ghisi@thiagoghisi·
I've read thousands of articles over the last 20 years in Tech. Agile, Career, Distributed Systems, Engineering Management, Metrics, Programming, Testing, Types... Below are The 22 Articles that Impacted my Career the most ➕ my main highlights from each one. 🧵🧵🧵
English
51
299
1.7K
268.4K
George E. Dahl
George E. Dahl@GeorgeEDahl·
We've just released the first version of our Deep Learning Tuning Playbook! This is our attempt to distill our process for actually getting good results with deep learning. We emphasize hyperparameter tuning since it has been a large pain point. github.com/google-researc…
English
44
794
3.6K
670.8K
Pranav Rajpurkar
Pranav Rajpurkar@pranavrajpurkar·
We're halfway through #HarvardCS197: AI Research Experiences! We've covered language models, Python practices, reading AI papers, PyTorch, Lightning, Weights & Biases, Hydra, and research Idea generation. My lecture notes (~13hrs) are publicly available: cs197.seas.harvard.edu
Pranav Rajpurkar tweet media
English
15
240
992
0
Ankur Handa
Ankur Handa@ankurhandos·
This is by far the best video I have seen explaining ray tracing, brdf, and rendering equation with great visualisations 😍 highly recommended. youtube.com/watch?v=gsZiJe…
YouTube video
YouTube
English
4
69
463
0