Amine Benhalloum

141 posts

Amine Benhalloum

@amine_benh

Agents and Post Training @Meta Superintelligence Labs

Katılım Aralık 2013

531 Takip Edilen237 Takipçiler

Amine Benhalloum@amine_benh·4 Ara

I’m at #NeurIPS2025! Our @Meta MSL Agents team is hiring interns in Paris —DM me if you’re excited to build the next wave of agents, environments, and everything in between.

English

837

Amine Benhalloum retweetledi

Romain Froger@froger_romain·2 Ara

I'll be @NeurIPSConf in San Diego this week, together with the co-authors of ARE/Gaia2 @mialon_gregoire & @amine_benh . Would love to connect: let’s talk about what’s next for agents!

Grégoire Mialon@mialon_gregoire

I am at #NeurIPS2025! I am hiring an intern for our Paris team to succeed @MekalaDheeraj and @ulyanapiterbarg, DM if you want to work on what's next for agents Will also have a look back on Gaia and introduce Gaia2 at the Scaling Environments for Agents workshop on Sunday!

English

1.1K

Amine Benhalloum retweetledi

Grégoire Mialon@mialon_gregoire·2 Eki

We released ARE and Gaia2 one week ago, time to share some observations and add new models to the leaderboard! huggingface.co/blog/meta-agen…

English

5.5K

Amine Benhalloum@amine_benh·25 Eyl

@xianjun_agi Thank you @xianjun_agi ! It's only the beginning ;)

English

Xianjun Yang@xianjun_agi·24 Eyl

Glad to be the early user of ARE! Congrats @amine_benh for the release!

Thomas Scialom@ThomasScialom

🚀 ARE: scaling up agent environments and evaluations Everyone talks about RL envs so we built one we actually use. In the second half of AI, evals & envs are the bottleneck. Today we OSS it all: Meta Agent Research Environment + GAIA-2 (code, demo, evals). 🔗Links👇

English

4.2K

Amine Benhalloum retweetledi

Gabriel Synnaeve@syhw·25 Eyl

(🧵) Today, we release Meta Code World Model (CWM), a 32-billion-parameter dense LLM that enables novel research on improving code generation through agentic reasoning and planning with world models. ai.meta.com/research/publi…

English

312

1.8K

918.3K

Amine Benhalloum retweetledi

Rohan Paul@rohanpaul_ai·23 Eyl

🧠Great research from @Meta Superintelligence Labs. Proposes Meta Agents Research Environments (ARE) for scaling up agent environments and evaluations. ARE lets researchers build realistic agent environments, run agents asynchronously, and verify them cleanly. On top of it they release Gaia2, a 1,120 scenario benchmark that stresses search, execution, ambiguity, time pressure, collaboration, and noise, and the results show sharp tradeoffs between raw reasoning and speed or cost. ⚙️ The Core Concepts ARE (Agent Runtime Environment) treats the world as a clocked simulation where everything is an event, the agent runs separately, and interactions flow through tools and notifications. Apps are the tools, environments bundle the apps plus rules, and scenarios package starting state, scheduled events, and a verifier. Traditional old benchmarks froze the world while a model was “thinking.” That made results look clean but ignored the real costs of inference time. In ARE, the world keeps ticking asynchronously. Time passes even while the model is generating, apps can trigger notifications, and other actors may act. So if a model is slow, it directly shows up as missed deadlines in the benchmark. That is exactly why GPT-5 (high) got 79.6 on Search but 0 on Time in default mode. The reasoning quality was excellent, but ARE exposed its inference slowness as a concrete failure mode. When ARE switched to instant mode, stripping out the latency, the model suddenly performed well — proving the bottleneck wasn’t reasoning but raw response time @AIatMeta 🧵 Read on 👇

English

Amine Benhalloum retweetledi

echen@echen·22 Eyl

@ThomasScialom amazing!! 🚀🚀 excited ARE is finally out :)

English

745

Amine Benhalloum retweetledi

Clémentine Fourrier 🍊 is off till Dec 2026 hiking@clefourrier·23 Eyl

Did you see that the Agent Research Environment is MCP compatible? -> using any MCP tools with any agent is now completely trivial! Check it out! We've used an LLM agent to 1) move a robot arm remotely 2) depending on real time web search results! :D How to in thread ^^

Clémentine Fourrier 🍊 is off till Dec 2026 hiking@clefourrier

Wanna upgrade your agent game? With @AIatMeta , we're releasing 2 incredibly cool artefacts: - GAIA 2: assistant evaluation with a twist (new: adaptability, robustness to failure & time sensitivity) - ARE, an agent research environment to empower all! huggingface.co/blog/gaia2

English

3.2K

Amine Benhalloum retweetledi

elvis@omarsar0·22 Eyl

Very cool work from Meta Superintelligence Lab. They are open-sourcing Meta Agents Research Environments (ARE), the platform they use to create and scale agent environments. Great resource to stress-test agents in environments closer to real apps. Read on for more:

English

175

990

151.2K

Amine Benhalloum@amine_benh·22 Eyl

Our contribution to the second half of AI 🚀 This has been a joy to build.

Grégoire Mialon@mialon_gregoire

🏗️ ARE: scaling up agent environments and evaluations In the LLM+RL era, evals and envs are the bottleneck Happy to release Gaia2, an extensible benchmark for agents aiming to reduce the sim2real gap + ARE, the platform in which Gaia2 is built Enjoy evaluating your agents! 👇

English

162

Amine Benhalloum@amine_benh·29 Şub

@artsiom_s Congrats man !

English

523

Artsiom Sanakoyeu@artsiom_s·29 Şub

Staff Research Scientist: Personal Update I have some exciting news that I'd like to share with you! On Monday, I was promoted to E6, which means I am now a Staff Research Scientist at Meta GenAI. This was made possible thanks to the significant impact and scope of a Generative AI project that I proposed, led, and completed last year. The project is not yet public, so I can't share details about it right now. Before this, I was at the terminal level - Senior Research Scientist, a position many get stuck in forever. It takes extra effort and personal qualities to break out of this limbo and become a Staff member. But now, I've unlocked a new ladder, E6+, where leveling up is significantly more challenging than between Junior (E3) and Senior (E5) levels. However, this also presents a challenge and an opportunity for further development! Exciting stuff!

English

222

46.9K

Amine Benhalloum@amine_benh·26 Ağu

@NVIDIAGeForce @alanwake #RTXON

QME

NVIDIA GeForce@NVIDIAGeForce·23 Ağu

We’re giving away a NVIDIA GeForce RTX 4090 with a one-of-a-kind @AlanWake custom backplate, launching 10/27 with full ray tracing + NVIDIA DLSS 3.5 ⚡ Entering is easy: 🟢 Like this post 🟢 Comment #RTXON

English

32.4K

2.7K

36.4K

1.7M

Amine Benhalloum@amine_benh·27 Oca

@GaelVaroquaux @SaveToNotion #thread

QME

Gael Varoquaux 🦋@GaelVaroquaux·27 Oca

✨Updated version: book chapter on machine-learning model evaluation To me, this text is very important, introducing readers to important and under-rated concepts, though most are neither new nor complicated hal.science/hal-03682454/ 1/6

English

10.5K

Amine Benhalloum@amine_benh·27 Oca

@deliprao @upennnlp @gail_w @yoavgo @yahave @SaveToNotion #thread

QME

105

Delip Rao e/σ@deliprao·26 Oca

Yesterday, @upennnlp invited @gail_w to share her work with @yoavgo & @yahave on “Thinking Like Transformers” at our long-running Wed. speaker series “clunch”, and it is one of the most interesting transformer-related works I’ve listened to. Plus, Gail is superbly engaging!

English

247

37.4K

Amine Benhalloum@amine_benh·23 Oca

@thiagoghisi @SaveToNotion #thread

QME

170

Thiago Ghisi@thiagoghisi·22 Oca

I've read thousands of articles over the last 20 years in Tech. Agile, Career, Distributed Systems, Engineering Management, Metrics, Programming, Testing, Types... Below are The 22 Articles that Impacted my Career the most ➕ my main highlights from each one. 🧵🧵🧵

English

299

1.7K

268.4K

Amine Benhalloum@amine_benh·19 Oca

@GeorgeEDahl @SaveToNotion #tweet

QME

1.1K

George E. Dahl@GeorgeEDahl·19 Oca

We've just released the first version of our Deep Learning Tuning Playbook! This is our attempt to distill our process for actually getting good results with deep learning. We emphasize hyperparameter tuning since it has been a large pain point. github.com/google-researc…

English

794

3.6K

670.8K

Amine Benhalloum@amine_benh·21 Eki

@Jeande_d @SaveToNotion #tweet

QME

Jean de Dieu Nyandwi@Jeande_d·20 Eki

Designing, Visualizing and Understanding Deep Neural Networks - UC Berkeley A great DL course that covers all sorts of neural network architectures, techniques for training them, and visualizing their representations. Videos: youtube.com/playlist?list=… Web: cs182sp21.github.io

English

149

621

Amine Benhalloum@amine_benh·12 Eki

@pranavrajpurkar @SaveToNotion #tweet

QME

Pranav Rajpurkar@pranavrajpurkar·12 Eki

We're halfway through #HarvardCS197: AI Research Experiences! We've covered language models, Python practices, reading AI papers, PyTorch, Lightning, Weights & Biases, Hydra, and research Idea generation. My lecture notes (~13hrs) are publicly available: cs197.seas.harvard.edu

English

240

992

Amine Benhalloum@amine_benh·9 Eki

@ankurhandos @SaveToNotion #tweet

QME

Ankur Handa@ankurhandos·8 Eki

This is by far the best video I have seen explaining ray tracing, brdf, and rendering equation with great visualisations 😍 highly recommended. youtube.com/watch?v=gsZiJe…

YouTube

English

463

Keşfet

@Meta @NeurIPSConf @mialon_gregoire @xianjun_agi @AIatMeta @ThomasScialom @artsiom_s @NVIDIAGeForce