

Alexandre Ramé
829 posts

@ramealexandre
Research scientist @GoogleDeepMind. Previously PhD @Sorbonne_Univ_. Post-training Gemma LLMs: distillation, RL and merging.












New paper & surprising result. LLMs transmit traits to other models via hidden signals in data. Datasets consisting only of 3-digit numbers can transmit a love for owls, or evil tendencies. 🧵


Demis and Sam, now is the time to stand with Dario. You can pick up the race next week.

A statement from Anthropic CEO, Dario Amodei, on our discussions with the Department of War. anthropic.com/news/statement…



📣 Announcing a second, different job posting from yesterday's. The @GoogleDeepMind Autonomous Agents team is seeking to hire a Research Scientist to work on established projects on sample efficient learning and robust self-improvement. Details and link below. [1/5]











The GLM-5 technical report is an impressive read. In the terminal environment section, their methodology is very similar to our SETA project: starting from seed tasks to draft terminal-task specifications, then building Docker environments and validating them with test scripts. They’ve also scaled this pipeline to generate thousands of environments using this approach. Thanks @Zai_org for sharing such a detailed report. If any of you are interested in building open source terminal environments, do also check out our 1376 environments and blog here: GitHub: github.com/camel-ai/seta-… Blog: camel-ai.org/blogs/seta-sca…

For decades, we’ve trained AI to chase rewards. But humans don’t just optimize outcomes. We experience, reflect, then learn. Can AI do the same? Introducing 𝐄𝐱𝐩𝐞𝐫𝐢𝐞𝐧𝐭𝐢𝐚𝐥 𝐑𝐞𝐢𝐧𝐟𝐨𝐫𝐜𝐞𝐦𝐞𝐧𝐭 𝐋𝐞𝐚𝐫𝐧𝐢𝐧𝐠, a step toward AI that truly learn from experience.
