
Benjamin Ricaud
528 posts

Benjamin Ricaud
@GBR_Data
Data, science, research, graphs, artificial intelligence.





Peter Steinberger is joining OpenAI to drive the next generation of personal agents. He is a genius with a lot of amazing ideas about the future of very smart agents interacting with each other to do very useful things for people. We expect this will quickly become core to our product offerings. OpenClaw will live in a foundation as an open source project that OpenAI will continue to support. The future is going to be extremely multi-agent and it's important to us to support open source as part of that.

Happy to share my latest pre-print with @GBR_Data. We investigate reasoning efficiency in LLMs and how to decompose it in different factors for a series of LLMs depending on what you know about the reasoning task. arxiv.org/abs/2602.09805

% of individuals using generative AI tools (aged 16-74), OECD latest data: 🇳🇴 56% Norway 🇩🇰 48% Denmark 🇨🇭 47% Switzerland 🇪🇪 46% Estonia 🇫🇮 46% Finland 🇮🇪 45% Ireland 🇳🇱 44% Netherlands 🇬🇷 44% Greece 🇱🇺 42% Luxembourg 🇧🇪 42% Belgium 🇸🇪 42% Sweden 🇦🇹 39% Austria 🇵🇹 38% Portugal 🇪🇸 38% Spain 🇸🇮 37% Slovenia 🇫🇷 37% France 🇱🇹 36% Lithuania 🇨🇿 35% Czechia 🇰🇷 34% Korea 🇱🇻 33% Latvia 🇪🇺 33% EU27 🇩🇪 32% Germany 🇸🇰 31% Slovak Republic 🇭🇺 30% Hungary 🇭🇷 27% Croatia 🇯🇵 27% Japan 🇵🇱 23% Poland 🇧🇬 22% Bulgaria 🇮🇹 20% Italy 🇷🇴 18% Romania 🇹🇷 17% Türkiye Source: @OECD ICT Access and Usage Database, January 2026.

The conference may be over, but the LOG community never slows down 🧑💼🌍 Join us at upcoming meetups worldwide: 🇳🇴 Tromsø 🇮🇳 Gandhinagar 🇮🇹 Pisa 🇫🇷 Paris 🇧🇷 São arlos 🇮🇳 New Delhi




Hey twitter! I'm releasing the LLM Evaluation Guidebook v2! Updated, nicer to read, interactive graphics, etc! huggingface.co/spaces/OpenEva… After this, I'm off: I'm taking a sabbatical to go hike with my dogs :D (back @huggingface in Dec *2026*) See you all next year!



My new work with @GBR_Data is on Arxiv now. arxiv.org/abs/2509.18458 🧵We introduce a reasoning benchmark for LLMs where you can vary difficulty, length, and noise truly independently. It's also the first benchmark that grounds these dimensions in Cognitive Load Theory.


Don’t worry, our jobs are safe.






