Yingchen Xu

148 posts

Yingchen Xu

@YingchenX

CS PhD at @ucl_dark 👩‍💻 interning at @SakanaAILabs 🐠 | previously at @MetaAI deep reinforcement learning | world models | reasoning & planning 🤖️🎨⛰️

London, England Katılım Temmuz 2020

399 Takip Edilen604 Takipçiler

Sabitlenmiş Tweet

Yingchen Xu@YingchenX·6 Ara

🌟We’re excited to announce H-GAP, a generalist model for humanoid control. Trained on large MoCap-derived data, it can generate diverse, natural motions & transfer skills to new tasks without fine-tuning! Paper: arxiv.org/abs/2312.02682 Website: ycxuyingchen.github.io/hgap/ [1/N]

English

154

91K

Yingchen Xu@YingchenX·5 Şub

Huge congrats, Laura!! 🥳🎓Super excited to read the thesis, what a timely drop 🔥

Laura Ruis@LauraRuis

My PhD thesis is out 🥳🎓 How do LLMs, trained on trillions of tokens, reason? Can they generalise beyond their training data or are they constrained by what they've seen before? My takeaway: they can generalise beyond training in interesting ways, showing genuine reasoning

English

564

Yingchen Xu retweetledi

Laura Ruis@LauraRuis·3 Ara

Apply to do research with me on emergence of agency/planning in LLMs, out-of-context reasoning, understanding generalization from data, or propose your own direction! Very excited to be mentoring this spring 💫

Cambridge Boston Alignment Initiative@cbai_ai

Applications are open for the CBAI Spring Research Fellowship in AI Safety! Collaborate with established researchers to kickstart your career in AI alignment and governance research. We provide mentorship, stipend, housing in Cambridge, 24/7 office access in Harvard Square, generous APIs & compute, and speaker events with leading researchers.

English

198

25.1K

Yingchen Xu retweetledi

hardmaru@hardmaru·20 Kas

Excited to announce our MIT Press book “Neuroevolution: Harnessing Creativity in AI Agent Design” by Sebastian Risi (@risi1979), Yujin Tang (@yujin_tang), Risto Miikkulainen, and myself. We explore decades of work on evolving intelligent agents and shows how neuroevolution can drive creativity in deep learning, RL, LLMs and AI Agents! 📖 Free open-access edition: neuroevolutionbook.com In addition to our own works, this video features work by Jürgen Schmidhuber (@SchmidhuberAI), Seth Bling (@SethBling), Igor Karpov, Jacob Schrum, Yulu Gan (@yule_gan), Ken Stanley (@kenneth0stanley), Joel Lehman (@joelbot3000), Jeff Clune (@jeffclune), Nick Cheney (@CheneyLab), Richard Song (@XingyouSong), Chelsea Finn (@chelseabfinn), Julian Togelius (@togelius), Sam Earle (@Smearle_RH), Hod Lipson (@hodlipson), and Jean-Baptiste Mouret (@jb_mouret).

English

219

161.8K

Yingchen Xu retweetledi

Sebastian Risi@risi1979·20 Kas

I’m beyond excited to announce our MIT Press book on Neuroevolution! An HTML version is now available for free on neuroevolutionbook.com, with a print edition coming out later in 2026. Real intelligence is not static; it evolves. For decades, the field of neuroevolution has pursued this necessary adaptability. Our book chronicles its development, from early concepts to its modern integration with deep learning and reinforcement learning, exploring its potential for understanding the origins of intelligence and its real-world applications. And the companion webpage is more than just a book site! It comes equipped with interactive demos, videos, exercises, and tutorials to allow everyone to experience neuroevolution in action. Check it out and let us know what you think! It was a pleasure to work on this book over the last 4+ years with David (@hardmaru), Yujin (@yujin_tang), and Risto. We are incredibly proud of the result and look forward to celebrating! We hope to connect with many of you at NeurIPS. We are very grateful to Melanie Mitchell (@MelMitchell1) who provided a fantastic foreword. To quote her: “The next big thing in AI is coming, and I suspect that neuroevolution will be a major part of it”. We think so too!

English

167

647

96.4K

Yingchen Xu retweetledi

Luke Darlow@LearningLukeD·23 Eki

I had to share this stunning gif! Do Continuous Thought Machines dream dream of electric sheep...? This is a UMAP projection showing the neurons of a CTM firing while generating text (5 tokens, with time to think between). Do you see the emergence of FAST and SLOW thoughts?

GIF

English

6.7K

Yingchen Xu retweetledi

A. H. Guzel@ahguzelUK·7 Eki

🎮 How can agents learn to generalize from limited offline data? We introduce iMac (Imagined Autocurricula) - training agents entirely in world models with emergent curricula!

GIF

English

14.9K

Yingchen Xu retweetledi

Sakana AI@SakanaAILabs·6 Eki

We are excited to share that “Continuous Thought Machines” has been accepted as a Spotlight at #NeurIPS2025! 🧠✨ The CTM is an AI that mimics biological brains by using neural dynamics & synchronization to think over time. It can solve complex mazes by building internal maps, gaze around images to classify them, and learn algorithms—all emergent from its core design. This is just the beginning. A hint of what we're exploring next… (video attached!) The team: @LearningLukeD @ciaran_regan_ @risi1979 @jeffreyseely @YesThisIsLion

English

623

167.6K

Yingchen Xu retweetledi

Tim Rocktäschel@_rockt·1 Eki

Proud to announce that Dr @LauraRuis defended her PhD thesis titled "Understanding and Evaluating Reasoning in Large Language Models" last week 🥳. Massive thanks to Noah Goodman and Emine Yilmaz for examining! As is customary, Laura received a personal mortarboard from @UCL_DARK. Details 👇

English

8.4K

Yingchen Xu retweetledi

Sakana AI@SakanaAILabs·25 Eyl

We’re excited to introduce ShinkaEvolve: An open-source framework that evolves programs for scientific discovery with unprecedented sample-efficiency. Blog: sakana.ai/shinka-evolve/ Code: github.com/SakanaAI/Shink… Like AlphaEvolve and its variants, our framework leverages LLMs to find state-of-the-art solutions to complex problems, but using orders of magnitude fewer resources! Many evolutionary AI systems are powerful but act like brute-force engines, burning thousands of samples to find good solutions. This makes discovery slow and expensive. We took inspiration from the efficiency of nature. ‘Shinka’ (進化) is Japanese for evolution, and we designed our system to be just as resourceful. On the classic circle packing optimization problem, ShinkaEvolve discovered a new state-of-the-art solution using only 150 samples. This is a big leap in efficiency compared to previous methods that required thousands of evaluations. We applied ShinkaEvolve to a diverse set of hard problems with real-world applications: 1/ AIME Math Reasoning: It evolved sophisticated agentic scaffolds that significantly outperform strong baselines, discovering an entire Pareto frontier of solutions trading performance for efficiency. 2/ Competitive Programming: On ALE-Bench (a benchmark for NP-Hard optimization problems), ShinkaEvolve took the best existing agent's solutions and improved them, turning a 5th place solution on one task into a 2nd place leaderboard rank in a competitive programming competition. 3/ LLM Training: We even turned ShinkaEvolve inward to improve LLMs themselves. It tackled the open challenge of designing load balancing losses for Mixture-of-Experts (MoE) models. It discovered a novel loss function that leads to better expert specialization and consistently improves model performance and perplexity. ShinkaEvolve achieves its remarkable sample-efficiency through three key innovations that work together: (1) an adaptive parent sampling strategy to balance exploration and exploitation, (2) novelty-based rejection filtering to avoid redundant work, and (3) a bandit-based LLM ensemble that dynamically picks the best model for the job. By making ShinkaEvolve open-source and highly sample-efficient, our goal is to democratize access to advanced, open-ended discovery tools. Our vision for ShinkaEvolve is to be an easy-to-use companion tool to help scientists and engineers with their daily work. We believe that building more efficient, nature-inspired systems is key to unlocking the future of AI-driven scientific research. We are excited to see what the community builds with it! Learn more in our technical report: arxiv.org/abs/2509.19349

English

252

1.4K

357.2K

Yingchen Xu retweetledi

Minqi Jiang@MinqiJiang·8 Eyl

What if you kept asking an LLM to "make it better"? In some recent work at FAIR, we investigate how we can efficiently use RL to fine-tune LLMs to iteratively self-improve on their previous solutions at inference-time. Training for iterated self-improvement can be costly. The naive approach to training for K self-improvement steps leads to K times the number of rollout steps per episode. We introduce Exploratory Iteration (ExIt), an RL-based automatic curriculum method that bootstraps diverse training distributions of self-improvement tasks by upcycling the LLM's own responses at previous turns as the starting points for both self-improvement and *self-divergence.* In order to decide what task to train on next, the curriculum prioritizes sampling of partial turn histories that led to higher return variance in its GRPO group (a learnability score that comes for free). This automatic curriculum over the bootstrapped task space teaches the model how to perform iterated self-improvement while only ever training the model on single-step self-improvement tasks. We look at ExIt's impact in both single-turn (contest math problems) and multi-turn (BFCLv3 multi-turn tasks), as well as MLE-bench, where the LLM is run in a search scaffold to produce solutions to real Kaggle competitions. Across these eval settings, we find ExIt produces models with greater capacity for inference-time self-improvement compared to GRPO. Notably, ExIt models can self-improve on test tasks for many more steps than the typical solution depth encountered during training, including a 22% improvement in MLE-bench performance compared to GRPO.

English

405

40.9K

Yingchen Xu retweetledi

Davide Paglieri@PaglieriDavide·5 Eyl

"Always reasoning" (ReAct) isn't optimal for LLM agents! 🧠 Our new paper identifies a "Goldilocks" effect: planning too frequently or not enough degrades performance. We show how to train agents to learn to dynamically allocate test-time compute when needed for best results. 👇

Bartłomiej Cupiał@CupiaBart

Almost all agentic pipelines prompt LLMs to explicitly plan before every action (ReAct), but turns out this isn't optimal for Multi-Step RL 🤔 Why? In our new work we highlight a crucial issue with ReAct and show that we should make and follow plans instead🧵

English

11.8K

Yingchen Xu retweetledi

Jack Parker-Holder@jparkerholder·5 Ağu

Genie 3 feels like a watershed moment for world models 🌐: we can now generate multi-minute, real-time interactive simulations of any imaginable world. This could be the key missing piece for embodied AGI… and it can also create beautiful beaches with my dog, playable real time

English

264

526

4.8K

2.1M

Yingchen Xu retweetledi

Tim Rocktäschel@_rockt·5 Ağu

Harder, Better, Faster, Stronger, Real-time! We are excited to reveal Genie 3, our most capable real-time foundational world model. Fantastic cross-team effort led by @jparkerholder and @shlomifruchter. Below some interactive worlds and capabilities that were highlights for me 🌎👇

English

183

1.4K

168.4K

Yingchen Xu retweetledi

Harshit Sikchi@harshit_sikchi·30 Tem

We are hosting a social again this year at #RLC2025 (@RL_Conference ) on August 5. Come to meet people pre-conference and find friends and collaborators. RSVP below if you can make it:

English

3.7K

Yingchen Xu retweetledi

Harshit Sikchi@harshit_sikchi·3 Ağu

I will be @RL_Conference presenting the below work on Fast Adaptation on Wednesday August 6 at 10:20 am and some works on unsupervised RL and imitation at RLBrew workshop on August 5.

Harshit Sikchi@harshit_sikchi

Behavioral Foundation Models (BFMs) trained with RL are secretly more powerful than we think. BFM’s directly output a policy believed to be near-optimal given any reward function. Our new work shows that they can actually do much better:

English

105

6.6K

Yingchen Xu retweetledi

Roberta Raileanu@robertarail·5 Ağu

Excited to be in Edmonton for the @RL_Conference this week! Today I’ll be at @ibrlworkshop and @rlvg2025 giving two workshop talks and participating in the panels. Stop by to say hello! 📍 Inductive Biases in RL Workshop ⏰ 9:15am 🤖 LLM Whispers: Injecting Human Priors into RL Agents 📍 RL and Video Games Workshop ⏰ 2pm 🕹️ NetHack: A Grand Challenge for RL and LLM Agents Alike

Inductive Biases in RL@ibrlworkshop

🗓️ The IBRL Workshop kicks off tomorrow! 🎉 Join us at @RL_Conference @UAlberta to explore how Inductive Biases can boost 🚀 the performance of RL agents. 📄 Accepted papers: sites.google.com/view/ibrl-work… 📅 Full schedule: sites.google.com/view/ibrl-work… #ReinforcementLearning #RLC2025

English

4.5K

Yingchen Xu retweetledi

Reinforcement Learning & Video Games Workshop @RLC@rlvg2025·11 Tem

We’re excited to announce our next speaker: Roberta Raileanu (@robertarail) from @GoogleDeepMind! Roberta will discuss NetHack: A Grand Challenge for RL and LLM Agents Alike. ⚔️ Join us on August 5th to learn how to develop agents capable of tackling open-ended environments!

Reinforcement Learning & Video Games Workshop @RLC tweet media

English

106

6.4K

Yingchen Xu retweetledi

Roberta Raileanu@robertarail·24 Tem

I’m building a new team at @GoogleDeepMind to work on Open-Ended Discovery! We’re looking for strong Research Scientists and Research Engineers to help us push the frontier of autonomously discovering novel artifacts such as new knowledge, capabilities, or algorithms, in an open-ended self-improving loop. We aim to work on ambitious research projects in a fast-paced manner. If this sounds appealing to you, apply using the link below by Friday, August 1st EOD: job-boards.greenhouse.io/deepmind/jobs/…

English

254

2.5K

344.7K

Yingchen Xu retweetledi

Edward Grefenstette@egrefen·21 Tem

Do you have a PhD (or equivalent) or will have one in the coming months (i.e. 2-3 months away from graduating)? Do you want to help build open-ended agents that help humans do humans things better, rather than replace them? We're hiring 1-2 Research Scientists! Check the 🧵👇

English

356

76.7K

Yingchen Xu retweetledi

Nathan Herr@naitherr·17 Haz

Excited to introduce LLM-First Search (LFS) - a new paradigm where the language model takes the lead in reasoning and search! LFS is a self-directed search method that empowers LLMs to guide the exploration process themselves, without relying on predefined heuristics or fixed exploration schedules. Why it matters: 🔍🤖 Self-directed: The model strikes a balance between exploration and exploitation using its own internal scoring system. 🔄🧠Adaptive: Automatically adjusts to task difficulty with no tuning required. 📈🏆Stronger performance: Achieves higher success rates on reasoning tasks. ⏱️📉Efficient: Outperforms other LLM-augmented strategies in compute usage. ⚖️🚀Scalable: Gains amplify with stronger models and more compute. A big thanks to both @_rockt and @robertarail for their exceptional supervision!

English

142

21.9K

Keşfet

@risi1979 @yujin_tang @SchmidhuberAI @SethBling @yule_gan @kenneth0stanley @joelbot3000 @jeffclune