Yingchen Xu

148 posts

Yingchen Xu banner
Yingchen Xu

Yingchen Xu

@YingchenX

CS PhD at @ucl_dark 👩‍💻 interning at @SakanaAILabs 🐠 | previously at @MetaAI deep reinforcement learning | world models | reasoning & planning 🤖️🎨⛰️

London, England Katılım Temmuz 2020
399 Takip Edilen604 Takipçiler
Sabitlenmiş Tweet
Yingchen Xu
Yingchen Xu@YingchenX·
🌟We’re excited to announce H-GAP, a generalist model for humanoid control. Trained on large MoCap-derived data, it can generate diverse, natural motions & transfer skills to new tasks without fine-tuning! Paper: arxiv.org/abs/2312.02682 Website: ycxuyingchen.github.io/hgap/ [1/N]
English
4
31
154
91K
Yingchen Xu retweetledi
Yingchen Xu retweetledi
hardmaru
hardmaru@hardmaru·
Excited to announce our MIT Press book “Neuroevolution: Harnessing Creativity in AI Agent Design” by Sebastian Risi (@risi1979), Yujin Tang (@yujin_tang), Risto Miikkulainen, and myself. We explore decades of work on evolving intelligent agents and shows how neuroevolution can drive creativity in deep learning, RL, LLMs and AI Agents! 📖 Free open-access edition: neuroevolutionbook.com In addition to our own works, this video features work by Jürgen Schmidhuber (@SchmidhuberAI), Seth Bling (@SethBling), Igor Karpov, Jacob Schrum, Yulu Gan (@yule_gan), Ken Stanley (@kenneth0stanley), Joel Lehman (@joelbot3000), Jeff Clune (@jeffclune), Nick Cheney (@CheneyLab), Richard Song (@XingyouSong), Chelsea Finn (@chelseabfinn), Julian Togelius (@togelius), Sam Earle (@Smearle_RH), Hod Lipson (@hodlipson), and Jean-Baptiste Mouret (@jb_mouret).
English
16
219
1K
161.8K
Yingchen Xu retweetledi
Sebastian Risi
Sebastian Risi@risi1979·
I’m beyond excited to announce our MIT Press book on Neuroevolution! An HTML version is now available for free on neuroevolutionbook.com, with a print edition coming out later in 2026. Real intelligence is not static; it evolves. For decades, the field of neuroevolution has pursued this necessary adaptability. Our book chronicles its development, from early concepts to its modern integration with deep learning and reinforcement learning, exploring its potential for understanding the origins of intelligence and its real-world applications. And the companion webpage is more than just a book site! It comes equipped with interactive demos, videos, exercises, and tutorials to allow everyone to experience neuroevolution in action. Check it out and let us know what you think! It was a pleasure to work on this book over the last 4+ years with David (@hardmaru), Yujin (@yujin_tang), and Risto. We are incredibly proud of the result and look forward to celebrating! We hope to connect with many of you at NeurIPS. We are very grateful to Melanie Mitchell (@MelMitchell1) who provided a fantastic foreword. To quote her: “The next big thing in AI is coming, and I suspect that neuroevolution will be a major part of it”. We think so too!
Sebastian Risi tweet media
English
24
167
647
96.4K
Yingchen Xu retweetledi
Luke Darlow
Luke Darlow@LearningLukeD·
I had to share this stunning gif! Do Continuous Thought Machines dream dream of electric sheep...? This is a UMAP projection showing the neurons of a CTM firing while generating text (5 tokens, with time to think between). Do you see the emergence of FAST and SLOW thoughts?
GIF
English
3
8
71
6.7K
Yingchen Xu retweetledi
A. H. Guzel
A. H. Guzel@ahguzelUK·
🎮 How can agents learn to generalize from limited offline data? We introduce iMac (Imagined Autocurricula) - training agents entirely in world models with emergent curricula!
GIF
English
1
19
77
14.9K
Yingchen Xu retweetledi
Sakana AI
Sakana AI@SakanaAILabs·
We are excited to share that “Continuous Thought Machines” has been accepted as a Spotlight at #NeurIPS2025! 🧠✨ The CTM is an AI that mimics biological brains by using neural dynamics & synchronization to think over time. It can solve complex mazes by building internal maps, gaze around images to classify them, and learn algorithms—all emergent from its core design. This is just the beginning. A hint of what we're exploring next… (video attached!) The team: @LearningLukeD @ciaran_regan_ @risi1979 @jeffreyseely @YesThisIsLion
English
11
74
623
167.6K
Yingchen Xu retweetledi
Tim Rocktäschel
Tim Rocktäschel@_rockt·
Proud to announce that Dr @LauraRuis defended her PhD thesis titled "Understanding and Evaluating Reasoning in Large Language Models" last week 🥳. Massive thanks to Noah Goodman and Emine Yilmaz for examining! As is customary, Laura received a personal mortarboard from @UCL_DARK. Details 👇
Tim Rocktäschel tweet media
English
6
12
90
8.4K
Yingchen Xu retweetledi
Sakana AI
Sakana AI@SakanaAILabs·
We’re excited to introduce ShinkaEvolve: An open-source framework that evolves programs for scientific discovery with unprecedented sample-efficiency. Blog: sakana.ai/shinka-evolve/ Code: github.com/SakanaAI/Shink… Like AlphaEvolve and its variants, our framework leverages LLMs to find state-of-the-art solutions to complex problems, but using orders of magnitude fewer resources! Many evolutionary AI systems are powerful but act like brute-force engines, burning thousands of samples to find good solutions. This makes discovery slow and expensive. We took inspiration from the efficiency of nature. ‘Shinka’ (進化) is Japanese for evolution, and we designed our system to be just as resourceful. On the classic circle packing optimization problem, ShinkaEvolve discovered a new state-of-the-art solution using only 150 samples. This is a big leap in efficiency compared to previous methods that required thousands of evaluations. We applied ShinkaEvolve to a diverse set of hard problems with real-world applications: 1/ AIME Math Reasoning: It evolved sophisticated agentic scaffolds that significantly outperform strong baselines, discovering an entire Pareto frontier of solutions trading performance for efficiency. 2/ Competitive Programming: On ALE-Bench (a benchmark for NP-Hard optimization problems), ShinkaEvolve took the best existing agent's solutions and improved them, turning a 5th place solution on one task into a 2nd place leaderboard rank in a competitive programming competition. 3/ LLM Training: We even turned ShinkaEvolve inward to improve LLMs themselves. It tackled the open challenge of designing load balancing losses for Mixture-of-Experts (MoE) models. It discovered a novel loss function that leads to better expert specialization and consistently improves model performance and perplexity. ShinkaEvolve achieves its remarkable sample-efficiency through three key innovations that work together: (1) an adaptive parent sampling strategy to balance exploration and exploitation, (2) novelty-based rejection filtering to avoid redundant work, and (3) a bandit-based LLM ensemble that dynamically picks the best model for the job. By making ShinkaEvolve open-source and highly sample-efficient, our goal is to democratize access to advanced, open-ended discovery tools. Our vision for ShinkaEvolve is to be an easy-to-use companion tool to help scientists and engineers with their daily work. We believe that building more efficient, nature-inspired systems is key to unlocking the future of AI-driven scientific research. We are excited to see what the community builds with it! Learn more in our technical report: arxiv.org/abs/2509.19349
English
30
252
1.4K
357.2K
Yingchen Xu retweetledi
Minqi Jiang
Minqi Jiang@MinqiJiang·
What if you kept asking an LLM to "make it better"? In some recent work at FAIR, we investigate how we can efficiently use RL to fine-tune LLMs to iteratively self-improve on their previous solutions at inference-time. Training for iterated self-improvement can be costly. The naive approach to training for K self-improvement steps leads to K times the number of rollout steps per episode. We introduce Exploratory Iteration (ExIt), an RL-based automatic curriculum method that bootstraps diverse training distributions of self-improvement tasks by upcycling the LLM's own responses at previous turns as the starting points for both self-improvement and *self-divergence.* In order to decide what task to train on next, the curriculum prioritizes sampling of partial turn histories that led to higher return variance in its GRPO group (a learnability score that comes for free). This automatic curriculum over the bootstrapped task space teaches the model how to perform iterated self-improvement while only ever training the model on single-step self-improvement tasks. We look at ExIt's impact in both single-turn (contest math problems) and multi-turn (BFCLv3 multi-turn tasks), as well as MLE-bench, where the LLM is run in a search scaffold to produce solutions to real Kaggle competitions. Across these eval settings, we find ExIt produces models with greater capacity for inference-time self-improvement compared to GRPO. Notably, ExIt models can self-improve on test tasks for many more steps than the typical solution depth encountered during training, including a 22% improvement in MLE-bench performance compared to GRPO.
English
16
72
405
40.9K
Yingchen Xu retweetledi
Davide Paglieri
Davide Paglieri@PaglieriDavide·
"Always reasoning" (ReAct) isn't optimal for LLM agents! 🧠 Our new paper identifies a "Goldilocks" effect: planning too frequently or not enough degrades performance. We show how to train agents to learn to dynamically allocate test-time compute when needed for best results. 👇
Bartłomiej Cupiał@CupiaBart

Almost all agentic pipelines prompt LLMs to explicitly plan before every action (ReAct), but turns out this isn't optimal for Multi-Step RL 🤔 Why? In our new work we highlight a crucial issue with ReAct and show that we should make and follow plans instead🧵

English
2
19
90
11.8K
Yingchen Xu retweetledi
Jack Parker-Holder
Jack Parker-Holder@jparkerholder·
Genie 3 feels like a watershed moment for world models 🌐: we can now generate multi-minute, real-time interactive simulations of any imaginable world. This could be the key missing piece for embodied AGI… and it can also create beautiful beaches with my dog, playable real time
English
264
526
4.8K
2.1M
Yingchen Xu retweetledi
Tim Rocktäschel
Tim Rocktäschel@_rockt·
Harder, Better, Faster, Stronger, Real-time! We are excited to reveal Genie 3, our most capable real-time foundational world model. Fantastic cross-team effort led by @jparkerholder and @shlomifruchter. Below some interactive worlds and capabilities that were highlights for me 🌎👇
English
53
183
1.4K
168.4K
Yingchen Xu retweetledi
Harshit Sikchi
Harshit Sikchi@harshit_sikchi·
We are hosting a social again this year at #RLC2025 (@RL_Conference ) on August 5. Come to meet people pre-conference and find friends and collaborators. RSVP below if you can make it:
Harshit Sikchi tweet media
English
1
6
39
3.7K
Yingchen Xu retweetledi
Harshit Sikchi
Harshit Sikchi@harshit_sikchi·
I will be @RL_Conference presenting the below work on Fast Adaptation on Wednesday August 6 at 10:20 am and some works on unsupervised RL and imitation at RLBrew workshop on August 5.
Harshit Sikchi@harshit_sikchi

Behavioral Foundation Models (BFMs) trained with RL are secretly more powerful than we think. BFM’s directly output a policy believed to be near-optimal given any reward function. Our new work shows that they can actually do much better:

English
3
9
105
6.6K
Yingchen Xu retweetledi
Roberta Raileanu
Roberta Raileanu@robertarail·
Excited to be in Edmonton for the @RL_Conference this week! Today I’ll be at @ibrlworkshop and @rlvg2025 giving two workshop talks and participating in the panels. Stop by to say hello! 📍 Inductive Biases in RL Workshop ⏰ 9:15am 🤖 LLM Whispers: Injecting Human Priors into RL Agents 📍 RL and Video Games Workshop ⏰ 2pm 🕹️ NetHack: A Grand Challenge for RL and LLM Agents Alike
Inductive Biases in RL@ibrlworkshop

🗓️ The IBRL Workshop kicks off tomorrow! 🎉 Join us at @RL_Conference @UAlberta to explore how Inductive Biases can boost 🚀 the performance of RL agents. 📄 Accepted papers: sites.google.com/view/ibrl-work… 📅 Full schedule: sites.google.com/view/ibrl-work… #ReinforcementLearning #RLC2025

English
1
6
57
4.5K
Yingchen Xu retweetledi
Reinforcement Learning & Video Games Workshop @RLC
We’re excited to announce our next speaker: Roberta Raileanu (@robertarail) from @GoogleDeepMind! Roberta will discuss NetHack: A Grand Challenge for RL and LLM Agents Alike. ⚔️ Join us on August 5th to learn how to develop agents capable of tackling open-ended environments!
Reinforcement Learning & Video Games Workshop @RLC tweet media
English
3
8
106
6.4K
Yingchen Xu retweetledi
Roberta Raileanu
Roberta Raileanu@robertarail·
I’m building a new team at @GoogleDeepMind to work on Open-Ended Discovery! We’re looking for strong Research Scientists and Research Engineers to help us push the frontier of autonomously discovering novel artifacts such as new knowledge, capabilities, or algorithms, in an open-ended self-improving loop. We aim to work on ambitious research projects in a fast-paced manner. If this sounds appealing to you, apply using the link below by Friday, August 1st EOD: job-boards.greenhouse.io/deepmind/jobs/…
English
90
254
2.5K
344.7K
Yingchen Xu retweetledi
Edward Grefenstette
Edward Grefenstette@egrefen·
Do you have a PhD (or equivalent) or will have one in the coming months (i.e. 2-3 months away from graduating)? Do you want to help build open-ended agents that help humans do humans things better, rather than replace them? We're hiring 1-2 Research Scientists! Check the 🧵👇
English
9
38
356
76.7K
Yingchen Xu retweetledi
Nathan Herr
Nathan Herr@naitherr·
Excited to introduce LLM-First Search (LFS) - a new paradigm where the language model takes the lead in reasoning and search! LFS is a self-directed search method that empowers LLMs to guide the exploration process themselves, without relying on predefined heuristics or fixed exploration schedules. Why it matters: 🔍🤖 Self-directed: The model strikes a balance between exploration and exploitation using its own internal scoring system. 🔄🧠Adaptive: Automatically adjusts to task difficulty with no tuning required. 📈🏆Stronger performance: Achieves higher success rates on reasoning tasks. ⏱️📉Efficient: Outperforms other LLM-augmented strategies in compute usage. ⚖️🚀Scalable: Gains amplify with stronger models and more compute. A big thanks to both @_rockt and @robertarail for their exceptional supervision!
Nathan Herr tweet media
English
2
23
142
21.9K