Xiaohan Fu retweetledi

🌀Agent Learning via Early Experience🌀
📝: arxiv.org/abs/2510.08558
- SFT for agents is sparse; RL on long-horizons is hard
We provide new mid-training signals that work:
1) Implicit next state world modeling task
2) Self-reflection on alternate states
- Strong improvements over 8 environments and multiple model families
- Works well for subsequent RL!
🧵1/5

English




