Yang Li
22 posts

Yang Li
@LeYangco
PhD candidate @sjtu1896, research intern @AlibabaGroup, researching machine learning with a focus on generative models and optimization.

insane sequence of statements buried in an Alibaba tech report


🌶️ Some (perhaps) spicy thoughts. It’s been a while since my last tweet, but I wanted to write about how disorienting it has been from academia to an LLM lab 😅 The kind of research I was trained to do during my PhD almost doesn’t exist here. The obsession with mathematical elegance and novelty is mostly gone. Everything is about scaling data and compute. For a while, that really got to me. At my lowest point, I felt like I’d lost interest in building LLMs altogether. I didn’t feel intellectually challenged anymore. What made this even stranger was that, at a technical level, things worked. If there was a capability I wanted to teach a model, scaling the right data and compute always got me there, no exception (so far). But recently, I found a way to reconcile with myself.. I realized the real competition isn’t in the ML recipe anymore. Most teams do roughly the same thing. What actually matters is how fast you can iterate, test ideas, and recover from mistakes. And that speed is mostly backed by infrastructure 🏗️ Faster loops, fewer bugs, better tooling. Seeing this made me excited again! Infra is its own deep, hard, and intellectually fun problem space. In 2026, I want to become an ML researcher who’s really good at infra. And I'll come back to ML problems with that edge, and will be excited to share what I find 😌


🚨 Chinese researchers just published a paper that destroys every AI agent startup pitch deck. It's called ROME + ALE, and it exposes why every "AI agent company" you've heard of is building on quicksand. Here's what nobody's talking about:


🚨 Chinese researchers just published a paper that destroys every AI agent startup pitch deck. It's called ROME + ALE, and it exposes why every "AI agent company" you've heard of is building on quicksand. Here's what nobody's talking about:

🚀 Excited to share our latest work in RL4LLM system. 🎉 ROLL Flash enables fully asynchronous overlap of generation, interaction, rewards, and training through Fine-grained Parallelism and Rollout–Train Decoupling. 1) 2.24× faster on RLVR; 2.72× faster on agentic tasks 2) Near-linear scaling: 8× GPUs → 7.6× throughput 3) Asynchronous Ratio balances utilization and sample freshness with minimal staleness cost 4) Supports off-policy algorithms (Decoupled PPO, TOPR, CISPO) with no performance loss Join Us. Star, try, contribute—let's scale LLM RL together! 🌟 🔗 Paper: arxiv.org/abs/2510.11345 💻 Code: github.com/alibaba/ROLL #LLMs #ReinforcementLearning #RL4LLM #SystemOptimization #AgenticAI




















