slime

40 posts

slime

slime

@slime_framework

The LLM post-training framework for RL Scaling. https://t.co/4ILpx8hfKN

Bergabung Eylül 2025
8 Mengikuti827 Pengikut
Tweet Disematkan
slime
slime@slime_framework·
slime v0.2.3 is here! 🚀 The biggest update in this release is YAML-based --sglang-config. It enables much more flexible SGLang configuration for advanced rollout setups, including: - PD disaggregation with different parallelism for prefill / decode - EPD - serving multiple different models launching multiple routers in one deployment We hope v0.2.3 gives you much more freedom in building efficient rollout systems. Release: github.com/THUDM/slime/re…
English
2
3
20
1.5K
slime
slime@slime_framework·
slime v0.2.3 is here! 🚀 The biggest update in this release is YAML-based --sglang-config. It enables much more flexible SGLang configuration for advanced rollout setups, including: - PD disaggregation with different parallelism for prefill / decode - EPD - serving multiple different models launching multiple routers in one deployment We hope v0.2.3 gives you much more freedom in building efficient rollout systems. Release: github.com/THUDM/slime/re…
English
2
3
20
1.5K
slime me-retweet
Ling Yang
Ling Yang@LingYang_PU·
What if your AI agent got better just by talking to you? Introducing OpenClaw-RL — a fully async RL framework that turns your everyday conversations into training signals. Your agent learns your habits, your workflows, your preferences. Privately. Continuously. #Clawdbot #openclaw 🔑 Two learning modes: • Binary RL — likes/dislikes become rewards • On-Policy Distillation — your textual feedback becomes token-level guidance Self-hosted. Zero API keys. Your data never leaves your machine. 👉 github.com/Gen-Verse/Open…
English
16
63
433
29.8K
slime
slime@slime_framework·
GLM-5 support just landed in slime — and the RL infra optimizations mentioned in the GLM-5 tech report are all here. Give it a try: github.com/THUDM/slime/pu…
Z.ai@Zai_org

Presenting the GLM-5 Technical Report! arxiv.org/abs/2602.15763 After the launch of GLM-5, we’re pulling back the curtain on how it was built. Key innovations include: - DSA Adoption: Significantly reduces training and inference costs while preserving long-context fidelity - Asynchronous RL Infrastructure: Drastically improves post-training efficiency by decoupling generation from training - Agent RL Algorithms: Enables the model to learn from complex, long-horizon interactions more effectively Through these innovations, GLM-5 achieves SOTA performance among open-source models, with particularly strong results in real-world software engineering tasks.

English
0
3
25
2.8K
slime
slime@slime_framework·
It’s finally here — GLM-5 just dropped! slime ended up doing a lot more heavy lifting for GLM-5 than before, and we’re super happy about it. We’ll add GLM-5 support in slime next… after a short recharge from the rush :p Happy Chinese New Year!
Z.ai@Zai_org

Introducing GLM-5: From Vibe Coding to Agentic Engineering GLM-5 is built for complex systems engineering and long-horizon agentic tasks. Compared to GLM-4.5, it scales from 355B params (32B active) to 744B (40B active), with pre-training data growing from 23T to 28.5T tokens. Try it now: chat.z.ai Weights: huggingface.co/zai-org/GLM-5 Tech Blog: z.ai/blog/glm-5 OpenRouter (Previously Pony Alpha): openrouter.ai/z-ai/glm-5 Rolling out from Coding Plan Max users: z.ai/subscribe

English
2
2
58
3.8K
slime
slime@slime_framework·
slime v0.2.2 is out! This release brings multiple memory + performance optimizations, plus major new capabilities: • Int4-QAT training • Full R3 (Rollout Routing Replay) support with DeepEP + MTP • Upgraded to SGLang v0.5.7 and Megatron dev branch Huge thanks to everyone who contributed! github.com/THUDM/slime/re…
English
1
6
42
3.2K
slime
slime@slime_framework·
slime v0.2.1 is out 🎉 Highlights - VLM + FSDP: true on-policy training on Qwen3-VL (dense) - Enable PD-disaggregation during rollout - Add DP-attention to rollout routing replay (R3) - Upgrade SGLang → v0.5.6 github.com/THUDM/slime/re…
English
0
2
17
3.7K
slime
slime@slime_framework·
Congrats! 🎉🎉🎉
Ying Sheng@ying11231

We've been running @radixark for a few months, started by many core developers in SGLang @lmsysorg and its extended ecosystem (slime @slime_framework , AReaL @jxwuyi). I left @xai in August — a place where I built deep emotions and countless beautiful memories. It was the best place I’ve ever worked, the place I watched grow from a few dozen people to hundreds, and it truly felt like home. What pushed me to make such a hard decision is the momentum of building SGLang open source and the mission of creating an ambitious future, within an open spirit that I learnt from my first job at @databricks after my PhD. We started SGLang in the summer of 2023 and made it public in January 2024. Over the past 2 years, hundreds of people have made great efforts to get to where they are today. We experienced several waves of growth after its first release. I still remember the many dark nights in the summer of 2024, I spent with @lm_zheng , @lsyincs , and @zhyncs42 debugging, while @ispobaoke single-handedly took on DeepSeek inference optimizations, seeing @GenAI_is_real and the community strike team tag-teaming on-call shifts non-stop. There are so many more who have joined that I'm out of space to call out, but they're recorded on the GitHub contributor list forever. The demands grow exponentially, and we have been pushed to make it a dedicated effort supported by RadixArk. It’s the step-by-step journey of a thousand miles that has carried us here today, and the same relentless Long March that will lead us into the tens of thousands of miles yet to come. The story never stops growing. Over the past year, we’ve seen something very clear: The world is full of people eager to build AI, but the infrastructure that makes it possible is not shared. The most advanced inference and training stacks live inside a few companies. Everyone else is forced to rebuild the same schedulers, compilers, serving engines, and training pipelines again and again — often under enormous pressure, with lots of duplicated effort and wasted insight. RadixArk was born to change that. Today, we’re building an infrastructure-first, deep-tech company with a simple and ambitious mission: "Make frontier-level AI infrastructure open and accessible to everyone." If the two values below resonate with you, come talk to us: (1) Engineering as an art. Infrastructure is a first-class citizen in RadixArk. We care about elegant design and code that lasts. Beneath every line of code lies the soul of the engineer who wrote it. (2) A belief in openness. We share what we build. We bet on long-term compounding through community, contribution, and giving more than we take. A product is defined by its users, yet it truly comes alive the moment functionality transcends mere utility and begins to embody aesthetics. Thanks to all the miles (the name of our first released RL framework; see below). radixark.ai

English
0
0
10
1.4K
slime
slime@slime_framework·
We’ve added SGLang PD disaggregation to slime! Use --prefill-num-servers to split prefill and decode servers, making multi-turn RL rollouts more controllable under heavy prefill load. github.com/THUDM/slime/pu…
English
0
5
32
4.6K
slime me-retweet
Yiping Wang
Yiping Wang@ypwang61·
8B model can outperform AlphaEvolve on open optimization problems by scaling compute for inference or test-time RL🚀! ⭕Circle packing: AlphaEvolve (Gemini-2.0-Flash/Pro) : 2.63586276 Ours (DeepSeek-R1-0528-Qwen3-8B) : 2.63598308 🔗in🧵 [1/n]
Yiping Wang tweet media
English
7
51
197
44.1K
slime
slime@slime_framework·
slime v0.2.0 is here 🎉 Huge thanks to all contributors & users who pushed this release forward ❤️ Highlights: • New FSDP training backend • Full-stack FP8 (train + infer) & MTP training during RL • Tools to reduce train–infer mismatch: custom IS, routing replay(R2/R3), true on-policy on FSDP • Performance improvements: amem + CUDA Graphs offload, faster FP8 weight updates • New examples: fully async, multi-agent, on-policy distillation, retool, ... 🔗 github.com/THUDM/slime/re…
English
3
8
70
9.4K