slime

40 posts

slime

@slime_framework

The LLM post-training framework for RL Scaling. https://t.co/4ILpx8hfKN

Bergabung Eylül 2025

8 Mengikuti827 Pengikut

Tweet Disematkan

slime@slime_framework·12 Mar

slime v0.2.3 is here! 🚀 The biggest update in this release is YAML-based --sglang-config. It enables much more flexible SGLang configuration for advanced rollout setups, including: - PD disaggregation with different parallelism for prefill / decode - EPD - serving multiple different models launching multiple routers in one deployment We hope v0.2.3 gives you much more freedom in building efficient rollout systems. Release: github.com/THUDM/slime/re…

English

1.5K

slime@slime_framework·12 Mar

Docs for sglang_config: thudm.github.io/slime/zh/advan…

English

slime@slime_framework·12 Mar

English

1.5K

slime@slime_framework·28 Şub

slime supports Qwen3.5 27B and 35B-A3B now. github.com/THUDM/slime/pu…

English

3.1K

slime me-retweet

Ling Yang@LingYang_PU·27 Şub

What if your AI agent got better just by talking to you? Introducing OpenClaw-RL — a fully async RL framework that turns your everyday conversations into training signals. Your agent learns your habits, your workflows, your preferences. Privately. Continuously. #Clawdbot #openclaw 🔑 Two learning modes: • Binary RL — likes/dislikes become rewards • On-Policy Distillation — your textual feedback becomes token-level guidance Self-hosted. Zero API keys. Your data never leaves your machine. 👉 github.com/Gen-Verse/Open…

English

433

29.8K

slime@slime_framework·19 Şub

GLM-5 support just landed in slime — and the RL infra optimizations mentioned in the GLM-5 tech report are all here. Give it a try: github.com/THUDM/slime/pu…

Z.ai@Zai_org

Presenting the GLM-5 Technical Report! arxiv.org/abs/2602.15763 After the launch of GLM-5, we’re pulling back the curtain on how it was built. Key innovations include: - DSA Adoption: Significantly reduces training and inference costs while preserving long-context fidelity - Asynchronous RL Infrastructure: Drastically improves post-training efficiency by decoupling generation from training - Agent RL Algorithms: Enables the model to learn from complex, long-horizon interactions more effectively Through these innovations, GLM-5 achieves SOTA performance among open-source models, with particularly strong results in real-world software engineering tasks.

English

2.8K

slime@slime_framework·12 Şub

It’s finally here — GLM-5 just dropped! slime ended up doing a lot more heavy lifting for GLM-5 than before, and we’re super happy about it. We’ll add GLM-5 support in slime next… after a short recharge from the rush :p Happy Chinese New Year!

Z.ai@Zai_org

Introducing GLM-5: From Vibe Coding to Agentic Engineering GLM-5 is built for complex systems engineering and long-horizon agentic tasks. Compared to GLM-4.5, it scales from 355B params (32B active) to 744B (40B active), with pre-training data growing from 23T to 28.5T tokens. Try it now: chat.z.ai Weights: huggingface.co/zai-org/GLM-5 Tech Blog: z.ai/blog/glm-5 OpenRouter (Previously Pony Alpha): openrouter.ai/z-ai/glm-5 Rolling out from Coding Plan Max users: z.ai/subscribe

English

3.8K

slime@slime_framework·20 Oca

@m_sirovatka ❤️

QME

204

Matej Sirovatka@m_sirovatka·20 Oca

The amount of alpha in slime is insane 🙏 love the work they're doing

slime@slime_framework

GLM-4.7-Flash: 30B-A3B, MLA (less KV), MTP (faster inference). Promising for agentic RL experiments — give it a try with slime! github.com/THUDM/slime/pu…

English

6.3K

slime@slime_framework·20 Oca

GLM-4.7-Flash: 30B-A3B, MLA (less KV), MTP (faster inference). Promising for agentic RL experiments — give it a try with slime! github.com/THUDM/slime/pu…

Z.ai@Zai_org

Introducing GLM-4.7-Flash: Your local coding and agentic assistant. Setting a new standard for the 30B class, GLM-4.7-Flash balances high performance with efficiency, making it the perfect lightweight deployment option. Beyond coding, it is also recommended for creative writing, translation, long-context tasks, and roleplay. Weights: huggingface.co/zai-org/GLM-4.… API: docs.z.ai/guides/overvie… - GLM-4.7-Flash: Free (1 concurrency) - GLM-4.7-FlashX: High-Speed and Affordable

English

10.8K

slime@slime_framework·18 Oca

slime v0.2.2 is out! This release brings multiple memory + performance optimizations, plus major new capabilities: • Int4-QAT training • Full R3 (Rollout Routing Replay) support with DeepEP + MTP • Upgraded to SGLang v0.5.7 and Megatron dev branch Huge thanks to everyone who contributed! github.com/THUDM/slime/re…

English

3.2K

slime@slime_framework·28 Ara

Amazing work by the team!

Z.ai@Zai_org

GLM-4.7 is featured on Artificial Analysis Intelligence Index, positioned as a leading open-source model.

English

1.1K

slime@slime_framework·23 Ara

As always, proudly powered by slime.

Z.ai@Zai_org

GLM-4.7 is here! GLM-4.7 surpasses GLM-4.6 with substantial improvements in coding, complex reasoning, and tool usage, setting new open-source SOTA standards. It also boosts performance in chat, creative writing, and role-play scenarios. Default Model for Coding Plan: z.ai/subscribe Try it now: chat.z.ai Weights: huggingface.co/zai-org/GLM-4.7 Tech Blog: z.ai/blog/glm-4.7

English

3.1K

slime@slime_framework·12 Ara

slime v0.2.1 is out 🎉 Highlights - VLM + FSDP: true on-policy training on Qwen3-VL (dense) - Enable PD-disaggregation during rollout - Add DP-attention to rollout routing replay (R3) - Upgrade SGLang → v0.5.6 github.com/THUDM/slime/re…

English

3.7K

slime@slime_framework·9 Ara

Congrats! 🎉🎉🎉

Ying Sheng@ying11231

We've been running @radixark for a few months, started by many core developers in SGLang @lmsysorg and its extended ecosystem (slime @slime_framework , AReaL @jxwuyi). I left @xai in August — a place where I built deep emotions and countless beautiful memories. It was the best place I’ve ever worked, the place I watched grow from a few dozen people to hundreds, and it truly felt like home. What pushed me to make such a hard decision is the momentum of building SGLang open source and the mission of creating an ambitious future, within an open spirit that I learnt from my first job at @databricks after my PhD. We started SGLang in the summer of 2023 and made it public in January 2024. Over the past 2 years, hundreds of people have made great efforts to get to where they are today. We experienced several waves of growth after its first release. I still remember the many dark nights in the summer of 2024, I spent with @lm_zheng , @lsyincs , and @zhyncs42 debugging, while @ispobaoke single-handedly took on DeepSeek inference optimizations, seeing @GenAI_is_real and the community strike team tag-teaming on-call shifts non-stop. There are so many more who have joined that I'm out of space to call out, but they're recorded on the GitHub contributor list forever. The demands grow exponentially, and we have been pushed to make it a dedicated effort supported by RadixArk. It’s the step-by-step journey of a thousand miles that has carried us here today, and the same relentless Long March that will lead us into the tens of thousands of miles yet to come. The story never stops growing. Over the past year, we’ve seen something very clear: The world is full of people eager to build AI, but the infrastructure that makes it possible is not shared. The most advanced inference and training stacks live inside a few companies. Everyone else is forced to rebuild the same schedulers, compilers, serving engines, and training pipelines again and again — often under enormous pressure, with lots of duplicated effort and wasted insight. RadixArk was born to change that. Today, we’re building an infrastructure-first, deep-tech company with a simple and ambitious mission: "Make frontier-level AI infrastructure open and accessible to everyone." If the two values below resonate with you, come talk to us: (1) Engineering as an art. Infrastructure is a first-class citizen in RadixArk. We care about elegant design and code that lasts. Beneath every line of code lies the soul of the engineer who wrote it. (2) A belief in openness. We share what we build. We bet on long-term compounding through community, contribution, and giving more than we take. A product is defined by its users, yet it truly comes alive the moment functionality transcends mere utility and begins to embody aesthetics. Thanks to all the miles (the name of our first released RL framework; see below). radixark.ai

English

1.4K

slime@slime_framework·7 Ara

We’ve added SGLang PD disaggregation to slime! Use --prefill-num-servers to split prefill and decode servers, making multi-turn RL rollouts more controllable under heavy prefill load. github.com/THUDM/slime/pu…

English

4.6K

slime me-retweet

Yiping Wang@ypwang61·1 Ara

8B model can outperform AlphaEvolve on open optimization problems by scaling compute for inference or test-time RL🚀! ⭕Circle packing: AlphaEvolve (Gemini-2.0-Flash/Pro) : 2.63586276 Ours (DeepSeek-R1-0528-Qwen3-8B) : 2.63598308 🔗in🧵 [1/n]

English

197

44.1K

slime@slime_framework·28 Kas

slime v0.2.0 is here 🎉 Huge thanks to all contributors & users who pushed this release forward ❤️ Highlights: • New FSDP training backend • Full-stack FP8 (train + infer) & MTP training during RL • Tools to reduce train–infer mismatch: custom IS, routing replay(R2/R3), true on-policy on FSDP • Performance improvements: amem + CUDA Graphs offload, faster FP8 weight updates • New examples: fully async, multi-agent, on-policy distillation, retool, ... 🔗 github.com/THUDM/slime/re…

English

9.4K

slime@slime_framework·27 Kas

We just added amem support to further save memory! github.com/THUDM/slime/pu…

Ant Ling@AntLingAGI

Introducing AMem NCCL-Plugin, the 2nd OSS component to ASystem!💰 Solve inefficient NCCL mem offload in RL workflows: VRAM Savings: free up 10GB+ on a single Hopper-architecture GPU Efficiency: switching time is optimized from the typical minutes to $<1$ second Verified by Ring-1T large-scale RL training, try it today!

English

1.9K

slime@slime_framework·24 Kas

We just got ~100× faster GAE by borrowing ideas from chunked linear attention and turning GAE into a chunked scan problem. Code: github.com/THUDM/slime/p/… Detailed write-up (Chinese): zhuanlan.zhihu.com/p/197523728942…

English

3.3K

slime@slime_framework·14 Kas

On-policy RL without the mystery gap. Thanks to the SGLang team @lmsysorg, slime now match training and inference bit-for-bit (0 KL)!

LMSYS Org@lmsysorg

💥 We've achieved perfect training-inference alignment for SGLang & FSDP in slime! (Flash Attn 3, DeepGEMM, etc.) The result? A strict KL divergence of 0. But here's the twist: We spent a month trying to find a baseline that crashes from mismatch... and couldn't. 🤷‍♂️ We haven't found a significant difference from the baseline yet. We're calling on the community: Send us your reproducible training collapse examples! We want to see them 🤣🤣

English

3.6K

Jelajahi

@m_sirovatka @lmsysorg @elonmusk @BarackObama @taylorswift13 @cristiano @BillGates @NASA