Post

We built a fork of @NovaSkyAI SkyRL, making SGLang by @lmsysorg a fully supported rollout engine, and integrating @_xjdr nmoe as a training backend, providing full support for B200 MoE RL
github.com/ai-blaise/nmoe
github.com/ai-blaise/sgla…
github.com/ai-blaise/SkyRL
English

@NovaSkyAI @lmsysorg @_xjdr Highly performant forward and backward G1 attention gate kernels based on @Alibaba_Qwen Gated Attention research
github.com/ai-blaise/nmoe…
github.com/ai-blaise/nmoe…
github.com/ai-blaise/nmoe…

English

@NovaSkyAI @lmsysorg @_xjdr @Alibaba_Qwen Two large-scale synthetic rerollout datasets based on @NVIDIAAI Nemotron-Agentic-Tool-Use-v1 generated with @deepseek_ai V3.2
huggingface.co/datasets/Blais…
huggingface.co/datasets/Blais…
English

@NovaSkyAI @lmsysorg @_xjdr @Alibaba_Qwen @NVIDIAAI @deepseek_ai The parser we used to generate the rerollout dataset, with a built-in TUI data viewer
github.com/ai-blaise/data…
English

