RadixArk

16 posts

RadixArk banner
RadixArk

RadixArk

@radixark

SHIP AI FOR ALL.

Katılım Kasım 2025
11 Takip Edilen2.2K Takipçiler
Sabitlenmiş Tweet
RadixArk
RadixArk@radixark·
Hello, world!
English
7
1
110
17.5K
RadixArk retweetledi
Andrew Ng
Andrew Ng@AndrewYNg·
New course: Efficient Inference with SGLang: Text and Image Generation, built in partnership with LMSys @lmsysorg and RadixArk @radixark, and taught by Richard Chen @richardczl, a Member of Technical Staff at RadixArk. Running LLMs in production is expensive, and much of that cost comes from redundant computation. This short course teaches you to eliminate that waste using SGLang, an open-source inference framework that caches computation already done and reuses it across future requests. When ten users share the same system prompt, SGLang processes it once, not ten times. The speedups compound quickly, especially when there's a lot of shared context across requests. Skills you'll gain: - Implement a KV cache from scratch to eliminate redundant computation within a single request - Scale caching across users and requests with RadixAttention, so shared context is only processed once - Accelerate image generation with diffusion models using SGLang's caching and multi-GPU parallelism Join and learn to make LLM inference faster and more cost-efficient at scale! deeplearning.ai/short-courses/…
English
35
66
421
60.6K
RadixArk retweetledi
LMSYS Org
LMSYS Org@lmsysorg·
We built a course with @DeepLearningAI on how to run LLM and image generation faster and more efficiently. LLM inference has a redundancy problem 🤖 Your chatbot re-reads the same system prompt on every request. 💻 Your coding agent re-processes the full context before every tool call. Imagine doing that across billions of requests from millions of users. SGLang was built to solve this with RadixAttention: every response is unique, but most of the work to get there isn't. Our course takes just over an hour and goes all the way in on modern AI inference: 1️⃣ How inference works with SGLang, from a single request to serving at production scale 2️⃣ How caching works with diffusion models 3️⃣ Where AI inference is headed and what comes next The most advanced infrastructure for serving modern AI is open. And we are making it easier than ever for you to learn and get hands on it. Thanks @richardczl our amazing instructor and @radixark for making this course possible 🧡 👇 Check out "Efficient Inference with SGLang: Text and Image Generation":deeplearning.ai/short-courses/…
LMSYS Org tweet media
DeepLearning.AI@DeepLearningAI

New course available! Efficient Inference with SGLang: Text and Image Generation is live. LLM inference gets expensive fast—mostly due to redundant computation. This course shows how to reduce that using SGLang, with KV cache and RadixAttention, and how to apply the same ideas to faster image generation. Built with @lmsysorg and @radixark, taught by Richard Chen. Enroll for free: hubs.la/Q04b0F1J0

English
1
9
32
3.1K
RadixArk retweetledi
LMSYS Org
LMSYS Org@lmsysorg·
🚀 New blog: ROCm Support for Miles: Large-Scale RL Post-Training on AMD Instinct™ GPUs Together with @AMD, Miles brings end-to-end RL pipelines to MI300/350-class clusters: ⚡️ Rollout generation dominates RL compute, and AMD’s HBM bandwidth directly addresses this bottleneck 🧠 AIME accuracy improved from 0.665 → 0.729 across training on Qwen3-30B-A3B with GRPO 💾 MI300X delivers ~1.1–1.3k tok/GPU/s rollout throughput ⏱️ Mean step time 388.5s on a single 8-GPU MI300X node (32×8 sampling, 8k response cap) 🔧 Multi-turn agentic training validated ... and more optimizations to come 🔥
LMSYS Org tweet media
English
3
18
66
12K
RadixArk retweetledi
LMSYS Org
LMSYS Org@lmsysorg·
👀 Live from #GTC2026 SGLang is featured on the @nvidia AI ecosystem slide during the keynote! Honored to be part of the infrastructure stack behind AI-native apps. ⚡
LMSYS Org tweet media
English
0
6
35
14K
RadixArk
RadixArk@radixark·
Building with open-source LLMs? We’re giving 20 builders a chance to attend GTC — on us 🎟️ Proud to support the SGLang community and the engineers making LLM serving faster and more scalable!
LMSYS Org@lmsysorg

🎁 SGLang GTC Giveaway — 20 FREE Passes! SGLang is an open-source LLM serving engine that helps models like DeepSeek, Qwen, Kimi, Minimax, GLM, and Llama run efficiently at production scale. Thanks to our sponsor @radixark, we're giving away 20 NVIDIA GTC 4-day exhibit passes (worth $930 each)! 🎟️ To enter the lottery: 1️⃣ Follow us → @lmsysorg 2️⃣ ⭐ Star SGLang on GitHub → github.com/sgl-project/sg… 3️⃣ Reply with: your favorite open-source model and what you use it for 4️⃣ Repost this for extra visibility How we pick winners: 🏆 Top 5 most engaging comments win directly 🎲 Remaining 15 drawn randomly via xpickr We'll verify your GitHub star before sending tickets, so make sure you've starred the repo! Let's go 👇

English
0
3
14
4.4K
RadixArk
RadixArk@radixark·
This matches exactly what we’re seeing internally. That’s why at RadixArk, unlimited enterprise subscriptions to Claude and Codex are standard issue for every engineer. We removed the friction so our team can focus entirely on what matters. Building the future of LLM inference at RadixArk. Open roles available on our website.
Greg Brockman@gdb

Software development is undergoing a renaissance in front of our eyes. If you haven't used the tools recently, you likely are underestimating what you're missing. Since December, there's been a step function improvement in what tools like Codex can do. Some great engineers at OpenAI yesterday told me that their job has fundamentally changed since December. Prior to then, they could use Codex for unit tests; now it writes essentially all the code and does a great deal of their operations and debugging. Not everyone has yet made that leap, but it's usually because of factors besides the capability of the model. Every company faces the same opportunity now, and navigating it well — just like with cloud computing or the Internet — requires careful thought. This post shares how OpenAI is currently approaching retooling our teams towards agentic software development. We're still learning and iterating, but here's how we're thinking about it right now: As a first step, by March 31st, we're aiming that: (1) For any technical task, the tool of first resort for humans is interacting with an agent rather than using an editor or terminal. (2) The default way humans utilize agents is explicitly evaluated as safe, but also productive enough that most workflows do not need additional permissions. In order to get there, here's what we recommended to the team a few weeks ago: 1. Take the time to try out the tools. The tools do sell themselves — many people have had amazing experiences with 5.2 in Codex, after having churned from codex web a few months ago. But many people are also so busy they haven't had a chance to try Codex yet or got stuck thinking "is there any way it could do X" rather than just trying. - Designate an "agents captain" for your team — the primary person responsible for thinking about how agents can be brought into the teams' workflow. - Share experiences or questions in a few designated internal channels - Take a day for a company-wide Codex hackathon 2. Create skills and AGENTS[.md]. - Create and maintain an AGENTS[.md] for any project you work on; update the AGENTS[.md] whenever the agent does something wrong or struggles with a task. - Write skills for anything that you get Codex to do, and commit it to the skills directory in a shared repository 3. Inventory and make accessible any internal tools. - Maintain a list of tools that your team relies on, and make sure someone takes point on making it agent-accessible (such as via a CLI or MCP server). 4. Structure codebases to be agent-first. With the models changing so fast, this is still somewhat untrodden ground, and will require some exploration. - Write tests which are quick to run, and create high-quality interfaces between components. 5. Say no to slop. Managing AI generated code at scale is an emerging problem, and will require new processes and conventions to keep code quality high - Ensure that some human is accountable for any code that gets merged. As a code reviewer, maintain at least the same bar as you would for human-written code, and make sure the author understands what they're submitting. 6. Work on basic infra. There's a lot of room for everyone to build basic infrastructure, which can be guided by internal user feedback. The core tools are getting a lot better and more usable, but there's a lot of infrastructure that currently go around the tools, such as observability, tracking not just the committed code but the agent trajectories that led to them, and central management of the tools that agents are able to use. Overall, adopting tools like Codex is not just a technical but also a deep cultural change, with a lot of downstream implications to figure out. We encourage every manager to drive this with their team, and to think through other action items — for example, per item 5 above, what else can prevent a lot of "functionally-correct but poorly-maintainable code" from creeping into codebases.

English
2
6
57
30K
RadixArk
RadixArk@radixark·
🔥Join us🔥
Mingyi Lu@mingyilu123

Want a front-row seat to the evolution of frontier models? 🤖 I'm building the AI Product team at @radixark. We're scaling SGLang @lmsysorg @sgl_project and defining the future of AI training & inference infrastructure. Open roles in PM, Product Ops, and DevRel. If you want to own products from strategy to GTM, join us! Apply through: job-boards.greenhouse.io/radixark/jobs/… #AI #TechJobs #SGLang #ProductManagement

English
0
3
8
4.4K
RadixArk
RadixArk@radixark·
We’re excited to announce RadixArk, to give the entire AI community infrastructure that was once only available inside frontier labs. We invite you to join our journey. 1. We’re hiring. RadixArk is growing our core team. If you care about first-principles design, open infrastructure in training and inference, and building systems that make AI faster and more accessible, please send your resume and info to hiring@radixark.ai 2. We’re opening our LLM testing platform to early users. Built on top of SGLang and Miles, it helps any AI team for training and inference at scale. Join the waitlist: 👉 platform.radixark.ai/waitlist Let’s build it — together.
Ying Sheng@ying11231

We've been running @radixark for a few months, started by many core developers in SGLang @lmsysorg and its extended ecosystem (slime @slime_framework , AReaL @jxwuyi). I left @xai in August — a place where I built deep emotions and countless beautiful memories. It was the best place I’ve ever worked, the place I watched grow from a few dozen people to hundreds, and it truly felt like home. What pushed me to make such a hard decision is the momentum of building SGLang open source and the mission of creating an ambitious future, within an open spirit that I learnt from my first job at @databricks after my PhD. We started SGLang in the summer of 2023 and made it public in January 2024. Over the past 2 years, hundreds of people have made great efforts to get to where they are today. We experienced several waves of growth after its first release. I still remember the many dark nights in the summer of 2024, I spent with @lm_zheng , @lsyincs , and @zhyncs42 debugging, while @ispobaoke single-handedly took on DeepSeek inference optimizations, seeing @GenAI_is_real and the community strike team tag-teaming on-call shifts non-stop. There are so many more who have joined that I'm out of space to call out, but they're recorded on the GitHub contributor list forever. The demands grow exponentially, and we have been pushed to make it a dedicated effort supported by RadixArk. It’s the step-by-step journey of a thousand miles that has carried us here today, and the same relentless Long March that will lead us into the tens of thousands of miles yet to come. The story never stops growing. Over the past year, we’ve seen something very clear: The world is full of people eager to build AI, but the infrastructure that makes it possible is not shared. The most advanced inference and training stacks live inside a few companies. Everyone else is forced to rebuild the same schedulers, compilers, serving engines, and training pipelines again and again — often under enormous pressure, with lots of duplicated effort and wasted insight. RadixArk was born to change that. Today, we’re building an infrastructure-first, deep-tech company with a simple and ambitious mission: "Make frontier-level AI infrastructure open and accessible to everyone." If the two values below resonate with you, come talk to us: (1) Engineering as an art. Infrastructure is a first-class citizen in RadixArk. We care about elegant design and code that lasts. Beneath every line of code lies the soul of the engineer who wrote it. (2) A belief in openness. We share what we build. We bet on long-term compounding through community, contribution, and giving more than we take. A product is defined by its users, yet it truly comes alive the moment functionality transcends mere utility and begins to embody aesthetics. Thanks to all the miles (the name of our first released RL framework; see below). radixark.ai

English
4
12
140
19.3K