Ryan Ng

24 posts

Ryan Ng

@aftermultiply

Reasoning @xAI | ex-@OpenAI | TL at Ray / Anyscale | K8s | DynamoDB

Katılım Şubat 2026

21 Takip Edilen33 Takipçiler

Ryan Ng@aftermultiply·20h

@yunta_tsai Not too many places offer such opportunity : felt lucky to have been in a few of them

English

Yun-Ta Tsai@yunta_tsai·22h

Pursuit of happiness, builder edition: - Find a problem you love spending your life solving. - Spend your life solving it.

English

362

16.3K

Ryan Ng@aftermultiply·20h

@niloofar_mire Did you join the convo?

English

Niloofar@niloofar_mire·1d

I’ve been feeling a bit burnt out so I decided very last minute a few days ago to fly to scandinavia and detach a bit. As I was chilling in a random park in Copenhagen and taking this photo I overheard the couple next to me talking about world models and grounded video gen LOL

English

227

12.6K

Ryan Ng@aftermultiply·1d

@boyuan__zheng More usage + rsi => agi

Italiano

Boyuan Zheng@boyuan__zheng·2d

Excited to see people try Grok Build for web dev. Our team has put a lot of effort into improving its aesthetics, functionality, and more exciting features to be expected with recursive self-improvement loop. It’s still early beta, and feedback is very welcome. Please try it out and let us know where we can improve.

Kilo@kilocode

Grok Build 0.1 might be one of the most underestimated AI models right now. We tested it in Kilo Code by asking it to build 5 websites from scratch. Here are the results:

English

244

113

1.5K

31.4M

Ryan Ng@aftermultiply·6d

@joannejang Now it’s just a tool call away

English

106

Joanne Jang@joannejang·6d

learned this quote from 2023 is making rounds -- i actually don't think this is true anymore in 2026! The model should be invisible. i expect us to flip back to ux in the form of agent behavior + continual learning loops; and the alpha is in making models feel natural and as invisible as possible.

English

410

123.2K

Ryan Ng@aftermultiply·18 May

@CoreAutoAI “The discipline to focus on what matters before the training run, especially things like data quality and systems readiness”

English

Core Automation@CoreAutoAI·17 May

What is pretraining? Asking for a friend

English

119

14.1K

Ryan Ng@aftermultiply·18 May

Never understood the magic of @Tailscale until I started using it for myself

English

Ryan Ng@aftermultiply·18 May

@MillionInt Agents make that much more tractable

English

114

Jerry Tworek@MillionInt·17 May

Best productivity hack I know is organizing your work so that you enjoy it the most

English

514

25.5K

Ryan Ng@aftermultiply·17 May

Agents coming online

English

Ryan Ng@aftermultiply·16 May

@adeelzaman_ maybe this is the way

avi@avizurlo

x.com/i/article/2055…

English

Ryan Ng@aftermultiply·14 May

Interesting results

Nous Research@NousResearch

Today we release Token Superposition Training (TST), a modification to the standard LLM pretraining loop that produces a 2-3× wall-clock speedup at matched FLOPs without changing the model architecture, optimizer, tokenizer, or training data. During the first third of training, the model reads and predicts contiguous bags of tokens, averaging their embeddings on the input side and predicting the next bag with a modified cross-entropy on the output side. For the remainder of the run, it trains normally on next-token prediction. The inference-time model is identical to one produced by conventional pretraining. Validated at 270M, 600M, and 3B dense scales, and at 10B-A1B MoE. The work on TST was led by @bloc97_, @gigant_theo, and @theemozilla.

English

Ryan Ng@aftermultiply·13 May

Great blog, reminds me of the Bitter Lesson from the good old days during Macrohard @endernewton

Thinking Machines@thinkymachines

People talk, listen, watch, think, and collaborate at the same time, in real time. We've designed an AI that works with people the same way. We share our approach, early results, and a quick look at our model in action. thinkingmachines.ai/blog/interacti…

English

334

Ryan Ng@aftermultiply·6 May

Big deal

xAI@xai

SpaceXAI will provide @AnthropicAI with access to Colossus 1, one of the world’s largest and fastest-deployed AI supercomputers, to provide additional capacity for Claude → x.ai/news/anthropic…

English

Ryan Ng retweetledi

Zhuohan Li@zhuohan123·24 Nis

Try out deepseek v4 on vLLM!

vLLM@vllm_project

🎉 Day-0 support for @deepseek_ai V4 Pro and Flash on vLLM — a new generation of DeepSeek model, purpose-built for tasks up to 1M tokens. Alongside the release, we're publishing a first-principles walkthrough of the new long-context attention and how we implemented it in vLLM. The new attention mechanism, in four moves: • Shared K/V + inverse RoPE → 2× memory savings • c4a / c128a KV compression → 4×–128× savings • DeepSeek Sparse Attention over compressed tokens • Short sliding window for locality across compression boundaries At 1M context, per-layer KV state is ~8.7× smaller than a DeepSeek V3.2-style 61-layer stack (9.62 GiB vs 83.9 GiB, bf16). fp8 attention cache + fp4 indexer cache shrink it further. vLLM side: • Unified hybrid KV cache — single logical block size (256 native positions) across all compression rates; compressor state folded into the SWA KV cache spec so prefix caching, disagg prefill, CUDA graphs and MTP reuse the same abstraction • Three page-size buckets for the full 5-way cache stack → no cross-kind fragmentation • Fused kernels: compressor + RMSNorm + RoPE + cache insert (1.4–3×), inverse RoPE + fp8 quant (2–3×), Q-norm + KV RoPE + K insert (10–20×) • Multi-stream overlap of indexer vs main-KV compression vs SWA insertion Disaggregated serving is supported out of the box and strongly recommended for best performance. Follow our recipes site for verified commands for @nvidia Blackwell (B200, B300, GB200, GB300) and Hopper (H100/H200/H20) systems. Thanks to the @deepseek_ai team for open-sourcing DeepSeek V4, and to @inferact for landing day-0 support 🤝 📝 Blog: vllm.ai/blog/deepseek-… 📖 Recipes: recipes.vllm.ai/deepseek-ai/De… 🤗 huggingface.co/deepseek-ai/De…

English

2.6K

Ryan Ng retweetledi

Boyuan Zheng@boyuan__zheng·19 Nis

Recursive self improvement🚀🚀🚀

Shen Zhuoran@CMS_Flash

Cool work of the team is finally out in the world. This is a very early preview, and much more is to come around: - One-shotting complex web apps; - Pure vibe coding; - Self-improvement between browser use and web app development. The last point is critical and unique to web dev, because there is an intelligence gap between using a web app and building a web app that we can exploit, in theory leading to indefinite scaling of self-improvement.

English

147

8.4K

Ryan Ng retweetledi

mimic@mimicrobotics·14 Nis

With mimic-video, we were among the very first to propose Video-Action Models for robotics. Today, we are open-sourcing the recipe.

English

275

41.2K

Ryan Ng@aftermultiply·13 Nis

@arshdeep @xai Godspeed !

English

702

Arshdeep Singh@arshdeep·13 Nis

After an unforgettable ride, this was my last week at @xai. When I joined in 2024, xAI was a small yet an extraordinary team. I got an amazing opportunity to build our core ML Platforms from scratch and form an ambitious, amazing team. Together we built the core systems that power frontier-model research, evaluations, human data collection, agent training, and daily productivity for every member of technical staff at xAI. It was a true privilege. I’m deeply grateful to @elonmusk and every single person I’ve had the chance to work with across research, engineering, compute, infra, and beyond. The pace, collaboration, and shared mission to understand the universe and be truth seeking have been unmatched. Being able to contribute to data, training, and infra behind Grok-2-1212, Grok 3, Grok 4, 4.2, Aurora, Imagine and many other efforts been the highlight of my career so far. Special thank you to my incredible team and all the brilliant people who made this ride so special. I leave more inspired than ever. Excited for what’s next. Ad astra 🚀

English

857

83.2K

Ryan Ng@aftermultiply·11 Nis

Keerthana Gopalakrishnan@keerthanpg

There are really few (< 15) people in the world who know both frontier modeling AND modern robotics very well. A lot of strong roboticists are still working off of ideas from the pre-2023, pre-Gemini era of robotics and know very little about frontier AI techniques. A lot of strong frontier modeling people do not care yet / have little expertise in robotics. The latter group is increasing as more of the big AI labs are foraying into robotics but still the world needs a lot more :)

QST

134

Ryan Ng@aftermultiply·9 Nis

@reggitales Nice- would love to try

English

193

Regina Lin@reggitales·9 Nis

Introducing Dex: the self-driving workspace for operators. Dex is the first agent system with full operational context and a self-updating knowledge base. Every datapoint from your workspace is ingested, synced, and structured into compounding context for agents to take action. Comment "DEX" or tag @dexbythirdlayer access. First 1,000 sign-ups get 7 days free. After, join our rolling waitlist. Sign up at joindex [dot] com for a fun surprise. How dex works (threads)

English

279

64.8K

Ryan Ng@aftermultiply·9 Nis

@yzeng58 Does the flushed kv introduce gaps in pos embeddings ?

English

102

Yuchen Zeng@yzeng58·8 Nis

Reasoning models think hard — but all that thinking fills up your KV cache fast. Memento fixes this: the model compresses its own chain-of-thought mid-generation, flushing old KV entries after each block. 2-3× less peak KV cache, ~2× throughput — accuracy largely preserved. The cool part: deciding what to remember and what to forget is a capability the model acquires through training — not something you bolt on. Excited about where this goes — especially for agents.

Dimitris Papailiopoulos@DimitrisPapail

x.com/i/article/2041…

English

105

13.9K

Ryan Ng@aftermultiply·7 Nis

@peteflorence Congrats, looking forward to what’s next!

English

1.2K