mei

177 posts

mei

@multiply_matrix

AGI forecaster. In 2022 I predicted an AGI timeline of 2027 MIT dropout

Katılım Şubat 2021

211 Takip Edilen2.4K Takipçiler

mei@multiply_matrix·16 Haz

Excited to support @RadicalNumerics 🔥

Eric Nguyen@exnx

Together with my co-founders Michael @MichaelPoli6, Stefano @Massastrello and Armin @athmsx, I am excited to announce @RadicalNumerics is emerging from stealth with a $50M seed round to build general biological intelligence. We’re also sharing an early preview of our new model Omnii, the most powerful genome language model to date. Omnii preview link: radicalnumerics.ai/blog/radical-n… At Radical Numerics, our mission is to master the code of life, and to drive the frontier of biological AI for both design and defense. This is our dual mandate, which comes from something our own team helped make possible. Our founding team trained Evo and Evo 2, the largest biological AI models (40B params) trained on DNA sequences. Trillions of tokens across all of life, from microbes to mammals. It’s fully open source, and created the field now known as generative genomics. Last year, scientists used Evo to generate the world’s first complete genome from scratch using AI. Turns out it was a bacteriophage—a type of virus. It functioned in the real world, and in this case it was harmless. But for us, it was a clear turning point. It showed that AI is no longer just analyzing biology. It is on the cusp of generating functional lifeforms. Eventually, AI will have the power to design and control life itself. That should make all of us incredibly excited, and incredibly uneasy. (Anyone can design DNA with a new function, and have it synthesized and delivered, like something from Amazon Prime). The same technology that will help us cure cancer is the very technology that might create the next global pandemic, or worse, allow the creation of bioweapons that can wipe out populations. We believe these forces are inseparable. If you work on the frontier of biology, you have to build technology to safeguard it from its misuse. Existing biosecurity tools are sorely losing the arms race, relying on outdated “have I seen this exact thing before?” style algorithms. We founded Radical Numerics to turn the tide. And we can’t do that by training on textbooks and natural language. We must understand the language of biology from the raw physical data itself, to reason across every molecule and modality, from DNA to proteins. The next frontier for AI goes far beyond chatbots or video generators to models that can understand and engineer life. Today, we’re previewing Omnii, which is already far surpassing Evo 2, and will continue improving as we scale and add new modalities (training now). 1. For human health, Omnii can read and write whole genomes (more on writing later). It’s state of the art (SOTA) on detecting causal variants for disease, and can rank Alzheimer's mutations zero-shot. We’re partnering with a diagnostics company to use Omnii for early cancer detection (pancreatic and multi-cancer). 2. For defense, Omnii is SOTA at detecting AI-generated pathogens. We benchmarked existing detection tools, and they simply can’t detect the AI-generated ones (“deepfake viruses”). We’re partnering with a US national lab to pilot Omnii for detecting the next pandemic, both natural and AI-generated. We have a data center full of Blackwells in construction now to build the most powerful biological AI models ever. This mission takes a new kind of AI lab that can actually scale on physical, biological data: new alignment research (mid/post training), scaling long context, building out mech interp teams to dissect what these models learn, new architectures and systems designs, all from the ground up. Our team is made up of AI researchers and scientists from top labs and institutions (e.g. Stanford, MIT, Google DeepMind), but more importantly, we all share the belief that this is the most important challenge of our lifetime. If you feel similarly, we are hiring. We aim to bring the brightest minds in AI and science together to save lives. Thanks to our partners on this journey, led by Emergence Capital @emergencecap, with Obvious Ventures @obviousvc, Triatomic @TriatomicCap , and Patrick Collison @patrickc. Our advisors include Eric Horvitz @erichorvitz, CSO of Microsoft, Chris Re @HazyResearch of Stanford, George Church @geochurch of Harvard, and Andrew Weber @AndyWeberNCB, former Assistant Secretary of Defense for Nuclear, Chemical and Biological Defense Programs. Fortune article: fortune.com/2026/06/15/exc… Jobs: radicalnumerics.ai/join-us

English

2.4K

mei retweetledi

Akshat Bubna@akshat_b·21 May

Raising $ is cool. What’s even cooler is getting to work every day with this incredible group of humans. We like solving hard problems and building things we can be proud of. If this is you, come join us! We’re just getting started :)

Modal@modal

x.com/i/article/2057…

English

285

51.9K

mei retweetledi

LMSYS Org@lmsysorg·16 May

🐋 DeepSeek V4 is now merged into SGLang main with v0.5.12. What we shipped at launch: 🔹 ShadowRadix: native prefix caching for V4's hybrid attention 🔹 HiSparse: CPU-extended KV for sparse attention (up to 3× long-context throughput) 🔹 MTP speculative decoding with in-graph metadata preparation 🔹 W4A8 MegaMoE kernel 🔹 Flash Compressor + Lightning TopK kernels 🔹 Multiple parallelism methods: Tensor Parallelism/Expert Parallelism/Context Parallelism/Data Parallelism Attention 🔹 Prefill Decode Disaggregation 🔹 Hardware: H100, H200, B200, B300, GB200, GB300, MI35X And what we added since: 🔹 HiCache for V4 under UnifiedRadixTree 🔹 W4A4 MegaMoE kernels for faster MegaMoE 🔹 Marlin/FlashInfer MXFP4 (W4A16) MoE on Hopper 🔹 Hierarchical multi-stream overlap for small-batch decode 🔹 Optimized mHC pipeline: DeepGemm + fused norm + fused hc_head 🔹 Faster KV Compression V2 kernel 🔹 Fused SiLU+clamp+FP8 quantization kernel 🔹 Support TP16 on H100/H20 🔹 Support Multiple Detokenizers 🔹Pipeline Parallelism 🔹One docker image for all supported Nvidia hardware Thanks to @NVIDIAAI, @AMD, @ant_oss, @alibaba_cloud, ByteDance, @iFLYTEKLab, @radixark, and @pranjalssh for the work we shipped together on V4 🙌 More in 0.5.12 👇

English

201

14.8K

mei retweetledi

Beff (e/acc)@beffjezos·15 May

@bubbleboi Yeah they cooked actually

English

10K

mei retweetledi

Zyphra@ZyphraAI·15 May

We present ZAYA1-8B-Diffusion-Preview, the first diffusion language model trained on @AMD. Autoregressive LLMs generate one token at a time; diffusion generates a block in parallel, speeding up inference. We show a 4.6-7.7x decoding speedup with minimal quality degradation 🧵

English

684

1.1M

mei retweetledi

Eric Alcaide@eric_alcaide·14 May

SGLang team is cracked. Respect 🫡

LMSYS Org@lmsysorg

🌊 SGLang now supports @poolsideai's Laguna-XS.2, a 33.4B-A3B hybrid SWA + MoE model purpose-built for agentic coding and long-horizon SWE work ☑️ SWE-bench Verified 68.2%; Multilingual 62.4%; Pro 44.5%; Terminal-Bench 2.0 30.1% ☑️ 131K-token context for long agent traces ☑️ Native poolside_v1 reasoning + tool-call parsers (OpenAI-compatible) ☑️ BF16, FP8, and NVFP4 quantizations 👉 Cookbook: docs.sglang.io/cookbook/autor…

English

3.6K

mei retweetledi

Soumith Chintala@soumithchintala·12 May

@lmsysorg @thinkymachines SGLang has been great. Thanks for all the great work @radixark !

English

6.3K

mei retweetledi

elie@eliebakouch·12 May

thinking machines is using SGLang btw

elie@eliebakouch

the "small" model behind this demo is a 276B total 12B active MoE (larger pretrains are cooking), sparsity ratio looks pretty standard compared to open models of the same size

English

335

29.3K

mei retweetledi

Zyphra@ZyphraAI·11 May

Today we’re announcing 15MW of AMD Instinct MI355 GPU capacity through Zyphra Cloud, our full-stack neocloud powered by @AMD.

English

358

863.1K

mei retweetledi

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex·11 May

I've been saying Zyphra is an exceptional neolab, on all levels from moral to technical to financial. They have done what Geohot has failed to do: made AMD relevant again. They'll reap the rewards for it. Truly, DeepSeek of the West

Zyphra@ZyphraAI

Today we’re announcing 15MW of AMD Instinct MI355 GPU capacity through Zyphra Cloud, our full-stack neocloud powered by @AMD.

English

322

22.8K

mei retweetledi

Lucas Atkins@latkins·9 May

I’ve been consistently impressed by zephyra, and have always felt a kinship with their cause. Beautiful work across the board, and what a slate of releases this week. Western open weights is going to have a hell of a year.

Zyphra@ZyphraAI

Today we're releasing ZAYA1-VL-8B, our first vision-language model. ZAYA1-VL-8B is a 700M active / 8B total MoE built on our ZAYA1-8B base trained on @AMD. We achieve strong performance for our size resulting in leading intelligence density and inference efficiency.

English

4.4K

mei retweetledi

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex·9 May

They keep going!

Zyphra@ZyphraAI

English

5.7K

mei retweetledi

Beren Millidge@BerenMillidge·9 May

With this release we have rounded out our full suite of core modalities: Language, Vision, Audio, and Thought This is the first step on our path to ubiquitous and efficient open visual understanding, and we have an exciting roadmap ahead. Congrats to the team. Amazing work!

Zyphra@ZyphraAI

English

4.9K

mei retweetledi

SemiAnalysis@SemiAnalysis_·9 May

Amazing work from the @sgl_project and @radixark team for their work optimizing DeepSeek V4 inference on B200, B300, and the recent 4x iso-interactivity throughput improvements on GB300 by @ChengWan17! As @elonmusk said, The GB300 is the best AI computer, and software optimizations like this show its true potential!

English

261

36.2K

mei@multiply_matrix·8 May

The world has infinite demand for inference @sgl_project @radixark @BanghuaZ @ying11231

English

4.8K

mei retweetledi

davinci@leothecurious·7 May

okay this looks like something

Zyphra@ZyphraAI

We introduce Markovian RSA: recursive candidate aggregation with bounded carryover. Each round passes only the last τ tokens of each candidate forward, so no matter how long the model reasons for, the context length always remains bounded.

English

4.7K

mei retweetledi

Robert Washbourne@rawsh0·6 May

new model! strong <1B active MoE led data and posttraining for this release. cca goat @rishiiyer01 and the pretraining squad cooked x.com/ZyphraAI/statu…

Zyphra@ZyphraAI

Today we're releasing ZAYA1-8B, a reasoning MoE trained on @AMD and optimized for intelligence density. With <1B active params, it outperforms open-weight models many times its size on math and reasoning, closing in on DeepSeek-V3.2 and GPT-5-High with test-time compute. 🧵

English

6.4K

mei retweetledi

Beren Millidge@BerenMillidge·6 May

Incredible work from the entire Zyphra team for this one! We never expected that our small ZAYA1 would be able to compete (at least in math) with the frontier giants. Our post-training and pre-training stacks are strong. More general thoughts on the ZAYA release, a 🧵

Zyphra@ZyphraAI

English

6.4K

mei retweetledi

𝚐𝔪𝟾𝚡𝚡𝟾@gm8xx8·7 May

Zyphra remains one of my favorite teams in the game because the releases all point in the same direction: capable AI that is cheaper to train, cheaper to run, and easier to deploy across modalities. ZAYA1-8B is the latest proof point for that pattern, extending Zyphra’s all-AMD ZAYA1 stack into post-training. The base model showed AMD Instinct MI300 hardware could train a competitive MoE. ZAYA1-8B was pretrained, midtrained, and SFT’d on a 1,024-node MI300X cluster with AMD Pensando Pollara interconnect built with IBM. This release shows the reasoning side: 8.4B total parameters, only 760M active, trained end-to-end by Zyphra, then pushed into math/code-heavy reasoning where it competes with much larger open reasoning models. The key is active-parameter efficiency. ZAYA1-8B is not a dense 8B; it is a small MoE with sub-1B active compute per token. Architecturally, Zyphra changed three pieces versus a standard MoE: Compressed Convolutional Attention for sequence mixing in a compressed latent space with 8× KV-cache compression, an MLP-based router with PID-controller bias balancing, and learned residual scaling. The training recipe is the other major piece: ZAYA1-8B was trained from scratch for reasoning, with long-CoT data included from pretraining onward using answer-preserving trimming. Post-training then runs SFT followed by a four-stage RL cascade: reasoning warmup on math and puzzles, a 400-task RLVE-Gym adaptive curriculum, math/code RL with TTC traces and synthetic code environments, then behavioral RL for chat and instruction following. The test-time-compute piece is Markovian RSA: multiple traces are generated in parallel, fixed-length tail segments are carried forward, and recursive aggregation prompts seed the next round. The point is bounded context during extended reasoning: with the 40K/4K configuration, ZAYA1-8B reaches 91.9 on AIME’25 and 89.6 on HMMT’25 while forwarding only a 4K-token tail. Outside the TTC setup, what stands out is the reasoning density: AIME’26 89.1, HMMT Feb.’26 71.6, IMO-AnswerBench 59.3, LiveCodeBench-v6 65.8, GPQA-Diamond 71.0, and MMLU-Pro 74.2 from a 760M-active / 8.4B-total MoE. ZAYA1-8B is the small-active MoE reasoning recipe in practice: sparse active compute, efficient inference, and enough reasoning density to make local and test-time-compute deployments interesting.

English

mei retweetledi

stochasm@stochasticchasm·6 May

what a release jesus

Zyphra@ZyphraAI

English

940

161.3K

Keşfet

@RadicalNumerics @NVIDIAAI @AMD @ant_oss @alibaba_cloud @iFLYTEKLab @radixark @pranjalssh