Radical Numerics

10 posts

Radical Numerics

Radical Numerics

@RadicalNumerics

Systems, scaling and architecture for next-gen scientific world models.

San Francisco & Tokyo Katılım Mayıs 2025
4 Takip Edilen2K Takipçiler
Radical Numerics retweetledi
Michael Poli
Michael Poli@MichaelPoli6·
We're growing rapidly at @RadicalNumerics and scaling our core teams. Join us in building the next generation of scientific world models. We're hiring across a few roles, each with significant ownership and cross-functional scope: - Member of Technical Staff, Post-Training - Member of Technical Staff, Infrastructure and Training Systems - Member of Technical Staff, Pretraining Science - Member of Technical Staff, AI Bio - Member of Technical Staff, Biosecurity Our technology brings together numerics, systems engineering, and architecture design to tackle large-scale pretraining on scientific data. Our blogs (see below) give a flavor of the work. We believe that advancing capabilities must go hand-in-hand with advancing safety and biosecurity. The same systems that design biology must also help defend against it. Ping me or others in the team if you'd like to learn more. job-boards.greenhouse.io/radicalnumerics
Michael Poli tweet media
English
6
14
113
29.8K
Radical Numerics
Radical Numerics@RadicalNumerics·
Scaling scientific world models requires co-designing architectures, training objectives, and numerics. Today, we share the first posts in our series on low-precision pretraining, starting with NVIDIA's NVFP4 recipe for stable 4-bit training. Part 1: radicalnumerics.ai/blog/nvfp4-par… Part 2: radicalnumerics.ai/blog/nvfp4-par… We cover floating point fundamentals, heuristics, custom CUDA kernels, and stabilization techniques. Future entries will cover custom recipes and results on hybrid architectures.
Radical Numerics tweet media
English
9
93
525
66.7K
Radical Numerics retweetledi
Radical Numerics
Radical Numerics@RadicalNumerics·
Sliding window attention (SWA) is powering frontier hybrid models for efficiency. Is there something better? Introducing Phalanx, a faster and better quality drop-in replacement for sliding window attention (SWA). Phalanx is a new family of hardware and numerics-aware windowed layers designed with a focus on data locality and jagged, block-aligned windows that map directly to GPUs. In training, Phalanx delivers 10–40% higher end-to-end throughput at 4K–32K context lengths over optimized SWA-hybrids and Transformers by reducing costly inter-warp communication. Today, we are releasing both the technical report, a blog, and Phalanx kernels in spear, our research kernel library. We are hiring.
Radical Numerics tweet media
English
12
49
206
38.6K
Radical Numerics retweetledi
Michael Poli
Michael Poli@MichaelPoli6·
We just released the largest open-source diffusion language model (RND1). RND1 is important to me on a personal level: it symbolizes our commitment to open-source exploration of radically different designs for AI at scale — training objectives, architectures, domains. There is still so much to do, and together we can move faster. We are not done yet. More on Monday (the next one is for model architecture enthusiasts!).
Radical Numerics@RadicalNumerics

Introducing RND1, the most powerful base diffusion language model (DLM) to date. RND1 (Radical Numerics Diffusion) is an experimental DLM with 30B params (3B active) with a sparse MoE architecture. We are making it open source, releasing weights, training details, and code to catalyze further research on DLM inference and post-training. We are researchers and engineers (DeepMind, Meta, Liquid, Stanford) building the engine for recursive self-improvement (RSI) — and using it to accelerate our own work. Our goal is to let AI design AI. We are hiring.

English
9
40
331
41.7K
Radical Numerics
Radical Numerics@RadicalNumerics·
We’re also hiring aggressively. Reach out if you’re interested in building automated research environments and agents. (AI researchers and SWEs, pre/mid/post training, architecture design, kernels, lots of backend system design, and automation) Our team is behind the tech for hybrid architectures, Hyena, and Evo. For us, recursive intelligence is a practical system that can be engineered. Our goal is to build full-stack, self-improving AI systems aimed at new applications in science & engineering
GIF
English
16
5
111
8.7K
Radical Numerics
Radical Numerics@RadicalNumerics·
Introducing RND1, the most powerful base diffusion language model (DLM) to date. RND1 (Radical Numerics Diffusion) is an experimental DLM with 30B params (3B active) with a sparse MoE architecture. We are making it open source, releasing weights, training details, and code to catalyze further research on DLM inference and post-training. We are researchers and engineers (DeepMind, Meta, Liquid, Stanford) building the engine for recursive self-improvement (RSI) — and using it to accelerate our own work. Our goal is to let AI design AI. We are hiring.
GIF
English
103
255
1.4K
845.7K