Radical Numerics (@RadicalNumerics) - Twitter Profili

Radical Numerics retweetledi

We're growing rapidly at @RadicalNumerics and scaling our core teams. Join us in building the next generation of scientific world models. We're hiring across a few roles, each with significant ownership and cross-functional scope: - Member of Technical Staff, Post-Training - Member of Technical Staff, Infrastructure and Training Systems - Member of Technical Staff, Pretraining Science - Member of Technical Staff, AI Bio - Member of Technical Staff, Biosecurity Our technology brings together numerics, systems engineering, and architecture design to tackle large-scale pretraining on scientific data. Our blogs (see below) give a flavor of the work. We believe that advancing capabilities must go hand-in-hand with advancing safety and biosecurity. The same systems that design biology must also help defend against it. Ping me or others in the team if you'd like to learn more. job-boards.greenhouse.io/radicalnumerics

English

6

14

113

29.8K

Radical Numerics@RadicalNumerics·12 Oca

Scaling scientific world models requires co-designing architectures, training objectives, and numerics. Today, we share the first posts in our series on low-precision pretraining, starting with NVIDIA's NVFP4 recipe for stable 4-bit training. Part 1: radicalnumerics.ai/blog/nvfp4-par… Part 2: radicalnumerics.ai/blog/nvfp4-par… We cover floating point fundamentals, heuristics, custom CUDA kernels, and stabilization techniques. Future entries will cover custom recipes and results on hybrid architectures.

English

9

93

525

66.7K

Radical Numerics retweetledi

Garyk Brixi@garykbrixi·14 Eki

Excited to share Phalanx, our new layer for sequence modeling! Each block communicates with its neighbor, like the shield cover of a neighboring hoplite. Phalanx can replace sliding window attention and trains faster than optimized baselines while maintaining quality.

Radical Numerics@RadicalNumerics

Sliding window attention (SWA) is powering frontier hybrid models for efficiency. Is there something better? Introducing Phalanx, a faster and better quality drop-in replacement for sliding window attention (SWA). Phalanx is a new family of hardware and numerics-aware windowed layers designed with a focus on data locality and jagged, block-aligned windows that map directly to GPUs. In training, Phalanx delivers 10–40% higher end-to-end throughput at 4K–32K context lengths over optimized SWA-hybrids and Transformers by reducing costly inter-warp communication. Today, we are releasing both the technical report, a blog, and Phalanx kernels in spear, our research kernel library. We are hiring.

English

2

9

51

11.5K

Radical Numerics@RadicalNumerics·14 Eki

More on Phalanx and our research kernel library: Blog: radicalnumerics.ai/blog/phalanx Code: github.com/RadicalNumeric… Report: radicalnumerics.ai/assets/phalanx…

English

1

2

17

2.6K

Radical Numerics@RadicalNumerics·14 Eki

Sliding window attention (SWA) is powering frontier hybrid models for efficiency. Is there something better? Introducing Phalanx, a faster and better quality drop-in replacement for sliding window attention (SWA). Phalanx is a new family of hardware and numerics-aware windowed layers designed with a focus on data locality and jagged, block-aligned windows that map directly to GPUs. In training, Phalanx delivers 10–40% higher end-to-end throughput at 4K–32K context lengths over optimized SWA-hybrids and Transformers by reducing costly inter-warp communication. Today, we are releasing both the technical report, a blog, and Phalanx kernels in spear, our research kernel library. We are hiring.

English

12

49

206

38.6K

Radical Numerics retweetledi

Michael Poli@MichaelPoli6·11 Eki

We just released the largest open-source diffusion language model (RND1). RND1 is important to me on a personal level: it symbolizes our commitment to open-source exploration of radically different designs for AI at scale — training objectives, architectures, domains. There is still so much to do, and together we can move faster. We are not done yet. More on Monday (the next one is for model architecture enthusiasts!).

Radical Numerics@RadicalNumerics

Introducing RND1, the most powerful base diffusion language model (DLM) to date. RND1 (Radical Numerics Diffusion) is an experimental DLM with 30B params (3B active) with a sparse MoE architecture. We are making it open source, releasing weights, training details, and code to catalyze further research on DLM inference and post-training. We are researchers and engineers (DeepMind, Meta, Liquid, Stanford) building the engine for recursive self-improvement (RSI) — and using it to accelerate our own work. Our goal is to let AI design AI. We are hiring.

English

9

40

331

41.7K

Radical Numerics@RadicalNumerics·9 Eki

Thank you @nebiusai, @PrimeIntellect, @LambdaAPI for the compute resources and support

English

5

0

69

6.6K

Radical Numerics@RadicalNumerics·9 Eki

We’re also hiring aggressively. Reach out if you’re interested in building automated research environments and agents. (AI researchers and SWEs, pre/mid/post training, architecture design, kernels, lots of backend system design, and automation) Our team is behind the tech for hybrid architectures, Hyena, and Evo. For us, recursive intelligence is a practical system that can be engineered. Our goal is to build full-stack, self-improving AI systems aimed at new applications in science & engineering

GIF

English

16

5

111

8.7K

Radical Numerics@RadicalNumerics·9 Eki

Introducing RND1, the most powerful base diffusion language model (DLM) to date. RND1 (Radical Numerics Diffusion) is an experimental DLM with 30B params (3B active) with a sparse MoE architecture. We are making it open source, releasing weights, training details, and code to catalyze further research on DLM inference and post-training. We are researchers and engineers (DeepMind, Meta, Liquid, Stanford) building the engine for recursive self-improvement (RSI) — and using it to accelerate our own work. Our goal is to let AI design AI. We are hiring.

GIF

English

103

255

1.4K

845.7K

Radical Numerics

Keşfet