Benjamin F Spector

155 posts

Benjamin F Spector

@bfspector

stanford cs phd student. i make ml go brr.

Katılım Ekim 2020

281 Takip Edilen10.6K Takipçiler

Benjamin F Spector retweetledi

will depue@willdepue·6 May

megakernels remain underrated. if you haven’t dug into them before go look them up! flappy seems to be hinting at some really powerful training megakernel stuff which is sick ex: fully contained training megakernel could be great for automated research

Flapping Airplanes@flappyairplanes

(4/5) One thing we’ve built is a “kittens” virtual machine that takes over the whole GPU and allows new kinds of co-optimization. We can go past the traditional sequential kernel model – for example, fusing entire training runs into a single kernel and even weirder stuff.

English

514

52.1K

Benjamin F Spector@bfspector·6 May

Fun to share a little taste of some of the things we've been working on with @sequoia

Flapping Airplanes@flappyairplanes

(1/5) Great to be at @sequoia to give a sneak peek of one of our research directions! TL;DR one path to data-efficiency may be to “abuse GPUs like they’ve never been abused before”

English

2.9K

Benjamin F Spector retweetledi

Hayden Prairie@hayden_prairie·15 Nis

We’ve been thinking a lot about scaling laws, wondering if there is a more effective way to scale FLOPs without increasing parameters. Turns out the answer is YES – by looping blocks of layers during training. We find that predictable scaling laws exist for layer looping, allowing us to use looping to achieve the quality of a Transformer twice the size. Our scaling laws suggest that for a fixed parameter budget, data and looping should be increased in tandem! 🧵👇

English

179

1.3K

291.3K

Benjamin F Spector@bfspector·5 Mar

@dylan522p @SemiAnalysis_ @Kurnalsalts Ah I meant they'd get physically larger to allow for lower power consumption. The whole thing has to run on like 100W or something? Including memory and whatnot. But this was an offhand comment, I am far from confident I am right.

English

126

Dylan Patel@dylan522p·5 Mar

@bfspector @SemiAnalysis_ @Kurnalsalts Vs the 02 die? The tensorncore seems the same no?

English

446

SemiAnalysis@SemiAnalysis_·4 Mar

NVIDIA's first GPU on TSMC 3nm shows an unexpected area increase compared to 4nm Blackwell! Thanks to @Kurnalsalts, the NVIDIA GB10 dieshot shows that GPC (12SM) area increased 12.5%, TPC area increased 16.7%, and SM area increased by 13.5%. NVIDIA has confirmed multiple times that both dies on GB10 are on TSMC's more expensive 3nm process, so this significant scaling regression is shocking. What was the Physical Design Team doing when porting Blackwell from 4nm to 3nm?

English

542

66.7K

Benjamin F Spector@bfspector·27 Şub

@elliotarledge @mttrdmnd Yeah you want interaction for computation so you need non-integer spin to get Pauli, but you want non-interaction for communication so you can stack all of your nats, and then integer spins are better.

English

Elliot Arledge@elliotarledge·26 Şub

@mttrdmnd @bfspector interesting haha

Filipino

143

Elliot Arledge@elliotarledge·25 Şub

why not just do matmuls in light and keep ops like topk and exp to electrons?

English

2.3K

Benjamin F Spector@bfspector·10 Şub

@gpusteve @HazyResearch my advisor Chris thought it was funny

English

228

steve@gpusteve·10 Şub

@HazyResearch what is the origin of the kittens naming? @bfspector

English

280

Benjamin F Spector@bfspector·8 Şub

@Yuchenj_UW @gauravisnotme FWIW my read was exactly this, smaller batches + speculation

English

203

Yuchen Jin@Yuchenj_UW·7 Şub

@gauravisnotme could be

English

9.8K

Yuchen Jin@Yuchenj_UW·7 Şub

2.5x faster but 6x more expensive. This can’t be achieved by inference optimization, must be new chips. TPU? B200? AWS Inferentia? Cerebras?

Claude@claudeai

Our teams have been building with a 2.5x-faster version of Claude Opus 4.6. We’re now making it available as an early experiment via Claude Code and our API.

English

1.3K

244.1K

Benjamin F Spector@bfspector·30 Oca

@mcxfrank Definitely interested! Will be down at Stanford tomorrow, any time after 3:30pm PT work?

English

2.2K

Michael C. Frank@mcxfrank·29 Oca

Very exiting to see “data gap” ideas taken up in such a serious way! Let me know if you ever want to chat about how kids learn. :)

Flapping Airplanes@flappyairplanes

Announcing Flapping Airplanes! We’ve raised $180M from GV, Sequoia, and Index to assemble a new guard in AI: one that imagines a world where models can think at human level without ingesting half the internet.

English

8.9K

Benjamin F Spector@bfspector·28 Oca

@karpathy has been so incredibly generous with his advice, time, and support, and @amspector100 and I are incredibly grateful! Getting to work with great people is, as always, the best part of the job.

Andrej Karpathy@karpathy

A conventional narrative you might come across is that AI is too far along for a new, research-focused startup to outcompete and outexecute the incumbents of AI. This is exactly the sentiment I listened to often when OpenAI started ("how could the few of you possibly compete with Google?") and 1) it was very wrong, and then 2) it was very wrong again with a whole another round of startups who are now challenging OpenAI in turn, and imo it still continues to be wrong today. Scaling and locally improving what works will continue to create incredible advances, but with so much progress unlocked so quickly, with so much dust thrown up in the air in the process, and with still a large gap between frontier LLMs and the example proof of the magic of a mind running on 20 watts, the probability of research breakthroughs that yield closer to 10X improvements (instead of 10%) imo still feels very high - plenty high to continue to bet on and look for. The tricky part ofc is creating the conditions where such breakthroughs may be discovered. I think such an environment comes together rarely, but @bfspector & @amspector100 are brilliant, with (rare) full-stack understanding of LLMs top (math/algorithms) to bottom (megakernels/related), they have a great eye for talent and I think will be able to build something very special. Congrats on the launch and I look forward to what you come up with!

English

4.6K

Benjamin F Spector@bfspector·28 Oca

Very grateful and excited to be working together every week!

mark xu@marklxu1

The best part of my job is the privilege to partner with extraordinary founders. It’s especially rewarding when those founders are people you’ve long admired and respected. I’ve known @amspector100 since our college days, and I’ve gotten to know his brother, @bfspector, through the Prod community, where Ben helped shape a generation of founders and young talent. So when Asher mentioned to me on a walk that he and Ben were thinking about starting something together, I could barely contain my excitement. It felt like a moment that had been building for an eternity. Today, that conviction has taken shape as Flapping Airplanes, a new foundational AI research lab led by Ben, Asher, and @aidanmantine, exploring radically more data-efficient approaches to learning. We’re thrilled to co-lead this investment alongside @GVteam and @sequoia, and to partner with my dear friends Ben, Asher, Aidan and the rest of their all-star team.

English

41.3K

Benjamin F Spector@bfspector·28 Oca

@davemuni one of the most fun conversations I've had in a while!

GV@GVteam

@flappyairplanes @bfspector @amspector100 @aidanmantine Read more from GV Managing Partner @davemuni and watch the full conversation on our blog: gv.com/news/why-we-in…

English

2.2K

Benjamin F Spector@bfspector·28 Oca

agreed

redJ@sudoredj

@ebarschkis flapping airplane would be kinda cool

English

38.2K

Benjamin F Spector@bfspector·28 Oca

@DavidCahn6 it has been so much fun working together for the last few months!!!

David Cahn@DavidCahn6

Today, I'm excited to announce our investment in Flapping Airplanes, and our partnership with @bfspector best known for being a founder of Prod, and @amspector100, a former debate champion and Stanford statistics PhD.

English

2.6K

Benjamin F Spector retweetledi

Dean Leitersdorf@DLeitersdorf·28 Oca

The new guard in AI has emerged: next generation of top tier AI founders just started a new kind of frontier lab. Good luck!

Flapping Airplanes@flappyairplanes

English

26K

Benjamin F Spector@bfspector·28 Oca

Very proud of the team we've assembled! Back to work!

Flapping Airplanes@flappyairplanes

English

256

46K

Benjamin F Spector@bfspector·14 Oca

@levidiamode @HazyResearch @tatsu_hashimoto Man I'd love to, candidly it'll be a little hard to find the time right now but I'll see what I can do. In the meantime you might enjoy my other livestream on TK youtube.com/watch?v=xcpEl0…

YouTube

English

259

levi@levidiamode·14 Oca

@bfspector @HazyResearch If you ever find the time to teach more, it'd be amazing if you could do your own version of MIT 6.S894 The CS336 sections that @tatsu_hashimoto did on GPUs etc were great but feel like a standalone course could go much deeper, especially on current research like ThunderKittens

English

167

levi@levidiamode·13 Oca

Day 13/365 of GPU Programming Watched this super underrated talk by @bfspector on AI hardware from various levels of abstraction One of the best GPU resources I've come across since starting my journey Ben really has a knack for teaching and cuts through all the bs out there

levi@levidiamode

Day 12/365 of GPU Programming Studied GPU hierarchy in terms of GPCs, TPCs, SMs, etc on various Nvidia architectures Also pretty interesting to see what's on the hardware level vs pure software abstractions

English

2.2K

Benjamin F Spector@bfspector·20 Ara

@amspector100 and I have the most vertically oriented predictions of anyone!

No Priors@NoPriorsPod

🔮 see the future 🔮 2026 prediction episode: Jensen Huang, @jainarvind, @winstonweinberg, @ScottWu46, @raizamrtn, @zackmziegler, @levie, @MishaLaskin, @polynoamial, @joshim5, @bryan_johnson, @_sholtodouglas, @bfspector @amspector100, @dylan522p (and your hosts Sarah and Elad)

English

1.6K

Benjamin F Spector retweetledi

Ricursive Intelligence@RicursiveAI·2 Ara

Introducing Ricursive Intelligence, a frontier AI lab enabling a recursive self-improvement loop between AI and the chips that fuel it. Learn more at ricursive.com

English

151

1.1K

490.1K

Keşfet

@sequoia @dylan522p @SemiAnalysis_ @Kurnalsalts @elliotarledge @mttrdmnd @gpusteve @HazyResearch