Benjamin F Spector

152 posts

Benjamin F Spector

Benjamin F Spector

@bfspector

stanford cs phd student. i make ml go brr.

Katılım Ekim 2020
261 Takip Edilen10.5K Takipçiler
Benjamin F Spector
Benjamin F Spector@bfspector·
@dylan522p @SemiAnalysis_ @Kurnalsalts Ah I meant they'd get physically larger to allow for lower power consumption. The whole thing has to run on like 100W or something? Including memory and whatnot. But this was an offhand comment, I am far from confident I am right.
English
0
0
2
79
SemiAnalysis
SemiAnalysis@SemiAnalysis_·
NVIDIA's first GPU on TSMC 3nm shows an unexpected area increase compared to 4nm Blackwell! Thanks to @Kurnalsalts, the NVIDIA GB10 dieshot shows that GPC (12SM) area increased 12.5%, TPC area increased 16.7%, and SM area increased by 13.5%. NVIDIA has confirmed multiple times that both dies on GB10 are on TSMC's more expensive 3nm process, so this significant scaling regression is shocking. What was the Physical Design Team doing when porting Blackwell from 4nm to 3nm?
SemiAnalysis tweet media
English
29
47
545
66K
Benjamin F Spector
Benjamin F Spector@bfspector·
@elliotarledge @mttrdmnd Yeah you want interaction for computation so you need non-integer spin to get Pauli, but you want non-interaction for communication so you can stack all of your nats, and then integer spins are better.
English
0
0
5
67
Elliot Arledge
Elliot Arledge@elliotarledge·
why not just do matmuls in light and keep ops like topk and exp to electrons?
English
4
1
16
2.2K
Benjamin F Spector
Benjamin F Spector@bfspector·
@mcxfrank Definitely interested! Will be down at Stanford tomorrow, any time after 3:30pm PT work?
English
2
0
24
2.2K
Benjamin F Spector
Benjamin F Spector@bfspector·
@karpathy has been so incredibly generous with his advice, time, and support, and @amspector100 and I are incredibly grateful! Getting to work with great people is, as always, the best part of the job.
Andrej Karpathy@karpathy

A conventional narrative you might come across is that AI is too far along for a new, research-focused startup to outcompete and outexecute the incumbents of AI. This is exactly the sentiment I listened to often when OpenAI started ("how could the few of you possibly compete with Google?") and 1) it was very wrong, and then 2) it was very wrong again with a whole another round of startups who are now challenging OpenAI in turn, and imo it still continues to be wrong today. Scaling and locally improving what works will continue to create incredible advances, but with so much progress unlocked so quickly, with so much dust thrown up in the air in the process, and with still a large gap between frontier LLMs and the example proof of the magic of a mind running on 20 watts, the probability of research breakthroughs that yield closer to 10X improvements (instead of 10%) imo still feels very high - plenty high to continue to bet on and look for. The tricky part ofc is creating the conditions where such breakthroughs may be discovered. I think such an environment comes together rarely, but @bfspector & @amspector100 are brilliant, with (rare) full-stack understanding of LLMs top (math/algorithms) to bottom (megakernels/related), they have a great eye for talent and I think will be able to build something very special. Congrats on the launch and I look forward to what you come up with!

English
4
1
65
4.5K
Benjamin F Spector
Benjamin F Spector@bfspector·
Very grateful and excited to be working together every week!
mark xu@marklxu1

The best part of my job is the privilege to partner with extraordinary founders. It’s especially rewarding when those founders are people you’ve long admired and respected. I’ve known @amspector100 since our college days, and I’ve gotten to know his brother, @bfspector, through the Prod community, where Ben helped shape a generation of founders and young talent. So when Asher mentioned to me on a walk that he and Ben were thinking about starting something together, I could barely contain my excitement. It felt like a moment that had been building for an eternity. Today, that conviction has taken shape as Flapping Airplanes, a new foundational AI research lab led by Ben, Asher, and @aidanmantine, exploring radically more data-efficient approaches to learning. We’re thrilled to co-lead this investment alongside @GVteam and @sequoia, and to partner with my dear friends Ben, Asher, Aidan and the rest of their all-star team.

English
3
2
66
39.9K
levi
levi@levidiamode·
@bfspector @HazyResearch If you ever find the time to teach more, it'd be amazing if you could do your own version of MIT 6.S894 The CS336 sections that @tatsu_hashimoto did on GPUs etc were great but feel like a standalone course could go much deeper, especially on current research like ThunderKittens
English
1
0
2
138
levi
levi@levidiamode·
Day 13/365 of GPU Programming Watched this super underrated talk by @bfspector on AI hardware from various levels of abstraction One of the best GPU resources I've come across since starting my journey Ben really has a knack for teaching and cuts through all the bs out there
levi tweet medialevi tweet medialevi tweet medialevi tweet media
levi@levidiamode

Day 12/365 of GPU Programming Studied GPU hierarchy in terms of GPCs, TPCs, SMs, etc on various Nvidia architectures Also pretty interesting to see what's on the hardware level vs pure software abstractions

English
2
1
7
1.1K
Benjamin F Spector retweetledi
Ricursive Intelligence
Ricursive Intelligence@RicursiveAI·
Introducing Ricursive Intelligence, a frontier AI lab enabling a recursive self-improvement loop between AI and the chips that fuel it. Learn more at ricursive.com
English
49
150
1.1K
479.6K
Benjamin F Spector retweetledi
Azalia Mirhoseini
Azalia Mirhoseini@Azaliamirh·
Thrilled to share that @annadgoldie and I are launching @RicursiveAI, a frontier lab enabling recursive self-improvement through AIs that design their own chips. Our vision for transforming chip design began with AlphaChip, an AI for layout optimization used to design four generations of TPUs, data center CPUs, and smartphones. AlphaChip offered a glimpse into a future where AI designs the silicon that fuels it. Ricursive extends this vision to the entire chip stack, building AI that architects, verifies, and implements silicon, enabling models and chips to co-evolve in a tight loop. We sat down with WSJ’s @berber_jin1 to discuss Ricursive: wsj.com/tech/this-ai-s…
Ricursive Intelligence@RicursiveAI

Introducing Ricursive Intelligence, a frontier AI lab enabling a recursive self-improvement loop between AI and the chips that fuel it. Learn more at ricursive.com

English
125
137
1.5K
225.8K
Benjamin F Spector retweetledi
Owen Dugan
Owen Dugan@OwenDugan·
Part 2 of our MLPs blog post is out! 👀 This time, we’re here to tell you the story 📖 of our quest for a construction that: ✅ Handles general embeddings 🌐 ✅ Asymptotically matches the information-theoretic limit 📊📈 ✅ Is usable within transformers 🤖✨
Owen Dugan tweet media
Owen Dugan@OwenDugan

Happy 🦃 Thanksgiving weekend! 🍂 This year, we cooked up a new recipe for juicy fact-storing MLPs. Instead of picking apart trained models, we asked: Can we construct fact-storing MLPs from scratch? 🤔 Spoiler: we can & we figured out how to slot these hand-crafted MLPs into Transformer blocks as modular fact stores! 🧩 New work with @garctrob @ronnygjunkins @jerrywliu @dylan_zinsley @EyubogluSabri Atri Rudra @HazyResearch! 🧵👇

English
1
12
30
8.5K
Benjamin F Spector retweetledi
Avanika Narayan
Avanika Narayan@Avanika15·
The U.S.–China AI race won’t be decided by who builds the most datacenters, but by who deploys the most intelligence. We call this Gross Domestic Intelligence (GDI): intelligence per watt × usable power. If the U.S. activates its dense installed base of local AI accelerators in a hybrid local–cloud system, it could add ~30–40% inference capacity and ≈2-4× GDI for single-turn chat and reasoning queries without building any new datacenters or grid infrastructure. Winning the GDI race means treating local compute as critical infrastructure and making hybrid inference the default. (1/N)
Avanika Narayan tweet media
English
9
41
136
67.7K