Pengming Wang

328 posts

Pengming Wang

Pengming Wang

@PengmingWang

Founding team @poolsideai | prev @DeepMind, PhD @Cambridge_Uni, FunSearch co-author

London, England Katılım Temmuz 2021
253 Takip Edilen355 Takipçiler
Pengming Wang
Pengming Wang@PengmingWang·
This is simple, but effective. Cool to see the ~13% speed gain in the backward pass just by offloading activations to host memory thanks to NVLink c2c
Pengming Wang tweet media
English
1
0
1
80
Pengming Wang retweetledi
poolside
poolside@poolsideai·
Training AI models requires storing temporary data mid-process. That data sits in GPU memory taking up space until it's needed. The standard fix has always been to delete it and redo the work later. It works, but it's wasteful.
English
1
4
10
2.1K
Pengming Wang
Pengming Wang@PengmingWang·
Corollary: There is an optimal amount of drama you want, and it's >0
English
0
0
2
98
Pengming Wang
Pengming Wang@PengmingWang·
LR is a slider for how much drama you want
English
1
0
2
150
Pengming Wang
Pengming Wang@PengmingWang·
I'll be at Neurips this year. Keen to meet old and new friends, looking forward to catching up!
English
0
0
6
362
Elana Simon
Elana Simon@ElanaPearl·
elanapearl.github.io/blog/2025/the-… it's a debugging detective story where you follow along the reasoning behind each step and solve it as we go it also explains ML & PyTorch concepts as they become necessary to understand what's breaking, why, and how to fix it🔎
English
15
48
538
31.6K
Elana Simon
Elana Simon@ElanaPearl·
New blog post: The bug that taught me more about PyTorch than years of using it started with a simple training loss plateau... ended up digging through optimizer states, memory layouts, kernel dispatch, and finally understanding how PyTorch works!
Elana Simon tweet media
English
47
184
1.8K
727.3K
Pengming Wang retweetledi
Eiso Kant
Eiso Kant@eisokant·
We believe that to compete at the frontier, you have to own the full stack: from dirt to intelligence. Today we’re announcing two major unlocks for our mission to AGI: 1. We're partnering with @CoreWeave and have 40,000+ NVIDIA GB300s secured. First capacity comes online starting Dec ’25. 2. Project Horizon: Poolside is developing a vertically integrated 2GW AI campus in West Texas to secure our medium term scale. On this site @CoreWeave will be our anchor tenant for the first 250MW phase. Learn more about it on our blog: poolside.ai/blog/announcin… Or on WSJ: wsj.com/tech/ai/a-gian…
English
35
50
419
455.5K
Pengming Wang
Pengming Wang@PengmingWang·
In the limit, evaluations are the ~only thing that matters. When models are self-improving, and every metric can be hill climbed, picking the metric becomes the most important thing. Evals will shift from being "writing unit tests" for research to being the *main thing*
English
2
4
16
2.2K
Pengming Wang
Pengming Wang@PengmingWang·
We've not been very public about our progress on model building, but I fully believe poolside will be the next lab joining the frontier. We're now sharing a bit more how we're doing this, with a systems-first approach we're taking with our model factory.
English
2
3
22
2.8K
Pengming Wang
Pengming Wang@PengmingWang·
We've spent quite some time at poolside thinking about this, and recently put down some words on how we're approaching this: poolside.ai/vision/research
English
1
1
2
261
Pengming Wang
Pengming Wang@PengmingWang·
Test-time compute is powerful, but in its current form there is a lack of "harmony" with pre-training. Models feel split-brained: They're either deeply overthinking, with no trust in its own "common sense"; or they latch onto the nearest neighbour of meaning without deliberation
English
1
4
11
1.5K