Daniel Norberg

1.3K posts

Daniel Norberg banner
Daniel Norberg

Daniel Norberg

@dnorberg

Software engineer at https://t.co/FFQzyWhc2V. previously at https://t.co/EjBbnKeOPg. Performance and scalability fanatic. [email protected]

Stockholm, Sweden Katılım Kasım 2008
204 Takip Edilen396 Takipçiler
Daniel Norberg retweetledi
Charles 🎉 Frye @ ICLR '26
Charles 🎉 Frye @ ICLR '26@charles_irl·
Alex Strick van Linschoten@strickvl

GPU scheduling looks like a technical problem. It's actually an org design problem wearing infrastructure clothes. The queue is just where the politics become visible. Here's what I mean: → Demand side (what's waiting, who's next) is owned by ML platform teams optimising for experiment velocity → Supply side (what capacity exists) is owned by infra/FinOps optimising for utilisation and spend When these collide, the "scheduling problem" becomes: who gets to define the rules? The formality of your answer scales with org size: At 5-10 people: Slack messages work. "Please don't use GPU 0 today." At 20-50 people: You need basic queuing. Jobs should wait, not crash. (Adept learned this the hard way. Their workflows would crash instead of queue when Slurm couldn't satisfy a 240-GPU request.) At 100+ people: You need explicit quotas. But quotas reveal waste. One bank had GPUs "allocated" at <35% utilisation because teams reserved capacity "just in case." At scale: You need borrowing + preemption contracts. Uber's approach: guarantee a baseline, allow opportunistic borrowing, reclaim with preemption. But this only works if users understand when they're running on borrowed time. The pattern: each governance transition happens when the previous approach's failure modes become intolerable. The winning move isn't a smarter algorithm. It's making allocation legible: baselines everyone can plan around, borrowing rules that don't surprise people, and preemption contracts that feel fair. Priority is a product decision wearing an infrastructure costume.

ZXX
1
2
47
8.9K
Petter Weiderholm
Petter Weiderholm@pweiderholm·
Got all our stuff from Singapore - current status = unpacking 203(!) boxes 😅
Petter Weiderholm tweet media
English
3
0
1
124
Maxime Peabody
Maxime Peabody@MaximePeabody·
I spent the past week testing modal out on a couple projects - it's nice! I'm curious, what's the target market? For side projects it seems great because it's way easier to deploy code to gpus on modal than anything else I've tried, especially AWS/GCP. But for scale, I noticed that the Volume for example only handles ~50k files, which seems pretty low?
English
1
0
0
384
Erik Bernhardsson
Erik Bernhardsson@bernhardsson·
It always felt unsatisfactory to me that the best way to catch a large class of bugs in software is to annotate lots of expressions with types. Like, it works, but what’s the platonic ideal of automated bug catching?
English
24
0
29
13.9K
Erik Bernhardsson
Erik Bernhardsson@bernhardsson·
We just spent an engineer-week or two at @modal_labs working on a performance optimization that basically boiled down to a few lines making all file operations 2x faster! (Deployed on Friday evening so we get a full weekend of data to analyze!)
Erik Bernhardsson tweet media
English
4
5
117
21.6K
Adam Laiacano
Adam Laiacano@adamlaiacano·
The ironic thing is that I was giving an impromptu talk about my days of designing atomic clocks approx 19 hours ago, as this email was sent
English
1
0
1
0
Rohan Singh
Rohan Singh@rohansingh·
Sentences I did not expect to use in the year 2022: "The Russians have captured Chernobyl."
English
1
0
5
0