Dwrakseh Petal

112 posts

Dwrakseh Petal banner
Dwrakseh Petal

Dwrakseh Petal

@phildunphag

to disconnect ~

Katılım Kasım 2023
66 Takip Edilen24 Takipçiler
Dwarkesh Patel
Dwarkesh Patel@dwarkesh_sp·
Why @michael_nielsen disagrees with the view that science will keep getting harder and harder as low-hanging fruit is picked:
English
4
7
65
57.1K
Dwarkesh Patel
Dwarkesh Patel@dwarkesh_sp·
Wrote up some flashcards and practice problems to help myself retain what @reinerpope taught. Hope it's helpful to you too! Suggest more below and I'll add them. reiner-flashcards.vercel.app
Dwarkesh Patel@dwarkesh_sp

Did a very different format with @reinerpope – a blackboard lecture where he walks through how frontier LLMs are trained and served. It's shocking how much you can deduce about what the labs are doing from a handful of equations, public API prices, and some chalk. It’s a bit technical, but I encourage you to hang in there - it’s really worth it. There are less than a handful of people who understand the full stack of AI, from chip design to model architecture, as well as Reiner. It was a real delight to learn from him. Recommend watching this one on YouTube so you can see the chalkboard. 0:00:00 – How batch size affects token cost and speed 0:31:59 – How MoE models are laid out across GPU racks 0:47:02 – How pipeline parallelism spreads model layers across racks 1:03:27 – Why Ilya said, “As we now know, pipelining is not wise.” 1:18:49 – Because of RL, models may be 100x over-trained beyond Chinchilla-optimal 1:32:52 – Deducing long context memory costs from API pricing 2:03:52 – Convergent evolution between neural nets and cryptography

English
35
149
2.1K
237.9K
Dwrakseh Petal
Dwrakseh Petal@phildunphag·
@dwarkesh_sp @reinerpope I share my live trading alerts (entry and exit points) on WhatsApp. Join for free! ✅ ➡️ Copy the search query and reply with "555" to WhatsApp: + 13026663796 👉 🔗: api.whatsapp.com/send/?phone=13… 🎥 - Daily Live Trading 📖 - Trading Recap ☢️ - Personal Strategy
English
0
0
0
15
Dwarkesh Patel
Dwarkesh Patel@dwarkesh_sp·
Did a very different format with @reinerpope – a blackboard lecture where he walks through how frontier LLMs are trained and served. It's shocking how much you can deduce about what the labs are doing from a handful of equations, public API prices, and some chalk. It’s a bit technical, but I encourage you to hang in there - it’s really worth it. There are less than a handful of people who understand the full stack of AI, from chip design to model architecture, as well as Reiner. It was a real delight to learn from him. Recommend watching this one on YouTube so you can see the chalkboard. 0:00:00 – How batch size affects token cost and speed 0:31:59 – How MoE models are laid out across GPU racks 0:47:02 – How pipeline parallelism spreads model layers across racks 1:03:27 – Why Ilya said, “As we now know, pipelining is not wise.” 1:18:49 – Because of RL, models may be 100x over-trained beyond Chinchilla-optimal 1:32:52 – Deducing long context memory costs from API pricing 2:03:52 – Convergent evolution between neural nets and cryptography
English
152
601
6.6K
1.3M
Dwarkesh Patel
Dwarkesh Patel@dwarkesh_sp·
.@AdamMarblestone thinks it might cost ~low billions to map the human brain connectome. The benefit is getting answers about the brain’s secret sauce: esp. why are humans so much more sample- and energy-efficient. If labs are going to be spending trillions of dollars on compute by the end of the decade, Adam's pitch is: give him 1/100th of that to actually figure out these big questions about intelligence.
English
17
27
193
23.7K
Dwrakseh Petal
Dwrakseh Petal@phildunphag·
@dwarkesh_sp I share my live trading alerts (entry and exit points) on WhatsApp. Join for free! ✅ ➡️ Copy the search query and reply with "555" to WhatsApp: + 13026663796 👉 🔗: api.whatsapp.com/send/?phone=13… 🎥 - Daily Live Trading 📖 - Trading Recap ☢️ - Personal Strategy
English
0
0
0
14
Dwarkesh Patel
Dwarkesh Patel@dwarkesh_sp·
There's a quadrillion-dollar question at the heart of AI: Why are humans so much more sample efficient compared to LLM? There are three possible answers: 1. Architecture and hyperparameters (aka transformer vs whatever ‘algo’ cortical columns are implementing) 2. Learning rule (backprop vs whatever brain is doing) 3. Reward function @AdamMarblestone believes the answer is the reward function. ML likes to use pretty simple loss functions, like cross-entropy. These are easy to work with. But they might be too simple for sample-efficient learning. Adam thinks that, in humans, the large number of highly specialised cells in the ‘lizard brain’ might actually be encoding information for sophisticated loss functions, used for ‘training’ in the more sophisticated areas like the cortex and amygdala. Like: the human genome is barely 3 gigabytes (compare that to the TBs of parameters that encode frontier LLM weights). So how can it include all the information necessary to build highly intelligent learners? Well, if the key to sample-efficient learning resides in the loss function, even very complicated loss functions can still be expressed in a couple hundred lines of Python code.
English
189
168
1.9K
942.4K
Dwrakseh Petal
Dwrakseh Petal@phildunphag·
Sipping iced coffee, flipping through a book, and letting the world slow down—this is my kind of perfect afternoon.
Dwrakseh Petal tweet media
English
0
1
0
16
Dwrakseh Petal retweetledi
Derek Quick-Assistant
Derek Quick-Assistant@MadiTalbot·
"Grad caps up, hearts full—#ClassOf2024, we did the thing! Grateful for every lecture, late-night study sesh, and inside joke. On to the next adventure! "
English
0
1
0
5
Dwrakseh Petal retweetledi
謙謙
謙謙@qianqia43048051·
"Neighborly win! Shared grill, laughs, and grilled corn with Mr. Li next door—small moments make the block feel like family. Who’s your favorite neighbor "
謙謙 tweet media
English
0
1
1
7
Dwrakseh Petal
Dwrakseh Petal@phildunphag·
"10 mins daily = 1 language win Learn 3 high-frequency phrases (e.g., 'I’m tied up', 'It’s a steal') → use them tonight! Small steps = big progress. #LingQTips #LanguageHacks"
Dwrakseh Petal tweet media
English
0
0
2
3