Just🧊

3.6K posts

Just🧊 banner
Just🧊

Just🧊

@kwekujunior_

Doing what excites |👨🏽‍💻

Manifold Katılım Temmuz 2014
772 Takip Edilen359 Takipçiler
Andrej Karpathy
Andrej Karpathy@karpathy·
Personal update: I've joined Anthropic. I think the next few years at the frontier of LLMs will be especially formative. I am very excited to join the team here and get back to R&D. I remain deeply passionate about education and plan to resume my work on it in time.
English
7.6K
10.7K
142.6K
24.5M
Just🧊
Just🧊@kwekujunior_·
Driving from East to West Coast of the United States A little crazy but worth it
English
1
0
0
37
Just🧊
Just🧊@kwekujunior_·
You’re supposed to be celebrating your birthday am told . 🥂
English
0
0
0
72
Just🧊 retweetledi
Kyle Chan
Kyle Chan@kyleichan·
Fully autonomous car racing at speeds of up to 250 km/h (155 mph) at A2RL in Abu Dhabi. $2.25 million prize.
English
17
56
405
191.8K
Just🧊 retweetledi
Naval
Naval@naval·
There’s no point in learning custom tools, workflows, or languages anymore.
English
954
994
15.7K
1.5M
Just🧊 retweetledi
Paul Azunre
Paul Azunre@pazunre·
Announcing Speech Recognition and Generation from @KhayaAI for 32 African Langs covering ~540 million people!! Live demo in comments. See video for demo of Speech Recognition for Southern Ghanaian Langs. @KhayaAI is the only AI covering all government sponsored Ghana langs 🔥
English
114
1.1K
3.1K
158.7K
Just🧊 retweetledi
Khurram Javed
Khurram Javed@kjaved_·
The Dwarkesh/Andrej interview is worth watching. Like many others in the field, my introduction to deep learning was Andrej’s CS231n. In this era when many are involved in wishful thinking driven by simple pattern matching (e.g., extrapolating scaling laws without nuance), it’s refreshing to hear an influential voice that is tethered to reality. One clarification for the podcast is that when Andrej says humans don’t use reinforcement learning, he is really saying humans don't use returns as learning targets. His example of LLMs struggling to learn to solve math problems from outcome-based rewards also elucidates the problem with learning directly from returns. Fortunately for RL, this exact problem is solved by temporal difference (TD) learning. All sample-efficient RL algorithms that show human-like learning (e.g., sample-efficient learning on Atari, and our work on learning from experience directly on a robot) rely on TD learning. Now Andrej is not primarily an RL person; he is looking at RL through the lens of LLMs these days, and all RL done in LLMs uses returns as targets, so it’s understandable that he is assuming that RL is all about learning from observed returns. But this assumption leads him to the incorrect conclusion that we need process-based dense rewards for RL to work. If you embrace TD learning, then you don't necessarily need a dense reward. Once you have learned a value function that encodes useful knowledge about the world, you can learn on the fly in the absence of rewards, just like humans and animals. This is possible because in TD learning there is no difference between learning from an unexpected reward and learning from an unexpected change in perceived value.
Dwarkesh Patel@dwarkesh_sp

The @karpathy interview 0:00:00 – AGI is still a decade away 0:30:33 – LLM cognitive deficits 0:40:53 – RL is terrible 0:50:26 – How do humans learn? 1:07:13 – AGI will blend into 2% GDP growth 1:18:24 – ASI 1:33:38 – Evolution of intelligence & culture 1:43:43 - Why self driving took so long 1:57:08 - Future of education Look up Dwarkesh Podcast on YouTube, Apple Podcasts, Spotify, etc. Enjoy!

English
14
45
447
196.5K