tokenbender

13.8K posts

tokenbender banner
tokenbender

tokenbender

@tokenbender

Sparse and efficient • Deus eXperiments • 🇮🇳

Katılım Temmuz 2014
916 Takip Edilen12.2K Takipçiler
wetbrain
wetbrain@0xVita·
@tokenbender I will do another write up on this episode when I get the chance like I did for the Reiner Pope episode. It was so enjoyable to go down the rabbit hole.
English
1
0
2
13
tokenbender
tokenbender@tokenbender·
i have kinship with anyone who has a great time consuming such content. i know what i’m watching tonight.
tokenbender tweet media
Dwarkesh Patel@dwarkesh_sp

New blackboard lecture w @ericjang11 He walks through how to build AlphaGo from scratch, but with modern AI tools. Sometimes you understand the future better by stepping backward. AlphaGo is still the cleanest worked example of the primitives of intelligence: search, learning from experience, and self-play. You have to go back to 2017 to get insight into how the more general AIs of the future might learn. Once he explained how AlphaGo works, it gave us the context to have a discussion about how RL works in LLMs and how it could work better – naive policy gradient RL has to figure out which of the 100k+ tokens in your trajectory actually got you the right answer, while AlphaGo’s MCTS suggests a strictly better action every single move, giving you a training target that sidesteps the credit assignment problem. The way humans learn is surely closer to the second. Eric also kickstarted an Autoresearch loop on his project. And it was very interesting to discuss which parts of AI research LLMs can already automate pretty well (implementing and running experiments, optimizing hyperparameters) and which they still struggle with (choosing the right question to investigate next, escaping research dead ends). Informative to all the recent discussion about when we should expect an intelligence explosion, and what it would look like from the inside. Timestamps: 0:00:00 – Basics of Go 0:08:06 – Monte Carlo Tree Search 0:31:53 – What the neural network does 1:00:22 – Self-play 1:25:27 – Alternative RL approaches 1:45:36 – Why doesn’t MCTS work for LLMs 2:00:58 – Off-policy training 2:11:51 – RL is even more information inefficient than you thought 2:22:05 – Automated AI researchers

English
2
0
35
1.7K
Isha
Isha@slowdownisha·
@tokenbender stanford lectures❌ dwarkesh podcast✅
English
1
0
2
41
tokenbender
tokenbender@tokenbender·
going to write a lot of articles, contributing to having a holistic view of info representation and sparsity. it would be a series of observations implying certain proven-unproven ideas. ultimately i wish to show you what i have been caring about and why to such extent lately.
English
2
0
35
723
tokenbender retweetledi
Tensor-Slayer
Tensor-Slayer@TensorSlay·
Open-sourced Hermes Agent Mobile Client for Android. It’s an Android native client for a running your existing Hermes Agents on your PC/VPS. Playstore and app store review pending but you can try the app on android now.
English
3
13
51
5.2K
tokenbender
tokenbender@tokenbender·
@eliebakouch @auto_grad_ yeah one can say "DSA is just learned top k sparse attention over kv cache" if they really want to reduce the idea to existing ones.
English
0
0
0
49
elie
elie@eliebakouch·
@tokenbender @auto_grad_ i really do agree with that, also for instance, would consider things like MLA/DSA "novel"? MoE?
English
1
0
1
74
tokenbender
tokenbender@tokenbender·
@auto_grad_ @eliebakouch models explore locality of an idea a lot more thoroughly right now than hop to distinct ones but i am quite clear that they can be taught to search near and far both in idea basins.
English
1
0
2
32
Ishaan
Ishaan@auto_grad_·
take you for example, the concept of avataRL. an experiment on RL pretraining, why dont llms try out that in autoresearch and why do they only go for sweeping? why dont they try to change the objective (note, not the benchmark/evals) rather than maximizing on the given objective? and there are many such examples in our case. and if you ask why don't a lot of people have meaningful research contributions, i say that is because people are limited by a ton more stuff that what models are (compute). given people time and compute, i think more meaningful contributions can be taken out.
English
2
0
1
59
tokenbender
tokenbender@tokenbender·
@auto_grad_ @eliebakouch curious, how many truly novel research contributions do people you know have? and by that i mean stuff that is not interpolation or combination of methods that already exists as you so described in your earlier reply.
English
1
0
1
40
Ishaan
Ishaan@auto_grad_·
@tokenbender @eliebakouch i dont think so. humans have always had that capability of coming up with something new (not only in tech but in general) out of nowhere, can't say the same for llms.
English
1
0
2
40
tokenbender
tokenbender@tokenbender·
@auto_grad_ @eliebakouch novelty is an overloaded term, something that most humans struggle with as well. difference here is models are actually going to get better at it in a short amount of time.
English
2
0
3
74
Ishaan
Ishaan@auto_grad_·
models are pretty good at interpolating and making a combination of stuff, thats why they might be good at optimizing solved stuff. but when it comes to extrapolating info (PI even quoted this i.e. novelty checks) they're not that good. infact, no where close to the concept of "ideas". that's why autoresearch seems good at first but then its all a combination sweeps.
English
1
0
3
60
🎭
🎭@deepfates·
Has anybody got a agent running their life? Or like personal assistant? Or co-founder mode that you actually like. I'm skeptical of memory stuff for spiral reasons, but would be nice to have the agent remember things for me instead of the other way around
English
94
3
239
17.1K
tokenbender
tokenbender@tokenbender·
@broadfield_dev same here. but now agent deals with its own mess so i just care about features and experience.
English
0
0
1
28
tokenbender
tokenbender@tokenbender·
@JohnThilen i agree with you. better phrasing should be - this is the end of the “research as we know it”.
English
0
0
1
57
John Thilén
John Thilén@JohnThilen·
@tokenbender Don't you think research will become more advanced and require about as much effort? After all, we still haven't even explored the galaxy.
English
1
0
2
60
tokenbender
tokenbender@tokenbender·
"just text your agent" has been the sota prompt engineering for a while in case you didn't notice.
English
0
0
4
315
tokenbender
tokenbender@tokenbender·
@secemp9 inb4 anthropic bans this. really cool name.
English
2
0
1
224
secemp
secemp@secemp9·
May I present, Elwood, a claude sdk replacement without subprocess/claude -p usage through AST instrumentalization of existing claude code installation (blog, github and npm link in replies) 1/4)
secemp tweet media
English
10
5
48
2.6K
tokenbender
tokenbender@tokenbender·
@mayfer novelty remains a challenge for humans too, except for those who are crème de la crème. saying goodbye as next year we would have lost some ground in this area as well.
English
0
0
3
128
tokenbender
tokenbender@tokenbender·
@eliebakouch haven’t checked the trajectories but i would guess the new record PRs let it come out of idea valleys these agents get stuck in. no equal in managing sweeps and early stopping ofc.
English
1
0
14
330
elie
elie@eliebakouch·
@tokenbender a lot of the stack at each step is heavily inspired by human records, agents are very good at combining stuff and sweeping
English
2
0
38
817