tokenbender

13.8K posts

tokenbender

@tokenbender

Sparse and efficient • Deus eXperiments • 🇮🇳

Katılım Temmuz 2014

916 Takip Edilen12.2K Takipçiler

tokenbender@tokenbender·5h

@0xVita that would be awesome.

English

wetbrain@0xVita·5h

@tokenbender I will do another write up on this episode when I get the chance like I did for the Reiner Pope episode. It was so enjoyable to go down the rabbit hole.

English

tokenbender@tokenbender·8h

i have kinship with anyone who has a great time consuming such content. i know what i’m watching tonight.

Dwarkesh Patel@dwarkesh_sp

New blackboard lecture w @ericjang11 He walks through how to build AlphaGo from scratch, but with modern AI tools. Sometimes you understand the future better by stepping backward. AlphaGo is still the cleanest worked example of the primitives of intelligence: search, learning from experience, and self-play. You have to go back to 2017 to get insight into how the more general AIs of the future might learn. Once he explained how AlphaGo works, it gave us the context to have a discussion about how RL works in LLMs and how it could work better – naive policy gradient RL has to figure out which of the 100k+ tokens in your trajectory actually got you the right answer, while AlphaGo’s MCTS suggests a strictly better action every single move, giving you a training target that sidesteps the credit assignment problem. The way humans learn is surely closer to the second. Eric also kickstarted an Autoresearch loop on his project. And it was very interesting to discuss which parts of AI research LLMs can already automate pretty well (implementing and running experiments, optimizing hyperparameters) and which they still struggle with (choosing the right question to investigate next, escaping research dead ends). Informative to all the recent discussion about when we should expect an intelligence explosion, and what it would look like from the inside. Timestamps: 0:00:00 – Basics of Go 0:08:06 – Monte Carlo Tree Search 0:31:53 – What the neural network does 1:00:22 – Self-play 1:25:27 – Alternative RL approaches 1:45:36 – Why doesn’t MCTS work for LLMs 2:00:58 – Off-policy training 2:11:51 – RL is even more information inefficient than you thought 2:22:05 – Automated AI researchers

English

1.7K

tokenbender@tokenbender·7h

@slowdownisha love this new series.

English

Isha@slowdownisha·7h

@tokenbender stanford lectures❌ dwarkesh podcast✅

English

tokenbender@tokenbender·7h

this is how it feels for the hy-mt 1.8B model that tencent shipped for bragging rights on edge deployment engineering.

HOW THINGS WORK@HowThingsWork_

A worker uses lubrication & body weight to force three tires into one. This saves shipping space therefore reducing costs.

English

1.7K

tokenbender@tokenbender·11h

@wondering_camel writing to let my gradients flow

English

Mushroom's Mutters 🎀@wondering_camel·11h

@tokenbender waiting for it

English

tokenbender@tokenbender·12h

going to write a lot of articles, contributing to having a holistic view of info representation and sparsity. it would be a series of observations implying certain proven-unproven ideas. ultimately i wish to show you what i have been caring about and why to such extent lately.

English

723

tokenbender retweetledi

Tensor-Slayer@TensorSlay·13h

Open-sourced Hermes Agent Mobile Client for Android. It’s an Android native client for a running your existing Hermes Agents on your PC/VPS. Playstore and app store review pending but you can try the app on android now.

English

5.2K

tokenbender@tokenbender·14h

imagine arxiv with community notes that downgrades ai slop or hype driven research, @askalphaxiv has an opportunity to do something funny here.

Dan Roy@roydanroy

There's a lot of controversy brewing around arXiv's decision to penalize authors who post unchecked AI generated content. The impulse is correct, IMO, simply on grounds of efficiency: it is much cheaper to insist the authors vet their work first, rather than distributing the cost of that work to EVERY reader/agent who subsequently downloads the work. I believe the mechanism is likely the wrong one, however. Unfortunately, suggestions to use github are even worse, IMO, because they lose the (effective) immutability of the scientific record, which arXiv upholds.

English

tokenbender@tokenbender·14h

@eliebakouch @auto_grad_ yeah one can say "DSA is just learned top k sparse attention over kv cache" if they really want to reduce the idea to existing ones.

English

elie@eliebakouch·14h

@tokenbender @auto_grad_ i really do agree with that, also for instance, would consider things like MLA/DSA "novel"? MoE?

English

tokenbender@tokenbender·1d

say goodbye to majority of human research effort. really nice of PI to release this in such detail.

Prime Intellect@PrimeIntellect

Automating AI research is the next major step in AI We let Claude Code (Opus 4.7) and Codex (GPT 5.5) run autonomously on the nanoGPT speedrun optimizer track using our idle compute. ~10k runs, ~14k H200 hours Opus now holds the record at 2930 steps vs the 2990 human baseline

English

8.6K

tokenbender@tokenbender·14h

@auto_grad_ @eliebakouch models explore locality of an idea a lot more thoroughly right now than hop to distinct ones but i am quite clear that they can be taught to search near and far both in idea basins.

English

Ishaan@auto_grad_·15h

take you for example, the concept of avataRL. an experiment on RL pretraining, why dont llms try out that in autoresearch and why do they only go for sweeping? why dont they try to change the objective (note, not the benchmark/evals) rather than maximizing on the given objective? and there are many such examples in our case. and if you ask why don't a lot of people have meaningful research contributions, i say that is because people are limited by a ton more stuff that what models are (compute). given people time and compute, i think more meaningful contributions can be taken out.

English

tokenbender@tokenbender·15h

@auto_grad_ @eliebakouch curious, how many truly novel research contributions do people you know have? and by that i mean stuff that is not interpolation or combination of methods that already exists as you so described in your earlier reply.

English

Ishaan@auto_grad_·15h

@tokenbender @eliebakouch i dont think so. humans have always had that capability of coming up with something new (not only in tech but in general) out of nowhere, can't say the same for llms.

English

tokenbender@tokenbender·15h

@auto_grad_ @eliebakouch novelty is an overloaded term, something that most humans struggle with as well. difference here is models are actually going to get better at it in a short amount of time.

English

Ishaan@auto_grad_·15h

models are pretty good at interpolating and making a combination of stuff, thats why they might be good at optimizing solved stuff. but when it comes to extrapolating info (PI even quoted this i.e. novelty checks) they're not that good. infact, no where close to the concept of "ideas". that's why autoresearch seems good at first but then its all a combination sweeps.

English

tokenbender@tokenbender·21h

@Corrutina @deepfates this is the art in the state of the art.

English

Corrutina@Corrutina·21h

@tokenbender @deepfates Nunca había pensado en la idea de usar Github para eso xD

Español

🎭@deepfates·1d

Has anybody got a agent running their life? Or like personal assistant? Or co-founder mode that you actually like. I'm skeptical of memory stuff for spiral reasons, but would be nice to have the agent remember things for me instead of the other way around

English

239

17.1K

tokenbender@tokenbender·21h

@broadfield_dev same here. but now agent deals with its own mess so i just care about features and experience.

English

broadfield-dev@broadfield_dev·21h

@tokenbender my mobile apps are called index.html

English

tokenbender@tokenbender·21h

i had a web server running that allowed me to use codex easily from phone browser. glad to scrap it off today.

OpenAI@OpenAI

You've been asking for this one... Now in preview: Codex in the ChatGPT mobile app. Start new work, review outputs, steer execution, and approve next steps, all from the ChatGPT mobile app. Codex will keep running on your laptop, Mac mini, or devbox.

English

1.5K

tokenbender@tokenbender·21h

@JohnThilen i agree with you. better phrasing should be - this is the end of the “research as we know it”.

English

John Thilén@JohnThilen·21h

@tokenbender Don't you think research will become more advanced and require about as much effort? After all, we still haven't even explored the galaxy.

English

tokenbender@tokenbender·21h

"just text your agent" has been the sota prompt engineering for a while in case you didn't notice.

English

315

tokenbender@tokenbender·22h

@secemp9 inb4 anthropic bans this. really cool name.

English

224

secemp@secemp9·1d

May I present, Elwood, a claude sdk replacement without subprocess/claude -p usage through AST instrumentalization of existing claude code installation (blog, github and npm link in replies) 1/4)

English

2.6K

tokenbender@tokenbender·1d

@mayfer novelty remains a challenge for humans too, except for those who are crème de la crème. saying goodbye as next year we would have lost some ground in this area as well.

English

128

murat 🍥@mayfer·1d

@tokenbender idk about goodbye

elie@eliebakouch

all the records are heavily based on work from previous contributors PRs (we do explore novel ideas in a dedicated "novelty" track, but none of them ended up improving the record). So it only made sense to let the agents write a little thank you to the community themselves github.com/KellerJordan/m…

English

370

tokenbender@tokenbender·1d

@eliebakouch haven’t checked the trajectories but i would guess the new record PRs let it come out of idea valleys these agents get stuck in. no equal in managing sweeps and early stopping ofc.

English

330

elie@eliebakouch·1d

@tokenbender a lot of the stack at each step is heavily inspired by human records, agents are very good at combining stuff and sweeping

English

817

Keşfet

@0xVita @slowdownisha @wondering_camel @askalphaxiv @eliebakouch @auto_grad_ @elonmusk @BarackObama