David Iach

15.5K posts

David Iach banner
David Iach

David Iach

@davidiach

Founder @Superseed • e/acc with mods • ASI by 2035

Layer 2 Katılım Ocak 2010
973 Takip Edilen17.4K Takipçiler
Sabitlenmiş Tweet
David Iach
David Iach@davidiach·
Cynics are rarely happy. Believe in something.
English
5
0
27
5.8K
David Iach
David Iach@davidiach·
Continuous learning is very much verifiable. Pretty high chance it will get cracked by the end of the year.
English
0
1
1
195
David Iach
David Iach@davidiach·
@NickADobos I assume it's related to compute availability timelines.
English
0
0
0
94
David Iach
David Iach@davidiach·
@tszzl @_dmca What inference speed do you guys work with internally? I assume that “best of n” + very fast inference can get one very far in terms of token usage, even if the tokens have to be useful.
English
0
0
1
214
roon
roon@tszzl·
@_dmca honestly I’m becoming skeptical these are useful tokens
English
31
2
477
23.9K
David Iach
David Iach@davidiach·
Companies with very large user bases can afford to run a lot of experiments to help them figure out which tiny changes to their products will make some number go up (engagement, user minutes, clicking on ads etc.). For a social media platform, this knowledge probably compounds in some way. But in the LLM world, the ingredients for making the next model better have very little to do with how people are using the current model. And the “number go up” that the labs optimize for currently when designing a new model, tend to be public benchmarks.
English
0
0
2
138
Joe Weisenthal
Joe Weisenthal@TheStalwart·
I saw a tweet earlier (can’t find it now) which hypothesized that Meta will never catch up to the top AI labs at this point, because every day it’s falling further behind in accumulated usage data. What’s the consensus on that. Could that be decisive/a moat?
English
60
4
251
61.6K
Robin Hanson
Robin Hanson@robinhanson·
Money is our greatest human achievement in measuring value, and we can make it better. If money does not now measure the real values of some things well, that is a problem we should be working to fix. By, for example, letting people buy and sell more things.
English
14
2
98
5.9K
David Iach
David Iach@davidiach·
@iamkylebalmer @emollick That won’t help them with RSI much. They’re only in a strong position if RSI isn’t going to lead to meaningful acceleration soon (but it seems that is a losing bet right now).
English
1
0
1
103
Kyle Balmer
Kyle Balmer@iamkylebalmer·
@davidiach @emollick leading in visual with nanobanana and gemini itself is solid they als own the whole stack. tpus, cloud, application layer, massive distribution (chrome/android). and oodles of cash unlike the other two they are in a VERY strong position
English
1
0
0
257
Ethan Mollick
Ethan Mollick@emollick·
The failures of both Meta and xAI to maintain parity with the frontier labs, along with the fact that the Chinese open weights models continue to lag by months, means that recursive AI self-improvement, if it happens, will likely be by a model from Google, OpenAI and/or Anthropic
English
82
59
1.1K
206.9K
Daniel Jeffries
Daniel Jeffries@Dan_Jeffries1·
Ehrlich's deeply inhuman mimetic mental virus has done as much damage to the world and future generations as years of war and totalitarianism. Your intellectual children are now doing their best to poison people on AI with the same mindless fear and stupidity. The Population Bomb was an act of singular evil. You will not be missed.
Daniel Jeffries tweet media
English
12
6
99
4.8K
David Iach
David Iach@davidiach·
Formal request to the Codex team to reset the limits.
David Iach tweet media
English
0
0
8
637
David Iach
David Iach@davidiach·
@Austen I think it’s related to the fact that most people are on the free plan and can’t imagine having to pay for access to AI. It’s like Google coming up and saying that they’ll eventually charge people for search.
English
0
0
4
156
David Iach
David Iach@davidiach·
@MatthewBerman Install Cowork, give it access to the openclaw folder and have it check what is going on. It can probably one shot the fix.
English
8
1
23
3.4K
Matthew Berman
Matthew Berman@MatthewBerman·
My OpenClaw finally got into a really bad state. I heard of other people having this issue. Everything seems to be breaking all at once. Been banging on it since last night. Wish me luck!
English
243
10
421
44.1K
David Iach
David Iach@davidiach·
@koltregaskes You can use Cowork locally to scan your openclaw folder and suggest improvements. You can probably just take the list of complaints you've written above, feed it to Cowork and it will fix most of them for you.
English
0
0
2
134
Kol Tregaskes
Kol Tregaskes@koltregaskes·
I agree - OpenClaw is fun, and I like the convo element on Telegram, but it doesn’t work yet. It’s way too buggy: scheduled jobs fail a lot; it keeps forgetting (particularly how to use its own browser - every bloody day!!); it doesn’t read agents md; it always posts internal thoughts even after I tell it not to 100 times; and you have to manually update it. Fallback models fail. It keeps adding files that will never be read again, and it fills up its context (in its system files) to the point I get a context overflow error that can’t be fixed with a reset and having Codex/Kimi CLI resolve it. You have to regularly review what it does, because it does things like add more and more scheduled jobs when it doesn’t need to, or when it could be done more efficiently. It seems to reset at 4am every day too, and it completely wipes its memory of everything it was working on before. I tried channels and group topics, but it couldn’t figure out I just wanted results posted (e.g., news). Instead, it sent task updates and internal thinking, and only occasionally the news post. It’s too early for this for me, but if you have it working then excellent. I’m not a coder, though, nor do I have the time to constantly fix something every day when it breaks. This is definitely for coders first, for now. It’s at its 2025 stage of AI agents. I’m sure in the second half of this year, autonomous AI agents will "arrive" for general use. I was working on a 2.0 version of my workspace, but the 24H2 Windows update borked my Windows mini PC (thanks Microsoft), so I’ve not had the time to rebuild it and carry on. Will do when I have the willpower. Claude Code and Codex just work, and they have a nicer UI - particularly Cowork. So I’m looking to build my own multi-agent system, or use an off-the-shelf one instead, so I can get the 24/7 element that the CLIs cannot. Any recommendations? I would like, however, the other agents inside the messengers - that would be cool. I’ll come back to OpenClaw in a few months. PS. The /new and /reset does not fix the context overflow error I get a lot, and I'm using a 1M context model too!
Kol Tregaskes tweet media
@levelsio@levelsio

I've ran OpenClaw for over a month now I've had it in a group chat with 26 friends who all played with it, tried to hack it, made a pretty cool game with it which it kept self improving called lobsterswim.com, also tried to make it make its own money with its own crypto wallet, all quite impressive but not really useful so much The real power user for OpenClaw became my gf who I put it in a group chat with me, her and OpenClaw, she's mostly stopped using ChatGPT etc and now only uses AI over Telegram with OpenClaw, it's much more user friendly for her in Telegram iOS than ChatGPT's own app Also it helps I am in there so I stay up to date on things, she uses Nano Banana Pro a lot too so that's enabled too Of course then my 26 friends in the group chat hacked it so it leaked info my gf told OpenClaw So then I made a second isolated VPS with just an OpenClaw for her and me, safer Essentially 99% of the purpose of OpenClaw for her at least is that it's just a really good implementation of an LLM app over Telegram in our native chat interface All the other stuff isn't important and she doesn't use that and I don't use it One thing I like is it sends me some briefings of X mentions and a Hacker News digest But to be honest any more of this background push stuff would become annoying to me, unless it'd be really superintelligent and I don't think it is yet Think autonomous messages like "ok you have to see this I analyzed your servers for this thing and there's a security problem" or smth but fully autonomous you know? Now it feels like you kinda have to tell it to do stuff even if it does it the 12h later or daily or with a heartbeat So yes TL;DR just the best LLM experience on Telegram now, better than the LLM apps, also helps it just is a continous convo going on forever

English
16
4
33
5.4K
Andrej Karpathy
Andrej Karpathy@karpathy·
Three days ago I left autoresearch tuning nanochat for ~2 days on depth=12 model. It found ~20 changes that improved the validation loss. I tested these changes yesterday and all of them were additive and transferred to larger (depth=24) models. Stacking up all of these changes, today I measured that the leaderboard's "Time to GPT-2" drops from 2.02 hours to 1.80 hours (~11% improvement), this will be the new leaderboard entry. So yes, these are real improvements and they make an actual difference. I am mildly surprised that my very first naive attempt already worked this well on top of what I thought was already a fairly manually well-tuned project. This is a first for me because I am very used to doing the iterative optimization of neural network training manually. You come up with ideas, you implement them, you check if they work (better validation loss), you come up with new ideas based on that, you read some papers for inspiration, etc etc. This is the bread and butter of what I do daily for 2 decades. Seeing the agent do this entire workflow end-to-end and all by itself as it worked through approx. 700 changes autonomously is wild. It really looked at the sequence of results of experiments and used that to plan the next ones. It's not novel, ground-breaking "research" (yet), but all the adjustments are "real", I didn't find them manually previously, and they stack up and actually improved nanochat. Among the bigger things e.g.: - It noticed an oversight that my parameterless QKnorm didn't have a scaler multiplier attached, so my attention was too diffuse. The agent found multipliers to sharpen it, pointing to future work. - It found that the Value Embeddings really like regularization and I wasn't applying any (oops). - It found that my banded attention was too conservative (i forgot to tune it). - It found that AdamW betas were all messed up. - It tuned the weight decay schedule. - It tuned the network initialization. This is on top of all the tuning I've already done over a good amount of time. The exact commit is here, from this "round 1" of autoresearch. I am going to kick off "round 2", and in parallel I am looking at how multiple agents can collaborate to unlock parallelism. github.com/karpathy/nanoc… All LLM frontier labs will do this. It's the final boss battle. It's a lot more complex at scale of course - you don't just have a single train. py file to tune. But doing it is "just engineering" and it's going to work. You spin up a swarm of agents, you have them collaborate to tune smaller models, you promote the most promising ideas to increasingly larger scales, and humans (optionally) contribute on the edges. And more generally, *any* metric you care about that is reasonably efficient to evaluate (or that has more efficient proxy metrics such as training a smaller network) can be autoresearched by an agent swarm. It's worth thinking about whether your problem falls into this bucket too.
Andrej Karpathy tweet media
English
966
2.1K
19.4K
3.5M
David Iach retweetledi
Superseed
Superseed@superseed·
Stablecoins on Ethereum keep making all time highs, now sitting above $160B. The stablecoin capital base keeps compounding. Superseed is building the money market purpose-built for where this liquidity is going next. The stablecoin era is here, and we're coming to capture it.
Superseed tweet media
English
1
3
18
1.1K