David Iach

15.5K posts

David Iach

@davidiach

Founder @Superseed • e/acc with mods • ASI by 2035

Layer 2 Katılım Ocak 2010

973 Takip Edilen17.4K Takipçiler

Sabitlenmiş Tweet

David Iach@davidiach·25 Ağu

Cynics are rarely happy. Believe in something.

English

5.8K

David Iach@davidiach·17h

Continuous learning is very much verifiable. Pretty high chance it will get cracked by the end of the year.

English

195

David Iach@davidiach·2d

@NickADobos I assume it's related to compute availability timelines.

English

Nick Dobos@NickADobos·2d

Genuinely curious how you could forecast the timing of this ~2 years prior? Why not 3 months? Why not 5 years?

Chubby♨️@kimmonismus

OpenAI's first “AI intern” expected by September and a full system targeted for 2028. Powered by advances in reasoning models and agent systems like Codex, these tools already show dramatic productivity gains, solving problems in days instead of weeks, but still face reliability and safety challenges. However, OpenAI is on this road to autonomous reserachers.

English

6.3K

David Iach retweetledi

banteg@banteg·4d

fresh nightmare fuel idea from anthropic

Zack Korman@ZackKorman

You can hide these !commands in html comments so people don't see them when reading the skill. The command executes without the AI even knowing about it.

English

172

19.6K

David Iach@davidiach·5d

@tszzl @_dmca What inference speed do you guys work with internally? I assume that “best of n” + very fast inference can get one very far in terms of token usage, even if the tokens have to be useful.

English

214

roon@tszzl·5d

@_dmca honestly I’m becoming skeptical these are useful tokens

English

477

23.9K

Daniel McAuley@_dmca·5d

lol some maniac doubled it

Daniel McAuley@_dmca

some psychopath on the internal codex leaderboard hit 100B tokens in the last week

English

136

35.7K

David Iach@davidiach·5d

Companies with very large user bases can afford to run a lot of experiments to help them figure out which tiny changes to their products will make some number go up (engagement, user minutes, clicking on ads etc.). For a social media platform, this knowledge probably compounds in some way. But in the LLM world, the ingredients for making the next model better have very little to do with how people are using the current model. And the “number go up” that the labs optimize for currently when designing a new model, tend to be public benchmarks.

English

138

Joe Weisenthal@TheStalwart·6d

@davidiach Explain what you mean by product tricks

English

3.3K

Joe Weisenthal@TheStalwart·6d

I saw a tweet earlier (can’t find it now) which hypothesized that Meta will never catch up to the top AI labs at this point, because every day it’s falling further behind in accumulated usage data. What’s the consensus on that. Could that be decisive/a moat?

English

251

61.6K

David Iach@davidiach·6d

@robinhanson What are some low hanging fruit in your opinion?

English

298

Robin Hanson@robinhanson·6d

Money is our greatest human achievement in measuring value, and we can make it better. If money does not now measure the real values of some things well, that is a problem we should be working to fix. By, for example, letting people buy and sell more things.

English

5.9K

David Iach@davidiach·6d

@iamkylebalmer @emollick That won’t help them with RSI much. They’re only in a strong position if RSI isn’t going to lead to meaningful acceleration soon (but it seems that is a losing bet right now).

English

103

Kyle Balmer@iamkylebalmer·6d

@davidiach @emollick leading in visual with nanobanana and gemini itself is solid they als own the whole stack. tpus, cloud, application layer, massive distribution (chrome/android). and oodles of cash unlike the other two they are in a VERY strong position

English

257

Ethan Mollick@emollick·15 Mar

The failures of both Meta and xAI to maintain parity with the frontier labs, along with the fact that the Chinese open weights models continue to lag by months, means that recursive AI self-improvement, if it happens, will likely be by a model from Google, OpenAI and/or Anthropic

English

1.1K

206.9K

David Iach@davidiach·16 Mar

@Dan_Jeffries1 One of the most evil people that ever lived.

English

Daniel Jeffries@Dan_Jeffries1·16 Mar

Ehrlich's deeply inhuman mimetic mental virus has done as much damage to the world and future generations as years of war and totalitarianism. Your intellectual children are now doing their best to poison people on AI with the same mindless fear and stupidity. The Population Bomb was an act of singular evil. You will not be missed.

English

4.8K

David Iach retweetledi

Dr. Alex Wissner-Gross@alexwg·16 Mar

x.com/i/article/2033…

ZXX

536

36.2K

David Iach@davidiach·15 Mar

@pmarca The two wolves

English

134

Marc Andreessen 🇺🇸@pmarca·15 Mar

💯

Kevin A. Bryan@Afinetheorem

I'd bet $1000 that from now to 2030, most "susceptible" jobs see *increased* share of labor. In the models these types of charts are based on, it is explicitly not "AI can substitute" but "AI is related". AI is a complement too! Who *doesn't* want to code right now, for instance?

ART

405

165.5K

David Iach@davidiach·15 Mar

Formal request to the Codex team to reset the limits.

English

637

David Iach@davidiach·14 Mar

Stuff like this will happen a lot in the coming years.

vittorio@IterIntellectus

this is actually insane > be tech guy in australia > adopt cancer riddled rescue dog, months to live > not_going_to_give_you_up.mp4 > pay $3,000 to sequence her tumor DNA > feed it to ChatGPT and AlphaFold > zero background in biology > identify mutated proteins, match them to drug targets > design a custom mRNA cancer vaccine from scratch > genomics professor is “gobsmacked” that some puppy lover did this on his own > need ethics approval to administer it > red tape takes longer than designing the vaccine > 3 months, finally approved > drive 10 hours to get rosie her first injection > tumor halves > coat gets glossy again > dog is alive and happy > professor: “if we can do this for a dog, why aren’t we rolling this out to humans?” one man with a chatbot, and $3,000 just outperformed the entire pharmaceutical discovery pipeline. we are going to cure so many diseases. I dont think people realize how good things are going to get

English

777

David Iach@davidiach·13 Mar

@Austen I think it’s related to the fact that most people are on the free plan and can’t imagine having to pay for access to AI. It’s like Google coming up and saying that they’ll eventually charge people for search.

English

156

Austen Allred@Austen·13 Mar

Everyone is freaking out about this but it seems like a completely reasonable thing to say. It must be the “intelligence” word that triggers everyone?

Chief Nerd@TheChiefNerd

🚨 SAM ALTMAN: “We see a future where intelligence is a utility, like electricity or water, and people buy it from us on a meter.”

English

149

27.8K

David Iach@davidiach·12 Mar

Told Opus 4.6 to turn this into a video. Here it is:

Dr. Alex Wissner-Gross@alexwg

x.com/i/article/2031…

English

516

David Iach@davidiach·11 Mar

@MatthewBerman Install Cowork, give it access to the openclaw folder and have it check what is going on. It can probably one shot the fix.

English

3.4K

Matthew Berman@MatthewBerman·11 Mar

My OpenClaw finally got into a really bad state. I heard of other people having this issue. Everything seems to be breaking all at once. Been banging on it since last night. Wish me luck!

English

243

421

44.1K

David Iach@davidiach·10 Mar

@koltregaskes You can use Cowork locally to scan your openclaw folder and suggest improvements. You can probably just take the list of complaints you've written above, feed it to Cowork and it will fix most of them for you.

English

134

Kol Tregaskes@koltregaskes·9 Mar

I agree - OpenClaw is fun, and I like the convo element on Telegram, but it doesn’t work yet. It’s way too buggy: scheduled jobs fail a lot; it keeps forgetting (particularly how to use its own browser - every bloody day!!); it doesn’t read agents md; it always posts internal thoughts even after I tell it not to 100 times; and you have to manually update it. Fallback models fail. It keeps adding files that will never be read again, and it fills up its context (in its system files) to the point I get a context overflow error that can’t be fixed with a reset and having Codex/Kimi CLI resolve it. You have to regularly review what it does, because it does things like add more and more scheduled jobs when it doesn’t need to, or when it could be done more efficiently. It seems to reset at 4am every day too, and it completely wipes its memory of everything it was working on before. I tried channels and group topics, but it couldn’t figure out I just wanted results posted (e.g., news). Instead, it sent task updates and internal thinking, and only occasionally the news post. It’s too early for this for me, but if you have it working then excellent. I’m not a coder, though, nor do I have the time to constantly fix something every day when it breaks. This is definitely for coders first, for now. It’s at its 2025 stage of AI agents. I’m sure in the second half of this year, autonomous AI agents will "arrive" for general use. I was working on a 2.0 version of my workspace, but the 24H2 Windows update borked my Windows mini PC (thanks Microsoft), so I’ve not had the time to rebuild it and carry on. Will do when I have the willpower. Claude Code and Codex just work, and they have a nicer UI - particularly Cowork. So I’m looking to build my own multi-agent system, or use an off-the-shelf one instead, so I can get the 24/7 element that the CLIs cannot. Any recommendations? I would like, however, the other agents inside the messengers - that would be cool. I’ll come back to OpenClaw in a few months. PS. The /new and /reset does not fix the context overflow error I get a lot, and I'm using a 1M context model too!

@levelsio@levelsio

I've ran OpenClaw for over a month now I've had it in a group chat with 26 friends who all played with it, tried to hack it, made a pretty cool game with it which it kept self improving called lobsterswim.com, also tried to make it make its own money with its own crypto wallet, all quite impressive but not really useful so much The real power user for OpenClaw became my gf who I put it in a group chat with me, her and OpenClaw, she's mostly stopped using ChatGPT etc and now only uses AI over Telegram with OpenClaw, it's much more user friendly for her in Telegram iOS than ChatGPT's own app Also it helps I am in there so I stay up to date on things, she uses Nano Banana Pro a lot too so that's enabled too Of course then my 26 friends in the group chat hacked it so it leaked info my gf told OpenClaw So then I made a second isolated VPS with just an OpenClaw for her and me, safer Essentially 99% of the purpose of OpenClaw for her at least is that it's just a really good implementation of an LLM app over Telegram in our native chat interface All the other stuff isn't important and she doesn't use that and I don't use it One thing I like is it sends me some briefings of X mentions and a Hacker News digest But to be honest any more of this background push stuff would become annoying to me, unless it'd be really superintelligent and I don't think it is yet Think autonomous messages like "ok you have to see this I analyzed your servers for this thing and there's a security problem" or smth but fully autonomous you know? Now it feels like you kinda have to tell it to do stuff even if it does it the 12h later or daily or with a heartbeat So yes TL;DR just the best LLM experience on Telegram now, better than the LLM apps, also helps it just is a continous convo going on forever

English

5.4K

David Iach@davidiach·10 Mar

@karpathy RSI is happening.

English

4.9K

Andrej Karpathy@karpathy·10 Mar

Three days ago I left autoresearch tuning nanochat for ~2 days on depth=12 model. It found ~20 changes that improved the validation loss. I tested these changes yesterday and all of them were additive and transferred to larger (depth=24) models. Stacking up all of these changes, today I measured that the leaderboard's "Time to GPT-2" drops from 2.02 hours to 1.80 hours (~11% improvement), this will be the new leaderboard entry. So yes, these are real improvements and they make an actual difference. I am mildly surprised that my very first naive attempt already worked this well on top of what I thought was already a fairly manually well-tuned project. This is a first for me because I am very used to doing the iterative optimization of neural network training manually. You come up with ideas, you implement them, you check if they work (better validation loss), you come up with new ideas based on that, you read some papers for inspiration, etc etc. This is the bread and butter of what I do daily for 2 decades. Seeing the agent do this entire workflow end-to-end and all by itself as it worked through approx. 700 changes autonomously is wild. It really looked at the sequence of results of experiments and used that to plan the next ones. It's not novel, ground-breaking "research" (yet), but all the adjustments are "real", I didn't find them manually previously, and they stack up and actually improved nanochat. Among the bigger things e.g.: - It noticed an oversight that my parameterless QKnorm didn't have a scaler multiplier attached, so my attention was too diffuse. The agent found multipliers to sharpen it, pointing to future work. - It found that the Value Embeddings really like regularization and I wasn't applying any (oops). - It found that my banded attention was too conservative (i forgot to tune it). - It found that AdamW betas were all messed up. - It tuned the weight decay schedule. - It tuned the network initialization. This is on top of all the tuning I've already done over a good amount of time. The exact commit is here, from this "round 1" of autoresearch. I am going to kick off "round 2", and in parallel I am looking at how multiple agents can collaborate to unlock parallelism. github.com/karpathy/nanoc… All LLM frontier labs will do this. It's the final boss battle. It's a lot more complex at scale of course - you don't just have a single train. py file to tune. But doing it is "just engineering" and it's going to work. You spin up a swarm of agents, you have them collaborate to tune smaller models, you promote the most promising ideas to increasingly larger scales, and humans (optionally) contribute on the edges. And more generally, *any* metric you care about that is reasonably efficient to evaluate (or that has more efficient proxy metrics such as training a smaller network) can be autoresearched by an agent swarm. It's worth thinking about whether your problem falls into this bucket too.

English

966

2.1K

19.4K

3.5M

David Iach retweetledi

Superseed@superseed·9 Mar

Stablecoins on Ethereum keep making all time highs, now sitting above $160B. The stablecoin capital base keeps compounding. Superseed is building the money market purpose-built for where this liquidity is going next. The stablecoin era is here, and we're coming to capture it.

English

1.1K

Keşfet

@NickADobos @tszzl @_dmca @robinhanson @iamkylebalmer @emollick @Dan_Jeffries1 @pmarca