David Aronchick

279 posts

David Aronchick banner
David Aronchick

David Aronchick

@aronchick

CEO, https://t.co/U7uKl4r9iE Cofounder, https://t.co/i9QX6JxCBa Ex: MSFT, K8s, Kubeflow, GOOG, AMZN. 4x founder/CEO. There is many a worse and more elaborate life. He/him

Hilbert’s Hotel Katılım Haziran 2008
3.8K Takip Edilen7.8K Takipçiler
Mitchell Hashimoto
Mitchell Hashimoto@mitchellh·
I'm dying for pi-mono-style minimal library that handles the hard parts of email (auth, syncing with local state, etc.) and gives me an opinionated way to add agentic loops on top of that. I want to build my own agents and logic and guardrails, I don't trust vendors right now.
English
36
17
740
55.7K
David Aronchick retweetledi
SpecStory
SpecStory@specstoryai·
Cursor, Codex and Claude Code are all single-player. Your whole team builds alone and no one knows what anyone else decided. But building product is a team sport. AI should be too. The conversations, decisions, specs and builds. All of it, together, with your whole team. Launching soon → somehow.sh
English
3
5
30
13.4K
Sherveen Mashayekhi
Sherveen Mashayekhi@Sherveen·
@aronchick No worries -- gstack is just mediocre skills. Regardless, a CTO impressed by them is unfathomably behind. There are lots of CTOs that are unfathomably behind, but if Garry is tweeting about it, it's arguably a CTO that should be more modern, not used as a trophy.
English
2
0
18
7.3K
Sherveen Mashayekhi
Sherveen Mashayekhi@Sherveen·
To be clear -- (1) Garry should be embarrassed for tweeting this. (2) If it's true, that CTO should be fired immediately. (3) Whenever I think I've finally hit the bottom of "VCs are mostly stupid + lucky (other than my friends)," one of them manages to find a new bottom.
Sherveen Mashayekhi tweet media
English
43
12
977
109.8K
ZM 🇺🇸 ⚓️
ZM 🇺🇸 ⚓️@mccheezeburger·
@isaiah_bb @brianschatz That’s not the quote at all. But you knew that. You just don’t care. You’d rather manipulate your audience. There is so much to be rightly critical of. DOGE was ham-fisted and accomplished nothing. But when you alter quotes to make your point you are a liar and deceiver.
English
21
0
35
20.6K
David Aronchick
David Aronchick@aronchick·
@karpathy @snwy_me Did you look at/how does the compare with the auto research papers Google published years ago? Yours feels like such an evolution
English
0
0
0
368
Andrej Karpathy
Andrej Karpathy@karpathy·
@snwy_me very cool! I love to see all the different directions people take it in, here esp the CLI, TUI, tool use aspects.
English
21
8
447
26.6K
snwy
snwy@snwy_me·
autoresearch really interested me, despite me not being "all-in" on agents yet. i wanted to get started with running auto experiments i looked to existing tools to serve as a harness but each one had its problems. so i made one introducing Helios for autonomous ML research
snwy tweet media
Andrej Karpathy@karpathy

Three days ago I left autoresearch tuning nanochat for ~2 days on depth=12 model. It found ~20 changes that improved the validation loss. I tested these changes yesterday and all of them were additive and transferred to larger (depth=24) models. Stacking up all of these changes, today I measured that the leaderboard's "Time to GPT-2" drops from 2.02 hours to 1.80 hours (~11% improvement), this will be the new leaderboard entry. So yes, these are real improvements and they make an actual difference. I am mildly surprised that my very first naive attempt already worked this well on top of what I thought was already a fairly manually well-tuned project. This is a first for me because I am very used to doing the iterative optimization of neural network training manually. You come up with ideas, you implement them, you check if they work (better validation loss), you come up with new ideas based on that, you read some papers for inspiration, etc etc. This is the bread and butter of what I do daily for 2 decades. Seeing the agent do this entire workflow end-to-end and all by itself as it worked through approx. 700 changes autonomously is wild. It really looked at the sequence of results of experiments and used that to plan the next ones. It's not novel, ground-breaking "research" (yet), but all the adjustments are "real", I didn't find them manually previously, and they stack up and actually improved nanochat. Among the bigger things e.g.: - It noticed an oversight that my parameterless QKnorm didn't have a scaler multiplier attached, so my attention was too diffuse. The agent found multipliers to sharpen it, pointing to future work. - It found that the Value Embeddings really like regularization and I wasn't applying any (oops). - It found that my banded attention was too conservative (i forgot to tune it). - It found that AdamW betas were all messed up. - It tuned the weight decay schedule. - It tuned the network initialization. This is on top of all the tuning I've already done over a good amount of time. The exact commit is here, from this "round 1" of autoresearch. I am going to kick off "round 2", and in parallel I am looking at how multiple agents can collaborate to unlock parallelism. github.com/karpathy/nanoc… All LLM frontier labs will do this. It's the final boss battle. It's a lot more complex at scale of course - you don't just have a single train. py file to tune. But doing it is "just engineering" and it's going to work. You spin up a swarm of agents, you have them collaborate to tune smaller models, you promote the most promising ideas to increasingly larger scales, and humans (optionally) contribute on the edges. And more generally, *any* metric you care about that is reasonably efficient to evaluate (or that has more efficient proxy metrics such as training a smaller network) can be autoresearched by an agent swarm. It's worth thinking about whether your problem falls into this bucket too.

English
48
82
1.5K
183K
David Aronchick
David Aronchick@aronchick·
@chrysb @openclaw There are a bunch of items in there Including safe, rollback, safe mode, and others. The problem is if there’s no known safe version than you can’t guarantee that the engine would ever have the ability to read the conflict to figure out what it could skip.
English
0
0
0
39
Chrys Bader
Chrys Bader@chrysb·
@aronchick @openclaw nice - it’s possible to have a situation where last known good could also crash, right? eg - it points to a webhook transform that no longer exists. I’m wondering if the config should just have warnings for non fatal things (like the above) instead of crashing.
English
1
0
0
1.1K
Chrys Bader
Chrys Bader@chrysb·
request: @openclaw needs a repair mode when an invalid configuration is present and the gateway fails to start, your agent becomes unreachable. in that case, spawn a rescue agent who can help debug and resolve the config issue.
English
205
22
667
46.4K
Melinda B. Chu
Melinda B. Chu@MelindaBChu1·
@AnthropicAI I guess Anthropic and Amodei don’t want Ukraine to have the best tools. We can only assume that you are Pro-Russia.
English
2
0
2
205
David Aronchick
David Aronchick@aronchick·
@Gavriel_Cohen @rohanpaul_ai I love the project! Have you given any thought to what it looks like to work in parallel with the open claw ecosystem? Or have a simple migration?
English
0
0
1
64
Gavriel Cohen
Gavriel Cohen@Gavriel_Cohen·
@rohanpaul_ai Creator of NanoClaw here, thanks for sharing. Monolithic frameworks don't make sense anymore with coding agents that can create for each of us the exact software we need. Instead, you want a secure and flexible foundation that you can easily build on. That's what NanoClaw is.
English
9
4
75
3.2K
Rohan Paul
Rohan Paul@rohanpaul_ai·
NanoClaw, the lightweight alternative to Clawdbot / OpenClaw already reached 10.5K Github stars ⭐️ Compared with OpenClaw, NanoClaw’s specialty is simplicity plus OS level isolation. - Much smaller and manageable codebase, only 4K lines. - Runs in containers for security. - Connects to WhatsApp, has memory, scheduled jobs, and runs directly on Anthropic's Agents SDK - stores state in SQLite, runs scheduled jobs, and keeps each chat group isolated with its own memory file and its own Linux container so the agent only sees directories you explicitly mount. - its safety model leans on application controls like allowlists and pairing codes inside a shared Node process. OpenClaw is built for broad multi channel coverage, while NanoClaw intentionally stays minimal so you customize by changing a small codebase instead of operating a big framework.
Rohan Paul tweet media
English
100
235
1.9K
153.8K
David Aronchick
David Aronchick@aronchick·
@fiscal_ai This is the dumbest take. By this logic, no nude company as ever existed since before craigslist, these companies made most of their money off of ads, classified / others.
English
0
0
1
588
Fiscal.ai
Fiscal.ai@fiscal_ai·
The New York Times is no longer a news company. $NYT
Fiscal.ai tweet media
English
36
89
672
984.7K
David Aronchick
David Aronchick@aronchick·
@BoWang87 I don’t k ow if you can edit, but the middle term is “ionic bond” not “hydrogen bond”
English
0
0
4
594
Bo Wang
Bo Wang@BoWang87·
Bytedance just dropped a paper that might change how AI thinks. Literally. They figured out why LLMs fail at long reasoning — and framed it as chemistry. The discovery: Chain-of-thought isn't just words. It's molecular structure. Three bond types: • Deep reasoning = covalent bonds (strong, unbreakable) • Self-reflection = hydrogen bonds (flexible, context-aware) • Exploration = van der Waals (weak, ever-present) Why most AI "thinking" sucks: Everyone's been imitating keywords — "wait," "let me check" — without building the actual bonds. It's like copying the shape of a protein without the atomic forces holding it together. Bytedance proved: structure emerges from training, not prompting. The fix: Mole-Syn Their method doesn't just generate text. It synthesizes stable thought molecules. Results: better reasoning, more stable RL training. Bytedance is treating AI reasoning like organic chemistry — and it works. Paper: arxiv.org/abs/2601.06002
Bo Wang tweet mediaBo Wang tweet media
English
116
522
2.9K
240.5K
Julian
Julian@jufuxs·
@jumperz Do you think you would be able to publish an artifact for this? You could literally tell your LLM to write an artifact that other Openclaw could use. I'd be the first one. Solid stuff once again!
English
5
1
20
5.6K
JUMPERZ
JUMPERZ@jumperz·
this is the entire memory stack if you actually want to take your agent memory to somewhere real. from actually remembering to having an intelligence layer. 31 pieces total, split into 3 phases: core first, reliability second, then advanced last. you build from core to advanced slowly, and you test each phase before touching the next. if you try to build all 31 at once, you will break everything and you won't understand anything. phase 1 is 10 pieces. write pipeline, read pipeline, decay, session flush and behavior loop and this is the minimum for memory that actually works.. phase 2 is 7 pieces. crash recovery, audit trail, dedup, conflict resolution, automated maintenance jobs.. this is what makes memory durable.. phase 3 is 14 pieces. trust scoring, cross-agent sharing, knowledge graphs, episode tracking, intelligent retrieval, budget awareness.. this is the ceiling .. intelligence. none of phase 3 matters until phase 1 and 2 are solid tho, build in order then test each phase before moving forward. phase 1 unstable means phase 3 just amplifies the flaws and if phase 2 is missing means phase 3 is literally optimising pure garbage. personally i'm not done yet. phase 1 and 2 are solid, phase 3 is still being built. but the longer you work with it the more you see, they're not separate... it's all one system. breakdown and prompts below.
JUMPERZ tweet mediaJUMPERZ tweet media
English
41
54
788
104.4K
David Aronchick retweetledi
Expanso
Expanso@ExpansoIO·
We're LIVE on Product Hunt! 🚀 200+ tools for AI agents across two platforms: 🛒 skills.expanso.io 📚 Catalog at expanso.io Your agents should never touch raw data. Tools filter at source: → remove-pii → cross-border-gdpr → encrypt-data → and 195+ more Open source. Self-hostable. One install command. 👉 Upvote + comment if this is useful to you: producthunt.com/products/expan…
English
0
1
3
231
David Aronchick retweetledi
Expanso
Expanso@ExpansoIO·
Something drops tomorrow at midnight. 🕛 172 reusable data processing recipes for AI agents — PII removal, log aggregation, GDPR routing, encryption, and more. One install command. Production-ready. Open source. Set your alarm. 👀 @ProductHunt
English
0
1
6
214
anand iyer
anand iyer@ai·
.@JeffDean dropped some real infrastructure insight on Latent Space: the AI compute race isn't just about FLOPs anymore. - Energy (picoJoules) is the bottleneck, not raw compute (FLOPs). - Moving data costs 1000x more than actually "thinking" - Google is building 2030 silicon for models that don’t exist yet. - Efficiency and data-locality > raw power. latent.space/p/jeffdean
English
2
5
67
4.1K