Will Chen

16.5K posts

Will Chen banner
Will Chen

Will Chen

@stablechen

AI R&D @IdyllicLabs, prev: dev rels @terra_money, head of R&D for wasm devx @cosmology_tech & @terran_one, @ucberkeley ('19 dropout)

San Francisco Beigetreten Ekim 2015
820 Folgt63.2K Follower
Will Chen
Will Chen@stablechen·
"im not going to pretend" is codex equivalent of claude code's "you're absolutely right"
English
0
0
0
44
Will Chen retweetet
Zhengyao Jiang
Zhengyao Jiang@zhengyaojiang·
Autoresearch has been out for 2 weeks. The community is trying to apply it to everything with a measurable metric, here are some successful attempts: 🧵 (1/6)
English
23
70
788
101.9K
Will Chen
Will Chen@stablechen·
at this point just throw your whole bucket list at claude code and let it build agentic systems to get it all done
English
0
0
0
176
Will Chen
Will Chen@stablechen·
intelligence feels more like water finding cracks than directed intention
English
1
0
2
235
Will Chen
Will Chen@stablechen·
Kind of crazy you can do this with one prompt in 3 hours - improving models capability and reliability is truly an exponential increase We took @rivet_dev new secure-exec library — which runs untrusted JavaScript in isolated V8 sandboxes with 16ms cold starts — and ran 185 experiments across three escalating rounds, all built and executed by parallel AI agents. Round 1 was 100 experiments probing every edge of the sandbox: memory limits (found a real security gap where ArrayBuffer bypasses memoryLimit), CPU kill switches, prototype pollution containment, escape attempts, and benchmarks (246 ops/sec, 15ms cold start). We built 50 toy projects including a competitive coding judge, a FaaS platform, a bytecode VM running inside a V8 isolate, and 450-invocation agent arenas. Round 2 went deeper with 65 experiments on object persistence, CRDTs converging across 3 separate sandboxes, an object-capability operating system, live object migration at 52.5/sec, and agent ecosystems where Lotka-Volterra population oscillations spontaneously emerged. Round 3 is where it got unhinged: 20 deeply novel experiments that reframed sandboxes as a computation primitive — not just isolation, but a building block for cognition and society. We built an adversarial debate system where a verifier fact-checks claims by *actually running the algorithms*, a biological immune system for code with memory cells that detect threats 20,330x faster on re-exposure, an economy where Sybil attacks turned out to be undetectable (a real security finding), agents that spontaneously invented governance to reverse a tragedy of the commons, causal inference via 40 "parallel universe" interventions, and code that evolved itself from bubble sort to introsort for a 44.6x speedup. Every experiment was ELO-ranked in tournaments of tens of thousands of matches. The punchline: sandboxes aren't jails — they're five things at once: cognitive building blocks (each sandbox is a thought), safe mutation substrates (disposable universes for code evolution), mechanism design infrastructure (perfect information isolation), possible-worlds semantics (counterfactual realities for reasoning), and emergence substrates (genuine isolation produces genuine culture, language, and institutions).
English
1
1
8
541
Will Chen
Will Chen@stablechen·
@pmarca would you call regular honest reflection and personal auditing / cognitive debugging “introspection” ?
English
0
0
1
245
Marc Andreessen 🇺🇸
My big conclusion from this week: Introspection causes emotional disorders.
English
1.3K
484
8.2K
39.3M
Will Chen
Will Chen@stablechen·
I think the crypto definition of “intent” is actually super apt for ai agents Intent = description of desired end state, not some mystical abstract essence of free will Describe your intents, spawn agents to attempt to explore solution space, accept solution by weighing against cost
English
1
0
3
273
Will Chen
Will Chen@stablechen·
@japarjam Feel u bro just downed 6 burgers not feeling too well
English
0
0
0
11
Just Jeff
Just Jeff@japarjam·
@stablechen Only place open at 4 am in the airport. Made me sick AF
English
1
0
0
34
Will Chen
Will Chen@stablechen·
McDonald’s is no longer a place you go when you’re broke and hungry It’s a place you go to treat yourself to childhood nostalgia
English
1
0
4
484
Will Chen
Will Chen@stablechen·
conspiracy theorists get the mechanism wrong but the outcome right debunkers get the mechanism right but the outcome wrong the correct take is usually: ‘I don’t know who did it or how, but the incentive structure made this almost inevitable
English
0
0
2
261
Will Chen
Will Chen@stablechen·
i vibe coded an entire personal AI operating system this week and the key insight was simple: don't optimize when things run. optimize the rhythm morning review loads my finances, orders food, assigns my workout, recites my mantra, and figures out what i need to do today. basically a guided conversation that takes around 2 hours but if I do it, it makes sure everything else in my life is taken care of. evening review processes the day and auto-delegates overnight agent tasks. it was hard to get agents being productive for 8 hours autonomously and i'm still scaling it up but the key is taking lots of logs throughout the day and automatically coming up with ideas on how to be useful. i use computational metaphors as a generative process -- for example, if there's too much complexity in one area of my life and I'm curious about how to organize things I would have them autoresearch different indexing strategies. it's kind of really hard to exhaust a codex plan lol i used to beat myself up for forgetting lessons. now i convert them into systems and forcing functions. the cost of replanning is near-zero so i tear down and rebuild daily with fresh context. a predictable daily schedule is actually OP, it kind of lets you iterate and compound learnings really fast with your symbiotic AI system MORNING = sync with AI to know what to work on DAY = work with AI while logging everything EVENING = review and automatically convert open threads into autonomous agents NIGHT = AI work autonomously while you sleep tear down your plans daily, reassess with fresh context. ralph loops aren't just good for agents, they're good for humans too. also: use Unix metaphors + biology metaphors, start with building groups of functionality and compose them into organ systems etc
English
3
2
7
1K
Karun Kaushik
Karun Kaushik@karunkaushik_·
when you step into the office and your founding AI engineer is on his 3rd all nighter this team never stops shipping
Karun Kaushik tweet media
English
500
74
3.8K
4.2M
Will Chen
Will Chen@stablechen·
the goal is to create systems that so reliably produce good output that you can reduce concerns to one variable to scale: time / money / compute etc autonomous agents are getting so good - the only question is “how much compute you wanna throw at it” now
English
1
1
3
423
Will Chen
Will Chen@stablechen·
fully autonomous algorithms “designed to search for a repeatable and scalable business model” (aka startups) will be successfully autoresearch within 18 months
English
0
1
4
589
Will Chen
Will Chen@stablechen·
atomic habits -> agentic habits not "agents do your habits for you", that's just automation when execution gets delegated, the bottleneck shifts. the habits that matter become: - conscious interrupt — noticing when something feels off - system debugging — is this actually working? - taste curation — what deserves my attention? - architecture — designing the systems themselves execution habits get cheaper every day. metacognitive habits become the only ones that compound
English
1
0
6
291
Will Chen
Will Chen@stablechen·
many devs still designing for 2024 sonnet 3.5 levels of unreliability, obsessing over context engineering. important to re-evaluate / surface assumptions regularly modern models actually get surprisingly far with just raw data, interface, desired outcome, instructions
English
0
0
2
291