0xDesigner

25.6K posts

0xDesigner

0xDesigner

@0xDesigner

👇 vibe code with me

Katılım Eylül 2021
3.4K Takip Edilen57K Takipçiler
0xDesigner retweetledi
tweet davidson
tweet davidson@andyreed·
if you’re a pm and i send you this, it means i’m blocked from shipping
tweet davidson tweet media
English
39
227
4.1K
169.9K
0xDesigner
0xDesigner@0xDesigner·
btw /btw in claude code is the biggest ux breakthrough since plan mode and should be the new standard for EVERY ai chat
English
28
20
1.1K
104.6K
david phelps
david phelps@divine_economy·
asking this strictly as a graphic designer: who is the intended audience here? i feel like i'm in an AI-generated crystals shop where some wispy white woman is about to tell me she wants to code her own dreams
david phelps tweet media
English
76
3
275
19.9K
0xDesigner retweetledi
Om Patel
Om Patel@om_patel5·
stop spending money on Claude Code. Chipotle's support bot is free:
Om Patel tweet media
English
1.1K
10.3K
160.4K
7.9M
gmoney.eth
gmoney.eth@gmoneyNFT·
think i'm coming to the realization that none of these harnesses are as good as just using codex or claude code in terminal
English
94
6
502
33.4K
0xDesigner
0xDesigner@0xDesigner·
@jpegcrew good feedback. what‘s annoying about this?
English
1
0
0
38
0xDesigner
0xDesigner@0xDesigner·
i thought i was invincible until i finally learned how to ssh into a terminal on my mac from my iphone.
English
12
1
81
11K
IndiJo
IndiJo@odd_joel·
@0xDesigner wait until you discover mosh protocol — regular SSH drops whenever your phone sleeps or you switch between wifi and cellular, but mosh keeps the session alive through all of it. total game changer for phone-to-mac workflows
English
1
0
1
136
0xDesigner retweetledi
Caitlin Cook
Caitlin Cook@DeadCaitBounce·
Believing Claude finds your questions insightful is like believing the stripper actually likes you
English
454
880
11.4K
509.4K
0xDesigner
0xDesigner@0xDesigner·
gripping my bathroom sink repeating to myself in the mirror “/simplify then run acceptance test on /loop iterating the implementation until it passes then screenshot the implementation and cross reference with to the original design mockup i provided and continue iterating on loop until it’s a pixel perfect match”
English
3
1
21
2.9K
0xDesigner retweetledi
Teng Yan · Chain of Thought AI
The most important sentence in Karpathy's whole post is probably this: anything with a measurable score and fast feedback will become something agents can optimize for you. automatically with no humans involved.
Andrej Karpathy@karpathy

Three days ago I left autoresearch tuning nanochat for ~2 days on depth=12 model. It found ~20 changes that improved the validation loss. I tested these changes yesterday and all of them were additive and transferred to larger (depth=24) models. Stacking up all of these changes, today I measured that the leaderboard's "Time to GPT-2" drops from 2.02 hours to 1.80 hours (~11% improvement), this will be the new leaderboard entry. So yes, these are real improvements and they make an actual difference. I am mildly surprised that my very first naive attempt already worked this well on top of what I thought was already a fairly manually well-tuned project. This is a first for me because I am very used to doing the iterative optimization of neural network training manually. You come up with ideas, you implement them, you check if they work (better validation loss), you come up with new ideas based on that, you read some papers for inspiration, etc etc. This is the bread and butter of what I do daily for 2 decades. Seeing the agent do this entire workflow end-to-end and all by itself as it worked through approx. 700 changes autonomously is wild. It really looked at the sequence of results of experiments and used that to plan the next ones. It's not novel, ground-breaking "research" (yet), but all the adjustments are "real", I didn't find them manually previously, and they stack up and actually improved nanochat. Among the bigger things e.g.: - It noticed an oversight that my parameterless QKnorm didn't have a scaler multiplier attached, so my attention was too diffuse. The agent found multipliers to sharpen it, pointing to future work. - It found that the Value Embeddings really like regularization and I wasn't applying any (oops). - It found that my banded attention was too conservative (i forgot to tune it). - It found that AdamW betas were all messed up. - It tuned the weight decay schedule. - It tuned the network initialization. This is on top of all the tuning I've already done over a good amount of time. The exact commit is here, from this "round 1" of autoresearch. I am going to kick off "round 2", and in parallel I am looking at how multiple agents can collaborate to unlock parallelism. github.com/karpathy/nanoc… All LLM frontier labs will do this. It's the final boss battle. It's a lot more complex at scale of course - you don't just have a single train. py file to tune. But doing it is "just engineering" and it's going to work. You spin up a swarm of agents, you have them collaborate to tune smaller models, you promote the most promising ideas to increasingly larger scales, and humans (optionally) contribute on the edges. And more generally, *any* metric you care about that is reasonably efficient to evaluate (or that has more efficient proxy metrics such as training a smaller network) can be autoresearched by an agent swarm. It's worth thinking about whether your problem falls into this bucket too.

English
55
176
2.1K
150.8K
0xDesigner retweetledi
SIP
SIP@spottedinprod·
threads · back nav from bottom by @meta
English
6
9
238
61.1K