JBaba

1.9K posts

JBaba banner
JBaba

JBaba

@JBabaTalks

Let's build, https://t.co/vy1e38LSoY about me. https://t.co/ZvN4aVUywF Building. https://t.co/H51kcr3F0P Failed.

انضم Eylül 2018
163 يتبع62 المتابعون
تغريدة مثبتة
JBaba
JBaba@JBabaTalks·
Just made math factory game for my kid and here is a demo. Making games just got easy. Used @Zai_org @MiniMax_AI @OpenAI codex. Planning and impl split. #gamedev
English
0
1
6
264
JBaba
JBaba@JBabaTalks·
@shl This means everyone should be code.
English
0
0
0
7
Nick Huber
Nick Huber@sweatystartup·
AI about to get 20x expensive. These $200 / month claude subscriptions are burning $5,000 worth of credits. The bubble is going to pop and it will pop soon.
English
394
153
2.5K
231.9K
JBaba
JBaba@JBabaTalks·
@big_duca Isn't enalagus to creativity leads to stupid decisions. Like Newton believed in an intelligent designer. Guy invented a couple math disciplines on the weekend.
English
0
0
0
4
Duca
Duca@big_duca·
We have AGI (for coding). And yet so much software is still so damn buggy. (including my own startup) Why?
English
98
4
62
9.9K
JBaba
JBaba@JBabaTalks·
@naval One software eats other small software.
English
0
0
0
4
Naval
Naval@naval·
Software was eaten by AI.
English
2.3K
2.2K
22K
107.4M
JBaba
JBaba@JBabaTalks·
@adamdotdev We have invisible hands from the top as well for more work
English
0
0
0
23
Adam
Adam@adamdotdev·
Man, I felt this video so hard. It feels like all of us (devs) that are using AI are trying to tiptoe around these feelings, or hang onto "but I use it correctly! i review the code!", idk, I think the drug analogy is apt. Once you have the button, it feels impossible to not use the button. On the one hand, there is something so tempting about throwing it all away and going back to life as it was before all of this. Even a little part of me that hopes it's all unsustainable and crumbles down around me so I don't have to make the impossible decision. On the other hand, I've spent most of my career unable to sleep at night because I couldn't wait to wake up and implement the solution that came to me before bed. I don't have that problem anymore, I sleep like a baby, and I'm only now realizing it's because there's no mystery anymore, no burning desire to wake up and shape the world. I just put the prompts in, and that doesn't get me out of bed. Rock and a hard place for sure. Anyway, use OpenCode, our new subscription (Go) is the best way to buy your drugs I mean tokens! 🥲
Mo@atmoio

I was a 10x engineer. Now I'm useless.

English
91
63
1.9K
326.5K
gabe
gabe@allgarbled·
Every single software engineer I know has told me their plan if the profession gets automated is to become an electrician.
English
462
205
10.6K
671.2K
JBaba
JBaba@JBabaTalks·
@jamwt Could it be bad instructions though??
English
0
0
0
222
Jamie Turner
Jamie Turner@jamwt·
Man, if you don't check carefully, LLMs generate some pretty bad code.
English
125
45
1.2K
93.7K
JBaba
JBaba@JBabaTalks·
@karpathy I'm not much familiar with AI but I am senior eng any examples you can give for which I can use this for traditional software engineering??
English
0
0
3
494
Andrej Karpathy
Andrej Karpathy@karpathy·
Three days ago I left autoresearch tuning nanochat for ~2 days on depth=12 model. It found ~20 changes that improved the validation loss. I tested these changes yesterday and all of them were additive and transferred to larger (depth=24) models. Stacking up all of these changes, today I measured that the leaderboard's "Time to GPT-2" drops from 2.02 hours to 1.80 hours (~11% improvement), this will be the new leaderboard entry. So yes, these are real improvements and they make an actual difference. I am mildly surprised that my very first naive attempt already worked this well on top of what I thought was already a fairly manually well-tuned project. This is a first for me because I am very used to doing the iterative optimization of neural network training manually. You come up with ideas, you implement them, you check if they work (better validation loss), you come up with new ideas based on that, you read some papers for inspiration, etc etc. This is the bread and butter of what I do daily for 2 decades. Seeing the agent do this entire workflow end-to-end and all by itself as it worked through approx. 700 changes autonomously is wild. It really looked at the sequence of results of experiments and used that to plan the next ones. It's not novel, ground-breaking "research" (yet), but all the adjustments are "real", I didn't find them manually previously, and they stack up and actually improved nanochat. Among the bigger things e.g.: - It noticed an oversight that my parameterless QKnorm didn't have a scaler multiplier attached, so my attention was too diffuse. The agent found multipliers to sharpen it, pointing to future work. - It found that the Value Embeddings really like regularization and I wasn't applying any (oops). - It found that my banded attention was too conservative (i forgot to tune it). - It found that AdamW betas were all messed up. - It tuned the weight decay schedule. - It tuned the network initialization. This is on top of all the tuning I've already done over a good amount of time. The exact commit is here, from this "round 1" of autoresearch. I am going to kick off "round 2", and in parallel I am looking at how multiple agents can collaborate to unlock parallelism. github.com/karpathy/nanoc… All LLM frontier labs will do this. It's the final boss battle. It's a lot more complex at scale of course - you don't just have a single train. py file to tune. But doing it is "just engineering" and it's going to work. You spin up a swarm of agents, you have them collaborate to tune smaller models, you promote the most promising ideas to increasingly larger scales, and humans (optionally) contribute on the edges. And more generally, *any* metric you care about that is reasonably efficient to evaluate (or that has more efficient proxy metrics such as training a smaller network) can be autoresearched by an agent swarm. It's worth thinking about whether your problem falls into this bucket too.
Andrej Karpathy tweet media
English
962
2.1K
19.5K
3.6M
JBaba
JBaba@JBabaTalks·
Congratulations India for the world cup.
English
0
0
0
36
Wes Winder
Wes Winder@weswinder·
traditional devs hating on vibe coding is the most predictable reaction in tech history the people who memorized the old rules always resent the people who skip them
English
103
20
293
12.5K
JBaba
JBaba@JBabaTalks·
@thdxr The 1st statement is false. 2nd is true because of comparison against 1st
English
0
0
0
633
dax
dax@thdxr·
what if the models haven't actually improved for months what if we're all just getting dumber
English
251
210
4.2K
132.7K
JBaba
JBaba@JBabaTalks·
@iannuttall Really or just another bait...
English
0
0
0
357
JBaba
JBaba@JBabaTalks·
Evolution of coding. > Me coding > Me using copilot copy paste > Using a single Claude session > Agentic Claude > Running Ralph loop > Running parallel sessions > Me planning and my agent orchestrate multiple parallel agents What's next??
English
2
2
3
82
JBaba
JBaba@JBabaTalks·
@anvanvan Hm... In that mode what will you do??
English
1
0
0
12
An Van
An Van@anvanvan·
@JBabaTalks No, the difference is the agents are doing the planning and not you anymore
English
1
0
1
6
JBaba
JBaba@JBabaTalks·
@anvanvan That's the last point I made
English
1
0
0
10
An Van
An Van@anvanvan·
@JBabaTalks > Me having an idea → my agents planning and then orchestrating multiple parallel agents
English
1
0
0
13
JBaba
JBaba@JBabaTalks·
@brankopetric00 Mistake every new senior eng building app makes
English
0
0
0
11
Branko
Branko@brankopetric00·
Kubernetes was built to solve Google-scale problems. You are not Google. You're a SaaS with 300 customers. Docker Compose would've been fine.
English
200
419
6.8K
284.4K
JBaba
JBaba@JBabaTalks·
@asaio87 It's easy for an experienced guy at least
English
0
0
0
11
andrei saioc
andrei saioc@asaio87·
I think AI creates the illusion for so many people that the development of production apps is so easy.
English
122
37
689
36.2K