JBaba

1.9K posts

JBaba

@JBabaTalks

Let's build, https://t.co/vy1e38LSoY about me. https://t.co/ZvN4aVUywF Building. https://t.co/H51kcr3F0P Failed.

انضم Eylül 2018

163 يتبع62 المتابعون

تغريدة مثبتة

JBaba@JBabaTalks·18 Şub

Just made math factory game for my kid and here is a demo. Making games just got easy. Used @Zai_org @MiniMax_AI @OpenAI codex. Planning and impl split. #gamedev

English

264

JBaba@JBabaTalks·19 Mar

@shl This means everyone should be code.

English

Sahil Lavingia@shl·18 Mar

CEOs should code

English

185

638

73.9K

JBaba@JBabaTalks·19 Mar

@sweatystartup Demand is high it will not pop.

English

Nick Huber@sweatystartup·19 Mar

AI about to get 20x expensive. These $200 / month claude subscriptions are burning $5,000 worth of credits. The bubble is going to pop and it will pop soon.

English

394

153

2.5K

231.9K

JBaba@JBabaTalks·18 Mar

Stupidity is backed into intelligence

Duca@big_duca

We have AGI (for coding). And yet so much software is still so damn buggy. (including my own startup) Why?

English

JBaba@JBabaTalks·18 Mar

@big_duca Isn't enalagus to creativity leads to stupid decisions. Like Newton believed in an intelligent designer. Guy invented a couple math disciplines on the weekend.

English

Duca@big_duca·15 Mar

We have AGI (for coding). And yet so much software is still so damn buggy. (including my own startup) Why?

English

9.9K

JBaba@JBabaTalks·15 Mar

@naval One software eats other small software.

English

Naval@naval·14 Mar

Software was eaten by AI.

English

2.3K

2.2K

22K

107.4M

JBaba@JBabaTalks·10 Mar

@adamdotdev We have invisible hands from the top as well for more work

English

Adam@adamdotdev·9 Mar

Man, I felt this video so hard. It feels like all of us (devs) that are using AI are trying to tiptoe around these feelings, or hang onto "but I use it correctly! i review the code!", idk, I think the drug analogy is apt. Once you have the button, it feels impossible to not use the button. On the one hand, there is something so tempting about throwing it all away and going back to life as it was before all of this. Even a little part of me that hopes it's all unsustainable and crumbles down around me so I don't have to make the impossible decision. On the other hand, I've spent most of my career unable to sleep at night because I couldn't wait to wake up and implement the solution that came to me before bed. I don't have that problem anymore, I sleep like a baby, and I'm only now realizing it's because there's no mystery anymore, no burning desire to wake up and shape the world. I just put the prompts in, and that doesn't get me out of bed. Rock and a hard place for sure. Anyway, use OpenCode, our new subscription (Go) is the best way to buy your drugs I mean tokens! 🥲

Mo@atmoio

I was a 10x engineer. Now I'm useless.

English

1.9K

326.5K

JBaba@JBabaTalks·10 Mar

@allgarbled Why?

gabe@allgarbled·9 Mar

Every single software engineer I know has told me their plan if the profession gets automated is to become an electrician.

English

462

205

10.6K

671.2K

JBaba@JBabaTalks·10 Mar

@jamwt Could it be bad instructions though??

English

222

Jamie Turner@jamwt·10 Mar

Man, if you don't check carefully, LLMs generate some pretty bad code.

English

125

1.2K

93.7K

JBaba@JBabaTalks·10 Mar

@karpathy I'm not much familiar with AI but I am senior eng any examples you can give for which I can use this for traditional software engineering??

English

494

Andrej Karpathy@karpathy·10 Mar

Three days ago I left autoresearch tuning nanochat for ~2 days on depth=12 model. It found ~20 changes that improved the validation loss. I tested these changes yesterday and all of them were additive and transferred to larger (depth=24) models. Stacking up all of these changes, today I measured that the leaderboard's "Time to GPT-2" drops from 2.02 hours to 1.80 hours (~11% improvement), this will be the new leaderboard entry. So yes, these are real improvements and they make an actual difference. I am mildly surprised that my very first naive attempt already worked this well on top of what I thought was already a fairly manually well-tuned project. This is a first for me because I am very used to doing the iterative optimization of neural network training manually. You come up with ideas, you implement them, you check if they work (better validation loss), you come up with new ideas based on that, you read some papers for inspiration, etc etc. This is the bread and butter of what I do daily for 2 decades. Seeing the agent do this entire workflow end-to-end and all by itself as it worked through approx. 700 changes autonomously is wild. It really looked at the sequence of results of experiments and used that to plan the next ones. It's not novel, ground-breaking "research" (yet), but all the adjustments are "real", I didn't find them manually previously, and they stack up and actually improved nanochat. Among the bigger things e.g.: - It noticed an oversight that my parameterless QKnorm didn't have a scaler multiplier attached, so my attention was too diffuse. The agent found multipliers to sharpen it, pointing to future work. - It found that the Value Embeddings really like regularization and I wasn't applying any (oops). - It found that my banded attention was too conservative (i forgot to tune it). - It found that AdamW betas were all messed up. - It tuned the weight decay schedule. - It tuned the network initialization. This is on top of all the tuning I've already done over a good amount of time. The exact commit is here, from this "round 1" of autoresearch. I am going to kick off "round 2", and in parallel I am looking at how multiple agents can collaborate to unlock parallelism. github.com/karpathy/nanoc… All LLM frontier labs will do this. It's the final boss battle. It's a lot more complex at scale of course - you don't just have a single train. py file to tune. But doing it is "just engineering" and it's going to work. You spin up a swarm of agents, you have them collaborate to tune smaller models, you promote the most promising ideas to increasingly larger scales, and humans (optionally) contribute on the edges. And more generally, *any* metric you care about that is reasonably efficient to evaluate (or that has more efficient proxy metrics such as training a smaller network) can be autoresearched by an agent swarm. It's worth thinking about whether your problem falls into this bucket too.

English

962

2.1K

19.5K

3.6M

JBaba@JBabaTalks·8 Mar

Congratulations India for the world cup.

English

JBaba@JBabaTalks·8 Mar

@weswinder Old habits die hard

English

Wes Winder@weswinder·7 Mar

traditional devs hating on vibe coding is the most predictable reaction in tech history the people who memorized the old rules always resent the people who skip them

English

103

293

12.5K

JBaba@JBabaTalks·8 Mar

@thdxr The 1st statement is false. 2nd is true because of comparison against 1st

English

633

dax@thdxr·8 Mar

what if the models haven't actually improved for months what if we're all just getting dumber

English

251

210

4.2K

132.7K

JBaba@JBabaTalks·8 Mar

It needs to collapse asap.

Rand@rand_longevity

every professor I talk to that uses AI says the college system is about to collapse

English

JBaba@JBabaTalks·7 Mar

@iannuttall Really or just another bait...

English

357

Ian Nuttall@iannuttall·7 Mar

holy shit

Ian Nuttall@iannuttall

my new app, SlopCorp, to spin up autonomous AI companies hit $100k MRR in the first three minutes AMA

English

702

191.8K

JBaba@JBabaTalks·7 Mar

Evolution of coding. > Me coding > Me using copilot copy paste > Using a single Claude session > Agentic Claude > Running Ralph loop > Running parallel sessions > Me planning and my agent orchestrate multiple parallel agents What's next??

English