Abhi

400 posts

Abhi

@bufoverflow

cto & entropy weaver @polychain. long (🇺🇸, crypto). grew up with the monks #!/bin/bash # opinions are mine, not my employer's chown $USER:opinions

software x finance interface 参加日 Ekim 2012

869 フォロー中279 フォロワー

固定されたツイート

Abhi@bufoverflow·30 Nis

Packetizing money will do for value transfer what Ethernet did for data. By splitting today’s bulky, error-prone payments into infinitely divisible “money packets,” we unlock frictionless settlement and usher in an entirely new class of applications that aren’t possible on legacy rails.

Sam Broner@SamBroner

What's something we'll be doing with money in five years that is impossible on payment rails today?

English

642

Abhi@bufoverflow·4d

Biggest unlock since gpt4 is to generate the features and attacks at the same time. Shift-left movement is structurally built for adversarial ai. during codegen the model has full context: intent, deps, IaC topology, and implied threat models. This attack surface density decays as you move right where you’re trying to pattern match. CTOs/CSOs can finally turn software/security engineering into a dialectical process. Red teaming isn’t a separate team or a tool happening after you deploy, it is literally a second pass at codegen through the same model with an adversarial dev.prompt

English

207

Marc Andreessen 🇺🇸@pmarca·4d

Every security engineer knows "security through obscurity" doesn't work. But that's how we've actually been running for the whole existence of computers, until now. AI can finally fix that.

English

1.4K

124.9K

Abhi@bufoverflow·2 Nis

nope. i’m eaglemaxxing. 🦅🇺🇸

English

Abhi@bufoverflow·11 Mar

So what does a Homebrew Gradient Club mean for the mortals? Compute or harness isn't the only moat. Autoresearch accelerated proprietary gradient and mechanism design is your most durable moat.

English

Abhi@bufoverflow·11 Mar

Portability is the real story. Karpathy says the d12-discovered nanochat changes transferred to larger models (d24). @tobi then adapted this to QMD and reported +19% on a 0.8b model beating a much larger hand tuned model (1.6b) x.com/tobi/status/20… My bet is that this should generalize anywhere there is a fast, trusted fitness function.

tobi lutke@tobi

OK this thing is totally insane. Before going to bed I... * used try to make a new qmdresearcher directory * told my pi to read this github repo and make a version of that for the qmd query-expansion model with the goal of highest quality score and speed. Get training data from tobi/qmd github. * woke up to +19% score on a 0.8b model (higher than previous 1.6b) after 8 hours and 37 experiments. I'm not a ML researcher of course. I'm sure way more sophisticated stuff is being done by real researchers. But its mesmerizing to just read it reasoning its way through the experiments. I learned more from that than months of following ml researchers. I just asked it to also make a new reranker and its already got higher base than the previous one. Incredible.

English

Abhi@bufoverflow·11 Mar

People are overthinking and underestimating @karpathy's autoresearch at the same time. It isn't RSI or just autonomous hyperparam search, but something weirder.. It feels like the Homebrew Computer Club moment for model, optimizer and training-loop experimentation.

Andrej Karpathy@karpathy

Three days ago I left autoresearch tuning nanochat for ~2 days on depth=12 model. It found ~20 changes that improved the validation loss. I tested these changes yesterday and all of them were additive and transferred to larger (depth=24) models. Stacking up all of these changes, today I measured that the leaderboard's "Time to GPT-2" drops from 2.02 hours to 1.80 hours (~11% improvement), this will be the new leaderboard entry. So yes, these are real improvements and they make an actual difference. I am mildly surprised that my very first naive attempt already worked this well on top of what I thought was already a fairly manually well-tuned project. This is a first for me because I am very used to doing the iterative optimization of neural network training manually. You come up with ideas, you implement them, you check if they work (better validation loss), you come up with new ideas based on that, you read some papers for inspiration, etc etc. This is the bread and butter of what I do daily for 2 decades. Seeing the agent do this entire workflow end-to-end and all by itself as it worked through approx. 700 changes autonomously is wild. It really looked at the sequence of results of experiments and used that to plan the next ones. It's not novel, ground-breaking "research" (yet), but all the adjustments are "real", I didn't find them manually previously, and they stack up and actually improved nanochat. Among the bigger things e.g.: - It noticed an oversight that my parameterless QKnorm didn't have a scaler multiplier attached, so my attention was too diffuse. The agent found multipliers to sharpen it, pointing to future work. - It found that the Value Embeddings really like regularization and I wasn't applying any (oops). - It found that my banded attention was too conservative (i forgot to tune it). - It found that AdamW betas were all messed up. - It tuned the weight decay schedule. - It tuned the network initialization. This is on top of all the tuning I've already done over a good amount of time. The exact commit is here, from this "round 1" of autoresearch. I am going to kick off "round 2", and in parallel I am looking at how multiple agents can collaborate to unlock parallelism. github.com/karpathy/nanoc… All LLM frontier labs will do this. It's the final boss battle. It's a lot more complex at scale of course - you don't just have a single train. py file to tune. But doing it is "just engineering" and it's going to work. You spin up a swarm of agents, you have them collaborate to tune smaller models, you promote the most promising ideas to increasingly larger scales, and humans (optionally) contribute on the edges. And more generally, *any* metric you care about that is reasonably efficient to evaluate (or that has more efficient proxy metrics such as training a smaller network) can be autoresearched by an agent swarm. It's worth thinking about whether your problem falls into this bucket too.

English

Abhi がリツイート

Nivi@nivi·26 Şub

You can make a significant contribution to practically any field by applying Deutsch’s theories of knowledge generation.

English

884

51.4K

Abhi@bufoverflow·23 Şub

Ask your fav model, can you estimate how many tokens I have used in all of my interactions? You have my permission to mog me with the rookie numbers meme.

English

Abhi@bufoverflow·22 Şub

Following up on agent-defined-x slope. Software-defined-x (networking, finance) was the last major infrastructure abstraction shift where the control plane decoupled from the execution plane. Agent-defined-x is the same move, but one layer up; control plane isn't code anymore, it's the intent. Agent-defined security, agent-defined compliance, agent-defined trading. The architecture that makes it work: agents in the loop at machine speed, experts on the loop for oversight.

English

Abhi@bufoverflow·22 Şub

@garrytan 's yc pod opens with lobster costume. Tech twitter saw the thumbnail and 30s of the video and chose violence. If you get past the joke bit and read between the lines, you'll realize that the signal isn't costumes or the hype cycle, but a new computing primitive: An abstraction that plans, reasons about tools, runs adversarial simulations based only on your intents and not prior code. We have never had a computer like this before. The flywheel isn't human-in-the-loop, but an agent-in-the-loop (hot path, machine speed) escalating to an expert-ON-the-loop (oversight, catching outliers). Every tech wave passes through a hype cycle where interest ebbs. The "path of least resistance" dunkers always confuse intercept for slope. This has the same energy as 2018 crypto twitter's "cRyPtO iS pOnZi". Ignore the costume and focus on the slope (agent-defined-x)

Y Combinator@ycombinator

With the takeoff of OpenClaw and MoltBook, a new agent-driven economy is taking shape. On the @LightconePod, we took a look at the explosive growth of AI dev tools and whether the time has come for builders to make something agents want. 00:00 - Intro 02:12 - No human involvement is changing the experience 04:55 - Does YC need to change its motto? 07:48 - Email tools and agent infrastructure 09:36 - Agent-driven documentation 13:00 - Swarm intelligence 15:36 - Content generation and dead Internet theory 18:12 - Growth, rules, and founder insights

English

Abhi@bufoverflow·17 Şub

@martin_casado “execution eats strategy for breakfast” folks are in for a whiplash.

English

151

martin_casado@martin_casado·16 Şub

We're very much in the electric scissors phase of AI software development ...

English

165

10.7K

Abhi@bufoverflow·12 Şub

@agupta I predict token budget (the non-crypto kind) becomes your new employee perk.

English

222

Ankit Gupta@agupta·11 Şub

I bet a bunch of technical founder/CEOs spending all night hacking on Claude Code are going to have some uncomfortable convos with their AI-skeptic senior engineers pretty soon.

Thariq@trq212

We've launched Claude Code contribution metrics to help you track PRs and lines of code contributed with the help of Claude Code.

English

833

195.5K

Abhi@bufoverflow·12 Şub

Friends in the gc: "Abhi join this time or else you'll have FOMO". My wife: "He doesn't have FOMO, he has FOBI - Fear of Being Included" Knew I snagged the right one.

English

Abhi@bufoverflow·3 Şub

x.com/i/article/2018…

ZXX

Abhi@bufoverflow·28 Oca

This one lived in drafts for too long. @sama The whole unwarp experience was so rewarding. Mad props for shipping this!

Abhi@bufoverflow

Santa season = recap season Spotify wrapped, Github, YT.. Where is my LLM Wrapped? @sama I'd love a '25 GPT Recap. 🎁 feels like missed opportunities. I love the easter egg, but give me the real Wrapped: how I used LLMs, how it improved me, and how I improved it Would love a recap of: - Total "thought minutes" - % of knowledge work - which chats were flagged as salient - best takes - emergent insights.

English

Abhi@bufoverflow·25 Oca

@hamptonism Daily turmeric, sea salt and hot water gargle. Passed down through generations in my family.

English

ₕₐₘₚₜₒₙ@hamptonism·23 Oca

People who rarely get sick, what’s your secret?

English

327

275

106.3K

Abhi@bufoverflow·25 Oca

Everyone appreciates the chad penguin's grind (tenacity, rebellion, existential quest). But the real hero is the mountain itself. Without the abyss and that towering pull there is no yearning. The void calls; the waddle answers 🐧🏔️

Marc Andreessen 🇺🇸@pmarca

This is a test.

English

121

ディスカバー

@tobi @karpathy @garrytan @martin_casado @agupta @sama @hamptonism @elonmusk