Tjerk

4.2K posts

Tjerk

Tjerk

@tjerkw

Utrecht, Netherlands Katılım Mayıs 2009
986 Takip Edilen433 Takipçiler
Tjerk retweetledi
Reid Wiseman
Reid Wiseman@astro_reid·
Only one chance in this lifetime… Like watching sunset at the beach from the most foreign seat in the cosmos, I couldn’t resist a cell phone video of Earthset. You can hear the shutter on the Nikon as @Astro_Christina is hammering away on 3-shot brackets and capturing those exceptional Earthset photos through the 400mm lens. @AstroVicGlover was in window 3 watching with @Astro_Jeremy next to him. I could barely see the Moon through the docking hatch window but the iPhone was the perfect size to catch the view…this is uncropped, uncut with 8x zoom which is quite comparable to the view of the human eye. Enjoy.
English
4.1K
45.6K
265.5K
18.8M
Tjerk retweetledi
NASA
NASA@NASA·
We see our home planet as a whole, lit up in spectacular blues and browns. A green aurora even lights up the atmosphere. That's us, together, watching as our astronauts make their journey to the Moon.
NASA tweet media
English
4.7K
66.3K
313.4K
76.9M
Tjerk retweetledi
ThePrimeagen
ThePrimeagen@ThePrimeagen·
guys, i honestly do not like clowning on Gary. I don't find being the butt of a joke funny, so I imagine he does not either. But, this is what worries me about where we are going. We are actively encouraging an entire generation that the tech is there when its not, and a couple of silly mistakes made on a website isn't the end of the world, but people's data and breaches are serious. We are entering a very VERY hackable world, and I do not like it one bit.
gregorein@Gregorein

so... I audited Garry's website after he bragged about 37K LOC/day and a 72-day shipping streak. here's what 78,400 lines of AI slop code actually looks like in production. a single homepage load of garryslist.org downloads 6.42 MB across 169 requests. for a newsletter-blog-thingy. 1/9🧵

English
293
265
5.3K
558.9K
Tjerk
Tjerk@tjerkw·
@getsome_air No support for Claude code subscriptions make this useless
English
0
0
0
83
Air by JetBrains
Air by JetBrains@getsome_air·
Run agents simultaneously in separate Git worktrees. Each gets its own working directory and branch, so their changes stay independent and don't conflict. Review, apply, and commit to main when ready.
English
5
3
41
4.1K
Tjerk retweetledi
Paul Graham
Paul Graham@paulg·
Whether the next president is a Republican or a Democrat, can we have someone with at least a little class next time? Because this is embarrassing.
Paul Graham tweet media
English
1.3K
1.4K
19.3K
1.4M
Tjerk retweetledi
Andrej Karpathy
Andrej Karpathy@karpathy·
I'm not very happy with the code quality and I think agents bloat abstractions, have poor code aesthetics, are very prone to copy pasting code blocks and it's a mess, but at this point I stopped fighting it too hard and just moved on. The agents do not listen to my instructions in the AGENTS.md files. E.g. just as one example, no matter how many times I say something like: "Every line of code should do exactly one thing and use intermediate variables as a form of documentation" They will still "multitask" and create complex constructs where one line of code calls 2 functions and then indexes an array with the result. I think in principle I could use hooks or slash commands to clean this up but at some point just a shrug is easier. Yes I think LLM as a judge for soft rewards is in principle and long term slightly problematic (due to goodharting concerns), but in practice and for now I don't think we've picked the low hanging fruit yet here.
English
253
331
4.3K
818.7K
Noah Zweben
Noah Zweben@noahzweben·
You can now schedule recurring cloud-based tasks on Claude Code. Set a repo (or repos), a schedule, and a prompt. Claude runs it via cloud infra on your schedule, so you don’t need to keep Claude Code running on your local machine.
English
298
563
7.5K
2M
Tjerk retweetledi
Stitch by Google
Stitch by Google@stitchbygoogle·
Meet the new Stitch, your vibe design partner. Here are 5 major upgrades to help you create, iterate and collaborate: 🎨 AI-Native Canvas 🧠 Smarter Design Agent 🎙️ Voice ⚡️ Instant Prototypes 📐 Design Systems and DESIGN.md Rolling out now. Details and product walkthrough video in 🧵
English
988
4.8K
41.7K
18.6M
Tjerk retweetledi
Joe Kent
Joe Kent@joekent16jan19·
After much reflection, I have decided to resign from my position as Director of the National Counterterrorism Center, effective today. I cannot in good conscience support the ongoing war in Iran. Iran posed no imminent threat to our nation, and it is clear that we started this war due to pressure from Israel and its powerful American lobby. It has been an honor serving under @POTUS and @DNIGabbard and leading the professionals at NCTC. May God bless America.
Joe Kent tweet media
English
73K
218.4K
847.1K
103M
Tjerk
Tjerk@tjerkw·
@getsome_air It doesnt support claude code on a normal plan, basically making it useless if you want claude code
English
1
0
1
133
Air by JetBrains
Air by JetBrains@getsome_air·
Air now supports Gemini CLI and Junie – alongside Codex and Claude Agent. Choose the agent that fits your task. All agents are included with JetBrains AI Pro or AI Ultimate. Prefer your own API keys? That works too. Enjoy maximum flexibility, all in one place. Download Air for free: jb.gg/7auq36
English
26
17
102
56.3K
Tjerk
Tjerk@tjerkw·
@karpathy This is a similar loop like “have all the unit tests pass” or “increase startup performance” right? Just a metric that the agent will optimize in a loop? It just that this requires “llm research”. Or am I missing something here?
English
0
0
0
72
Andrej Karpathy
Andrej Karpathy@karpathy·
Three days ago I left autoresearch tuning nanochat for ~2 days on depth=12 model. It found ~20 changes that improved the validation loss. I tested these changes yesterday and all of them were additive and transferred to larger (depth=24) models. Stacking up all of these changes, today I measured that the leaderboard's "Time to GPT-2" drops from 2.02 hours to 1.80 hours (~11% improvement), this will be the new leaderboard entry. So yes, these are real improvements and they make an actual difference. I am mildly surprised that my very first naive attempt already worked this well on top of what I thought was already a fairly manually well-tuned project. This is a first for me because I am very used to doing the iterative optimization of neural network training manually. You come up with ideas, you implement them, you check if they work (better validation loss), you come up with new ideas based on that, you read some papers for inspiration, etc etc. This is the bread and butter of what I do daily for 2 decades. Seeing the agent do this entire workflow end-to-end and all by itself as it worked through approx. 700 changes autonomously is wild. It really looked at the sequence of results of experiments and used that to plan the next ones. It's not novel, ground-breaking "research" (yet), but all the adjustments are "real", I didn't find them manually previously, and they stack up and actually improved nanochat. Among the bigger things e.g.: - It noticed an oversight that my parameterless QKnorm didn't have a scaler multiplier attached, so my attention was too diffuse. The agent found multipliers to sharpen it, pointing to future work. - It found that the Value Embeddings really like regularization and I wasn't applying any (oops). - It found that my banded attention was too conservative (i forgot to tune it). - It found that AdamW betas were all messed up. - It tuned the weight decay schedule. - It tuned the network initialization. This is on top of all the tuning I've already done over a good amount of time. The exact commit is here, from this "round 1" of autoresearch. I am going to kick off "round 2", and in parallel I am looking at how multiple agents can collaborate to unlock parallelism. github.com/karpathy/nanoc… All LLM frontier labs will do this. It's the final boss battle. It's a lot more complex at scale of course - you don't just have a single train. py file to tune. But doing it is "just engineering" and it's going to work. You spin up a swarm of agents, you have them collaborate to tune smaller models, you promote the most promising ideas to increasingly larger scales, and humans (optionally) contribute on the edges. And more generally, *any* metric you care about that is reasonably efficient to evaluate (or that has more efficient proxy metrics such as training a smaller network) can be autoresearched by an agent swarm. It's worth thinking about whether your problem falls into this bucket too.
Andrej Karpathy tweet media
English
962
2.1K
19.5K
3.6M
Boris Cherny
Boris Cherny@bcherny·
New in Claude Code: Code Review. A team of agents runs a deep review on every PR. We built it for ourselves first. Code output per Anthropic engineer is up 200% this year and reviews were the bottleneck Personally, I’ve been using it for a few weeks and have found it catches many real bugs that I would not have noticed otherwise
Claude@claudeai

Introducing Code Review, a new feature for Claude Code. When a PR opens, Claude dispatches a team of agents to hunt for bugs.

English
463
499
7.4K
1.2M
Tjerk
Tjerk@tjerkw·
@Plinz @AnamarijaML @garrytan That 40% stat is for federal income tax only. Include all taxes and it drops to ~24%. Plus the truly rich don’t even show up in income stats — they grow wealth through unrealized gains and borrow against it tax-free. The high earners paying that 40% are surgeons and lawyers…
English
0
0
0
25
Joscha Bach
Joscha Bach@Plinz·
@AnamarijaML @garrytan The top 1% pay 40% of all taxes, but that‘s fine and perhaps not enough but really not the point here. CA is planning a tax un unrealized gains, which is basically money that only exists on paper, not in someone‘s bank account
English
7
0
161
8.3K
Tjerk
Tjerk@tjerkw·
@Plinz @AnamarijaML @garrytan It’s not “unrealized gains” it’s just a wealth tax. You pay a percentage on what you own. See @garyseconomics thinking. We can basically drop all other taxes (like sales tax). The problem is however that investments (illiquid stock) is also taxed.
English
0
0
0
146
Joscha Bach
Joscha Bach@Plinz·
@AnamarijaML @garrytan The point is that this money is not castles, banquets and courtesans but the machines that produce goods, services, innovation, employment and tax revenue. Dismantling these machines to sell them to pay taxes makes everyone worse off
English
29
48
833
134.4K
Tjerk retweetledi
Seth Bannon
Seth Bannon@sethbannon·
This account was suspended immediately after this tweet.
Seth Bannon tweet media
English
366
2.1K
19.2K
1M
Tjerk retweetledi
Andrej Karpathy
Andrej Karpathy@karpathy·
A lot of people quote tweeted this as 1 year anniversary of vibe coding. Some retrospective - I've had a Twitter account for 17 years now (omg) and I still can't predict my tweet engagement basically at all. This was a shower of thoughts throwaway tweet that I just fired off without thinking but somehow it minted a fitting name at the right moment for something that a lot of people were feeling at the same time, so here we are: vibe coding is now mentioned on my Wikipedia as a major memetic "contribution" and even its article is longer. lol The one thing I'd add is that at the time, LLM capability was low enough that you'd mostly use vibe coding for fun throwaway projects, demos and explorations. It was good fun and it almost worked. Today (1 year later), programming via LLM agents is increasingly becoming a default workflow for professionals, except with more oversight and scrutiny. The goal is to claim the leverage from the use of agents but without any compromise on the quality of the software. Many people have tried to come up with a better name for this to differentiate it from vibe coding, personally my current favorite "agentic engineering": - "agentic" because the new default is that you are not writing the code directly 99% of the time, you are orchestrating agents who do and acting as oversight. - "engineering" to emphasize that there is an art & science and expertise to it. It's something you can learn and become better at, with its own depth of a different kind. In 2026, we're likely to see continued improvements on both the model layer and the new agent layer. I feel excited about the product of the two and another year of progress.
Andrej Karpathy@karpathy

There's a new kind of coding I call "vibe coding", where you fully give in to the vibes, embrace exponentials, and forget that the code even exists. It's possible because the LLMs (e.g. Cursor Composer w Sonnet) are getting too good. Also I just talk to Composer with SuperWhisper so I barely even touch the keyboard. I ask for the dumbest things like "decrease the padding on the sidebar by half" because I'm too lazy to find it. I "Accept All" always, I don't read the diffs anymore. When I get error messages I just copy paste them in with no comment, usually that fixes it. The code grows beyond my usual comprehension, I'd have to really read through it for a while. Sometimes the LLMs can't fix a bug so I just work around it or ask for random changes until it goes away. It's not too bad for throwaway weekend projects, but still quite amusing. I'm building a project or webapp, but it's not really coding - I just see stuff, say stuff, run stuff, and copy paste stuff, and it mostly works.

English
644
820
8.8K
1.3M
Tjerk retweetledi
Andrej Karpathy
Andrej Karpathy@karpathy·
A few random notes from claude coding quite a bit last few weeks. Coding workflow. Given the latest lift in LLM coding capability, like many others I rapidly went from about 80% manual+autocomplete coding and 20% agents in November to 80% agent coding and 20% edits+touchups in December. i.e. I really am mostly programming in English now, a bit sheepishly telling the LLM what code to write... in words. It hurts the ego a bit but the power to operate over software in large "code actions" is just too net useful, especially once you adapt to it, configure it, learn to use it, and wrap your head around what it can and cannot do. This is easily the biggest change to my basic coding workflow in ~2 decades of programming and it happened over the course of a few weeks. I'd expect something similar to be happening to well into double digit percent of engineers out there, while the awareness of it in the general population feels well into low single digit percent. IDEs/agent swarms/fallability. Both the "no need for IDE anymore" hype and the "agent swarm" hype is imo too much for right now. The models definitely still make mistakes and if you have any code you actually care about I would watch them like a hawk, in a nice large IDE on the side. The mistakes have changed a lot - they are not simple syntax errors anymore, they are subtle conceptual errors that a slightly sloppy, hasty junior dev might do. The most common category is that the models make wrong assumptions on your behalf and just run along with them without checking. They also don't manage their confusion, they don't seek clarifications, they don't surface inconsistencies, they don't present tradeoffs, they don't push back when they should, and they are still a little too sycophantic. Things get better in plan mode, but there is some need for a lightweight inline plan mode. They also really like to overcomplicate code and APIs, they bloat abstractions, they don't clean up dead code after themselves, etc. They will implement an inefficient, bloated, brittle construction over 1000 lines of code and it's up to you to be like "umm couldn't you just do this instead?" and they will be like "of course!" and immediately cut it down to 100 lines. They still sometimes change/remove comments and code they don't like or don't sufficiently understand as side effects, even if it is orthogonal to the task at hand. All of this happens despite a few simple attempts to fix it via instructions in CLAUDE . md. Despite all these issues, it is still a net huge improvement and it's very difficult to imagine going back to manual coding. TLDR everyone has their developing flow, my current is a small few CC sessions on the left in ghostty windows/tabs and an IDE on the right for viewing the code + manual edits. Tenacity. It's so interesting to watch an agent relentlessly work at something. They never get tired, they never get demoralized, they just keep going and trying things where a person would have given up long ago to fight another day. It's a "feel the AGI" moment to watch it struggle with something for a long time just to come out victorious 30 minutes later. You realize that stamina is a core bottleneck to work and that with LLMs in hand it has been dramatically increased. Speedups. It's not clear how to measure the "speedup" of LLM assistance. Certainly I feel net way faster at what I was going to do, but the main effect is that I do a lot more than I was going to do because 1) I can code up all kinds of things that just wouldn't have been worth coding before and 2) I can approach code that I couldn't work on before because of knowledge/skill issue. So certainly it's speedup, but it's possibly a lot more an expansion. Leverage. LLMs are exceptionally good at looping until they meet specific goals and this is where most of the "feel the AGI" magic is to be found. Don't tell it what to do, give it success criteria and watch it go. Get it to write tests first and then pass them. Put it in the loop with a browser MCP. Write the naive algorithm that is very likely correct first, then ask it to optimize it while preserving correctness. Change your approach from imperative to declarative to get the agents looping longer and gain leverage. Fun. I didn't anticipate that with agents programming feels *more* fun because a lot of the fill in the blanks drudgery is removed and what remains is the creative part. I also feel less blocked/stuck (which is not fun) and I experience a lot more courage because there's almost always a way to work hand in hand with it to make some positive progress. I have seen the opposite sentiment from other people too; LLM coding will split up engineers based on those who primarily liked coding and those who primarily liked building. Atrophy. I've already noticed that I am slowly starting to atrophy my ability to write code manually. Generation (writing code) and discrimination (reading code) are different capabilities in the brain. Largely due to all the little mostly syntactic details involved in programming, you can review code just fine even if you struggle to write it. Slopacolypse. I am bracing for 2026 as the year of the slopacolypse across all of github, substack, arxiv, X/instagram, and generally all digital media. We're also going to see a lot more AI hype productivity theater (is that even possible?), on the side of actual, real improvements. Questions. A few of the questions on my mind: - What happens to the "10X engineer" - the ratio of productivity between the mean and the max engineer? It's quite possible that this grows *a lot*. - Armed with LLMs, do generalists increasingly outperform specialists? LLMs are a lot better at fill in the blanks (the micro) than grand strategy (the macro). - What does LLM coding feel like in the future? Is it like playing StarCraft? Playing Factorio? Playing music? - How much of society is bottlenecked by digital knowledge work? TLDR Where does this leave us? LLM agent capabilities (Claude & Codex especially) have crossed some kind of threshold of coherence around December 2025 and caused a phase shift in software engineering and closely related. The intelligence part suddenly feels quite a bit ahead of all the rest of it - integrations (tools, knowledge), the necessity for new organizational workflows, processes, diffusion more generally. 2026 is going to be a high energy year as the industry metabolizes the new capability.
English
1.6K
5.5K
40.1K
7.7M
Tjerk retweetledi
Paul Graham
Paul Graham@paulg·
@elonmusk No she didn't, and here's the evidence. She's turning the front wheels of her car to the right, away from the ICE officer on her left. If she was trying to run him over, why would she turn in the other direction?
Paul Graham tweet media
English
2.3K
1.9K
46.4K
1.3M