Tyler Fox

9.2K posts

Tyler Fox

@smileyborg

Building things with software.

San Diego, CA Katılım Ağustos 2013

238 Takip Edilen7.8K Takipçiler

Tyler Fox@smileyborg·17h

2026: coding agents can one-shot a function to invert a binary tree in any programming language known to humankind also 2026: god forbid the agent inserted new section 3 and now has to renumber all 12 sections below it in a markdown document

English

418

Tyler Fox@smileyborg·4 May

@lossybrain 🎯

QME

lossybrain@lossybrain·4 May

@smileyborg LLMs don’t need to be deterministic, because humans aren’t either.

English

Tyler Fox@smileyborg·4 May

While it may be true today, this opinion is not going to age well. LLMs don’t need to be deterministic to produce consistent and reliable results, especially at the level of the individual tasks they will be expected to perform relative to their intelligence capability.

Ben Dickson@bendee983

Compilers are deterministic. Give them the same code with the same compiler settings, and you'll always receive the same binary. You can take responsibility for your software at the code level. LLMs, on the other hand, are stochastic. Even if you set the temperature to zero, you're likely to get different responses on the same prompt. Therefore, you need to understand the code it produces if you want to take ownership and responsibility for it.

English

3.5K

Tyler Fox@smileyborg·4 May

Code review today is often also intent, architecture, and approach review. Those aspects will surely remain. But I am confident that we can and will move on to better ways to increase confidence in the correctness of the code, without relying on humans to read and review it all.

English

631

Tyler Fox@smileyborg·4 May

Note that this doesn’t mean that all software engineers will just be vibe coding. The distinction between that and agentic engineering is stark: the latter implies high standards for architecture, quality, & correctness. Code still matters, but it’s mostly implementation detail.

English

447

Tyler Fox@smileyborg·4 May

Jeffrey is doing incredible open source work with agents — dcg is probably the most important project that you should be using with Claude Code or Codex

Jeffrey Emanuel@doodlestein

It's now been around 4 months since my open-source dcg tool was first released, and I know from hearing from tons of users that it has saved countless people from disaster at the hands of overeager Claude Code agents. I've continued to make various performance improvements and added additional preset packs to the project, most recently for the Railway API after the recent and infamous incident where someone blamed Claude for wiping their production database. Because of the way dcg is implemented as a "pre-tool-use hook" in Claude Code, there was no way to use it in Codex, since Codex didn't support that kind of hook at all. Until a week or so ago, when they finally added it. So I'm now pleased to say that the latest version of dcg has full support for Codex (plus it also works for gemini-cli if anyone is really using that outside of the 'Plex!). If you're not familiar with dcg yet, I highly recommend checking it out. It's unthinkable to me now to use any coding agent that doesn't support it; it feels like speeding on the highway without a seatbelt on (or more accurately, with a sharp knife strapped to the steering wheel pointed at your heart). Agents just can't be trusted to not occasionally do crazy things that seem sensible to them at the moment, but which are wildly destructive and often irreversible. These bouts of temporary madness often occur soon after compactions, or as a result of context rot caused by excessively long sessions. Not only does dcg mechanically prevent the agents from being able to do that, it explains to them why it did that specifically, and offers them safe alternatives custom-tailored to the specific commands they tried to run. The more agents you have running at the same time on the same project, the more dcg goes from a nice thing to have to being totally indispensable if you don't want to constantly worry about one rogue agent wiping out the work of the other agents with a misguided "git reset --hard HEAD" command. The dcg utility itself is written in hyper-optimized, memory-safe Rust and uses minimal system resources. Because it's totally mechanical (unlike the auto-approve feature in Claude Code, which uses an AI model that adds latency), you can't even notice any delay from it running on every command. dcg is NOT just a cookbook of canned forbidden commands; frontier models are too smart and resourceful to actually be constrained by such a simplistic approach. When they're prevented from running a command one way, they'll try another way; if that also doesn't work, they'll whip up an ad-hoc Bash script or Python program to do what they want. But dcg can detect that as well using its advanced ast-grep mode (which only kicks in when dealing with such heredoc scripts, so that the faster regex-only path can be used when applicable). It's also very quick and easy to expand and customize dcg by creating your own custom preset packs to add to the 50 or so included packs. Just ask Codex to study the existing presets and explain what you want to protect against in your own custom API or tooling, or in a third-party project that's not currently included by default in dcg. So, remember: Friends don't let friends vibe code without dcg. Protect yourself from your agents, and protect them from themselves. You can get it here: github.com/Dicklesworthst… It installs in under a minute on Linux or Mac using the curl-bash one-liner command shown in the README, and automatically detects any supported agent harnesses installed on your machine and configures them for you to use dcg. And if you decide it's not for you, it can be fully uninstalled in seconds using the provided command.

English

2.1K

Tyler Fox@smileyborg·3 May

NYC is a fantastic city, it’s a shame nothing like it exists in CA

English

1.4K

Tyler Fox@smileyborg·15 Nis

Can’t wait to see their SOLE.md

Tracy Alloway@tracyalloway

Allbirds, the shoe brand, now says it's an AI compute company.

English

1.3K

Tyler Fox retweetledi

Chubby♨️@kimmonismus·25 Şub

I love Karpathy's posts because they're so on point. He's not only a leading expert in his field, but he also manages to capture the zeitgeist with his statements. But this post is particularly impactful. Since December, (agentic) coding has undergone a significant transformation, one could even say a qualitative leap. Before, it was a matter of iterative improvements, but since the end of last year, it has demonstrated its true value in a completely different way. Or, in Karpathy's words: "It is hard to communicate how much programming has changed due to AI in the last 2 months: not gradually and over time in the "progress as usual" way, but specifically this last December. There are a number of asterisks but imo coding agents basically didn’t work before December and basically work since. (...) As a result, programming is becoming unrecognizable. You’re not typing computer code into an editor like the way things were since computers were invented, that era is over. You're spinning up AI agents, giving them tasks *in English* and managing and reviewing their work in parallel." Two points on this that need to be repeated again and again because they are often still misunderstood. 1) The very basic truth: this is the worst it will ever be. From here on out, things will get better. Even if the status quo were to remain as it is, it would be serious. But what we have today is the worst it will ever be. 2) The pace of progress is constantly increasing. It is exponential. And that's the crucial point: from December to February, more happened than in a very long time. And this trajectory will likely (almost certainly) continue. If points 1) and 2) are true, it is simply impossible to foresee and predict how this will affect society and all essential areas. As much as I welcome and approve of this, the near future is unpredictable. That's all I wanted to say.

Andrej Karpathy@karpathy

It is hard to communicate how much programming has changed due to AI in the last 2 months: not gradually and over time in the "progress as usual" way, but specifically this last December. There are a number of asterisks but imo coding agents basically didn’t work before December and basically work since - the models have significantly higher quality, long-term coherence and tenacity and they can power through large and long tasks, well past enough that it is extremely disruptive to the default programming workflow. Just to give an example, over the weekend I was building a local video analysis dashboard for the cameras of my home so I wrote: “Here is the local IP and username/password of my DGX Spark. Log in, set up ssh keys, set up vLLM, download and bench Qwen3-VL, set up a server endpoint to inference videos, a basic web ui dashboard, test everything, set it up with systemd, record memory notes for yourself and write up a markdown report for me”. The agent went off for ~30 minutes, ran into multiple issues, researched solutions online, resolved them one by one, wrote the code, tested it, debugged it, set up the services, and came back with the report and it was just done. I didn’t touch anything. All of this could easily have been a weekend project just 3 months ago but today it’s something you kick off and forget about for 30 minutes. As a result, programming is becoming unrecognizable. You’re not typing computer code into an editor like the way things were since computers were invented, that era is over. You're spinning up AI agents, giving them tasks *in English* and managing and reviewing their work in parallel. The biggest prize is in figuring out how you can keep ascending the layers of abstraction to set up long-running orchestrator Claws with all of the right tools, memory and instructions that productively manage multiple parallel Code instances for you. The leverage achievable via top tier "agentic engineering" feels very high right now. It’s not perfect, it needs high-level direction, judgement, taste, oversight, iteration and hints and ideas. It works a lot better in some scenarios than others (e.g. especially for tasks that are well-specified and where you can verify/test functionality). The key is to build intuition to decompose the task just right to hand off the parts that work and help out around the edges. But imo, this is nowhere near "business as usual" time in software.

English

1.1K

223.1K

Tyler Fox@smileyborg·25 Şub

We’re gonna reach AGI before Tesla solves auto windshield wipers, aren’t we

English

681

Tyler Fox@smileyborg·11 Şub

Software engineering in 2026

English

184

1.3K

14.3K

1.2M

Tyler Fox@smileyborg·6 Şub

There’s nothing like being a software engineer today, using frontier models and agents firsthand, to see how the world is being revolutionized in real-time.

English

2.2K

Tyler Fox@smileyborg·3 Şub

Shell shocked seeing this in the Epstein files 🫨 justice.gov/epstein/files/…

English

12.7K

Tyler Fox retweetledi

Alex Albert@alexalbert__·2 Şub

It's only been one year since vibe coding was coined...

Andrej Karpathy@karpathy

There's a new kind of coding I call "vibe coding", where you fully give in to the vibes, embrace exponentials, and forget that the code even exists. It's possible because the LLMs (e.g. Cursor Composer w Sonnet) are getting too good. Also I just talk to Composer with SuperWhisper so I barely even touch the keyboard. I ask for the dumbest things like "decrease the padding on the sidebar by half" because I'm too lazy to find it. I "Accept All" always, I don't read the diffs anymore. When I get error messages I just copy paste them in with no comment, usually that fixes it. The code grows beyond my usual comprehension, I'd have to really read through it for a while. Sometimes the LLMs can't fix a bug so I just work around it or ask for random changes until it goes away. It's not too bad for throwaway weekend projects, but still quite amusing. I'm building a project or webapp, but it's not really coding - I just see stuff, say stuff, run stuff, and copy paste stuff, and it mostly works.

English

109

173

278.5K

Tyler Fox retweetledi

Obie Fernandez@obie·12 Ara

Even if current LLM progress hits a brick wall at Opus 4.5 level (and I doubt that will happen) the next 12 months are still going to be a staggering time of change in this industry as decision makers start truly understanding the new reality we live in. obie.medium.com/what-happens-w…

English

567

319.5K

Tyler Fox@smileyborg·22 Tem

@SOTSPodcast 💀

QME

164