Matt Oswalt

309 posts

Matt Oswalt banner
Matt Oswalt

Matt Oswalt

@Mierdin

Principal Systems Engineer @Cloudflare, O’Reilly Author, Perpetual Learner. In pursuit of Essentialism. Links: https://t.co/La8UW0iFoO

Beigetreten Aralık 2009
35 Folgt5.9K Follower
Angehefteter Tweet
Matt Oswalt
Matt Oswalt@Mierdin·
I'm feeling compelled to get back into making regular video content. What would you find most useful? Could be a prerecorded video or a livestream, or both.
English
0
0
0
625
Matt Oswalt
Matt Oswalt@Mierdin·
💯 - From the beginning of this year (and even stronger now) I have held the position that the two killer use cases for my line of work are: - Troubleshooting/debugging - especially with MCP or CLI access to real systems, or even screenshots, I have had exactly the same experience @thorstenball is describing. Point an agent at a codebase and also a prometheus instance where that service is reporting metrics and ask "explain this behavior in this timeframe" - crazy how well it can do even cross-service event correlation - Learning - exploring a new codebase or even just breaking down a technical topic and asking dumb questions in my own words. The agent won't get annoyed at my questions, or insulted when I just say "explain it better". There's still both an art and a science to doing these "right" - and like always you still need to keep your brain turned on, but these have really moved the needle for me much more than having them actually "produce" something like markdown or code (though these use cases have their place too obv, just sharing the relative impact for me personally)
Thorsten Ball@thorstenball

Under-discussed: how good agents are at debugging issues in production. Everybody's constantly talking about how well they write code, but give an agent access to the gcloud CLI and a screenshot of a graph and, good god, will it go.

English
0
1
5
471
Matt Oswalt
Matt Oswalt@Mierdin·
One tip i would have for working with coding agents - especially for long sessions or big prompts: Periodically ask before implementation for the agent to surface any assumptions it may be making. Often forcing these to become explicit in a summarized list and then agreeing with or correcting them as needed can help prevent a lot of headache later. Combined with an approach of being generally pretty detailed in the initial guidance, this provides a much better set of guardrails for the model to operate within which improves the outcome greatly.
English
2
0
2
171
Matt Oswalt retweetet
Hidden in Frame
Hidden in Frame@HiddenFrame_·
In the movie Eternal Sunshine of the Spotless Mind (2004), the female's hair colour is a reference to the various stages of her relationship with Joel - 1. Green - The stage of their relationship when her hair is dyed green 2. Red and Orange - The stage of their relationship when her hair is dyed red and then subsequently orange 3. Blue - Blue stage
Hidden in Frame tweet media
English
9
10
187
25.3K
Matt Oswalt retweetet
Mitchell Hashimoto
Mitchell Hashimoto@mitchellh·
We've gone really quickly from "local models are dogshit" to "local models are good actually" (like, a 12 month window from A to B). I don't think they're actually good ENOUGH yet. We need an Opus 4.5 quality local model. When that happens, I think the world will spill over. Opus 4.5 is/was amazing, and is more than good enough for almost all tasks still as long as you pair with a frontier-level planner/judge. It'll still require a hugely expensive machine to run it, I'm sure, like a $5K or more laptop or mac studio. But, that's going to be pennies compared to the API costs plus all the benefits of guaranteed privacy and so on.
English
177
202
4K
251.3K
Matt Oswalt
Matt Oswalt@Mierdin·
I understand this is all very new but seems like every few weeks theres some version of “oh ___bench is dead, heres the ‘real’ new benchmark”. Open to being wrong - excited in fact, I’m still learning how it’s all done myself.
English
2
0
0
141
Matt Oswalt
Matt Oswalt@Mierdin·
Not to diminish the work anyone is doing in this space at all, but it seems to me that we as an industry still have no idea how to benchmark frontier models’ software engineering capabilities. The apparent rampant gamification, and variance in results doesn’t inspire confidence
English
1
0
1
208
Matt Oswalt
Matt Oswalt@Mierdin·
Have been switching back to GPT5.5 for personal projects for a bit. - Anthropic models are clearly trained to act like real people and it's just creeping me out. The novelty is gone and I just want the thing to be a tool again. - I hate having to maintain configs for multiple harnesses (forced claude code usage) - Opus 4.8 is way better than 4.7 IME but still makes some bizarre choices even when I'm pretty clear/constrained with my instructions.
English
0
0
4
338
Matt Oswalt retweetet
Mitchell Hashimoto
Mitchell Hashimoto@mitchellh·
@davidcrawshaw That’s because people like you are interested in the results. For every one of you there’s more that are just tabbing over to YouTube or scrolling their phone and turning their brains off
English
13
18
812
26.1K
Matt Oswalt
Matt Oswalt@Mierdin·
I wonder if this personification may (ironically) actually slow down adoption of coding agents, as it causes newcomers to misunderstand their capabilities and have to learn the rough edges via a much more circuitous path.
English
0
0
0
372
Matt Oswalt
Matt Oswalt@Mierdin·
Love @dhh characterization of coding agents as a mech suit youtube.com/shorts/IeOZtj0… In addition to often being associated with anti-human marketing, the whole personification of AI/agents has never made sense to me. This feels much better.
YouTube video
YouTube
English
1
0
2
819
Matt Oswalt
Matt Oswalt@Mierdin·
It has been enough justification for me to tolerate using a separate agent for personal projects (we use opencode at work) too, but re-consolidating is looking more and more attractive as time goes on.
English
0
0
0
111
Matt Oswalt
Matt Oswalt@Mierdin·
It's a shame, too - until recently, the economics of even the basic claude subscription were really pretty good. And Opus 4.6 has generally been great for me, for a while (I'm currently hard-pinned back to that for my default)
English
1
0
0
184
Matt Oswalt
Matt Oswalt@Mierdin·
Opus 4.7 has been so bad. Not only for programming tasks but even just having it help me with research, articulating tradeoffs, even validating my outdoor wifi / IP camera design (and I mean basic stuff like power). Can't count all the "my mistake!" and "you're totally right!"
English
1
0
0
420
Matt Oswalt retweetet
DHH
DHH@dhh·
Oh I love it. Not because I can imagine anything useful off the top of my head, but holy smokes it looks FUN! We need computers to be more like this more of the time. A little GeoCities. A little more crazy. A little more TempleOS.
Orhun Parmaksız 👾@orhundev

I'm so excited to announce a new terminal emulator! 🎉 Meet "Ratty"🐀 🧀 A GPU-rendered terminal emulator with inline 3D graphics. 🪤 Try it out: ratty-term.org ⭐ Source: github.com/orhun/ratty #rustlang #terminal #ratty #ratatui #opensource

English
57
73
1.7K
135K
Matt Oswalt
Matt Oswalt@Mierdin·
I've had it on my list to do some basic hello world stuff with CUDA for a while, just never made the time. This may be the excuse that gets me there: nvlabs.github.io/cuda-oxide/
English
0
0
0
177
Matt Oswalt
Matt Oswalt@Mierdin·
Trying something new, doing a blog read through with commentary. Had this in an open tab for a while so figured it would be a good way to get some thoughts out on the AI topic. Source: jasonrobert.dev/blog/2026-04-1…
English
0
0
0
203