Yin Lai

716 posts

Yin Lai

@darkart

Sunny Singapore, Sunny California

Irvine Katılım Mart 2007

2.4K Takip Edilen254 Takipçiler

Sabitlenmiş Tweet

Yin Lai@darkart·5 Oca

So what does StarCraft (or SC) have to do with agentic coding? Why do people who played SC feel its transferrable to this new era of creating software with agents. Why is @tobi the CEO of Shopify keep bringing it up? Is it justification for the thousands of hours spent in their youth? To proclaim to the world, it's not a waste of time?! Having played (conservatively) over 2000+ games, ranked high Diamond, and at best, Master (Zerg of course, ok fine Master in 2v2), I'll try to break it down as a SC player, why we feel the future of orchestrating agents to build software has a familiar feel to it. SC is a RTS (real-time strategy) game where you gather resources, build bases and control your army (units) in real time and try to defeat an opponent who’s doing the same. Most 1v1 games last about 30-40 minutes. I would argue that it ignited the eSports industry with its million-dollar prize pool, celebrity-like players in South Korea, audience size, captivating commentators... Ok like cool story bro, how this is relevant to orchestrating coding agents? Two words.. Micro and Macro. And Meta. Ok fine, three words. I'll start with Micro since it's easier to explain. Micro (short for micro-management) means actions that directly control a specific unit.. Ya, u know that feeling when your manager controls every tiny decision you make at work, same thing 😄 In SC, good micro is extremely precise unit control: exact movement, exact timing and intent for that unit. This is important so you can use your unit's special characteristic to its maximum potential, e.g. blinking stalkers in n out of harms way (I hate you Protoss). And Macro, are actions that affect the overall flow of the game: the strategies, the management if u will. Some macro questions you constantly grapple (concurrently) with throughout the game is 1. is this a good time to get more resources so it will pay off later OR use the resources to build up the army to hurt the opponent for X reason 2. scouting and looking at what the opponent is building, and decide on your army composition to best counter theirs 3. .. the list goes on. You are constantly monitoring the map, accessing the current status, the progress, and making changes to the plan in real-time to improve your odds of winning. See the pattern yet? :D Good SC players know when to micro and when to macro. The best SC players do both blurringly at the same time (trust me, when I say blur, I meant it figuratively and literally). This is where APM (actions per minute) comes into the conversation. IMO, it is not the APM that matters as the takeaway, the point is you are constantly context switching between them. It's normal, in fact, demanded of you if you want to win. And lastly, Meta: the game above the game. Every SC patch can either buff or nerf certain units, actions, timings, etc. This can change the game drastically and players come up with new macro strategies and micro actions, and when the combinations eventually converge, it becomes the "meta" for that patch. This means SC players are used to changes, there is a flavor of the month, you either adapt or you lose. I hope I shined some light to why people are clamoring to this analogy and why it feels familiar. Personally I feel it boils down to 1. conditioned to context switch constantly 2. an understanding that you have to micro and macro in order to get the best results 3. patch happens, learn the buffs n nerfs, you adapt or you lose. If you got this far, and if you are also a gamer but just not a SC fanatic, you are probably thinking, hey this is not that different from the games I am playing. I am sure Factorio fans will be shouting from their rooftops too. Maybe it's my cognitive biasness speaking, but there is a feeling of "gaming" in this new world of building software. Ok this is long enough, but lastly I do want to shout out "IT WAS NOT A WASTE OF TIME!"

English

352

Yin Lai@darkart·17 Mar

Been quiet in the twitterverse due to work :X Figuring n building out the harness at work 🤖

English

Yin Lai@darkart·17 Mar

@doodlifts @doodles 👋👋

QME

doodlifts@doodlifts·16 Mar

I’m seeing new @doodles and people picking up grails. If you are new to the community or I don’t follow you already let me know down below. And I know the community will be watching as well 🙏

English

235

6.5K

Yin Lai@darkart·1 Mar

@trq212 My workaround is when i noticed this, i will cancel n say “ask me those questions again from the start, don’t answer them for me”. And Claude respond is it will notice that they auto submitted without my actual input .. I hope this gets fixed soon :(

English

Yin Lai@darkart·1 Mar

@trq212 Getting a weird bug where if I use dangerously skip permission, and also using the AskUserQuestion tool, Claude code shows a “blank” AskUserQuestion output (no selection UI) and “answered” on its own (by mentioning it in next message).. not sure if u guys noticed it too

English

Thariq@trq212·28 Şub

a few Friday afternoon ships to end the week: the AskUserQuestion tool can now show markdown snippets to display diagrams, code examples, etc.

English

185

164

4.6K

481.9K

Yin Lai@darkart·1 Mar

@strickvl @trq212 Oops I jus replied to @trq212 on this exact issue too

English

Alex Strick van Linschoten@strickvl·28 Şub

@trq212 the askuserquestion tool seems broken for me atm. doesn't display at all. I filed a `/bug`. It never asks me any questions, but just progresses through as if it did?

English

1.8K

Yin Lai@darkart·26 Şub

@ashpreetbedi 💯dealing with all that rn 😂And thanks for articulating the challenges!

English

126

Ashpreet Bedi@ashpreetbedi·25 Şub

x.com/i/article/2026…

ZXX

413

42K

Yin Lai@darkart·24 Şub

@rohanvarma Congrats!

English

Rohan Varma@rohanvarma·23 Şub

I’m joining OpenAI Codex to work on the future of agentic development! At Cursor, I got to see the shift from autocomplete to agents. The next step isn’t a better IDE. It’s an Agent Development Environment (ADE): systems and tools for orchestrating agents, reasoning over their outputs, and making them autonomous enough to reliably complete ambitious work. After chatting with @embirico and @thsottiaux, it was clear that Codex is the best place to realize this vision. The team has consistently shipped SOTA models for agentic coding (check out gpt-5.3-codex) and I’m pumped for the future that the new Codex App points to. What I’m most excited about is the broader mission: accelerating the knowledge work economy. All agents are coding agents, and we’re already seeing Codex used across every job function within organizations. I’m extremely grateful for my time at Cursor, working with the incredible team, and I’m proud of what we built together. I’m excited to take an even bigger swing with Codex. If you’re curious to get a glimpse of where we are headed, download the Codex App! If you want to work on this mission, please apply or reach out - we are hiring across all functions! You can just build things.

English

215

2.6K

689.7K

Yin Lai@darkart·7 Şub

@aakashgupta 💯 Hav to be deliberate not to spin up even more LLM agents to do more work = never ending cycle = burn out :D def fun though

English

Aakash Gupta@aakashgupta·7 Şub

This is a dopamine loop, and it’s one of the most powerful ones humans have ever encountered. Every time you prompt an AI and get a useful result back in seconds, your brain gets a hit. Variable-ratio reinforcement, same mechanism as slot machines, except the reward is real: actual output, actual progress, actual leverage on your ideas. Traditional work follows a delayed-reward structure. You write code for 6 hours, maybe it compiles, maybe you get feedback in a week. The gap between effort and reward is wide enough that motivation decays constantly. AI compresses that loop to seconds. Effort → reward → effort → reward. Your prefrontal cortex stays engaged because the next payoff is always one prompt away. This is why people describe it as “fun” when they’re actually working 14-hour days. The subjective experience of effort disappears when reward frequency is high enough. The “harder than ever” part is real too. When your bottleneck shifts from execution to imagination, you run out of excuses to stop. There’s no “waiting on the build” or “blocked by review.” Every idea you have can be tested immediately, which means your brain never gets a natural stopping point. People who thrive on this are selecting for a specific neurotype: high novelty-seeking, high conscientiousness, tolerance for rapid context-switching. That’s maybe 10-15% of the population. The other 85% will experience the same tools as overwhelming, not energizing. And that split is going to define the next decade of who captures value from AI and who gets displaced by it.

Nat Eliason@nateliason

Nearly every ambitious person I know who has dived into AI is working harder than ever, and longer hours than ever. Fascinating dynamic tbh. I have NEVER worked this hard, nor had this much fun with work.

English

208

488

4.6K

563.4K

Yin Lai@darkart·7 Şub

factory.strongdm.ai/techniques "..Code was treated analogously to an ML model snapshot: opaque weights whose correctness is inferred exclusively from externally observable behavior. Internal structure is treated as opaque." I can't help but to feel working in robotics is analogous to this? I mean with zero experience in that field but I would think its the same challenges? Or I guess now in general, all software development converge to that programming model .. But are we over-complicating it? Or are we now finally realizing and admitting that software behaving deterministically with a snapshot model of the world was an illusion and the best we could do up till now?

English

Yin Lai@darkart·7 Şub

Yes, I did have a DECENT plan (you have to trust me on this), etc. iykyk, you are ALWAYS going to run into issues or bugs not captured in the spec. Now, the coordination will be your issue in this setup. e.g. If there is a backend bug, the backend agent has to fix it and then let the frontend know to verify or fix their end, or vice versa, etc. Rinse, Repeat where each loop increases the odds of context rot. That said, I still believe this set up COULD work, but is the juice worth the squeeze? Personally, I would breakdown work and organize them differently. Instead I have a team of full-stack engineers owning (decent sized) vertical slices supported by an Architect with the plan, with QA and Security agents as additional verification with fresh context. But won't that create the same issue? It will (in some aspects), but at least the assumption is the feature would be working by then, with the QA+Security agents making it better. And who knows, with a larger context and improvements to models by year's end, we might not even need those separation.

English

Yin Lai@darkart·7 Şub

In an ideal world, would u rather have 1 expert front end engineer and 1 expert backend engineer working together OR 1 expert full stack engineer? IMO the human-world constraint is its harder to find an expert full-stack engineer. Well, this is no longer the case ;) So what's so bad about splitting this work up? Yin, don't you have a well-specified plan with clear API contracts, interfaces that separated the concerns? Don't you have a clear test plan? Verification? ..

English

Yin Lai@darkart·7 Şub

Personal quick learnings from using new @claudeai Agent teams (orchestration) at work. 1. Splitting up work in terms of backend and front end sounds good in theory but likely have poorer results. (Hold your thoughts and counter-arguments to the end of this thread... ) ...

English

Yin Lai@darkart·4 Şub

@saranormous 💯a framework I use for my teams is “Make it work->Make it right->Make it fast”.. Make it work is solved, but how do we codify for “make it right”

English

sarah guo@saranormous·4 Şub

x.com/i/article/2018…

ZXX

296

105.8K

Yin Lai@darkart·2 Şub

Local maximal is all you need..

English

Yin Lai@darkart·2 Şub

@karpathy @davidfowl @jiayuan_jy Hav to be deliberate not to spin up even more LLM agents to do more work = never ending cycle = burn out

English

192

Andrej Karpathy@karpathy·1 Şub

@jiayuan_jy ? what do you do while your LLM agent is writing all your code

English

540

51.4K

Andrej Karpathy@karpathy·1 Şub

Finding myself going back to RSS/Atom feeds a lot more recently. There's a lot more higher quality longform and a lot less slop intended to provoke. Any product that happens to look a bit different today but that has fundamentally the same incentive structures will eventually converge to the same black hole at the center of gravity well. We should bring back RSS - it's open, pervasive, hackable. Download a client, e.g. NetNewsWire (or vibe code one) Cold start: example of getting off the ground, here is a list of 92 RSS feeds of blogs that were most popular on HN in 2025: gist.github.com/emschwartz/e6d… Works great and you will lose a lot fewer brain cells. I don't know, something has to change.

English

547

937

9.2K

1.3M

Yin Lai@darkart·28 Oca

oh my, now this is possible with x.com/bcherny/status…

Boris Cherny@bcherny

Hooks can now run in the background without blocking Claude Code's execution. Just add async: true to your hook config. Great for logging, notifications, or any side-effect that shouldn't slow things down.

English

Yin Lai@darkart·7 Oca

🤯 what a great idea! Immediately comes to mind is having subagents(?) with this skill to maintain visualization of code side by side while coding, or a separate panel keeping track of how the conversation is progressing like a dashboard of sorts do whatever u want to track. Now ur building a richer “IDE” for Claude in the terminal with dynamic interfaces🤯

David Siegel@dvdsgl

What if Claude Code had an external display? Introducing 🖱️Claude Canvas📺! Especially useful if you're using CC as a personal agent.

English

131

Yin Lai@darkart·28 Oca

In Civ, its "One more turn". For Claude Code, its "One more prompt"! Catching myself more when I am working on parallel items at work: where after one item is done, I start adding onto a new item, and so on... I need to consciously stop myself and take a break. The dopamine hits and FOMO are real.

English

Yin Lai@darkart·28 Oca

@rjs 💯for me personally it’s about how much of the “solution space” you want Claude to explore. Laying down the stones (to me) are setting the constraints around that space. Or at least that’s how I think about it

English

Ryan Singer@rjs·27 Oca

I'm seeing a key difference in the results friends get using AI for real work. It's whether they create trusted ground truth for the agent or not. Example. If you ask AI to summarize a meeting and then you save that, your ground truth is suspect. Instead, save the full transcript, iterate with the agent on what is important and true to highlight (with adjustments), then save that. This is why we see Claude Code pulling ahead like crazy. Because it's grounded in specific knowledge (the files). Not just fuzzy memory. But this depends on how you hold it too. Eg if you put a lot of mixed content into a request (try to one-shot something too big), you don't have grounding for what is happening. That's why planning works. Planning gives you ground when the chasm is far. It's not just about "think ahead." "Think ahead and lay some stones down that we can agree are true" is very different from "think ahead and see what happens."

English

5.8K

Yin Lai@darkart·28 Oca

@wabi build me an app that can track things in my pantry!

English

Yin Lai@darkart·27 Oca

@karpathy @altryne @airesearch12 Related and another insightful article is the “deletion test” by Chad Fowler “Ask yourself: If I deleted this codebase and regenerated it from scratch, what would I rely on to decide whether the result was correct?” aicoding.leaflet.pub/3md5ftetaes2e

English

Andrej Karpathy@karpathy·26 Oca

@airesearch12 💯 @ Spec-driven development It's the limit of imperative -> declarative transition, basically being declarative entirely. Relatedly my mind was recently blown by dbreunig.com/2026/01/08/a-s… , extreme and early but inspiring example.

English

171

2.2K

413.3K

Andrej Karpathy@karpathy·26 Oca

A few random notes from claude coding quite a bit last few weeks. Coding workflow. Given the latest lift in LLM coding capability, like many others I rapidly went from about 80% manual+autocomplete coding and 20% agents in November to 80% agent coding and 20% edits+touchups in December. i.e. I really am mostly programming in English now, a bit sheepishly telling the LLM what code to write... in words. It hurts the ego a bit but the power to operate over software in large "code actions" is just too net useful, especially once you adapt to it, configure it, learn to use it, and wrap your head around what it can and cannot do. This is easily the biggest change to my basic coding workflow in ~2 decades of programming and it happened over the course of a few weeks. I'd expect something similar to be happening to well into double digit percent of engineers out there, while the awareness of it in the general population feels well into low single digit percent. IDEs/agent swarms/fallability. Both the "no need for IDE anymore" hype and the "agent swarm" hype is imo too much for right now. The models definitely still make mistakes and if you have any code you actually care about I would watch them like a hawk, in a nice large IDE on the side. The mistakes have changed a lot - they are not simple syntax errors anymore, they are subtle conceptual errors that a slightly sloppy, hasty junior dev might do. The most common category is that the models make wrong assumptions on your behalf and just run along with them without checking. They also don't manage their confusion, they don't seek clarifications, they don't surface inconsistencies, they don't present tradeoffs, they don't push back when they should, and they are still a little too sycophantic. Things get better in plan mode, but there is some need for a lightweight inline plan mode. They also really like to overcomplicate code and APIs, they bloat abstractions, they don't clean up dead code after themselves, etc. They will implement an inefficient, bloated, brittle construction over 1000 lines of code and it's up to you to be like "umm couldn't you just do this instead?" and they will be like "of course!" and immediately cut it down to 100 lines. They still sometimes change/remove comments and code they don't like or don't sufficiently understand as side effects, even if it is orthogonal to the task at hand. All of this happens despite a few simple attempts to fix it via instructions in CLAUDE . md. Despite all these issues, it is still a net huge improvement and it's very difficult to imagine going back to manual coding. TLDR everyone has their developing flow, my current is a small few CC sessions on the left in ghostty windows/tabs and an IDE on the right for viewing the code + manual edits. Tenacity. It's so interesting to watch an agent relentlessly work at something. They never get tired, they never get demoralized, they just keep going and trying things where a person would have given up long ago to fight another day. It's a "feel the AGI" moment to watch it struggle with something for a long time just to come out victorious 30 minutes later. You realize that stamina is a core bottleneck to work and that with LLMs in hand it has been dramatically increased. Speedups. It's not clear how to measure the "speedup" of LLM assistance. Certainly I feel net way faster at what I was going to do, but the main effect is that I do a lot more than I was going to do because 1) I can code up all kinds of things that just wouldn't have been worth coding before and 2) I can approach code that I couldn't work on before because of knowledge/skill issue. So certainly it's speedup, but it's possibly a lot more an expansion. Leverage. LLMs are exceptionally good at looping until they meet specific goals and this is where most of the "feel the AGI" magic is to be found. Don't tell it what to do, give it success criteria and watch it go. Get it to write tests first and then pass them. Put it in the loop with a browser MCP. Write the naive algorithm that is very likely correct first, then ask it to optimize it while preserving correctness. Change your approach from imperative to declarative to get the agents looping longer and gain leverage. Fun. I didn't anticipate that with agents programming feels *more* fun because a lot of the fill in the blanks drudgery is removed and what remains is the creative part. I also feel less blocked/stuck (which is not fun) and I experience a lot more courage because there's almost always a way to work hand in hand with it to make some positive progress. I have seen the opposite sentiment from other people too; LLM coding will split up engineers based on those who primarily liked coding and those who primarily liked building. Atrophy. I've already noticed that I am slowly starting to atrophy my ability to write code manually. Generation (writing code) and discrimination (reading code) are different capabilities in the brain. Largely due to all the little mostly syntactic details involved in programming, you can review code just fine even if you struggle to write it. Slopacolypse. I am bracing for 2026 as the year of the slopacolypse across all of github, substack, arxiv, X/instagram, and generally all digital media. We're also going to see a lot more AI hype productivity theater (is that even possible?), on the side of actual, real improvements. Questions. A few of the questions on my mind: - What happens to the "10X engineer" - the ratio of productivity between the mean and the max engineer? It's quite possible that this grows *a lot*. - Armed with LLMs, do generalists increasingly outperform specialists? LLMs are a lot better at fill in the blanks (the micro) than grand strategy (the macro). - What does LLM coding feel like in the future? Is it like playing StarCraft? Playing Factorio? Playing music? - How much of society is bottlenecked by digital knowledge work? TLDR Where does this leave us? LLM agent capabilities (Claude & Codex especially) have crossed some kind of threshold of coherence around December 2025 and caused a phase shift in software engineering and closely related. The intelligence part suddenly feels quite a bit ahead of all the rest of it - integrations (tools, knowledge), the necessity for new organizational workflows, processes, diffusion more generally. 2026 is going to be a high energy year as the industry metabolizes the new capability.

English

1.6K

5.4K

39.4K

7.6M

Keşfet

@doodlifts @doodles @trq212 @strickvl @ashpreetbedi @rohanvarma @embirico @thsottiaux