dex

4.7K posts

dex banner
dex

dex

@dexhorthy

building the post-IDE IDE at https://t.co/hDpglja33W - @aitinkerers sf lead, prev @replicatedhq @SproutSocial @nasa - ai that works pod @ https://t.co/69BhaNtWfd

San Francisco, CA Katılım Ocak 2017
1.6K Takip Edilen14.3K Takipçiler
Sabitlenmiş Tweet
dex
dex@dexhorthy·
👇 a curated superthread of resources to get the most out of coding agents, advanced context engineering, research/plan/implement, and more
dex tweet media
English
19
47
504
55.9K
dex
dex@dexhorthy·
@0xblacklight Ok but no weapons my mom will be sad
English
0
0
1
59
dex
dex@dexhorthy·
I don’t disagree wholesale but there’s some shaky points I. Here you don’t need to run agents in the cloud to publish their progress to the team or make them collaborative You don’t need agents running overnight to reap the benefits of sandboxes There are a lot of cloud things that are not containers
English
1
0
7
617
Sergey Karayev
Sergey Karayev@sergeykarayev·
Running agents locally is a dead end. The future of software development is hundreds of agents running at all times of the day — in response to bug alerts, emails, Slack messages, meetings, and because they were launched by other agents. The only sane way to support this is with cloud containers. Local agents hit a wall quickly: • No scale. You can only run as many agents (and copies of your app) as your hardware allows. • No isolation. Local agents share your filesystem, network, and credentials. One rogue agent can affect everything else. • No team visibility. Teammates can't see what your agents are doing, review their work, or interact with them. • No always-on capability. Agents can't respond to signals (alerts, messages, other agents) when your machine is off or asleep. Cloud agents solve all of these problems. Each agent runs in its own isolated container with its own environment, and they can run 24/7 without depending on any single machine. This year, every software company will have to make the transition from work happening on developer's local machines from 9am-6pm to work happening in the cloud 24/7 -- or get left behind by companies who do.
English
48
9
109
9.7K
David Cramer
David Cramer@zeeg·
1) not surprising whatsoever 2) this is exactly what I keep saying about models not being powerful enough today the fact that they can do so much with lossy compression is amazing, but there's no magic here imo (for transformers) context windows need to be 1-2 orders of magnitude larger for the future people keep saying is reality, and even then the compute is probably not worth it
Lossfunk@lossfunk

🚨 Shocking: Frontier LLMs score 85-95% on standard coding benchmarks. We gave them equivalent problems in languages they couldn't have memorized. They collapsed to 0-11%. Presenting EsoLang-Bench. Accepted to the Logical Reasoning and ICBINB workshops at ICLR 2026 🧵

English
14
4
90
7.5K
dex
dex@dexhorthy·
the post states “no explanation but this works better than other things we’ve tried” So on the model side, mostly vibes I also do think it’s more ergonomic for develops. The xml tags give you a little be more clarity over what’s important to a specific trigger compared to markdown headers - it’s very explicitly where things begin and end
English
0
0
0
18
Rodrigo Elias
Rodrigo Elias@rodrigoelias·
@dexhorthy Can you share how you identified/validated this approach? Not doubting, but I want to understand how these evals work in practice
English
1
0
1
29
dex
dex@dexhorthy·
we've been trying a bunch of stuff. this one kinda works.
dex tweet media
English
29
29
425
40.5K
Ian Livingstone
Ian Livingstone@ianlivingstone·
Incredibly excited to announce Keycard for Coding Agents - no more copy & pasting credentials or approving individual tool calls. Agents get task-scoped access, so you can stay in flow and actually build. You’re only pulled in when it matters. Yolo mode, without compromise.
Keycard@KeycardLabs

Your coding agents inherit your credentials and your permissions. No identity system in the stack can tell the difference between you and the agent acting in your name. Today: Keycard for Coding Agents 🧵

English
18
6
81
29.2K
dex
dex@dexhorthy·
dex tweet media
ZXX
1
0
9
871
dex
dex@dexhorthy·
this guy gets it
Kent C. Dodds ⚡@kentcdodds

@dexhorthy This is a great talk. Grounded in reality. I'm doing a lot of what you're suggesting naturally (when it matters) and expect my preferred harness (Cursor) to build in some of these features in the workflow (like their Plan mode for example). Thanks for sharing!

English
3
0
16
4.8K
dex
dex@dexhorthy·
@nayshins what should we put in his hand instead of a lighter
English
3
0
2
79
dex
dex@dexhorthy·
@thdxr onecli on gh?
English
0
0
0
281
dax
dax@thdxr·
ai password manager is that anything?
English
129
1
301
41.9K
dex
dex@dexhorthy·
@shanraisshan Any developer on your team should be able to launch Claude in a repo, say “run the tests” and it knows how to run them on the first try, and they pass (and knows how to do any required first time setup on the first try too)
English
1
0
1
29
dex
dex@dexhorthy·
@refactorfiend there are product reasons why it behaves this way that I understand and respect. Understeering is better than oversteering in almost all cases. This is an advanced technique for people who know what they're doing.
English
1
0
1
47
rasiim kyan bey
rasiim kyan bey@refactorfiend·
@dexhorthy this feels like it should be a bug ticket that should be submitted to anthropic. or maybe it has and will be addressed two minutes from now in their next update
English
1
0
0
66
dex
dex@dexhorthy·
@kentcdodds welcome to x dot come the everything app where everyone is nice and humble and there are no jerks anywhere
English
0
0
2
67
Kent C. Dodds ⚡
Kent C. Dodds ⚡@kentcdodds·
@dexhorthy Man it's refreshing to talk with someone who has a healthy balance of wisdom and humility (something I sometimes struggle to do myself 😆).
English
1
0
0
81
Kent C. Dodds ⚡
Kent C. Dodds ⚡@kentcdodds·
Everyone knows that the last 40% of the context window of AI models start to get pretty unreliable... Except I don't experience this at all 🤔 My two primary AI tools are Cursor (mostly GPT 5.4) and ChatGPT with very long conversations. Cursor compacts and it's still does fine.
English
16
0
40
7.7K
dex
dex@dexhorthy·
@kentcdodds (and i think those people who do that tend to feel the most "left behind" by the coding agent madness)
English
0
0
1
22
dex
dex@dexhorthy·
thats fair, i did not mean to dismiss at all, but i see how it could come off that way. I more want to qualify that i'm not expert on every kind of problem and you should ignore my advice if you're not working in the same areas that we're working in trust me i'm sure nobody prefers spelunking across 6 ten-year-old-repos to get things done 🤣
English
2
0
1
56