
Controversial take: Cursor cloud agents are the best agentic development experience for serious products:
1. Setting up a Cursor cloud agent environment has the best setup experience, because its fully agentic. No one wants (or even can - even with AI 😅 - build a Docker image or complete build script)
2. The new Cursor 3 desktop and and web apps have by far the best, more professional and robust UX - esp. for teams; compared with conductor.build, Codex, claude.ai/code, claude code
3. Cursor gives you a separate machine, incl. really well working VNC, where I test the new app in an isolated environment w/ a desktop and browser
4. Last, but not least - and this is big for me - only Cursor cloud agents are primed out of the box and equipped with the tools out of the box to open a browser and test their implementation end to end, producing a video walkthrough as a side result
During easter vacation I built a generic SaaS app, some highlights:
- 0% lines of code manually written
- Browser use works so well, it logs into my Microsoft 365 account, incl. MFA token from 1Password
- Average agent runs ~30min before it stops / asks for FB - out of the box (!)
- 90% of prompts and iterations on my phone --> important, because the mobile UI is so good I had even more time with my kids than usual 😅
I love the feeling and vibe of Claude Code and I still use it for my side project. But for a team that builds a production grade app, where reviews and UATs are critically are important it fails to deliver an equally seamless experience.
Key unlocks for a great agentic coding experience with longer running agents:
- Ensure you can easily boot up an isolated clean version of your app for each cloud environment (or worktree, if you're a local guy)
- Give the system an ability to test itself, i.e. we have a CLI for our apps now as a token-efficient way to test end-to-end flows; our Cursor cloud env also has a 1Password CLI to login to same login to Microsoft 365 accounts that require MFA and do real integration tests with actual inboxes
- Spend a lot of time on specifications. Your prompt should be 20 or 50 lines, incl. a test plan, UI guidance, etc. (I think paired coding is dead. Do paired spec'ing and UAT'ing!)
English


