Haven Vu

410 posts

Haven Vu banner
Haven Vu

Haven Vu

@havenvu

building something new, 2x founder, ex gen ai lead ai/ml @ucberkeley project 0: https://t.co/2Nb60qCFSn sleep tracker for Night Shift Nurses

Katılım Ağustos 2024
338 Takip Edilen113 Takipçiler
Sabitlenmiş Tweet
Haven Vu
Haven Vu@havenvu·
I’m a top 1% codex and claude code power user. 8-10 terminal tabs always running simultaneously. Hit $200 weekly limit in 1 day. Here are some of my biggest tips. Seriously plug this into your codex or Claude code and ask how you can begin doing this. 1. Stop caring so much about managing context windows. It’s always better to have large agents.md or Claude.md files and burn tokens than have your agent forget details and implement incorrectly. You’ll end up burning way more tokens and wasting way more time if you try to token optimize. Models typically get better and their context windows go up over time anyway. Don’t worry so much about having the perfect context length. That is very short sighted. Instead, you should have a memory log, decision log and a large instruction file so that literally every session has full context on what you’re trying to accomplish. 2. To make faster decisions, tell your agent to ask YOU questions on what you believe the ideal user experience is and work backwards. This will help you understand the tradeoffs of complex architecture without having to understand all of the nuances in architectural decisions. It’s always better to work backward from the user experience because you’ll likely end up refactoring any architecture to cater to a user experience anyway. I’ve refactored my architecture so many times because of some UX issue as opposed to some security issue/ logic issue. 3. To work in parallel, spawn multiple work trees & use docker. However, shipping and integrating with main should always be done sequentially. Shipping and integrating into main will take likely the same amount of time as building all the features out in parallel. But if you try to parallelize many agents writing to main in parallel your code will break. 4. Build harnesses and headless testing for EVERYTHING. The faster your AGENT is able to test its work, the faster you can ship, so spend time building tools for your agent to close its own loops. Without you needing to verify manually. 5. Start barebones with vanilla agents — I’ve uninstalled almost all MCP connections. Almost all of my skills and tools were just coming from workflows I found myself using repeatedly out of vanilla use. Just give your agent knowledge that certain tools exist and they can call it on demand. Otherwise just build your own skills. 6. To prevent your agent from lying about being “done” with a task: Always pair program with another model. The way you do that is to give your agent access to Claude code CLI and Codex CLI and Cursor CLI and Devin CLI as tools/ skills. These CLIs have the best unit economics for calling coding agents. You may end up burning 2x the tokens but you’ll save a ton of time and that will let you ship so much faster (for me 5x faster) because I’m able to have my agents run longer loops when it works with a pair programming agent. While it burns tokens, I can go ship another feature or work on something else. 7. Build your own tutor and spin up small internal tools and web apps to help you read through your codebase simply. Use excalidraw for diagrams and just have your agent teach you the codebase and update its own documentation as the codebase grows. When I was building out my mascots I literally had my agent build out a webpage for me to see all 150 iterations of my mascot. Why would I click through a complex file system when I can literally one shot internal tools for myself? Make yourself work more efficiently with the agent.
English
0
0
3
190
Popcorn Post
Popcorn Post@PopcornPost_·
Name a movie you had no idea was gonna be THAT good.
English
741
20
436
60.5K
Haven Vu
Haven Vu@havenvu·
@robinebers Bro just get an independent model to run as an adversarial judge
English
0
0
0
32
Robin Ebers · AI for Non-Coders
Composer 2.5 in a nutshell: it's fantastic, until it isn't you can cruise smoothly for an hour, and then a silly thing trips it up (like some nested CSS that doesn't render correctly) it's when a lot of dots connect that these cheaper models still struggle the good news is that this is exactly where Cursor shines - literally switch a model mid-session, fix it, and move back to Composer 2.5
English
22
2
178
11.9K
Haven Vu
Haven Vu@havenvu·
People look at this and think "what a failure" Only to realize we had no choice but to plunge every single rocket into the ocean with 0 recoverable parts up to 10 years ago. This is what real progress looks like.
Elon Musk@elonmusk

English
0
0
1
56
Haven Vu
Haven Vu@havenvu·
I have: 2x Codex $200 subscriptions 1x Claude $200 subscription 1x Cursor $20 subscription 1x Devin $20 subscription Anyone else on a similar boat?
English
0
0
2
47
Haven Vu
Haven Vu@havenvu·
1. Switching costs: I want to be able to take convos over from Codex CLI to my codex app. Right now I’m so used to the CLI and when I open the codex app, I see nothing. 2. Software shape and work tree shape: as someone who runs multiple agents in parallel, understanding where they are all at and how merge/ integrate safe they are to main and having clear documentation/indexing would be really helpful
English
0
0
0
34
jason
jason@jxnlco·
If you're using codex desktop app today, what features do you feel like are still missing? Let me know and I’ll summarize all the feedback and share internally.
English
933
13
572
71.8K
Jaeger Media
Jaeger Media@jaegermedia1·
Christopher Nolan has mastered the art of making his films appear to be deep at first glance, but on the second watch revealing just how superficial and pretentious they really are. Has anyone actually been able to enjoy a movie of his on the second viewing?
English
666
34
699
448.5K
Haven Vu retweetledi
arvo färt
arvo färt@arvofart·
It’s curious how often Hereditary gets talked about as the film that started the current trend in horror cinema when that trend had already been in full swing for years by that point. The real trendsetter was arguably The Babadook, but the trend started even earlier than that
English
68
122
3.3K
81.2K
Haven Vu
Haven Vu@havenvu·
@kunchenguid @NedNguyen Not true, they also provide CUDA. If openAI had never touched harnesses, there would’ve never been a ChatGPT moment.
English
0
0
0
41
Kun Chen
Kun Chen@kunchenguid·
@NedNguyen nvidia is at that size nvidia doesn’t try to own everything - they partner with the ecosystem “do as much as needed. as little as possible”
English
1
0
6
1.3K
Kun Chen
Kun Chen@kunchenguid·
i'm strongly against model companies focusing too much on harness, but i would love to hear if anyone has a strong argument for it my reason against it: if openai didn't build GPT 5.5, no one else can. this is their core competence if openai didn't build codex cli and app, we have opencode and t3code. building harness is NOT their core competence this is not saying products like claude code, codex aren't good - i genuinely think these are top tier products built by really talented people my point is - the world might be a better place if model companies focus more on their core capability and give us better, faster, safer and cheaper models, rather than competing with the ecosystem in the application layer what do you think?
Greg Brockman@gdb

the model alone is no longer the product

English
250
19
521
119.8K
Haven Vu
Haven Vu@havenvu·
@mark_k I basically never clear context anymore cause compaction is so good
English
0
0
0
12
Mark Kretschmann
Mark Kretschmann@mark_k·
Codex really needs a simple context meter. Just show the currently used context window percentage somewhere in the UI. When you’re deep into a long coding session, it would be extremely useful to know whether you’re at 30%, 70%, or about to hit the wall.
Mark Kretschmann tweet media
English
32
2
118
8.8K
Haven Vu
Haven Vu@havenvu·
@yulo_tech Unit economics won’t allow them to win the market unless they own their own foundational model.
English
1
0
1
121
Yulo
Yulo@yulo_tech·
PostHog will destroy Claude Code and Codex The moat they'll have from user behavior data and error logs will for the first time give AI tasks that are actually useful and not slop features or things that don't matter Can't wait to try it
PostHog@posthog

Introducing PostHog Code, the product editor that: - Understands your product - Identifies usage patterns - Triages bugs and errors for you - Creates PRs to fix them - Continuously monitors and improves your product Join the waitlist: posthog.com/code

English
56
11
716
272.2K
Haven Vu
Haven Vu@havenvu·
Is there any benefit to using the Codex/ Claude app compared to their CLIs?
English
0
0
0
22