Hanif Carroll

1.1K posts

Hanif Carroll banner
Hanif Carroll

Hanif Carroll

@HanifCarroll

🇺🇸🇦🇷 | AI Product Engineer 🤓 Obsessed with building digital products 🥏 Ultimate frisbee / 🏋🏿‍♂️ Barbell enthusiast

Buenos Aires, Argentina Katılım Ocak 2024
63 Takip Edilen93 Takipçiler
Sabitlenmiş Tweet
Hanif Carroll
Hanif Carroll@HanifCarroll·
I read Sully’s piece on why LLM pipelines get slow when agents try to do too much, and it connected with one of the most important lessons I’ve learned as a software engineer: task decomposition. The useful version of that lesson is not just “break big software projects into smaller pieces.” It’s broader than that. If I’m avoiding a task because it feels overwhelming, the answer is usually not to wait until I feel more motivated. It’s to break the task down until I find a piece small enough to act on. That same lesson applies to the AI systems we build. We’re all guilty of wanting AI to do the whole job in one pass. We stuff a long transcript, ten objectives, edge cases, formatting rules, and a complex schema into one prompt, then blame the model when the first pass is weak. Sometimes the answer is not “wait six months for better models.” Sometimes the answer is: understand the task well enough to decompose it. Give each model a narrower job. Give it only the context it needs. Let specialized pieces run in parallel. Use QA as a guardrail, not as a crutch for a bad first pass. That’s what I liked about the Sully piece. It reframes speed as a consequence of better task design, not just faster inference. It also makes me more excited about open source and smaller models. If task complexity is the real bottleneck, then the future is not just bigger models doing everything. It’s better systems that make each model’s job smaller.
Sully.ai@sullyai

x.com/i/article/2044…

English
0
0
0
53
Hanif Carroll
Hanif Carroll@HanifCarroll·
@signulll Generalizes to most (all?) sports. I've been thinking about this recently. It's definitely true in ultimate frisbee. Sports are about creating, manipulating, and taking advantage of space.
English
0
0
0
55
signüll
signüll@signulll·
in basketball the best players love spacing cuz it gives room to move, & lanes to attack. tech rn is maximum spacing. the floor is effectively wide open. the world is reshuffling fast enough that an entrepreneur gets to define what the next version looks like. chaos is a ladder type stuff. these types of windows don’t stay open long & don’t occur as often as you’d like them to.
English
28
10
430
23.1K
Hanif Carroll
Hanif Carroll@HanifCarroll·
@tmophoto @sudoingX Same thing happens to me with Hermes and OpenClaw. That's why I've given up on coding with them and only use Codex app now.
English
1
0
1
85
tmo
tmo@tmophoto·
do you know why Hermes agent wont do this? It almost feels like its hard coded to save tokens like they aren't being generated at home for free. No matter what model i use it always requrires an immense about of babysitting... sometimes i can just answer yes 40 times in a row while it works through the same kind of list.
English
2
0
2
750
Sudo su
Sudo su@sudoingX·
few days into codex plus and i think i found the hack. nobody is talking about it and the value sitting in this subscription is wild. the hack: do not prompt the agent. write a single detailed task doc with every requirement laid out plus the final vision of what you are building, then fire codex cli with one line, accomplish this and test until done. it goes. hours of uninterrupted agentic coding on gpt 5.5 xhigh, no throttling, no rate cap, 'no can you clarify loop'. the agent has everything it needs in one place so it works the problem instead of working you. i have been grinding it since this morning, screenshot below shows the session past 24 mins and still running. anthropic burns through your daily allowance in three opus 4.7 prompts then your entire tier id is gone for the day. codex plus on the same money goes on and on while you go take a walk. this is the most underrated subscription in the agentic stack right now. the value is there if you front-load the prompt instead of conversation-mode it. give codex the brief, walk away, come back to a finished task. try this. loot the value while the math still favors you.
Sudo su tweet media
English
65
43
1K
97.2K
Hanif Carroll retweetledi
Alexander Embiricos
Alexander Embiricos@embirico·
codex can work in the future: "tomorrow, check in on this discussion and ping me if it isn't resolved" "let me know if this bug isn't fixed by the day before launch" "bug me if this flaky test doesn't go green after retry" i do this all the time. powerful but not obvious—yet
English
35
27
559
37.9K
Hanif Carroll
Hanif Carroll@HanifCarroll·
Easy way to 10x the UX of your app: Tell Codex to use Browser Use to go through the important flows, then ask it to identify usability problems.
English
0
0
0
33
Hanif Carroll
Hanif Carroll@HanifCarroll·
We launched a Spanish learning app on iOS and Android in 4 weeks. The hardest part wasn't the app. It was figuring out who should pay for it first. The obvious answer was Spanish learners. But learners are hard to monetize on day one. They want to try before they commit, and you can't blame them. The clearer first buyer: teachers. A tutor who can generate level-appropriate readings, share them with a class, and track what students are working on—that's someone with a real budget and a real problem. The students come with them. That one insight reshaped the entire MVP. Here's what shipped: → iOS + Android reader for Spanish learners → Web workspace for teachers: class management, student groups, shared readings → RevenueCat for mobile subscriptions, Stripe for teacher plans → App Store + Google Play listings, store assets, legal pages, public site, handoff docs I've worked on a lot of MVPs. This one reminded me that "who is this for" and "who is paying for this" are two different questions, and you need to answer the second one before you can build the right first version. Full case study in the comments.
English
3
0
3
41
Hanif Carroll
Hanif Carroll@HanifCarroll·
@brettmiller128 @petergyang Out of curiosity, do you have openclaw try to update itself? I was doing that, and it broke every time as well. Now I have both Hermes Agent and OpenClaw and I just ask them to update each other. Seems to be working so far.
English
0
0
0
26
Brett Miller
Brett Miller@brettmiller128·
@petergyang I am super frustrated with openclaw. It breaks every time I update. Hermes is much more stable.
English
2
0
1
559
Peter Yang
Peter Yang@petergyang·
I caved and downloaded Hermes to try. For those of you who have tried both Hermes and OpenClaw what difference do you notice? No shilling please, just want some honest opinions
English
375
28
1.1K
296.9K
Hanif Carroll
Hanif Carroll@HanifCarroll·
Currently experimenting with custom CLI tools + codex exec + launchd for scheduled tasks. There's a powerful combination in there, just gotta figure out its shape and scope.
English
0
0
0
14
Hanif Carroll
Hanif Carroll@HanifCarroll·
@kr0der Used to be xhigh all the time, then high, now I mostly use low with fast mode, occasionally medium.
English
0
0
1
345
Anthony Kroeger
Anthony Kroeger@kr0der·
it's been a week of GPT 5.5, what reasoning level are you using, and do you have fast mode on? last time, the most popular response was medium -> xhigh -> high -> low i'm personally using mostly high
English
85
1
140
33.7K
Hanif Carroll
Hanif Carroll@HanifCarroll·
Reminds me of two phrases that I often think about. "Everything counts" - Brian Tracy. Every action, thought, and decision either adds to or subtracts from your success. Nothing is neutral; small, daily habits accumulate over time to determine your ultimate results, wealth, and character. "Don't practice what you don't want to become." - Jordan Peterson. Idea that follows from "everything counts".
English
0
0
0
56
Kpaxs
Kpaxs@Kpaxs·
Every moment of attention is a double-entry in your life’s accounting system. The ledger is always balanced. You never “just” do something with your attention. You’re always making a trade. The horror is that it’s ruthlessly fair. It doesn’t care about your intentions. It doesn’t grade on a curve. If you spend three hours practicing outrage detection, you get three hours better at outrage detection, and three hours worse at everything you didn’t practice. You can’t cheat it. You can’t game it. You can’t “just this once” your way out of it.
English
4
24
209
5.8K
Stewart Alsop - Host of Crazy Wisdom Radio Show
I just love that I get to live in a culture where foreigners are so accepted and given so much leeway (much like the US from 1800s to 1990s) that I get to play the ignorant gringo jester and so much performance art value arises in my daily interactions Learned the word “asar” which I think means chance in one such interaction just now
Stewart Alsop - Host of Crazy Wisdom Radio Show tweet media
English
3
1
2
207
Hanif Carroll
Hanif Carroll@HanifCarroll·
Most of my problems with OpenClaw and Hermes Agent seemed to come from me asking them to update themselves. So, looks like I'll have them update each other instead.
English
0
0
0
16
Hanif Carroll
Hanif Carroll@HanifCarroll·
Codex (app) thinking level select is still broken? Looks like they fixed it after you start a conversation, but when you're on that first screen I still can't change the thinking level. Anyone else still seeing this problem?
English
1
0
1
10
Hanif Carroll
Hanif Carroll@HanifCarroll·
@bidah It's unfortunate, but I've given up on Next as well as Vercel. Vercel is nice, but I recently learned about everything that Cloudflare offers and their generous limits, so I'll be trying them out for future projects.
English
1
0
1
33
ROFI
ROFI@bidah·
@HanifCarroll Next.js is chained to Vercel. You can't switch infra provider. I do think it's great tech but Vercel OSS story is broken.
English
1
0
1
81
Hanif Carroll
Hanif Carroll@HanifCarroll·
Been apartment hunting in Buenos Aires and ran into a frustrating problem: when you filter for "washer" on rental sites, you get a mix of units with an in-unit washer and units with a shared laundry room. No way to tell them apart without clicking through every single listing and scrolling through all the photos yourself. So I built something to fix it. You give it a URL, it analyzes all the listing photos, and tells you whether the washer is actually in the unit. First version is working. Next step is getting it to run across an entire search results page so I don't have to feed it URLs one by one. The architecture ended up being pretty interesting. I used a smaller, cheaper model (gpt-5.4-mini) to do the initial pass on all the photos, and anything it's not confident about gets escalated to a stronger model (gpt-5.4). To figure out which models were actually reliable, I had to build a testing harness. I collected a dataset of listings I already knew the answer to, so I could run each model against it and measure accuracy. Turns out the cheapest model (gpt-5.4-nano) wasn't cutting it. This kind of testing is called evals, and it's one of the most important parts of building anything serious with AI—without it you're just guessing. Am I the only one who's had this apartment search problem?
English
1
0
0
21
Hanif Carroll
Hanif Carroll@HanifCarroll·
This is why we put up guardrails for the system: lints, tests, scripts, browser use so that the agent can see the work that it just completed. It is stochastic, but much less so with the proper systems in place. I do agree that you end up feeling drained, though I think this has to do with context switching than AI itself. As you wait for one task to finish, you switch to another. That adds up.
English
0
0
1
154
Tero Parviainen
Tero Parviainen@teropa·
in which @jeremyphoward nails the phenomenology of agentic coding
Machine Learning Street Talk@MLStreetTalk

A masterclass from @jeremyphoward on why AI coding tools can be a trap -- and what 45 years of programming taught him that most vibe coders will never learn. - AI coding tools exploit gambling psychology - The difference between typing code and software engineering - Enterprise coding AND prompt-only vibe coding are "inhumane" i.e. disconnecting humans from understanding-building - AI tools remove the "desirable difficulty" you need to build deep mental models. Out on MLST now!

English
6
18
218
41K
Hanif Carroll
Hanif Carroll@HanifCarroll·
@zeke Thanks for this! I'm a huge fan of this style.
English
0
0
0
331
Zeke Sikelianos
I made a Swiss International Style design system as an agent skill. npx skills add zeke/swiss-design-skill swiss.ziki.boo
English
15
59
840
69K
Shinji Pons
Shinji Pons@shinjipons·
@zeke But is it good? Was it made without training on copyrighted material?
English
2
0
3
1K
Hanif Carroll
Hanif Carroll@HanifCarroll·
The most underrated thing AI did: unblock high-agency non-technical people who always had ideas but no way to build them. If you haven't started experimenting with AI yet, what are you waiting for?
English
0
0
0
9