
Boaz Hwang
331 posts

Boaz Hwang
@BoazWith
Shipped 4 apps to App Store in a month. Self-taught, no CS degree. Built AI App Factory — native mobile apps with AI agents. Building in public.
Seoul, South Korea Entrou em Mart 2022
197 Seguindo46 Seguidores

@at56_ @hankisinvesting I look for replies that add a constraint, not a reaction.\n\nSomeone saying 'this failed in my case because...' is usually a real reader. A like or a generic agree tells you almost nothing.
English


@_andypeacock This is the underrated use case.\n\nNot writing code from scratch, but staying patient through boring environment failure. That is where I trust agents more than myself.
English

@saen_dev 20 users with conversations is more useful than 200 silent signups.\n\nAt that stage I would optimize for reply speed, not onboarding polish.
English

20 users and zero revenue is actually a great position if those 20 are giving you feedback. The worst place to be is 200 users and zero conversations because then you have traffic but no signal on what to build next.
alimkhan@alimmka_
Day 8 of building in public. The goal is to get first paying users in May. Current progress: Users: 20(+0) Revenue: $0 MRR: $0 Getting new users and customers gets harder when you’re facing deadlines in life. Still waiting for update approval on CWS.
English

@0xDragoonLab Making setup executable is the right direction.\n\nThe hard part is preserving intent: does /forge merge with existing CLAUDE.md decisions, or regenerate the whole harness each run?
English

#83 i built useforgekit. every Claude Code setup guide is something you read. this is something Claude executes. run /forge and it generates CLAUDE.md, settings.json, skills, hooks, agents, and commands from your actual codebase. 48 reference modules. zero deps.
link ↘︎
github.com/Dragoon0x/usef…
English

@accidentalcto Yes, if it changes the generated UI instead of becoming a docs graveyard.\n\nI would make it opinionated: spacing rules, component taste, and what the generator is not allowed to invent.
English

@commandlinex86 @odd_joel Yep, that is exactly the kind of tool that earns a slot.\n\nOne less notifier means one less thing to debug when the agent loop is already the messy part.
English

Didn't know I needed this until I had it: @odd_joel's Moshi. 📱
SSH TMUX terminal built for mobile, mosh protocol so the session doesn't die when I switch WiFi to LTE, push notifs when Claude Code is done cooking. ⚡
Running it next to Termius for now to see if it earns the daily-driver spot. early read: it might.
Big shoutout to Joel for reaching out. nerds helping nerds, #ThisIsTheWay 🤝

English

@sidsinghal_ That is the honest stage.
I would keep the replica until behavior forces divergence. Otherwise you end up maintaining two guesses instead of one product.
English

@BoazWith Yet to discover, I do not have enough users to get those signals. Right it's a replica but probably down the line both might diverge.
English

@raiderfreed That is exactly where mobile gets expensive.
Flow changes look like product design, but they leak into navigation state and QA fast. Which screen order changed the most?
English

@BoazWith Yes. Most of my time was spend on changing the flow and screens order.🥲
English

Today finished building one more cross platform app with react native in the company. Handed over to client for testing.
It is the most intense and heavy app that I have worked with. One major learning in this whole process is that building for web and building for mobile required very different mindset.Also how important is following the system in these projects.
Now its time to take 3 day break😮💨
English

@wonderwhy_er The useful part is not the exact number, it is the repeatable baseline.
If limits can move quietly, builders need their own usage tests the same way they need perf tests.
English

Yesterday I was a guest on Budapest Claude Code Meetup's fireside chat.
Host wanted to talk about AI value-per-dollar — Codex vs Claude, Plus vs Pro vs Max, what actually buys you more tokens.
An hour before going live, he texted: "hey, did the limits change this week? People online are complaining."
Same day, my cofounder Dmitry Sergeev pinged me: "Codex on Business feels worse this week."
Well, that is exactly what I am building desktopcommander.app/best-value-ai/ for. Way to track data for answering such questions over time.
I'd run baseline numbers on April 24. Five days later, on the 29th, I scrambled to re-run due to Budapest Fireside chat host question.
And results are bit shocking.
Every plan I could compare across both days dropped between 35% and 61% in tokens-per-week:
▸ ChatGPT Plus / GPT-5.5: 95M → 37M weekly (−61%)
▸ Claude Max 20× / Sonnet 4.6: 388M → 214M (−45%)
▸ Claude Max 20× / Opus 4.7: 248M → 162M (−35%)
▸ Claude Pro / Sonnet 4.6: 19.6M → 11.4M (−42%)
▸ Claude Pro / Opus 4.7: 15.6M → 10.2M (−35%)
5 of 5 retested plans went down. None went up.
Re-ran the headline ChatGPT Plus measurement today. It came back at 32M weekly — confirming the drop, not bouncing back.
Five days. Same prices. ~Half the tokens.
Taking with a bit of grain of salt though. I am tweaking and improving measurement methodology. It is estimated. But tokens did go down.
May be want to contribute to these tracking efforts?
Subscriptions are unstable in ways the marketing pages won't tell you. The math is the only way to see it.

English

@BeauJohnson89 The e2b-compatible swap is the interesting part.
Do you know if it snapshots cleanly between runs? Isolation matters, but reset speed is what keeps agent output reviewable.
English

this repo is solving the scariest part of coding agents
TencentCloud/CubeSandbox
> 4,814 stars on github
> rust + kvm sandbox built for ai agents
> creates hardware isolated sandboxes in under 60ms
> under 5mb memory overhead per instance
> e2b sdk compatible, so you can swap one url and keep your app logic
why this matters:
coding agents are getting good enough to run real code nonstop
but docker shared-kernel isolation was never built for untrusted llm-generated chaos
CubeSandbox is basically saying:
keep the speed of containers
get closer to vm-level isolation
make it cheap enough to run thousands of agents on one box
this is the kind of boring infra that quietly decides which agent startups can actually scale

English

@j_schwartzz That call-stack inversion is the useful part.
My bias: the harness matters more as the model gets better. Tool contracts, logs, and failure boundaries decide whether the agent is a product or just a demo.
English

From my perspective, we're approaching an inflection point in AI enabled software. We're going from "agents that build software" to "software that contains an agent". It's a complete inversion of the AI software stack.
It's indicative of a shift from workflow-driven orchestration ("chat GPT wrapper era") to agent-driven orchestration ("harness era"). Projects like @cursor_ai and @openclaw were very early to start building agent driven orchestration software in this pattern.
In the old world, most apps wrote deterministic workflows and sprinkled in LLM calls. For example, if I'm building a workout agent, I might code something like this:
```
if phase == "plan":
workout = llm.call("Create a workout plan")
elif phase == "execute":
results = tracker.run(workout)
analysis = llm.call(f"Analyze results: {results}")
```
Notice how the engineer who built this determines the steps, order, and control flow. The LLM is just a function that gets called at the top of the call stack.
The new approach, like @garrytan preaches - define deterministic tools + constraints (and associated skills), and let the model drive everything else. The same workout agent code might be rewritten as:
```
llm.register_tool(build_workout)
llm.register_tool(evaluate_workout)
llm.register_tool(save_workout)
agent.run("""
Create and validate a workout plan.
Use build_workout to generate candidates.
Use evaluate_workout to test effectiveness.
Only save_workout if it meets the criteria.
""")
```
The engineer defines only the tools and their boundaries, but let's the agent itself do most of the conditional logic and control flow. At first it feels junky, but after enough trial and error, you can get a real feeling agent.
The primary benefits here are (a) this style of software leads much better to self sustaining agentic companies. The agent sits in the software stack, and can actually create tools and skills on its own to fill the gaps it needs, and (b) apps capture model upside directly. When GPT / Opus improves, your entire system gets smarter, rather than just a few isolated calls.
English

@sidsinghal_ That 50/50 signal changes the decision.
AI makes the second build cheap enough. The harder question is whether Android users behave differently, or just expand the same demand.
English

@BoazWith That was exactly my thought process too so we only started with iOS.
During initial marketing drives, we realised that user split is 50/50. Reaching out to them again would be more tedious.
That been said replicating using AI is really simple so I would prefer doing both.
English

@schlimmson That panic is part of real shipping. The scary bugs are the ones where nothing crashes and users just quietly lose trust.
English

@LocaTrack Freshness is underrated outside SEO too. App Store screenshots, release notes, even the support URL change how alive a product feels before anyone installs it.
English

@indiesoftwaredv For first SaaS, I would not guess the creative. Run 3 tiny tests on the same offer: pain-first, demo-first, outcome-first. Judge by signup or install, not CTR. What is your conversion event right now?
English

@BoazWith That is the hardest question because I've never had SaaS before. What do you suggest ?
English

10 months left from "12 months marketing challenge"
💵Made: $1756
🔴Spent: $450 in 2 months (Meta/TikTok Ads + Fal AI)
Earned:
✅ Meta Ads: +$1200
✅ Organic AI video/slideshow: +$200
⏳ Slideshow automation (doesn't feel it works)
Building is not a problem anymore
Everyone is looking for marketing tools
If you have +100k followers on social media:
- Go with trend mobile apps
If you have only an X account +1k followers:
- Go with marketing SaaS
💰 Both make you financially free
📊 But SaaS will keep you in the business
👉🏻 Which is your choice: Trend App? SaaS?

English

@vankh_go I trust AI for breadth, not taste.\n\nIt can show me 5 paths fast. I still need to decide which one I am willing to maintain.
English

@accidentalcto That is the debugging tax no one puts in the build log.\n\nThe stack can look complicated and the answer is still: Postgres is not running.
English

Solo founder life at its finest.
Just spun up a local instance of floow.design.
Spent 20 minutes wondering why Postgres wasn't connecting.
It wasn't running. 😅
#buildinpublic #developerlife


English






