Joe Hsu

9.6K posts

Joe Hsu

@jhsu

UI 🤝 AI 🦋@getwhys_ (current) let's chat: https://t.co/UTol0SLz5V prev: Waybridge, AppNexus, EngineYard

new york, ny Katılım Aralık 2006

2.6K Takip Edilen801 Takipçiler

Joe Hsu@jhsu·1d

built my own daily note taking app, without having to think beyond putting it on a page. chunked and tagged over night. easily search or aggregate by tags blog: atjhsu.dev/blog/dodaily-a… #builtformyself

English

Joe Hsu@jhsu·1d

Created a zo-proxy for using Zo agents inside of opencode. Built using a simple proxy to the Zo api and using opencode's openai-compatible provider settintgs. Zo inside of @opencode! (or any other ai tools) not super fast, but it works! @zocomputer #BuildWithZo

Zo Computer@zocomputer

We wanna see how you've been using Zo 👀 We're giving away MERCH and $500 in credits for 3 winners! All you need to do to submit: 1. Quote this post with a pic/vid of something you've been building with Zo. Can be anything, from a fun site to a cool automation. 2. Tag us in the post @zocomputer and #BuildWithZo 3. Winners will be tagged so make sure to follow us! Happy Zo-ing :D

English

704

Joe Hsu@jhsu·1d

@YoavCodes what is this, native apps for ants?

English

1.1K

Yoav@YoavCodes·1d

Electrobun lets you ship 16MB app bundles. But what if I told you you will soon be able to ship a native system tray app, written entirely in Typescript that is less than 5KB unpacked, 2KB zipped. - 3,943 bytes of bun Typescript - 625 bytes metadata electrobunny.ai

English

298

23.6K

Joe Hsu@jhsu·2d

depending on how long the agent task is (implementation), the review phase can grow pretty large. this is probably be because of lack of trust or unclear what was used to validate. I agree that supercharging the "plan" phase would help shorten both review and implement, though it might just be shifting the time/effort from implement and review (maybe not a bad thing). some of the visuals also makes me think maybe there's a more collaborative/async way to work

English

Amelia Wattenberger 🪷@Wattenberger·2d

here's my rough logic around why devs need a new tool focused on planning what do you think? going to write it up as a blog post soon, would love any generative reactions 👀

English

102

5.6K

Joe Hsu@jhsu·2d

Introducing MINIZO

ben guo 🪽@0thernet

hilarious how many people are copying @zocomputer > June 20, 2025 – Zo beta > Nov 19, 2025 – Zo launch > 6 days later – first OpenClaw commit > Feb 2026 – Every YC company pivots to cloning Zo

Español

185

Joe Hsu@jhsu·4d

@spencerc99 at a coffeeshop to get the wifi hungry ppl

English

spencer chang@spencerc99·4d

made a device to write messages & poems using Wi-Fi networks

English

2.5K

Joe Hsu@jhsu·4d

@aisdk @neural_avb ai-rlm github: github.com/jhsu/ai-rlm npm: npmjs.com/package/ai-rlm

Indonesia

Joe Hsu@jhsu·4d

still kind of WIP, but 2.0 of `ai-rlm`, RLM for @aisdk, uses quickjs for repl, or plugin your own sandbox provider. pretty customizable with plenty of hooks. i've been trying to build some sort inspect/monitor app like what @neural_avb built x.com/neural_avb/sta…

AVB@neural_avb

Just open sourced my RLM repo on github! 💙 A minimalist sandbox with a python REPL, executes LLM generated code, maintains context, supports early stopping. Also an OpenTUI app to view logs in the terminal. Star it, fork it, go crazy with it. github.com/avbiswas/fast-…

English

Joe Hsu@jhsu·4d

@aisdk @neural_avb or something like collab-ai by @yiliush would be super cool x.com/yiliush/status…

Yiliu@yiliush

seeing if/how graphs come back in

English

Joe Hsu@jhsu·4d

Gossiper - slack bot that sends other users an only visible to them commentary on other users

English

Joe Hsu@jhsu·4d

mini OS that lets you chat to build apps within the OS that other uses can use. here's a calendar to keep track of tasks. each app has per-user state and also a shared state you can toggle between.

English

Joe Hsu@jhsu·4d

been building some random apps, here's one where I crawl some inspo sites and generate a sort of newsletter collection of some designs along with a summary and short descriptions.

English

Joe Hsu@jhsu·12 Mar

@Replit it's also really fast. branching on tasks, merging back to main, working on the canvas

English

Joe Hsu@jhsu·12 Mar

ok @Replit agents v4 is really good

Joe Hsu@jhsu

is that @tldraw + @Replit in agent 4?

English

Joe Hsu@jhsu·11 Mar

is that @tldraw + @Replit in agent 4?

English

Joe Hsu@jhsu·7 Mar

@neural_avb Really cool, I need to setup an eval like this for a rag application.

English

126

AVB@neural_avb·7 Mar

I am SOOOO glad I ran this experiment! I have so many actionable insight it is crazy. Highly recommend yall to set up similar evals for your projects/SaaS. Context: I have been evaluating different models on the current Paper Breakdown retrieval subagents. Goal is to find cheaper models that get the job done quicker. Dataset: huggingface.co/datasets/paper… I have been comparing smaller model outputs against Sonnet-4.6 (results shown below) and gpt-5-mini (current subagent model running in prod). Some insights: - gemini-3-flash thinks a lot, it returns too many chunks, and explores the paper way too much. - gemini-3-flash-lite is actually better than 3-flash at this, it even caches additional queries for fast "future retrieval". Very cool! - grok-fast-non-reasoning outperforms grok-fast-reasoning. And is the CLOSEST to sonnet-4.6 <- this was my biggest surprise. - gpt-5-mini is very fast, it thinks less, fetches quickly. I have empirically felt it's pretty good and reliable - gpt-5-nano pretty bad at this - minimax-m2.5 has high precision (it returns more info than needed) but the problem is the vercel ai gateway provider has been slow :( - for some reason glm-5 and glm-4.7 has a high failure rate on my task, I am yet to understand why. Next steps: - My goal now is to pick some of the best models here, and run either a larger expt with more test cases, or use a LLM-as-a-judge. - In the near future, I may go into harness optimizations (i.e. better prompts, better tool descriptions) I am seeing a ton of free users using the website lately, if I am able to switch to grok-fast-non-reasoning and minimax-m2.5 it will save me actual money.

AVB@neural_avb

I really like the Prime RL school of thinking - "environments & evals are two sides of the same coin" So today I'll convert Paper Breakdown into an RL env. I'll run evals with smaller models to check if I can cut my inference bill without sacrificing rewards.

English

7.7K

Joe Hsu@jhsu·7 Mar

little OS to build apps inside