Jonathan Low

353 posts

Jonathan Low

@jonathanlowhy

Building Mistle - Open source background agents

가입일 Nisan 2010

481 팔로잉161 팔로워

Jonathan Low@jonathanlowhy·3h

@scottmcpherson @steipete Codex computer use is just a MCP. If you have Codex Desktop installed and computer use is supported, Codex CLI can call it too.

English

Scott McPherson@scottmcpherson·4h

@steipete Have you tried cmux with codex computer use? I haven't had any luck.

English

2.7K

Peter Steinberger 🦞@steipete·6h

I'm late to the party, but cmux is great. github.com/manaflow-ai/cm… current split: codex mac app: knowledege work, learning, reading cmux + codex cli: coding

English

148

101

1.7K

152.6K

Jonathan Low@jonathanlowhy·1d

100%. I often run multiple related PRs back to back with a single chat session. No need to manage the context window.

Siqi Chen@blader

honestly claude code / opus 4.7 feels barely usable compared to codex / 5.5 at this point for any kind of long running / complex engineering work

English

Jonathan Low@jonathanlowhy·1d

@lubinho_k I get a peace of mind with GPT 5.5. What do you think can be done well with Composer 2.5 where the difference doesn’t matter?

English

1.6K

Luckforest@lubinho_k·1d

I am on the $200 Claude, $100 Codex, $20 Cursor Plan. After using Composer 2.5 for 8 hours straight while only using 8% of my $20 plan, I should reconsider my entire subscription stack. Maybe $100 Codex for complex stuff, and $60 Cursor for UI & Copy?

English

181

1.3K

115.4K

Jonathan Low@jonathanlowhy·1d

@kunchenguid GPT has its own quirks and is heavily RLed in a specific manner which makes sense for there to be a Codex harness. Eg. Apply_patch is unique to GPT models because of its RL.

English

Kun Chen@kunchenguid·1d

i'm strongly against model companies focusing too much on harness, but i would love to hear if anyone has a strong argument for it my reason against it: if openai didn't build GPT 5.5, no one else can. this is their core competence if openai didn't build codex cli and app, we have opencode and t3code. building harness is NOT their core competence this is not saying products like claude code, codex aren't good - i genuinely think these are top tier products built by really talented people my point is - the world might be a better place if model companies focus more on their core capability and give us better, faster, safer and cheaper models, rather than competing with the ecosystem in the application layer what do you think?

Greg Brockman@gdb

the model alone is no longer the product

English

250

516

117.5K

Jonathan Low@jonathanlowhy·1d

@mattlam_ Curious. What do you use that Codex Cloud cannot provide? The remote stuff?

English

Matthew Lam@mattlam_·1d

@jonathanlowhy Codex is great too, but codex cloud is behind, and gpt5.5 is behind on fe/design still imo

English

412

Matthew Lam@mattlam_·2d

agreed. People are sleeping on Cursor and Composer 2.5 is just the start. If you actually try Cursor 3 you'll see: - ux is superb, as good as any coding app - harness tied with codex imo at the top - allows any models, I mostly use Composer 2.5 and Opus - cursor cli, I still need to try, but if it makes the cloud agent handoff seamless like Cursor 3 i'll for sure use There's definitely improvements I'm looking for but they're way ahead of the cloud agents game, and their models are only just starting.

am.will@LLMJunky

People hate on Cursor, or even go as far to laugh at it. "$60B for an IDE LOL" "It's just a VSCode fork" Yeah, and Tesla is just a car. Youtube is just a video player. and the iPhone is just a phone It's honestly hysterical how wrong they are. GPT 5.5 High Fast and Cursor 2.5 Fast feel unbelievably good in this harness. I have been building non-stop for the last week. Between Cursor and my Codex sub, I want for nothing.

English

227

19.6K

Jonathan Low@jonathanlowhy·1d

@apeatling @OpenAIDevs I think I manage to get it to work for me by having multiple sessions at the same time. The fatigue from context switching is real though. I can only maintain max-multi-sessioning for 5 hours or so.

English

Andy Peatling@apeatling·1d

I find myself drifting back to Claude Code from Codex. Primarily for speed, it just feels faster to me. Even if I have Codex on fast mode. @OpenAIDevs

English

456

Jonathan Low@jonathanlowhy·1d

@hunvreus Code correctness is a verifiable loop that can be RL-ed on. Hard to do that with creative work where it is taste that matters.

English

Ronan Berder@hunvreus·1d

2 reasons why AI may struggle to generate good design and copy, but not code: - The generated artifact is what people consume. With code, they consume the executed version. If it works, they don't care what it looks like. - Constraints are good for creative tasks. Programming has a LOT of constraints. In design, unless you're working within a design system, you have almost none. Same goes for writing.

English

628

Jonathan Low@jonathanlowhy·1d

3k+ lines of frontend code in a file. Is it a problem if Codex doesn't have an issue with it?

English

Jonathan Low@jonathanlowhy·1d

@josevalim With Codex, once implementation is done, I will get Codex to focus on cleaning up the abstraction smells. It helps to have a doc in the repo to explain what the smells are and then point Codex to it.

English

José Valim@josevalim·2d

Here is a simple but good example of how Codex tends to handle tasks better than Claude Code. A user reported that some actions in tidewave.ai had keyboard shortcuts but were not displaying them on mouseover. I asked Codex to find and fix the missing cases. Codex found all of them and additionally introduced a small helper called shortcutLabel that maps a ShortcutAction to its label (see code screenshot). The benefits are two: * It uses the ShortcutAction type to ensure we don't accidentally forget the label of any shortcut * The helper was added to shortcut.ts, colocating labels with the shortcuts themselves, making future additions more foolproof Codex also updated the other places where we listed shortcuts to use the new helper. I didn't ask Codex to create the helper but it was the right call. Maybe we would have suggested it during code review, maybe we would not. But I'd say Codex left the codebase in a better state than it found it. I gave Claude Code the exact same prompt three different times. In every case, it just inlined the shortcuts in the templates, duplicating information across multiple files. I rarely feel Claude Code improves the codebase unless I explicitly tell it to do so. Now the flipside is that sometimes Codex is going to go ahead and create abstractions when they are not needed, but so far I have seen more hits than misses.

English

104

10.3K

Jonathan Low@jonathanlowhy·2d

What’s better? Using PR review tools like CodeRabbit/Greptile, or building a custom one? Currently experimenting with a custom PR review agent on @mistledev. I like how I get full control over the review scope rather than working with a black box.

English

Jonathan Low@jonathanlowhy·2d

@brianchew Depends on the use case. Some things can benefit from more cores (eg. building Rust). Or memory (local LLMs). If you think you would be doing those, then you should definitely go for a better machine.

English

Brian Chew@brianchew·2d

on the fence of buying a maxed out MacBook for the start of my career - do I get it? I currently have a 16GB Ram M1 MBP thoughts? PS: I’m quite thrifty so this is kind of a big decision

English

3.3K

Jonathan Low 리트윗함

Thomas Jiang@thomasjiangcy·2d

Correct user attribution is pretty important when working with background agents (esp. in teams): - who made the commit? - who opened the PR? - who left a comment? This sounds easy on the surface: just ensure your credentials are scoped correctly and owned by the right person when these actions are performed. Now, what if you're triggering these actions from Slack? or Linear? Here's how @mistledev does it 👇

English

Jonathan Low@jonathanlowhy·3d

@did0f Models/thinking are tied to the configuration of a turn. Changing this will only take effect when you make your next submit after the turn. (Steering does not count as it's still mid-turn)

English

642

Francesco Di Donato@did0f·3d

In Codex, if I change the thinking mode mid-execution, will it: > ignore it until my next prompt? > complete the current internal thinking step and switch to new intelligence level on next one? > immediately switch it mid-thinking (would that even be possible?!)

English

300

81.7K

Jonathan Low@jonathanlowhy·3d

I like the autocorrect feature in @cotypist. I make a lot of small typos and not having to backspace to correct them is great!

Peter Steinberger 🦞@steipete

Can't recommend @cotypist cotypist.app enough. Autocomplete everywhere.

English

Jonathan Low@jonathanlowhy·3d

Building background agents internally can take months to get right. You have to think about sandboxes, credential brokering, permissions, integrations and more. With @mistledev, we're building it to enable teams to get their own Stripe Minions up and running in minutes.

English

Jonathan Low@jonathanlowhy·3d

@steipete @cotypist Okay, it does after I remapped Queue to `alt-enter` instead.

English

Jonathan Low@jonathanlowhy·3d

@steipete @cotypist Does this work on Codex CLI?

English

10.2K

Peter Steinberger 🦞@steipete·3d

Can't recommend @cotypist cotypist.app enough. Autocomplete everywhere.

English

1.5K

141.6K

Jonathan Low@jonathanlowhy·3d

This is a forcing function to throw tokens at a problem instead of money.

Tyler Bosmeny@bosmeny

A mic drop moment @ycombinator tonight @sama just offered $2M in OpenAI tokens to EVERY YC startup in the current batch in exchange for equity Just like Yuri Milner offering to invest in every startup back when Sam was a YC partner I can't wait to see what's unlocked when you let the most driven, creative and formidable founders tokenmaxx

English

Jonathan Low@jonathanlowhy·3d

Funny. If they really wanted, I'm sure they could just set up a Codex automation to track YC companies against token usage.

@jason@Jason

Fair warning, YC founders: if you take these tokens, there’s a non-zero chance that OpenAI will study exactly what your startup is doing, copy your idea and put your app into their free offering. This is the classic platform playbook — be careful, founders!

English

탐색

@scottmcpherson @steipete @lubinho_k @kunchenguid @mattlam_ @apeatling @OpenAIDevs @hunvreus