
Nate
223 posts





Introducing FrontierCode: a coding eval that raises the bar for difficulty & quality. Each task took 40+ hrs of work by leading open-source maintainers. Models write sloppy code that works but isn’t maintainable. Our eval is first to measure: would you actually merge this code?


SOURCES: ANTHROPIC MYTHOS GONNA BE A MAJOR FLOP

The Omarchy 4 branch is now 30,000 lines of new code. The majority of it was written by GPT5.5. It's been so, so good at QML. You still need to review, but there's just no way this scale of a conversion would be feasible without AI in a reasonable time. github.com/basecamp/omarc…



Today’s Codex quality-of-life updates start in settings. You can now search Codex settings, with results grouped by category, so you can find what you want to change without scanning every section – this makes setup and customization easier.








Send help.

















