beginbot 🃏
8.6K posts


she looped on my specification until my acceptance criteria succeeded.

Here’s your monthly reminder that you shouldn’t be prompting coding agents anymore. You should be designing loops that prompt your agents.



New Google Gemma 4 12B claims near-26B performance - we tested both! We ran both models locally on one RTX 4090 and gave each the same task: write a self-contained HTML5 canvas animation with real physics in one file without libraries. Three scenes - a Galton board, two blocks colliding off a wall, and a chaotic triple pendulum Outputs: Gemma 4 26B-A4B: 15 GB VRAM usage, 6.9k tokens, 138 tok/s Gemma 4 12B: 9 GB VRAM usage, 8.9k tokens, 80 tok/s Same Gemma 4 family, but the 26B-A4B won every scene and ran ~1.7x faster - on just 4B active params. The 12B stayed very close though, on almost half the VRAM - which makes it the ideal model for a 16 GB laptop

Uber reportedly now caps coding agents at $1,500/month per employee per tool - seems sensible to me, but it's also an interesting hint at the value Uber thinks these tools are providing simonwillison.net/2026/Jun/3/ube…














