Post

spent the last hour with gemini 3.5 flash on command code asked it to build a model-routing web tool (cost: $0.32). One-shot capable test prompt, I benchmark with on every new release. here's what i actually saw: 1. built a todo list before writing a line of code. tracked it. closed it well. It chose linear agent execution instead of the much hyped all things parallel. 2. shipped a ~500-line single-page app. deterministic scoring, documented well. design nothing to write home about. 3. zero clarifying questions. it just committed to choices. lazy gemini is gone in my book. 4. asked it what it'd change in another hour. answer: "tf-idf or vector embeddings for query understanding." specific, names a real weakness in the code it just wrote. 5. tie-breaker in modelroute: picks the cheaper model. it wasn't asked. that's the "pro-coding" move. trying out more experiments. exciting stuff.





