AGI Prophet
572 posts



Gemini Flash 3.5 is now on CursorBench, our main coding agent eval. We’ll keep updating the leaderboard as new models come out. cursor.com/evals

Giving a physics talk on Dark Energy at @UTAustin in the Karch Seminar. DATE: Tues, May 19, 2-3pm TITLE: "From Dark to Geometric Energy: Equivariant Distortion in Geometric Unity" PLACE: UTexas Austin, Physics, Math & Astronomy Bldg. Room 9.222 2515 Speedway, Austin TX, 78712


Just started testing the @grok Build beta. First feel: UX is nice, still some rough edges, but model speed is genuinely cool. If task quality on hard stuff matches opus 4.7 (or even slightly below) at this speed, it's a game-changer. Good chance they steamroll the competition.

Phew, Grok Build is really thorough, pretty incredible. Relay feature for PasteLocal is done, pushing to Github now. And if you want to know what this adds, here's a bit more about it, this is a feature I really wanted in there, and honestly didn't really expect to have done this week. • Per-peer E2E encryption: clipboard data is encrypted individually for each paired device using X25519 + HKDF + AES-GCM (no plaintext ever leaves the client) • Durable persistence: relay state is now stored in an atomic state.json with proper TTL handling and compaction. Pending clips survive relay-server restarts • Safe compaction: fixed unsafe map mutation during expiration/compaction that could cause data loss or undefined behavior • Full bidirectional CLI: new commands: pastelocal relay send, inbox, fetch, and status, plus pastelocal-remote --relay --peer --send • Auto-sync: when watch.enabled + relay.auto_upload are on, meaningful clipboard changes are automatically pushed to paired devices • Improved DX: doctor checks are now only shown when relay is enabled, and the TUI shows basic relay status • Multiple review cycles: went through full implement → review → fix → re-review (effort 4), plus a final targeted regression fix round Try it out and please send any and all feedback!

Introducing Composer 2.5, our most powerful model yet. It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions. For the next week, we’re doubling the included usage of the model.


Neural networks might speak English, but they think in shapes. Understanding their rich *neural geometry* is key to understanding how they work – and to debugging and controlling them with precision. Starting today, we’re releasing a series of posts on this research agenda. 🧵









