Tweet fixado

New blog post, wherein I beat a dead horse for the last time.
mariozechner.at/posts/2025-11-…
English
Mario Zechner
104.2K posts

@badlogicgames
Old man yelling at Claudes. The fucking bluecheck is temporary... https://t.co/mnOoWUqt4g https://t.co/8i5vIRDt6P








🚨 Shocking: Frontier LLMs score 85-95% on standard coding benchmarks. We gave them equivalent problems in languages they couldn't have memorized. They collapsed to 0-11%. Presenting EsoLang-Bench. Accepted to the Logical Reasoning and ICBINB workshops at ICLR 2026 🧵

And the most important part: we open sourced the /autoresearch plugin for pi. Just tell it what you want, it will do the rest. github.com/davebcn87/pi-a…



how dirty of a hack do i go with to auto inject new tools mid-turn in Pi 😅




