
Mitchell Bosley
334 posts

Mitchell Bosley
@mitchellbosley
Postdoc at CENIA | Previously at the Schwartz-Reisman Institute at UofT | PhD from University of Michigan Poli Sci



I read a few dozen pages of this and it is not bad for LLM fiction, but also very very LLM-y, from the themes to the fact that there are lots of staccato conversations and meaningful silences and overwrought metaphors and very little differentiated character development.








🚨 Shocking: Frontier LLMs score 85-95% on standard coding benchmarks. We gave them equivalent problems in languages they couldn't have memorized. They collapsed to 0-11%. Presenting EsoLang-Bench. Accepted to the Logical Reasoning and ICBINB workshops at ICLR 2026 🧵

We are back to the phase of the AI news cycle where people are underestimating how jagged the AI ability frontier is, as well as how much they still depend on expert human decision-making or guidance at key points in order to function well. Still far from "doing all jobs," today.




@_aidan_clark_ My working hypothesis is that all else equal, humans enjoy the production of AI content more than consumption of already generated AI content due to the fact that the content generation loop is essentially a Skinner box



One of the advantages of being an early user of LLMs is that I have seen The Curve with my own eyes (like in this post before ChatGPT or the term Generative AI). I notice recent AI users & companies adopting AI anchoring on recent capabilities as if they are stable. Probably not





