
Adrian de Wynter
57 posts

Adrian de Wynter
@deWynterruption
Scientist at Microsoft/University of York. Opinions are my own.





I cannot overstate how important it is to not anthropomorphize models. “Happiness” is a sentient property, which this measurement is very much not. A simulacrum at best. Once again, important to read what the actual eval is instead of the paper title, and it is just this 🤦♂️



Whatever AI sceptics say, LLMs really can reason. They're not just doing an imitation that looks like reasoning, it's the real deal. But even though they are able to reason, sometimes they won't! If you ask an LLM a question it can't answer, sometimes it will just try to imitate reasoning without doing it. The chain of thought looks basically indistinguishable from actual reasoning. But under the hood something very different is going on. @TrentonBricken talked with me about what work on circuits inside LLMs has revealed:





The title of this paper is a click-bait, and doesn't make sense. x.com/MilesCranmer/s… Let me fix the title of this paper to: “If LLMs have human-like attributes, then so does the neural network inside Age of Empires II." The paper's argument appears to conflate the substrate with the implemented system. Age of Empires II, as a classical videogame, is not shown to possess anthropomorphic attributes, as the title suggested. What is actually shown is that a trainable neural network or LLM-like system can be implemented inside the game environment, like Age of Empires II. Nothing new or novel is spotted in this paper. Therefore, the content of this paper is stale, and does not evidently support the title's claim.
















We've made a breakthrough in self-evolving AI scientists moving from "search" to "principled discovery": Scientific discovery requires that the search space itself changes, and an AI scientist must perceive this shift without intervention. We built an AI that achieves this for the first time with the ability to discover the scientific vocabulary it reasons in. Evidence, tools, artifacts, verifiers, failures & claims become typed provenance. We show three distinct modalities: 1) retrieval, adding known objects; 2) search, exploring a fixed schema; and critically: 3) discovery, a verified regime transition. We solve the open-endedness evaluation problem by lifting agentic workflows into a typed copresheaf and proving, via a Kan obstruction, that true discovery is not unbounded generation but a verifiable schema expansion: old evidence is transported by Left Kan extension, and genuine novelty is mathematically quantified by the pointwise residual beyond the transported image - separating discovery from mere search and making novelty objective and measurable rather than a subjective judgment or benchmark delta. Our AI scientist is built in a way that does not pre-conceive the approach it chooses; instead, we endow the system with formal power to adapt, evolve, and reason from first principles. Case studies include: 1⃣Builder/Breaker model that discovers mode-conditioned compliance in proteins; 2⃣CategoryScienceClaw that finds anisotropic fiber-network stiffness rules. Great work in collaboration with my graduate student @fwang108_ @MITdeptofBE F.Y. Wang & M.J. Buehler, Self-Revising Discovery Systems for Science: A Categorical Framework for Agentic Artificial Intelligence, arXiv:2606.01444, 2026


