Harmonic

290 posts

Harmonic

@HarmonicMath

Building Mathematical Superintelligence

Katılım Ocak 2024

7 Takip Edilen19.2K Takipçiler

Sabitlenmiş Tweet

Harmonic@HarmonicMath·30 Kas

Many of us intuitively feel that the field of mathematics is going to change, so let's unpack the likely outcomes, without resorting to hyperbole or doomerism.

English

348

81.2K

Harmonic retweetledi

Pietro Monticone@PietroMonticone·16h

Nathanson has just published the recording of his talk about Aristotle’s solutions and it is very interesting to watch! “I tried to figure out what it did that I didn’t do to solve the problems.” “The incredibly clever idea that Aristotle had was…” youtu.be/VBIxv-6m7sk

YouTube

Pietro Monticone@PietroMonticone

Interesting update: a few days ago, Nathanson presented a talk at the New York Number Theory Seminar explaining how Aristotle solved some of his problems.

English

5.1K

Harmonic@HarmonicMath·1d

Aristotle solves several open problems from Melvyn B. Nathanson, a frequent collaborator with Paul Erdos

Pietro Monticone@PietroMonticone

Interesting update: a few days ago, Nathanson presented a talk at the New York Number Theory Seminar explaining how Aristotle solved some of his problems.

English

4.4K

Harmonic@HarmonicMath·1d

Cool use of Aristotle to power formal verification inside a lambda calculus lab

Bartosz Naskręcki@nasqret

I think a radical new viewpoint is emerging on the many activities that mathematicians do. Perhaps a novel profession of mathematical engineering is emerging from the early chaos of AI for mathematics. I can see very clearly that the coordination, setup, technical pursuit, and orchestration of AI systems scaled for massive mathematical efforts and projects will require a special engineering mindset that is currently lacking, or almost completely absent, in mathematical projects. The existence of such a profession is not in opposition to mathematical tinkerers who use their artisanal craft to produce genuinely novel content. As with any kind of content, someone needs to adapt it to the grand scheme of things. This is why these roles are starting to appear complementary rather than competitive. Maybe this is a temporary activity, soon to be replaced by computers, but I think the major role of mathematical engineers will be to stay in touch with the tinkerers and provide a human cushion around their internal activities. I am enjoying this kind of activity (*), where you orchestrate with models and see how the project itself becomes a challenge in design and scale. This might well mean more jobs for mathematicians. In the long run, I suspect we may become secondary cognitive powers in parts of the mathematical information chain. But I do not think this will happen very soon across the whole system. And I hope it never happens at the most human layer: the joy people feel when a new idea is born. (*) This project is essentially a lambda-calculus lab, fully integrated with classical topics such as Church’s lambda calculus, the Aristotle formal proofs system, and extensions over particular papers. I presented it to students at the workshop in Warszawa-Falenty and was very pleased with the result. I am now using this framework for proof development. What strikes me most is that this is primarily an engineering challenge: the mathematics entering the pipeline is being handled, structured, and formalized, but not radically developed inside the pipeline itself.

English

4.3K

Harmonic retweetledi

Lean@leanprover·6d

Great to see Lean at the centre of this closed loop: from autonomous discovery, to formal verification, to mathematical exposition that humans can read, rewrite, and improve.

Pietro Monticone@PietroMonticone

To my mind, what seems most important here is not so much the results themselves, nor even the particular methods used in the proof. What really matters is the workflow: the closed loop from autonomous discovery of the right constructions, to formal verification of the proof in Lean, to informalised exposition back into a manuscript that mathematicians can read, understand, rewrite, and improve. It suggests a framework in which formal proof is not merely a static final certificate, but an active part of mathematical research: a medium through which ideas are found, organised, tested, explained and made available for human judgment. That dynamic loop is the real story for me. The natural next question is how far it can be pushed.

English

6.4K

Harmonic@HarmonicMath·26 Nis

Aristotle is getting more and more capable, assisting mathematicians not just in formalization but also discovery. Team continues to cook 🔥

Pietro Monticone@PietroMonticone

"Aristotle's proof is correct, simple, elegant, and beautiful. It uses techniques in the original paper and adds its own new ideas. I am amazed and impressed by what Aristotle has done." This is what Melvyn Nathanson, a leading additive number theorist and longtime Erdős collaborator, wrote to me after reading solutions by Aristotle (@HarmonicMath) to two problems he had posed earlier this year. Our paper answers Nathanson's Problems 10 and 11 on product intersection sets in semigroups, and also settles the second parts of Problems 4 and 7 as corollaries.

English

42.2K

Harmonic@HarmonicMath·24 Nis

JUST IN: Aristotle writes and formally verifies z80 emulator in @leanprover

Alex Meiburg@Timeroot

The Lean language (@leanprover) has utilities for verifying software, and AI is adept at using it. But can AI prove correctness for a *foreign architecture* with *no existing API*? It turns out, yes! @HarmonicMath's Aristotle wrote a z80 emulator: github.com/Timeroot/Z80Emu (1/n)

English

7.8K

Harmonic@HarmonicMath·23 Nis

@ericweinstein 👀

QME

572

Eric Weinstein@ericweinstein·23 Nis

Pure mathematics will be mostly unrecognizable. Like very early black and white talkies becoming color home theater. Today’s math will not be unwatchable like silent pictures, but even that will happen eventually too. This AI math hype cycle will have crashed in wildly overclaimed tech CEO BS, but the CEOs will be proven to be correct for the cycles that followed. Older mathematicians and younger colleagues may be seriously divided in a way that we haven’t seen. Papers will not exist in the same way. You will have an automatically adaptive custom presentation based on your abilities and interests. Many established results will survive revelations that the proofs in the literature were flawed. This will be very disturbing to mathematicians. We will find out that a lot of problems we thought were hard were actually completely misgauged. Machines will write for each other and translate to English when needed. There will be too much Mathematics to sort through. Amateurs will submit their machine’s results which will be AI verified as valid. The successors to LLMs will relentlessly rely on a few main tricks to generate nonimmitative discoveries. LLMs in math will have crashed. It’s going to be both ego crushing and magnificent. A tragedy and a liberation. Currently Unthinkable Visualizations will democratize what can be understood. Like exotic structures on 7-spheres. Humans will still matter. But less and less so. They will move from doing research to directing it. We will get worse in a sense at mathematics as we atrophy. But the machines will compensate for that too. Computers may invent new areas or “theories” by 7 years. But maybe not. Hard to say. Our abysmal Mathematical pedagogy will have finally fallen. ——- One math PhD’s guess anyway, in 2026.

Paata Ivanisvili@PI010101

What will mathematics look like 7 years from now? I’m really curious to hear your brief opinion.

English

173

138

1.3K

231K

Harmonic@HarmonicMath·22 Nis

This

toly 🇺🇸@toly

Write a lean model for this and prove it with @HarmonicMath

English

4.1K

Harmonic@HarmonicMath·17 Nis

Formally verified by Aristotle

Enrique Barschkis@ebarschkis

Kind of crazy. I had a rough idea for an Erdős problem, gave it to GPT-5.4 Pro, went for a walk, came back to a solution. I verified it, formalized it with Aristotle from @HarmonicMath together with @Tomodovodoo. Incredible how powerful these tools are in the right hands!

English

153

54K

Harmonic@HarmonicMath·8 Nis

BREAKING: Aristotle now powering formal verification of Solana smart contracts "Prove your Solana code is correct. Mathematically"

shek@shek_dev

two major upgrades that made this possible asm2lean: assembly to Lean transpiler, no manual transcription hell aristotle by @HarmonicMath : the agentic prover can run for upto 24 hours on harder theorems x.com/HarmonicMath/s…

English

220

26.3K

Harmonic@HarmonicMath·8 Nis

Source: #2-secondary-contributions-by-ai-tools" target="_blank" rel="nofollow noopener">github.com/teorth/erdospr…

English

1.8K

Harmonic@HarmonicMath·8 Nis

Update: Aristotle has been used to autoformalize 105 of the 115 Erdős problem formalizations produced by AI tools, i.e. 91% of the total. This includes formalizations of classical proofs such as Pólya (1918), Barrow and Mordell (1937), Hall (1947), de Bruijn (1951), and Lorentz (1954), alongside very recent proofs such as Tao (2026), He–Li–Tang (2026), Pomerance (2026), and Chojecki (2026).

English

163

9.4K

Harmonic@HarmonicMath·7 Nis

We are past the event horizon of mathematics and entering a strange world for which there is no precedent

Pietro Monticone@PietroMonticone

AI is increasingly changing how we do mathematics. Erdős Problem #650, open for over 60 years, was solved a few weeks ago through a collaboration between human mathematicians, an informal reasoning model (GPT 5.4 Pro @OpenAI) and a formal one (Aristotle @HarmonicMath). 🧵

English

105

8.9K

Harmonic@HarmonicMath·7 Nis

Aristotle fixes this

Nav Toor@heynavtoor

🚨SHOCKING: Apple just proved that AI models cannot do math. Not advanced math. Grade school math. The kind a 10-year-old solves. And the way they proved it is devastating. Apple researchers took the most popular math benchmark in AI — GSM8K, a set of grade-school math problems — and made one change. They swapped the numbers. Same problem. Same logic. Same steps. Different numbers. Every model's performance dropped. Every single one. 25 state-of-the-art models tested. But that wasn't the real experiment. The real experiment broke everything. They added one sentence to a math problem. One sentence that is completely irrelevant to the answer. It has nothing to do with the math. A human would read it and ignore it instantly. Here's the actual example from the paper: "Oliver picks 44 kiwis on Friday. Then he picks 58 kiwis on Saturday. On Sunday, he picks double the number of kiwis he did on Friday, but five of them were a bit smaller than average. How many kiwis does Oliver have?" The correct answer is 190. The size of the kiwis has nothing to do with the count. A 10-year-old would ignore "five of them were a bit smaller" because it's obviously irrelevant. It doesn't change how many kiwis there are. But o1-mini, OpenAI's reasoning model, subtracted 5. It got 185. Llama did the same thing. Subtracted 5. Got 185. They didn't reason through the problem. They saw the number 5, saw a sentence that sounded like it mattered, and blindly turned it into a subtraction. The models do not understand what subtraction means. They see a pattern that looks like subtraction and apply it. That is all. Apple tested this across all models. They call the dataset "GSM-NoOp" — as in, the added clause is a no-operation. It does nothing. It changes nothing. The results are catastrophic. Phi-3-mini dropped over 65%. More than half of its "math ability" vanished from one irrelevant sentence. GPT-4o dropped from 94.9% to 63.1%. o1-mini dropped from 94.5% to 66.0%. o1-preview, OpenAI's most advanced reasoning model at the time, dropped from 92.7% to 77.4%. Even giving the models 8 examples of the exact same question beforehand, with the correct solution shown each time, barely helped. The models still fell for the irrelevant clause. This means it's not a prompting problem. It's not a context problem. It's structural. The Apple researchers also found that models convert words into math operations without understanding what those words mean. They see the word "discount" and multiply. They see a number near the word "smaller" and subtract. Regardless of whether it makes any sense. The paper's exact words: "current LLMs are not capable of genuine logical reasoning; instead, they attempt to replicate the reasoning steps observed in their training data." And: "LLMs likely perform a form of probabilistic pattern-matching and searching to find closest seen data during training without proper understanding of concepts." They also tested what happens when you increase the number of steps in a problem. Performance didn't just decrease. The rate of decrease accelerated. Adding two extra clauses to a problem dropped Gemma2-9b from 84.4% to 41.8%. Phi-3.5-mini from 87.6% to 44.8%. The more thinking required, the more the models collapse. A real reasoner would slow down and work through it. These models don't slow down. They pattern-match. And when the pattern becomes complex enough, they crash. This paper was published at ICLR 2025, one of the most prestigious AI conferences in the world. You are using AI to help you make financial decisions. To check legal documents. To solve problems at work. To help your children with homework. And Apple just proved that the AI is not thinking about any of it. It is pattern matching. And the moment something unexpected shows up in your question, it breaks. It does not tell you it broke. It just quietly gives you the wrong answer with full confidence.

English

5.9K

Harmonic@HarmonicMath·6 Nis

On canonization, or building The Great Library of Mathematics "I personally know of only one AI company so far that seems to be taking this challenge seriously, namely Harmonic with its Aristotle agent." -@AlexKontorovich

Alex Kontorovich@AlexKontorovich

I've added the slides and an appendix, and posted both to a new "blog": alexkontorovich.wordpress.com/2026/04/05/lec…

English

28.7K

Harmonic@HarmonicMath·31 Mar

One of our goals when we designed Aristotle was to accelerate the creation of the world’s largest repository of mathematics. It’s a pity that many important results are buried inside obscure papers, many of which aren’t yet digitized. AI and formal verification together can assist in the creation of the Great Library of Mathematics. We don’t yet know whether this will be in the form of a single large open-source monorepo like mathlib, or a decentralized network of repositories, mirroring the existing paradigm of research papers linked by citations. What we do know is that it will be done by humans in tandem with AI, and so we build our tools accordingly.

Alex Kontorovich@AlexKontorovich

A preview of my talk tomorrow at the Newton Insitute @NewtonInstitute (comments welcome) My primary interest is research math: solving problems, proving theorems. Before 2019, I was accustomed to using Mathematica to check tedious, error-prone algebra in my papers. Do it once, and never waste time checking it again. But algebra was only part of the issue. If I had a lemma, and in a 60-page paper I might have 20 of them, with a dozen parameters all moving around in different ranges and needing to line up perfectly at the end, then even a single stray minus sign could kill the entire paper. The whole enterprise was extremely complex and fragile. (What I'm describing is very common in loads of fields in modern research math.) In 2019, I watched a lecture of Kevin Buzzard's, and realized the answer: I should use an interactive theorem prover like Lean to check my lemmas the same way Mathematica checks my algebra. (Of course, as I've since learned, there are many benefits to working formally beyond correctness, and these have been extensively enumerated elsewhere, so I won't repeat them here.) But my original motivation for getting involved in formalization was simple: I hoped it would speed up my workflow. It did not. In fact, formalization is brutally tedious, requiring painstakingly spelling out facts that to a human expert are blatantly obvious. Fast forward to 2025, and AI was getting genuinely good at helping with formalization. I was already using Claude rather extensively when we crossed the finish line on the "Medium" PNT in July 2025. By September 2025, Math Inc's Gauss system autoformalized the Strong PNT, writing over 20K lines of compiling Lean autonomously. Earlier this month, they outdid themselves again, writing 200K lines autonomously and formalizing Viazovska's theorems on optimal sphere packing in dimensions 8 and 24. So isn't that the dream? AI can now, in some instances, autoformalize very significant theorems. Can we mathematicians just get back to thinking, sketching, and letting AI do the formalization for us? Not so fast. Autoformalization only works because it is built on top of a big, comprehensive, efficient, coherent monorepo of high-quality formalized mathematics, namely Mathlib. And even in the PNT+ and Viazovska examples, the autoformalizations still depended on substantial earlier human work: setting up the right definitions, the right API, the right abstractions, and so on. So maybe we now get a nice positive feedback loop: Research -> formal math (thanks to AI) -> grows Mathlib -> enables more research. Still no. AI formalization, and frankly the first-pass human formalization too, is usually local, ad hoc, single-purpose work. It is not necessarily general, abstract, efficient, or reusable. So it does not in and of itself help grow Mathlib. The second arrow is broken. Actually, this is not some temporary annoyance, it is inevitable! The goals of doing research and building libraries are misaligned, like scrambling up a cliff versus building an elevator to the top. Both are trying to go up, but for completely different reasons and in completely different ways. In fact, it is even worse than that: the second arrow may make the feedback loop negative. Let us give that second arrow a name: "canonization". By canonization, I mean the process of taking a local, one-off formalization and turning it into library mathematics: general, reusable, coherent, efficient, and compatible with the rest of the monorepo. This is an extremely difficult and time-consuming task. It requires a large amount of prior knowledge and skill, often in several quite different areas at once. And here's why the feedback loop may be negative: while a rough formalization can certainly be a technical head start, socially it often strands the problem in the worst possible state: too solved to feel pressing, too idiosyncratic to be reusable. If a formalization already exists in some ad hoc form, then people are much less incentivized to do this work! They get less credit for succeeding, there is less urgency, and less motivation. Does this sound familiar? It's the same structural problem we had back in 2019, going from proved results to formalized results! So the answer should be obvious. In June 2025, I claimed that (quasi)autoformalization, meaning not entirely autonomous but allowing human intervention and steering, was the greatest short-term challenge in realizing the dream of speeding up research [K2025]. The corresponding claim today is: (Quasi)auto-canonization is the greatest short-term challenge for AI systems. I personally know of only one AI company so far that seems to be taking this challenge seriously, namely Harmonic with its Aristotle agent. Imagine if we get this right. Definitions will still be difficult to automate, but there are orders of magnitude fewer definitions than theorems. Once those foundations are laid (which will still be a ton of human time and effort!), everything else can scale on top. Right now, the vast majority of research mathematicians working in formalization are, very commendably, working toward growing Mathlib. But they comprise maybe 1% of all professional mathematicians. This is not necessarily because people do not want to work formally. It is because the current system does not match how most mathematicians want to work. People are diverse. They have different strengths and weaknesses, different interests, different workflows. If we embrace an ecosystem where people are encouraged to formalize freely, with heavy AI assistance, and where the right pieces later get (quasi)auto-canonized into the central monorepo, then I think we could potentially be in position, given the right incentives, training, and culture-shifts, to move from a handful to the majority of mathematicians doing math formally.

English

34.4K

Harmonic@HarmonicMath·31 Mar

Apply here: jobs.ashbyhq.com/Harmonic/c6a0e…

English

1.2K

Harmonic@HarmonicMath·31 Mar

We’re building the future of mathematical reasoning, and we need the right interface to bridge the gap between human and machine. Harmonic is hiring a Frontend Engineer to lead the evolution of Aristotle’s UX. Help us make complex reasoning intuitive.

English

5.8K

Harmonic@HarmonicMath·28 Mar

Another one bites the dust! Formalization powered by Aristotle

Sky Yang (Yue Er）@aftenheath

I proved an erdos problem at 17! Papers of many Great Mathematicians gave me key inspiration! Especially Professor Balog and Professor Wooley’s! And I would once again like to sincerely thank everyone for all the help you have given me! erdosproblems.com/forum/thread/3…

English

8.6K

Harmonic@HarmonicMath·27 Mar

@nielstron thanks for the feedback! we’re working on both currently - should ship soon!

English

Niels Mündler@nielstron·27 Mar

@HarmonicMath I have two feedbacks - its impossible to provide feedback through any channel of your website - please show completed projects *by time of completion* I have a bunch of proofs running in parallel and now need to sift through all completed ones to figure out what got recently resolved. Thanks! Amazing tool.

English

149

Keşfet

@leanprover @ericweinstein @AlexKontorovich @elonmusk @BarackObama @taylorswift13 @cristiano @BillGates