Jason Rute

497 posts

Jason Rute

@JasonRute

AI Researcher @ Mistral AI | Formally IBM Research | Former Mathematician/Logician/Data scientist | Building AI for math and reasoning

Katılım Temmuz 2022

226 Takip Edilen712 Takipçiler

Sabitlenmiş Tweet

Jason Rute@JasonRute·16 Mar

Announcing our fully open source code agent to support development in @leanprover. This has been a labor of love by our team at @MistralAI and we look forward to seeing what the #LeanProver community does with it!

English

161

10.4K

Jason Rute@JasonRute·1d

@gro_tsen Nevermind. I see you say for arbitrarily large n and talk just the sup of delta. My bad.

English

Jason Rute@JasonRute·1d

@gro_tsen I think you also need a big-O or little-O fudge term. For example Sawin’s result is n^1.014114/C for some constant C and Erdos’s original was n^{1 + o(1)}. (Or maybe this is implied without saying in your post.)

English

Gro-Tsen@gro_tsen·1d

So, since three days ago, there's a new constant in mathematics, which one might call the “plane unit distance exponent”: the sup of all δ≥1 such that there exist unit distance graphs in the plane with n vertices and n^(1+δ) edges for arbitrarily large n.

English

9.9K

Jason Rute@JasonRute·1d

@prz_chojecki It will be interesting to compare this to the gap that the LANA project said they found and will announce this summer.

English

261

Przemek Chojecki | PC@prz_chojecki·2d

Funny things happen when you start asking GPT-5.5 Pro to fill gaps in Mochizuki's work, especially 3.11.5 ⇒ 3.12 passage objected by Scholze-Stix. LLMs are probably the best shot at digesting this 1000+ pages long proof and translate it into a more standard Arakelov geometry approach.

English

22.9K

Jason Rute@JasonRute·17 May

@aaswaminathan01 @mathandcobb While I think your take has some truth (we will soon be able to autoformalize a nontrivial amount of math papers into say Lean), I think it is missing a large degree of technical, practical, and sociological nuance.

English

Ashvin Swaminathan@aaswaminathan01·17 May

@mathandcobb I think we are close (i.e., within two years) to a moment where people should be expected to provide formalizations of their work, built on formal statements of inputs from references. This isn't foolproof, but will go some way to solve the fabricated reference problem.

English

368

Ashvin Swaminathan@aaswaminathan01·16 May

In the context of math papers, all this discussion about the arXiv is moot. We are in the era of formalization. We should aim for end to end lean proofs.

English

8.1K

Jason Rute@JasonRute·16 May

@Kseniase_ @CarinaLHong @ylecun @logic_int @evelovesolive I agree with @CarinaLHong. Everything I’ve seen from @evelovesolive says that Aleph prover is not an EBM but an LLM wrapper. @evelovesolive has even said it in X replies when asked, but I would have to search to find them.

English

223

Ksenia Se@Kseniase_·16 May

@CarinaLHong @ylecun @logic_int I really appreciate your input. I think it will be a tremendously interesting conversation if you and @evelovesolive can have a recorded conversation discussing science behind your approaches

English

338

Ksenia Se@Kseniase_·15 May

EBM are so back! @ylecun has been pointing here for years: AI reasoning needs systems that check structure before they answer. Aleph from @logic_int now leads the major formal reasoning benchmarks – let me explain what it is -> 📺

English

339

57.9K

Jason Rute@JasonRute·15 May

@danrobinson I think it might be unethical to raise Erdős from the dead, although the idea has been considered for other purposes: xkcd.com/599/

English

213

Dan Robinson@danrobinson·15 May

Everyone seems to be working on tools to automatically solve Erdos-style problems Is anyone working on automated generation of new ones?

English

150

51.8K

Jason Rute@JasonRute·15 May

@LatinumAI Did you rewrite the interpreter too or just the compiler? I guess now @leanprover needs an external compiler bench (alongside their external kernel bench).

English

Latinum Frontier Mathematics Research Lab@LatinumAI·14 May

Why are formal languages not used as programming languages? Most were built only for mathematics. Lean lets you write real code, but its compiler assumes a garbage collector, threads, and an operating system underneath. So I rewrote it.

English

2.1K

Jason Rute@JasonRute·14 May

@lpachter Do you have good examples?

English

148

Lior Pachter@lpachter·13 May

There is much being written about AI in pure mathematics but less in applied mathematics, where I'm finding it to be just as impactful, if not more so.

English

Jason Rute@JasonRute·14 May

@ChrSzegedy @prz_chojecki That paper was very influential to my view on this field. Especially the autoformalization/proving flywheel. It feels close.

English

Christian Szegedy@ChrSzegedy·13 May

@prz_chojecki leanprover.zulipchat.com/user_uploads/3…

QME

1.5K

Przemek Chojecki | PC@prz_chojecki·13 May

Solve math → Solve everything else This is why scaling curves still look smooth but the jumps feel discontinuous. Math is the narrowest bottleneck in the entire intelligence stack. Crack it cleanly and the rest of cognition is just downstream reuse of the same circuits. No more “we need a special module for X.” X is always math in disguise. LLMs that only pattern-match calculus can win existing benchmarks. The ones that internalize proof, generalization, and counterfactuals at the root level get the keys to every lock. If you want to get even close to AGI, you have to pass through mathematics. Math is the ur-substrate. This is what I'm building for.

English

109

Jason Rute@JasonRute·14 May

@littmath Can you explain “verification is the bottleneck”?

English

663

Daniel Litt@littmath·13 May

stochastic parrots: “it doesn’t think” “verification is the bottleneck” “solve math, solve everything else,” “they’re just stochastic parrots”

English

311

26.5K

Jason Rute@JasonRute·13 May

@giffmana Math is usually fairly robust to errors, much more than code. There are lots of articles about why. This particular benchmark however is designed adversarially to be very fiddly, calculation based, and non-intuitive (else the model will guess the solution).

English

144

Lucas Beyer (bl16)@giffmana·13 May

Turns out that professional mathematicians make pretty much the same amount and style of bugs as professional programmers make:

Greg Burnham@GregHBurnham

Thread with a few notes on this. It’s a disappointing finding, of course. The best we can do is fix it up and learn lessons for future work.

English

200

23.1K

Jason Rute@JasonRute·12 May

@EpochAIResearch One third of solved problems? Or unsolved problems? Or both?

English

4.8K

Epoch AI@EpochAIResearch·12 May

We are conducting an AI-assisted review of FrontierMath: Tiers 1-4. This has flagged fatal errors in about a third of problems, and we believe most of these flags to be valid. We will release updated scores on a corrected dataset after completing a thorough human review.

English

875

466.5K

Jason Rute@JasonRute·12 May

@ElliotGlazer Is there something special about aleph_17?

English

315

Elliot Glazer@ElliotGlazer·12 May

Thank you for your participation in this year’s continuum survey. Please enjoy your complimentary continuum clock.

English

218

10K

Jason Rute@JasonRute·10 May

@VictorTaelin Have you ever considered writing an external checker for Lean? (I assume this particular is different from Lean.)

English

101

Taelin@VictorTaelin·9 May

To whom it may concern NanoProof.hs: the smallest viable proof checker I posted something similar before, but it was more of a research experiment with weird λ-encoded shit, than something usable. This new repo contains a tiny, 1000-LOC Haskell self-contained proof checker that you can actually use to prove arbitrary theorems. The language has just 6 base types: → Empty (`⊥`): type with 0 elems → Unit (`⊤`): type with 1 elem (`()`) → Bool (`𝔹`): type with 2 elems (`0 | 1`) → Sigma (`ΣA.B`): dependent pairs (`(x,y)`) → Pi (`ΠA.B`): dependent functions (`λx.f`) → Equal (`a==b`): propositional equality (`{==}`) That's all you need. Each of these is needed, as it introduces something fundamental. The file includes a parser, stringifier, equality, a bidirectional type checker, and a simple CLI. It also includes first-class reduction relations, which allow us to pretty print goas just like Lean. You can place '()' in a position to inspect the current context and goal there. I also include a demo proof for the commutation of multiplication.

English

280

17.4K

Jason Rute@JasonRute·10 May

@lacker @julianboolean_ I think we have plenty of more difficult problems already? But if it is a new conjecture, it would be interesting (at least the first time) exactly because the AIs are trying to convince us it is important.

English

Kevin Lacker@lacker·10 May

@JasonRute @julianboolean_ I guess after the math AIs solve the Riemann Hypothesis, their next challenge will be inventing a hypothesis that they can convince humanity is even more important.

English

Julian@julianboolean_·9 May

the more i think about this, the more wrong I think Gowers is here Math is infinite. Every proof or problem you could ever write about is in the Library of Babel. LLMs just make it more accessible. Finding a proof of a particular problem may have gotten easier, like how complex-bashing made elementary geometry easy. But the fun is in picking out the right problem - the interesting problem - from all the books in the Library. And that will always be possible

Nabeel S. Qureshi@nabeelqu

The mathematician Tim Gowers: "the era where you could enjoy the thrill of having your name forever associated with a particular theorem or definition may well be close to its end"

English

19.1K

Jason Rute@JasonRute·10 May

@lacker @julianboolean_ There are a number of good videos aimed at more general audiences explaining advanced math concepts including fields medal winning papers. In this hypothetical scenario where AI is good at everything, they would also be good at making this kind of content.

English

Kevin Lacker@lacker·9 May

@julianboolean_ I dunno, if cutting edge math becomes too hard for humans to understand, why would anything be interesting any more? Would we have to trust the AIs when they told us which open questions were interesting, and worth working on?

English

463

Jason Rute@JasonRute·9 May

@littmath Did it work out that it is likely open, or find out from say your paper?

English

455

Daniel Litt@littmath·8 May

I have an ambitious conjecture that reasoning models, since o3, are convinced is false. I pose it to each new model. Earlier model generations would consistently hallucinate counterexamples; GPT 5.5 Pro spends ~an hour searching and then grudgingly concedes it’s open.

English

815

71.1K

Jason Rute@JasonRute·9 May

@j_dekoninck @xeophon I know one benchmark where over half the problems are impossible to get correct.

English

Jasper Dekoninck@j_dekoninck·9 May

@xeophon I think the more problematic statement that people make for such benches is: "models really suck at this task"

English

469

Florian Brand@xeophon·9 May

"our bench is really hard*" * (we prompt the models in emojis, it has to access to 841 tools which are loaded in the user message and we give it 12 seconds wall clock time)

English

136

5.1K

Jason Rute@JasonRute·9 May

@Anthony_Bonato AI can do the verification as well, especially with formal verification (but humans can still verify the verifiers).

English

207

Anthony Bonato@Anthony_Bonato·9 May

Waiting for the day when AI claims to solve some big problem/conjecture in mathematics and no one can understand the proof so we can't verify whether it is correct

Timothy Gowers @wtgowers@wtgowers

I've recently got in on the act of getting AI to solve open problems in mathematics. More precisely, I gave some questions asked by Melvyn Nathanson to ChatGPT 5.5 Pro, to which I have been given access, and it answered them. 🧵

English

5.8K

Jason Rute@JasonRute·9 May

@LawrPaulson @mathladyhazel I think the best thing would be to have two different operations for exponention, one where the exponent is in Nat (for mult. moniods) / Int (for mult. groups) 0^0=1 is right for that. And one when the exponent is a real/complex/vector/matrix. For the latter use say exp(ln(a)*b)?

English

Lawrence Paulson@LawrPaulson·8 May

@mathladyhazel It’s not something you prove. It’s a definitional choice. And it does work.

English

524

Math Lady Hazel 🇦🇷@mathladyhazel·8 May

This is why some math people say 0ᵒ = 1.

English

868

81.2K

Keşfet

@gro_tsen @prz_chojecki @aaswaminathan01 @mathandcobb @Kseniase_ @CarinaLHong @ylecun @logic_int