Julia Kempe

209 posts

Julia Kempe

@KempeLab

Silver Professor at NYU Courant and CDS, Research Scientist at FAIR Research in Machine Learning, past in Quantum Computing & Finance. Posts my own.

Beigetreten Nisan 2024

190 Folgt2.2K Follower

Julia Kempe retweetet

NYU Center for Data Science@NYUDataScience·6d

MIT PhD student Shobhita Sundaram (@shobsund) & CDS Silver Professor Julia Kempe (@KempeLab), and a team from FAIR show how AI can generate its own practice problems to solve nearly impossible tasks. nyudatascience.medium.com/generating-the…

English

1.2K

Julia Kempe@KempeLab·3 Mar

Our groups' ICLR presentations: 🎉Oral: OpenApps arxiv.org/abs/2511.20766 Soft Tokens, Hard Truths arxiv.org/abs/2509.19170 How reinforcement learning after next-token prediction facilitates learning arxiv.org/abs/2510.11495 From Concepts to Components arxiv.org/abs/2506.17052

English

2.8K

Julia Kempe@KempeLab·20 Şub

Nice overview of recent self-distillation works out there!

Emiliano Penaloza@emilianopp_

x.com/i/article/2024…

English

24.8K

Julia Kempe@KempeLab·18 Şub

The funny thing is not that GPT could give 3 simple, elegant proofs. It is that it knew what it means to prove "Fortnow-style"...

Lance Fortnow@fortnow

A cute not too hard problem: If k <= n/2 give a fully combinatorial proof that (n choose k) >= 2^k. There's more than one way to do this.

English

1.3K

Julia Kempe@KempeLab·18 Şub

@polynoamial @HarvardMath Very apt quote from co-1stProver @scottnarmstrong

English

104

Julia Kempe retweetet

Noam Brown@polynoamial·18 Şub

@HarvardMath AI isn’t replacing mathematicians today, but it is changing mathematics:

English

145

6.4K

Harvard Department of Mathematics@HarvardMath·17 Şub

"The verdict, it seems, is in: artificial intelligence is not about to replace mathematicians. That is the immediate takeaway from the “First Proof” challenge—perhaps the most robust test yet of the ability of LLMs to perform mathematical research." scientificamerican.com/article/first-…

English

213

106.8K

Julia Kempe@KempeLab·17 Şub

8/ Final note: thanks to the mathematicians behind #1stProof. Some fellow 1st-provers + colleagues: @AcerFur @OpenAI @merettm @DayShuai @littmath @nasqret @jleda_x @Tomodovodoo @c2v47 @BGrayzel @ArsSocraticaAI 1stproof.org #1stProof

English

701

Julia Kempe@KempeLab·17 Şub

7/ Recommendations (2/2): Build an audit registry. Machine-led literature search + AI-written papers = citation risk. We need a public “verification traces” layer (arXiv-adjacent): papers accumulate audit logs/certificates (models dissect/rewrite/check).

English

449

Julia Kempe@KempeLab·17 Şub

1/ #1stProof New write-up is live: “Takeaways from the First-Proof Trenches” (with @scottnarmstrong + @MunosRemi). Enjoy! kempejulia1.github.io/1stProof-Attem… Working across #1stProof we pulled together takeaways and recommendations for the scientific community.

English

8.2K

Julia Kempe@KempeLab·13 Şub

Nikhil Srivastava Rachel Ward Shmuel Weinberger Lauren Williams Colleagues: @AcerFur @littmath @nasqret @c2v47 @jleda_x @davidbessis Comments and scrutiny are very welcome.

English

798

Julia Kempe@KempeLab·13 Şub

11/ Again, we thank the authors for creating such a fascinating testbed — and we genuinely appreciate careful checks and independent audits from the community. Mohammed Abouzaid Andrew J. Blumberg @MartinHairer Joe Kileel @TammyKolda @nick_sriv Paul D. Nelson Daniel Spielman

English

862

Julia Kempe@KempeLab·13 Şub

7/ *Third — interlude: “Humor from your bot.”* We found all bots prone to shortcuts and laziness. “Let’s wait until Feb 13 to see the proof” was one of the most frequently proposed options.

English

979

Julia Kempe@KempeLab·13 Şub

6/ *Second:* During literature searches, the models surfaced very recent references that appeared to prove key theorems they needed. Several turned out to be AI-generated papers. Lesson: literature search is acquiring a new level of difficulty in the age of AI slop.

English

755

Julia Kempe@KempeLab·13 Şub

5/ After working with our AI “scientist fleet,” we want to share a few takeaways. *First:* Even top frontier models produced many false proof manuscripts. Only careful back-and-forth auditing between models exposed the bugs. We hope we got them all!

English

794

Entdecken

@shobsund @polynoamial @HarvardMath @scottnarmstrong @AcerFur @OpenAI @merettm @DayShuai