Semon Rezchikov

1.1K posts

Semon Rezchikov

@eigenstate

Mathematician with wide interests. My job is at IHES but you can also find me in NYC/Bay Area. Talk to me! Twitter is for fun 🦋

NYC Katılım Haziran 2009

847 Takip Edilen1.1K Takipçiler

Sabitlenmiş Tweet

Semon Rezchikov@eigenstate·13 Şub

For the record I think that progress in AI-for-math will completely change the way math research is done over the next few years. I am not at all a skeptic. I just want a) honesty b) for people to understand what math research *really is*.

English

8.6K

Semon Rezchikov retweetledi

American Bird Conservancy@ABCbirds·20 Şub

🥇 We aren't saying Alysa Liu "Stole the Look" from the White-crowned Sparrow, but we aren't not saying it either. Congrats on an epic win! Learn how to help birds on and off the ice on our blog: bit.ly/45usF82

English

833

11.7K

589.9K

Semon Rezchikov@eigenstate·18 Şub

@littmath @jasondeanlee Then again, the unofficial (but widely distributed) pronouncements of folks at various labs were also too early. (I have the god-given right to complain about *everyone*!)

English

318

Daniel Litt@littmath·17 Şub

@jasondeanlee Yes, way too early for this article!

English

2.7K

Jason Lee@jasondeanlee·17 Şub

What how did they get to this conclusion, the solutions haven't even been all read yet?

Harvard Department of Mathematics@HarvardMath

"The verdict, it seems, is in: artificial intelligence is not about to replace mathematicians. That is the immediate takeaway from the “First Proof” challenge—perhaps the most robust test yet of the ability of LLMs to perform mathematical research." scientificamerican.com/article/first-…

English

10.6K

Semon Rezchikov@eigenstate·17 Şub

Key properties: a) output only depends on x + maybe rng b) no further inputs x’ are provided to f by A after x is revealed by B.

English

224

Semon Rezchikov@eigenstate·17 Şub

What is the status of verified secret computation? Specifically, suppose company A wants to claim that they have a fixed secret function f (some massive tool-using LLM ensemble) which at t_0 will be evaluated on value x submitted by unrelated org B. How A prove this to B? (1/2)

English

305

Semon Rezchikov@eigenstate·17 Şub

Given that there is no formal evaluation of the 1st proof challenge for round 1, it does not seem fair to interpret initial results in any way whatesoever. AI hype notwithstanding. Let's do it properly for round 2.

Harvard Department of Mathematics@HarvardMath

English

2.1K

Semon Rezchikov@eigenstate·16 Şub

@dieworkwear You should do a Cedric villani critique!!!

English

528

derek guy@dieworkwear·16 Şub

I don't think it has to be binary. Clothing can be an interesting lens to view culture. It also intersects with economics, politics, and sociology in interesting ways. At the same time, it doesn't tell us much about a person's deeper, more important qualities, such as their character, intelligence, or morality. And it's hardly the most important thing in the world. Many of the world's most brilliant minds don't dress particularly well, such as Grigori Perelman and Terence Tao (two of the world's best mathematicians). And it's perfectly fine to prioritize other things in your life. I only object to people who conflate a rejection of fashion with high-mindedness or virtue. To me, that's making the same superficial judgement in reverse. You are not better than other people for dressing badly, just as you're not better than other people for dressing well. I think it's fine to like clothing and not put too much weight on it. Sometimes I think people get carried away with the extremes on these two positions. My friend and fellow menswear writer Bruce Boyer is fond of saying: "Clothing is more important than most people think, but less important than fashion people think."

tuhin@tuhin

Sorry PG. How you do one thing is how you do everything. Dressing well and by extension fashion is about how you see yourself in the world and an extension of your inner beliefs. It’s also a desire to see beauty in the world and be a part of that for others. It’s also about seriousness and reverence. It’s quite common to mark it up as superficial aesthetics. But it’s not.

English

219.4K

Semon Rezchikov@eigenstate·15 Şub

@ben_golub As I understand it, the organizers are not treating the "first release" as a formal benchmark that they will be writing official evaluations for -- this will happen with subsequent releases. You should probably wait for the First Proof org to formally comment.

English

1.5K

Ben Golub@ben_golub·15 Şub

So what's the state of First Proof? Is there some consensus on how OAI did?

English

17.6K

Semon Rezchikov@eigenstate·14 Şub

@MysteryHacker1 I'm not at all a relevant expert, and obviously I can't make any sense of your links. I'm sure that you can find relevant experts and explain your dramatically improved argument in a conventional manner and they would be appreciative!

English

1223334444555554444333221@MysteryHacker1·14 Şub

@eigenstate i have a much better proof if you're actually interested (the case analysis is exterordinarily simplified wrt to resolving a single edge-1-deficiency for pentagonal faces in the topological reconstruction of cubic planar bridgeless mated binary tree graphs [Whitney 1931].)

English

Semon Rezchikov@eigenstate·14 Şub

Actually benchmark idea : is it easy to redo the computer part of the 4 color theorem argument with AI assistance now? :D and write up a nice streamlined summary of the strategy that undergrads can understand?

Bojan Tunguz@tunguz

I’ll take some time to think through the implications of this work, but my first impression is that it’s a Quantum Field Theory equivalent of the four color theorem - computer assisted proof, rather than computer generated. And definitely not profound new physics from scratch.

English

2.8K

Semon Rezchikov@eigenstate·14 Şub

@boazbaraktcs I totally agree that users will be constantly using models to prove research level lemmas by Feb 2027. (Would be idiotic if not.) Your earlier claim reads like “in 6 months math centaurs are ~ over”, very different!

English

1.1K

Semon Rezchikov@eigenstate·14 Şub

This did not happen (The OAI work is super interesting, I don’t know why I have to caveat this)

Jason Lee@jasondeanlee

Must be at 8/10 or 9/10 for the first proof. Some prodding from the boss to deliver on the last mile

English

2.6K

Boaz Barak@boazbaraktcs·14 Şub

@eigenstate They could come up with harder and harder problems! Though at some point, maybe a year, it would seem pointless to ask if AI can do the thing that tons of users are constantly using it to do. Just as there is no "First Code" project.

English

4.5K

Boaz Barak@boazbaraktcs·14 Şub

#1stProof challenge is exciting because it is at the edge of current AI capabilities. 6 months ago, all or almost all problems would have been infeasible for AIs. 6 months from now, all will be routine.

English

5.9K

Semon Rezchikov@eigenstate·13 Şub

@AcerFur Including the chat instances is important! I really do think with a mathy human thinking about them while talking to a model quite a lot of them are solvable straightforwardly — these are lemmas, after all.

English

Acer@AcerFur·13 Şub

I'll write up the methodology and workflow soon (and will include the chat instances once I've gone through and grabbed the relevant ones); I just needed to get these out before the deadline.

English

2.4K

Acer@AcerFur·13 Şub

Here are my attempts at getting GPT-5.2 Pro/Gemini 3 Deep Think to give solutions to all 10 #1stProof problems: github.com/KStarGamer/Fir… Some caveats: I know the attempts to problems 4, 6, and 7 are incomplete, and that the attempt for problem 8 is probably incorrect.

English

110

15.6K

Semon Rezchikov@eigenstate·13 Şub

@deevrod Also to be fair, having the AI solve new problems is obviously super cool. Why not do it!

English

153

rodion n. déev@deevrod·13 Şub

@eigenstate I’m sure one can get infinite money for this by enveloping it with right buzzwords (like, “building an AI-powered temple of all human knowledge” or whatever is fashionable among billionaires at the moment)

English

178

Semon Rezchikov@eigenstate·13 Şub

English

8.6K

Semon Rezchikov@eigenstate·13 Şub

@deevrod Because of strong financial and reputational incentives!

English

346

rodion n. déev@deevrod·13 Şub

@eigenstate I don’t understand why people keep trying to make AI solve new problems while it would be obviously much more efficient in helping to navigate the present body of works

English

423

Semon Rezchikov@eigenstate·13 Şub

At this point I use the models on many days; they were net negative back in August and are solidly net positive now.

English

1.2K

Semon Rezchikov@eigenstate·13 Şub

Cheating here means claiming 2 when really something like 1 happened. It's not that they're not both meaningful, but they show pretty different behaviors and that's important. Oneshotting these problems totally autonomously (use many agents sure) is meaningfully different.

English

443

Semon Rezchikov@eigenstate·13 Şub

Re: #1stproof I just want to point out I think that 1) a math human with moderately relevant background can solve many of these problems by interactively talking to a model and then giving it hints, 2) this is extremely different from 1-shot performance, 3) it's tempting to cheat

English

1.3K

Keşfet

@littmath @jasondeanlee @dieworkwear @ben_golub @MysteryHacker1 @boazbaraktcs @AcerFur @elonmusk