Mark Kisin

20 posts

Mark Kisin banner
Mark Kisin

Mark Kisin

@marcusgnt

Math Professor at Harvard. Interested in arithmetic geometry.

Katılım Ekim 2013
125 Takip Edilen37 Takipçiler
Mark Kisin
Mark Kisin@marcusgnt·
@mbeisen @littmath Why do you think that AI slop is a transient problem ? It’s completely possible btw that AI will soon be capable of producing (some) non-slop papers while humans using AI still produce a lot of slop.
English
2
0
1
151
Michael 英泉 Eisen
Yes. But I think you’ve articulated this nicely elsewhere that something is fundamentally changing here - not because of the policy per se but because of the context in which it’s happening. AI slop and hallucinations are a transient and relatively easy to deal with problem. The real challenge is that AIs will soon - if they aren’t already - be capable of autonomously generating papers that aren’t slop. And to do it at a scale that would swamp humans on arXiv. What then? This is why I think this is a consequential policy decision even if it is understandable in a narrow immediate sense. It’s a move to gating on the mode of authorship done - or at least presented - without really grappling with what that means.
English
4
0
5
1.4K
Michael 英泉 Eisen
Too many threads on going on, so going to try to consolidate. I don't think anyone objects to the core principle nominally at play here that if you put science out into the world, you are responsible for that work. This is what science is. I don't want to get distracted by questions of authorship or how responsibility is apportioned amongst authors - that's an orthogonal issue. The expectation that you can trust the scientific outputs (and I'm intentionally broadening this beyond papers) of others is really a defining feature of science as a collective endeavor. And obviously, if a paper contains hallucinated references, fake citations, placeholder text, or obvious autogenerated junk, it’s hard to argue the authors exercised even minimal scholarly care. People have tried to paint me and the others who have expressed concern about the new arXiv policy as somehow questioning this. We're not. To me something deeper shift is represented by that move, and I think it warrants at least acknowledgment - and IMO deeper discussion. The value of preprint servers to the research community comes from them being fast, open, effectively unfiltered, and agnostic about correctness. A lot of great science is published first on arXiv and other preprint, and so is a lot of science that is poorly executed and often poorly presented. Since the existence of the later doesn't devalue the former, it's a bargain most people are happy with. One of the things that kept this model afloat was the fact that producing a paper required some non-trivial effort, and therefore people inclined to produce works that could en masse disrupt the ecosystem could not actually produce them at scale. AI has obviously shattered any remnant of connection between things that look like papers and scholarly output and effort (mind you, I think this is a good thing, but that's also a somewhat separate topic). **But the response to it has also broken something.** arXiv (and other preprint servers) have always had to impose some kind of screening to keep out obviously inappropriate stuff, and I think most of us agree that asking "Is this an actual work of science?" before posting something is a reasonable thing for a preprint server to do (provided that the definition of what a work of science is is intentionally fairly broad). However, the new policy is explicitly changing that bargain. The question is no longer "Is this a relevant scholarly work?" Rather it is becoming "Can we trust this authorial process?". That is a HUGE shift. Look, I understand why moderators feel existential pressure - the system isn't architected in infrastructure, processes or modes of use with a massive flood of AI-generated papers. But there are some real risks in the new direction. 1) The thing that makes preprint servers different from (and better than) journals is that there is no gatekeeping. The new policy threatens this. Once moderation becomes about inferring authorial integrity, the boundary between “quality control” and “editorial policing” gets blurry. The fact that one of the 'punishments' is to force people to go through peer review before posting to arXiv (an idea too absurd to even mock), suggests that current leadership has a comfortable relationship to journal peer review that makes the risk that arXiv will become a journal in every meaningful sense more of a risk. 2) “Incontrovertible evidence” sounds, well, incontrovertible, but moderation systems take on a life of their own via various forms of procedure, precedent and social signaling. Today it’s hallucinated references. Tomorrow it could become stylistic mimicry. Slippery slope here. 3) The policy misdiagnoses the real problem. As I've said elsewhere, the issue is not “AI use” but the system that leads people to think it will benefit them to push slop onto arXiv. LLMs may amplify the negative effects of metric-driven academia, but they didn't create it. To me we are at a fork in the road moment. There is a world within our grasp where an alignment of preprinting and AI actually breaks the toxic stranglehold that traditional publishing has on science. A world where actual communication (not the facsimile of it we have today) takes place between people, between machines and from people to and from machines, around data and ideas in science. But there is also a world where the preprint servers we love collapse in fear and a lack of imagination into irrelevancy and we lose to moment. I'm not saying this policy itself will cause that. But I am saying that it's not a good sign.
Thomas G. Dietterich@tdietterich

Attention @arxiv authors: Our Code of Conduct states that by signing your name as an author of a paper, each author takes full responsibility for all its contents, irrespective of how the contents were generated. 1/

English
22
17
94
31.7K
David Savitt
David Savitt@dsavitt·
Well, good luck everybody.
David Savitt tweet media
English
1
1
7
1.7K
Mark Kisin
Mark Kisin@marcusgnt·
@littmath As an editor I’ve found it is trivial to catch LLM generated papers in general this way. I imagine that’s what arxiv is doing. People are fixating on the hallucinated reference but that was just an example.
English
1
0
4
212
Daniel Litt
Daniel Litt@littmath·
Re: arxiv LLM policies, it is now trivial to catch hallucinated citations, obvious LLM “if you’d like I can etc.” text, and so on, *by using current-gen LLMs*. What we really want is for output to be proof-of-thought, for which the mere existence of a paper no longer suffices.
English
26
18
359
24.6K
Mark Kisin
Mark Kisin@marcusgnt·
@LucaAmb You’re conflating the crime with evidence of the crime. The crime isn’t the editing ‘slip’. It’s trying to pass off LLM output as something you’ve thought about and checked.
English
0
1
5
154
Luca Ambrogioni
Luca Ambrogioni@LucaAmb·
A lifetime arxiv ban (arxiving after pubs is meaningless) can completely wreck someone career Doing that for a single slip unconsequential slip on an otherwise good paper is completely reckless Society is changing fast, we need to support people. Not punish them.
English
80
17
336
112.6K
John Duncan
John Duncan@johnfrduncan·
@marcusgnt @littmath The boundary of current capabilities is indeed jagged. Out of interest: If AI were to become more capable in your area at some point, would you (expect to) regard that as generally positive, or generally negative?
English
1
0
1
728
John Duncan
John Duncan@johnfrduncan·
To working mathematicians who are not optimistic about the future of mathematics: Is there any aspect of what you do now that you can't do better with the help of AI tools?
English
14
1
71
15.9K
Mark Kisin
Mark Kisin@marcusgnt·
@sebngriego @littmath I wouldn’t trust the proof or the refutation in this thread without thinking about it myself !
English
1
0
2
158
Daniel Litt
Daniel Litt@littmath·
New project: problemsilike.com, a website collecting open problems that I, personally, like, with comments on their context, difficulty, and interest.
Daniel Litt tweet media
English
28
109
834
55K
Daniel Litt
Daniel Litt@littmath·
I have an ambitious conjecture that reasoning models, since o3, are convinced is false. I pose it to each new model. Earlier model generations would consistently hallucinate counterexamples; GPT 5.5 Pro spends ~an hour searching and then grudgingly concedes it’s open.
English
30
16
816
71K
Mark Kisin
Mark Kisin@marcusgnt·
@littmath That is a very tight pair of inequalities !
English
1
0
3
345
Daniel Litt
Daniel Litt@littmath·
@marcusgnt I think autonomous systems will be able to produce (some) papers at the level of the best human experts now (though not arbitrary papers at the level of the best experts), but humans to be able to produce even better papers using the tools…
English
1
0
12
559
Daniel Litt
Daniel Litt@littmath·
Something worth clarifying about this quote: it seems likely that the best papers produced in 2030 will be better than the best papers of 2025, and I think it’s unlikely they’ll be produced autonomously by AI systems (rather than by experts using them as tools).
Neel Somani@neelsomani

"In March 2025 I made a bet... that AI tools would not be able to autonomously produce papers I judge to be at a level comparable to that of the best few papers published in 2025, at comparable cost to human experts, by 2030. I now expect to lose this bet." - @littmath

English
6
10
138
17.7K
Mark Kisin
Mark Kisin@marcusgnt·
@skdh I mean I wouldn’t have predicted it given your success, and yet here we are.
English
1
0
15
384
Sabine Hossenfelder
Sabine Hossenfelder@skdh·
In a few years, Americans will see that the only consequence of defunding parts of academia was fewer publications which no one cares about anyway. What will happen next?
English
245
62
977
345.9K
Mark Kisin
Mark Kisin@marcusgnt·
@littmath Yes it can now get much more difficult problems wrong !
English
0
0
2
252
Daniel Litt
Daniel Litt@littmath·
Just caught myself thinking “ugh, the AI couldn’t even compute this cohomology group correctly.” OK, admittedly things have come pretty far, pretty fast.
English
18
58
1.3K
95.3K
Hugh Hewitt
Hugh Hewitt@hughhewitt·
Spent 12 hours doing due diligence after a legacy media journalist told me security had been pulled from people. First, the threat from Iran against @mikepompeo, Brian Hook and others is very real and ongoing. This is what concerns me and should concern the president and everyone in the country right-to-left. Even Bill Clinton hammered Saddam Hussein when he thought about hitting George H.W. Bush. That was probably Clinton’s strongest moment as president. This decision if not reversed could well be President Trump’s weakest and a defining moment if the decision isn’t reversed. Detail? Joe Kent, who is now chief of staff at ODNI and slated to lead NCTC, immediately said “Yes” when @SenTomCotton asked him yesterday if he would want security for himself or his family if he were in the reports the Senate Intel Committee reviewed. The intelligence they were discussing is classified but not that Q-and-A, and many people in the room heard it and can confirm. So President Trump and Secretary @marcorubio —neither of whom could have had a briefing on these ongoing, very specific threats given the timing of the decision to pull security— were very poorly served by whomever made this recommendation to them if indeed they even made the decision. I can’t believe that, if they get the briefing, both don’t instantly restore the security. The Iranians still want to kill the president and everyone involved in the decision to stop Soleimani in Iraq before the Iranian general killed more Americans. Pulling security from anyone who is a target of an enemy state is weakness on parade, and constitutes waving a red flag in front of the theocrats and the IRGC. If God forbid anyone much less a former Secretary of State is attacked, wounded or killed by Iran it will forever tarnish the reputation and legacy of the president and Secretary Rubio. The president and Secretary Rubio should blast whomever made this decision for endangering not just the targets but the country which would be greatly damaged if the assassins got through the threat. Hopefully the president will reverse the State Department’s orders immediately and also talk specifically to the Iranians: “If you even think about this, I’ll flatten you.” The “hell to pay” comment got the attention of Hamas. POTUS and Secretary Rubio need to be the attention of the mullahs as well. In the meantime, restore the security details.
English
295
93
504
210.3K
Mark Kisin
Mark Kisin@marcusgnt·
@KonstantinKisin You’re suffering from WDS (Woke derangement syndrome). Mom is worried about you !
English
0
0
0
65
Konstantin Kisin
Konstantin Kisin@KonstantinKisin·
"If opposing this insanity makes me "right-wing", so be it. Ultimately, the choice is between civilisation and people who think men can give birth. Everything else is fluff." konstantinkisin.com/p/fine-call-me…
English
425
1.9K
14.3K
1.5M
Richard Hanania
Richard Hanania@RichardHanania·
Alright Trump is out of his mind. There was every sign in the world, I thought that he had enough of a sense of self-preservation to not go full MAHA. Like I thought he would just try to arrest his enemies, not abolish vacancies. New era of idiocy is upon us.
English
240
54
1.4K
364.3K
Mark Kisin
Mark Kisin@marcusgnt·
@NateSilver538 I remember when Nate used to be a data driven guy. Now he’s just become the kind of pundit he used to scorn.
English
0
0
1
211
Nate Silver
Nate Silver@NateSilver538·
There's no escape because renominating Biden is fundamentally a bad idea, there are only bad arguments for it, and everyone will regularly be reminded of this in the form of things like bad polls, bad public appearances (or evasive actions designed to avoid them) and so on.
Conor Sen@conorsen

The AOC comments last night felt like it was starting to wind down but Bennet choosing to go on CNN tonight makes things feel a little more loose now. Guess we’ll see how the presser goes.

English
134
242
2.8K
426K
Split Ticket
Split Ticket@SplitTicket_·
ICYMI, our 2024 ratings for the House and Senate. HOUSE: 209 DEM, 207 REP, 19 Tossup SENATE: 51 REP, 47 DEM, 2 Tossup Presidential ratings will launch soon.
Split Ticket tweet mediaSplit Ticket tweet media
English
27
35
297
171.2K
Phillips P. OBrien
Phillips P. OBrien@PhillipsPOBrien·
The interesting thing about the Medvedev threat to use nukes if Ukraine's counteroffensive succeeds is not the threat (he's dont that lots). The really interesting thing is that he is speaking openly about Ukraine's counteroffensive being a success. theguardian.com/world/live/202…
Phillips P. OBrien tweet media
English
107
427
2.1K
361.3K