Erin Carmody

2.9K posts

Erin Carmody banner
Erin Carmody

Erin Carmody

@uncountableart

https://t.co/unx2g39A3m

Katılım Ağustos 2021
420 Takip Edilen642 Takipçiler
Erin Carmody
Erin Carmody@uncountableart·
@ShakeelHashim I've realized that since they spent so much money for their AI machines to work it's just ok to lie and make things up.
English
0
0
0
33
Erin Carmody retweetledi
Przemek Chojecki | PC
Przemek Chojecki | PC@prz_chojecki·
STATUS UPDATE: REVISION NEEDED Currently there's an error flagged by @littmath in the proof of Proposition 12.55 that needs correction to have the full argument. There are also some issues raised by @Tomodovodoo here: github.com/Tomodovodoo/de… (haven't checked them yet) If you found other issues let me know, happy to add them to the working queue. If you want to collaborate on making it into a publishable result, also DM!
Przemek Chojecki | PC@prz_chojecki

My multi-agent harness powered by GPT-5.4 settled a FrontierMath Open Problem. The result of 2 weeks of 5-10 agents working 24/7: there are no char 3 rank 1 del Pezzo surfaces with more than 7 singularities. This settles the problem to the negative. Details below.

English
11
1
32
17.9K
Erin Carmody retweetledi
Franz Schreiber
Franz Schreiber@FJSchreiber·
@prz_chojecki The language "this settles the problem to the negative" does not match "I trust the process [of the agents having worked out the proof correctly]" at all. I really think you should be much less definitive in your wording.
English
0
1
22
617
Erin Carmody
Erin Carmody@uncountableart·
@prz_chojecki @littmath You "trust the process"? Why AI makes many errors constantly, doesn't even know what a prime number is...
English
0
0
1
373
Przemek Chojecki | PC
Przemek Chojecki | PC@prz_chojecki·
I trust the process here. Models didn't write more than 5-10 pages at a time, there was always a verifier/reviewer model looking over individual additions for correctness. Finally the orchestrator mapped out the arguments and check them too. I didn’t check all the computations myself, but I've read through the strategy and some individual pieces in more detail. That said, I expect there will be more than one correction to be made in the end - as is the case with any human written paper too of that size.
English
14
0
8
85.9K
Przemek Chojecki | PC
Przemek Chojecki | PC@prz_chojecki·
My multi-agent harness powered by GPT-5.4 settled a FrontierMath Open Problem. The result of 2 weeks of 5-10 agents working 24/7: there are no char 3 rank 1 del Pezzo surfaces with more than 7 singularities. This settles the problem to the negative. Details below.
Przemek Chojecki | PC tweet media
English
21
16
176
136.2K
Erin Carmody retweetledi
Daniel Litt
Daniel Litt@littmath·
@prz_chojecki Sorry, I'm not going to go back and forth with your LLM. Please check the paper yourself!
English
2
3
187
4.3K
Erin Carmody
Erin Carmody@uncountableart·
@JDHamkins Congrats Joel! It looks great! The essays and images are so good. Such a variety of topics. I feel like this could even complement a calculus class.
English
1
0
1
18
Erin Carmody
Erin Carmody@uncountableart·
I gave a talk yesterday at the CUNY grad center and had a lot of positive feedback and comments. A collaborator of some of the students sent me a link to a coding challenge that has a similar vibe to killing primes check it out: codegolf.stackexchange.com/questions/1489…
English
0
0
5
103
Erin Carmody
Erin Carmody@uncountableart·
I saw Astoria today in battery park! And it had a human friend nearby watching out 😊
English
0
0
1
61
Erin Carmody retweetledi
Dom Lucre | Breaker of Narratives
🔥🚨JUST IN: The Unitree humanoid robot was spotted running and playing with children on the streets of New York City.
English
439
567
3.3K
419.9K
Erin Carmody
Erin Carmody@uncountableart·
@wtgowers I tried a game and got something right accidentally. Then tried again today and didn't guess correctly but it looks like something is working. I'm going to read it carefully and try to actually play it. Looks interesting!
English
0
0
0
38
Timothy Gowers @wtgowers
Timothy Gowers @wtgowers@wtgowers·
I think now both games are working properly, with all puzzles showing up. (I've checked this by opening them in incognito mode.)
English
2
0
10
1.9K
Timothy Gowers @wtgowers
Timothy Gowers @wtgowers@wtgowers·
I've created a couple of mathematical games, both based on word problems in groups or semigroups. One of them could lead to a Polymath project if enough people are interested in it, as it is connected with an open problem. More details in a blog post linked below. 1/2
English
3
33
282
23.9K
Wojciech Aleksander Wołoszyn
@uncountableart @unusual_whales But then the same logic would apply—every trade has two sides :) I just checked and by sellers they mean number of homes listed for sale and by buyers they mean an estimated number of people actively trying to buy :)
English
1
0
1
20
unusual_whales
unusual_whales@unusual_whales·
Home sellers now exceed buyers by over 600,000, marking the widest gap on record, per Redfin.
unusual_whales tweet media
English
442
1.4K
9.1K
4.6M