Bruno Andreis (@andreisbru) - Twitter Profili | Zamantika Mersobahis Locabet

Bruno Andreis retweetledi

Very excited to announce HorizonMath with @erikyw26 and collaborators! How can we measure AI progress on mathematical discovery? Turns out there’s several classes of problems where discovery is hard but verification is easy. We develop a benchmark with 101 such problems and test GPT 5.4 Pro, Claude 4.6 Opus, and Gemini 3.1 Pro. Pending expert review, GPT 5.4 Pro finds two potentially novel solutions that beat existing baselines🧵

English

161

14.4K

Bruno Andreis

Keşfet