Chetaslua
10.3K posts

Chetaslua
@chetaslua
AI Insider / Reporter featured in BGR • HackerNews • GIGAZINE • 36Kr | AI Prompting and Testing | Vibe Benchmark and Vibe Marketing



Google DeepMind just dropped one of the clearest signals yet for where math is going. The @GoogleDeepMind AlphaProof Nexus agent autonomously resolved 9 of 353 open Erdős problems, with the proofs checked in Lean. The reported inference cost: a few hundred dollars per problem. That is wild. Not because 9/353 means math is "solved". It clearly doesn’t. Most problems still resisted, and the full search cost is more complicated. But the direction is obvious: AI agents are moving from contest math into real research-level mathematics. And formal verification turns the output from "sounds plausible" into "actually compiles". generate, test, verify, repeat.



HOLLLLY SHIIIIIT 😳 GPT 5.5 xhigh in codex made this paper physics website with wind effect one shot check the physics , ui and interaction @sama you guys cooked and this is my first time i can suggest chatgpt is go to ai solution for everything




Holy shiitt Gemini 3.2 is insanely fast and intelligent 🤯 passed voxel cube test with more than flying colours , it gave so much better output that i didnot even asked for 1700 lines of codes - 48 seconds and solved it , proof attached so that everyone can see @demishassabis thanks for hearing us and gave us first non lazy google model 🥹



ANTHROPIC 🔥: Mythos 1, "claude-mythos-1-preview", is being prepared for a release on Claude Code and Claude Security. The model became visible for a short amount of time on Claude; besides that, new strings mentioning Mythos have been added. > Access to the Claude Mythos model in Claude Code and Claude Security. It still doesn't mean the general public will have access to this exact model, according to Anthropic's earlier communication. More below 👇





Gemini 3.5 Flash vs GPT-5.5 instant vs Sonnet 4.6 Remember guys. #1 in Finance Agent v2. SOTA performance right here. lol 🤣 Prompt : " 300+140=460 Is this correct? Breakdown? "


Today, we share a breakthrough on the planar unit distance problem, a famous open question first posed by Paul Erdős in 1946. For nearly 80 years, mathematicians believed the best possible solutions looked roughly like square grids. An OpenAI model has now disproved that belief, discovering an entirely new family of constructions that performs better. This marks the first time AI has autonomously solved a prominent open problem central to a field of mathematics.

Gemini 3.5 Flash vs GPT-5.5 instant vs Sonnet 4.6 Remember guys. #1 in Finance Agent v2. SOTA performance right here. lol 🤣 Prompt : " 300+140=460 Is this correct? Breakdown? "





