Antony Huerto
62 posts
























Como demonstram as imagens, dada a distância, os jogadores do Real Madrid não podem ter ouvido o que andam a dizer que ouviram.








¿Qué deberías construir primero: el Frontend o el Backend o la base de datos? Yo siempre comienzo por la base de datos, prefiero tener las relaciones hechas y después sigo por la lógica de API y finalmente lo visual. ¿Ustedes de que forma lo hacen?




Are frontier AI models really capable of “PhD-level” reasoning? To answer this question, we introduce FormulaOne, a new reasoning benchmark of expert-level Dynamic Programming problems. We have curated a benchmark consisting of three tiers, in increasing complexity, which we call ‘shallow’, ‘deeper’, ‘deepest’. The results are remarkable: - On the ‘shallow’ tier, top models reach performance of 50%-70%, indicating that the models are familiar with the subject matter. - On ‘deeper’, Grok 4, Gemini-Pro, o3-Pro, Opus-4 all solve at most 1/100 problems. GPT-5 Pro is significantly better, but still solves only 4/100 problems. - On ‘deepest’, all models collapse to 0% success rate. 🧵












