Nicoló Paternoster 🦇🔊
2.2K posts

Nicoló Paternoster 🦇🔊
@adv4nced
unless it comes out of your soul like a rocket, don't do it. ❝ Verba volant, bits persist ❞










Can @AnthropicAI Claude 3.5 sonnet outperform @OpenAI o1 in reasoning? Combining Dynamic Chain of Thoughts, reflection, and verbal reinforcement, existing LLMs like Claude 3.5 Sonnet can be prompted to increase test-time compute and match reasoning strong models like OpenAI o1. 👀 TL;DR: 🧠 Combines Dynamic Chain of thoughts + reflection + verbal reinforcement prompting 📊 Benchmarked against tough academic tests (JEE Advanced, UPSC, IMO, Putnam) 🏆 Claude 3.5 Sonnet outperformes GPT-4 and matched O1 models 🔍 LLMs can create internal simulations and take 50+ reasoning steps for complex problems 📚 Works for smaller, open models like Llama 3.1 8B +10% (Llama 3.1 8B 33/48 vs GPT-4o 36/48) ❌ Didn’t benchmark like MMLU, MMLU pro, or GPQA due to computing and budget constraints 📈 High token usage - Claude Sonnet 3.5 used around 1 million tokens for just 7 questions

















