
Eng. SomeOne
528 posts

Eng. SomeOne
@EngS85
AI | Machine Learning Engineer | CV | Data Scientist in HEALTHCARE #ذكاء_اصطناعي #تعلم_الآله #التعلم_العميق #هندسة_ذكاء_اصطناعي




ياااه يالتنظيم الدفاعي ياه ياريفير بليط لو اشوت عليك بن هاربورغ دخل بمرماك







A few more observations after replicating the Tower of Hanoi game with their exact prompts: - You need AT LEAST 2^N - 1 moves and the output format requires 10 tokens per move + some constant stuff. - Furthermore the output limit for Sonnet 3.7 is 128k, DeepSeek R1 64K, and o3-mini 100k tokens. This includes the reasoning tokens they use before outputting their final answer! - all models will have 0 accuracy with more than 13 disks simply because they can not output that much! - the max solvable sizes WITHOUT ANY ROOM FOR REASONING (floor(log2(output_limit/10))) DeepSeek: 12 disks Sonnet 3.7 and o3-mini: 13 disks - If you actually look at the output of the models you will see that they don't even reason about the problem if it gets too large: "Due to the large number of moves, I'll explain the solution approach rather than listing all 32,767 moves individually" - At least for Sonnet it doesn't try to reason through the problem once it's above ~7 disks. It will state what the problem and the algorithm to solve it and then output its solution without even thinking about individual steps. - it's also interesting to look at the models as having a X% chance of picking the correct token at each move - even with a 99.99% probability the models will eventually make an error simply because of the exponentially growing problem size



فيه واحد نشر رد على ورقة Apple عنونها The Illusion of "The Illusion of Thinking" 😂 ومن زود "الطقطقة" حط الباحث الأول نموذج ذكاء اصطناعي Claude Opus 🤣 تذكرت كتاب الغزالي "تهافت الفلاسفة" ورد إبن رشد عليه بكتاب "تهافت التهافت" 😂 بغض النظر عن التهكم في العنوان، الردود منطقية وأنصح كل من استعجل، و"طار بالعجَّة" زي ما نقول، إن يقرأ الرد ليدرك قيمة وفائدة المراجعة والتحكيم العلمي




















