
شاركت #ثقة بمؤتمر #ArabicNLP2024 في تايلاند، لتستعرض ورقة بحثية قامت بها بالتعاون مع #كاوست بهدف قياس أداء النماذج اللغوية وتطوير تطبيقات أكثر ذكاءً. #خبر_عاجل للمزيد: ow.ly/cKq950SZb4a
🇵🇸🇸🇦 Faris Hijazi (e/acc)
1K posts

@theeFaris
Just a man doing what he can with what he has. pro open source decentralized AGI 🐱💻AI Hacker 🪚DOOM enjoyer 🗿@KFUPM https://t.co/pRmnNMs3Vg

شاركت #ثقة بمؤتمر #ArabicNLP2024 في تايلاند، لتستعرض ورقة بحثية قامت بها بالتعاون مع #كاوست بهدف قياس أداء النماذج اللغوية وتطوير تطبيقات أكثر ذكاءً. #خبر_عاجل للمزيد: ow.ly/cKq950SZb4a


Matt Maher tested frontier models in Cursor v. other harnesses. Cursor boosted model performance by 11% on average: Gemini: 52% → 57% GPT-5.4: 82% → 88% Opus: 77% → 93% His benchmark measures how well models implement a 100-feature PRD. @cursor_ai consistently outperformed.




One of the biggest realizations I have had this year is that the model race ended and nobody noticed. The real race is inference. The real moat is inference. I just read the best breakdown of inference engineering I have come across. Read this article.





Will the ChatGPT app store be the next big App Store? Here's how to take advantage of this opportunity: 3:09 - What Are ChatGPT Apps? 8:25 - Architecture & How They're Built 33:12 - Improving with Evals 52:01 - Ideas for Builders








