Mazen — sa/acc
2.7K posts

Mazen — sa/acc
@ma7dev
building at @deepforai; @cursor_ai ambassador | prev: @malaa_tech @oregonstate | @pytorch award winner | ms & bs @oregonstate






The importance of stupidity in scientific research:





SpaceXAI and @cursor_ai are now working closely together to create the world’s best coding and knowledge work AI. The combination of Cursor’s leading product and distribution to expert software engineers with SpaceX’s million H100 equivalent Colossus training supercomputer will allow us to build the world’s most useful models. Cursor has also given SpaceX the right to acquire Cursor later this year for $60 billion or pay $10 billion for our work together.


Meet Kimi K2.6: Advancing Open-Source Coding 🔹Open-source SOTA on HLE w/ tools (54.0), SWE-Bench Pro (58.6), SWE-bench Multilingual (76.7), BrowseComp (83.2), Toolathlon (50.0), Charxiv w/ python(86.7), Math Vision w/ python (93.2) What's new: 🔹Long-horizon coding - 4,000+ tool calls, over 12 hours of continuous execution, with generalization across languages (Rust, Go, Python) and tasks (frontend, devops, perf optimization). 🔹Motion-rich frontend - Videos in hero sections, WebGL shaders, GSAP + Framer Motion, Three.js 3D. 🔹Agent Swarms, elevated - 300 parallel sub-agents × 4,000 steps per run (up from K2.5's 100 / 1,500). One prompt, 100+ files. 🔹Proactive Agents - K2.6 model powers OpenClaw, Hermes Agent, etc for 24/7 autonomous ops. 🔹Claw Groups (research preview) - bring your own agents, command your friends', bots & humans in the loop. - K2.6 is now live on kimi.com in chat mode and agent mode. For production-grade coding, pair K2.6 with Kimi Code: kimi.com/code - 🔗 API: platform.moonshot.ai 🔗 Tech blog: kimi.com/blog/kimi-k2-6 🔗 Weights & code: huggingface.co/moonshotai/Kim…



كيف تُبنى الالعاب؟ وكيف أبني لعبة من الصفر؟ في هاكاثون دال #ابنِ_وأطلق الإجابة اختر مسارك من بين ٥ مسارات: 1- بناء لعبة باستخدام محرك ألعاب، 2- لعبة عبر المتصفح، 3- لعبة على الجوال، 4- خوارزميات الشخصيات داخل الألعاب، 5- موقع ومنصة للاعبين. وخلال أربع ساعات فقط، ستتمكن من بناء لعبة متكاملة من الصفر بمساعدة أدوات الذكاء الاصطناعي 🚀 *سيحصل الحضور على رصيد مجاني من Cursor سجِّل الآن! luma.com/8571jsfj

ok so the default DSPy.RLM is literally going to destroy this benchmark before the end of the day. running now for sonnet 4.5... 🏆 Scoreboard (live) RLM: 90/94 (95.7%) Vanilla: 0/94 (0.0%) anyone want to pay for the opus run? 😉












