Mazen — sa/acc

2.7K posts

Mazen — sa/acc

@ma7dev

building at @deepforai; @cursor_ai ambassador | prev: @malaa_tech @oregonstate | @pytorch award winner | ms & bs @oregonstate

Riyadh Katılım Nisan 2023

992 Takip Edilen3.2K Takipçiler

Sabitlenmiş Tweet

Mazen — sa/acc@ma7dev·17 Eki

yay! thanks @marksaroufim 🙏🙏

English

80K

Mazen — sa/acc retweetledi

Nous Research@NousResearch·1d

Today we release Token Superposition Training (TST), a modification to the standard LLM pretraining loop that produces a 2-3× wall-clock speedup at matched FLOPs without changing the model architecture, optimizer, tokenizer, or training data. During the first third of training, the model reads and predicts contiguous bags of tokens, averaging their embeddings on the input side and predicting the next bag with a modified cross-entropy on the output side. For the remainder of the run, it trains normally on next-token prediction. The inference-time model is identical to one produced by conventional pretraining. Validated at 270M, 600M, and 3B dense scales, and at 10B-A1B MoE. The work on TST was led by @bloc97_, @gigant_theo, and @theemozilla.

English

142

392

3.5K

387.7K

Mazen — sa/acc retweetledi

Apurva Gandhi@apurvasgandhi·6d

Sub-agents are a promising inference-time scaling primitive: • Expand an agent's working memory • Divide-and-conquer hard problems • Solve problems faster with parallel execution But how do we train a model to best take advantage of sub-agents and make sure we get these benefits? Very excited to release RAO: Recursive Agent Optimization. RAO is an end-to-end reinforcement learning approach for training LLM agents to spawn, delegate to, and coordinate with recursive copies of themselves (that can themselves spawn other agents) - turning recursive inference into a learned capability. 1/10

GIF

English

114

693

127.8K

Mazen — sa/acc retweetledi

Elliot Arledge@elliotarledge·4d

Neet@neet_sol

i don’t want a job, i need more leverage

ZXX

341

18.9K

Mazen — sa/acc retweetledi

mazen@mznmel·5d

خطوة جديدة في رحلة بناء أفضل RAG سعودي alsiyaq.deep.sa

العربية

200

107K

Mazen — sa/acc retweetledi

SANI | صانع@devWithSANI·4 May

الحلم بدون مسار… يبقى حلم. ومع مسار، بتتعلّم التقنية بطريقة تواكب عالم يتغيّر كل يوم. سجّل الحين و #اصنع_مسارك

العربية

164

99.1K

Mazen — sa/acc retweetledi

ممدوح الظفيري@MamdouhAI·3 May

أهلًا، أطلقنا أنا و @_y0u_0 كورس عن الـAgentic Engineering with Claude Code. نشرح فيه كيف تبني منتجات وأنظمة بأعلى جودة مع الوكلاء! وأراهن على جودة الكورس والمعلومات اللي موجودة فيه

العربية

128

1.9K

2.4M

Mazen — sa/acc retweetledi

Sudo su@sudoingX·28 Nis

"how do you fit qwen 3.6 27b q4 on 24gb at 262k context" lands in my dms 5 times a week. here is the exact memory math. model bytes at idle = 16gb (q4_k_m of 27b dense) kv cache at 262k context with q4_0 for both k and v = 5gb total = 21gb on the card headroom = 3gb for prompts and tool call traces the magic is the kv cache type. most people leave it at default fp16 or push to q8 thinking quality wins. on qwen 3.6 27b dense at 262k: - fp16 kv cache = does not fit at all - q8 kv cache = fits at 23gb but runs 3x slower (double penalty: more vram, less speed) - q4_0 kv cache = fits at 21gb at full speed (40 tok/s flat curve, same speed at 4k or 262k) most builders never test the kv cache type because tutorials never mention it. it is the single biggest unlock on consumer 24gb hardware. flags i run: ./llama-server -m Qwen3.6-27B-Q4_K_M.gguf -ngl 99 -c 262144 -np 1 -fa on --cache-type-k q4_0 --cache-type-v q4_0 what they do: -ngl 99 = offload everything to gpu -c 262144 = 262k context window -np 1 = single user slot (do not enable multi-slot, eats headroom) -fa on = flash attention on (memory and speed both win) --cache-type-k q4_0 --cache-type-v q4_0 = the unlock if you are sitting on 24gb and not running this config, you are leaving 250k of context on the table. or worse, you are running q8 kv cache and burning 3x your speed for nothing. q4 is not a compromise on consumer hardware. it is the right call.

English

110

1.3K

74.5K

Mazen — sa/acc retweetledi

⚚Sage@belikesagee·27 Nis

Me in a Teams meeting, waiting to say "Nothing From my side"

English

295

12.7K

75.1K

2.2M

Mazen — sa/acc retweetledi

Rio Rinaldi@udenrio·24 Nis

Feeling stupid = u are actually learning.

Curious Minds@CuriousMindsHub

The importance of stupidity in scientific research:

English

27.3K

108.7K

Mazen — sa/acc retweetledi

Qwen@Alibaba_Qwen·22 Nis

🚀 Meet Qwen3.6-27B, our latest dense, open-source model, packing flagship-level coding power! Yes, 27B, and Qwen3.6-27B punches way above its weight. 👇 What's new: 🧠 Outstanding agentic coding — surpasses Qwen3.5-397B-A17B across all major coding benchmarks 💡 Strong reasoning across text & multimodal tasks 🔄 Supports thinking & non-thinking modes ✅ Apache 2.0 — fully open, fully yours Smaller model. Bigger results. Community's favorite. ❤️ We can't wait to see what you build with Qwen3.6-27B! 👀 🔗👇 Blog: qwen.ai/blog?id=qwen3.… Qwen Studio: chat.qwen.ai/?models=qwen3.… Github: github.com/QwenLM/Qwen3.6 Hugging Face: huggingface.co/Qwen/Qwen3.6-2… huggingface.co/Qwen/Qwen3.6-2… ModelScope: modelscope.cn/models/Qwen/Qw… modelscope.cn/models/Qwen/Qw…

English

531

1.7K

12.5K

3.7M

Mazen — sa/acc@ma7dev·22 Nis

@benln Congrats to you and your wife! 🙏❤️

English

249

Ben Lang@benln·22 Nis

Baby sister arrived yesterday. Deeply grateful.

English

145

991

49.7K

Mazen — sa/acc retweetledi

Dal | دال@DalData_sa·22 Nis

٣ أيام تفصلنا عن بداية هاكاثون #أبنِ_وأطلق🌟 في المملكة العربية السعودية سوق الألعاب ينمو بسرعة، سجل معنا لتكون جزء من هذه الرحلة🚀 سجِّل الآن luma.com/8571jsfj

العربية

2.8K

Mazen — sa/acc retweetledi

Michael Truell@mntruell·22 Nis

Excited to partner with the SpaceX team to scale up Composer. A meaningful step on our path to build the best place to code with AI.

SpaceX@SpaceX

SpaceXAI and @cursor_ai are now working closely together to create the world’s best coding and knowledge work AI. The combination of Cursor’s leading product and distribution to expert software engineers with SpaceX’s million H100 equivalent Colossus training supercomputer will allow us to build the world’s most useful models. Cursor has also given SpaceX the right to acquire Cursor later this year for $60 billion or pay $10 billion for our work together.

English

484

1.2K

10.4K

1.6M

Mazen — sa/acc retweetledi

elie@eliebakouch·20 Nis

kimi K2.6 vs K2.5, mythos, opus 4.7, and cursor composer 2 (based on K2.5) on every benchmark i could find tl;dr: it's a really really good model

Kimi.ai@Kimi_Moonshot

Meet Kimi K2.6: Advancing Open-Source Coding 🔹Open-source SOTA on HLE w/ tools (54.0), SWE-Bench Pro (58.6), SWE-bench Multilingual (76.7), BrowseComp (83.2), Toolathlon (50.0), Charxiv w/ python(86.7), Math Vision w/ python (93.2) What's new: 🔹Long-horizon coding - 4,000+ tool calls, over 12 hours of continuous execution, with generalization across languages (Rust, Go, Python) and tasks (frontend, devops, perf optimization). 🔹Motion-rich frontend - Videos in hero sections, WebGL shaders, GSAP + Framer Motion, Three.js 3D. 🔹Agent Swarms, elevated - 300 parallel sub-agents × 4,000 steps per run (up from K2.5's 100 / 1,500). One prompt, 100+ files. 🔹Proactive Agents - K2.6 model powers OpenClaw, Hermes Agent, etc for 24/7 autonomous ops. 🔹Claw Groups (research preview) - bring your own agents, command your friends', bots & humans in the loop. - K2.6 is now live on kimi.com in chat mode and agent mode. For production-grade coding, pair K2.6 with Kimi Code: kimi.com/code - 🔗 API: platform.moonshot.ai 🔗 Tech blog: kimi.com/blog/kimi-k2-6 🔗 Weights & code: huggingface.co/moonshotai/Kim…

English

135

1.7K

149.1K

Mazen — sa/acc retweetledi

Dal | دال@DalData_sa·20 Nis

كيف تُبنى الألعاب؟ وكيف تبني لعبة من الصفر؟ في هاكاثون دال #ابنِ_وأطلق تجد الإجابة. اختر مسارك إن أردت من بين المسارات الخمسة وخلال أربع ساعات فقط، ستتمكن من بناء لعبة متكاملة من الصفر بمساعدة أدوات الذكاء الاصطناعي 🚀 *سيحصل الحضور على رصيد مجاني من Cursor سجِّل الآن! luma.com/8571jsfj

العربية

7.9K

Mazen — sa/acc@ma7dev·20 Nis

يلا يا قيمرز، نشوفكم هناك

Dal | دال@DalData_sa

كيف تُبنى الالعاب؟ وكيف أبني لعبة من الصفر؟ في هاكاثون دال #ابنِ_وأطلق الإجابة اختر مسارك من بين ٥ مسارات: 1- بناء لعبة باستخدام محرك ألعاب، 2- لعبة عبر المتصفح، 3- لعبة على الجوال، 4- خوارزميات الشخصيات داخل الألعاب، 5- موقع ومنصة للاعبين. وخلال أربع ساعات فقط، ستتمكن من بناء لعبة متكاملة من الصفر بمساعدة أدوات الذكاء الاصطناعي 🚀 *سيحصل الحضور على رصيد مجاني من Cursor سجِّل الآن! luma.com/8571jsfj

العربية

117

Mazen — sa/acc retweetledi

Raymond Weitekamp@raw_works·18 Nis

sorry it took me ~50 hrs! now i've got DSPy.RLM as SOTA on LongCOT (Full) by a very large margin, using... ...drumroll... Qwen 3.5 9B! 👑 Qwen3.5-9B + dspy.RLM = 15.69% on LongCoT-full 🔥 ~1.6× GPT 5.2's 9.83% on the same slice!

Raymond Weitekamp@raw_works

ok so the default DSPy.RLM is literally going to destroy this benchmark before the end of the day. running now for sonnet 4.5... 🏆 Scoreboard (live) RLM: 90/94 (95.7%) Vanilla: 0/94 (0.0%) anyone want to pay for the opus run? 😉

English

603

125.4K

Mazen — sa/acc retweetledi

Dal | دال@DalData_sa·18 Nis

مهتم ببناء منتجك الخاص من الصفر حتى الإطلاق باستخدام أدوات الذكاء الاصطناعي؟ مبادرة دال الجديدة #ابنِ_وأطلق هي البيئة المثالية لإشباع هذا الاهتمام، من خلال دعم المشاركين من قبل خبراء في أدوات الذكاء الاصطناعي🔥 سجِّل الآن في النسخة الأولى من المبادرة بعنوان: "Build & Ship a Game in 4 Hours" luma.com/8571js

العربية

9.4K

Mazen — sa/acc retweetledi

Cursor@cursor_ai·17 Nis

Through the end of this weekend, we are doubling Composer 2 usage limits inside of Cursor's new agents window. Enjoy!

English

120

2.2K

164.3K

Mazen — sa/acc retweetledi

Dal | دال@DalData_sa·16 Nis

مهتم بتطوير لعبتك الأولى؟ ندعوكم للتسجيل في النسخة الأولى من #ابني_واطلق، بدعم من Cursor وقيادة م. مازن العتيبي سفير Cursor في مدينة الرياض. ستتمكن من تطوير لعبة من الصفر بمساعدة الذكاء الاصطناعي في أربع ساعات فقط 🚀 سيحصل الحضور على رصيد مجاني من Cursor. المقاعد محدودة، انتهز الفرصة وسجّل الآن! luma.com/8571jsfj

العربية

1.8K

Keşfet

@bloc97_ @gigant_theo @theemozilla @_y0u_0 @benln @elonmusk @BarackObama @taylorswift13