ذكاء آصطناعي🤖
8.6K posts

ذكاء آصطناعي🤖
@ArabMindX
🚀 أعلّمك تستخدم AI بطريقة محترفين 🧠 برومبتات | أدوات | تطبيقات 📩 تابعني وغيّر طريقة شغلك





🚀 Introducing FlashQLA: high-performance linear attention kernels built on TileLang. ⚡ 2–3× forward speedup. 2× backward speedup. 💻 Purpose-built for agentic AI on your personal devices. 💡Key insights: 1. Gate-driven automatic intra-card CP. 2. Hardware-friendly algebraic reformulation. 3. TileLang fused warp-specialized kernels. FlashQLA boosts SM utilization via automatic intra-device CP. The gains are especially pronounced for TP setups, small models, and long-context workloads. Instead of fusing the entire GDN flow into a single kernel, we split it into two kernels optimized for CP and backward efficiency. At large batch sizes this incurs extra memory I/O overhead vs. a fully fused approach, but it delivers better real-world performance on edge devices and long-context workloads. The backward pass was the hardest part: we built a 16-stage warp-specialized pipeline under extremely tight on-chip memory constraints, ultimately achieving 2×+ kernel-level speedups. We hope this is useful to the community!🫶🫶 Learn more: 📖 Blog: qwen.ai/blog?id=flashq… 💻 Code: github.com/QwenLM/FlashQLA

The DeepSeek-V4-Pro discount has been extended until May 31, 2026, 15:59 UTC!

Adobe for creativity + Claude 🤝 Now, Claude users can power their content with more than 50 Creative Cloud tools. Simply describe the outcome you want and let the assistant orchestrate workflows behind the scenes: adobe.ly/4cTkJjF











