Infini-AI-Lab

109 posts

Infini-AI-Lab banner
Infini-AI-Lab

Infini-AI-Lab

@InfiniAILab

Pittsburgh, PA Katılım Eylül 2024
37 Takip Edilen1.7K Takipçiler
Infini-AI-Lab
Infini-AI-Lab@InfiniAILab·
Result: high-sparsity attention with no visible quality loss—while speeding up attention enough to matter end-to-end for real-time generation. If attention is your bottleneck, MonarchRT is a drop-in path to lower latency. 🧵5/5
Infini-AI-Lab tweet mediaInfini-AI-Lab tweet media
English
1
1
7
647
Infini-AI-Lab
Infini-AI-Lab@InfiniAILab·
Video generation models are improving fast—real-time autoregressive models now deliver high quality at low latency, and they’re quickly being adopted for world models and robotics applications. So what’s the problem? They’re still too slow on consumer hardware. 🚀 What if we told you that we can get true real-time 16 FPS video generation on a single RTX 5090? (1.5-12x over FA 2/3/4 on 5090, H100, B200) Today we release MonarchRT 🦋, an efficient video attention that parameterizes attention maps as (tiled) Monarch matrices and delivers real E2E gains. 📄 Paper: arxiv.org/abs/2602.12271 🌐 Website: infini-ai-lab.github.io/MonarchRT 🔗 GitHub: github.com/Infini-AI-Lab/… 🧵1/n
English
4
27
132
32.7K
Infini-AI-Lab retweetledi
Infini-AI-Lab
Infini-AI-Lab@InfiniAILab·
Empirically, we evaluate Jackpot 🎰 under diverse and extreme mismatch settings: 🤪 Joint-training: Consistently outperforms TIS across the extreme misalign small→large model RL training setups: Qwen2.5 1.5B→3B, Qwen3 1.7B→4B, Qwen3 1.7B→8B 🤪 Non-two-model-joint-training settings: In highly stale off-policy regimes (large rollout batches), Jackpot enables: • Removing PPO clipping • Convergence rate approaching on-policy training and faster than staleness baseline [5/n]
Infini-AI-Lab tweet media
English
1
0
1
535
Infini-AI-Lab
Infini-AI-Lab@InfiniAILab·
RL is notoriously unstable under actor–policy mismatch 😥 — a common reality caused by kernel differences, MoE randomness, FP8 rollouts, or asynchronous pipelines. But here’s a crazy thought 🤔 👉 What if you could RL-train a large model using rollouts generated only by a weaker, faster, and completely different model? Sounds doomed from the start? 💩 We are releasing Jackpot 🎰.💡 enabling training Qwen3-8B-Base using only Qwen3-1.7B-Base generated rollouts ✨ Jackpot is surprisingly powerful: • Enables cheap, fast rollouts to train stronger models • Dramatically changes the cost–performance tradeoff of RL training We release Jackpot 🎰 in the following format: 🌔Paper: arxiv.org/abs/2602.06107 🌕Code: github.com/Infini-AI-Lab/… 🌖Blog: infini-ai-lab.github.io/jpt_website/ [1/n]
Infini-AI-Lab tweet media
English
6
22
124
23.4K
Infini-AI-Lab retweetledi
Infini-AI-Lab
Infini-AI-Lab@InfiniAILab·
🚀 InfiniAI Lab @ CMU is hiring Postdocs! We are looking for outstanding postdoctoral researchers in ML systems and security to join InfiniAI Lab at Carnegie Mellon University. Research directions include (but are not limited to): 🤖 AI Agents & RL 🔐 Machine Learning Security 🎥 Video Models 🏗️ AI Systems & Architecture Design We especially encourage candidates interested in applying for the CMU–Bosch Institute (CBI) Postdoctoral Fellowship, which provides strong support for independent, high-impact research: 👉 carnegiebosch.cmu.edu/fellowships/in… 🗓️ CBI application deadline: January 30, 2026 How to apply: Please fill out the form and send us an email via 👉 infini-ai-lab.cmu.edu/vacancies
English
2
7
36
33.4K
Infini-AI-Lab
Infini-AI-Lab@InfiniAILab·
The most fun part: interpretability 🔍 Token-specific STEM embeddings behave like steering vectors. Even with the same input text, swapping the STEM embedding can meaningfully shift the output distribution 🎛️✨
Infini-AI-Lab tweet media
English
0
1
17
728
Infini-AI-Lab
Infini-AI-Lab@InfiniAILab·
Where does STEM help most? Big wins on knowledge-heavy benchmarks (ARC-Challenge, OpenBookQA, MMLU)… …but also strong improvements on contextual reasoning (BBH, LongBench) & long-context tasks (NIAH, LongBench). 🧩⏳
Infini-AI-Lab tweet mediaInfini-AI-Lab tweet mediaInfini-AI-Lab tweet media
English
1
1
18
867
Infini-AI-Lab
Infini-AI-Lab@InfiniAILab·
Lookup memories are having a moment 😄 The whale 🐋 #deepseek dropped engram… and we dropped up-projections from our FFNs…perfect timing 😅 🥳 Introducing STEM: Scaling Transformers with Embedding Modules 🌱 A scalable way to boost parametric memory with extra perks: ✅ Stable training even at extreme sparsity ✅ Better quality for fewer training FLOPs (knowledge + reasoning + long-context gains) ✅ Efficient inference: ~33% FFN params removed + CPU offload & async prefetch ✅ More interpretable → seamless knowledge editing 🔧🧠 Looking forward to DeepSeek v4… feels like we’ve only scratched the surface of embedding-lookup scaling 👀 📄Paper: arxiv.org/abs/2601.10639 🌐 Website: infini-ai-lab.github.io/STEM 🔗 GitHub: github.com/Infini-AI-Lab/…
Infini-AI-Lab tweet media
English
2
26
154
60.5K