Biqing Qi

14 posts

Biqing Qi

Biqing Qi

@BiqingQ

Research Scientist in Shanghai AI Lab

Katılım Ocak 2022
99 Takip Edilen14 Takipçiler
Biqing Qi retweetledi
AI-Insight
AI-Insight@AI_Insight_Talk·
Found an interesting next model architecture exploration work from Shanghai AI Lab: SDAR, a new paradigm that converts trained AR models into blockwise diffusion models for FAST parallel decoding! ✅ AR's training efficiency ✅ Diffusion's inference speed The 30B MoE model even beats pure AR baselines on GPQA and ChemBench. HF Papers: huggingface.co/papers/2510.06… Model(1.7B/4B/8B/30B-A3B):huggingface.co/collections/Je…
AI-Insight tweet media
English
0
6
10
1K
Biqing Qi
Biqing Qi@BiqingQ·
💥 Token-level AR is hitting its limits. We introduce SDAR — AR for pretraining, then switch to Diffusion+AR for SFT. ✅ 2×+ faster inference ✅ Equal/better accuracy ✅ Works for general & reasoning tasks 🔗 jetastra.github.io/SDAR/
English
1
0
0
69
Biqing Qi retweetledi
AK
AK@_akhaliq·
TTRL Test-Time Reinforcement Learning
AK tweet media
English
4
70
434
53.3K
Biqing Qi retweetledi
Kaiyan Zhang
Kaiyan Zhang@OkhayIea·
🧠 Again! Introducing: MARTI — Multi-Agent Reinforced Training and Inference A unified framework for LLM-based Multi-Agent Systems with centralized interaction & distributed policy training. Supports structured workflows (debate, MoA, chain), custom rewards, and 3rd-party MAS (e.g., AutoGen, CAMEL). 📈 Preliminary Highlights: Multi-agent RL > single-agent RL baselines under same inference budget MARTI-trained Qwen2.5-3B > standard RL & rivals instruct variants Strong results on AIME (66.7 score with TTRL + MAD, DeepScaleR-1.5B) 🧪 PPO | GRPO | REINFORCE++ | TTRL 🚀 vLLM V1 & Hybrid Engine compatible 🛠️ github.com/TsinghuaC3I/MA… 🤝 Collabs welcome! #LLM #MultiAgent #ReinforcementLearning #RLHF #AIResearch #OpenSource #MARTI
Kaiyan Zhang tweet mediaKaiyan Zhang tweet mediaKaiyan Zhang tweet mediaKaiyan Zhang tweet media
English
0
7
16
1.5K
Biqing Qi retweetledi
Qiushi Sun
Qiushi Sun@qiushi_sun·
🎉Introducing our latest work: "ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows" 🤗 Huggingface: huggingface.co/papers/2505.19… 🏠Homepage: qiushisun.github.io/ScienceBoard-H… TLDR: We introduce ScienceBoard, featuring (1) a dynamic OS env with real scientific software (CLI + GUI), and (2) a human-validated benchmark spanning domains like biochem, astronomy, GIS, ATP, and more. 🧵[1/5]
Qiushi Sun tweet media
English
3
18
63
10.8K
AK
AK@_akhaliq·
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling
AK tweet media
English
10
92
515
47.4K
Biqing Qi
Biqing Qi@BiqingQ·
🚀 Exciting news! Our paper has been accepted at CVPR 2024! 🎉 Dive into the future of machine learning with our groundbreaking work on Interactive Continual Learning based on system1 and system2 framework arxiv.org/abs/2403.02628. Code is available at: github.com/Biqing-Qi/Inte…
Biqing Qi tweet media
English
0
2
3
465