Han Fang

1.4K posts

Han Fang banner
Han Fang

Han Fang

@Han_Fang_

AI Research @ Meta SuperIntelligence Labs

United States เข้าร่วม Ağustos 2011
196 กำลังติดตาม1.7K ผู้ติดตาม
ทวีตที่ปักหมุด
Han Fang
Han Fang@Han_Fang_·
R-Zero (ICLR 2026) — self-evolving LLM from zero external data. one base model, two roles: Challenger generates hard problems, Solver solves them. Challenger is rewarded when Solver fails. co-evolve with GRPO. Challenger learns to probe for weaknesses, not just generate hard problems. +6.49 math, +7.54 general reasoning on Qwen3-4B-Base. three iterations. no human data. arxiv.org/abs/2508.05004
English
5
55
395
23.5K
Han Fang
Han Fang@Han_Fang_·
@XFreeze @elonmusk @elonmusk instruction following on easy-to-verify instructions isn’t sufficient- thats what IFBench is. The real deal is the instructions that users actually might ask. For example, do you think the response below is in your tone?
Han Fang tweet media
English
1
0
1
273
X Freeze
X Freeze@XFreeze·
Grok 4.20 just claimed #1 on IFBench (Artificial Analysis) - the gold standard for instruction following 81% score. Outranking every other model And here is what that actually means - When you ask Grok to do something, it doesn't give you a close enough answer. It doesn't approximate. It doesn't go off-script It follows your instructions. Precisely. Every time xAI is not just racing to build the most intelligent AI - they are also building the most reliable one An AI that actually listens to you...
X Freeze tweet media
English
193
610
1.9K
523.5K
Alexandr Wang
Alexandr Wang@alexandr_wang·
okay this is too exciting :) meta AI is now #2 in the app store, top AI app! we are so back!
Alexandr Wang tweet media
English
259
117
2.2K
321.8K
Han Fang รีทวีตแล้ว
Jason Weston
Jason Weston@jaseweston·
🏋️Thinking Mid-training: RL of Interleaved Reasoning🎗️ We address the gap between pretraining (no explicit reasoning) and post-training (reasoning-heavy) with an intermediate SFT+RL mid-training phase to teach models how to think. - Annotate pretraining data with interleaved thoughts - SFT mid-training to learn when/what to think alongside original content - RL mid-training to optimize reasoning generation with grounded reward from future token prediction Result: 3.2x improvement on reasoning benchmarks compared to direct RL post-training on base Llama-3-8B, and gains over only prior SFT as well. Introducing reasoning earlier makes models better prepared for post-training! Read more in the blog post: facebookresearch.github.io/RAM/blogs/thin…
Jason Weston tweet media
English
9
71
553
65.9K
Han Fang รีทวีตแล้ว
Shengjia Zhao
Shengjia Zhao@shengjia_zhao·
Excited to share what we’ve been building at Meta Superintelligence Labs! We just released Muse Spark, our first AI model. It's a natively multimodal reasoning model and the first step on our path to personal superintelligence. We've overhauled our entire stack to support scaling, and this is just the beginning. ai.meta.com/blog/introduci…
Shengjia Zhao tweet media
English
74
173
1.7K
224.8K
Han Fang รีทวีตแล้ว
Hongyu Ren
Hongyu Ren@ren_hongyu·
Check out Muse Spark, our first milestone in the quest for personal superintelligence! Scaling this with the team has been a total blast. Give it a spin and let us know what you think! 🥑
Hongyu Ren tweet mediaHongyu Ren tweet media
English
18
59
315
63.7K
Han Fang รีทวีตแล้ว
AI at Meta
AI at Meta@AIatMeta·
Today we're introducing TRIBE v2 (Trimodal Brain Encoder), a foundation model trained to predict how the human brain responds to almost any sight or sound. Building on our Algonauts 2025 award-winning architecture, TRIBE v2 draws on 500+ hours of fMRI recordings from 700+ people to create a digital twin of neural activity and enable zero-shot predictions for new subjects, languages, and tasks. Try the demo and learn more here: go.meta.me/tribe2
English
736
2.5K
16K
6.8M
Karthik A Sankararaman 🇮🇳🇺🇸
@Yuchenj_UW Even (and especially) assuming AGI, this doesn't logically follow at all. Set of things a frontier lab can do is finite (bounded by compute). Set of things business can be built on is infinite. So a finite set of companies doing finite set of things cant be infinite.
English
3
0
4
364
Yuchen Jin
Yuchen Jin@Yuchenj_UW·
Some people at frontier AI labs told me they believe startups are over. OpenAI, Anthropic, Google, xAI will absorb every industry as AGI nears. Coding today, science, medicine, and finance next. Then everything else. If they’re right, that’s a pretty boring end of the world.
English
540
160
3K
944.5K
Han Fang รีทวีตแล้ว
Karina Nguyen
Karina Nguyen@karinanguyen·
Excited to release PostTrainBench v1.0! This benchmark evaluates the ability of frontier AI agents to post-train language models in a simplified setting. We believe this is a first step toward tracking progress in recursive self-improvement 🧵:
English
45
90
677
148.3K
Han Fang
Han Fang@Han_Fang_·
I’ve been thinking about something lately. Every mature science has its central dogma — a foundational claim so deeply embedded that practitioners forget it’s even there. Biology has DNA → RNA → Protein. Thermodynamics has entropy. Does AI have one? Here I want to share a thought exercise of mine: what if we treated the compression-intelligence connection not as a useful intuition, but as our field’s central dogma? See my thoughts here: tokens-for-thoughts.notion.site/the-central-do…
English
1
0
1
262
Han Fang รีทวีตแล้ว
Kimi.ai
Kimi.ai@Kimi_Moonshot·
Kimi K2.5 tech report just dropped! Quick hits: - Joint text–vision training: pretrained with 15T vision-text tokens, zero-vision SFT (text-only) to activate visual reasoning - Agent Swarm + PARL: dynamically orchestrated parallel sub-agents, up to 4.5× lower latency, 78.4% on BrowseComp - MoonViT-3D: a unified image–video encoder with 4× temporal compression, enabling 4× longer videos in the same context - Toggle: token-efficient RL, 25–30% fewer tokens with no accuracy drop Here's our work toward scalable, real-world agentic intelligence. More details in the report 👉github.com/MoonshotAI/Kim…
Kimi.ai tweet mediaKimi.ai tweet mediaKimi.ai tweet mediaKimi.ai tweet media
English
54
279
1.9K
313.5K