Chengzhi Mao

12 posts

Chengzhi Mao

Chengzhi Mao

@ChengzhiM

Researcher in Machine Learning and Computer Vision. Assistant Professor at Rutgers CS. Prior Google Research Scientist. PhD from Columbia.

Katılım Eylül 2018
324 Takip Edilen186 Takipçiler
Sabitlenmiş Tweet
Chengzhi Mao
Chengzhi Mao@ChengzhiM·
Call for Papers is OPEN! We want your work on Actionable Perception, VLA models, and Robot Manipulation. 🗓️ Deadline: May 4 (AOE) ✅ Non-archival (Dual submission welcome!) Details & Submissions: 🔗 activis-workshop.github.io #CVPR2026 #AI #Robotics #VLA
English
0
0
0
204
Yinghui He
Yinghui He@yinghui_he_·
RLVR gives sparse supervision; On-Policy Self-Distillation often requires high-quality demonstrations. Our new method, ✨SD-Zero✨, gets the best of both worlds – we use model’s self-revision to turn binary rewards into dense token-level supervision. No external teacher. No curated demonstrations. 🚨 Introducing Self-Distillation Zero (SD-Zero), which trains one model to play two roles: (1) “Generator” that makes attempts, and (2) “Reviser” that conditions on the generator’s failed/successful attempt + binary reward to produce a better answer. ‼️Even WRONG attempts can become the training signal.‼️ 🔗Paper: arxiv.org/abs/2604.12002 🏆 SD-Zero brings 10%+ improvement over base models (Qwen3,4B; Olmo3,7B) on math & code reasoning, beating GRPO and vanilla On-Policy Self-Distillation under the same training budget. SD-Zero also enables iterative self-evolution.
Yinghui He tweet mediaYinghui He tweet media
English
16
56
398
213K
Chengzhi Mao
Chengzhi Mao@ChengzhiM·
How does visual perception actually serve robotic action? 🤔 Announcing the #CVPR2026 Workshop: ActiVis — "Bridging Vision, Language, and Action: What's Missing in Actionable Visual Perception for Robotics." Submit your paper and Join us in Denver this June! 📍
English
3
0
6
3.2K
Chengzhi Mao
Chengzhi Mao@ChengzhiM·
It turns out, the best way to track the world is to learn how to generate it. 📍
English
0
0
0
17
Chengzhi Mao
Chengzhi Mao@ChengzhiM·
We found that Video Diffusion Models naturally solve this. They don't just hallucinate pixels; they inherently learn motion in the early, noisy stages of generation, independent of appearance. By tapping into this, we can track visually identical objects without any supervision.
English
1
0
0
22
Chengzhi Mao
Chengzhi Mao@ChengzhiM·
#NeurIPS2025 Poster #3611 Computer vision has a dirty secret: most object trackers are actually just "recognizers." They track by looking at colors and textures, not movement. That’s why they fail when two objects look identical.
English
1
0
0
27
Chengzhi Mao
Chengzhi Mao@ChengzhiM·
In O‘ahu 🌴—this time for #ICCV2025 (Oct 19–23)! I’ll be speaking at the TrustFM Workshop on Oct 20, 4 PM (HST), Room 308B: “Seeing Through Words: Understanding and Controlling Vision via Language.” Come chat about how language helps interpret and steer vision foundation models!
English
0
0
3
124
Chengzhi Mao retweetledi
Lihao Sun
Lihao Sun@1e0sun·
🚨New #ACL2025 paper! Today’s “safe” language models can look unbiased—but alignment can actually make them more biased implicitly by reducing their sensitivity to race-related associations. 🧵Find out more below!
Lihao Sun tweet media
English
1
2
14
2.8K
Chengzhi Mao
Chengzhi Mao@ChengzhiM·
#ICML 2024 How do large language models (LLM) reach their decisions? Our latest research project, SelfIE, is the first to use an LLM to explain the same LLM's internals. The interpretation can be used for safety alignment and understanding hallucinations. selfie.cs.columbia.edu
English
0
0
13
763
Chengzhi Mao
Chengzhi Mao@ChengzhiM·
@hongyangzh @cvondrick @djhsu The amount of training data is the same for single/multitask. We guess by training multitask, the features learned are more biased to the robust ones.
English
0
0
0
0
Hongyang Zhang
Hongyang Zhang@hongyangzh·
@cvondrick @djhsu @ChengzhiM Very interesting work! Intuitively, is it because more tasks bring more training data, so the adversarial generalization is better and the robustness is strengthened?
English
1
0
2
0
Carl Vondrick
Carl Vondrick@cvondrick·
What causes adversarial examples? Latest #ECCV2020 paper from @ChengzhiM and Amogh shows that deep networks are vulnerable partly because they are trained on too few tasks. Just by increasing tasks, we strengthen robustness for each task individually. arxiv.org/pdf/2007.07236…
Carl Vondrick tweet media
English
4
35
141
0