Tianhao Wang

28 posts

Tianhao Wang

Tianhao Wang

@0920wth

Assistant Professor @HDSIUCSD. Previously Research Assistant Professor @TTIC_Connect and PhD in Statistics & Data Science @Yale.

Chicago Katılım Temmuz 2017
255 Takip Edilen214 Takipçiler
Tianhao Wang retweetledi
Tianhao Wang retweetledi
Zhijian Liu
Zhijian Liu@zhijianliu_·
Reasoning LLMs generate very long chains-of-thought, so even small quantization errors add up. With AWQ, Qwen3-4B drops 71.0 → 68.2 on MMLU-Pro (~4% relative loss). 😬 ParoQuant fixes this! It keeps only the critical rotation pairs and fuses everything into a single kernel. Recovers most of the lost reasoning accuracy with minimal overhead — so 4-bit models stay strong at reasoning. 💪💪
English
31
143
1.4K
170.4K
Tianhao Wang retweetledi
Zhuoran Yang
Zhuoran Yang@zhuoran_yang·
New Paper -- "On the Mechanism and Dynamics of Modular Addition: Fourier Features, Lottery Ticket, and Grokking" We give a complete mechanistic and dynamic picture of how neural networks learn modular addition f(x,y) = (x+y) mod p. We answer three questions: (1) What does the trained network compute? (2) How do Fourier features emerge during training? (3) Why does grokking happen? Each answer comes with a mathematical characterization backed by theory and experiments. Paper: arxiv.org/abs/2602.16849 Blog: y-agent.github.io/posts/modular_… Demo: huggingface.co/spaces/y-agent… Code: github.com/Y-Agent/modula…
Zhuoran Yang tweet media
English
7
49
311
17K
Tianhao Wang retweetledi
Jiaqi Ma
Jiaqi Ma@Jiaqi_Ma_·
The ARC challenge claims to measure "fluid intelligence" through tasks that are "simple for people yet difficult for AI." However, is the AI failure really due to the lack of "fluid intelligence?" Our recent work shows that the answer is NO with a carefully designed diagnostic study. ArXiv: arxiv.org/pdf/2512.21329 Joint work with Xinhe Wang, @JinHuang9306000, @_Jimmy_Zhang_ , @0920wth Our study is motivated by an observation that ARC problems are easy for humans because their representation strongly favors human vision. For example, in the attached figure, the same ARC problem presented in a serialized way becomes much more challenging for humans. 1/
Jiaqi Ma tweet media
English
1
3
30
5.9K
Tianhao Wang retweetledi
Arya Mazumdar
Arya Mazumdar@MountainOfMoon·
The University of California, San Diego invites applications for one or more ladder rank faculty appointments based in the Halıcıoğlu Data Science Institute, the academic unit of the newly formed School of Computing, Information and Data Sciences. This is an open rank search for all levels of appointment (assistant, associate, or full professor). We seek outstanding candidates from ALL areas of Artificial Intelligence and Machine Learning as represented within HDSI's research scope, including but not limited to: 1) Computer Vision 2) AI for Science 3) Emerging Technologies for AI (such as Quantum Computing) apol-recruit.ucsd.edu/JPF04397 @HDSIUCSD @UCSD
English
0
12
88
25.2K
Tianhao Wang retweetledi
Sadhika Malladi
Sadhika Malladi@SadhikaMalladi·
Excited to share that I will be starting as an Assistant Professor in CSE at UCSD (@ucsd_cse) in Fall 2026! I am currently recruiting PhD students who want to bridge theory and practice in deep learning - see here: cs.princeton.edu/~smalladi/recr…
English
38
71
547
86.7K
Tianhao Wang retweetledi
Zhiyuan Li
Zhiyuan Li@zhiyuanli_·
Adaptive optimizers range from AdaGrad-Norm to Shampoo and full-matrix AdaGrad, with increasingly expressive preconditioners. But does more adaptivity always translate to fewer steps to converge? Our ICML 2025 paper answers negatively via a unified convergence analysis. 🧵1/6
Zhiyuan Li tweet media
English
2
13
98
18.3K
Tianhao Wang retweetledi
Zhuoran Yang
Zhuoran Yang@zhuoran_yang·
🚀 We're excited to share our paper, "Taming Polysemanticity in LLMs," which introduces Group Bias Adaptation (GBA)—the FIRST Sparse Autoencoder (SAE) training method with a provable guarantee for untangling monosemantic concepts! 📄 Paper: arxiv.org/abs/2506.14002 🌐 Website: y-agent.github.io/taming-sae-gba… 🎯 Demo (Layer 26 of Qwen 2.5B-Base): y-agent.github.io/taming-sae-gba… Joint work with @siyuc3141, @HeejuneSheen, Xuyuan Xiong, and @0920wth
Zhuoran Yang tweet mediaZhuoran Yang tweet mediaZhuoran Yang tweet media
English
5
24
110
10.2K
Tianhao Wang retweetledi
Zhiyuan Li
Zhiyuan Li@zhiyuanli_·
Why does Adam outperform SGD in LLMs training? Adaptive step sizes alone don't fully explain this, as Adam also surpasses adaptive SGD. Is coordinate-wise adaptivity the secret? Not entirely—Adam actually struggles in the rotated parameter space! 🧵 (1/6) arxiv.org/abs/2410.08198
Zhiyuan Li tweet media
English
3
35
266
47.1K
Tianhao Wang retweetledi
Zhuoran Yang
Zhuoran Yang@zhuoran_yang·
[New paper on in-context learning] "In-Context Linear Regression Demystified" (link: arxiv.org/abs/2503.12734). Joint work @JLiangHe, @xintianpan, @siyuc3141. We establish a rather complete understanding of how one-layer multi-head attention solves in-context linear regression,
English
2
24
109
7.7K
Tianhao Wang retweetledi
Ruili Feng
Ruili Feng@feng_ruili_frl·
A step towards neural interactive simulation, where is Neo?
Hongyang Zhang@hongyangzh

Introducing The Matrix --- a foundation world model for generating infinite-length, hyper-realistic videos with real-time, frame-level control: - Infinite-length video generation - 720p high-quality rendering - Real-time, frame-level control at 16 FPS - Generalization to real-world video control 🔗Blog: thematrix1999.github.io 📄Paper: thematrix1999.github.io/article/the_ma… 💻Code & Playable Demo: Coming soon! Key Innovation: A brand new technique called the shift-window denoise process model, enabling auto-regressive generation for diffusion and consistency models in real-time. Special thanks to project leader Ruili Feng and the entire Matrix team for their dedication and hard work over the year-long project.

English
0
3
6
1.5K
Barna Saha
Barna Saha@B1ar2n3a·
Thuy-Duong “June” Vuyong and Tianhao Wang @0920wth joining us as new faculties, and multiple wedding bells in the group. (The pictures missing few other folks who joined, and the awesome food that we had 🙂)
English
1
0
7
1.2K
Barna Saha
Barna Saha@B1ar2n3a·
The UCSD theory group EOY celebration. We had a lot to celebrate: alum @JessSorrell joining JHU as an assistant prof, @MHop_Theory and Rex Lei graduating, lot of amazing work including Chris’s work selected as ICML Oral, a big cohort of students and postdocs joining in 24,
Barna Saha tweet mediaBarna Saha tweet mediaBarna Saha tweet mediaBarna Saha tweet media
English
2
1
50
19.1K
Sadhika Malladi
Sadhika Malladi@SadhikaMalladi·
@0920wth @HDSIUCSD Congratulations, Tianhao! Well-deserved and excited to see where your research agenda goes next :)
English
1
0
2
320