Peihao Wang

41 posts

Peihao Wang banner
Peihao Wang

Peihao Wang

@peihao_wang

📚 PhD Student @utexasece @WNCG_UT @VITAGroupUT; 🌟 Stanford Rising Star in Data Science 2025; 🎓 Google Fellowship 2025 in ML & ML foundations; 🎄@ccccrs_0908

Austin, TX Katılım Ocak 2020
238 Takip Edilen212 Takipçiler
Peihao Wang
Peihao Wang@peihao_wang·
Interestingly, we revealed a duality: 🔵 Training-time alignment ≈ amortized parameter-space optimization 🔵 Test-time optimization ≈ latent space sampling From a classical statistical inference lens, these two are tightly connected, just operating over different spaces.
English
0
1
6
395
Peihao Wang
Peihao Wang@peihao_wang·
We formulate decoding as an optimization problem: find responses that maximize a differentiable reward subject to being sampled from an LLM . Gradients are backpropagated into the model’s hidden states, steering inference into a form of test-time training.
English
1
1
6
486
Peihao Wang
Peihao Wang@peihao_wang·
Latent space reasoning via looped transformers has gained attention lately. It is rooted in optimization unrolling , where each loop implicitly models a GD step on hidden states. Our ICLR paper studied what if we explicitly run GD in latent space at test time?
Zhen Wang@zhenwang9102

1/🧵 What if test-time reasoning wasn't discrete search, but gradient descent in latent space? Happy to share our #ICLR2026 paper ∇-Reasoner: a paradigm shift from zeroth-order search to first-order optim at test time. Led by @peihao_wang @ccccrs_0908 iclr.cc/virtual/2026/p…

English
5
37
365
32.3K
Peihao Wang retweetledi
DAIR.AI
DAIR.AI@dair_ai·
Are multi-agent systems necessary? Here is a great new paper addressing this. The big assumption most AI devs make today is that more agents lead to better performance. But here is the overlooked reality: most multi-agent systems are homogeneous. All agents typically share the same base LLM, differing only in prompts, tools, and positions in the workflow. This raises a compelling question of whether a single agent can simulate these workflows through multi-turn conversations. This new research investigates this across seven benchmarks spanning coding, mathematics, QA, domain-specific reasoning, and real-world planning. A single agent with KV cache reuse can match the performance of homogeneous multi-agent workflows while reducing inference costs. The cost advantage comes from shared KV cache across agent interactions, avoiding redundant prefill computation. Because homogeneous agents possess identical reasoning capabilities and differ only in specialized instructions, a single agent can role-play these agents sequentially, exploiting the workflow's task decomposition without needing separate model instances. Building on this finding, the researchers propose OneFlow, an algorithm that automatically designs workflows optimized for single-agent execution. OneFlow uses a dual meta-LLM architecture (Creative Designer + Critical Reviewer) with Monte Carlo Tree Search to discover streamlined workflows with comprehensive system prompts and fewer total agents. OneFlow with single-agent execution achieves 92.1% on HumanEval, 81.4% on MBPP, 93.3% on GSM8K, matching or exceeding multi-agent baselines while significantly reducing cost. Single-LLM methods cannot capture truly heterogeneous workflows where agents use different base models, since KV caches cannot be shared across different LLMs. These results position single-LLM implementation as a strong baseline for MAS research. The authors suggest that the real opportunity lies in developing heterogeneous systems where model diversity benefits outweigh coordination costs. Paper: arxiv.org/abs/2601.12307 Learn to build effective AI agents in our academy: dair-ai.thinkific.com
DAIR.AI tweet media
English
14
42
209
23.8K
Peihao Wang
Peihao Wang@peihao_wang·
@zhiwen_fan_ Thx Zhiwen! Glad that I finally made some progress chasing your excellence.
English
0
0
0
29
VITA Group
VITA Group@VITAGroupUT·
🎉 Huge congratulations to PhD student Peihao Wang (@peihao_wang ) on two major honors: 🏆 2025 Google PhD Fellowship in Machine Learning & ML Foundations 🌟 Stanford Rising Star in Data Science Incredibly proud of Peihao's outstanding achievements! 🔶⚡
VITA Group tweet media
English
2
1
16
2.4K
Peihao Wang
Peihao Wang@peihao_wang·
This work is so special to me. I first touched cryo-EM as a junior - couldn’t believe a neural net could predict bio structure from extremely low SNR, unposed images. with so many AI progress in these 5 years, scaling laws make AI-driven protein discovery feel real
Zhiwen(Aaron) Fan@zhiwen_fan_

DUSt3R-like models work for scientific imaging too! Our ICCV’25 paper “CryoFastAR” shows that a geometric foundation model can do feed-forward ab initio cryo-EM reconstruction—10× faster and state-of-the-art quality on noisy particle images! #ICCV2025 #CryoEM 📎Paper: arxiv.org/abs/2506.05864

English
0
0
1
603
Peihao Wang retweetledi
Zhiwen(Aaron) Fan
Zhiwen(Aaron) Fan@zhiwen_fan_·
We already introduced #LightGaussian last year to accelerate the rendering speed of 3DGS. In our CVPR'25 paper, SteepGS, we go further by demystifying and improving density control during 3DGS optimization — making training more efficient and reliable. Project Page: vita-group.github.io/SteepGS/
Zhiwen(Aaron) Fan tweet media
English
0
18
106
5.7K
Ruisi Cai
Ruisi Cai@ccccrs_0908·
Excited to share that I have been awarded NVIDIA fellowship! 🎉 Immensely grateful for the recognition and support - this inspires me to continue advancing research in LLM efficiency and AI security. blogs.nvidia.com/blog/graduate-…
English
15
6
254
22K
Peihao Wang retweetledi
Zhiwen(Aaron) Fan
Zhiwen(Aaron) Fan@zhiwen_fan_·
🚀 Our NeurIPS '24 work, Large Spatial Model (LSM), is here! LSM performs semantic 3D reconstruction in just 0.1s, processing unposed data via feed-forward 3D reconstruction. 👉It leverages large-scale 3D datasets with minimal annotations, defining a 3D latent space. We are continuously exploring how this explicit 3D representation can further enhance reasoning and robotic learning. 🔗 Try our online Gradio demo with your own data at largespatialmodel.github.io #NeurIPS2024 #3DReconstruction
English
3
63
309
43.7K
Peihao Wang retweetledi
Ruisi Cai
Ruisi Cai@ccccrs_0908·
Train one - Get many🚀! Check more details about Flextron at cairuisi.github.io/Flextron/
Pavlo Molchanov@PavloMolchanov

🚀 Introducing Flextron - a Many-in-One LLM - Oral at ICML! Train one model and get many optimal models for each GPU at inference without any additional retraining. 🌟 🔗 Paper: arxiv.org/abs/2406.10260 Main benefits with only 5% post-training finetuning: ✅ Best model for every GPU (small & large) without retraining ✅ Change inference cost on the fly based on load ✅ Input-adaptive inference (heterogeneous weight-shared MoE, Attention) ✅Instead of training many models, we train only 1: LLaMa2-7B ➡️ 3B, 4B, 5B, 6B, etc. Method in observation in thread. 🧵👇

English
0
2
14
1.3K
Peihao Wang retweetledi
Mingyuan Zhou
Mingyuan Zhou@MingyuanZhou·
Introducing Score identity Distillation with Long and Short Guidance (SiD-LSG), our data-free solution to distill Stable Diffusion models into one-step text-to-image generators, achieving a COCO2014 zero-shot FID of 8.15. Excited to share the code and checkpoints with the community! Code: github.com/mingyuanzhou/S… Paper: arxiv.org/abs/2406.01561 #Diffusion #Distillation #StableDiffusion @ZhendongWang6 @UnderGroundJeg @haihuang_ml
English
1
3
11
683
Peihao Wang retweetledi
Ruisi Cai
Ruisi Cai@ccccrs_0908·
Tired of training varying-size LLMs to fit various GPU memory and latency requirements? Check out Flextron! Our new ICML (Oral) paper shows how to train one model deployable across GPU series. Learn more: cairuisi.github.io/Flextron/🚀
English
2
7
29
5.3K
Peihao Wang retweetledi
Ruisi Cai
Ruisi Cai@ccccrs_0908·
The Flextron-Llama2-7B model family demonstrates superior MMLU performance compared to both open-source models (including Pythia, OpenLLaMA-v2) and existing post-hoc compression methods (including Sheared-LLaMA, SliceGPT, LLM-Pruner, Compresso, LaCo).
Ruisi Cai tweet media
English
1
1
6
1.3K
Peihao Wang retweetledi
Ruisi Cai
Ruisi Cai@ccccrs_0908·
Managing long context is challenging due to quadratic attention memory usage. But what if we could compress growing context information into a fixed-size memory? 🤔 Check out our new ICML paper: "LoCoCo: Dropping In Convolutions for Long Context Compression"! 1/3
Ruisi Cai tweet media
English
5
24
88
20.1K
Peihao Wang
Peihao Wang@peihao_wang·
Training 3D foundation models? In our CVPR2024 work, we propose a new concept that directly enhances 2D prediction’s view consistency via image based rendering. It generalizes to many 2D foundation models in zero shot and transfers their success to 3D at little training cost.
Mukund@sneezygiraffe

Progress in 2D vision models has been exciting, e.g. SAM, DINO, etc. But how do we apply them on a 3D scene? We propose Lift3D, a plug ‘n play framework that converts any arbitrary 2D vision model to be 3D consistent w/o any extra optimization. arxiv.org/abs/2403.18922

English
0
1
10
725