Ziming Liu

21 posts

Ziming Liu

Ziming Liu

@lzm_mlsys

PhD Candidate at SoC,@NUSingapore, advised by @YangYou1991. Working on efficient ML system and algorithms. Past:@MSFTResearch @BytedanceTalk @HPCAITech

Singapore Katılım Ocak 2023
55 Takip Edilen110 Takipçiler
Sabitlenmiş Tweet
Ziming Liu
Ziming Liu@lzm_mlsys·
🚀Serving MoE models made EASY and CHEAP!! We built EaaS — think of experts not as layers in a model, but as microservices you can spin up, replicate, or kill independently. No all-to-all. No static process groups. No system-wide crash when one GPU dies. Just: ⚙️ Clients (attention) ↔ Servers (experts) 🧠 Stateless → easy replication 📡 Asymmetric async P2P (no CPU involved!) 🧱 Fine-grained scaling without restarting and save real 💰! Monolithic inference is over. Serving is becoming cloud-native. Preprint here → arxiv.org/abs/2509.17863
Ziming Liu tweet media
English
1
0
4
2.5K
Ziming Liu retweetledi
Tencent HY
Tencent HY@TencentHunyuan·
One static model does not fit all😭 We just dropped our latest work: Functional Neural Memory. Instead of static models, we generate custom "parameters" for every single input. ✅Prompt your model anytime ✅Instant personalization ✅Better instruction following ✅Flexible & dynamic memory (w/o memory bank✌️) (🧵1/6)
English
11
139
333
68.6K
Ziming Liu
Ziming Liu@lzm_mlsys·
Communication library to be released soon!
English
0
0
0
113
Ziming Liu
Ziming Liu@lzm_mlsys·
🚀Serving MoE models made EASY and CHEAP!! We built EaaS — think of experts not as layers in a model, but as microservices you can spin up, replicate, or kill independently. No all-to-all. No static process groups. No system-wide crash when one GPU dies. Just: ⚙️ Clients (attention) ↔ Servers (experts) 🧠 Stateless → easy replication 📡 Asymmetric async P2P (no CPU involved!) 🧱 Fine-grained scaling without restarting and save real 💰! Monolithic inference is over. Serving is becoming cloud-native. Preprint here → arxiv.org/abs/2509.17863
Ziming Liu tweet media
English
1
0
4
2.5K
Ziming Liu
Ziming Liu@lzm_mlsys·
@aryanvs_ @MSFTResearch @YangYou1991 Thanks! We have been trying to make it simple to use as a plug-in of diffusers and it would be really nice if it could be supported in diffusers LOL.
English
0
0
2
135
Ziming Liu retweetledi
Ziming Liu
Ziming Liu@lzm_mlsys·
Results: To further evaluate the qualitative performance of RAS, we conducted a human evaluation. We randomly selected 14 prompts from the official research papers and blogs of Stable Diffusion 3 and Lumina, generating two images for each prompt: one using dense inference and the other using RAS, both with the same random seed and default number of timesteps. 633 out of 1400 votes (45.21%) indicated that the two images were of similar quality. Additionally, 28.29% of votes favored the dense image over the RAS result, while 26.50% preferred RAS over the dense result. These results demonstrate that RAS achieves a significant improvement in throughput (1.625× for Stable Diffusion 3 and 1.561× for Lumina-Next-T2I) without noticeably affecting human preference. (5/5)
Ziming Liu tweet media
English
0
0
6
413
Ziming Liu
Ziming Liu@lzm_mlsys·
We also introduced techniques like starvation prevention, dynamic sampling ratio, accumulated error resetting, key & value recovery, and kernel fusing to further improve the performance of our method. Please refer to our paper for more details. (4/5)
Ziming Liu tweet media
English
1
0
5
451
Ziming Liu retweetledi
Victor.Kai Wang
Victor.Kai Wang@VictorKaiWang1·
Generating ~200 million parameters in just minutes! 🥳 Excited to share our work with @MTDovent , @heisejiasuo96 , and @YangYou1991: 'Recurrent Diffusion for Large-Scale Parameter Generation' (RPG for short). Example: Obtain customized models using prompts (see below). (🧵1/8)
English
4
85
286
45.2K
Ziming Liu retweetledi
Shijie Wang
Shijie Wang@ShijieWang20·
How can we better animate images solely following text descriptions? We present Motion Focal Loss (MotiF) (arxiv.org/abs/2412.16153) to better align motions with text descriptions in text-image-to-video (TI2V) task and release TI2V-Bench, a comprehensive TI2V benchmark. (1/n)
English
6
11
54
5.8K
Ziming Liu retweetledi
Together AI
Together AI@togethercompute·
We are excited that @Leon75421958 is joining the stellar Together AI Research team as VP Frontier Technologies. Leon has driven LLM ecosystem innovations such as Deepspeed and Project Brainwave. Leon will help our mission to build the fastest cloud for generative AI.
Together AI tweet media
English
5
5
31
6.5K
Ziming Liu
Ziming Liu@lzm_mlsys·
@YangYou1991 Really enjoyed chipping in on this project. It's been awesome!😃
English
0
0
2
227
Yang You
Yang You@YangYou1991·
Want to train a model like #Sora? Check out our new project #OpenDiT! OpenDiT is an easy-to-use, fast, and memory-efficient system for training and deploying DiT models, which are the foundation of models like Sora. With OpenDiT, you can achieve: * Up to 80% faster in training * 50% reduction in memory usage * Over 50% reduction in communication volume with novel sequence parallelism Github: github.com/NUS-HPC-AI-Lab…
Yang You tweet mediaYang You tweet media
English
5
65
314
88.8K
Ziming Liu
Ziming Liu@lzm_mlsys·
Really enjoyed chipping in on this project. It's been awesome!😃
Yang You@YangYou1991

Want to train a model like #Sora? Check out our new project #OpenDiT! OpenDiT is an easy-to-use, fast, and memory-efficient system for training and deploying DiT models, which are the foundation of models like Sora. With OpenDiT, you can achieve: * Up to 80% faster in training * 50% reduction in memory usage * Over 50% reduction in communication volume with novel sequence parallelism Github: github.com/NUS-HPC-AI-Lab…

English
0
0
3
115