Weiyang Liu

833 posts

Weiyang Liu banner
Weiyang Liu

Weiyang Liu

@Besteuler

Assistant Professor @CUHKofficial. Postdoc @MPI_IS. PhD @Cambridge_Uni & @GeorgiaTech. Previous Intern @Google & @nvidia. All opinions are my own.

شامل ہوئے Mayıs 2009
778 فالونگ2.3K فالوورز
Weiyang Liu ری ٹویٹ کیا
Jitendra MALIK
Jitendra MALIK@JitendraMalikCV·
With Emmanuel Dupoux scp.net/persons/dupoux/ and Yann LeCun @ylecun, we consider a cognitive science inspired AI. We analyse how autonomous learning works in living organisms, and propose a roadmap for reproducing it in artificial systems. lnkd.in/eNWDmuqT
English
9
79
447
60.5K
Weiyang Liu ری ٹویٹ کیا
The Scientific Lens
The Scientific Lens@LensScientific·
Why do humans find symmetry so appealing? Is it biological, evolutionary, or a product of our collective consciousness?
English
238
576
5.1K
1.2M
Weiyang Liu ری ٹویٹ کیا
Andrej Karpathy
Andrej Karpathy@karpathy·
@Yulun_Du @ilyasut SGD is a ResNet too (the blocks of it are fwd+bwd), the residual stream is the weights so... 🤔 We're not taking the Attention is All You Need part literally enough? :D
English
28
39
585
100.5K
Weiyang Liu ری ٹویٹ کیا
Wildminder
Wildminder@wildmindai·
DiagDistill. Real-time streaming video gen. - low initial latency; - speedup over base model; - fixed 17GB VRAM footprint; - Wan2.1 + Tiny VAE. - ultra long video gen. Probably useful for real-time, responsive video game environments or cutscenes that react to player choices instantly. spherelab.ai/diagdistill/
English
2
8
68
6K
Weiyang Liu
Weiyang Liu@Besteuler·
Diagonal distillation is based on an extremely simple and intuitive idea: more denoising steps at the beginning and less denoising steps later. With more temporal information accumulated, it should be easier to generate the next frame (hence less denoising steps). I have really learned a lot about video generation from Jinxiu and the team in this project! 😄
Jinxiu Liu@JinxiuLiuAI

Excited to share: our paper DiagDistill is accepted to #ICLR2026! It’s a real‑time autoregressive video generation method that creates a 5‑second video in just 2.61 seconds! Paper: huggingface.co/papers/2603.09… Website: spherelab.ai/diagdistill/ Code: github.com/Sphere-AI-Lab/…

English
0
0
12
2.5K
Weiyang Liu
Weiyang Liu@Besteuler·
🚀 Excited to introduce POET-X, a scalable and highly memory-efficient algorithm for LLM pretraining. ✨ LoRA-level GPU memory, better-than-AdamW pretraining performance! POET-X finally marries training stability (from POET's spectrum preservation) and practical scalability (from our new implementation and CUDA kernels). POET-X can pretrain billion-parameter LLMs (eg., Llama-8B) on a single NVIDIA H100, where standard optimizers like AdamW run out of memory under the same settings. We carefully reimplemented every computation step of POET (arxiv.org/pdf/2506.08001). POET-X combines many small checkpointing and parallelization tricks. While each may appear incremental, together they dramatically improve scalability and reduce memory usage by over 70% compared to the original POET. The memory-efficiency of POET-X comes from the unique parameter-efficient reparameterization (where sparsity comes in) of the weight update rule. POET-X bridges this gap between parameter efficiency and memory efficiency. Code is now public. Feel free to try it! ➡️ paper: arxiv.org/pdf/2603.05500 💻 Code: github.com/Sphere-AI-Lab/… 🌐 Website: spherelab.ai/poetx #AI #LLM #MachineLearning #DeepLearning
Weiyang Liu tweet media
English
1
12
55
8.2K
Michael Black
Michael Black@Michael_J_Black·
Big news: @Meshcapade is now part of Epic Games. This is a perfect match for our technology and team and I am super excited about what we will build together. I want to thank our many supporters who have provided funding and/or advice on our journey from 2018 to today, including @MP_Innovation, @maxplanckpress, @MPI_IS, @PerceivingSys, @matrixvc, @dcstalder, @dianaberlin, @HV_Capital, Zuzanna Czapinska, LBBW VC, @GoodwaterCap, CLO Virtual Fashion (@itsclo3d), @grbradsk, @lucvincent, @JeffDean, @ballmatthew, Andrew Hamel, Bill O’Farrell, Ammar Zakiullah, @Nicolas_Keller, Alex Diehl, @goodwinlaw, YPOG, @NVIDIA Inception Program, @msft4startups, and many more! Most importantly, I want to thank my co-founders @naureenmahmood and Talha Zaman and the whole @Meshcapade team. There is nothing more rewarding than working with great people who you like and trust to build products that customers love using technology you believe in. Thank you, thank you, thank you. Now, on to the next phase! mpg.de/26082348/max-p…
Michael Black tweet media
English
55
52
559
87.7K
Weiyang Liu ری ٹویٹ کیا
Vincent Sitzmann
Vincent Sitzmann@vincesitzmann·
In my recent blog post, I argue that "vision" is only well-defined as part of perception-action loops, and that the conventional view of computer vision - mapping imagery to intermediate representations (3D, flow, segmentation...) is about to go away. vincentsitzmann.com/blog/bitter_le…
English
43
157
1K
367K
Michael Black
Michael Black@Michael_J_Black·
Science is a team sport. I’ve been fortunate to play on some great teams with outstanding researchers. Today, I am honored to be admitted to the National Academy of Engineering. I would not have received this recognition, however, without the dedication and brilliance of my students, postdocs, interns, collaborators, data team, software team, administrators, funding agencies, and government supporters. It is through the collective effort of many, with the support of the society at large, that science and engineering make progress. Ultimately, I am grateful to the taxpayer who gives their hard-earned money to support the advancement of knowledge. As a taxpayer myself, I think a lot about my responsibility to society. I will continue to work to deserve your support. #NAEMember, @maxplanckpress, @mpi_is, @theNAEng, @PerceivingSys nae.edu/345149/NAENewC…
English
18
4
152
9K
Weiyang Liu
Weiyang Liu@Besteuler·
Orthogonal Finetuning (oft.wyliu.com; boft.wyliu.com) has a unique advantage of preventing catastrophic forgetting. Inspired by this property, we find that merging models within the orthogonal group can effectively reduce model conflicts and preserve both pretraining and downstream knowledge. This is our OrthoMerge framework. The idea behind OrthoMerge is extremely simple. For OFT-tuned models, we can first map the orthogonal adapters to Lie algebra with inverse Carley transform and then perform merging there. This guarantees the merged model differs from the pretrained model only up to an orthogonal transformation. A better news is that OrthoMerge can also be applied to non-OFT-tuned models. By solving the orthogonal procrustes problem, we can have the projected component of the adapter onto the orthogonal group. OrthoMerge will then be applied there and the residual component can be merged using conventional merging methods. That said, OrthoMerge can be used together with existing model merging methods! This is a great example of simple yet effective ideas. Great efforts by my PhD students Sihan Yang and Kexuan Shi. The project is already open-sourced and feel free to give it a try! Project: spherelab.ai/OrthoMerge/ Paper: arxiv.org/pdf/2602.05943 Code: github.com/Sphere-AI-Lab/…
Weiyang Liu tweet mediaWeiyang Liu tweet media
English
5
48
339
21.6K