Tianqin Li @ CMU

49 posts

Tianqin Li @ CMU banner
Tianqin Li @ CMU

Tianqin Li @ CMU

@Jack_Litq

CS PhD Student at Carnegie Mellon University

Katılım Nisan 2017
366 Takip Edilen33 Takipçiler
Sabitlenmiş Tweet
Tianqin Li @ CMU
Tianqin Li @ CMU@Jack_Litq·
Thrilled to share our new CVPR 2025 paper “Perceptual Inductive Bias Is What You Need Before Contrastive Learning”! Inspired by David Marr's theory, we show that injecting object shape & surface cues before self-supervised pre-training leads to: • 2× faster convergence in total pre-training time (Big Save on $$ ! ) • +2 mIoU on ADE20K & Cityscapes • Stronger depth & OOD robustness • Human‐level shape bias across 17 benchmarks 📖 Read our paper on arXiv: arxiv.org/abs/2506.01201… #ComputerVision #SelfSupervisedLearning #CVPR2025 #ShapeBias
Tianqin Li @ CMU tweet media
English
0
0
1
109
Tianqin Li @ CMU retweetledi
Eric J. Michaud
Eric J. Michaud@ericjmichaud_·
How does scaling up neural networks change what they learn? Despite its importance, our understanding of this question remains nascent. I've written a long post reflecting on my model of neural scaling and its relationship to interpretability, etc.: ericjmichaud.com/quanta
English
38
165
1.4K
360.3K
Tianqin Li @ CMU retweetledi
Google DeepMind
Google DeepMind@GoogleDeepMind·
What if you could not only watch a generated video, but explore it too? 🌐 Genie 3 is our groundbreaking world model that creates interactive, playable environments from a single text prompt. From photorealistic landscapes to fantasy realms, the possibilities are endless. 🧵
English
812
2.6K
13.3K
3.7M
Tianqin Li @ CMU retweetledi
AK
AK@_akhaliq·
StreamDiT Real-Time Streaming Text-to-Video Generation StreamDiT enables real-time text-to-video generation at 16 FPS on a single GPU (H100)
English
7
52
326
43.7K
Tianqin Li @ CMU retweetledi
DailyPapers
DailyPapers@HuggingPapers·
Meta introduces StreamDiT, a real-time streaming text-to-video generation model. It delivers continuous 512p video at 16 FPS on a single GPU. Unlock new interactive applications like live content creation and dynamic video editing.
English
1
1
3
597
Tianqin Li @ CMU retweetledi
Hong-Xing (Koven) Yu
Hong-Xing (Koven) Yu@Koven_Yu·
#ICCV2025 🤩3D world generation is cool, but it is cooler to play with the worlds using 3D actions 👆💨, and see what happens! — Introducing *WonderPlay*: Now you can create dynamic 3D scenes that respond to your 3D actions from a single image! Web: kyleleey.github.io/WonderPlay/ 🧵1/7
English
6
38
187
57.4K
Tianqin Li @ CMU retweetledi
Overworld
Overworld@overworld_ai·
🌟Got multiple expert models and want them to steer your image/video generation? We’ve re-implemented the Product of Experts for Visual Generation paper on a toy example, and broken it down step by step in our new blog post! Includes: - Github repo: Annealed Importance Sampling (AIS) implementation + an easy-to-follow Jupyter notebook - Explanations, intuitions, & visualizations of the math! Check out the blog post in the comments below! #AI #Diffusion #PoE #GenerativeModeling
GIF
GIF
English
2
10
23
7.4K
Tianqin Li @ CMU retweetledi
Kyunghyun Cho
Kyunghyun Cho@kchonyc·
late 1980s, @ylecun and @LeonBottou used amiga 1000 and a bespoke modem to implement and research artificial neural nets using SN-1. the legend was born.
Kyunghyun Cho tweet mediaKyunghyun Cho tweet media
English
18
71
752
99.2K
Tianqin Li @ CMU retweetledi
Hongyu Li
Hongyu Li@Hongyu_Lii·
We interact with dogs through touch -- a simple pat can communicate trust or instruction. Shouldn't interacting with robot dogs be as intuitive? Most commercial robots lack tactile skins. We present UniTac: a method to sense touch using only existing joint sensors! [1/5]
English
4
25
115
15.9K
Tianqin Li @ CMU retweetledi
Paul Zhou
Paul Zhou@zhiyuan_zhou_·
Action chunking works really well in imitation learning, and is essential to learning good BC policies in robotics. Can/should we apply the same idea in RL? We find that RL in the action chunk space, when done right (we call it ✨Q-chunking ✨), can be highly efficient🧵👇
Paul Zhou tweet media
English
4
22
188
33.6K
Tianqin Li @ CMU retweetledi
Russ Tedrake
Russ Tedrake@RussTedrake·
TRI's latest Large Behavior Model (LBM) paper landed on arxiv last night! Check out our project website: toyotaresearchinstitute.github.io/lbm1/ One of our main goals for this paper was to put out a very careful and thorough study on the topic to help people understand the state of the technology, and to share a lot of details for how we're achieving it. youtube.com/watch?v=BEXFnr…
YouTube video
YouTube
English
8
105
486
87.7K
Tianqin Li @ CMU retweetledi
Bolei Zhou
Bolei Zhou@zhoubolei·
Code is released!
Wayne Wu@wayne_wu_0503

🚀URBAN-SIM is released! A large-scale robot learning platform for urban spaces, built on NVIDIA Omniverse. Train robots at scale in rich, interactive city environments. 🔗 github.com/metadriverse/u… Key Features: ⚡️ High Efficiency: Thousands of FPS on a single GPU -- enabling fast robot training. 📈 Scalable Training: Add more GPUs, scale up performance (FPS) continuously. 🏙️Rich Scene Context: Infinite scene generation -- supporting tasks like visual locomotion, navigation, VLA training, and robot-human-scene interaction. 🎮 Versatile Interfaces. Collect data via VR headset, racing wheel, keyboard, or mouse for imitation learning. 🧩 Ecosystem Compatibility: Built on NVIDIA Omniverse, IsaacSim, and PhysX.

English
0
3
54
5.3K
Tianqin Li @ CMU retweetledi
Qiyang (Colin) Li
Qiyang (Colin) Li@qiyang_li·
Everyone knows action chunking is great for imitation learning. It turns out that we can extend its success to RL to better leverage prior data for improved exploration and online sample efficiency! colinqiyangli.github.io/qc/ The recipe to achieve this is incredibly simple. 🧵 1/N
English
3
75
369
48.2K
Tianqin Li @ CMU retweetledi
Phillip Isola
Phillip Isola@phillip_isola·
Our new work on adaptive image tokenization: Image —> T tokens * variable T, based on image complexity * single forward pass both infers T and tokenizes to T tokens * approximates minimum description length encoding of the image
Shivam Duggal@ShivamDuggal4

Compression is the heart of intelligence From Occam to Kolmogorov—shorter programs=smarter representations Meet KARL: Kolmogorov-Approximating Representation Learning. Given an image, token budget T & target quality 𝜖 —KARL finds the smallest t≤T to reconstruct it within 𝜖🧵

English
0
28
203
15.3K
Tianqin Li @ CMU retweetledi
Sergey Levine
Sergey Levine@svlevine·
Action chunking is a great idea in robotics: by getting a model to produce a short sequence of actions, it _just works better_ for some mysterious reason. Now it turns out this can help in RL too, and it's a bit clearer why: action chunks help explore and help with backups. 🧵👇
Sergey Levine tweet media
English
10
112
686
58K
Tianqin Li @ CMU retweetledi
DailyPapers
DailyPapers@HuggingPapers·
How Well Does GPT-4o Understand Vision? A comprehensive benchmarking framework for evaluating multimodal foundation models on standard computer vision tasks using established datasets.
DailyPapers tweet media
English
2
3
13
1K
Tianqin Li @ CMU retweetledi
Sergey Levine
Sergey Levine@svlevine·
Warm-start RL (WSRL) can learn to control a real robot in under 20 minutes! Deep RL is getting really fast. Warm-start from offline data + super-efficient online learning is increasingly making real world RL not just practical but pretty easy.
Paul Zhou@zhiyuan_zhou_

We tested WSRL (Warm-start RL) on a Franka Robot, and it leads to really efficient online RL fine-tuning in the real world! WSRL learned the peg insertion task perfectly with only 11 minutes of warmup and *7 minutes* of online RL interactions 👇🧵

English
4
61
374
57.9K
Tianqin Li @ CMU retweetledi
Sean Kirmani
Sean Kirmani@SeanKirmani·
🤖🌎 We are organizing a workshop on Robotics World Modeling at @corl_conf 2025! We have an excellent group of speakers and panelists, and are inviting you to submit your papers with a July 13 deadline. Website: robot-world-modeling.github.io
Sean Kirmani tweet media
English
3
37
135
45.9K