Tianqin Li @ CMU

49 posts

Tianqin Li @ CMU

@Jack_Litq

CS PhD Student at Carnegie Mellon University

Katılım Nisan 2017

366 Takip Edilen33 Takipçiler

Sabitlenmiş Tweet

Tianqin Li @ CMU@Jack_Litq·13 Haz

Thrilled to share our new CVPR 2025 paper “Perceptual Inductive Bias Is What You Need Before Contrastive Learning”! Inspired by David Marr's theory, we show that injecting object shape & surface cues before self-supervised pre-training leads to: • 2× faster convergence in total pre-training time (Big Save on $$ ! ) • +2 mIoU on ADE20K & Cityscapes • Stronger depth & OOD robustness • Human‐level shape bias across 17 benchmarks 📖 Read our paper on arXiv: arxiv.org/abs/2506.01201… #ComputerVision #SelfSupervisedLearning #CVPR2025 #ShapeBias

English

109

Tianqin Li @ CMU retweetledi

Eric J. Michaud@ericjmichaud_·13 Oca

How does scaling up neural networks change what they learn? Despite its importance, our understanding of this question remains nascent. I've written a long post reflecting on my model of neural scaling and its relationship to interpretability, etc.: ericjmichaud.com/quanta

English

165

1.4K

360.3K

Tianqin Li @ CMU retweetledi

Google DeepMind@GoogleDeepMind·5 Ağu

What if you could not only watch a generated video, but explore it too? 🌐 Genie 3 is our groundbreaking world model that creates interactive, playable environments from a single text prompt. From photorealistic landscapes to fantasy realms, the possibilities are endless. 🧵

English

812

2.6K

13.3K

3.7M

Tianqin Li @ CMU retweetledi

AK@_akhaliq·8 Tem

StreamDiT Real-Time Streaming Text-to-Video Generation StreamDiT enables real-time text-to-video generation at 16 FPS on a single GPU (H100)

English

326

43.7K

Tianqin Li @ CMU retweetledi

DailyPapers@HuggingPapers·13 Tem

Meta introduces StreamDiT, a real-time streaming text-to-video generation model. It delivers continuous 512p video at 16 FPS on a single GPU. Unlock new interactive applications like live content creation and dynamic video editing.

English

597

Tianqin Li @ CMU retweetledi

Hong-Xing (Koven) Yu@Koven_Yu·26 Haz

#ICCV2025 🤩3D world generation is cool, but it is cooler to play with the worlds using 3D actions 👆💨, and see what happens! — Introducing *WonderPlay*: Now you can create dynamic 3D scenes that respond to your 3D actions from a single image! Web: kyleleey.github.io/WonderPlay/ 🧵1/7

English

187

57.4K

Tianqin Li @ CMU retweetledi

Overworld@overworld_ai·12 Tem

🌟Got multiple expert models and want them to steer your image/video generation? We’ve re-implemented the Product of Experts for Visual Generation paper on a toy example, and broken it down step by step in our new blog post! Includes: - Github repo: Annealed Importance Sampling (AIS) implementation + an easy-to-follow Jupyter notebook - Explanations, intuitions, & visualizations of the math! Check out the blog post in the comments below! #AI #Diffusion #PoE #GenerativeModeling

GIF

English

7.4K

Tianqin Li @ CMU retweetledi

Kyunghyun Cho@kchonyc·13 Tem

late 1980s, @ylecun and @LeonBottou used amiga 1000 and a bespoke modem to implement and research artificial neural nets using SN-1. the legend was born.

English

752

99.2K

Tianqin Li @ CMU retweetledi

Hongyu Li@Hongyu_Lii·11 Tem

We interact with dogs through touch -- a simple pat can communicate trust or instruction. Shouldn't interacting with robot dogs be as intuitive? Most commercial robots lack tactile skins. We present UniTac: a method to sense touch using only existing joint sensors! [1/5]

English

115

15.9K

Tianqin Li @ CMU retweetledi

Paul Zhou@zhiyuan_zhou_·12 Tem

Action chunking works really well in imitation learning, and is essential to learning good BC policies in robotics. Can/should we apply the same idea in RL? We find that RL in the action chunk space, when done right (we call it ✨Q-chunking ✨), can be highly efficient🧵👇

English

188

33.6K

Tianqin Li @ CMU retweetledi

Russ Tedrake@RussTedrake·9 Tem

TRI's latest Large Behavior Model (LBM) paper landed on arxiv last night! Check out our project website: toyotaresearchinstitute.github.io/lbm1/ One of our main goals for this paper was to put out a very careful and thorough study on the topic to help people understand the state of the technology, and to share a lot of details for how we're achieving it. youtube.com/watch?v=BEXFnr…

YouTube

English

105

486

87.7K

Tianqin Li @ CMU retweetledi

Bolei Zhou@zhoubolei·12 Tem

Code is released!

Wayne Wu@wayne_wu_0503

🚀URBAN-SIM is released! A large-scale robot learning platform for urban spaces, built on NVIDIA Omniverse. Train robots at scale in rich, interactive city environments. 🔗 github.com/metadriverse/u… Key Features: ⚡️ High Efficiency: Thousands of FPS on a single GPU -- enabling fast robot training. 📈 Scalable Training: Add more GPUs, scale up performance (FPS) continuously. 🏙️Rich Scene Context: Infinite scene generation -- supporting tasks like visual locomotion, navigation, VLA training, and robot-human-scene interaction. 🎮 Versatile Interfaces. Collect data via VR headset, racing wheel, keyboard, or mouse for imitation learning. 🧩 Ecosystem Compatibility: Built on NVIDIA Omniverse, IsaacSim, and PhysX.

English

5.3K

Tianqin Li @ CMU retweetledi

Yann LeCun@ylecun·12 Tem

The optimal batch size is 1 (For suitable definitions of "optimal")

Micah Goldblum@micahgoldblum

🚨 Did you know that small-batch vanilla SGD without momentum (i.e. the first optimizer you learn about in intro ML) is virtually as fast as AdamW for LLM pretraining on a per-FLOP basis? 📜 1/n

English

615

130K

Tianqin Li @ CMU retweetledi

Qiyang (Colin) Li@qiyang_li·12 Tem

Everyone knows action chunking is great for imitation learning. It turns out that we can extend its success to RL to better leverage prior data for improved exploration and online sample efficiency! colinqiyangli.github.io/qc/ The recipe to achieve this is incredibly simple. 🧵 1/N

English

369

48.2K

Tianqin Li @ CMU retweetledi

Wuyang Chen@WuyangC·12 Tem

We have two workshops along with ICML next week. (Great support from SFU+UBC+Vector!) July 14 (9am-5pm): sites.google.com/view/vancouver… July 21 (9am-noon): sites.google.com/view/sfu-at-ic… Please join and enjoy the talks! Location: 515 W Hastings St, Vancouver, BC V6B 4N6 maps.app.goo.gl/kN6o9W87bbL5qC…

English

2.7K

Tianqin Li @ CMU retweetledi

Phillip Isola@phillip_isola·11 Tem

Our new work on adaptive image tokenization: Image —> T tokens * variable T, based on image complexity * single forward pass both infers T and tokenizes to T tokens * approximates minimum description length encoding of the image

Shivam Duggal@ShivamDuggal4

Compression is the heart of intelligence From Occam to Kolmogorov—shorter programs=smarter representations Meet KARL: Kolmogorov-Approximating Representation Learning. Given an image, token budget T & target quality 𝜖 —KARL finds the smallest t≤T to reconstruct it within 𝜖🧵

English

203

15.3K

Tianqin Li @ CMU retweetledi

Sergey Levine@svlevine·11 Tem

Action chunking is a great idea in robotics: by getting a model to produce a short sequence of actions, it _just works better_ for some mysterious reason. Now it turns out this can help in RL too, and it's a bit clearer why: action chunks help explore and help with backups. 🧵👇

English

112

686

58K

Tianqin Li @ CMU retweetledi

DailyPapers@HuggingPapers·7 Tem

How Well Does GPT-4o Understand Vision? A comprehensive benchmarking framework for evaluating multimodal foundation models on standard computer vision tasks using established datasets.

English

Tianqin Li @ CMU retweetledi

Sergey Levine@svlevine·3 Tem

Warm-start RL (WSRL) can learn to control a real robot in under 20 minutes! Deep RL is getting really fast. Warm-start from offline data + super-efficient online learning is increasingly making real world RL not just practical but pretty easy.

Paul Zhou@zhiyuan_zhou_

We tested WSRL (Warm-start RL) on a Franka Robot, and it leads to really efficient online RL fine-tuning in the real world! WSRL learned the peg insertion task perfectly with only 11 minutes of warmup and *7 minutes* of online RL interactions 👇🧵

English

374

57.9K

Tianqin Li @ CMU retweetledi

Sergey Levine@svlevine·3 Tem

For more, check out updated arxiv: arxiv.org/abs/2412.07762 updated website: zhouzypaul.github.io/wsrl

English

4.9K

Tianqin Li @ CMU retweetledi

Sean Kirmani@SeanKirmani·3 Tem

🤖🌎 We are organizing a workshop on Robotics World Modeling at @corl_conf 2025! We have an excellent group of speakers and panelists, and are inviting you to submit your papers with a July 13 deadline. Website: robot-world-modeling.github.io

English

135

45.9K

Keşfet

@ylecun @LeonBottou @corl_conf @elonmusk @BarackObama @taylorswift13 @cristiano @BillGates