Tsung-Yi Lin

140 posts

Tsung-Yi Lin

Tsung-Yi Lin

@TsungYiLinCV

Principal Research Scientist @Nvidia | Ex-@Google Brain Team | Computer Vision & Machine Learning

Katılım Kasım 2018
366 Takip Edilen2.5K Takipçiler
Sabitlenmiş Tweet
Tsung-Yi Lin
Tsung-Yi Lin@TsungYiLinCV·
Honored that COCO received the Koendrink Prize at ECCV 2024. It’s been incredible to witness advancements driven by well curated data over the past decade. I'm excited for the future of multi-modal understanding and generation—data will remain key, and we’re just getting started.
Tsung-Yi Lin tweet media
English
8
5
147
11.4K
Tsung-Yi Lin
Tsung-Yi Lin@TsungYiLinCV·
@peteflorence Great read! Exciting developments ahead when we put physical AI as the first class citizen!!
English
2
0
1
1.9K
Tsung-Yi Lin
Tsung-Yi Lin@TsungYiLinCV·
@yen_chen_lin Hardware and data will inevitably scale. The real leverage now is in better CV representations that help us scale data faster. Those intermediates probably won’t survive into the final model—but they’ll accelerate the path there.
English
1
0
4
105
Tsung-Yi Lin retweetledi
NVIDIA Robotics
NVIDIA Robotics@NVIDIARobotics·
Cosmos Policy just dropped for robotics. 🤖 Cutting edge research is turning a world foundation model into a unified robot brain that can see, predict, and act—no extra action heads, no complicated control stack. Read our blog on @HuggingFace ➡️ nvda.ws/3MfqiPX Want to get hands-on with Cosmos (Reason, Predict, Policy, Cookbook)? Join the Cosmos Cookoff, sponsored by @nebiusai and @milestonesys ➡️ nvda.ws/4a12Z4f
English
6
64
394
14.2K
Tsung-Yi Lin retweetledi
Moo Jin Kim
Moo Jin Kim@moo_jin_kim·
We release Cosmos Policy 💫: a state-of-the-art robot policy built on a video diffusion model backbone. - policy + world model + value function — in 1 model - no architectural changes to the base video model - SOTA in LIBERO (98.5%), RoboCasa (67.1%), & ALOHA tasks (93.6%) 🧵👇
English
17
109
863
146.9K
Tsung-Yi Lin
Tsung-Yi Lin@TsungYiLinCV·
Cosmos Reason 2 is already powering video analytics AI agents, autonomous vehicles, and robots, and works hand-in-hand with our newest Cosmos releases: Predict 2.5, Transfer 2.5-2B, and the NVIDIA GR00T N1.6 robot foundation model!
English
0
0
0
208
Tsung-Yi Lin retweetledi
Max Li 李赵硕
Max Li 李赵硕@mli0603·
Our theoretical upper bound reaches 0.611. we believe that scaling model capacity could further absorb and consolidate this knowledge. But we’re still far from solving the BEHAVIOR benchmark. We hope this work provides a strong and practical starting point for the community.
Max Li 李赵硕 tweet media
English
0
1
2
150
Tsung-Yi Lin retweetledi
NVIDIA AI Developer
NVIDIA AI Developer@NVIDIAAIDev·
Today’s Cosmos Cookbook Special 🍽️ A recipe to post-train Cosmos Reason into a physics-savvy critic that judges whether generated videos obey real-world physics. 📖 Score videos for physical plausibility 📖 Detect physically inaccurate issues like impossible trajectories or bad collisions 📖 Incorporate physics‑aware rewards into your generation or RL loops to keep the models grounded Read the full recipe 📖 nvda.ws/4pTI0GZ
English
2
3
37
2.5K
Tsung-Yi Lin retweetledi
Fei-Fei Li
Fei-Fei Li@drfeifei·
🎉 How do we measure the rapid progress of robotic learning and embodied AI research? The 1st BEHAVIOR challenge results are out! And we're to see such strong performance on 50 challenging household tasks. Congrats to the winning teams! 🥇Robot Learning Collective 🥈Comet 🥉SimpleAI Robot Leaderboard: shorturl.at/xaAlU (1/N)
English
33
78
557
107.1K
Tsung-Yi Lin retweetledi
Tsung-Yi Lin retweetledi
Marco Pavone
Marco Pavone@drmapavone·
Excited to unveil @nvidia's latest work on #Reasoning Vision–Language–Action (#VLA) models — Alpamayo-R1! Alpamayo-R1 is a new #reasoning VLA architecture featuring a diffusion-based action expert built on top of the #Cosmos-#Reason backbone. It represents one of the core technologies driving NVIDIA’s push toward Level 4 autonomy and robotaxis (nvidianews.nvidia.com/news/nvidia-ub…), as announced by Jensen Huang at #gtc DC last week. 📄 Paper: Alpamayo-R1 research.nvidia.com/publication/20… We present: - Architecture & Design: How to transform a VLM into a driving-ready Reasoning VLA - Chain of Causation Labeling: A new framework enabling reasoning-based learning - Training Strategy: From internet-scale pre-training → AV-specific SFT → RL-based post-training - Extensive Evaluation: From closed-loop simulation to real-world, on-vehicle testing 📈 Results: Alpamayo-R1 delivers significant performance gains over end-to-end baselines — especially in rare, safety-critical scenarios — all while maintaining real-time inference (99 ms end-to-end latency). Coming soon: releases of model variants and reasoning metadata built on top of the Physical AI Dataset (huggingface.co/datasets/nvidi…)—with more updates on the way. Stay tuned! 🙌 Huge thanks to Wenjie Luo and @yan_wang_9 (project co-leads); the @nvidia AV Research team (@iamborisi, @YurongYou, @xinshuoweng, @tianran_, @wenhaoding95, and many others); collaborators across @nvidia Research (@liu_mingyu, @visualyang, @PavloMolchanov, and many others); and the @nvidia AV Product team (Sarah Tariq, Patrick Liu, Jack Huang, and many more). Full contributor list in the Appendix. @NVIDIADRIVE @NVIDIAAI
English
10
39
234
36.6K
Tsung-Yi Lin retweetledi
NVIDIA AI Developer
NVIDIA AI Developer@NVIDIAAIDev·
NVIDIA Cosmos open models made major progress.✨ ✅ Cosmos Predict 2.5 unifies text, image, and video world generation into one model that creates longer and more coherent simulations with improved grounding and efficiency. ✅ Cosmos Transfer 2.5 introduces precise, spatially controlled world transformations that are 3.5× smaller, faster, and higher in fidelity than before. Together, these models push the boundaries of physical AI, enabling robots and agents to learn, reason, and operate in dynamically simulated worlds. Read the @HuggingFace blog. 🔗huggingface.co/blog/nvidia/co… #NVIDIAGTC
English
10
34
176
12.4K
William Fedus
William Fedus@LiamFedus·
Today, @ekindogus and I are excited to introduce @periodiclabs. Our goal is to create an AI scientist. Science works by conjecturing how the world might be, running experiments, and learning from the results. Intelligence is necessary, but not sufficient. New knowledge is created when ideas are found to be consistent with reality. And so, at Periodic, we are building AI scientists and the autonomous laboratories for them to operate. Until now, scientific AI advances have come from models trained on the internet. But despite its vastness — it’s still finite (estimates are ~10T text tokens where one English word may be 1-2 tokens). And in recent years the best frontier AI models have fully exhausted it. Researchers seek better use of this data, but as any scientist knows: though re-reading a textbook may give new insights, they eventually need to try their idea to see if it holds. Autonomous labs are central to our strategy. They provide huge amounts of high-quality data (each experiment can produce GBs of data!) that exists nowhere else. They generate valuable negative results which are seldom published. But most importantly, they give our AI scientists the tools to act. We’re starting in the physical sciences. Technological progress is limited by our ability to design the physical world. We’re starting here because experiments have high signal-to-noise and are (relatively) fast, physical simulations effectively model many systems, but more broadly, physics is a verifiable environment. AI has progressed fastest in domains with data and verifiable results - for example, in math and code. Here, nature is the RL environment. One of our goals is to discover superconductors that work at higher temperatures than today's materials. Significant advances could help us create next-generation transportation and build power grids with minimal losses. But this is just one example — if we can automate materials design, we have the potential to accelerate Moore’s Law, space travel, and nuclear fusion. We’re also working to deploy our solutions with industry. As an example, we're helping a semiconductor manufacturer that is facing issues with heat dissipation on their chips. We’re training custom agents for their engineers and researchers to make sense of their experimental data in order to iterate faster. Our founding team co-created ChatGPT, DeepMind’s GNoME, OpenAI’s Operator (now Agent), the neural attention mechanism, MatterGen; have scaled autonomous physics labs; and have contributed to some of the most important materials discoveries of the last decade. We’ve come together to scale up and reimagine how science is done. We’re fortunate to be backed by investors who share our vision, including @a16z who led our $300M round, as well as @Felicis, DST Global, NVentures (NVIDIA’s venture capital arm), @Accel and individuals including @JeffBezos , @eladgil , @ericschmidt, and @JeffDean. Their support will help us grow our team, scale our labs, and develop the first generation of AI scientists.
William Fedus tweet media
English
427
438
4.2K
3.5M
Tsung-Yi Lin retweetledi
NVIDIA Robotics
NVIDIA Robotics@NVIDIARobotics·
Facing data bottlenecks in your robotics workflows? Explore how #NVIDIACosmos world foundation models from #NVIDIAResearch can be post trained for specific #PhysicalAI applications: 🔮 Cosmos Predict to simulate future scenarios. 🎨 Cosmos Transfer to create diverse synthetic environments. 💡 Cosmos Reason to enable advanced robotic decision-making. Learn more 👉 nvda.ws/4osm5Xt
English
2
9
46
3.6K
Tsung-Yi Lin
Tsung-Yi Lin@TsungYiLinCV·
🚀Earlier this year we launched Cosmos-Reason1 — and it just climbed to #1 on the new Physical Reasoning Leaderboard, released alongside V-JEPA 2! 🤗Try it out: huggingface.co/nvidia/Cosmos-…
NVIDIA AI Developer@NVIDIAAIDev

Ranked #1 on @Meta's Physical Reasoning Leaderboard on @huggingface for a reason. 👏 🔥 🏆 Cosmos Reason enables robots and AI agents to reason like humans by leveraging prior knowledge, physics, and common sense to intelligently interact with the real world. This state-of-the-art reasoning VLM excels in physical AI applications like: 📊 Data curation and annotation 🤖 Robot planning and reasoning ▶️ Video analytics AI agents See the leaderboard → nvda.ws/4mLUmjd Check out Cosmos Reason → nvda.ws/425mMfF

English
1
2
14
1.9K
Tsung-Yi Lin retweetledi
Hanzi Mao
Hanzi Mao@hanna_mao·
We build Cosmos-Predict2 as a world foundation model for Physical AI builders — fully open and adaptable. Post-train it for specialized tasks or different output types. Available in multiple sizes, resolutions, and frame rates. 📷 Watch the repo walkthrough youtube.com/watch?v=ibnVm6… ⚒️ Visit github.com/nvidia-cosmos/… for more #NVIDIACosmos #PhysicalAI
YouTube video
YouTube
English
8
66
278
31.8K