david yan

71 posts

david yan

david yan

@dzyan01

phd @PrincetonVL. undergrad @princetoncs

Entrou em Mart 2025
349 Seguindo178 Seguidores
tender
tender@tenderizzation·
now that all the grandiose posts around “how to be good at life/research/etc” have been debunked as grifter ai slop id like to share my view that the diff between failure and success looks more like “i played too much league of legends and it ruined my life” vs “i also played too much league of legends but am just a slightly higher-functioning being so i ended up fine”
English
8
8
207
10.7K
Lukas Münzel
Lukas Münzel@lukasmunzel·
Huge personal update: I moved 5800 miles away from home to join @mirendil We're building towards what I think is the most exciting and impactful use of the models: accelerating scientific research. A wonderfully bright future lies ahead if we get this right :)
English
15
5
159
11.7K
david yan
david yan@dzyan01·
fun fact: dinov3's train set is around 10% ImageNet-1k, which is about a 130x up-weighting (over uniform sampling)
English
0
0
2
165
david yan retweetou
david yan retweetou
Dustin Tran
Dustin Tran@dustinvtran·
personal news: i've joined Elorian as Chief Reasoning Architect. multimodal AGI is the most critical frontier as we move from the era of chatbots to coding agents to models that reason and act over the physical world. i'm really excited to design natively visual models across thinking, agents, architectures, and the systems stack with the amazing team at Elorian. i wish the best to everyone at xAI & SpaceX — driving posttraining was a unique experience with so many memorable stories. all the best to the team, and to Elon.
English
49
14
349
80.4K
david yan retweetou
Raj Ghugare
Raj Ghugare@GhugareRaj·
Does classical computation theory help explain the success of inference time compute in RL? We study this question in our #ICML2026 oral We prove that policies with higher inference compute solve and generalize to a larger set of tasks. Empirically, we show that such policies can outperform 5x larger ResNets. Website - rajghugare19.github.io/computation-rl… 🧵👇
English
2
8
35
6.5K
andrew gao
andrew gao@itsandrewgao·
@dzyan01 wow interesting, do u happen to have any data on degree conferrals by major? also I always found it sad how little hardware engineers make compared to software engineers
English
4
0
6
4.7K
andrew gao
andrew gao@itsandrewgao·
this isn't public but stanford CS degrees dropped by 42% YoY. Berkeley down 61%* only 260 degrees conferred. you can rule out: 1. major impaction/capping - stanford doesn't do this 2. class size - this was an abnormally large class (2k+) any theories? my thoughts below
andrew gao tweet media
English
85
49
914
326.8K
david yan retweetou
Adithya Murali
Adithya Murali@Adithya_Murali_·
@NVIDIA is working on one of the hardest problems in Physical AI so you don’t have to: generalist robotic pick-and-place. We are excited to introduce GraspGenX at #CVPR2026—a foundation model for robotic grasping that works out of the box for unknown robots, novel objects, and unseen environments. Unlike Vision-Language-Action (VLA) models or dedicated grasp networks that require expensive, embodiment-specific training, GraspGenX is cross-embodiment and works zero-shot. You simply pass a "robot prompt" alongside an image of the object to generate actions. 🚀 Key Highlights: 1) Scaling: Trained on over 2 Billion 6-DoF grasp rollouts entirely in physics simulation—a dataset size practically impossible to collect via real-world teleoperation. 2) Zero-Shot Transfer: Works out of the box for several common robot grippers widely used across the research community and industry. 3) Built for the Agentic Era: Features native MCP support, client-server architecture, and skills.md, allowing seamless integration into LLM/Agentic robotics workflows. 4) Full Pipeline Integration: Pair it with other open foundation models (like SAM3) and advanced motion solvers like cuRoboV2 for full deployment in entirely unknown environments. If you are currently executing pick-and-place with a VLA or WAM, you can use GraspGenX to generate sim-verified trajectory data and inject it into your pipeline. No need to waste precious real-world engineering hours on data collection for standard manipulation tasks. 🌐Website: graspgenx.github.io 💻Code: github.com/NVlabs/GraspGe… 📄Paper: arxiv.org/abs/2606.00998 📍CVPR Booth: Poster 619 on Jun 6 1:45 session at ExHall F This work was led by the incredible @BeiningH (Princeton), in collaboration with a phenomenal team at NVIDIA: @erwincoumans, @yu_wei_chao, @balakumar_, @clembow, and Stan Birchfield #CVPR2026
English
0
10
41
5.3K
Minh Nhat Nguyen
Minh Nhat Nguyen@menhguin·
@tenobrus @lu_sichu if you believe a simulation that can fool humans is possible, imo the only reasonable probability approaches 99%, since a reality can spin out infinite simulations either you believe it's possible, or it's not
English
1
0
5
331
Tenobrus
Tenobrus@tenobrus·
wow. scott's p(sim) is much higher than i would have thought
Tenobrus tweet media
English
54
18
582
35.2K
david yan
david yan@dzyan01·
@baaadas didn't know innates get topdecked if you have more then 10. saving this for my next clone run
English
0
0
0
152
Jiaming Song
Jiaming Song@baaadas·
time for some vacation stuff
Jiaming Song@baaadas

Yesterday was my last day at @LumaLabsAI. Over the last three years, I had the privilege of helping drive the company's transition from 3D AI to video generation and native multimodal foundation models. I am grateful to have worked alongside an extraordinary group of researchers, and I look forward to seeing the next chapter of the company's story unfold.

English
9
1
174
24.9K
david yan retweetou
Sergey Zakharov
Sergey Zakharov@ZakharovSergeyN·
Releasing RecGen: a collaboration between @ToyotaResearch, @toyota_europe, and @UvA_Amsterdam tackling a core 3D vision challenge: reconstructing complete multi-object scenes (parts, poses, textures, even occluded geometry) from just 1 to a few RGB-D views. Trained purely on synthetic data, RecGen achieves SOTA on real-world robotics and 6D pose benchmarks, handling occlusions, symmetry, and complex interactions. A step toward scalable, high-fidelity digital twins for robotics, and better evaluation and training of generalist policies. reconstruction-by-generation.github.io
English
3
35
222
27.1K
david yan
david yan@dzyan01·
I’d previously thought that single-view reconstruction would be tough with only synthetic data, but it turns out it’s not! Check out this very cool work applying procedural 3D data to *full* reconstruction.
david yan tweet media
Sergey Zakharov@ZakharovSergeyN

Releasing RecGen: a collaboration between @ToyotaResearch, @toyota_europe, and @UvA_Amsterdam tackling a core 3D vision challenge: reconstructing complete multi-object scenes (parts, poses, textures, even occluded geometry) from just 1 to a few RGB-D views. Trained purely on synthetic data, RecGen achieves SOTA on real-world robotics and 6D pose benchmarks, handling occlusions, symmetry, and complex interactions. A step toward scalable, high-fidelity digital twins for robotics, and better evaluation and training of generalist policies. reconstruction-by-generation.github.io

English
0
1
14
2.1K
Xindi Wu
Xindi Wu@cindy_x_wu·
Honored to receive the 2026 Apple Scholars in AIML PhD fellowship! 🍎 Extremely grateful to my advisor @orussakovsky and all the incredible mentors, collaborators and friends I’ve had throughout the journey. Excited to push toward more scalable and capable multimodal system! machinelearning.apple.com/updates/apple-…
Princeton Computer Science@PrincetonCS

Congrats to to @cindy_x_wu on receiving an @Apple Scholars in AIML fellowship! 🍎 🎉 The fellowship recognizes doctoral students doing innovative research in machine learning and artificial intelligence. bit.ly/3OV0fyP

English
25
5
230
22.7K
david yan
david yan@dzyan01·
@holoday_ The baselines we use are wider than that (>4 cm), but you can always change the code to generate your own. You should definitely check out @_ilya_c's very great work on this (though they consider the unsupervised setting). arxiv.org/abs/2212.12324
English
2
0
3
368
Nathan Myers
Nathan Myers@holoday_·
@dzyan01 this is awesome. what baselines are you training on? i'm going to be working on a single camera where the only baseline comes from natural hand tremor (5mm). curious whether disparity at ~5mm baseline is even recoverable with existing stereo matchers...
English
1
0
0
471
david yan
david yan@dzyan01·
Stereo depth is important in robotics, and relies heavily on synthetic data. But what actually makes for good synthetic data? In WMGStereo, we study dataset design and discover a powerful data recipe - just 500 samples of our data can match 40k Sceneflow samples! 🧵[1/7]
English
4
40
252
14.7K
david yan
david yan@dzyan01·
By collecting the best design choices from our study, we create a full-scale dataset, WMGStereo-150k. Our data is super sample efficient and scales well! [6/7]
david yan tweet media
English
1
0
5
898