Se June Joo (@joocjun) - Twitter 프로필 | Zamantika Mersobahis Locabet

Se June Joo 리트윗함

Seonghyeon Ye@SeonghyeonYe·19 Şub

VLAs (from VLMs) ❌ => WAMs (from Video Models) ✅ Why WAMs? 1️⃣ World Physics: VLMs know the internet, but Video Models implicitly model the physical laws essential for manipulation. 2️⃣ The "GPT Direction": VLAs are like BERT (rely heavily on task-specific post-training). WAMs are like GPT (pre-train & prompt), unlocking incredible zero-shot transfer! What I want to see in 2026: 📈 Scaling Laws: We will see much clearer scaling laws for robotics compared to VLAs. 🤝 Human-to-Robot Transfer: Unlocking massive transfer capabilities using video as a shared representation space. 🤖 Zero-Shot Mastery: Moving from short-horizon tasks to long-horizon, dexterous manipulation without task-specific demonstrations. We recently open-sourced the checkpoints, training and inference code. Dive into the research! 👇 📄 Paper: arxiv.org/abs/2602.15922 💻 Code: github.com/dreamzero0/dre… 🤗 HF: huggingface.co/GEAR-Dreams/Dr…

English

5

65

516

74K

Se June Joo 리트윗함

Joel Jang@jang_yoel·20 Şub

🚀 DreamZero training code is LIVE — train your own WAM (aka VAM)! 🔧 Replicate DROID from-scratch training 📊 Run evals on sim (DROID-Sim, MolmoSpaces, Polaris) & real-world (RoboArena) No 2 GB200s for real-time inference? No problem — let NVIDIA carry that burden 💪. Sign up for our API and jump into prompting new tasks! (e.g. "fan the burger" 🍔, totally unseen verb/task from DROID) Coming soon: new embodiment/robot fine-tuning initialized from our DreamZero-AGIBot checkpoint. Stay tuned! 🤖 🔗 github.com/dreamzero0/dre…

Seonghyeon Ye@SeonghyeonYe

VLAs (from VLMs) ❌ => WAMs (from Video Models) ✅ Why WAMs? 1️⃣ World Physics: VLMs know the internet, but Video Models implicitly model the physical laws essential for manipulation. 2️⃣ The "GPT Direction": VLAs are like BERT (rely heavily on task-specific post-training). WAMs are like GPT (pre-train & prompt), unlocking incredible zero-shot transfer! What I want to see in 2026: 📈 Scaling Laws: We will see much clearer scaling laws for robotics compared to VLAs. 🤝 Human-to-Robot Transfer: Unlocking massive transfer capabilities using video as a shared representation space. 🤖 Zero-Shot Mastery: Moving from short-horizon tasks to long-horizon, dexterous manipulation without task-specific demonstrations. We recently open-sourced the checkpoints, training and inference code. Dive into the research! 👇 📄 Paper: arxiv.org/abs/2602.15922 💻 Code: github.com/dreamzero0/dre… 🤗 HF: huggingface.co/GEAR-Dreams/Dr…

English

2

17

116

10.3K

Se June Joo 리트윗함

Thomas Zhang@ThomasTCKZhang·17 Ara

🤖🤖Very excited to finally share our new work “Action Chunking and Exploratory Data Collection Yield Exponential Improvements in Behavior Cloning for Continuous Control” Everyone in robotics does action-chunking, but why does it actually work?🤔🤔And, what can theory tell us about the properties of data we should be collecting for robotic behavior cloning? 🧵1/N

English

5

61

403

59.2K

Se June Joo 리트윗함

Pascale Fung@pascalefung·16 Ara

Introducing VL-JEPA: Vision-Language Joint Embedding Predictive Architecture for streaming, live action recognition, retrieval, VQA, and classification tasks with better performance and higher efficiency than large VLMs. • VL-JEPA is the first non-generative model that can perform general-domain vision-language tasks in real-time, built on a joint embedding predictive architecture. • We demonstrate in controlled experiments that VL-JEPA, trained with latent space embedding prediction, outperforms VLMs that rely on data space token prediction. • We show that VL-JEPA delivers significant efficiency gains over VLMs for online video streaming applications, thanks to its non-autoregressive design and native support for selective decoding. • We highlight that our VL-JEPA model, with an unified model architecture, can effectively handle a wide range of classification, retrieval, and VQA tasks at the same time. by @Delong0_0 @MustafaShukor1 @TheoMoutakanni @willyhcchung Jade Lei Yu Tejaswi Kasarla @AllenBolourchi @ylecun @pascalefung arxiv.org/abs/2512.10942

English

13

87

557

89.4K

Se June Joo@joocjun·23 Eki

@eddybuild Cool release eddy👍

English

0

1

175

Eddy Xu@eddybuild·22 Eki

today, we're releasing the largest egocentric dataset of physical jobs - 400k action labels - 2.5k clips - 2x'd open source dataset size (download below)

English

109

184

2.3K

416.4K

Se June Joo 리트윗함

Sourish Jasti@SourishJasti·21 Eki

1/ The future of general-purpose robotics will be decided by one major question: which flavor of data scales reasoning? Every major lab represents a different bet. Over the past 3 months, @adam_patni, @vriishin, and I read the core research papers, spoke with staff at the major labs, and mapped the talent pool. This has completely changed how we think about general-purpose robotics. Our paper builds intuition, step-by step, across the 2025 frontier: from architectures → evals → data → industry dynamics. Each layer reveals a different bottleneck, but they all converge on one truth—data decides everything. Our takeaways + process below👇 If you want access to our graph (sound on), comment or DM me

English

87

189

836

179.4K

Se June Joo 리트윗함

Saining Xie@sainingxie·14 Eki

three years ago, DiT replaced the legacy unet with a transformer-based denoising backbone. we knew the bulky VAEs would be the next to go -- we just waited until we could do it right. today, we introduce Representation Autoencoders (RAE). >> Retire VAEs. Use RAEs. 👇(1/n)

English

57

324

1.9K

413.4K

Se June Joo 리트윗함

C Zhang@ChongZitaZhang·16 Eyl

Doing so called AI+robotics 30% time debugging real robot deployment 30% time fixing simulation and looking at tensorboard or wandb 30% time meetings and all kinds of non-research activities 10% time spin my brain to get a bit intellectual contributions with AI

English

4

5

128

6.3K

Se June Joo 리트윗함

RLWRLD@RLWRLD_ai·12 Eyl

Watch ALLEX in action. From delicate gestures to precise object handling, our humanoid shows next-level hand dexterity and Physical AI at the @OpenAI Seoul Open Event. This is how @RLWRLD_ai is redefining real-world robotics 🤖✨ #RLWRLD #OpenAI #PhysicalAI #dexterity #AIrobotics #Seoul

English

1

9

22

8.4K

Se June Joo 리트윗함

RLWRLD@RLWRLD_ai·12 Eyl

Just saw this awesome demo by @kaysorin — really proud to share ALLEX in action at the OpenAI Seoul Open Event! Watching it move, interact, and demonstrate real-world dexterity was something special. 🤖🙌 Huge shoutout to everyone involved — pushing the boundaries of what’s possible with physical AI. #RLWRLD #OpenAI #Robotics #PhysicalAI #Dexterity #Innovation #Seoul

Kay@kaysorin

In Seoul tonight for the @OpenAI Korea launch event. Sora installations, robot high fives, imagegen photobooths, and the amazing Korean founders and artists behind them all.

English

0

3

9

620

Se June Joo 리트윗함

Stone Tao@Stone_Tao·9 Eyl

Opensourcing a useful tool to calibrate camera extrinsics painlessly in a minute, no checkerboards! It's based on EasyHEC, using differentiable rendering to optimize extrinsics given object meshes+poses. Crazy that even a piece of paper works too. Code: github.com/StoneT2000/sim…

English

7

41

244

43.8K

Se June Joo 리트윗함

Jianglong Ye@jianglong_ye·23 Haz

How to generate billion-scale manipulation demonstrations easily? Let us leverage generative models! 🤖✨ We introduce Dex1B, a framework that generates 1 BILLION diverse dexterous hand demonstrations for both grasping 🖐️and articulation 💻 tasks using a simple C-VAE model.

English

15

82

375

72.5K

Se June Joo 리트윗함

hyunji amy lee@hyunji_amy_lee·19 Haz

🚨 Want models to better utilize and ground on the provided knowledge? We introduce Context-INformed Grounding Supervision (CINGS)! Training LLM with CINGS significantly boosts grounding abilities in both text and vision-language models compared to standard instruction tuning.

English

2

38

123

15.6K

Se June Joo 리트윗함

Seohong Park@seohong_park·13 Haz

Q-learning is not yet scalable seohong.me/blog/q-learnin… I wrote a blog post about my thoughts on scalable RL algorithms. To be clear, I'm still highly optimistic about off-policy RL and Q-learning! I just think we haven't found the right solution yet (the post discusses why).

English

35

182

1.2K

168.2K

Se June Joo 리트윗함

Sohee Yang@soheeyang_·13 Haz

🚨 New Paper 🧵 How effectively do reasoning models reevaluate their thought? We find that: - Models excel at identifying unhelpful thoughts but struggle to recover from them - Smaller models can be more robust - Self-reevaluation ability is far from true meta-cognitive awareness

English

4

26

130

10.1K

Se June Joo 리트윗함

Younggyo Seo@younggyoseo·29 May

Excited to present FastTD3: a simple, fast, and capable off-policy RL algorithm for humanoid control -- with an open-source code to run your own humanoid RL experiments in no time! Thread below 🧵

English

15

110

560

130.5K

Se June Joo 리트윗함

Jay Shin@jay_shin·23 Nis

Technical report is finally out arxiv.org/abs/2504.15431

English

0

2

18

844

Se June Joo 리트윗함

Yuke Zhu@yukez·9 Nis

We took a short break from robotics to build a human-level agent to play Competitive Pokémon. Partially observed. Stochastic. Long-horizon. Now mastered with Offline RL + Transformers. Our agent, trained on 475k+ human battles, hits the top 10% on Pokémon Showdown leaderboards. No search or heuristics, just sequence modeling. Today, we're open-sourcing our Metamon platform with our algorithms, data, and environments: 🌐 metamon.tech We are excited to see how our work accelerates research on building generally capable AI agents, and more importantly, inspires the next generation of Pokémon trainers!

English

10

64

362

50.4K

Se June Joo@joocjun·17 Mar

@jang_yoel @NVIDIAAI Congrats!

English

0

1

210

Joel Jang@jang_yoel·17 Mar

Some personal life update: I have joined @NVIDIAAI GEAR lab as a full-time Research Scientist last month (after one year as a research intern)! I’ll continue to be working on developing general-purpose robot foundation models. Stay tuned for some exciting updates!

English

29

3

350

25.3K

Se June Joo 리트윗함

Tairan He@TairanHe99·4 Şub

🚀 Can we make a humanoid move like Cristiano Ronaldo, LeBron James and Kobe Byrant? YES! 🤖 Introducing ASAP: Aligning Simulation and Real-World Physics for Learning Agile Humanoid Whole-Body Skills Website: agile.human2humanoid.com Code: github.com/LeCAR-Lab/ASAP

English

45

194

1K

257.4K

Se June Joo

탐색