Justin Bayer

1.9K posts

Justin Bayer

@usuallyuseless

Student, scholar, teacher & builder of learning autonomous systems.

Munich Katılım Aralık 2008

954 Takip Edilen716 Takipçiler

Sabitlenmiş Tweet

Justin Bayer@usuallyuseless·3 Nis

World models of scenes (incl. dynamics) allow both SLAM and prediction for model-based control. Let me show you our recent work (ICLR 2021) where we demonstrate that with realistic drone data. Here is a video with ground truth, filtered locations and long-step predictions.

English

Justin Bayer@usuallyuseless·1d

@robotsdigest Comparison to TD-MPC and Dreamer missing, desk reject.

English

351

Robots Digest 🤖@robotsdigest·1d

no pretrained encoder, no complex tricks. LeWorldModel shows how JEPA-based World Models can be trained end-to-end from raw pixels with just 2 loss terms ~15M params, single GPU, and ~48× faster planning than foundation-model world models.

English

479

54.2K

Justin Bayer@usuallyuseless·13 Mar

@tungnd_13 @yingwww_ @ylecun @mengyer @randall_balestr @GaoyueZhou @oumaymabounou And we did sth similar in 2016. arxiv.org/abs/1605.06432 And Watter et al in 2015. arxiv.org/abs/1506.07365.

English

Tung Nguyen@tungnd_13·13 Mar

@yingwww_ @ylecun @mengyer @randall_balestr @GaoyueZhou @oumaymabounou We tried a similar idea a while ago for optimal control in a latent space, using first-order Taylor expansion errors as the loss function to minimize the curvature of the latent dynamics model. arxiv.org/abs/1909.01506 arxiv.org/abs/2003.01086

English

1.6K

Ying Wang@yingwww_·13 Mar

What is a good latent space for world modeling and planning? 🤔 Inspired by the perceptual straightening hypothesis in human vision, we introduce temporal straightening to improve representation learning for latent planning. 📑: agenticlearning.ai/temporal-strai…

English

132

781

230K

Justin Bayer@usuallyuseless·26 Şub

@RichardSSutton The people that worked on dual control in the 80s are the worst, eh.

English

671

Richard Sutton@RichardSSutton·26 Şub

Reinforcement learning is and always has been the study of algorithms for learning from experience. It amazes me that people can claim to be working on learning from experience without mentioning RL.

Dileep George@dileeplearning

x.com/i/article/2026…

English

591

128.4K

Justin Bayer@usuallyuseless·13 Şub

@yuvaltassa What's the integration of MJWarp with Jax like?

English

122

Yuval Tassa@yuvaltassa·13 Şub

MuJoCo 3.5 is out, including the official MJWarp release, a new sysID toolbox and lots of other goodies! github.com/google-deepmin…

English

217

8.3K

Justin Bayer@usuallyuseless·5 Şub

Bam.

DHH@dhh

@windjacker Written for a different world. I removed that clause. Go to town with your agents 🤘

QST

Justin Bayer@usuallyuseless·25 Oca

@LucaAmb World models are not about modelling *the world*, they are about modelling *a world*. @SchmidhuberAI 1990, "Making the world differentiable...".

English

341

Luca Ambrogioni@LucaAmb·25 Oca

LLMs are world models, the idea that the understanding the world is about simulating a 3D environment is extremely childish

English

241

34.6K

Justin Bayer retweetledi

Susan Swartz@beadmomsw·11 Oca

13 years. RIP my darling boy.

English

518

20.7K

1.9M

Justin Bayer@usuallyuseless·10 Oca

@KyleMorgenstein @DominiqueCAPaul Because of p(x_t+1 | x_1:t, u_1:t). But that's a pain with Isaac anyway, and the major reason not to use it. 🤷

English

Kyle🤖🚀🦭@KyleMorgenstein·9 Oca

@DominiqueCAPaul why would you run Isaac sim on edge?

English

648

Dominique Paul@DominiqueCAPaul·9 Oca

Another annoying detail about the Jetson Thor: it doesn't support Isaac Sim. Why would you even build such a strong device fOr rOboTiCs but not make it compatible with essential software?

English

2.7K

Justin Bayer@usuallyuseless·1 Oca

@giffmana x.com/usuallyuseless…

Justin Bayer@usuallyuseless

Bayesplaining: take a well established method, express it as a series of crude approximations to a Bayesian approach, throw it back at the community where it was invented.

QME

175

Lucas Beyer (bl16)@giffmana·31 Ara

rofl the Bayesians are doing it again! (Might still be an interesting paper)

English

281

37.8K

Justin Bayer retweetledi

Chenhao Li@breadli428·29 Ara

Extracting physics and dynamics (that are good enough for control) from state data alone is already super challenging, let alone from images. People often think reconstructing images is hard but ignores the difficulty in retaining the low-level dynamics details in VLMs, which are more essential for control.

Jim Fan@DrJimFan

Everyone's freaking out about vibe coding. In the holiday spirit, allow me to share my anxiety on the wild west of robotics. 3 lessons I learned in 2025. 1. Hardware is ahead of software, but hardware reliability severely limits software iteration speed. We've seen exquisite engineering arts like Optimus, e-Atlas, Figure, Neo, G1, etc. Our best AI has not squeezed all the juice out of these frontier hardware. The body is more capable than what the brain can command. Yet babysitting these robots demands an entire operation team. Unlike humans, robots don't heal from bruises. Overheating, broken motors, bizarre firmware issues haunt us daily. Mistakes are irreversible and unforgiving. My patience was the only thing that scaled. 2. Benchmarking is still an epic disaster in robotics. LLM normies thought MMLU & SWE-Bench are common sense. Hold your 🍺 for robotics. No one agrees on anything: hardware platform, task definition, scoring rubrics, simulator, or real world setups. Everyone is SOTA, by definition, on the benchmark they define on the fly for each news announcement. Everyone cherry-picks the nicest looking demo out of 100 retries. We gotta do better as a field in 2026 and stop treating reproducibility and scientific discipline as second-class citizens. 3. VLM-based VLA feels wrong. VLA stands for "vision-language-action" model and has been the dominant approach for robot brains. Recipe is simple: take a pretrained VLM checkpoint and graft an action module on top. But if you think about it, VLMs are hyper-optimized to hill-climb benchmarks like visual question answering. This implies two problems: (1) most parameters in VLMs are for language & knowledge, not for physics; (2) visual encoders are actively tuned to *discard* low-level details, because Q&A only requires high-level understanding. But minute details matter a lot for dexterity. There's no reason for VLA's performance to scale as VLM parameters scale. Pretraining is misaligned. Video world model seems to be a much better pretraining objective for robot policy. I'm betting big on it.

English

11.8K

Justin Bayer@usuallyuseless·23 Ara

@liangpan_t Ask yourself why you want multimodal behaviour. Certainly not to achieve maximal return.

English

Liang Pan@liangpan_t·22 Ara

So the only way to learn multimodal behaviors is to tokenize actions and then use cross-entropy loss?

Chaoyi Pan@ChaoyiPan

Generative models (diffusion/flow) are taking over robotics 🤖. But do we really need to model the full action distribution to control a robot? We suspected the success of Generative Control Policies (GCPs) might be "Much Ado About Noising." We rigorously tested the myths. 🧵👇

English

8.7K

Justin Bayer@usuallyuseless·20 Ara

So what if the first hit on google for the first question is a stackexchange question you asked 15 years ago?

artem kirsanov@ArtemKRSV

English

Justin Bayer retweetledi

Will McGugan@willmcgugan·18 Ara

Alrighty. The Toad is out of the bag. 👜🐸 Install toad to work with a variety of #AI coding agents with one beautiful terminal interface. Check out the blog post for more information... willmcgugan.github.io/toad-released/ I've been told I'm very authentic on camera. You just can't fake that kind of awkwardness.

English

689

196K

Justin Bayer@usuallyuseless·13 Ara

@micahgoldblum Are you sure this is not just because you trained on pairs instead of sequences?

English

Micah Goldblum@micahgoldblum·11 Ara

For a long time, Yann LeCun and others believed in gradient-based planning, but it didn’t work very well … until now. Here’s how we did it using incredibly simple techniques. But first, an introduction to gradient-based planning: 🧵1/11

English

173

1.4K

158.6K

Justin Bayer@usuallyuseless·13 Ara

@quasimondo but it looks cool.

English

Mario Klingemann💧💦@quasimondo·13 Ara

Ah, the one thing I do not miss from the good old GANs is mode collapse. Looks like I spent the last $30 on training a model that had died overnight.

English

872

Mario Klingemann💧💦@quasimondo·12 Ara

Sometimes I'm a late adopter these days. No idea why I didn't start using @runpod earlier, since watching 4 RTX 5090 blazing through the training of an old-school GAN is sooo satisfying.

English

2.8K

Justin Bayer@usuallyuseless·7 Ara

@JerryHan_og I really love the force/twist dragging in the mujoco viewer. any chance you will also support that?

English

248

Jerry Han@JerryHan_og·7 Ara

MuJoCo on server. Three.js in browser. WebSocket in between. Physics and rendering don't need to live together. Streaming robot state, rendering in Three.js, physics running server-side. Benefits: - Swap physics engines without touching renderer - No browser memory ceiling - Precise MuJoCo physics + beautiful Three.js visuals PoC done. Unitree G1 walking demo synced across both views. Working toward a clean streaming protocol. #robotics #mujoco #threejs

English

204

12.4K

Justin Bayer@usuallyuseless·6 Ara

@ChaoyiPan

QME

118

Chaoyi Pan@ChaoyiPan·3 Ara

English

546

108.6K

Justin Bayer@usuallyuseless·30 Kas

If we want to have an analogy to open source, calling a model without data and training code “open” is bullshit.

Charles 🎉 Frye@charles_irl

calling a model with a non-commercial license "open weights" is bullshit.

English

Justin Bayer retweetledi

Dimitri Bertsekas@DBertsekas·15 Kas

I came across the statement “Abandon #ReinforcementLearning in favor of #ModelPredictiveControl,” by Yann LeCun. My view is that "MPC is part of RL, and reversely, the most reliable part of RL is MPC". For a broader discussion, see linkedin.com/in/dimitri-ber…

English

109

13.9K

Justin Bayer@usuallyuseless·29 Kas

@breadli428 Do you know recurrent environment simulators, iclr2017? It looks very related.

English

Chenhao Li@breadli428·28 Kas

@usuallyuseless Hey thanks for following up! We tried different architectures and searched hyper parameters. RSSM also does great if you train it similarly with AR. Hope it helps!

English

Chenhao Li@breadli428·26 Kas

🧠Model-Based RL shows promises but has seen limited success in real-world robotics. 🌎Introducing Robotic World Model, a black-box end-to-end neural dynamics model that bridges this gap, where policies are trained purely in imagination. @NeurIPSConf 🎯sites.google.com/view/roboticwo…

English

368

101.2K

Keşfet

@robotsdigest @tungnd_13 @yingwww_ @ylecun @mengyer @randall_balestr @GaoyueZhou @oumaymabounou