Justin Bayer

1.9K posts

Justin Bayer

Justin Bayer

@usuallyuseless

Student, scholar, teacher & builder of learning autonomous systems.

Munich Katılım Aralık 2008
954 Takip Edilen716 Takipçiler
Sabitlenmiş Tweet
Justin Bayer
Justin Bayer@usuallyuseless·
World models of scenes (incl. dynamics) allow both SLAM and prediction for model-based control. Let me show you our recent work (ICLR 2021) where we demonstrate that with realistic drone data. Here is a video with ground truth, filtered locations and long-step predictions.
English
2
4
25
0
Robots Digest 🤖
Robots Digest 🤖@robotsdigest·
no pretrained encoder, no complex tricks. LeWorldModel shows how JEPA-based World Models can be trained end-to-end from raw pixels with just 2 loss terms ~15M params, single GPU, and ~48× faster planning than foundation-model world models.
Robots Digest 🤖 tweet media
English
14
43
479
54.2K
Ying Wang
Ying Wang@yingwww_·
What is a good latent space for world modeling and planning? 🤔 Inspired by the perceptual straightening hypothesis in human vision, we introduce temporal straightening to improve representation learning for latent planning. 📑: agenticlearning.ai/temporal-strai…
Ying Wang tweet media
English
29
132
781
230K
Justin Bayer
Justin Bayer@usuallyuseless·
@RichardSSutton The people that worked on dual control in the 80s are the worst, eh.
English
0
0
1
671
Justin Bayer
Justin Bayer@usuallyuseless·
Bam.
DHH@dhh

@windjacker Written for a different world. I removed that clause. Go to town with your agents 🤘

QST
0
0
0
67
Justin Bayer
Justin Bayer@usuallyuseless·
@LucaAmb World models are not about modelling *the world*, they are about modelling *a world*. @SchmidhuberAI 1990, "Making the world differentiable...".
English
0
1
1
341
Luca Ambrogioni
Luca Ambrogioni@LucaAmb·
LLMs are world models, the idea that the understanding the world is about simulating a 3D environment is extremely childish
English
55
14
241
34.6K
Justin Bayer retweetledi
Susan Swartz
Susan Swartz@beadmomsw·
13 years. RIP my darling boy.
Susan Swartz tweet media
English
518
2K
20.7K
1.9M
Dominique Paul
Dominique Paul@DominiqueCAPaul·
Another annoying detail about the Jetson Thor: it doesn't support Isaac Sim. Why would you even build such a strong device fOr rOboTiCs but not make it compatible with essential software?
English
7
0
27
2.7K
Lucas Beyer (bl16)
Lucas Beyer (bl16)@giffmana·
rofl the Bayesians are doing it again! (Might still be an interesting paper)
Lucas Beyer (bl16) tweet media
English
18
7
281
37.8K
Justin Bayer retweetledi
Chenhao Li
Chenhao Li@breadli428·
Extracting physics and dynamics (that are good enough for control) from state data alone is already super challenging, let alone from images. People often think reconstructing images is hard but ignores the difficulty in retaining the low-level dynamics details in VLMs, which are more essential for control.
Jim Fan@DrJimFan

Everyone's freaking out about vibe coding. In the holiday spirit, allow me to share my anxiety on the wild west of robotics. 3 lessons I learned in 2025. 1. Hardware is ahead of software, but hardware reliability severely limits software iteration speed. We've seen exquisite engineering arts like Optimus, e-Atlas, Figure, Neo, G1, etc. Our best AI has not squeezed all the juice out of these frontier hardware. The body is more capable than what the brain can command. Yet babysitting these robots demands an entire operation team. Unlike humans, robots don't heal from bruises. Overheating, broken motors, bizarre firmware issues haunt us daily. Mistakes are irreversible and unforgiving. My patience was the only thing that scaled. 2. Benchmarking is still an epic disaster in robotics. LLM normies thought MMLU & SWE-Bench are common sense. Hold your 🍺 for robotics. No one agrees on anything: hardware platform, task definition, scoring rubrics, simulator, or real world setups. Everyone is SOTA, by definition, on the benchmark they define on the fly for each news announcement. Everyone cherry-picks the nicest looking demo out of 100 retries. We gotta do better as a field in 2026 and stop treating reproducibility and scientific discipline as second-class citizens. 3. VLM-based VLA feels wrong. VLA stands for "vision-language-action" model and has been the dominant approach for robot brains. Recipe is simple: take a pretrained VLM checkpoint and graft an action module on top. But if you think about it, VLMs are hyper-optimized to hill-climb benchmarks like visual question answering. This implies two problems: (1) most parameters in VLMs are for language & knowledge, not for physics; (2) visual encoders are actively tuned to *discard* low-level details, because Q&A only requires high-level understanding. But minute details matter a lot for dexterity. There's no reason for VLA's performance to scale as VLM parameters scale. Pretraining is misaligned. Video world model seems to be a much better pretraining objective for robot policy. I'm betting big on it.

English
0
4
41
11.8K
Justin Bayer
Justin Bayer@usuallyuseless·
@liangpan_t Ask yourself why you want multimodal behaviour. Certainly not to achieve maximal return.
English
0
0
0
55
Justin Bayer
Justin Bayer@usuallyuseless·
So what if the first hit on google for the first question is a stackexchange question you asked 15 years ago?
artem kirsanov@ArtemKRSV

English
0
0
1
66
Justin Bayer retweetledi
Will McGugan
Will McGugan@willmcgugan·
Alrighty. The Toad is out of the bag. 👜🐸 Install toad to work with a variety of #AI coding agents with one beautiful terminal interface. Check out the blog post for more information... willmcgugan.github.io/toad-released/ I've been told I'm very authentic on camera. You just can't fake that kind of awkwardness.
English
60
87
689
196K
Justin Bayer
Justin Bayer@usuallyuseless·
@micahgoldblum Are you sure this is not just because you trained on pairs instead of sequences?
English
0
0
0
85
Micah Goldblum
Micah Goldblum@micahgoldblum·
For a long time, Yann LeCun and others believed in gradient-based planning, but it didn’t work very well … until now. Here’s how we did it using incredibly simple techniques. But first, an introduction to gradient-based planning: 🧵1/11
Micah Goldblum tweet media
English
24
173
1.4K
158.6K
Mario Klingemann💧💦
Mario Klingemann💧💦@quasimondo·
Ah, the one thing I do not miss from the good old GANs is mode collapse. Looks like I spent the last $30 on training a model that had died overnight.
Mario Klingemann💧💦 tweet media
English
3
0
7
872
Mario Klingemann💧💦
Mario Klingemann💧💦@quasimondo·
Sometimes I'm a late adopter these days. No idea why I didn't start using @runpod earlier, since watching 4 RTX 5090 blazing through the training of an old-school GAN is sooo satisfying.
English
5
0
26
2.8K
Justin Bayer
Justin Bayer@usuallyuseless·
@JerryHan_og I really love the force/twist dragging in the mujoco viewer. any chance you will also support that?
English
1
0
1
248
Jerry Han
Jerry Han@JerryHan_og·
MuJoCo on server. Three.js in browser. WebSocket in between. Physics and rendering don't need to live together. Streaming robot state, rendering in Three.js, physics running server-side. Benefits: - Swap physics engines without touching renderer - No browser memory ceiling - Precise MuJoCo physics + beautiful Three.js visuals PoC done. Unitree G1 walking demo synced across both views. Working toward a clean streaming protocol. #robotics #mujoco #threejs
English
5
23
204
12.4K
Chaoyi Pan
Chaoyi Pan@ChaoyiPan·
Generative models (diffusion/flow) are taking over robotics 🤖. But do we really need to model the full action distribution to control a robot? We suspected the success of Generative Control Policies (GCPs) might be "Much Ado About Noising." We rigorously tested the myths. 🧵👇
English
16
91
546
108.6K
Justin Bayer
Justin Bayer@usuallyuseless·
@breadli428 Do you know recurrent environment simulators, iclr2017? It looks very related.
English
0
0
0
19
Chenhao Li
Chenhao Li@breadli428·
@usuallyuseless Hey thanks for following up! We tried different architectures and searched hyper parameters. RSSM also does great if you train it similarly with AR. Hope it helps!
Chenhao Li tweet media
English
1
0
1
28
Chenhao Li
Chenhao Li@breadli428·
🧠Model-Based RL shows promises but has seen limited success in real-world robotics. 🌎Introducing Robotic World Model, a black-box end-to-end neural dynamics model that bridges this gap, where policies are trained purely in imagination. @NeurIPSConf 🎯sites.google.com/view/roboticwo…
English
13
64
368
101.2K