Jon Barron

3K posts

Jon Barron banner
Jon Barron

Jon Barron

@jon_barron

Principal research scientist at Google DeepMind. Synthesized views are my own.

SF Bay Area Katılım Mayıs 2010
1.4K Takip Edilen33.4K Takipçiler
Sabitlenmiş Tweet
Jon Barron
Jon Barron@jon_barron·
Here's my 3DV talk, in chapters: 1) Intro / NeRF boilerplate. 2) Recent reconstruction work. 3) Recent generative work. 4) Radiance fields as a field. 5) Why generative video has bitter-lessoned 3D. 6) Why generative video hasn't bitter-lessoned 3D. 5 & 6 are my favorites.
Jon Barron tweet media
English
39
104
812
117.5K
Jon Barron
Jon Barron@jon_barron·
@AndrewSchmidtFC I think storage and network costs would be the biggest blocker for an OSS system
English
0
0
0
802
Jon Barron
Jon Barron@jon_barron·
@AjdDavison I bet your unlock will come from doing your LLM conversation and ideation inside of an IDE, instead of a chat exchange. Chat is cheap, unit tests are everything.
English
1
0
0
782
Andrew Davison
Andrew Davison@AjdDavison·
My experience: I have some AI conversations which end up getting overwhelming and confusing; and I might have fun vibecoding some ideas and demos and that helps a bit; but then soon I'm back to paper and pen and sketches and thinking hard while stuck like I always was. 2/2
English
4
0
21
2K
Jon Barron
Jon Barron@jon_barron·
@AjdDavison Yes! You just talk to it. The last few months have been the most exciting time of my research career, I think
English
0
0
9
735
Jon Barron retweetledi
Jiawei Yang
Jiawei Yang@JiaweiYang118·
Two months ago, I vaguely posted a number: 0.9 FID, one-step, pixel space. Now it is 0.75, and can be even lower. Many wonder how. I thought it might end as a small FID prank: simple and deliberate. It started with one question: can FID be optimized directly, and what does it reveal? Introducing FD-loss.
Jiawei Yang tweet media
English
53
147
877
184.7K
Jon Barron retweetledi
Jiawei Yang
Jiawei Yang@JiaweiYang118·
Bonus: If you want to see how FID itself could be misleading sometimes, (and how does reward hacking look like using FD-loss). Check out our appendix. This model: 2.09 FID, 660 IS.
Jiawei Yang tweet media
English
0
9
69
7.1K
Jon Barron
Jon Barron@jon_barron·
@thomasahle does this have a closed form? Feels like it should.
English
3
0
4
2.7K
Thomas Ahle
Thomas Ahle@thomasahle·
Voronoi-diagram for optimal Gaussian Quantization (Lloyd-Max-GQ)
Thomas Ahle tweet media
Español
11
55
384
32.3K
Ian Curtis
Ian Curtis@XRarchitect·
@jon_barron @theworldlabs I'll send over a live link to you directly once finished! You will be able to explore a bunch of worlds 🙏 we are getting close!
English
1
0
4
256
World Labs
World Labs@theworldlabs·
60 million Gaussian splats. One massive dark fantasy world ready to explore! ⚔️ Created entirely with Marble, this persistent world is brought to life in-browser via our Spark 2.0 LoD system and Three.js Fly through it yourself and learn more about how it was made 👇
English
21
27
203
31.3K
Jon Barron
Jon Barron@jon_barron·
@rms80 @theworldlabs I see a video of a game and the post says it's "ready to explore" so I assume it's the game that is ready, not like the assets underneath the game
English
2
0
2
198
Jon Barron
Jon Barron@jon_barron·
@theworldlabs cmon man you said it was "ready to explore", gimme the sword already!
English
0
0
7
475
Jon Barron
Jon Barron@jon_barron·
@DFinsterwalder deep learning people succeeded despite their philosophizing, not because of it
English
1
0
1
63
David Finsterwalder | eu/acc
David Finsterwalder | eu/acc@DFinsterwalder·
@jon_barron I get the joke. Most “world model” discourse is vapor. But “no under-the-hood questions” was not the advice that got neural nets out of crackpot territory and got Hinton the Nobel. Just saying.
English
1
0
1
116
Jon Barron
Jon Barron@jon_barron·
"World Models" discourse will now be paused, pending the invention of terraforming. If we had named LLMs "Thought Models" we'd never get past the philosophical debates around what is *actually* happening under the hood. Just name your model according to its inputs or outputs.
English
12
6
130
13.7K
Jon Barron
Jon Barron@jon_barron·
@JitendraMalikCV Yeah this definition holds up pretty well, maybe due to the MDP scoping the problem statement narrowly enough that "the world" gains a concrete technical meaning.
English
0
0
6
1.9K
Jitendra MALIK
Jitendra MALIK@JitendraMalikCV·
@jon_barron "World models" has a technical meaning - the transition model/dynamics model from Bellman/Kalman in the context of MDPs/ state space approach to control theory ~ 1960. I gave a talk on this history youtube.com/watch?v=9B4kka…
YouTube video
YouTube
English
5
35
295
57K
Jon Barron retweetledi
David Baszucki
David Baszucki@DavidBaszucki·
Earlier this year, we launched 4D Generation with mesh-based schemas like the car-5. Now, we're expanding to 30+ new schemas powered by Procedural Model Generation. This shift allows for fully functional and editable 3D assets—from submarines that dive to jet planes that fly. Here's a sneak peek of what's coming soon.
English
138
55
583
104.9K
Ryan Julian
Ryan Julian@ryancjulian·
@jon_barron "forward dynamics model" there for decades, but I guess that doesn't pass VC readability
English
1
0
10
1K
Jon Barron
Jon Barron@jon_barron·
@keenanisalive yeah the absence of dynamics in these models is huge. Vanilla dynamic 4D models also feel insufficient, if they're just "animated" rather than simulated. Really looking forward to physics getting into the mix more, that'll be satisfying.
English
1
0
5
295
Jon Barron
Jon Barron@jon_barron·
@keenanisalive Put video generation world models also arguably don't predict how the natural world behaves, they predict pixels that show that behavior. Very hard to nail down what a world model should be but I think the 3D models come slightly closer to obviously modeling "the world"
English
3
0
24
2.8K
Keenan Crane
Keenan Crane@keenanisalive·
A bunch of folks have been building machine learning models that turn a photograph into a 3D environment made of Gaussian splats (read: blobs of color floating in space). Cool technology & a very admirable effort. But marketing these as "world models" seems wrong. More accurate would be to say that they are a riff on the broader class of image-conditioned 3D generators, with a somewhat different flavor of condition image and output representation. As far as world modeling, they don't make great predictions about how the natural world looks or behaves. (Even for, say, a chair behind a table.) Again: I love the technology. Super cool creative stuff. I don't love the marketing and hype around it.
English
18
30
346
51K
Jon Barron
Jon Barron@jon_barron·
@gentile_captial yeah loop closure seems hard. But making individual small environments seems doable, you just gotta chain them together.
English
1
0
2
144
Gentile_Capital
Gentile_Capital@gentile_captial·
@jon_barron only works for open spaces. Try generating a dungeon crawler that gets you back to your original starting point. I've got some ideas for how to solve this, but alas I lack the compute for the effort.
English
1
0
0
211
Jon Barron
Jon Barron@jon_barron·
Gen3D world models seem to work now. I humbly request that someone finally put some guns and swords into one of these systems and host a deathmatch. I will be there and I can bring some gen 3D luminaries, it'll be a blast, just set it up and ping me on discord.
SpAItial AI@SpAItial_AI

Echo-2 is a physically-grounded world model from which we can distill meshes, point clouds, or 3DGS scene representations. Directly usable in a myriad of downstream applications from gaming to training robots. Want to build your own world? Try it here: spaitial.ai

English
5
7
125
15.9K
Jon Barron retweetledi
Massimiliano Viola
Massimiliano Viola@massiviola01·
New fancy depth colormaps now on PyPI? Hell yeah! With the help of my friend Claude, I (vibe) coded a tiny library that implements the depth-to-RGB mapping from Vision Banana 🍌 and its generalization. pip install hilbertmap
Massimiliano Viola tweet media
English
3
7
101
8.3K