Jon Barron

3K posts

Jon Barron

@jon_barron

Principal research scientist at Google DeepMind. Synthesized views are my own.

SF Bay Area Katılım Mayıs 2010

1.4K Takip Edilen33.4K Takipçiler

Sabitlenmiş Tweet

Jon Barron@jon_barron·28 Nis

Here's my 3DV talk, in chapters: 1) Intro / NeRF boilerplate. 2) Recent reconstruction work. 3) Recent generative work. 4) Radiance fields as a field. 5) Why generative video has bitter-lessoned 3D. 6) Why generative video hasn't bitter-lessoned 3D. 5 & 6 are my favorites.

English

104

812

117.5K

Jon Barron@jon_barron·2d

@AndrewSchmidtFC I think storage and network costs would be the biggest blocker for an OSS system

English

802

Schmidt@AndrewSchmidtFC·3d

Instead of letting Google or Apple own it, what if we made great open source software and hardware to capture and render these things, and built a public repository of the world? But, y’know. I dream.

Bilawal Sidhu@bilawalsidhu

The next generation of street view will be wildly immersive.

English

12.5K

Jon Barron@jon_barron·2d

@AjdDavison I bet your unlock will come from doing your LLM conversation and ideation inside of an IDE, instead of a chat exchange. Chat is cheap, unit tests are everything.

English

782

Andrew Davison@AjdDavison·2d

My experience: I have some AI conversations which end up getting overwhelming and confusing; and I might have fun vibecoding some ideas and demos and that helps a bit; but then soon I'm back to paper and pen and sketches and thinking hard while stuck like I always was. 2/2

English

Andrew Davison@AjdDavison·2d

Related... is anyone out there making progress on their *hardest* research problems using LLMs? The kind you've been wondering about for years, where it's hard to even describe what you're trying to do but just have a feeling there's something to find. Honest question: how? 1/2

kache@yacineMTB

you can outsource your thinking but you cannot outsource your understanding

English

12.5K

Jon Barron@jon_barron·2d

@AjdDavison Yes! You just talk to it. The last few months have been the most exciting time of my research career, I think

English

735

Jon Barron retweetledi

Jiawei Yang@JiaweiYang118·3d

Two months ago, I vaguely posted a number: 0.9 FID, one-step, pixel space. Now it is 0.75, and can be even lower. Many wonder how. I thought it might end as a small FID prank: simple and deliberate. It started with one question: can FID be optimized directly, and what does it reveal? Introducing FD-loss.

English

147

877

184.7K

Jon Barron retweetledi

Jiawei Yang@JiaweiYang118·3d

Bonus: If you want to see how FID itself could be misleading sometimes, (and how does reward hacking look like using FD-loss). Check out our appendix. This model: 2.09 FID, 660 IS.

English

7.1K

Jon Barron@jon_barron·3d

@thomasahle does this have a closed form? Feels like it should.

English

2.7K

Thomas Ahle@thomasahle·4d

Voronoi-diagram for optimal Gaussian Quantization (Lloyd-Max-GQ)

Español

384

32.3K

Jon Barron@jon_barron·3d

@XRarchitect @theworldlabs Please do, looking forward to it!

English

191

Ian Curtis@XRarchitect·3d

@jon_barron @theworldlabs I'll send over a live link to you directly once finished! You will be able to explore a bunch of worlds 🙏 we are getting close!

English

256

World Labs@theworldlabs·3d

60 million Gaussian splats. One massive dark fantasy world ready to explore! ⚔️ Created entirely with Marble, this persistent world is brought to life in-browser via our Spark 2.0 LoD system and Three.js Fly through it yourself and learn more about how it was made 👇

English

203

31.3K

Jon Barron@jon_barron·3d

@rms80 @theworldlabs I see a video of a game and the post says it's "ready to explore" so I assume it's the game that is ready, not like the assets underneath the game

English

198

Ryan Schmidt@rms80·3d

@jon_barron @theworldlabs the post copy does actually only say "fly through it" 🫤

English

247

Jon Barron@jon_barron·3d

@theworldlabs cmon man you said it was "ready to explore", gimme the sword already!

English

475

World Labs@theworldlabs·3d

@jon_barron Game coming soon! 🤩 You can fly through the splat here (also linked in our blog via thread): wlt-ai-cdn.art/spark-2.0/2604…

English

886

Jon Barron@jon_barron·3d

@DFinsterwalder deep learning people succeeded despite their philosophizing, not because of it

English

David Finsterwalder | eu/acc@DFinsterwalder·3d

@jon_barron I get the joke. Most “world model” discourse is vapor. But “no under-the-hood questions” was not the advice that got neural nets out of crackpot territory and got Hinton the Nobel. Just saying.

English

116

Jon Barron@jon_barron·5d

"World Models" discourse will now be paused, pending the invention of terraforming. If we had named LLMs "Thought Models" we'd never get past the philosophical debates around what is *actually* happening under the hood. Just name your model according to its inputs or outputs.

English

130

13.7K

Jon Barron@jon_barron·5d

@JitendraMalikCV Yeah this definition holds up pretty well, maybe due to the MDP scoping the problem statement narrowly enough that "the world" gains a concrete technical meaning.

English

1.9K

Jitendra MALIK@JitendraMalikCV·5d

@jon_barron "World models" has a technical meaning - the transition model/dynamics model from Bellman/Kalman in the context of MDPs/ state space approach to control theory ~ 1960. I gave a talk on this history youtube.com/watch?v=9B4kka…

YouTube

English

295

57K

Jon Barron retweetledi

David Baszucki@DavidBaszucki·6d

Earlier this year, we launched 4D Generation with mesh-based schemas like the car-5. Now, we're expanding to 30+ new schemas powered by Procedural Model Generation. This shift allows for fully functional and editable 3D assets—from submarines that dive to jet planes that fly. Here's a sneak peek of what's coming soon.

English

138

583

104.9K

Jon Barron@jon_barron·5d

@ryancjulian "velocity models" sounds pretty cool (and fundable)

English

807

Ryan Julian@ryancjulian·5d

@jon_barron "forward dynamics model" there for decades, but I guess that doesn't pass VC readability

English

Jon Barron@jon_barron·5d

@keenanisalive yeah the absence of dynamics in these models is huge. Vanilla dynamic 4D models also feel insufficient, if they're just "animated" rather than simulated. Really looking forward to physics getting into the mix more, that'll be satisfying.

English

295

Jon Barron@jon_barron·5d

@keenanisalive Put video generation world models also arguably don't predict how the natural world behaves, they predict pixels that show that behavior. Very hard to nail down what a world model should be but I think the 3D models come slightly closer to obviously modeling "the world"

English

2.8K

Keenan Crane@keenanisalive·5d

A bunch of folks have been building machine learning models that turn a photograph into a 3D environment made of Gaussian splats (read: blobs of color floating in space). Cool technology & a very admirable effort. But marketing these as "world models" seems wrong. More accurate would be to say that they are a riff on the broader class of image-conditioned 3D generators, with a somewhat different flavor of condition image and output representation. As far as world modeling, they don't make great predictions about how the natural world looks or behaves. (Even for, say, a chair behind a table.) Again: I love the technology. Super cool creative stuff. I don't love the marketing and hype around it.

English

346

51K

Jon Barron@jon_barron·5d

@gentile_captial yeah loop closure seems hard. But making individual small environments seems doable, you just gotta chain them together.

English

144

Gentile_Capital@gentile_captial·5d

@jon_barron only works for open spaces. Try generating a dungeon crawler that gets you back to your original starting point. I've got some ideas for how to solve this, but alas I lack the compute for the effort.

English

211

Jon Barron@jon_barron·5d

Gen3D world models seem to work now. I humbly request that someone finally put some guns and swords into one of these systems and host a deathmatch. I will be there and I can bring some gen 3D luminaries, it'll be a blast, just set it up and ping me on discord.

SpAItial AI@SpAItial_AI

Echo-2 is a physically-grounded world model from which we can distill meshes, point clouds, or 3DGS scene representations. Directly usable in a myriad of downstream applications from gaming to training robots. Want to build your own world? Try it here: spaitial.ai

English

125

15.9K

Jon Barron retweetledi

Massimiliano Viola@massiviola01·26 Nis

New fancy depth colormaps now on PyPI? Hell yeah! With the help of my friend Claude, I (vibe) coded a tiny library that implements the depth-to-RGB mapping from Vision Banana 🍌 and its generalization. pip install hilbertmap

English

101

8.3K

Keşfet

@AndrewSchmidtFC @AjdDavison @thomasahle @XRarchitect @theworldlabs @rms80 @DFinsterwalder @elonmusk