Andy Zeng

472 posts

Andy Zeng banner
Andy Zeng

Andy Zeng

@andyzengineer

Building robot foundation models @GeneralistAI. Prev @GoogleDeepMind, PhD @Princeton. One experiment away from magic. ✗DMs → email

Katılım Eylül 2017
614 Takip Edilen9.7K Takipçiler
Andy Zeng
Andy Zeng@andyzengineer·
Decades of hardware development led to strong, fast, and precise robot arms. The moment we can put in general intelligence in these things, we can leverage the full spectra of capabilities that they were always meant to capture. We’re betting on a future where robot hardware will continue to improve, and we intend to build the best models on top of the best hardware to push the frontier of capabilities and reliability. Here’s an old video from TossingBot that I think helps make this point extra clear. Industrial-grade repeatability started out as a crutch for dumb software -- but if you pair it with the right AI models, then it becomes an advantage that is superhuman. There are factories where the same UR robot arms are still being used to precisely and repeatably build the same car parts, operating for 10 yrs straight without a single failure or shutdown. Humanoids will get there too (among other form factors). Not yet today, but eventually. And our models will be ready to meet them when they do.
Generalist@GeneralistAI

GEN-1 delicately arranges potato chips, and lifts a heavy bag of potatoes — from a gentle touch to a strong grip. Read more about Gen-1 in our blog posts in the comments below ↓

English
4
11
185
18.7K
Andy Zeng retweetledi
Generalist
Generalist@GeneralistAI·
Robot reaches deep for screws Read more about Gen-1 in our blog posts in the comments below ↓
English
5
11
137
14.1K
Andy Zeng retweetledi
Ben Pekarek
Ben Pekarek@ben_pekarek·
Today marks the end of my first full week @GeneralistAI Last Monday, I was given a challenge: use our GEN-1 model to teach a robot a task of my choosing, using the same no-code platform our customers use. I picked the ball-and-vase magic trick. It was one of my favorites as a kid, and it felt like the right mix of fun and surprisingly hard. A few days later, GEN-1 pulled it off. I left Friday having watched the robot nail it 14 times in a row. What’s wild is that even 4 months ago, if you told me you could go from idea to on-robot skill in a couple of days, I probably wouldn’t have believed you. Really excited to be building with an incredible team. Can’t wait to see what week two brings 🤖
English
14
30
317
36.9K
Andy Zeng retweetledi
Generalist
Generalist@GeneralistAI·
GEN-1 performs a magic trick Read more about GEN-1 in our blog post in the comments below ↓
Ben Pekarek@ben_pekarek

Today marks the end of my first full week @GeneralistAI Last Monday, I was given a challenge: use our GEN-1 model to teach a robot a task of my choosing, using the same no-code platform our customers use. I picked the ball-and-vase magic trick. It was one of my favorites as a kid, and it felt like the right mix of fun and surprisingly hard. A few days later, GEN-1 pulled it off. I left Friday having watched the robot nail it 14 times in a row. What’s wild is that even 4 months ago, if you told me you could go from idea to on-robot skill in a couple of days, I probably wouldn’t have believed you. Really excited to be building with an incredible team. Can’t wait to see what week two brings 🤖

English
3
10
68
10.5K
Andy Zeng retweetledi
Generalist
Generalist@GeneralistAI·
GEN-1 cleans white board Read more about GEN-1 in our blog post in the comments below ↓
English
7
17
245
62.3K
Andy Zeng
Andy Zeng@andyzengineer·
If anyone’s creating a benchmark for frontier physical AI models, this task is a great one to add to the roster. Sensorimotor end-to-end policies must exhibit the long-term visual memory to track and reason about where the object might be. It’s also harder to “cheat” on this task -- it can be difficult to do if you’ve got “gaps” in your model’s memory e.g. low-frame rate memory or coarse representations, like language. GEN-1 nails it (also on the first try with an unseen object). @BerkayAntmen was really trying hard to fool the model here. Shout out to an excellent task from @RhodaAI.
Generalist@GeneralistAI

GEN-1 plays the 🐚 shell game, trained on just 1 hr of robot data. It also generalizes to unseen objects, like @BerkayAntmen 's car keys. Physical AI models should be capable of benchmark tasks like this one. It's interesting for the all the reasons @RhodaAI calls out -- requires visual memory, and the model must track the cups from the very start, at high frame rates. Interestingly, GEN-1 appears to exhibit a degree of "active perception." It's subtle; the hands can sometimes appear to "follow" the cups, using its own movements to help attend to where it thinks the object should be. Read more about GEN-1 in our blog post in the comments below ↓

English
3
15
110
11.3K
Andy Zeng retweetledi
Kevin Peterson
Kevin Peterson@kevinmpeterson1·
A week of contrasts... @sudo_robotics says why do IRL training? Generalist says why do sim? Very cool to see some of these examples coming together. Putting a dollar into a wallet definitely feels like a tail task I would not have expected robotics to solve for another 5(?) years at least! Kudos @GeneralistAI
Generalist@GeneralistAI

Everyday for the past 2 weeks, we've been sharing something new from GEN-1, our latest milestone in scaling robot learning. This has never been done before. Going from ideas to skills in days (or faster) is what physical AI models should deliver. More coming. Stay tuned. Read more about it in our blog post in the comments below ↓

English
1
4
27
5.5K
Andy Zeng retweetledi
Andy Zeng retweetledi
Generalist
Generalist@GeneralistAI·
Everyday for the past 2 weeks, we've been sharing something new from GEN-1, our latest milestone in scaling robot learning. This has never been done before. Going from ideas to skills in days (or faster) is what physical AI models should deliver. More coming. Stay tuned. Read more about it in our blog post in the comments below ↓
English
2
20
198
38.5K
Andy Zeng retweetledi
Generalist
Generalist@GeneralistAI·
GEN-1 still works with lights off, and generalizes under harsh lighting conditions. The model uses raw video pixels to make decisions, so strong lighting changes can drastically alter its input distribution. Yet performance still holds. Why? GEN-1 was pre-trained on a massive, diverse dataset of different lighting conditions—everywhere from outdoor farms, to warehouses, from grocery stores, to dimly lit homes—it's already seen it all, and transfers this knowledge to new tasks. This is a glimpse of what we call Mastery, and is part of the reason these models can cross a new performance threshold. Read more about it in our blog post in the comments below 👇
English
3
26
226
17.9K
Andy Zeng retweetledi
Pete Florence
Pete Florence@peteflorence·
Tbh, this is my favorite yet. Also, one of the best things is that I literally can’t even keep track of all the tasks happening in the offices anymore. I learn about this one day and boom we publish it on the internet. Amazing work by the full Generalist team to get here
Generalist@GeneralistAI

GEN-1 removes thumbtacks and papers from corkboard. Other tasks GEN-1 can do: youtube.com/playlist?list=… Read more about GEN-1, our latest foundation model for the physical world: generalistai.com/blog/apr-02-20…

English
3
4
53
5.1K
Andy Zeng
Andy Zeng@andyzengineer·
The first time we rolled a robot into a new warehouse, it didn’t perform as well as we expected. It took us an entire day of debugging, before we realized it was something simple… the cameras were wired completely wrong. 🤦 Left camera to right gripper 🔀 right camera to left gripper. But what was interesting was, the model still kind of worked. In post-training data, left side = bin, right side = conveyor. But even when swapped, the model would still do the task—just slightly worse. Enough to fool us into thinking it was something else for hours… until the moment we switched the cameras back. Then it worked great. This wasn’t the first time we’ve seen emergent ambidexterity. During a packing demo last year, we spotted the robot using the “wrong” hand to shake a USB brick out of a tight baggie. Totally outside the post-training data (we watched all 17 hrs of it to double check). 100% left hand, but for some reason at inference time, it felt the need to use the right hand. Nothing in the model architecture could obviously explain this kind of invariance. If these models are headed where I think they are, imagine one day having a generalist “substrate of intelligence” where you can plug in any number of sensors and actuators, and the whole thing just springs to life. It wouldn’t matter how you wired it up. It would just work. That would be pretty cool.
Generalist@GeneralistAI

GEN-1 puts plushies into polybags, in a warehouse outside the lab in New Hampshire.

English
8
34
381
35.1K