jk
3.3K posts

jk
@after_ephemera
ex @meta ar/vr&&privacy, @mysten_labs | bringing emerging tech to production | holistic systems, radical moderation, accelerated future

there is no hugging face for robotics data. no standardized pipeline for collecting, labeling, versioning, training on real-world robot data at scale. no tooling that handles contact dynamics and material deformation well enough for industrial manipulation. no teleoperation infrastructure where human supervisor intervention automatically becomes training data. no vertical-specific manipulation datasets for any specific industrial task. the actual bottleneck in physical AI is the data and the infrastructure to generate it. and this is a structural problem. for language AI, training data was the internet. abundant, cheap, already labeled by human intent. for robotics, the gap between where foundation models are and where they need to be cannot be closed by deploying more robots. three bets are being made right now: simulation-first works brilliantly for locomotion. domain randomization has essentially solved quadruped walking in unstructured terrain. but it breaks down completely for manipulation. simulated cameras have no noise, blur, or friction error. real cameras and grippers have all of it. cable insertion, fabric folding, dexterous assembly are exactly where simulation fails. teleoperation as data collection is the second move. deploy semi-autonomous robots, capture human-guided trajectories, iterate. theoretically sound. but the capital math is brutal and the execution evidence isn't there yet. human video as proxy is the third. if robots could learn from watching humans, you tap unlimited data. the problem: human hand geometry and force feedback don't map onto robot actuators. you're learning the shape of motion without the physics that make it work. what's actually working today is locomotion. narrow manipulation in structured environments. inspection and sensing. quadrupeds doing thermal inspection. no general-purpose manipulation required. the hardware race is loud, capital-intensive, winner-take-few. but the data infrastructure race is quiet, undercapitalized, wide open.

what are the chances this thing will be abandonware in a month? you can make something in a few days but will you commit to maintaining and improving it for the next few years?

We installed 2 new robots at a customer site today. From start to finish, it took under 3hrs. The robots then started doing real work for our customer, and generating revenue for us. We have entered a new era of robotics. Buckle up - things are going to move quickly.







