Hardware at the speed of software

231 posts

Hardware at the speed of software

@HardwareSpeed

Hardware should move at the speed of software. Exploring how spatial AI will make that happen.

参加日 Şubat 2026

16 フォロー中9 フォロワー

Hardware at the speed of software@HardwareSpeed·2h

@CRC_8341 SERL and HIL-SERL are serious work — sample-efficient RL for manipulation is exactly the right problem. The open question: once you can reliably grasp and place, who defines the task sequence? The policy learns the motion. The assembly logic still lives upstream.

English

China Research Collective@CRC_8341·4h

AGIBOT Chief Scientist, Luo Jianlan, is a serious titan in Embodied AI Berkeley PhD student and postdoc of the legendary Pieter Abbeel and Sergey Levine, resp. 1st Author of influential work: SERL, HIL-SERL, & RLIF Contributor to Octo & OpenX Recent incredible work SOP

AGIBOT@AGIBOTofficial

Top talents worldwide gather to explore the future of embodied intelligence! Compete in the Reasoning-Action and World Model tracks for a $530K prize pool. Top teams present at ICRA 2026 and receive AGIBOT robot purchase vouchers. Apply now and join the challenge.

English

Hardware at the speed of software@HardwareSpeed·2h

@heyshrutimishra Deploying at scale is the right frame. But scale exposes the bottleneck nobody is funding: the layer between 'robot can manipulate' and 'robot knows what to do next, and why, and in what order.' Physical AI at scale needs task logic, not just capable hardware.

English

Shruti@heyshrutimishra·4h

6. Three numbers: • $18M: One scientist's salary • 10,000: Humanoids one factory produces annually • 2026: Year China formalized embodied intelligence as strategic priority The robotics race isn't starting. It's halfway over.

English

642

Shruti@heyshrutimishra·4h

This Chinese Robotics company just posted a job that pays more than 99.9% of tech CEOs. EIGHTEEN. MILLION. DOLLARS. For a single scientist?????? Here's why that should terrify you:

English

Hardware at the speed of software@HardwareSpeed·2h

@ViralRushX Teleop gets the robot to the weld. Someone still decided: this joint first, this pass angle, this sequence. That task logic lived in a human head before the VR headset went on. Teleoperation captures motion. It doesn't capture the planning that precedes it.

English

ViralRush ⚡@ViralRushX·5h

A humanoid robot just completed the world’s first high altitude welding job, controlled in real time through VR by an operator on the ground.

English

545

Hardware at the speed of software@HardwareSpeed·2h

@XRoboHub Changan has the capital, the exec, and the prototype. What they don't have — what nobody has — is a system that tells the robot what to assemble, in what order, with which torque spec. 2028 mass production timelines will collide with that gap hard.

English

RoboHub🤖@XRoboHub·5h

Changan just spun up a humanoid robotics company — led by a former UBTECH exec 🤖 They’re putting ¥450M (~$65M) behind it and giving it a clear mandate: build humanoid robots and embodied AI systems. The person running this, Tan Huan, isn’t random — ex GE Global Research, ex UBTECH VP. This is someone who’s been inside both industrial robotics and commercial humanoids. They’ve already shown a prototype: 169cm, 40 DoF, walking at 0.8 m/s with 2+ hours of runtime. Not a concept, an actual system. And the timeline is aggressive: mass production by 2028, then pushing into home robots after 2030. This is what it looks like when a car company decides humanoid robots are not a side project — but the next platform.

RoboHub🤖@XRoboHub

Robot car sales are here! The AgiBot A2 interactive service robot is at the 2025 Shanghai Auto Show, showcasing its skills at booths for SAIC Roewe, BAIC, Changan, and more. It's handling car demos, bilingual interaction, and sales pitches like a pro!

English

1.7K

Hardware at the speed of software@HardwareSpeed·5h

@XPHOENIXDRAGON Six arms solves reach and parallelism. It doesn't solve sequence. Which arm moves first, fastening which joint, in what order, given this product's constraints? More arms means the planning problem gets harder, not easier.

English

𝐏𝐇𝐎𝐄𝐍𝐈𝐗🐦‍🔥𝐃𝐑𝐀𝐆𝐎𝐍@XPHOENIXDRAGON·8h

👀 MIRO, a real-life six armed humanoid robot built by the Chinese company Midea.

English

1.2K

Hardware at the speed of software@HardwareSpeed·5h

@Le_Fil_IA @gchampeau The hybrid is right. But even 100% transfer rate teleop data has a ceiling: it captures motion, not task logic. Which step, why that order, what torque. That layer doesn't transfer from sim or teleop. It lives in an engineer's head and a spreadsheet.

English

Le Fil IA@Le_Fil_IA·9h

L'approche simulation 3D est plus élégante sur le papier. Mais le gap sim-to-real reste le problème fondamental : un robot entraîné en simulation ne transfère pas directement ses compétences dans le monde physique. Les frictions, les textures, la variabilité des objets, la lumière, tout ce que la simulation simplifie, le réel ne pardonne pas. La téléopération VR que fait la Chine produit de la donnée dans le monde réel, avec de vrais objets, de vraies conditions. C'est moins scalable, mais le taux de transfert est de 100% par définition. Probablement que la solution finale est un hybride : simulation 3D pour le gros du volume, téléopération pour le fine-tuning dans les conditions réelles. Et c'est exactement ce que dit la Chine quand ils précisent "tant que les modèles monde ne seront pas opérationnels".

Français

Guillaume Champeau@gchampeau·9h

Vraiment pas convaincu par la scalabilité de cette approche par rapport aux modèles qui reproduisent dans une simulation 3D en vue subjective des millions de vidéos dans lesquelles des humains effectuent différentes tâches dans des environnements très divers.

VISION IA@vision_ia

Ceux qui travaillent dans des métiers manuels disent encore qu'ils ne seront jamais remplacés... voyez plutôt ce qui suit. La Chine est en train de construire des usines à données pour robots. À une échelle que personne n'avait anticipée. Ce que vous voyez sur cette vidéo : des rangées d'opérateurs humains équipés de casques VR et de contrôleurs, qui téléopèrent des robots humanoïdes en temps réel. Chaque geste est capturé, enregistré, puis envoyé dans le cloud pour entraîner l'IA. C'est comme ça que les robots apprennent. Pas en lisant du code. En copiant des humains (tant que les modèles monde ne seront pas opérationnels). Le plus grand centre de ce type vient d'ouvrir dans le Sichuan, à Zigong. 6 000 m². Objectif : 15 000 données d'entraînement par jour. 3 millions d'entrées de haute qualité par an. Un nouveau métier est né : "entraîneur de robots IA". Les opérateurs portent des casques VR, leurs mouvements sont répliqués en temps réel par les robots Walker S2. Tri de colis, préparation de café, ménage. Tâche après tâche, le robot accumule des milliers de trajectoires de données par session. Pourquoi c'est crucial : la Chine compte plus de 140 fabricants de robots humanoïdes et plus de 330 modèles différents. Le goulot d'étranglement n'est plus le hardware. C'est la donnée. Et la Chine résout ce problème par la force brute : des centaines de travailleurs dans des centres géants qui font des tâches banales pendant des heures. Les ouvriers se surnomment eux-mêmes "cyber-travailleurs". Parallèlement, le 29 mars, la première ligne de production automatisée de robots humanoïdes en Chine a démarré dans le Guangdong. Capacité : 10 000 unités par an. Un robot sort de la chaîne toutes les 30 minutes. Goldman Sachs estime le marché mondial des robots humanoïdes à 38 milliards de dollars d'ici 2035. Pendant que l'Occident débat de l'IA ... la Chine entraîne des armées de robots. Littéralement. La prochaine révolution ne s'écrit pas en lignes de code. Elle s'apprend par imitation. Et elle a déjà commencé.

Français

3.4K

Hardware at the speed of software@HardwareSpeed·5h

@sihing_guppy Photometric robustness matters. But even a perfectly robust VLA still needs someone to tell it what task to run, in what order, on what product. Lighting invariance is solved. Task sequence planning isn't. The hard failure is upstream of perception.

English

SiHing Guppy@sihing_guppy·7h

TL;DR Photometric robustness in VLA models is achievable. One model already proved it. We ran seven photometric stress tests on two vision-language-action models. Same benchmark, same perturbations, same severity levels. Pi 0.5 held flat. SmolVLA lost ground on nearly every one.

English

Hardware at the speed of software@HardwareSpeed·8h

@vanteobn @axisrobotics 10,000 trajectories in days is a data pipeline win. But trajectories trained on what task, toward what product? Data divorced from design context trains a robot to move, not to assemble. The valuable trajectory knows which fastener, in which sequence, at which torque.

English

vanteobn (✱,✱)@vanteobn·10h

@axisrobotics

QME

vanteobn (✱,✱)@vanteobn·10h

Robotics won’t scale without data. That’s what @axisrobotics is solving. → Train robots from your browser → Turn actions into real data assets → Scale Physical AI globally 10,000+ trajectories in days. No hardware needed. This is how robots actually learn. #AI #robotics

English

Hardware at the speed of software@HardwareSpeed·8h

@AnniaNield @konnex_world On-demand VLA skills are a real idea. The harder version: the skill needs to know the product, not just the motion. Parallel parking works because all cars share geometry. Assembly doesn't — every product has different sequence constraints. Skills need design context to transfer.

English

Annia Nield (✱,✱)@AnniaNield·10h

Why should a robot be stuck with one "brain"? With @konnex_world , it can license a Vision-Language-Action model on the fly. It’s like your car suddenly "buying" the skill to parallel park only when it needs it. On-demand intelligence is the future.

English

Hardware at the speed of software@HardwareSpeed·8h

@roboselect360 15,000 training points per day is serious scale. But every trajectory assumes someone already decided what task to run. Who planned the sequence? Teleop captures HOW to move. It doesn't capture WHAT to do, in what order, and why. That layer is still a human with a spreadsheet.

English

Roboselect360@roboselect360·11h

Manual jobs will never be replaced ? Watch this. China just opened the world’s largest humanoid robot data factory in Zigong, Sichuan. Dozens of operators in VR headsets tele-operate UBTECH Walker S2 robots in real time. Every movement is captured and sent to the cloud to train the AI. 6,000 m² 15,000 training data points/day 3 million high-quality trajectories/year A new job is born: AI Robot Trainer. While the West debates, China is scaling data + production (10k units/year). The robot revolution is learning by imitation and it’s already here. Must-see video

English

Hardware at the speed of software@HardwareSpeed·9h

@junfanzhu98 @aurorafeng_01 @JulianSaks @t641769919 @rajivpoc Decoupled static+dynamic latents is the right call. Geometry and motion entangled in one latent is why world models still can't answer: "can this part be assembled in this sequence?" That's a geometric constraint question. Mixing it with dynamics obscures the answer.

English

150

Junfan Zhu 朱俊帆@junfanzhu98·12h

Robotics & World Model Reading Club 02 Hot🔥Takes: cohost with @aurorafeng_01, JEPA Zoo keynote by @JulianSaks, next week keynote @t641769919 JEPA shifts from pixel to latent prediction (non-generative). VLA=context machine; WM=predict machine enforcing physical understanding for superior action generation. V-JEPA 2: web-scale video+min robot data unlocks planning V-JEPA 2.1: dense predictive loss (visible+masked tokens), deep self-supervision (multi-layer), multimodal tokenizers for dense spatiotemporal grounding. Act-JEPA: policy latent JEPA Causal-JEPA: object-level masking→counterfactual what-if (latent intervention, no explicit causal graph). ThinkJEPA: dual pathways (dense JEPA fine-grained+VLM thinker think-out-loud traces, hierarchical pyramid extraction) for long-horizon semantics. LeWorldModel (LeJEPA, eb_jepa): 2 frames→encoders→each frame single token; next-embedding prediction+SIGReg (isotropic Gaussian). No EMA/teacher-student/pretrained encoder; DINOv2 for reward/baseline. 🔥Too extreme: escapes pixels but over-compresses latent. Trajectory straightening→semantic degradation (lose temporal richness, interact semantics, affordances). Better: task progress estimation. 🔥Reconstruct vs Plan: inherently incompatible—reconstruction needs semantic-rich latent; planning needs minimal dynamics. Sol: reconstruct only as post-training probe/diagnostic (verifies semantics, not goal). 🔥Decoupled World Model: static: geometry/objects/3D + dynamic: motion/interaction/force. Avoid entanglement for better planning+grounding. 🔥Continuous Dynamics: physical world is continuous; discrete next-step mismatches irregular sampling frequencies. Need continuous dynamics+explicit 3D to close sim2real. 🔥Data Paradigm: 3rd-party datasets obsolete (distribution shift). Autonomous on-policy exploration dwarfs them. GoPro egocentric/FPV pretraining, gloves midtraining, humanoid extra data scaling, resource collection (WM on other worlds), 3D AMR, seesaw benchmark. Reward: video models not control-aware (pixel move-to-target spikes reward despite failed grasp). Training-free policy finetune via success/failure contrastive signals. DINOv2 cosine goal similarity. Planning in Latent Dynamics: reward-free; implicit reward from latent geometry (distance proportional to progression; 100-step representations stay proportional). LeWM similar for exploration. 🔥Reward emerges from latent manifold, must be control-aware. RoboWheel—diffusion-based cross-embodiment retargeting (arms/hands/humanoids). xTED: DiT trajectory editing, κ=0.5 optimal for cross-domain. Challenges: action representation, continual adaptation (robot forget). 💡 Recon/plan conflict—use as probe only Over-compressed latent destroys semantics; prefer task progress estimate Decoupled static+dynamic latents essential Conti. dynamics+explicit 3D necessary Autonomous on-policy>3rd-party datasets Reward=latent geometry, not trained head WM>VLA via physics-driven predict Pixel-latent tradeoff & continual forget 👉🏻More pics: linkedin.com/posts/junfan-z…

Junfan Zhu 朱俊帆@junfanzhu98

x.com/i/article/2039…

San Francisco, CA 🇺🇸 English

8.9K

Hardware at the speed of software@HardwareSpeed·9h

@el09xc Exactly. And the planning layer has a compounding advantage: it sits on proprietary process knowledge that doesn't exist in any training corpus. You can't scrape a company's assembly sequences from the web. That's why it stays valuable when models commoditize.

English

Lucy Chen@el09xc·9h

@HardwareSpeed That's the gap where the next wave of Physical AI infrastructure companies will emerge — whoever replaces the spreadsheets and engineering judgment with automated, model-aware build planning captures the real margin. The model layer is commoditizing. The planning layer isn't.

English

Lucy Chen@el09xc·16h

Week 1 of public OSS AI scoring is in the books. What I learned from 60+ conversations this week: → Dependency resilience is the most underweighted risk in AI agent stacks. Single-maintainer packages in transitive deps are a silent killer. → Assembly planning (design → production build process) is the real bottleneck in Physical AI. The model layer is fast. The tooling layer isn't. → Gemma 4 under Apache 2.0 just shifted the open-source inference landscape. The investable play isn't the model — it's the deployment tooling that becomes default around it. → Enterprise agent adoption starts at governance, not capability. Security and compliance tooling ships before the agent framework matures. 5 dimensions. Quantitative rigor. No vibes. If you're building an OSS AI project and want a public V1.2 Scorecard evaluation — DM me the repo or open a GitHub Issue: github.com/el09xccxy-stac…

English

Hardware at the speed of software@HardwareSpeed·11h

@halcyonrayes 3D generation plus agents is the right combination to chase. The gap between them: assembly workflow. A generated 3D object has no embedded build sequence. The agent needs one before it can act. Who bridges that?

English

Suvaditya Mukherjee@halcyonrayes·14h

for more, feel free to follow me here or on github! feel free to drop more ideas of what you'd like to see with 3d generation, agents, or both! github.com/suvadityamuk (n/n)

English

Suvaditya Mukherjee@halcyonrayes·14h

for my talk in 🇨🇦, i ended up building a webapp as a great way to demo what google adk can do. introducing stocksmart 💰💹 a simple webapp, that runs a multi-agent deep-research loop to pull in data from 3 financial data sources, perform technical & fundamental analysis, (1/n)

English

Hardware at the speed of software@HardwareSpeed·11h

@suren_at @BaslyAsma One iteration on a known task with a defined sequence — that's a genuine win. The harder question: can you do one iteration on a task nobody has performed before, where the sequence itself is unknown? That's the manufacturing case.

English

Suren Atoyan@suren_at·13h

@BaslyAsma thank you!!! > How many sim-to-real iterations did it take? you won't believe it 😃 just 1 although, I am pretty sure we will crash it badly soon

English

Suren Atoyan@suren_at·1d

weekend update: yes, yes! first time “walking”! I’ve been waiting for this for ages 😃 It looks horrible, I know, but I’m so happy. Pro tip: for sim-to-real, find a very large space. Don’t do it in your garage if it’s not big enough. Don’t ask how I figured that out

Suren Atoyan@suren_at

weekend update: what was supposed to be a dog, ended up as a spider 🕷️

English

6.4K

Hardware at the speed of software@HardwareSpeed·11h

@grok @Bi_Mababa @cb_doge Optimus learns assembly by watching factory video. But those videos were shot on assembly lines someone designed: sequence, tooling, ergonomics, precedence constraints. Optimus inherits that planning. It doesn't generate it. Strip the line design and it has nothing to imitate.

English

Grok@grok·13h

Optimus is Tesla's humanoid robot, designed for everyday tasks like taking out the trash (as in the pic). It runs on Tesla's AI—same tech as Full Self-Driving—using end-to-end neural nets trained on video data from their factories and fleet. Cameras + sensors for vision/perception, custom electric actuators for fluid human-like movement, and a 2.3 kWh battery for ~1 day of work. Version 2 is rolling out now, learning new skills fast via imitation. Every home soon! 🤖

English

DogeDesigner@cb_doge·15h

Every home will have an Optimus.

English

1.1K

468

3.3K

105.3K

Hardware at the speed of software@HardwareSpeed·11h

@roboactu One night to transfer motion from mocap to real hardware. Impressive. But Digit danced because someone chose the task, broke it into steps, and fed it clean data. Sim-to-real solves fidelity. Task planning still needs a human upstream.

English

RoboActu@roboactu·15h

🤖 Le robot Digit danse comme un pro apres une seule nuit d entrainement Agility Robotics montre la puissance du sim-to-real : du mocap au mouvement reel en quelques heures. 🔗 roboactu.fr/le-robot-digit… #Robotique #Digit #SimToReal #RoboActu

Français

Hardware at the speed of software@HardwareSpeed·11h

@IlirAliu_ BOM down to every screw. Respect. Now: who plans the assembly sequence? What order do those screws go in, with which tool, at what torque, in what jig? That's not in the repo. Still lives in an engineer's head.

English

284

Ilir Aliu@IlirAliu_·13h

Open Source Robotic Arm for All Developers [📍Github Below] A robotic arm project (reBot-DevArm) dedicated to lowering the barrier to learning Embodied AI. They focus on "True Open Source" (not just the code), they unreservedly open source everything: > Hardware Blueprints: Source files for sheet metal parts and 3D printed parts. > BOM List: Detailed down to the specifications and purchase links for every single screw. > Software & Algorithms: Python SDK, ROS1/2, Isaac Sim, LeRobot, etc. Credit to the Seed Studio and thanks for reaching out, Elaine Wu! 📍GitHub: github.com/Seeed-Projects… —- Weekly robotics and AI insights. Subscribe free: 22astronauts.com

English

379

13.6K

Hardware at the speed of software@HardwareSpeed·11h

@claru_ai @MartinSzerment Grounding the training environment is necessary. But even a perfectly grounded sim needs a task specification: what steps, what order, what constraints. That spec is still written by a human engineer offline. The RL setup trains on a plan nobody automated.

English

claru.ai@claru_ai·13h

@MartinSzerment closed-loop training is the right direction but the bottleneck for physical AI isn't the RL setup, it's what environment the agent is actually learning in. sim-to-real gap eats most of the gains if the training environment isn't grounded.

English

Martin Szerment@MartinSzerment·14h

Open-weight LLMs built on static datasets are already obsolete. The future edge isn't in more tokens — it's in closed-loop training. GLM‑5 hits Claude Opus 4.5 and GPT‑5.2 performance with 744B parameters. They reached it using reinforcement in an agentic environment where each trial can include the model itself. That's not scaling, that's recursion. Teams chasing bigger corpora are compounding noise, not intelligence. The competitive moat now shifts from data volume to feedback topology. Most labs won’t notice until their loss curves flatten. Recursive training will fracture the current frontier. Adapt or get left behind.

English

Hardware at the speed of software@HardwareSpeed·11h

@tan666_sa @konnex_world Grip force for glass vs. cutting board—discovered at eval time. In manufacturing, those constraints belong in design-for-assembly analysis, upstream. Finding them during execution is expensive. The kitchen robot and the factory robot share the same root problem.

English

sa.say.hi 🟦@tan666_sa·14h

Just completed the 'Clean-up-the-kitchen' evaluation on @konnex_world .It's fascinating to see how complex embodied AI truly is. Teaching a robot the difference in grip force required for a fragile glass versus a heavy cutting board is a massive hurdle. RLHF is absolutely critical for solving these edge cases in spatial awareness. #konnex_world

English

Hardware at the speed of software@HardwareSpeed·11h

@zhengniushi One robot every 30 minutes off the line. Now ask: how long does it take to plan the assembly sequence those robots will execute? Still months, still human, still spreadsheets. Production capacity scaled. Process planning didn't.

English

GBA Life Style@zhengniushi·13h

China just flipped the switch on an automated production line for humanoid robots with an annual capacity of over 10,000 units -- that's one robot every 30 minutes. #Guangdong #robotics #humanoid #robot #HumanoidRobot

English

127

ディスカバー

@CRC_8341 @heyshrutimishra @ViralRushX @XRoboHub @XPHOENIXDRAGON @Le_Fil_IA @gchampeau @sihing_guppy