metr0x

613 posts

metr0x banner
metr0x

metr0x

@metrox_eth

Monkey building robots Co-founder @SHOW_ROBOTICS | Prev. Founder @SODAmerch | Prev. Ops lead, Support lead @SnapshotLabs

Thailand, deep in robots Katılım Aralık 2024
507 Takip Edilen551 Takipçiler
Aurel Arnold
Aurel Arnold@aurel_arnold·
I use WebXR in the Quest’s built-in browser, which gives you controller poses you can stream over a websocket to your host. Then you need IK to map the controller poses to joint angles. If it’s a 6-DoF arm + gripper, it makes sense to separate the joints for position and orientation.
English
0
0
1
91
Aurel Arnold
Aurel Arnold@aurel_arnold·
Collected 3h of data in ~5h wall time with VR teleop today. Honestly feels way better than leader arms. There's never anything in your way even with weird motions, and being able to leave an arm frozen when not in use is huge. What do you think of the setup? Tried to make it as comfortable as possible ;)
English
7
11
108
10.6K
metr0x
metr0x@metrox_eth·
Tonight: rigging the Waveshare Gripper B in MuJoCo. 4-bar linkages are hard. Claude Code is telling me for the third time that it's late and I should hit the sack, apparently this happens to everyone now 😂
English
1
1
25
2.1K
metr0x
metr0x@metrox_eth·
@YuXiang_IRVL Awesome ! Curious about your pi0.5 fine-tune setup. Did you start from the base PI checkpoint or from a robot-specific pretrained checkpoint (so101 or similar)? Trying to figure out the right starting point for a RoArm M3 stacking task.
English
1
0
2
193
Yu Xiang
Yu Xiang@YuXiang_IRVL·
Among the policies we evaluated so far (ACT, DiT, SmolVLA, pi0, pi0.5), fine-tuned pi0.5 achieves the best performance on VLA-REPLICA. The trend is consistent with recent simulation benchmarks such as RoboLab. The policy behaviors in the real world: irvlutd.github.io/VLAReplica/sce…
Yu Xiang@YuXiang_IRVL

Strongly agree. In VLA-REPLICA irvlutd.github.io/VLAReplica/, we explicitly evaluate all three aspects: • object location variation • different object instances • background clutter We design the test scenes such that they are different from the demonstration distribution

English
4
19
154
18.4K
metr0x
metr0x@metrox_eth·
ZXX
0
0
0
80
metr0x
metr0x@metrox_eth·
Added a sim page to the @SHOW_ROBOTICS workshop UI today. It's a work in progress. MuJoCo scene with six 25mm cubes laid out in a 2×3 grid. Two cameras (wrist + front). It plays back synthetic stacking trajectories so I can watch the arm pick and place before committing GPU time to a training run. If the motion looks wrong here, no point training on it. Most of the day went into inverse kinematics and URDF partial implementation of RoArm Gripper B. Next: decompose the Gripper B mesh into separate STLs (base, jaws, linkage) and rig the six pivots so the jaws actually open and close in sim.
English
2
10
44
2.1K
metr0x
metr0x@metrox_eth·
Claude Code feels like a superpower until you hit the limit 😀
metr0x tweet media
English
0
1
5
251
metr0x
metr0x@metrox_eth·
Ye we are still early, nothing is standardized, an arxiv paper is not a good format for datasets conditions documentation, but it’s often the only source of truth. I would say chance to mismatch something is pretty much 100% 😀 It’s also hard to know what a pre-trained expert is actually capable of. Any particular edge use case is a flip of a coin. In example I am not seeing much VLA cube stacking demos, sota not clear, teleop not fully solved, force feedback an leader / torque control on motors is a must for delicate tasks. We are building at the frontiers.
English
1
0
0
23
Nurvai - The Data Layer for Physical AI
@metrox_eth Interesting how the bottleneck ended up being systems engineering rather than the policy itself. Timing mismatches and I/O latency seem massively underestimated in robotics evals. How often do you think “model failures” are actually infrastructure and runtime issues?
English
1
0
0
31
metr0x
metr0x@metrox_eth·
20 pick-and-place episodes on the RoArm M3. 95% success rate. The policy was trained a week ago. We unlocked it today by fixing the runtime. The signal: FPS counter on the eval dashboard at 12-14 while the model was trained at 20fps. Every eval was running at 60-70% of training-time inference frequency. Distribution mismatch baked in. Profiled the robot loop. send() to the servo controllers was blocking the main thread for 20-110ms per step. Refactored to an AsyncArmWorker: serial I/O on a dedicated thread, main loop latency drops to ~0ms. 20fps stable. Hardware: added a PCIe card with 4 Renesas USB controllers, cameras and arms on isolated buses. Removed the USB contention inflating send() variance. Last mile: base servo offset +3° clockwise from training calibration. Tuned, re-evaled. ACT v3 025k policy, 20 consecutive episodes at 95%. Gripper still has a residual timing quirk. Minor at this success rate, fix later. VR teleop was gated behind 70% baseline. Cleared. SmolVLA v6 (100k full finetune, unfrozen encoder) finished cooking tonight. Next on the bench.
English
2
6
27
1.1K
Chris Matthieu
Chris Matthieu@chrismatthieu·
I built a real-world-to-simulation demo using a @RealSenseai stereo camera mounted on a little #ROS #AMR robot to feed my skeletal pose into a humanoid robot in RVIZ and Gazebo ❤️🤖. Stop be the RealSense booth at @Robotics_Summit & Expo next week to see it in action!
English
4
8
61
2.9K
CryptoBee
CryptoBee@CryptoBE2·
$robotmoney is next 👀
Penn@monaco_pnl

@0xMousa_ I like it alot and plan on talking about it, not sure what this current dip is other than impatient hands + base fud

English
1
0
3
478
metr0x
metr0x@metrox_eth·
@syun88AI Congrats! VLA stuff is so hard, I am struggling.
English
0
0
2
32
metr0x
metr0x@metrox_eth·
But wait, why camera are lagging behind realtime 3D ? I guess I'll need to figure this out 😀
English
0
0
1
116
metr0x
metr0x@metrox_eth·
3D realtime robot view + joint angle traces just shipped.
English
3
5
51
3.2K
metr0x
metr0x@metrox_eth·
First real-world eval at 050K. 020K/030K showed frozen arm (pre-convergence). Power law fit L = 28.6 × k^(-0.414) predicts loss ~0.324 at 050K, which is where meaningful behavior should emerge.
English
0
0
0
120
metr0x
metr0x@metrox_eth·
Dataset rebuilt from scratch. 315 RoArm M3 pick-and-place episodes. Pipeline: trim → shift → speedaug. Trim cuts dead frames. Shift adds synthetic positional offsets for spatial generalization. Speedaug = variable-speed replay so the policy doesn't lock onto demo cadence.
English
1
0
1
140