Brandon Ong

359 posts

Brandon Ong banner
Brandon Ong

Brandon Ong

@bytedunks

building robots for datacenters; prev Robotics PhD (left) @Columbia @NTUsg; @join_ef

SG Katılım Mayıs 2018
425 Takip Edilen520 Takipçiler
Sabitlenmiş Tweet
Brandon Ong
Brandon Ong@bytedunks·
To understand what it takes to build a humanoid robot with model-based control, we finetuned @physical_int 's (PI) Pi05 model for our custom use case and environment. We incurred ~$10K in hardware costs, compared to the typical ~$20K set up (DROID/ALOHA). Here are the lessons and challenges we faced building the first working prototype (shown in the video) in 3 months. Part 1: Hardware, Software, Model Selection, Custom Embodiment, Inference, Embedded Hardware, Hierarchical Planner Part 2: Model Evaluation, Data Collection, Model Training, Simulation and Teleoperation We hope sharing our experience accelerates the learning of others who are in a similar starting point.
Brandon Ong@bytedunks

x.com/i/article/2018…

English
3
11
114
17.8K
Brandon Ong
Brandon Ong@bytedunks·
We're in Boston this week for #RoboticsSummit, and NYC next week. Who should I meet to exchange notes and explore overlaps on vertical robotics? We're building robots for datacenters and are actively thinking about early commercialization angles, continuous post-training pipelines fed by fleet data and scalable verification for RL.
English
0
0
2
94
Brandon Ong
Brandon Ong@bytedunks·
👀
kingston kuan@kstonekuan

Glad to see this project getting some recognition, thanks @dwarkesh_sp People are surprised when I say I worked on datacenters when I was at Jane Street “Why do you need to run datacenters?” “Aren’t you a software engineer?” When I first joined the team it was daunting, I was a fresh grad and knew nothing about DC ops Over time I realized it was one of the best positions I could find myself in early in my career Beyond the technical complexity of running critical infrastructure, supporting the entire firm led to collaborations across multiple teams I was often reminded how much we take physical infrastructure for granted, and how much impact I can have solving problems with new tech in a traditional industry

ART
0
0
2
203
Brandon Ong
Brandon Ong@bytedunks·
We'll be in SF next week for Data Center Expo. Do I know anyone deploying robots in industrial or similar settings? We're building robots for datacenters and are working together with our first enterprise partner. Would love to share notes and explore overlaps. Real-world RL, engineering deployment pipelines for continuously improving models, and early commercialization are top of mind.
English
3
0
10
365
Brandon Ong
Brandon Ong@bytedunks·
mandatory day zero shot
Brandon Ong tweet mediaBrandon Ong tweet mediaBrandon Ong tweet media
Filipino
3
1
25
1.4K
Brandon Ong retweetledi
kingston kuan
kingston kuan@kstonekuan·
Digital twins for datacenters are still hard to build in 2026 As agents control systems more autonomously, the need for high-level observability and visualization will only increase We need to know quickly how agents are running to make sure they do not go off the rails
kingston kuan tweet media
English
0
1
6
218
Brandon Ong
Brandon Ong@bytedunks·
Who’s working on continuous improvement model posttraining tools in the open source? (For AI robotics models) Sounds like the kind of thing that would benefit from co-developing with the community. Would love to meet others building their own.
Jianlan Luo@jianlanluo

Excited to share LWD: Learning While Deploying. Our robots learn while doing real tasks—restocking groceries, brewing Gongfu tea, making cocktails, making juice, and packing shoes. Deployment is no longer just evaluation; it becomes the training loop. 🧵

English
0
0
4
456
Brandon Ong
Brandon Ong@bytedunks·
Congrats on the new MolmoAct 2 release by @allen_ai! A few features that stood out for those considering this for real-world deployments: 1. YAM embodiment unlock + 720h teleoperated dataset 720 hours of bimanual YAM data is a meaningful contribution. The YAM embodiment is a simple bimanual arm setup for dexterous tasks, very similar to the dual PiperX and Trossen WidowX arms. Anyone building on @physical_int 's Pi05 or similar models with a YAM-type robot now has significantly more data to fine-tune from, which should reduce the fine-tuning samples needed for a custom task, assuming the target task and environment fall within the dataset's distribution. The dataset spans household, factory, and coffee-shop settings with high object and scene variation. @cortexairobot was the data vendor. Hoping the appendix detailing the quality control protocol gets released. 2. Depth reasoning as a reproducible recipe, but only with layer-level access MolmoAct2-Think shows one way to inject depth information into the action model. Before producing an action, the model predicts a compact discrete depth representation that conditions the action expert through per-layer KV conditioning. The mechanism requires surgical access to the VLM's intermediate attention states at every layer, something only possible with fully open architectures. 3. Swappable VLM backbones for converting VLM -> VLM-ER The released training recipe effectively decouples the perception backbone from the action head. You can pick a VLM optimized for your task domain rather than accepting a generic vision encoder. Hypothetical example: for warehouse sorting where success hinges on reading tiny, cluttered, blurry SKU labels, start from a VLM fine-tuned for OCR (e.g., a custom Qwen-VL or InternVL variant) instead of a generalist web-scale VLM. Apply the MolmoAct2-ER training recipe to that backbone to produce an "OCR-VL-ER" variant, then attach a flow-matching Action Expert. The result is a bespoke VLA that inherits your perception fine-tuning, optimized for label-reading manipulation rather than generic open-world scenes. This assumes catastrophic forgetting is minimized and the backbone retains most of the baseline capabilities it had before fine-tuning. With this recipe, you can swap in domain-specific backbones (medical imaging, industrial inspection, high-res OCR) and convert them into action models entirely from open components.
Ai2@allen_ai

Robotics models often struggle outside controlled environments. Ours is built to work in real ones. Today we're launching MolmoAct 2, which can assist with a host of chores & lab tasks, plus the MolmoAct 2-Bimanual YAM dataset—the largest open robotics dataset of its kind. 🧵

English
0
6
18
2.3K
Brandon Ong retweetledi
kingston kuan
kingston kuan@kstonekuan·
Simulations are core to robotics research, but spinning up custom scenes is still tedious and has a steep learning curve We built mujoco workbench (mwb): cli + agent skills for codex/claude code to scaffold and debug sim scenes from natural language
English
3
6
39
3.9K
Brandon Ong
Brandon Ong@bytedunks·
"A robotics-shaped take-off curve." Cracking distribution is as hard as the technical challenges to commercialize early, especially when customers need to see a plausible trajectory, hardware budgets are tight, and data collection needs funding. "Channel construction is the most underestimated lever in the stack." Creating the data flywheel that powers continuous improvement and eventual task reliability is what closes the gap to real-world deployments that actually create value. "Deployment-system engineering matters as much as model architecture." Similar to the systems around early LLM applications and AI voice agents, a model's capabilities are only as good as the scaffolds around it. This is where engineering depth and iteration speed compound into capability. The teams that earn the right to a first deployment reliable enough to kickstart the data flywheel will crack adoption in a new vertical.
York Yang@YorkYang5050

x.com/i/article/2051…

English
2
2
13
2.3K
Brandon Ong
Brandon Ong@bytedunks·
@LukasForTech The skills and CLI focus on scene creation and not the physics engine for now. So if you mean creating scenes from the standard MuJoCo library for your UAV testing then yes, it is supported.
English
0
0
0
66
Lukas Die Kunst
Lukas Die Kunst@LukasForTech·
@bytedunks This would streamline my UAV prototyping. Does it support aerodynamic force modeling, or is it limited to rigid-body kinematics?
English
1
0
1
79
Brandon Ong
Brandon Ong@bytedunks·
Sims are an essential piece of any researcher's experimental loop, both for training data and for evaluation. However, spinning up quick scenes still requires a learning curve and reading extensive documentation. This is true even for the small but critical fraction of training data needed for diversity, or for quick, directionally correct evaluation checks. In response, we built MuJoCo Workbench (MWB), a CLI and set of agent skills to prototype custom scenes with coding agents like Codex and Claude Code. Repo is in the next post. It's an attempt to make building diverse scenes a delightful experience, and to maximize what coding agents can do in researchers' hands to accelerate their experimental loops. How it works: 1. Install the bundled agent skills. 2. Describe what you want, and the agent scaffolds a working sim for you. No MuJoCo experience is required to get started. The skills teach the agent the mwb CLI, the scene layout conventions, and the debug tools, so it can iterate on behavior without you needing to know the plumbing. We're working on extending MWB with a built-in integration for real-time inference on open-source VLAs/VAMs like Pi05. If you recognize the problems mentioned here and want to learn more, reach out.
English
4
13
66
5.3K
Suraj Sharma
Suraj Sharma@suraj_sharma14·
Robots that learn. Live. In 72 hours. Embodied AI Hackathon @ SPC: → May 15-17 • San Francisco • In-person → Tracks: Manipulation • Locomotion • Aerial • Best Overall → Hardware + compute provided • OpenAI • AWS • Hugging Face Applications close: May 12. If you've ever wanted to ship AI that moves in the real world. This is your weekend. Register 👇 embodied-ai-hackathon-spc.replit.app @southpkcommons
English
1
2
17
1K