
Using only box-forwarding speed as the reward, our Stackelberg PPO automatically evolves robots with arms for pushing and legs for moving. The key idea is a novel game-theoretic view of structure–control co-design, yielding more effective optimization and dramatically better designs. Come see our poster at ICLR 2026 on Apr 25, 10:30 AM, at P4-#4810. With @YuhuiWangAI, @YanningD_AI, @oneDylanAshley. Paper: arxiv.org/abs/2603.15388 Project Page: yanningdai.github.io/stackelberg-pp…






