Yu Lei@_OutofMemory_
🤖Co-training is everywhere (sim↔real[e.g. GR00T, LBM], human↔robot[e.g. PI, EgoScale], even non-robot data[e.g. PI, LBM).
But why does it work? How can we improve it further?
Taking sim-and-real imitation learning in diffusion/ flow-based models as the test bed, we performed a rigorous mechanistic analysis, drawing on theoretical insights and multi-layered experiments.
😮Key insight: it’s all about representations.
- Alignment → enables transfer
- Discernibility → enables adaptation
⚖️Both are necessary — it's better to have more aligned representations, but the model must be able to discern the domains. We term this as structured representation alignment.
⬇️Let’s take a deep dive into that:
Paper: arxiv.org/pdf/2604.13645
Website: science-of-co-training.github.io