
world modeling is never about rendering pixels. rendering is local. world state is global. as soon as more than one agent exists, the only thing that truly matters is the shared representation beneath individual views. that shared representation is what scales into collective capability. this is why I'm super excited to share project Solaris -- our new work focused on building a multiplayer video world model in minecraft. This release includes three main pieces. 1⃣Solaris Engine, a fully featured multiplayer data collection system with built in visuals. the team put a huge amount of work into this since nothing like it really exists yet. github.com/solaris-wm/sol… 2⃣Solaris Model, a multiplayer DiT with a new memory efficient self forcing design, trained on 12.6M frames of coordinated Minecraft gameplay. github.com/solaris-wm/sol… 3⃣Solaris Eval, which uses a VLM as a judge to evaluate different multiplayer capabilities. read the full technical breakdown by @ojmichel4, and start building with Solaris. solaris-wm.github.io


