

UCSB Computer Science Department
985 posts

@ucsbcs
Official account of UC Santa Barbara's Computer Science Department.





Human perception is inherently situated – we understand the world relative to our own body, viewpoint, and motion. To deploy multimodal foundation models in embodied settings, we ask: “Can these models reason in the same observer-centric way?” We study this through SAW-Bench: a novel benchmark for observer-centric situated awareness: - 786 real world egocentric videos - 2,071 human-annotated QA pairs Across all tasks, we evaluate 24 state-of-the-art MFMs: 📉 Best model: 53.9% 🧑 Humans: 91.6% Models systematically: ❌ Confuse head rotation with physical movement ❌ Collapse under multi-turn trajectories ❌ Fail to maintain persistent world-state memory 👉 We see that maintaining a stable observer-centric representation remains challenging. As MFMs are increasingly integrated into embodied agents, situated awareness becomes essential for reliable real-world interaction. We release SAW-Bench and encourage further research toward improving observer-centric reasoning in multimodal foundation models.















Introducing @NeoCognition, the agent lab for specialized intelligence. Everyone needs experts, but human expertise does not scale. Backed by $40M seed funding, we build self-learning agents that specialize across domains to make expertise abundant.


















