Michael Munje

@michaelmunje

PhD Student @UTAustin | MS @GeorgiaTech | Former Intern @Microsoft @NASAJPL @IBMResearch

Austin Katılım Eylül 2025

42 Takip Edilen13 Takipçiler

Sabitlenmiş Tweet

Michael Munje@michaelmunje·15 Eyl

[1/8] New social navigation paper + benchmark: SocialNav-SUB 🚶🤖 Recent work puts VLMs on robots for navigation, but can they really interpret scenes and extract key details for social navigation? 🔎 larg.github.io/socialnav-sub

English

Michael Munje retweetledi

Zichao@ZichaoHu99·26 Eyl

How can robots follow complex instructions in dynamic environments? 🤖 Meet ComposableNav — a diffusion-based planner that enables robots to generate novel navigation behaviors that satisfy diverse instruction specifications on the fly — no retraining needed. 📄 Just accepted to CoRL 2025 🔗 Project: amrl.cs.utexas.edu/ComposableNav/ A Thread (1/8)

English

2.6K

Michael Munje@michaelmunje·15 Eyl

A huge thanks to my collaborators for making this work possible! @ChenTangMark @dafeijing @ZichaoHu99 @yifengzhu_ut @cuijiaxun @GarrettWarnell @Joydeepb_robots @PeterStone_TX

English

119

Michael Munje@michaelmunje·15 Eyl

English

Michael Munje@michaelmunje·15 Eyl

[8/8] 🤝 SocialNav-SUB is a human-grounded check on whether VLMs understand social navigation scenes ✨ Please read our paper for more info: arxiv.org/abs/2509.08757 #Robotics #VLM #SocialNavigation

English

Michael Munje@michaelmunje·15 Eyl

[7/8] SocialNavSUB is also fully open-source, actively maintained, and easily extendable to customized prompts and/or additional VLMs! Pull requests are always welcome! github.com/LARG/SocialNav…

English

Michael Munje@michaelmunje·15 Eyl

[6/8] 🧪 Does chain-of-thought (using spatial/spatiotemporal VQAs first) improve social reasoning? ✅Yes. Does BEV context help models? ⚖️ Model-dependent (sometimes a lot). Does better spatial(temporal) context improve social reasoning? ✅Yes.

English

Michael Munje@michaelmunje·15 Eyl

[5/8] 📊 Do today’s VLMs agree with human judgments? We find that they still trail behind humans and simple rule-based baselines.

English

Michael Munje@michaelmunje·15 Eyl

[4/8] 👥 We collected human data from an IRB-approved human-subject study to construct our benchmark and evaluate whether models align with human judgments in social navigation scenes.

English

Michael Munje@michaelmunje·15 Eyl

[3/8] SocialNav-SUB features real-world social navigation scenarios built from SCAND scenarios @ 4 Hz → PHALP tracking → front-view & BEV with labeled pedestrians, combining them with a set of carefully designed questions to create our VQA prompts (5k in total).

English

Michael Munje@michaelmunje·15 Eyl

[2/8] We introduce SocialNav-SUB: a VQA benchmark to evaluate spatial, spatiotemporal, and social reasoning for real-world social navigation scenarios with object-centric grounding (front view + Bird’s-Eye-View (BEV) + numbered markers) to provide rich context to VLMs.

English

Keşfet

@ChenTangMark @dafeijing @ZichaoHu99 @yifengzhu_ut @cuijiaxun @GarrettWarnell @Joydeepb_robots @PeterStone_TX