GAMMA UMD

1.1K posts

GAMMA UMD banner
GAMMA UMD

GAMMA UMD

@gammaumd

Geometric Algorithms for Modeling, Motion, and Animation research group: UNC Chapel Hill (1992-2018); University of Maryland, College Park (2018 onwards)

College Park, MD Katılım Ekim 2009
530 Takip Edilen756 Takipçiler
Sabitlenmiş Tweet
GAMMA UMD
GAMMA UMD@gammaumd·
Thrilled to share that our team has multiple papers accepted at #NeurIPS2025 🎉🚀 We’re excited to contribute to advancing multi-modal learning, physical reasoning, and embodied AI. Here’s a quick overview of the works 👇🧵
English
1
2
6
813
GAMMA UMD
GAMMA UMD@gammaumd·
Huge congrats to James Mullen @jamesfmullen and Geonsun Lee @gsun_lee on successfully defending their PhDs this week! 🎓👏 We’re so proud to celebrate your achievements — your hard work and perseverance have inspired everyone in the GAMMA Lab and beyond! 🌟
GAMMA UMD tweet media
English
0
1
13
488
GAMMA UMD
GAMMA UMD@gammaumd·
GAMMA Group Presented 5 papers at ICCV 1. "𝑨𝑽𝑻𝒓𝒖𝒔𝒕𝑩𝒆𝒏𝒄𝒉: 𝑨𝒔𝒔𝒆𝒔𝒔𝒊𝒏𝒈 𝒂𝒏𝒅 𝑬𝒏𝒉𝒂𝒏𝒄𝒊𝒏𝒈 𝑹𝒆𝒍𝒊𝒂𝒃𝒊𝒍𝒊𝒕𝒚 𝒂𝒏𝒅 𝑹𝒐𝒃𝒖𝒔𝒕𝒏𝒆𝒔𝒔 𝒊𝒏 𝑨𝒖𝒅𝒊𝒐-𝑽𝒊𝒔𝒖𝒂𝒍 𝑳𝑳𝑴𝒔" Sanjoy Chowdhury, Sayan Nag, Subhrajyoti Dasgupta, Yaoting Wang, Mohamed (Mo) Elhoseiny, Ruohan Gao, Dinesh Manocha 📍Tue 21 | Oct 11:45 a.m. HST — 1:45 p.m. HST | Poster Session 1 | Exhibit Hall I #141 🌐 Project page: lnkd.in/g5mMwkHj 💻 GitHub: lnkd.in/gxWJibcx 2. "𝑬𝒈𝒐𝑨𝒅𝒂𝒑𝒕: 𝑨𝒅𝒂𝒑𝒕𝒊𝒗𝒆 𝑴𝒖𝒍𝒕𝒊𝒔𝒆𝒏𝒔𝒐𝒓𝒚 𝑫𝒊𝒔𝒕𝒊𝒍𝒍𝒂𝒕𝒊𝒐𝒏 𝒂𝒏𝒅 𝑷𝒐𝒍𝒊𝒄𝒚 𝑳𝒆𝒂𝒓𝒏𝒊𝒏𝒈 𝒇𝒐𝒓 𝑬f𝒇𝒊𝒄𝒊𝒆𝒏𝒕 𝑬𝒈𝒐𝒄𝒆𝒏𝒕𝒓𝒊𝒄 𝑷𝒆𝒓𝒄𝒆𝒑𝒕𝒊𝒐𝒏" Sanjoy Chowdhury, Subrata Biswas, Sayan Nag, Tushar Nagarajan, Calvin Murdock, Ishwarya Ananthabhotla, Yijun Q., Vamsi Krishna Ithapu, Dinesh Manocha, Ruohan Gao 📍Wed 22 Oct | 11:15 a.m. HST — 1:15 p.m. HST | Poster Session 3 | Exhibit Hall I #983 🌐 Project page: lnkd.in/gF3c2eKt 💻 GitHub: lnkd.in/g-iiQaed 3. "𝑨𝑼𝑹𝑬𝑳𝑰𝑨: 𝑻𝒆𝒔𝒕-𝒕𝒊𝒎𝒆 𝑹𝒆𝒂𝒔𝒐𝒏𝒊𝒏𝒈𝑫𝒊𝒔𝒕𝒊𝒍𝒍𝒂𝒕𝒊𝒐𝒏 𝒊𝒏 𝑨𝒖𝒅𝒊𝒐-𝑽𝒊𝒔𝒖𝒂𝒍 𝑳𝑳𝑴𝒔" Sanjoy Chowdhury, Hanan Gani, Nishit Anand, Sayan Nag, Ruohan Gao, Mohamed (Mo) Elhoseiny, Salman Khan, Dinesh Manocha 📍Thu 23 Oct | 11:15 a.m. HST — 1:15 p.m. HST | Poster Session 5 | Exhibit Hall I #2113 🌐 Project page: lnkd.in/gwd_T5VM 💻 GitHub: lnkd.in/gV-5aWRs 4. "DMesh++: An Efficient Differentiable Mesh for Complex Shapes" Sanghyun Son · Matheus Gadelha · Yang Zhou · Matthew Fisher · Zexiang Xu · Yi-Ling Qiao · Ming Lin · Yi Zhou Thu 23 Oct 5:45 p.m. PDT — 7:45 p.m. PDT / 2:45pm HST - 4:45pm HST Exhibit Hall I #2456 ProjectPage: lnkd.in/epYCB43E Github: lnkd.in/eabpDQGx 5. "IM360: Large-scale Indoor Mapping with 360 Cameras", Dongki Jung, Jaehoon Choi, Yonghan Lee, Dinesh Manocha 📍Thu 23 Oct | 2:45 p.m. HST — 4:45 p.m. HST | Poster Session 6 | Exhibit Hall I #2691 🌐 Project page: lnkd.in/ewQ4uEdS 💻 GitHub: lnkd.in/e4FQpnNE 📷
English
0
1
3
618
GAMMA UMD
GAMMA UMD@gammaumd·
GAMMA Group at UMD presented 13 papers at IROS 2025. 1.       “ET-Former: Efficient Triplane Deformable Attention for 3D Semantic Scene Completion from Monocular Camera”, 13:50-13:55, TuBT10.7 Liang, Jing; Yin, He; Qi, Xuewei; Park, Jong Jin; Sun, Min; Madhivanan, Rajasimman; Manocha, Dinesh 2.       “ Confidence-Controlled Exploration: Efficient Sparse-Reward Policy Learning for Robotic Navigation”, 13:30-13:35,TuBT11.3 Patel, Bhrij; Kulathun Mudiyanselage, Kasun Weerakoon; Suttle, Wesley A.;       Koppel, Alec; Sadler, Brian; Zhou, Tianyi; Manocha, Dinesh; Bedi, Amrit Singh 3.       “TK-Planes: Tiered K-Planes with High Dimensional Feature Vectors for Dynamic UAV-Based Scenes”, 13:25-13:30, TuBT30.2 Maxey, Christopher; Choi, Jaehoon; Kwon, Heesung; Lee, Hyungtae; Manocha, Dinesh 4.       “ CROSS-GAiT: Cross-Attention-Based Multimodal Representation Fusion for Parametric Gait Adaptation in Complex Terrains”, 17:10-17:15, TuDT11.7 Seneviratne, Gershom Devake; Kulathun Mudiyanselage, Kasun Weerakoon; Elnoor, Mohamed; Rajagopal, Vignesh; Varatharajan, H.; M Jaffar, Mohamed Khalid; Pusey, Jason; Manocha, Dinesh 5.       “VL-TGS: Trajectory Generation and Selection Using Vision Language Models in Mapless Outdoor Environments”, 13:20-13:25, WeBT7.1 Song, Daeun; Liang, Jing; Xueso, Xiao; Manocha, Dinesh 6.       “AutoSpatial: Visual-Language Reasoning for Social Robot Navigation through Efficient Spatial Reasoning Learning”, 15:20-15:25, WeCT5.5 Kong, Yangzhe; Song, Daeun; Liang, Jing; Manocha, Dinesh; Yao, Z.;  Xiao, Xuesu 7.       “Cross-Source-Context Indoor RGB-D Place Recognition”, 15:30-15:35, WeCT10.7 Liang, Jing; Deng, Zhuo; Zhou, Zheming; Ghasemalizadeh, O.; Kuo, Cheng-Hao; Sen, A.; Manocha, Dinesh 8.    “SkyVLN: Vision-And-Language Navigationand NMPC Control for UAVs in Urban Environments”, 13:45-13:50, ThBT12.6 Payandeh, Amirreza; Song, Daeun; Nazeri, M. Liang, Jing; Mukherjee, P.; Raj, Amir Hossain; Kong, Y.; Manocha, Dinesh; Xiao, Xuesu 9.     “ Is the House Ready for Sleeptime? Generating and Evaluating Situational Queries for Embodied Question Answering”. 15:25-15:30, ThCT3.6 Dorbala, Vishnu Sashank; Goyal, P. Piramuthu, R.; Johnston, M.; Ghanadan, Reza; Manocha, Dinesh 10.  “LBAP: Improved Uncertainty Alignment of LLM Planners Using Bayesian Inference”, 15:10-15:15, ThCT8.3 Mullen, James; Manocha, Dinesh 11. “MMCD: Multi-Modal Collaborative Decision-Making for Connected Autonomy with Knowledge Distillation”, 17:10-17:15, TuDT9.7 Liu, Rui; Wang, Zikang; Gao, Peng; Shen, Yu; Tokekar, Pratap; Lin, Ming C. 12. “ Quantifying and Modeling Driving Style in Trajectory Forecasting”, 15:05-15:10, WeCT17.2 Zheng, Laura; Yaghoubi Araghi, Hamidreza; Wu, Tony; Thalapanane, Sandeep; Zhou, Tianyi; Lin, Ming C. 13.  “On the Vulnerability of LLM/VLM-Controlled Robotics”, 13:45-13:50, TuBT4.6 Wu, Xiyang; Chakraborty, S.; Xian, Ruiqi; Liang, Jing; Guan, Tianrui; Liu, F.; Sadler, B.; Manocha, Dinesh; Bedi, Amrit S.
English
0
2
3
764
GAMMA UMD retweetledi
DailyPapers
DailyPapers@HuggingPapers·
NVIDIA just released Audio Flamingo 3 on Hugging Face! This fully open, state-of-the-art Large Audio-Language Model excels at understanding & reasoning across speech, sounds, and music, setting new benchmarks on 20+ tasks. huggingface.co/nvidia/audio-f…
English
7
118
682
59.7K
GAMMA UMD retweetledi
Amrit Singh Bedi
Amrit Singh Bedi@amritsinghbedi3·
Excited to share our #NeurIPS2025 paper 🎉 "more thinking ≠ better reasoning" 👉 We uncover the 𝐦𝐢𝐫𝐚𝐠𝐞 𝐨𝐟 𝐭𝐞𝐬𝐭-𝐭𝐢𝐦𝐞 𝐭𝐡𝐢𝐧𝐤𝐢𝐧𝐠 𝐬𝐜𝐚𝐥𝐢𝐧𝐠: increasing thinking tokens at test-time boosts accuracy briefly, then hurts as response variance increases
Amrit Singh Bedi tweet media
English
1
4
19
884
GAMMA UMD
GAMMA UMD@gammaumd·
(5/n) VideoHallu: Evaluating and Mitigating Multi-modal Hallucinations on Synthetic Video Understanding Author: @zli12321, @wu_xiyang, @guangyao_shi, Yubin Qin, Hongyang Du, @zhoutianyi, @dmanocha, @boydgraber Paper: arxiv.org/abs/2505.01481 Project: wuxiyang1996.github.io/videohallu_pag… Dataset: huggingface.co/datasets/Intel… We present VideoHallu, a benchmark of 3,000+ synthetic videos with counterintuitive QA pairs that reveal how state-of-the-art MLLMs hallucinate on physics violations, spatio-temporal inconsistencies, and commonsense errors, while showing that targeted fine-tuning significantly improves abnormality detection and reasoning.
English
0
0
1
187
GAMMA UMD
GAMMA UMD@gammaumd·
(4/n) MAGNET: A Multi-agent Framework for Finding Audio-Visual Needles by Reasoning over Multi-Video Haystacks Authors: Sanjoy Chowdhury, Mohamed Elmoghany, Yohan Abeysinghe, Junjie Fei, Sayan Nag, Salman Khan, @moElhoseiny, @dmanocha Paper: arxiv.org/abs/2506.07016 Project page: schowdhury671.github.io/magnet_project/ Benchmark: huggingface.co/datasets/elmog… We introduce MAGNET, a multi-agent framework with a new benchmark (AVHaystacks), task (AVHaystacksQA), and metrics (StEM & MTGS) that enables and evaluates multi-video audio-visual reasoning, achieving state-of-the-art performance with large gains over strong baselines.
GAMMA UMD tweet media
English
1
0
0
228
GAMMA UMD
GAMMA UMD@gammaumd·
Thrilled to share that our team has multiple papers accepted at #NeurIPS2025 🎉🚀 We’re excited to contribute to advancing multi-modal learning, physical reasoning, and embodied AI. Here’s a quick overview of the works 👇🧵
English
1
2
6
813
GAMMA UMD
GAMMA UMD@gammaumd·
🎓 We’re thrilled to celebrate the successful PhD thesis defenses of Jing Liang @Jing53582 and Senthil Hariharan Arul! 🥳👏 A huge congratulations to both, your hard work and dedication have truly paid off. We’re so proud of you and can’t wait to see what’s next! 🌟
GAMMA UMD tweet mediaGAMMA UMD tweet media
English
0
0
3
205
GAMMA UMD
GAMMA UMD@gammaumd·
🚀 New @UMD_CollegeISR research by @vdorbala helps robots truly “get” situational context! 🤖 Introducing Situational EQA (S-EQA), enabling embodied agents to reason over multiple object states & relationships to answer complex, real-world queries like “Is the house ready for sleep time?” 📄 Accepted to #IROS2025 in Hangzhou, China 🌐 Paving the way for smarter, more context-aware home robots. 🔗 isr.umd.edu/news/story/new… #AI #Robotics #ComputerVision
GAMMA UMD tweet media
English
0
0
1
222
GAMMA UMD retweetledi
laura z
laura z@laurayuzheng·
I DEFENDED MY THESIS TODAY!!! Thanks to my advisor Ming Lin and also my committee and also my family. And like 300 other people
laura z tweet medialaura z tweet medialaura z tweet media
English
11
4
123
10K
GAMMA UMD
GAMMA UMD@gammaumd·
Excited to share our paper HALO is accepted to #CoRL2025! 🧭 HALO introduces a novel method to train vision-based reward models that align with human navigation preferences without requiring online rollouts or hand-engineered rewards. 📌 Key ideas: • We collect binary user feedback to intuitive queries like "Should the robot turn left?", "Should the robot turn right?", "Should it accelerate?" based on egocentric camera input. • The user feedback, combined with the expert reference action, is used to construct a probabilistic action preference distribution, with its mode at the reference action. • We train a reward model to rank all feasible actions using the Plackett-Luce loss, a generalization of the Bradley-Terry model for n-way comparisons. 🏁 We deploy HALO’s reward in both a classical planner and an RL-based planner. Real-world evaluation on a Clearpath Husky robot shows: ✅ ≥33.3% improvement in success rate ✅ ≥12.9% reduction in trajectory length ✅ ≥26.6% reduction in Fréchet distance to human demonstrations 📸 All with RGB cameras only, no LiDAR or depth. Authors: @gershom_96, Jianyu An, Sahire Ellahy, @kaweer_, Mohamed Elnoor, Jonathan Deepak Kannan, Amogha Sunil, @dmanocha Paper: arxiv.org/pdf/2508.01539 Website: gamma.umd.edu/researchdirect…
English
0
0
0
284
GAMMA UMD
GAMMA UMD@gammaumd·
🤖 What if your LLM confidently misread a chart, and you couldn’t tell? 📊 Multimodal LLMs are improving, but still hallucinate when answering chart-based questions. 🎯 Presenting at ACL 2025 (Virtual): ChartLens grounds model answers to specific chart elements, helping users verify claims and spot hallucinations. 🛠️ We introduce: • ChartLens: Fine-grained chart attribution • ChartVA-Eval: Benchmark across finance, policy & economics 📍 Session 12: V-Presentations 🗓️ Today | July 30 | 11:00–12:30 CEST | 5–6:30 AM ET 📄 ChartLens: Fine-grained Visual Attribution in Charts 👥 Authors: Puneet Mathur, Nedim Lipka, Franck Dernoncourt, Ryan A. Rossi, Dinesh Manocha #ACL2025 #MultimodalLLM #AIhallucination #ChartUnderstanding
English
0
0
1
161
GAMMA UMD retweetledi
GAMMA UMD retweetledi
Amrit Singh Bedi
Amrit Singh Bedi@amritsinghbedi3·
Are you interested in test-time AI alignment? If you are attending #ICML2025, please visit our poster: 📍 Poster Location: East Exhibition Hall A–B, Booth #E-2701 📅 When: Wednesday, July 16 | ⏰ 11:00 AM – 1:30 PM PDT @HaoZhu6 @MFHChehade @SOURADIPCHAKR18 will be presenting
Amrit Singh Bedi tweet media
Amrit Singh Bedi@amritsinghbedi3

Can decades old ideas from #psychology help fix critical issues in modern LLM alignment? 🤔 We're tapping into #BoundedRationality & 'satisficing principles' to build an alternate way to align LLMs. Our new #ICML2025 paper 👇 🧵 arxiv.org/pdf/2505.23729

English
0
4
19
1.3K
GAMMA UMD
GAMMA UMD@gammaumd·
🚀 Audio General Intelligence (AGI) is no longer a dream — it’s here. Introducing Audio Flamingo 3 — open-source, multimodal, and groundbreaking. It listens. It understands. It reasons across sound and language. 💥 Code, weights, datasets, paper — all open. 📄Paper: arxiv.org/abs/2507.08128 🤗HuggingFace: huggingface.co/nvidia/audio-f… Built by the amazing team at NVIDIA & UMD. Let’s shape the future of audio intelligence together!
GAMMA UMD tweet media
English
0
2
4
364