David Chan

97 posts

David Chan

@_dmchan

Postdoc at @berkeley_ai studying contextual grounding in multimodal AI. These are the voyages of the... Crap. I don't have a name for my own ship...

Berkeley, CA Katılım Kasım 2009

144 Takip Edilen175 Takipçiler

David Chan retweetledi

Edson Araujo@edsonroteia·9 Mar

📢 Deadline Extension for MMFM Workshop @ #CVPR2026! We are extending the submission deadline to **March 14, 2026 (AoE)**. For updated details on submission timelines and guidelines, please refer to the workshop website and OpenReview page below. We’re excited to see your work!

Edson Araujo@edsonroteia

The 5th edition of the MMFM Workshop is coming to @CVPR 2026! "What is Next in Multimodal Foundation Models?" exploring the frontiers of vision, language, and beyond. June 2026 | Denver, CO Details in thread 👇

English

3.3K

David Chan@_dmchan·3 Mar

Our deadline is only one week away! Don't forget to submit!

Edson Araujo@edsonroteia

English

8.5K

David Chan@_dmchan·26 Oca

Some awesome new work from our group which explores the multimodal capabilities of interactive agents!

Zirui "Colin" Wang@zwcolin

🎮 We release VisGym: Diverse, Customizable, Scalable Environments for Multimodal Agents (w/ @junyi42 @aomaru_21490) 🌐 With 17 environments across multiple domains, we show systematically the brittleness of VLMs in visual interaction, and what training leads to. 🧵[1/8]

English

256

David Chan@_dmchan·24 Oca

Ooooh, fancy!

Haiwen (Haven) Feng@HavenFeng

✨Thinking with Blender~ Meet VIGA: a multimodal agent that autonomously codes 3D/4D blender scenes from any image, with no human, no training! @berkeley_ai #LLMs #Blender #Agent 🧵1/6

English

181

David Chan retweetledi

XuDong Wang@XDWang101·20 Kas

Objectness should be user-defined — not human-label-defined! Unsupervised SAM 2 (UnSAMv2) makes it real✨ 1 point + a continuous granularity slider = the mask you want! UnSAMv2 beats SAM2: +16% NoC-90, +26% 1-IoU, +37% AR on 11+ datasets (w/ just 6k unlabeled images)!💪 1/n

English

1.6K

David Chan retweetledi

Jay Alammar@JayAlammar·3 Kas

The Illustrated NeurIPS 2025: A Visual Map of the AI Frontier New blog post! NeurIPS 2025 papers are out—and it’s a lot to take in. This visualization lets you explore the entire research landscape interactively, with clusters, summaries, and @cohere LLM-generated explanations that make the field easier to grasp. Link in thread!

English

214

1.3K

183.6K

David Chan retweetledi

Phillip Isola@phillip_isola·1 Kas

Arxiv has been such a wonderful service but I think this is a step in the wrong direction. We have other venues for peer review. To me the value of arxiv lies precisely in its lack of excessive moderation. I'd prefer it as "github for science," rather than yet another journal.

Thomas G. Dietterich@tdietterich

The Computer Science section of @arxiv is now requiring prior peer review for Literature Surveys and Position Papers. Details in a new blog post

English

718

76.7K

David Chan@_dmchan·25 Eki

I’m not sure what they get out of this, but I’m here for it!

Vaibhav (VB) Srivastav@reach_vb

Chinese doordash dropping MIT license foundation video models??? “We introduce LongCat-Video, a foundational video generation model with 13.6B parameters, delivering strong performance across Text-to-Video, Image-to-Video, and Video-Continuation generation tasks.” huggingface.co/meituan-longca…

English

191

David Chan@_dmchan·21 Eki

Two days at ICCV = Two new papers! Interrupting LLMs’ reasoning should have seamless and predictable behavior. Turns out, that’s not the case.

Patrick Wu@tsunghan_wu

Humans handle dynamic situations easily, what about models? Turns out, they break in three distinct ways: ⛔ Force Stop → Reasoning leakage (won’t stop) ⚡️ Speedup → Panic (rushed answers) ❓ Info Updates → Self-doubt (reject updates) 👉Check out dynamic-lm.github.io

English

312

David Chan@_dmchan·20 Eki

This is some awesome new work from our lab - Echo is a way we can build benchmarks automatically from social media! Check it out!

Jiaxin Ge@aomaru_21490

✨Introducing ECHO, the newest in-the-wild image generation benchmark! You’ve seen new image models and new use cases discussed on social media, but old benchmarks don’t test them! We distilled this qualitative discussion into a structured benchmark. 🔗 echo-bench.github.io

English

364

David Chan@_dmchan·20 Eki

I'm at #ICCV2025 this week - send me a DM or email if you'd like to find a time to talk anything multimodal! Speaking of multimodal, don't forget to check out our workshop: "What's Next in Multimodal Foundation Models?" on Monday in 326 B! sites.google.com/view/mmfm4thwo…

English

156

David Chan retweetledi

Roei Herzig@roeiherzig·18 Eki

🌺 Join us in Hawaii at ICCV 2025 for the workshop “What is Next in Multimodal Foundation Models?” 🗓️ Monday, October 20 | 8:00 – 12:00📍Room 326 B We’ve got a stellar lineup of speakers & panelists— details here: 🔗 sites.google.com/view/mmfm4thwo… @ICCVConference

English

37.5K

David Chan retweetledi

Wen-Han Hsieh@henseoba·10 Eyl

🚀Excited to share that our paper, “Do What? Teaching Vision-Language-Action Models to Reject the Impossible,” has been accepted to #EMNLP2025 Findings! 📄Paper: arxiv.org/pdf/2508.16292 🌎Project page: wen-hanhsieh.github.io/iva.github.io/

GIF

English

16.3K

David Chan@_dmchan·27 Tem

I'll be in Vienna for ACL starting Today - I’m presenting work on how LMMs perform in-context updates in a Bayesian way, but I’m excited to talk anything multimodal! Feel free to reach out if you’re around! #ACL2025

English

214

David Chan@_dmchan·22 Tem

Awesome work exploring the power of serial computing!

Konpat Ta Preechakul@konpatp

Some problems can’t be rushed—they can only be done step by step, no matter how many people or processors you throw at them. We’ve scaled AI by making everything bigger and more parallel: Our models are parallel. Our scaling is parallel. Our GPUs are parallel. But what if the real bottleneck isn’t size—but depth?What if the model just didn’t have enough serial steps to get it right? Some problems need depth, not width. This is the Serial Scaling Hypothesis. This is not the same as recent studies in scaling test-time compute, which focus on train vs. test and are agnostic to parallel vs. serial. For example: test-time majority voting increases compute by running models in parallel — but doesn’t help when the task itself is serial. We argue: what really matters is how the compute is structured. And for many real-world problems, it must be serial. Read more at: arxiv.org/abs/2507.12549 or 🧵. (In collaboration with: @layer07_yuxi , Kananart Kuwaranancharoen and @YutongBAI1002 )

English

122

David Chan@_dmchan·21 Tem

Me (To Cursor): Refactor this code. Cursor: Sure! I've refactored your code! It's shorter and cleaner now! Me: Are you sure there are no feature regressions? Cursor: The code is missing essential functionality. Me: ....

English

117

David Chan retweetledi

Patrick Wu@tsunghan_wu·6 Haz

📢 Call for Papers! Last chance to hang with the CV crowd in Hawaii 🌴 We're hosting the 4th MMFM Workshop at #ICCV2025 — submit your work on vision, language, audio & more by July 1 🗓️ Also check out the CVPR edition 👉 @MMFMWorkshop 🔗 sites.google.com/view/mmfm4thwo…

David Chan@_dmchan

🚀 Call for Papers! 🚀 Excited to help organize the 4th Workshop on What is Next in Multimodal Foundation Models? at ICCV in Honolulu, Hawai'i 🌺 Submit work on vision, language, audio & more! 🗓️ Deadline: July 1, 2025 🔗 sites.google.com/view/mmfm4thwo… #MMFM4 #ICCV2025 #AI #multimodal

English

10.6K

David Chan retweetledi

Roei Herzig@roeiherzig·28 Haz

🚨 Rough luck with your #ICCV2025 submission? We’re organizing the 4th Workshop on What’s Next in Multimodal Foundation Models at @ICCVConference in Honolulu 🌺🌴 Send us your work on vision, language, audio & more! 🗓️ Deadline: July 1, 2025 🔗 sites.google.com/view/mmfm4thwo…

English

10.9K

David Chan retweetledi

Minwoo (Josh) Kang@joshminwookang·23 Haz

🤔 Do LLMs exhibit in-group↔out-group perceptions like us? ❓ Can they serve as faithful virtual subjects of human political partisans? Excited to share our paper on taking LLM virtual personas to the *next level* of depth! 🔗 arxiv.org/abs/2504.11673 🧵

English

7.3K

David Chan retweetledi

Eli Schwartz@Eli_Schwartz·10 Haz

Submit your paper to our Multimodal Foundation Models (MMFM) Workshop at ICCV in Honolulu, Hawaii

David Chan@_dmchan

English

811

Keşfet

@cohere @ICCVConference @MMFMWorkshop @elonmusk @BarackObama @taylorswift13 @cristiano @BillGates