lmms-lab

33 posts

lmms-lab banner
lmms-lab

lmms-lab

@lmmslab

Feeling and building multimodal intelligence.

Singapore 가입일 Mayıs 2024
41 팔로잉333 팔로워
lmms-lab 리트윗함
Shuai Liu
Shuai Liu@choiszt·
Agents are mind-blowing. But they don't remember things consistently. Or when they do — it's not safe. We built Engram. AES-256 encrypted. Keys stay on your device. Zero-knowledge sync. No cloud. No middleman. Use it. Your agent memory is yours. @lmmslab github.com/EvolvingLMMs-L… youtu.be/I6xVuNRMkVc
YouTube video
YouTube
English
1
3
10
4.7K
lmms-lab 리트윗함
Adhiraj Ghosh
Adhiraj Ghosh@adhiraj_ghosh98·
Every day I'm alive I become more of a fan of @lmmslab. lmms-eval is coming in clutch during #CVPR2026 rebuttals!
English
0
3
9
1.3K
lmms-lab 리트윗함
Ziwei Liu
Ziwei Liu@liuziwei7·
🥳Year-End Reflection on the Growth of LMMs-Lab🥳 2025 has been a fruitful year for 🧠LMMs-Lab🧠 @lmmslab (lmms-lab.com), a non-profit open-source research organization dedicated to feeling and building the future of multimodal intelligence with: 🌟 > 12,000 Total GitHub Stars 🍴 > 2,000 Forks 🧑‍💻 > 30 Core Repositories
English
3
28
234
9.7K
lmms-lab 리트윗함
Ziwei Liu
Ziwei Liu@liuziwei7·
🔥LLaVA-OneVision upgraded to V1.5🔥 We @lmmslab present 🌋LLaVA-OV-1.5🌋, a fully open framework for democratized multimodal training * Superior Performance surpassing Qwen2.5-VL * High-Quality Data at Scale * Ultra-Efficient Training Framework - Repo: github.com/EvolvingLMMs-L…
English
6
36
156
17.8K
lmms-lab
lmms-lab@lmmslab·
@_jasonwei bro really have the insight and the peace mind to find more insights
English
0
0
0
66
Jason Wei
Jason Wei@_jasonwei·
AI research is strange in that you spend a massive amount of compute on experiments to learn simple ideas that can be expressed in just a few sentences. Literally things like “training on A generalizes if you add B”, “X is a good way to design rewards”, or “the fact that method M is sample efficient means that we should create environments with this specific property”. But somehow if you find the correct five ideas and you really understand them deeply, suddenly you’re miles ahead of the rest of the field
English
25
37
596
68.1K
/
/@gazorp5·
@liuziwei7 @lmmslab @Gradio @_akhaliq The text formatting of the transcript leaves a lot to be desired, any plans on improving it? Thanks for open sourcing the model.
English
1
0
0
144
lmms-lab
lmms-lab@lmmslab·
Feel the vibe~
lmms-lab tweet media
English
4
1
5
579
lmms-lab 리트윗함
Brian Li
Brian Li@Brian_Bo_Li·
VideoMMMU is a meticulously crafted benchmark designed to evaluate multimodal models’ video understanding abilities for college-level videos. Videos have tremendous knowledge and learning from them remains challenging for current models, but it is expected to become a crucial capability on the path toward achieving AGI.
Kairui Hu@kairuicarry

🚀Introducing Video-MMMU: Evaluating Knowledge Acquisition from Professional Videos 🎥 Knowledge-intensive Videos: Spanning 6 professional disciplines (Art, Business, Science, Medicine, Humanities, Engineering) and 30 diverse subjects, Video-MMMU challenges models to learn and apply college-level knowledge from videos. ❓ Knowledge Acquisition-based QA Design: QA pairs are aligned with the three stages of cognitive learning: · Perception: Identifying knowledge. · Comprehension: Understanding the underlying concepts. · Adaptation: Applying the knowledge to practical scenarios. 📊 Quantitative Knowledge Acquisition Assessment (Δknowledge): A novel metric that quantifies how much a model improves after watching a video, providing unique insights into its knowledge acquisition capability. Why It Matters? 🚀 Pushing the Boundaries Video-MMMU moves beyond perception and understanding of video to knowledge acquisition from video, positioning videos as a powerful medium for transmitting knowledge. 📚 Cognitive-Level Insights Video-MMMU introduces three cognitive tracks—Perception, Comprehension, and Adaptation—that mirror human learning stages, providing a structured framework to evaluate how effectively models acquire, understand, and apply knowledge. 🧠 Bridging the Gap Video-MMMU uncovers critical limitations in current LMMs and provides insights for advancing LMMs’ capabilities in knowledge acquisition from video. Project Page: videommmu.github.io ArXiv: arxiv.org/html/2501.1382…

English
2
5
27
2.9K
lmms-lab 리트윗함
Ziwei Liu
Ziwei Liu@liuziwei7·
🤖Interpreting Large Multimodal Models (LMM)🤖 We present an automatic framework to identify, interpret and steer neurons within LMM for safe AGI - Paper: arxiv.org/pdf/2411.14982 - Code: github.com/EvolvingLMMs-L… - Model @huggingface : huggingface.co/collections/lm… . Thanks @_akhaliq !
Ziwei Liu tweet media
lmms-lab@lmmslab

New work from LMMs-Lab! This time we present our latest research on the interpretation and safety of multimodal models

English
2
41
273
31.6K
lmms-lab 리트윗함
Wenhao Chai
Wenhao Chai@wenhaocha1·
🔥 We just submitted some baselines and benchmarks to lmms-eval @lmmslab (LLaVA team) — evaluation is now just one line of code away! We call for the reporting of visual token numbers when evaluating LMM performance! - lmms-eval repo: github.com/EvolvingLMMs-L… - VDC, first benchmark for detailed video captions: #evaluation" target="_blank" rel="nofollow noopener">rese1f.github.io/aurora-web/#ev… - AuroraCap (VDC baseline): github.com/rese1f/aurora - MovieChat, first long-video understanding benchmark: github.com/rese1f/MovieCh… - MovieChat baseline: github.com/rese1f/MovieCh…
Wenhao Chai tweet mediaWenhao Chai tweet media
English
1
4
19
3.3K
lmms-lab 리트윗함
Shuang Li
Shuang Li@ShuangL13799063·
We are organizing a new workshop on "Knowledge in Generative Models" at #ECCV2024 to explore how generative models learn representations of the visual world and how we can use them for downstream applications. sites.google.com/ttic.edu/knowl… 📅30 September 2024, 2 PM
Anand Bhattad@anand_bhattad

We are organizing a new workshop on "Knowledge in Generative Models" at #ECCV2024 to explore how generative models learn representations of the visual world and how we can use them for downstream applications. For the schedule and more details, visit our website: 🔗Website: sites.google.com/ttic.edu/knowl… 📅 Date: 30 September 2024, 2 PM 📍 Location: Brown 1, MiCo Milano, Italy 🇮🇹 🎤 Speakers: Amazing lineup to provide diverse perspectives: @davidbau, David Forsyth, @shalinidemello, @YGandelsman, @phillip_isola and @liuziwei7 Organizing with @DuXiaodan, @nickKolkin, @graceluo_, @ShuangL13799063 and @grshakh See you all in Milano!

English
1
3
34
6.8K
lmms-lab 리트윗함
Brian Li
Brian Li@Brian_Bo_Li·
Great experience working with Lianmin to integrate LLaAV-OneVision into SGLang, and huge thanks to @PY_Z001 and @KaichenZhang358 to help finish this. Try it on: github.com/sgl-project/sg… Directly try our demo (with SGLang SRT API service): llava-onevision.lmms-lab.com
LMSYS Org@lmsysorg

We worked with the LLaVA team to integrate LLaVA-OneVision into SGLang v0.3. You can now launch a server and query it using the OpenAI-compatible vision API, supporting interleaved text, multi-image, and video formats.

English
0
8
21
2.4K
CAMEL-AI.org
CAMEL-AI.org@CamelAIOrg·
🐫 CAMEL-AI Project Meeting (US Time Friendly) - Next Monday 📢 Join us for our next development meeting to discuss our project's latest integrations and upcoming features. 🔧 New Features from the Last Sprint: - ✨ Integrated @togethercompute: a cloud platform for building and running generative AI. - 👨‍💻 Integrated @LinkedIn: get information from LinkedIn. - 🎥 Integrated a text-to-video model to boost @CamelAIOrg’s multi-modal capability. - 📜 PR review guidelines updated. - 📂 Integrated @NebulaGraph: an open-source distributed graph database built for super large-scale graphs. - 🤖 @CamelAIOrg Role-Playing Scraper for Report & Knowledge Graph Generation. - 📚 Added cookbooks to @CamelAIOrg’s repo. - 🤝 Integrated model from @RekaAILabs. - ⚙️ Integrated @SambaNovaAI: deploy state-of-the-art AI. - 🗂️ Improved collection naming for retrieval pipeline. 🚀 Features for the Coming Sprint: - 📄 Add structured output support from @OpenAI. - ⚙️ Unify tools settings in ChatAgent. - ✨ Support FLUX-1: a 12 billion parameter rectified flow transformer capable of generating images from text descriptions. - 💻 Integrate @Reddit: fetch data from it. - 📱 Integrate @WhatsApp: popular social platform support. - 📰 Integrate @AskNewsApp: fetch real-time news with minimal bias. - 🖥️ Add SmoLLM model and WebGPU support. - 👁️ Add Phi-3.5 vision-instruct VLM model. - 📊 Add GraphDBBlock to LongtermAgentMemory. - 🧾 Add OCR capability to @CamelAIOrg. ⏰ When: 17:00 (BST) / 09:00 (PDT) 🗺️ Where: discord.gg/Edhz4jYwMW?eve…
CAMEL-AI.org tweet media
English
1
0
6
2.1K