Gregor Geigle

119 posts

Gregor Geigle

Gregor Geigle

@GregorGeigle

PhD student @Uni_WUE| NLP, Multimodal Vision+Language

Beigetreten Aralık 2020
92 Folgt189 Follower
vik
vik@vikhyatk·
New Moondream 2B release is out! Includes structured outputs, improved text understanding, gaze detection. And probably more things I'm forgetting about right now.
vik tweet media
English
42
128
1.1K
116.7K
Gregor Geigle
Gregor Geigle@GregorGeigle·
@vikhyatk @snowclipsed @teortaxesTex @j0yk1ll Can confirm this works well. Used it for my recent model, too, because those mega long sequence lengths are a pain to train with. Do you pool the crops or concatenate them all together channel-wise? I found pooling to work better, surprisingly.
English
1
0
2
24
Gregor Geigle
Gregor Geigle@GregorGeigle·
Next, we apply our lessons learned to train Centurio - state-of-the-art multilingual LVLMs trained with 100 languages based on Aya-Expanse @CohereForAI and Qwen 2.5 @Alibaba_Qwen. Weights on HuggingFace!
Gregor Geigle tweet media
English
1
0
0
113
Gregor Geigle
Gregor Geigle@GregorGeigle·
@mervenoyann @visheratin Right, good point that. As a PhD student at a university, I don't have to pay too much attention if something is commercially permissible but that's not true for others, of course.
English
0
0
1
43
merve
merve@mervenoyann·
Best multilingual SigLIP ever is now compatible with transformers 🫡🤗
merve tweet media
English
8
21
213
18.6K
Gregor Geigle
Gregor Geigle@GregorGeigle·
The monkey's paw worked well, so I will present 2(!) posters at @emnlpmeeting Wednesday at 4pm. I will be easy to spot - just look for the guy with crutches🩼
English
1
1
6
423
Gregor Geigle retweetet
Fabian David Schmidt
Fabian David Schmidt@fdschmidt·
Excited to present NLLB-LLM2Vec at @emnlpmeeting Tuesday 2pm! Drop by our poster to chat about multilingual & multimodal research. NLLB-LLM2Vec can now easily be used with @huggingface AutoModels — try it esp. for embedding low-resource languages! 🌐 huggingface.co/fdschmidt93/NL…
Fabian David Schmidt@fdschmidt

Introducing NLLB-LLM2Vec! 🚀 We fuse the NLLB encoder & Llama 3 8B trained w/ LLM2Vec to create NLLB-LLM2Vec which supports cross-lingual NLU in 200+ languages🔥 Joint work w/ Philipp Borchert, @licwu, and @gg42554 during my great research stay at @cambridgeltl

English
1
7
26
4.4K
Gregor Geigle
Gregor Geigle@GregorGeigle·
Awesome work! I don't know why but it feels strange to see my University logo in the same figure as these big labs & groups😅
Xiang Yue@xiangyue96

🌍 I’ve always had a dream of making AI accessible to everyone, regardless of location or language. However, current open MLLMs often respond in English, even to non-English queries! 🚀 Introducing Pangea: A Fully Open Multilingual Multimodal LLM supporting 39 languages! 🌐✨ neulab.github.io/Pangea/ arxiv.org/pdf/2410.16153 The Pangea family includes three major components: 🔥 Pangea-7B: A state-of-the-art multilingual multimodal LLM capable of 39 languages! Not only does it excel in multilingual scenarios, but it also matches or surpasses English-centric models like Llama 3.2, Molmo, and LlavaOneVision in English performance. 📝 PangeaIns: A 6M multilingual multimodal instruction tuning dataset across 39 languages. 🗂️ With 40% English instructions and 60% multilingual instructions, it spans various domains, including 1M culturally-relevant images sourced from LAION-Multi. 🎨 🏆 PangeaBench: A comprehensive evaluation benchmark featuring 14 datasets in 47 languages. Evaluation can be tricky, so we carefully curated existing benchmarks and introduced two new datasets: xChatBench (human-annotated wild queries with fine-grained evaluation criteria) and xMMMU (a meticulously machine-translated version of MMMU). 🙌 This is a joint leading effort with @yueqi_song. Also kudos to the amazing team @AkariAsai, @seungonekim, @Jeande_d, @simi_97k, @anjali_ruban, @lintangsutawika, @Sathya8NR, @gneubig for their hard work! Check out more results and insights we conclude from our training in the thread below. 👇

English
1
0
12
1.8K
Gregor Geigle
Gregor Geigle@GregorGeigle·
@giffmana I knew of the B/16 model but must have missed that one. So I tested it to shamelessly plug my work (github.com/gregor-ge/Babe…): For classification, it is by far the best for English + mid/high-res languages. Retrieval lags behind NLLB-SigLIP (-English). tl;dr: SigLIP sweep
Gregor Geigle tweet media
English
0
0
0
421
Lucas Beyer (bl16)
Lucas Beyer (bl16)@giffmana·
Yall know SigLIP-So400m right? Did you know there is also an international version of it? It slipped through the cracks during the original release, but now it’s on timm too. What to expect: slightly worse EN benchmarks, but significantly better language and culture coverage!
Ross Wightman@wightmanr

OpenCLIP passed 10K stars on GitHub this week. A big milestone for any open-source project. 🍻 to the many collaborators that made that possible. Coincidentally, I pushed a new release with a port of the largest multi-lingual SigLIP -- a SO400M/16 @ 256x256 that appeared on big_vision a little while back. Now on the @huggingface hub and useable via timm or OpenCLIP (update your timm too)! huggingface.co/timm/ViT-SO400…

English
4
7
154
15.5K
Gregor Geigle
Gregor Geigle@GregorGeigle·
@ChenLiu47008770 Not surprising since your work was the main inspiration for the master's thesis 👍 We did not use Fisher's information, though. Only a simple epoch-wise schedule top-down.
English
0
0
1
211
Gregor Geigle
Gregor Geigle@GregorGeigle·
A broken ankle might stop me from going to #ACL2024 myself but it won't stop *you* from checking out my accepted papers (1x main conference, 2x workshop):
English
2
0
14
1.4K
Gregor Geigle
Gregor Geigle@GregorGeigle·
... 2) the thesis of my first Master's student Max: "Improving Vision-Language Cross-Lingual Transfer with Scheduled Unfreezing" (in the workshop proceedings).
Gregor Geigle tweet media
English
1
1
2
114