Bo

1.4K posts

Bo banner
Bo

Bo

@bo_wangbo

@perplexity_ai

Berlin, Germany 参加日 Ocak 2014
664 フォロー中3.7K フォロワー
Bo
Bo@bo_wangbo·
Beside the hype, have you tested Gemini embedding 2 on your private eval? Our case: weaker than bge-m3.
English
4
0
36
2.2K
Bo がリツイート
Denis Yarats
Denis Yarats@denisyarats·
looks like folks really enjoying our embeddings, 500k downloads already
Denis Yarats tweet media
English
7
6
146
9.4K
Bo
Bo@bo_wangbo·
Computer on phone allows me code while watching the baby ;)
Bo tweet mediaBo tweet media
English
4
2
39
33.6K
Bo
Bo@bo_wangbo·
@aniketmaurya I don’t know, but this is what I’m asking now
Bo tweet media
English
0
0
0
24
Antoine Chaffin
Antoine Chaffin@antoine_chaffin·
And I think it should be even more impressive in practice Multi-vector models crush benches, yes But most importantly, it does generalize very well, and it makes such an huge difference in production
English
3
2
22
2.2K
Antoine Chaffin
Antoine Chaffin@antoine_chaffin·
Anyone needed a proof that multi-vector is going to win? Please have a look at what it looks like when a cracked team try it hard An omni model that actually **crushes** anything that exists, on any modality, on any domain Congratulations to the team, this is truly impressive
Mixedbread@mixedbreadai

Introducing Mixedbread Wholembed v3, our new SOTA retrieval model across all modalities and 100+ languages. Wholembed v3 brings best-in-class search to text, audio, images, PDFs, videos... You can now get the best retrieval performance on your data, no matter its format.

English
3
8
85
11.4K
Ben Clavié
Ben Clavié@bclavie·
@bo_wangbo Whoops sent that too fast - tough nut to crack, especially to index it in a way where it can hit high numbers. We found it to be extremely related to many real world cases though, more than we thought
English
2
0
2
215
Ben Clavié
Ben Clavié@bclavie·
I'm so excited to introduce this! We've worked on a million different moving parts to produce this. I'm fairly confident it's the best multimodal model that exists, period -- and it's not too shabby at pushing back the LIMITs of retrieval either...
Mixedbread@mixedbreadai

Introducing Mixedbread Wholembed v3, our new SOTA retrieval model across all modalities and 100+ languages. Wholembed v3 brings best-in-class search to text, audio, images, PDFs, videos... You can now get the best retrieval performance on your data, no matter its format.

English
36
40
410
138.2K
Bo がリツイート
Perplexity
Perplexity@perplexity_ai·
Announcing Personal Computer. Personal Computer is an always on, local merge with Perplexity Computer that works for you 24/7. It's personal, secure, and works across your files, apps, and sessions through a continuously running Mac mini.
English
1.6K
3.5K
32.5K
14M
Panda
Panda@Jiaxi_Cui·
感觉 Google 是依靠 NotebookLM 积累的音频、PDF、视频数据,补齐了这三个模态的训练数据,不然很难解释为什么如此多的PDF的对齐数据是从哪来的 CLIP 时代还可以靠着互联网大量爬虫,找到 text—image pair对来做对比学习训练,但互联网上没有大量的 PDF—Audio—Video 的pair对 所以做 多模态embedding 的思路其实也变了,你不可能再依靠不授权的爬虫取得大量 pair 数据了,需要自己有真实的产品入口才可以持续积累数据 要把多模态embedding做好,实际上和是否是模型公司关系不大,就按照标准 transformer 架构,大力进行对比学习也未尝不可。所以需要的是手里有文档、Audio、Video相关产品场景的人 如果 Kimi、GLM、MiniMax 想追上 Google 在多模态 embedding 上的效果,最方便的似乎是收购现成的产品入口,比如 @lifesinger 的 YouMind 或者 @oran_ge 的ListenHub !
中文
4
10
72
24.1K
Knut Jägersberg
Knut Jägersberg@JagersbergKnut·
Alibaba not to abandon open source models
Knut Jägersberg tweet media
English
1
1
9
511
Bo
Bo@bo_wangbo·
@jackminong Time to show off your Singlish
English
0
0
1
141
Jackmin
Jackmin@jackminong·
everyone thinks im singaporean am i too kiasu?
English
6
0
21
2K