1.4K posts

Bo

@bo_wangbo

@perplexity_ai

Berlin, Germany 参加日 Ocak 2014

664 フォロー中3.7K フォロワー

Bo@bo_wangbo·10h

after hacking pylate and fast-plaid for 2 days now i'm a huge fan of LightOn's work!

Antoine Chaffin@antoine_chaffin

BrowseComp-Plus, perhaps the hardest popular deep research task, is now solved at nearly 90%... ... and all it took was a 150M model ✨ Thrilled to announce that Reason-ModernColBERT did it again and outperform all models (including models 54× bigger) on all metrics

English

2.1K

Bo@bo_wangbo·18h

Perplexity@perplexity_ai

Comet is now available for iOS. Download on the App Store: apps.apple.com/us/app/comet-a…

ZXX

Bo@bo_wangbo·2d

@antoine_chaffin @matospiso Sparse ColBERT

English

109

Antoine Chaffin@antoine_chaffin·2d

@matospiso future is sparse isn't it

English

364

matospiso@matospiso·2d

bullish on sparse retrieval arxiv.org/abs/2603.13277

English

4.1K

Bo@bo_wangbo·2d

Beside the hype, have you tested Gemini embedding 2 on your private eval? Our case: weaker than bge-m3.

English

2.2K

Bo がリツイート

Denis Yarats@denisyarats·4d

looks like folks really enjoying our embeddings, 500k downloads already

English

146

9.4K

Bo@bo_wangbo·4d

@atitaarora 😆 haha

Filipino

557

atitaarora@atitaarora·4d

@bo_wangbo You are an inspiration man!

English

599

Bo@bo_wangbo·4d

Computer on phone allows me code while watching the baby ;)

English

33.6K

Bo@bo_wangbo·6d

@aniketmaurya I don’t know, but this is what I’m asking now

English

Aniket@aniketmaurya·6d

@bo_wangbo What do I use it for?

English

Bo@bo_wangbo·6d

Let’s burn credits!!

Perplexity@perplexity_ai

Perplexity Computer is now on mobile. Start any task on any device. Manage Computer from your phone or desktop with cross-device synchronization. Available now for iOS in the Perplexity app. Coming soon to Android.

English

1.9K

Bo@bo_wangbo·12 Mar

@antoine_chaffin @lateinteraction I’m honestly impressed, might be a good time to come back to multi vec

English

122

Antoine Chaffin@antoine_chaffin·12 Mar

And I think it should be even more impressive in practice Multi-vector models crush benches, yes But most importantly, it does generalize very well, and it makes such an huge difference in production

English

2.2K

Antoine Chaffin@antoine_chaffin·12 Mar

Anyone needed a proof that multi-vector is going to win? Please have a look at what it looks like when a cracked team try it hard An omni model that actually **crushes** anything that exists, on any modality, on any domain Congratulations to the team, this is truly impressive

Mixedbread@mixedbreadai

Introducing Mixedbread Wholembed v3, our new SOTA retrieval model across all modalities and 100+ languages. Wholembed v3 brings best-in-class search to text, audio, images, PDFs, videos... You can now get the best retrieval performance on your data, no matter its format.

English

11.4K

Bo@bo_wangbo·12 Mar

@bclavie found it (huggingface.co/datasets/orion…) thanks and congrats for the release!

English

153

Ben Clavié@bclavie·12 Mar

@bo_wangbo Whoops sent that too fast - tough nut to crack, especially to index it in a way where it can hit high numbers. We found it to be extremely related to many real world cases though, more than we thought

English

215

Ben Clavié@bclavie·12 Mar

I'm so excited to introduce this! We've worked on a million different moving parts to produce this. I'm fairly confident it's the best multimodal model that exists, period -- and it's not too shabby at pushing back the LIMITs of retrieval either...

Mixedbread@mixedbreadai

English

410

138.2K

Bo@bo_wangbo·11 Mar

Proud to see what PPLX Search team and API team has delivered in the past weeks: 1. Better Search API: perplexity.ai/hub/blog/searc… 2. Embeddings API: docs.perplexity.ai/docs/embedding…

Perplexity@perplexity_ai

The Perplexity API platform is now a full-stack, model-agnostic API platform for building agents. It replaces your model provider, search layer, and embeddings, built on the same infrastructure that powers Perplexity.

English

866

Bo がリツイート

Perplexity@perplexity_ai·11 Mar

Announcing Personal Computer. Personal Computer is an always on, local merge with Perplexity Computer that works for you 24/7. It's personal, secure, and works across your files, apps, and sessions through a continuously running Mac mini.

English

1.6K

3.5K

32.5K

14M

Bo@bo_wangbo·11 Mar

@Jiaxi_Cui Synthetic data

Suomi

185

Panda@Jiaxi_Cui·11 Mar

感觉 Google 是依靠 NotebookLM 积累的音频、PDF、视频数据，补齐了这三个模态的训练数据，不然很难解释为什么如此多的PDF的对齐数据是从哪来的 CLIP 时代还可以靠着互联网大量爬虫，找到 text—image pair对来做对比学习训练，但互联网上没有大量的 PDF—Audio—Video 的pair对所以做多模态embedding 的思路其实也变了，你不可能再依靠不授权的爬虫取得大量 pair 数据了，需要自己有真实的产品入口才可以持续积累数据要把多模态embedding做好，实际上和是否是模型公司关系不大，就按照标准 transformer 架构，大力进行对比学习也未尝不可。所以需要的是手里有文档、Audio、Video相关产品场景的人如果 Kimi、GLM、MiniMax 想追上 Google 在多模态 embedding 上的效果，最方便的似乎是收购现成的产品入口，比如 @lifesinger 的 YouMind 或者 @oran_ge 的ListenHub ！

中文

24.1K

Bo@bo_wangbo·6 Mar

watching @AskPerplexity step-by-step help me convert pplx-embed from huggingface to GGUF in 1-shot is satisfying huggingface.co/bowang0911/ppl…