Marek Galovic

45 posts

Marek Galovic banner
Marek Galovic

Marek Galovic

@marek_galovic

Building the next-generation of search @topk_io. ex-Pinecone, ex-Shopify.

Katılım Ağustos 2011
348 Takip Edilen129 Takipçiler
Marek Galovic
Marek Galovic@marek_galovic·
vector search is a primitive. retrieval is the product.
English
1
5
39
3.2K
Marek Galovic
Marek Galovic@marek_galovic·
@tomaarsen @matospiso Sparse-only can work reasonably well with some tuning, but obviously, reranking makes it better. @tomaarsen, any plans to add multi-vec model support in Sentence Transformers?
English
1
0
0
35
tomaarsen
tomaarsen@tomaarsen·
@matospiso Very cool. I bet you can implement this into the SparseEncoder in Sentence Transformers too, with custom modules, but it would only output the sparse embeddings and not also the multi-vector embeddings for reranking, sadly :/
English
2
0
2
323
Marek Galovic
Marek Galovic@marek_galovic·
@adxtyahq We built this at @topk_io. 100% agree that retrieval quality matters more than the model, which is why we're all in on multi-vec retrieval.
English
0
0
0
77
aditya
aditya@adxtyahq·
“design a RAG pipeline for 10M docs with zero hallucination” apparently this was asked in a Google L5 interview round. came across it somewhere on the internet and honestly it’s a way more interesting system design problem than most classic distributed systems questions 1. ingest + normalize docs - remove duplicates, standardize formats, extract metadata, maintain version history 2. hybrid retrieval (BM25 + embeddings) - BM25 handles exact keyword matching while embeddings capture semantic meaning - semantic search alone usually struggles with precision at massive scale 3. ANN retrieval + reranking - ANN (Approximate nearest neighbor ) quickly pulls top candidate chunks from millions of docs - then a reranker rescoring step improves relevance by deeply comparing query vs retrieved chunks 4. source confidence scoring - every retrieved chunk gets scored based on freshness, trust level, overlap and retrieval consistency - low-confidence context should never heavily influence generation 5. constrained generation - the model is only allowed to answer using retrieved context (nothing new to be invented outside of the retrieved context) 6. citation-backed responses - every major claim links back to exact chunks, documents or timestamps 7. hallucination fallback layer - if retrieval confidence drops below a threshold: “insufficient evidence found” 8. continuous evals - run adversarial queries, retrieval recall benchmarks and hallucination tests continuously 9. caching + memory layer - cache high-frequency enterprise queries and retrieval paths (improves latency and output) 10. observability everywhere - trace retrieval paths, chunk rankings, token attribution and failure points Also at 10M docs, retrieval quality matters more than the frontier model itself.
aditya tweet media
English
85
323
2.7K
191.9K
Marek Galovic
Marek Galovic@marek_galovic·
While SMVE is great, it's only a part of the full picture. You still need model inference, durability, a scalable write & read path, doc. quantization, refinement, and more. @topk_io already solved it and offers an e2e solution for multi-vec retrieval on top of object storage.
English
1
0
5
302
Marek Galovic
Marek Galovic@marek_galovic·
Cool talk on multi-vector retrieval, why it beats single vector methods, and why it's hard to use in practice. 🧵on what's next (1/n):
Ben Clavié@bclavie

Information Retrieval is about making knowledge accessible. Late Interaction is the best way to do that today. But now that we have a new kind of users, it's time to zoom out so we can plan the future of retrieval. I gave a talk about this at @ir_tsukuba #slide=id.p" target="_blank" rel="nofollow noopener">docs.google.com/presentation/d…

English
1
9
48
7.3K
Marek Galovic
Marek Galovic@marek_galovic·
running multi-vec search at 1.1TB/s inside @topk_io to mine hard negatives for better multi-vec model is pretty cool
Marek Galovic tweet media
English
1
3
33
2.7K
Marek Galovic
Marek Galovic@marek_galovic·
@chamath Uploading everything into the context window was never the right approach. Context is fundamentally a search problem and that’s what we’re solving at @topk_io.
English
0
0
0
21
Chamath Palihapitiya
Chamath Palihapitiya@chamath·
Use Claude they said. Upload your decks the said. Unleash all this productivity they said. But apparently, I first need to start a new chat, delete some of the deck and not exceed the maximum image count…just like my existing brain.
Chamath Palihapitiya tweet media
English
378
65
2.5K
378K
Marek Galovic retweetledi
topk.io
topk.io@topk_io·
TopK is heading to @AICouncilConf in San Francisco. We're proud to be sponsoring this year and excited to connect with teams building real-world AI systems. At the Expo Hall, we'll be showcasing how the TopK Context Engine turns unstructured documents into evidence-backed context to make vertical AI agents more accurate and trustworthy. Come find us — we'd love to meet. May 12–14 · SF Marriott Marquis
topk.io tweet media
English
0
3
5
336
Marek Galovic retweetledi
topk.io
topk.io@topk_io·
Context is a search problem. TopK Context Engine significantly improves answer accuracy in enterprise research tasks while reducing token costs up to 13x. Get access: topk.io
topk.io tweet mediatopk.io tweet media
English
0
2
8
972
Marek Galovic retweetledi
Antoine Chaffin
Antoine Chaffin@antoine_chaffin·
The new generation of open state-of-the-art single and multi-vector retrieval models is here It's time, DenseOn with the LateOn 🎶 @LightOnIO releases models that leap past existing ones, and everything you need to do the same!
Antoine Chaffin tweet media
English
13
52
222
39.7K