Mixpeek

205 posts

Mixpeek banner
Mixpeek

Mixpeek

@mixpeek

The multimodal data warehouse. One API to decompose, store, and search video, image, audio, and documents. Built on Ray. 🏗️ Try live demos ↓

USA Katılım Eylül 2021
9 Takip Edilen140 Takipçiler
Mixpeek
Mixpeek@mixpeek·
Your RAG pipeline is a Rube Goldberg machine. And it's why your AI hallucinates. LangChain → chunker → embedder → vector DB → re-ranker → LLM. 6 tools. 6 failure points. 6 things to debug at 3am. The fix: one retriever pipeline. Filter → Sort → Reduce → Enrich. One API call. mixpeek.com
English
0
0
0
15
Mixpeek
Mixpeek@mixpeek·
Vector databases are a scam. $80K/yr for 1B vectors. You're paying for RAM. We built the same thing on S3 for $3,500. Rust shards, Ray coordinator, sub-10ms latency. 95% cheaper. Benchmarks open-sourced → mixpeek.com
English
1
0
0
94
Mixpeek
Mixpeek@mixpeek·
RAG is the #1 AI pattern right now. Here's how it works in 60 seconds: → Retrieve: semantic search across your data → Augment: inject context into the LLM prompt → Generate: grounded answers, no hallucinations Every serious AI app runs on this. We built a multimodal retrieval API to make it easy → mixpeek.com
English
0
0
0
21
Mixpeek
Mixpeek@mixpeek·
This makes multi-vector search practical at scale for the first time. Prior approaches either sacrificed quality (single-vector), speed (brute-force), or guarantees (PLAID). The verticals where this hits hardest: financial doc search, medical imaging, legal discovery — anywhere OCR is the bottleneck.
English
1
0
0
21
Mixpeek
Mixpeek@mixpeek·
We benchmarked every viable approach to multimodal document retrieval on financial tables and found a combination that hasn't been published before: ColQwen2 + MUVERA. 99.4% of brute-force quality. Sub-millisecond first-pass retrieval. OCR-based search isn't even close.
Mixpeek tweet media
English
1
0
1
195
Mixpeek
Mixpeek@mixpeek·
Your agent doesn't need 6 microservices to go from "find similar products under $50" to ranked, enriched results. It needs one retriever with 6 stages.
English
1
1
2
61
Mixpeek retweetledi
Anthony Katsur
Anthony Katsur@anthonykatsur·
The @IABTechLab has launched the Agent Registry as part of our Agentic Ad Management Protocols #AAMP! The Agent Registry is a new place for the industry to publish, discover, and connect with advertising agents across the ecosystem. Early participants already include @Equativ, @mixpeek and @PubMatic If you are building agentic advertising services, register your agents & help shape the future of agentic workflows. To learn more: iabtechlab.com/introducing-th… #AAMP #AgenticAdvertising #Agentic #AI #IABTechLab #Standards
GIF
English
0
3
9
458
Mixpeek retweetledi
ethan steininger 🔎
ethan steininger 🔎@ethansteininger·
Search infra assumes your query is a text string. What happens when it's a 500MB video? We just shipped query preprocessing for @mixpeek decompose large files into chunks, batch embed in parallel, run concurrent searches, fuse results. One API call. "query_preprocessing": { "max_chunks": 20, "aggregation": "rrf" } Ingestion applied to the query. Same pipeline that indexed your data now runs on your search input. #query-preprocessing" target="_blank" rel="nofollow noopener">mixpeek.com/docs/retrieval…
ethan steininger 🔎 tweet media
English
0
1
3
396
Mixpeek
Mixpeek@mixpeek·
Mixpeek Plugins are live. We built this because every team's extraction logic is different and forcing everyone through the same pre-built models doesn't work at scale. Now you can build custom feature extractors with: → Your own models → Your own weights → Your own API keys Wire them into multi-collection decomposition pipelines that break complex assets (video, documents, images) into structured, searchable features across purpose-built collections. Then deploy real-time inference endpoints so those same models serve retrieval at query time enabling multi-vector serving patterns like ColBERT, ColPali, and hybrid dense+sparse, all behind a single retriever. What this unlocks: you're no longer choosing between flexibility and infrastructure. Fine-tune a domain-specific embedding model, plug it in, and get batch processing + real-time serving without stitching together five different systems. Docs: mixpeek.com/docs/processin…
Mixpeek tweet media
English
0
1
4
368
Mixpeek
Mixpeek@mixpeek·
Retrieval is the interface for the next generation of AI agents. Not "find documents"—find the exact 8 seconds, the specific table, the cited source. Grounded multimodal data for your agents. → mixpeek.com
Mixpeek tweet media
English
0
0
1
11
Mixpeek
Mixpeek@mixpeek·
Production scale from day one. Financial services. Adtech. Healthcare. Media. Regulated industries where "it mostly works" isn't acceptable. Structured outputs your agent can cite. Timestamps it can reference. Sources it can link.
English
1
0
1
13
Mixpeek
Mixpeek@mixpeek·
Grounded multimodal data for your agents. We build the infrastructure that turns video, images, audio, and documents into structured context AI can actually use. Here's how we built it:
Mixpeek tweet media
English
1
1
1
51