LanceDB

981 posts

LanceDB banner
LanceDB

LanceDB

@lancedb

Developer-friendly, open source AI-Native Multimodal Lakehouse https://t.co/wXn4tw5ySn

San Francisco, CA Katılım Nisan 2023
62 Takip Edilen4.2K Takipçiler
LanceDB
LanceDB@lancedb·
@duckdb @AICouncilConf 3/ Lance covers the catalog — local dirs, object stores, REST namespaces. Quack covers the transport. Nothing changes on the storage or query side.
English
1
0
3
273
LanceDB
LanceDB@lancedb·
1/ Hannes Mühleisen announced Quack — @duckdb's new client-server protocol — at Day 1 of @AICouncilConf. Lance works with it out of the box.
LanceDB tweet media
English
2
5
33
5.7K
LanceDB
LanceDB@lancedb·
2/ Index build: split into bounded segments, build in parallel. 10 workers → 5x faster. Build time is bounded by the slowest segment. 3/ Queries: Plan Executors fan out per segment. HNSW over centroids replaces linear scans. Walsh-Hadamard rotation drops RaBitQ prep from O(d²) → O(d log d).
English
1
0
0
293
LanceDB
LanceDB@lancedb·
1/ Past 1B vectors, three things break: index won't fit on one node, centroid scans go linear, RaBitQ rotation costs O(d²) per query at 1536 dims. Each needs a different fix 🧵
English
1
1
8
1.6K
LanceDB
LanceDB@lancedb·
Vector search gets expensive because the index has to live in RAM — bigger dataset, bigger instance. LanceDB stores the index in S3 and memory-maps it, so RAM scales with QPS not data size. At 100M docs (1152-dim, SQ8): ~$779/mo. At 10M: ~$148. At 1M: ~$65. Full cost breakdown + OpenSearch comparison: lancedb.com/blog/opensearc…
English
0
2
20
1.3K
LanceDB
LanceDB@lancedb·
1/ @AICouncilConf starts tomorrow 🚀 Find us at the LanceDB booth — especially if you're training multimodal models at scale and your data layer is the bottleneck.
LanceDB tweet media
English
1
1
12
480
LanceDB
LanceDB@lancedb·
Tune into Chang She's session – "Trillion is the New Billion: Managing Really Large Multimodal Datasets for AI" - Why existing data infra wasn't built for search, curation, and training workloads - How the Lance format addresses them at a foundational level - How LanceDB fits alongside Iceberg in the data stack. aicouncil.com/talks26/trilli…
English
0
0
0
184
LanceDB
LanceDB@lancedb·
Hear ye, hear ye — the LanceDB team rides forth to @AICouncilConf in SF next week ⚔️🏰 Seek us at our booth within the walls of the Marriott Marquis. There, our knights of AI shall demonstrate how the multimodal lakehouse collapses five unruly systems into one sovereign table — so your researchers stop waiting and your GPUs stop starving. We'll be guarding the gates of the multimodal lakehouse. Come find us. aicouncil.com/sf-2026
English
3
2
11
2.3K
LanceDB
LanceDB@lancedb·
Apache DataFusion meetup in San Francisco is back! @tech_optimist, AI Engineer at LanceDB will be diving into the the internals of distribution query execution built with Apache DataFusion and Lance, multimodal lakehouse format. Also tune in to the other sessions by speakers from @RisingWaveLabs, @wherobots, and @paradedb covering data compaction, spatial data, and more. 📅 May 11, SF 🔗 Register: luma.com/k3ointcl Thank you @divs1101 for organizing!
English
0
1
11
1K
LanceDB
LanceDB@lancedb·
Modern problems = modern solutions! Hear from the engineers at LanceDB, @dlthub, and @DataHubCloud building the ingestion, retrieval, and metadata layers of the open source AI stack. This event is designed for data engineers, ML engineers, platform teams, and anyone running pipelines in production. 📅 Wed May 13, Menlo Park luma.com/80pocni3
English
0
0
6
278
LanceDB
LanceDB@lancedb·
Model quality is a numbers game. Checkpoint overhead shouldn't be the thing that limits how many experiments you run. UDFs in LanceDB checkpoint at the fragment level. A crash at frame 70,000 of 80,000 resumes from the last checkpoint — not from zero. Each feature pipeline runs in isolation, so a failure in one doesn't touch the others. New data arrives → only new rows are computed. lancedb.com/blog/unifying-…
English
0
0
5
363