vortex

22 posts

vortex banner
vortex

vortex

@vortexdotdev

An extensible, state of the art columnar file format. Formerly at @spiraldb, now a Linux Foundation project (@LFAIDataFdn). Apache-2.0

Github Katılım Mayıs 2025
14 Takip Edilen249 Takipçiler
vortex retweetledi
Ankush Gola
Ankush Gola@ankush_gola11·
We leveraged two amazing open source projects when building SmithDB. One is @ApacheDataFusio: an extensible Rust based query engine. We built custom execution plans specifically tuned for our workloads and storage backend, and DataFusion made it straightforward to plumb everything together. The other is @vortexdotdev: an extensible file format that allows you to build custom layouts with specific encoding and chunking strategies for different columns. I would highly recommend checking out both of these projects if you're interested in modern data systems.
Ankush Gola@ankush_gola11

We built SmithDB: the database purpose built for agent observability workloads that now powers many parts of LangSmith. Agent observability presents a challenging data problem. Agent traces can contain tens of thousands of intermediate spans and large, unbounded payloads. These characteristics are a direct result of agents running for longer time horizons and LLM context window sizes growing. Traditional data infrastructure was not built to handle the complexities associated with storing and querying this data. SmithDB brings LangSmith up to 12x performance improvements across access patterns most important for agent observability. I’ve been working on SmithDB directly with an amazing team over the past few months, and I’m incredibly proud of the results we’re seeing. I wrote a bit more about the story and engineering challenges behind SmithDB in this blog. Additionally, if you’re a systems engineer interested in building the future of agent observability, please reach out!

English
2
18
104
17.9K
vortex retweetledi
Spice AI
Spice AI@spice_ai·
The Research Behind Modern Data Compression & @vortexdotdev When we chose Vortex as the storage layer for Spice Cayenne (the data accelerator engine in Spice), we were betting on decades of database research finally reaching production-ready maturity. Here's the research behind Vortex: 📄 BtrBlocks (SIGMOD 2023) - The core algorithm from the Technical University of Munich. Cascading multiple lightweight encodings outperforms monolithic compression. Optimize for decompression speed, not just compression ratio. 📄 FastLanes (VLDB 2023) - Hardware-friendly integer compression. Structures bit-packing to maximize SIMD utilization across AVX-512, AVX2, and ARM NEON. Near-memory-bandwidth decompression. 📄 FSST (VLDB 2020) - Fast Static Symbol Table for strings. Near-LZ4 ratios at 5-10× faster decompression. Critical for string-heavy columns. 📄 ALP (CWI Amsterdam) - Adaptive Lossless floating-Point compression. Exploits real-world float patterns (prices with 2 decimals, sensor readings with limited precision). 📄 MonetDB/X100 + Morsel-Driven Parallelism - Foundations for vectorized, NUMA-aware query execution that Vortex builds on. The result? Compression that is tailored to your data: • Integers via FastLanes bit-packing • Floats via ALP adaptive encoding • Strings via FSST symbol tables • Timestamps via delta encoding • Sorted columns via run-length encoding Why does this matter for production systems? 1️⃣ Query performance scales with decompression speed. Focus on decode performance translates directly to faster queries. 2️⃣ Automatic encoding selection means zero configuration. The algorithm samples your data and picks optimal strategies per column. 3️⃣ SIMD acceleration is baked in. FastLanes was designed for vectorized, hardware accelerated execution from day one. 4️⃣ Zero-copy Arrow access. Data decompresses directly to Arrow arrays with no intermediate copies. Vortex is now a Linux Foundation AI & Data project, and researchers are building on it (Anyblox, F3). You get SOTA research in production systems. The future of data storage is exciting. To learn more about our Vortex implementation, check out the blog: hubs.ly/Q04bGfvf0 #datafusion #ai #data #vortex #spiceai #arrow #parquet
Spice AI tweet media
English
0
1
6
363
vortex retweetledi
Will Manning
Will Manning@willmanning·
Connor Tsui & I just merged a first cut of TurboQuant into @vortexdotdev , already validated on production embeddings 🚀🚀🚀
English
1
5
12
2.2K
vortex
vortex@vortexdotdev·
you took up with Weasley, but he can't afford sliceable cascaded encodings. now your random access is dogged, and your cortisol is properly spiked, potter
vortex tweet media
English
0
0
3
87
vortex retweetledi
Luke Kim
Luke Kim@lukekim·
CASE-WHEN support coming to @vortexdotdev Guess I'm a Vortex contributor now!
Luke Kim tweet media
English
0
1
8
400
vortex retweetledi
Luke Kim
Luke Kim@lukekim·
🌪️ Why LF Vortex for hot data? @ApacheParquet great compression, slow decode @ApacheArrow instant decode, no compression Vortex: encoding-efficient compression with SIMD decode to Arrow 80% of Parquet's compression, 10x faster decode
English
1
5
11
817
vortex retweetledi
Alfonso Subiotto ❄️
Alfonso Subiotto ❄️@asubiotto·
Happy to share that I've been nominated to the @vortexdotdev Technical Steering Committee! It's been fun and productive switching to Vortex from Parquet as our storage format at Polar Signals and I'm excited to continue contributing to the Vortex project.
English
1
1
4
350
vortex retweetledi
vortex retweetledi
Polar Signals
Polar Signals@PolarSignalsIO·
We completed a major project to switch our storage file format from Parquet to Vortex 🌪️ resulting in 70% average query performance improvement across the board 🚀 Learn more about how rethinking interface-imposed limitations unlocked these gains in our latest blog post 👇
English
2
7
27
3.7K
vortex retweetledi
Andrew Lamb
Andrew Lamb@andrewlamb1111·
The talk on @SpiralDB at @CMUDByoutube.com/watch?v=zyn_T5… is a great one. I think it would also be interesting to hear a counterpoint about @ApacheParquet that explains actual technical details of that format, the Cathedral vs Bizzaar management, options with Metadata, etc
YouTube video
YouTube
English
2
15
111
8.8K
vortex
vortex@vortexdotdev·
Go check out our latest post, sharing new developments from the past month 🗓️💻☕️ vortex.dev/blog/september
English
0
2
11
1.3K