Delta Lake
2.4K posts

Delta Lake
@DeltaLakeOSS
Delta Lake is an open-source storage framework that enables building a Lakehouse architecture for Spark, Flink, Trino, Hive, Scala, Java, Rust, Python, & more!
San Francisco, CA Katılım Nisan 2019
67 Takip Edilen10.7K Takipçiler

Best of Both Worlds: Leveraging delta-kernel-rs to Unify the Open Lakehouse x.com/i/broadcasts/1…
English

📣 𝗡𝗲𝘅𝘁 𝗢𝗽𝗲𝗻 𝗟𝗮𝗸𝗲𝗵𝗼𝘂𝘀𝗲 + 𝗔𝗜 𝗪𝗲𝗯𝗶𝗻𝗮𝗿: 𝗧𝘂𝗲𝘀𝗱𝗮𝘆, 𝗠𝗮𝗿𝗰𝗵 𝟭𝟬!
Open table formats promise engine-agnostic access, but independent protocol maintenance is costly. The Delta Kernel solves this by abstracting the Delta Lake protocol behind a clean API. 🛠️
Join this session to explore how @ClickHouseDB integrated 𝚍𝚎𝚕𝚝𝚊-𝚔𝚎𝚛𝚗𝚎𝚕-𝚛𝚜 into its single-binary C++ build system. 🚀
🎟️ Register: luma.com/OLAI-310
#openlakehouse #oss #deltalake #openlakehouseai #clickhouse

English

Building a Scalable Usage Insights Platform with Delta Sharing x.com/i/broadcasts/1…
English

The Next Evolution of Delta Lake: Catalog-Managed Tables 🚀
We are excited to share that 𝗗𝗲𝗹𝘁𝗮 𝗟𝗮𝗸𝗲 𝟰.𝟭.𝟬 introduces 𝗰𝗮𝘁𝗮𝗹𝗼𝗴-𝗺𝗮𝗻𝗮𝗴𝗲𝗱 𝘁𝗮𝗯𝗹𝗲𝘀, which establish the catalog as the coordinator of table access and source of truth for table state! This evolution simplifies discovery and governance while unlocking significant performance gains.
🔹 𝗦𝘁𝗮𝗻𝗱𝗮𝗿𝗱𝗶𝘇𝗲𝗱 𝘁𝗮𝗯𝗹𝗲 𝗱𝗶𝘀𝗰𝗼𝘃𝗲𝗿𝘆 𝗮𝗻𝗱 𝘂𝗻𝗶𝗳𝗶𝗲𝗱 𝗴𝗼𝘃𝗲𝗿𝗻𝗮𝗻𝗰𝗲: The catalog facilitates access through logical table identifiers and grants clients appropriate permissions to data, dramatically simplifying how engines discover and use tables in a governed manner.
🔹 𝗘𝗻𝗳𝗼𝗿𝗰𝗲𝗮𝗯𝗹𝗲 𝗰𝗼𝗻𝘀𝘁𝗿𝗮𝗶𝗻𝘁𝘀: The catalog can authoritatively validate or reject schema and constraint changes, preventing incompatible updates that could compromise data integrity or break downstream workloads.
🔹 𝗢𝗽𝗲𝗻 𝗳𝗼𝘂𝗻𝗱𝗮𝘁𝗶𝗼𝗻: This design aligns Delta with the catalog-managed model pioneered by Apache Iceberg, making it simpler for practitioners to discover and govern data consistently regardless of format.
Check out the blog to learn more 👉 delta.io/blog/2026-02-0…
#deltalake #catalogs #unitycatalog #opensource #oss

English

Open table formats promise engine-agnostic access, but independent protocol maintenance is costly. The Delta Kernel solves this with a clean API for optimized readers.
Join us to see how @ClickHouseDB integrated delta-kernel-rs into its zero-dependency C++ build system. 🚀
We will cover:
📌 The Kernel’s architectur
📌 Real-world challenges of embedding Rust in a C++ codebase—from static linking and sanitizer support to cross-compilation failures
📌 What’s next for the project
🗓️ March 10, 2026
🕕 9:00AM PT
Save your spot! ➡️ luma.com/OLAI-310
#DeltaLake #ClickHouse #Rust #CPP #OpenSource

English

📣 Join us next Wednesday, Feb 11 at 9AM PT for Delta Hacks: How Delta can propel organizations to AI data readiness!
🔗 REGISTER: luma.com/deltahacks-0211
AI success starts with your data, but building reliable data pipelines shouldn’t feel like rocket science. 🚀 We’re cutting through the noise to show you how to build production-grade pipelines using the first principles of software and data engineering.
We’ll cover:
🔹 Bridging the Gap: Moving from traditional SQL thinking to scalable PySpark patterns
🔹 Architecture over Hype: Applying software engineering principles to your data flow
🔹 Practical Frameworks: How to write cleaner, more maintainable pipelines that solve data quality issues for good
#deltalake #opensource #oss #ai #dataengineering #pipelines

English

📣 Engineering Dynamic Lineage: Column-level lineage using @OpenLineage, @ApacheSpark , and Delta Lake
Traditional static lineage tools fail to track real-time data flow in complex enterprise environments. Join us as we explore 𝗱𝘆𝗻𝗮𝗺𝗶𝗰, 𝗱𝗲𝘁𝗲𝗿𝗺𝗶𝗻𝗶𝘀𝘁𝗶𝗰 𝗹𝗶𝗻𝗲𝗮𝗴𝗲 𝗮𝘀 𝘁𝗲𝗹𝗲𝗺𝗲𝘁𝗿𝘆—treating query execution as parseable events that reveal current data state.
𝗪𝗲'𝗹𝗹 𝗰𝗼𝘃𝗲𝗿:
🔹 Dynamic, deterministic lineage vs. static maps
🔹 OpenLineage as the open source alternative to proprietary solutions
🔹 Stitching flows across 1000s of jobs via Spark listeners
🔹 Integrating lineage alongside tables in the Lakehouse
🔗 Register: luma.com/delta-0224
🗓️ Feb 24 | 9AM PT
📺 Live on LinkedIn, YouTube & X
#opensource #oss #deltalake #telemetry #openlineage #apachespark

English

Thinking about open lakehouse evolution, deferred computation, or large-scale multimodal data platforms? 🤔 This blog, Multimodal with Delta Lake, by R. Tyler Croy (founding member of the delta-rs project) is a must-read.
Read the full post here 👉 brokenco.de/2026/01/19/mul…
(Image generated by ChatGPT based on Tyler's blog focusing on vdt1, @dennylee)
#deltalake #opensource #oss #multimodal

English

AI success starts with data—but building reliable pipelines shouldn't feel like rocket science. 🚀
It is time to cut through the complexity. Join Andrew Sitz to learn how to build production-grade pipelines using the first principles of software and data engineering.
Discover how to bridge the gap between traditional SQL thinking and modern Delta Lake and PySpark patterns. 🙌 Walk away with a clear framework for writing cleaner, more maintainable pipelines.
🕓 9AM PT
🗓️ February 11
🔗 RSVP: luma.com/deltahacks-0211
#opensource #deltalake #oss #ai

English

Unified Data, Unified Insights: Meet the Graph Lakehouse x.com/i/broadcasts/1…
English

Hi all! Thanks for being here. Where are you tuning in from? 🌏
Have questions? Please ask away! We will answer them during the Q&A portion of today's session. 🙌
cc @puppyquery
English