Julien Le Dem

11K posts

Julien Le Dem banner
Julien Le Dem

Julien Le Dem

@J_

Architect, Founder, Angel, Advisor, Keynote speaker, OSS: @OpenLineage @MarquezProject, ASF: Parquet Arrow Iceberg 🐖. 🦋 https://t.co/4VQUXaZ5vu . he/him

California Katılım Mayıs 2009
2.1K Takip Edilen4.1K Takipçiler
Julien Le Dem
Julien Le Dem@J_·
@changhiskhan My point is the upper layers can be built on top of the existing format. You don’t need to start a new stack from scratch. Not everything needs to be in parquet an you don’t need a new format to build them.
English
0
0
0
83
changhiskhan
changhiskhan@changhiskhan·
@J_ Great article and really thoughtful! Encodings is almost the least interesting part of Lance tho. Requirements for AI data and workloads spans multiple layers. Even IF parquet moves forward on encodings there’s still so much that’s missing for AI engineers and researchers *now*.
English
1
0
1
143
Julien Le Dem
Julien Le Dem@J_·
In the past few years, we’ve seen a cambrian explosion of new columnar formats, challenging the hegemony of Parquet. Presumably, the design of yore is not going to cut it moving forward. I spent some time to understand how things actually changed. sympathetic.ink/2025/12/11/Col…
English
3
17
92
6.5K
Julien Le Dem retweetledi
Andrew Lamb
Andrew Lamb@andrewlamb1111·
There is some crazy (good) activity on the @ApacheParquet mailing list for new encodings. A sample: PFOR, FSST, ALP, Strings and Cascaded Encodings. 🤯 Huge kudos to Arnav Balyan, Prateek Gaur, and Micah Kornfield for driving this. @parquet.apache.org" target="_blank" rel="nofollow noopener">lists.apache.org/list.html?dev@…
English
1
6
57
5.5K
Julien Le Dem
Julien Le Dem@J_·
I’ll be speaking in Mountain View on Thursday. Come say hi!
Delta Lake@DeltaLakeOSS

Parquet sparked a revolution in columnar storage. Now AI workloads are driving a new wave of change. At 𝗢𝗽𝗲𝗻 𝗟𝗮𝗸𝗲𝗵𝗼𝘂𝘀𝗲 + 𝗔𝗜 𝗠𝗶𝗻𝗶 𝗦𝘂𝗺𝗺𝗶𝘁, Julien Le Dem (@datadoghq) will cover: 🔹 What’s changed since Parquet was introduced 🔹 Why new columnar formats are emerging now 🔹 The encoding advances shaping what comes next—and how they’re pushing Parquet to evolve 📅 Nov 13 | Mountain View 🔗 Register: luma.com/OLMS-1113 #openlakehouse #opensource #columnstorage #ai #parquet

English
0
1
5
1.1K
Julien Le Dem retweetledi
Hyperparam
Hyperparam@hyperparamapp·
Cool parquet metadata visualizer by @J_, powered by Hyparquet
Hyperparam tweet media
English
0
2
2
290
Julien Le Dem
Julien Le Dem@J_·
If you've been wondering why we see a flurry of new columnar formats, come see me present "Column Storage for the AI Era". I'll talk about what has changed, new advances in data encoding and how that's pushing Parquet to evolve. Event tomorrow: luma.com/pxikwty3
English
0
1
5
605
Julien Le Dem
Julien Le Dem@J_·
@julianhyde Ironically it is simultaneously terrible and the thing everyone compares themselves against :)
English
0
0
3
56
Julian Hyde
Julian Hyde@julianhyde·
@J_ I hope you get a dime every time someone says Parquet is terrible.
English
1
0
2
146
Julien Le Dem
Julien Le Dem@J_·
I'm trying to understand a bit better real life deployments of open source Clickhouse. If you're using it, what does your deployment look like?
English
0
1
2
350
Julien Le Dem retweetledi
Andrew Lamb
Andrew Lamb@andrewlamb1111·
Quite a list of contributors already to the Rust @ApacheParquet implementation of Variant (support for semi structured data). I was making some slides to explain what Variant is and made up a list I wanted to share. The feature will be amazing github.com/apache/arrow-r…
Andrew Lamb tweet media
English
3
6
58
4K