CedarDB

41 posts

CedarDB banner
CedarDB

CedarDB

@cedar_db

One Database, Endless Possibilities

Katılım Mayıs 2024
31 Takip Edilen2K Takipçiler
CedarDB
CedarDB@cedar_db·
Not every feature gets the attention it deserves at release time. This is the first in our new monthly series: a closer look at the most impactful things we have shipped recently. 👉 cedardb.com/blog/release_n… 🌲
English
0
1
3
87
CedarDB
CedarDB@cedar_db·
When you’re over the tips of your skis in analytics, simple tools don’t cut it. We used Parquet to carry Stack Overflow data from ClickHouse to CedarDB. Result: CedarDB ran our complex queries more than 4X faster. Read the full ski-themed benchmark ⛷️ cedardb.com/blog/ski_parqu…
English
0
0
11
841
CedarDB
CedarDB@cedar_db·
Strings are everywhere; and often the most filtered columns in analytics. So compression isn't just about saving space, it's also about query speed. CedarDB now uses FSST to reduce string storage while keeping queries fast with the help of a dictionary. cedardb.com/blog/string_co…
English
1
15
81
6.1K
CedarDB retweetledi
sisyphus bar and grill
sisyphus bar and grill@itunpredictable·
@cedar_db is incredibly cool and more people should know about it. They’re a team of PhDs in Munich building a new relational database, on top of almost 10 years of academic research, that crushes existing benchmarks and maybe (finally?) gets us to the HTAP grail. The core idea is that existing RDBMSes like MySQL and Postgres were built more than 30 years ago, on assumptions about hardware constraints that are just not true anymore. These ecosystems have evolved admirably but ultimately…it’s a database. It’s built not to change very much. Here are a few of the ways that CedarDB is rethinking every element of the database: 1) A better query optimizer In the last 30 years we’ve made a lot of progress on how to optimize SQL queries, to the point where an optimized query can easily outperform a not-so-optimized query by a ton. But not many query optimization improvements have made the leap from research into databases today. CedarDB did a few things on this front: Implemented the unnesting algorithm developed by Thomas Neumann (one of the leaders of the Umbra research project CedarDB came from) — an improvement of more than 1000x Developed a novel approach to join ordering using adaptive optimization that can handle 5K+ relations Created a statistics subsystem that tells the optimizer things that traditional databases can’t 2) What if your database was actually a compiler? CedarDB doesn’t interpret queries, it instead generates code. For every SQL query that a user writes, CedarDB processes, optimizes it, and generates machine code that the CPU can directly execute. This has been a holy grail for a while, and they implemented it via a custom low-level language that is cheap to convert into machine code via a custom assembler. Another way that CedarDB improves performance is through Adaptive Query Execution. Essentially they start executing each query immediately with a “quick and dirty” version, while working on better versions in the background. 3) Taking advantage of all cores / Ahmdal’s law Distributing fairly between all available cores is notoriously difficult, and the CedarDB team would argue that most databases underutilize their hardware. Their clever approach to this problem is called morsel-driven parallelism. CedarDB breaks down queries into segments: pipelines of self-contained operations. Then, data is divided into “morsels” per segment – small input data chunks containing roughly ~100K tuples each. You can read more in the original paper here: db.in.tum.de/~leis/papers/m… 4) Rethinking the buffer manager Modern systems come equipped with massive amounts of RAM; there’s actually much more “room at the club” than database developers initially assumed. So the idea of the revamped buffer manager in CedarDB is that you can (and should) expect variance not just in data access patterns, but in storage speed and location, page sizes and data organization, and memory hierarchy. CedarDB’s buffer manager is designed from the ground up to work in a heavily multi-threaded environment. It decentralizes buffer management with Pointer Swizzling: Each pointer (memory address) knows whether its data is in memory or on disk, eliminating the global lock that throttles traditional buffer managers. 5) Building a database for change Databases are built to not change. It’s exactly this stability that gives each generation the confidence to build their apps (no matter how different they are) on systems like Postgres. You know what you’re getting. But there’s also a clear downside to this rigidity. CedarDB’s storage class system employs pluggable interfaces where adding new storage types doesn’t require rewriting other components. E.g. if CXL becomes the go-to storage interface at some point in the future, you don’t need to write another whole component, you just need another endpoint for the buffer manager. Anyway these are just a few of the ideas they’re bringing to the table. Maybe it’s because they’re in Germany, maybe it’s because they’re just really humble, but more people should know about this team!! Check out the full post here: amplifypartners.com/blog-posts/the…
English
28
71
684
286.8K
CedarDB
CedarDB@cedar_db·
Stop nesting database systems just to paper over analytics pain. pg_duckdb and pg_clickhouse look tidy, but they rarely fix the real bottlenecks. We explain why, and what we had to build to get HTAP right end‑to‑end in our latest blog post: cedardb.com/blog/unnest_db…
English
0
0
9
4.7K
CedarDB
CedarDB@cedar_db·
As we prepare for AWS Re:Invent, we wrote about a recent PoC in "CedarDB Tames the Slopes." The steady line on the graph might evoke images of the long, straight drive to Vegas on I-15. Enjoy our post and looking forward to seeing you in Vegas next week! cedardb.com/blog/takes_to_…
English
0
0
0
2.1K
CedarDB
CedarDB@cedar_db·
🎃 Ready for some code chills for Halloween? 👻 In “Down with template (or not)!”, we venture into the dark world of C++ templates. 😱 Prepare for some template madness! cedardb.com/blog/down_with…
English
0
1
3
1.4K
CedarDB
CedarDB@cedar_db·
What if a database could be your game engine? During parental leave @VogelLu built DOOMQL: A multiplayer DOOM-like where everything (rendering, game loop, state) runs in pure SQL on CedarDB. It's fast, ridiculous, and surprisingly elegant. Full write-up: cedardb.com/blog/doomql
CedarDB tweet media
English
0
7
23
2.3K
CedarDB
CedarDB@cedar_db·
Want to know what makes CedarDB special? You’re in luck! Our co-founder @PhilippFent sat down with Kaivalya Apte on The GeekNarrator Podcast to dive into the innovations and engineering behind CedarDB.
Kaivalya Apte - The Geek Narrator@thegeeknarrator

Next episode(releasing soon) is another banger on The GeekNarrator. @cedar_db with @PhilippFent We go really deep into the architecture, data structures, analytics, query optimisation and so on. In case you missed the latest banger on Uni kernels check this out: You don't need Linux, Docker, k8s? Future with Unikernels ft. NanoVMs youtu.be/IQJl6rgpibY

English
0
0
7
1.1K
CedarDB
CedarDB@cedar_db·
Ever wished your analytics could keep up with reality instead of lagging behind? We wrote about connecting #CockroachDB change data capture (CDC) with #CedarDB, and what that means for running lightning-fast analytical queries on live data. cedardb.com/blog/crdb_cdc_…
English
0
2
11
842
CedarDB
CedarDB@cedar_db·
Leaving academia is always a big step, especially if you bring your research project with you into the real world. Read our latest post to learn what we did to prepare a research project for production workloads and what we learned along the way: cedardb.com/blog/research_…
English
0
3
41
3.9K
CedarDB
CedarDB@cedar_db·
Congratulations to SortMergeJoins from TU Munich - winners of the 2025 SIGMOD Programming Contest! Built by the Umbra research group (CedarDB’s roots), their system ran 12× faster than median - entirely open-source and no sort-merge-joins to be found 😉: github.com/umbra-db/conte…
CedarDB tweet media
English
0
5
48
3.4K
CedarDB
CedarDB@cedar_db·
Join us on an AI and vector-powered journey, as we explore key philosophical topics such as "Does pickled watermelon belong on a taco?", and how to search CedarDB docs using CedarDB's vector support. cedardb.com/blog/semantic_…
English
0
0
6
568
CedarDB
CedarDB@cedar_db·
🎓 Supporting the next gen of database talent! We are proud to back @TUMuchData with new team t-shirts and fresh merch for their growing DB community at TUM. From research talks to Amsterdam confs, they’re repping "Compile and Conquer" in style!
CedarDB tweet mediaCedarDB tweet media
English
0
0
19
1.1K
CedarDB
CedarDB@cedar_db·
A great question from our Community Slack sparked this demo on spatial queries aka "finding things" with geospatial data. Check it out: cedardb.com/blog/geospatia…
English
0
0
5
542
CedarDB
CedarDB@cedar_db·
CedarDB Community Edition is here! Download CedarDB Community Edition today - no paywall, no signup, just pure performance. Read more about our CedarDB on our blog: cedardb.com/blog/launch/
English
0
9
46
4.5K
CedarDB
CedarDB@cedar_db·
Many database systems claim to be compatible with PostgreSQL. But what does that really mean? Find out in our latest blog post and learn more about what it takes to become PostgreSQL compatible. cedardb.com/blog/postgres_…
English
0
19
125
5K
CedarDB
CedarDB@cedar_db·
You don’t need an army of C++ devs to hand-optimize every query. We let the code write the code. Read our latest blog post to see how we mix runtime flexibility with almost magical performance! cedardb.com/blog/compilati…
English
0
15
95
4.9K
CedarDB
CedarDB@cedar_db·
B-trees may be decades old, but we still use them extensively in CedarDB. Read our latest blog post to learn how to scale B-tree operations to hundreds of cores. cedardb.com/blog/optimisti…
English
3
42
200
15K
CedarDB
CedarDB@cedar_db·
@MarkCallaghanDB We generate a single executable per query that only contains the functions and logic blocks it actually requires to execute the query. This way we don't have a ton of functions lying around.
English
1
0
3
198
CedarDB
CedarDB@cedar_db·
We follow up on our past claims that fewer code branches are better in our return to blogging after our winter break. Read on to find out why branches are a burden on the CPU, and what both you and the CPU can do to avoid performance penalties. cedardb.com/blog/reducing_…
English
0
12
77
6.4K