Zhou Sun

36 posts

Zhou Sun

Zhou Sun

@ZhouSun

Co-founder& CEO @ Mooncake Labs. Researching open data infra and AI. Previously Lead Query&Storage @ SingleStore.

San Francisco Katılım Kasım 2013
47 Takip Edilen74 Takipçiler
Zhou Sun
Zhou Sun@ZhouSun·
Dream a future where there's one single system: for app developers and agents it's their beloved database, and for data engineers it's their familiar datalake.
Reynold Xin@rxin

We disclosed today as part of our Series L that our 4-yr old data warehousing business is now >$1B revenue run rate. This is to the best of my knowledge the fastest to $1B DW product in the industry. How did we do it, and what’s next? The conventional wisdom is that it would take 5+ years to build a new database (just to release one). Four years ago, the linked blog announced that Databricks had won the official TPC-DS 100TB benchmark with DBSQL, which was in preview back then. It had the best perf and the best price/perf, and notably beating Snowflake by 12x in price/perf in that benchmark. (Note: we are still the top place on the official TPC-DS benchmark today.) That blog post launched a contentious "benchmarking war" with a lot of back and forth between vendors, but more importantly it marked the very beginning of our data warehousing business. To build this business, we assembled the best engineering team and established a new infrastructure product category called Lakehouse that inherits the flexibility and openness of data lakes and performance of data warehouses. Lakehouse is now the standard for data infrastructure, and organizations are migrating from legacy data warehouses to the Lakehouse. The result so far is a testament to the team and their execution. We have a lot of ideas on how to take performance and usability to the next level, and the team is working hard to make that happen. Expect some big announcements next year. We want to lay the foundation for growing the data warehousing product to a $10B business. Databricks had operated largely in the “analytics” side of data in the past, and we believe the “operational” side of data (aka “OLTP”) is also ready for a “Lakehouse” style disruption. A huge chunk of the founding team’s time is now focusing on “Lakebase”, a new category of OLTP databases that separates storage (in the lake) from compute. That architecture enables features that have been virtually impossible for databases in the past: instant provisioning, elastic scaling (down to zero), branching, high throughput scan directly from Spark, … I won’t go into too much detail about Lakebase here, but we expect a similar trend to happen in the next few years: Lakebase will transform the industry and other OLTP systems will re-architect or position towards it. The best data warehouse is a lakehouse, and the best database is a lakebase! databricks.com/blog/2021/11/0…

English
0
0
2
22
Zhou Sun
Zhou Sun@ZhouSun·
@iskyzh lol are you going to start a Seattle travel note series on zhihu
English
1
0
0
105
迟猫猫🐱
迟猫猫🐱@iskyzh·
✈️Pittsburgh➡️Seattle in 2 weeks. Back to the time of the acquisition in May, the company asked me to move closer to the office. It's been 3 amazing years at CMU+neon+dbx and Pittsburgh has become my second home after Shanghai. See you next time! (⬇️Chi's PIT photo collections)
迟猫猫🐱 tweet media迟猫猫🐱 tweet media迟猫猫🐱 tweet media迟猫猫🐱 tweet media
English
5
0
47
5.6K
Zhou Sun retweetledi
Nikita | Scaling Postgres
Nikita | Scaling Postgres@nikitabase·
Six months ago we joined Databricks. Today, Lakebase is powering production-grade AI products for multi-billion-dollar enterprises. The pace here is unreal, it’s only possible because the team ships world-class engineering at world-class speed.
Nikita | Scaling Postgres tweet media
English
1
8
71
8K
Zhou Sun
Zhou Sun@ZhouSun·
@ericzakariasson you will soon find postgres analytics performance to be the bottleneck :)
English
0
0
1
360
eric zakariasson
eric zakariasson@ericzakariasson·
postgres MCP + mermaid/notebooks have completely replaced database clients for my ad-hoc data analysis. being able to query data from my db and visualize it directly in cursor makes everything so much easier. here’s my setup:
eric zakariasson tweet media
English
49
117
2K
306.1K
Eric Allam
Eric Allam@maverickdotdev·
@kiwicopple @mooncakelabs Would love to see what people actually store in a "lakehouse" type thing, like a real application example instead of just OLAP this and OTLP that
English
3
0
1
310
Zhou Sun
Zhou Sun@ZhouSun·
@maverickdotdev @kiwicopple @mooncakelabs lol let's say we are building x.com, for posting/ commenting, we start with just postgres (users/ posts...). And when we start to build: 'trends', 'explore', you find that you need to scan large tables and the history. That brings the datawarehouse
English
0
0
0
33
Zhou Sun retweetledi
mooncake
mooncake@mooncakelabs·
Gradually, and then suddenly.
mooncake tweet media
English
0
3
28
2.3K
Zhou Sun retweetledi
mooncake
mooncake@mooncakelabs·
We're cooking pg_mooncake v0.2. The release where ‘out-of-the-box analytics’ comes true. 🥮🥮🥮 Here’s what’s coming: 1️⃣ Full Postgres Table Access Method 2️⃣ Logical Replication into Columnstore Tables 3️⃣ Smarter handling of small inserts Full details ⬇️⬇️
mooncake tweet media
English
1
3
19
1.3K
Zhou Sun retweetledi
mooncake
mooncake@mooncakelabs·
pg_mooncake 🤝 @DrizzleORM Run your analytics with fully typed Postgres SDKs you love. Stoked to partner with @_alexblokh and team. Lots more to come together. Stay tuned.
mooncake tweet media
English
1
3
24
6K
Zhou Sun retweetledi
mooncake
mooncake@mooncakelabs·
Postgres is now top 10 fastest analytic databases 🥮 booooooom. day 136.
mooncake tweet media
English
1
3
45
6K
Zhou Sun
Zhou Sun@ZhouSun·
@dataenggdude Thx. I actually read it before. My take is at core it is just iceberg without catalog, so soon more engine will be able to read/write it.
English
0
0
0
20
Zhou Sun
Zhou Sun@ZhouSun·
I studied #Amazon S3 tables so you don't have to, here's what I learned: TLDR: S3 tables allows standalone iceberg table without a catalog. So, IMO it's not a big data solution but rather a good ad-hoc analytics storage. mooncake.dev/blog/s3tables
English
1
0
8
257
Zhou Sun retweetledi
mooncake
mooncake@mooncakelabs·
Day 131; and pg_mooncake v0.1 is out now 🥮 pg_mooncake is the easiest way for fast analytics in Postgres. Available on @neondatabase . And coming soon to @supabase.
GIF
English
1
3
20
1.7K