Mark Lyons

1.4K posts

Mark Lyons banner
Mark Lyons

Mark Lyons

@mcl5tech

product @cloudera | prev product @aws @dremio @verticaunified • #data #analytics #design #tech for 🌍

Somerville, MA เข้าร่วม Ekim 2012
4.9K กำลังติดตาม898 ผู้ติดตาม
Nikunj Kothari
Nikunj Kothari@nikunj·
Spent 18 months trying to find what's coming beyond chat, here are some emerging patterns..
Nikunj Kothari tweet media
English
36
83
1K
99.4K
CedarDB
CedarDB@cedar_db·
Have you ever wondered why existing database systems focus on either analytical or transactional performance? Learn why this is the case and how a hybrid storage engine can deliver high performance for combined workloads: cedardb.com/blog/colibri/
English
2
16
72
7K
Jack Vanlightly
Jack Vanlightly@vanlightly·
I'm working on a set of blog posts that compare the internals of Apache Iceberg, Delta Lake, Apache Hudi and Apache Paimon. No benchmarking, no judgments etc, just a comparison of internal mechanics.
English
4
3
81
5.5K
Mark Lyons รีทวีตแล้ว
Aakash Gupta
Aakash Gupta@aakashgupta·
True:
Aakash Gupta tweet media
English
7
55
469
99.6K
Peter Kraft
Peter Kraft@petereliaskraft·
Firecracker is an incredibly cool piece of technology. Built by AWS and open-sourced, it's essentially a virtual machine monitor that tries to be as lightweight as possible, providing the minimal OS functionality most apps need to run (particularly network and file I/O) and passing through much of the implementation to the host OS. At DBOS, we use Firecracker microVMs to serverlessly host user applications. We really like them because they're fast to start up and don't require many resources, but provide the high level of isolation and security our users need. The AWS team that built Firecracker wrote a great paper about it--highly recommend checking it out if you want to learn more.
Peter Kraft tweet media
English
7
76
521
53K
Mark Lyons รีทวีตแล้ว
Marc Brooker
Marc Brooker@MarcJBrooker·
Microsecond-accurate time is now available in EC2 US East. So many cool things this makes possible: aws.amazon.com/about-aws/what…
English
5
19
150
19.4K
Kristen Anderson
Kristen Anderson@FintechKristen·
Public service announcement: two children in daycare at @BrightHorizons in Cambridge, MA costs $95,400/year. This is after-tax money (ie about $130k in income would be needed to afford this). Shame on this country. Cc @reshmasaujani
English
108
78
700
303.3K
Mark Lyons
Mark Lyons@mcl5tech·
@JoshuaSteinman I’ve been working on measuring credibility & expertise via Proof of Research (proof of work concept) any interest in discussing.
English
0
0
0
32
joshua steinman (🇺🇸,🇺🇸)
Request for Startup: Batting average for public personae and organizations, preferably open and auditable. Perhaps an open database linking individuals to predictions, and enabling a sort of “Rotten Tomatoes” style rating for accuracy of both predictions AND overall accuracy.
English
12
3
73
9.8K
Mark Lyons
Mark Lyons@mcl5tech·
Anyone looking for a new SA opportunity DM me and I can intro you to Roger Frey! (Great team & Roger is fantastic!!) lnkd.in/edKZsu-b
English
1
1
7
145
Mark Lyons
Mark Lyons@mcl5tech·
@mim_djo @teej_m They deff are compressing and encoding the data and query execution as much as possible without materializing.
English
0
0
2
49
Mim
Mim@mim_djo·
@teej_m something bother me and can't explain it, the only rational explanation for snowflake performance, they may be operating directly on compressed data or some shit like this.
English
3
0
3
865
Mark Lyons
Mark Lyons@mcl5tech·
@thetinot @mim_djo I believe for a fair compare you need to generate a net new tpch or ds data set to my comment the other day. There’s so much possibly fishy business w the data set already generated by snowflake
English
1
0
1
211
Tino Tereshko 🇺🇦
Tino Tereshko 🇺🇦@thetinot·
@mim_djo You're also using their dataset, which is optimized to heck and potentially extra cached. Not a fair analysis methinks
English
2
0
2
387
Mim
Mim@mim_djo·
1/ Querying 40 GB of data from #duckdb, first try reading directly from cloud storage, the throughput is so slow, it hurt, after 10 minutes, get OOM for Query 18
English
2
4
38
26.8K
Mark Lyons
Mark Lyons@mcl5tech·
@mim_djo Was it newly generated data set or a dataset they already created?
English
1
0
1
350
Mim
Mim@mim_djo·
you think you have a basic understanding of OLAP database, then you run TPCH-SF100 ( that's 600 M rows) on #Snowflakedb using the smallest size, this is just wild !!! 102 second , I have no idea what they are doing !!!
Mim tweet media
English
7
2
37
13.2K
Mark Lyons รีทวีตแล้ว
Mim
Mim@mim_djo·
TPCH-SF30 ; 180 million rows #AZURE D16DS_V5; 16 Cores, 64 GB RAM #Databricks Photon 41 S #DuckDB : 43 second Query Parquet files from the VM SSD, no Azure storage involved Databricks Software cost (not hardware) 4.4 $/Hour github.com/djouallah/Test…
Mim tweet media
English
4
5
43
8.3K
David Maier
David Maier@DavidAMaier·
Wowowo, @neondatabase.. You are telling me you've built a database that allows me to just branch off my production data at any time in the past and use it for testing/debugging/development? Thats way too cool.
English
3
2
18
0
Mark Lyons รีทวีตแล้ว
Dipankar Mazumdar
Dipankar Mazumdar@Dipankartnt·
Join @dremio’s Tech advocacy & Eng team for the very first installment of the @ApacheIceberg Office Hours 📆 🚀 We will kick-off with a brief presentation on Copy-on-Write Vs Merge-on-Read strategies, followed up by Q&A on anything Iceberg related. When: December 7th, 12 PM
Dipankar Mazumdar tweet media
Toronto, Ontario 🇨🇦 English
2
4
15
0