Michael Armbrust

272 posts

Michael Armbrust

Michael Armbrust

@michaelarmbrust

Lead developer of Spark SQL @databricks, formerly @ucberkeley. Distributed databases, query languages, scala, other nerdy stuff...

Katılım Ocak 2012
0 Takip Edilen6.2K Takipçiler
Jorge Ortiz
Jorge Ortiz@JorgeO·
Anyone have plumber recommendations for SF or the Bay Area? My gas water heater needed 3 repairs in the last month, another 3 in the 12mo prior, and it’s *still* having issues. I want to replace it. Also wondering if I should take the opportunity to electrify. Ty!
English
4
0
5
4.8K
Michael Armbrust
Michael Armbrust@michaelarmbrust·
@jaceklaskowski I think we are just missing some context information as we start the extra threads we use to improve cluster utilization. We will fix it.
English
0
0
1
242
Jacek Laskowski
Jacek Laskowski@jaceklaskowski·
Found a way to dig deeper into #DeltaLiveTables 😎 And found the history of a published table all with nulls in userId and userName yet it's known who ran the pipeline?! 🤔 Should be easy to add it or I'm missing a bigger picture /cc @michaelarmbrust #Databricks
Jacek Laskowski tweet media
English
1
0
1
1.1K
Michael Armbrust
Michael Armbrust@michaelarmbrust·
@jaceklaskowski @Falydoor Yeah, that is confusing. We should fix the icon. Longer term we are working on the ability for you to attach the notebook to your DLT pipeline cluster, so you can get interactive feedback on syntax or analysis errors.
English
1
0
3
185
Jacek Laskowski
Jacek Laskowski@jaceklaskowski·
Just came across this button "Delta Live Tables" in a #Databricks notebook Reminds me days with #IBM #WebSphere dev tools where writing code wasn't as cool as I always wanted 😏 (I'm not losing hope that there're some places for some code writing for DLTs, e.g. in #Python)
Jacek Laskowski tweet mediaJacek Laskowski tweet mediaJacek Laskowski tweet media
English
2
1
3
1.2K
Michael Armbrust retweetledi
Matei Zaharia
Matei Zaharia@matei_zaharia·
Insightful benchmark of Linux Foundation Delta Lake and Apache Iceberg by @BrooklynData that shows Delta is up to 8x faster in workloads with data updates. Most storage benchmarks only test reads, but with updates, care is needed to maintain performance. brooklyndata.co/blog/benchmark…
English
3
49
108
0
Adriana Porter Felt
Adriana Porter Felt@__apf__·
fifteen years ago my husband took me on a first date, one thing led to another and now he's building a large tree house
English
9
2
325
0
Michael Armbrust
Michael Armbrust@michaelarmbrust·
@jaceklaskowski @bigklata That is not exclusively true. Partitions can also help, for example with a SELECT DISTINCT <partition_column> and other aggregations, however, you are generally right that if the query is SELECT * you have to read the whole table definitionally.
English
1
0
1
0
Jacek Laskowski
Jacek Laskowski@jaceklaskowski·
Kept hearing about data partitioning as a way to make many reads from a #DeltaLake table faster *somehow* but...that's what I "discovered" today...without WHERE clause (that #SparkSQL can push down) the whole table is scanned anyway (with or without partitions).
English
6
1
12
0
Michael Armbrust retweetledi
Matei Zaharia
Matei Zaharia@matei_zaharia·
Databricks just set a new record on the official TPC-DS data warehousing benchmark, showing that a lakehouse system based on open data formats can outperform previous DW systems. Don't listen to folks who say open means bad performance! databricks.com/blog/2021/11/0…
English
4
65
277
0
Adriana Porter Felt
Adriana Porter Felt@__apf__·
I'm seriously considering equipping people with deli meats and then "coincidentally" walking by them on the sidewalk
English
12
0
25
0
Adriana Porter Felt
Adriana Porter Felt@__apf__·
my pandemic pup is afraid when strangers get closer than 6ft, since that's been the rule his whole life 🤦🏻‍♀️ he warms up after a few min, but it isn't ideal. how do I fix him?
Adriana Porter Felt tweet media
English
9
0
45
0
Michael Armbrust retweetledi
Ali Ghodsi
Ali Ghodsi@alighodsi·
@technology @emilychangtv Why do you have to pick between diversity and merit? Make your work env. inclusive, decrease bias in hiring, as a leader don't make statements that alienate large groups. That'll give you a competitive advantage to a talent pool that won't join unwelcoming companies.
English
6
34
173
0
Michael Armbrust
Michael Armbrust@michaelarmbrust·
Really excited to announce Delta Live Tables, a system that turns your SQL queries and pyspark DataFrame code into production-ready ETL pipelines! We've been working on this for almost 3 years, and it powers some of our largest pipelines internally. youtube.com/watch?v=fJhlTs…
YouTube video
YouTube
English
3
11
55
0
Michael Armbrust retweetledi
Delta Lake
Delta Lake@DeltaLakeOSS·
Did you know you can now use Delta Lake straight from #Python? Just a quick: pip install deltalake pandas
English
4
23
100
0