Mirco Dotta

815 posts

Mirco Dotta

Mirco Dotta

@mircodotta

@triple_quote co-founder - https://t.co/3qqgS9Mv65 Rock your #Scala compile time 🤘

Switzerland Katılım Ağustos 2008
139 Takip Edilen1.2K Takipçiler
Mirco Dotta retweetledi
Stanislav Kozlovski
Stanislav Kozlovski@kozlovski·
The way Datadog calculates percentiles at scale is very innovative 🔥 Usually, calculating the percentiles of large datasets is very expensive. To know the 99th percentile of a stream of values, you need to: - keep all the values - sort them - return the value whose rank matches the percentile (e.g 99th item) Datadog cannot afford to do this with the many millions of data points that come in every second - the space and CPU requirements are not practical for a company with thousands of customers. 🐾 Naturally, they opted for sketch algorithms - those should provide them with a good-enough probabilistic result while being vastly more efficient to compute. Unfortunately - they couldn’t get satisfactory results. The algorithms would produce results that were too inaccurate. ❌ Why? Many percentile sketches had guarantees in terms of *rank error*. A rank-error guarantee of 2% means that the p95 value returned by the sketch is somewhere between the p93-p97 value. But system latencies exhibit very fat tails - the difference between the p97 and p99 values can be 2-10x! So what did the dogs do? 🐶 They invented a new sketch algorithm - DDSketch. Instead of rank error guarantees, they designed it for *relative error* guarantees. If the p99 is 60s, a 2% error means the sketch would return 58.8-61.2s. The algorithm is surprisingly pretty simple: • They create buckets covering ranges of the desired error rate. (+- 2% in this case) 🪣 • Each bucket keeps a counter of the amount of data points within that range. 💯 • When processing an item (latency metric data point), increment the counter of the appropriate bucket. ➕ • To count the desired percentile, you sum up the bucket’s values until you get to the desired percentile. Whatever bucket that percentile is in - that’s your value. 🏆 In this example, the 50th percentile is 1033ms. (4th value out of our total of 8) Going by count, the 4th value is in the second bucket (b-1) and the algorithm would produce a result of 1021-1061ms. To cover the range from 1 millisecond to 1 minute, you only need 275 buckets. With 64-bit counters, that's just ~2kB of memory, regardless of the amount of input data. This is why we call sketch algorithms sublinear in space growth - memory requirements do NOT grow linearly with input. The exponential nature of the bucket distribution makes it cheap to cover an even wider range: 1 nanosecond to 1 day takes just 3x more buckets: • 802 buckets at ~6kB. As you can probably tell, this is pretty easy to parallelize. You can divide this bucket-building exercise into many parallel lightweight substreams, and then merge the results freely. 🕊 The merge operation is a simple sum of the buckets & their counters, which ensures that the accuracy is kept in the same range. It is a very scalable and performant sketch algorithm. Kudos to Datadog for inventing it. Good boy! 🫳🐕‍🦺
Stanislav Kozlovski tweet media
English
25
233
1.6K
266.4K
Mirco Dotta retweetledi
Xebia
Xebia@Xebia_Global·
Did you know that @gradle offers much more than a build tool? We caught up with @mircodotta at the Scala Tooling Summit where he explained that Gradle Enterprise is like a Swiss Army knife for developers' productivity. #Scala @scala_lang
English
0
5
12
1.1K
Mirco Dotta retweetledi
Kit Langton
Kit Langton@kitlangton·
Scala 3 is so damn good.
English
3
18
140
23.2K
Mirco Dotta retweetledi
Martin Odersky
Martin Odersky@odersky·
For everyone who has asked about the Safer Exceptions paper, here's a link that should work: infoscience.epfl.ch/record/290885?… It seems there is a way to get open access to ACM papers but you have to go through some other link first.
English
0
15
105
0
Mirco Dotta
Mirco Dotta@mircodotta·
@mariofusco Have you heard of checked exceptions in Scala 3? >> contributors.scala-lang.org/t/pre-sip-chec… I haven’t been following, so I don’t know whether the idea has been pushed forward or not. But I guess we would all love to listen to a talk on the subject. For sure it will be polarizing! 😅
English
0
0
1
0
Mario Fusco
Mario Fusco@mariofusco·
This is a Tyrell P34. No other F1 manufacturer produced a 6 wheeled car before and AFTER it. And indeed it was a terrible idea. Java has checked exceptions. No other programming language introduced this feature before and AFTER it.
Mario Fusco tweet media
Cedric Beust@cbeust

@mariofusco @maxandersen That's argumentum ad populum, though. We want compiler enforced error path checks, checked exceptions are just one way of supporting it. Rust's approach is another. Runtime exceptions just gives you clean looking code that's actually crashy

English
3
3
27
0
Mirco Dotta
Mirco Dotta@mircodotta·
@afaranwide @Livex EP demand for this chateau is far exceeding availability for the retailers I am a client of. Based on the few data points I have, the free market is proving them right.
English
0
0
0
0
Afaranwide
Afaranwide@afaranwide·
@Livex How can they justify such a whopping increase on last year, which was rated higher by Neal Martin? And the '19 is available for less than the release price (£746/£813, Liv-ex data). Can't see why anyone would buy it. #Bordeaux2020
Lagos, Portugal 🇵🇹 English
1
0
0
0
Liv-ex
Liv-ex@Livex·
#Bdx20 release: Smith Haut Lafitte 2020 - €96 p/b ex-negociant, up 48% on 2019 (€64.80). Analysis coming soon. Compare previous Smith Haut Lafitte release prices while you wait: bit.ly/bdx20prices
English
1
0
2
0