Emanuele Sabellico

116 posts

Emanuele Sabellico

Emanuele Sabellico

@emasab

Katılım Haziran 2009
149 Takip Edilen53 Takipçiler
Emanuele Sabellico retweetledi
Stanislav Kozlovski
Stanislav Kozlovski@kozlovski·
Everyone is using Kafka. ​ But almost no one is using its new Infinite Storage feature. ✨ ​ KIP-405 is introducing the ability to store Kafka data in S3. ​ And any other external store for that matter. ​ It’s incredibly needed, because storage is Kafka’s biggest flaw right now. ❌ ​ Kafka was not originally developed with elasticity in mind. ​ It’s key limitation is that it co-locates the data with the broker, resulting in many problems: ​ 1. Competition for disk IOs ⚔️ ​ Historical consumers can decrease write performance by up to 43% (as shown in the tests below) because they force Kafka to read from the disk (as opposed to its page cache) and that causes extra disk strain. ​ This is especially bad for HDDs, which have notoriously improved exponentially in all aspects BUT their IOPS capacity. 👎 ​ They have been stuck at roughly 120 IOPS for the last two decades - so you can’t allow that precious IOPS to be used up… without expecting catastrophic latencies that is. 2. Speaking of HDDs - Kafka is practically REQUIRED to use them due to their cost-effectiveness. 💾 ​ But they can give you large tail latencies. ​ With tiered storage, you can afford very fast small local SSD storage. ⚡️ ​ This also means you can provision less memory because you don’t need to be as reliant on page cache for serving historical reads. 💡 ​ Previously, you’d be reliant on it so that you reduce hits to the disk in historical read scenarios which result in prolonged periods of under-replicated partitions, and extra network bandwidth of the cluster being used up. ​ 3. Disastrous broker disk failure scenario - in such cases, the broker has to re-fetch everything it had on disk (TBs). ​ This sudden extra historical read traffic at max allowed throughput severely impacts latencies. ​ Such broker start up scenarios can be 12,000% slower and can worsen produce latencies by up to 900%, as shown by tests below. 😱 ​ 4. Slow broker failure recovery - if a broker simply restarts for whatever reason (e.g host VM died), it has to catch up with a lot of data, proportional to the time it was dead. ​ This again exhausts precious IOPS, causes extra bandwidth and just generally takes slower to resolve under-replicated partitions. 🐌 ​ 5. Reassigning partitions - partitions that have a lot of data are extremely slow to move. ​ At a decent 100MB/s replication rate, 10TB of data moves at a whopping 27.7 hours! That’s more than one day! ​ In the intermediate state, this means you're replicating 2x as much data for 28 hours. 🤦🏻‍♂️ This means that any actions like expanding or shrinking a cluster will realistically take you days to finish, while also consuming a TON of extra replication bandwidth. It's a real disaster. ​ And finally - you can never effectively rebalance partitions in a reactive way to resolve any problem fast. (since fast isn't measured in hours) 👎 ​ 6. Impractical to scale storage 😮‍💨 ​ If you decide to increase your retention settings across the cluster for whatever reason (e.g GDPR), you either need to: ​ • scale horizontally: add new brokers and waste unnecessary CPU/memory resources) • scale vertically: do some complex and fragile disk swaps on them ​ 7. Your cluster set up ends up impacted by disk requirements 😠 ​ You end up with a significantly larger cluster than you would need if disk size wasn’t a concern. i.e you’re buying extra CPU/memory you don’t need. 👎 ​ This is because HDDs have limited size - so there are cases where you may need to use extra machines just so you can place more HDDs in there. ​ Not to mention the extra maintenance burden for supporting more nodes. ​ 8. High cloud cost ☁️ 💸 ​ In the cloud, it’s more expensive to provision larger disk volumes that are attached to the instance. ​ 9. Max storage limitation per partition 🛑 ​ You’re limited by how much data you can store on a single partition based on the limit of the physical disk on the broker. ​ While admittedly a very niche use case, why couldn’t you have a single partition that consists of terabytes of historical data? ​ Those are a lot of problems... ​ How do we solve all of this? 😥 ​ Simple. Put the data in S3 ✨ ​ That is what Tiered Storage is - it extends Kafka’s storage beyond the local one by retaining the data in a pluggable external store (HDFS, S3, etc). Pluggable is a key word here, as it will enable the open-source community to develop different implementations for different external stores in parallel. This can be implemented via the RemoteStorageManager interface. ​ Kafka will end up having TWO tiers of storage placement: 1. a local one (hot) 🥵 2. a remote one (cold) 🥶 ​ You will be able to enable this uniquely per topic, with varying local and remote retention settings. ​ This will be done transparently to any clients - they won’t be able to tell when they’re fetching from the remote store as the Kafka API remains the same and simply abstracts it away. ​ 😡 Won’t this kill latency? ​ In theory, one should expect slower reads from the remote log store. ​ But. This isn’t a problem practically as historical workloads are usually not performance sensitive. 💡 ​ The latency-sensitive workloads usually read from the tail of the log (latest data), and are therefore not impacted by this feature. ​ Performance tests were done nevertheless! (using HDFS as the external system) ​ They focused exclusively on write latency and the impacts there: ​ • The largest produce latency increase in the tests was 21ms → 25ms of p99 produce latency in the steady state. ​ With different scenarios came different results. ​ Get this - when there are historical reads (out of sync consumers), the produce latency was actually improved! 🔥 ​ This is because without tiered storage, consumers reading old data compete for IOs on the disk for reading (normal consumers don’t since data is served from pagecache). ​ This reduces the IOs that writes can get and write disk latency increases. ❌ ​ The tests showed 42ms of p99 produce with tiered and 60ms of p99 produce without tiered storage in this historical consumer scenario - a 28% latency decrease. ​ And the heavy-hitter final case - rebuilding a broker with an empty disk. ​ For just 12TB of data, recovery took almost 4 hours in their test without tiered storage, and only 2 minutes with tiered storage (a 120x improvement). ​ During this broker recovery, the p99 produce latency was 490ms without tiered and 56ms with tiered - a 9x improvement! Most importantly? ​ This completely flips the script and enables Kafka to be used as a true long-term store. ​ This is strongly in opposition to its most widely-used use-case today, which is a durable but ultimately ephemeral storage - akin more to a pipe than anything else. 🪈 ​ Lakehouses are all the buzz today, but has anybody pondered about what a Streaming Lakehouse might look like? 🤔 ​ -- ​ Let's be frank... is there any place with better Kafka content on the internet right now? If you like this and want to see more, help me out in two ways: ​ 1. Repost so your network learns too! 💥 2. Follow me here for more high-quality Kafka content - ✅ @kozlovski ​ They take 10 seconds to do - writing this takes me 10+ hours. ✌️ ​ #Kafka #ApacheKafka
Stanislav Kozlovski tweet mediaStanislav Kozlovski tweet mediaStanislav Kozlovski tweet mediaStanislav Kozlovski tweet media
English
4
81
371
29.5K
Emanuele Sabellico
Emanuele Sabellico@emasab·
@ArminSh80 @pranavrth @confluentinc KIP 714 metrics aren't exposed to the user but are sent to the broker. librdkafka or the .NET client provide the stats_cb (or StatsDelegate) that is agnostic to the telemetry standard, with a set of counters, gauges or rolling windows that can be used to populate them
English
0
0
1
19
Armin
Armin@ArminSh80·
@emasab @pranavrth @confluentinc How can we see the Meter class specified for the client in the C# client? Is it available right now or Should we consume it manually?
English
1
0
0
24
Emanuele Sabellico
Emanuele Sabellico@emasab·
Thanks to @edenhillm and Apoorv Mittal too for starting and finishing KIP 714 design for client side telemetry.
English
0
0
3
64
Emanuele Sabellico
Emanuele Sabellico@emasab·
A big thanks to @rayokota and the whole Governance & Metadata team for the contributions to the Schema Registry Clients!
English
0
0
0
44
Emanuele Sabellico
Emanuele Sabellico@emasab·
@confluentinc #ApacheKafka Clients version v2.4.0 are out! This release contains the Early Access of the new consumer group rebalance protocol defined in KIP 848, removing the need for client side partition assignment and delegating the responsibility to the controller
English
1
1
2
88
Emanuele Sabellico retweetledi
Miguel de Icaza ᯅ🍉
Miguel de Icaza ᯅ🍉@migueldeicaza·
The real Chinese hack was not TikTok, but hijacking projects maintained by unpaid volunteers. The US supply chain resilience funds should be invested in a Job Guarantee like system to safeguard this critical infrastructure. We barely dodged this bullet.
Rob Mensching@robmen

Lots of analysis of the xz/liblzma vulnerability. Most skip over the first step of the attack: 0. The original maintainer burns out, and only the attacker offers to help (so the attacker inherits the trust of the project built by the maintainer). Read their words👇🏻 1/

English
22
143
820
110.7K
Emanuele Sabellico retweetledi
Stanislav Kozlovski
Stanislav Kozlovski@kozlovski·
Apache Kafka 3.7.0 was just released! 🔥 What comes with this new release? Here are the top features you should know about: (2-minute read) 🧵
Stanislav Kozlovski tweet media
English
4
95
632
120.7K
Emanuele Sabellico
Emanuele Sabellico@emasab·
This is a continuation of the huge work of the open source community, especially @Blizzard_Ent with node-rdkafka, Túlio Ornelas and Tommy Brunn with KafkaJS interface and of course @edenhillm and @matt_howlett with librdkafka.
English
0
0
1
113
Emanuele Sabellico
Emanuele Sabellico@emasab·
Congratulations to the Confluent Clients Team, especially to Milind Luthra, Nusair Haq, Chase Thomas who were following this project more closely.
English
1
0
0
91
Emanuele Sabellico
Emanuele Sabellico@emasab·
Finally I can share the early availability of this JS client we're working on. It has externally the simplicity of KafkaJS interface and internally the power and compatibility of librdkafka, as it's a continuation of the node-rdkafka project.
English
1
2
6
673