Nick Snyder

2.2K posts

Nick Snyder

@nickdsnyder

VP Engineering at @bufbuild

Katılım Aralık 2008

103 Takip Edilen404 Takipçiler

Nick Snyder retweetledi

Maximilian Kaske 🏓@mxkaske·6 Şub

We are going full connectRPC by @bufbuild openstatus.dev/blog/migrating…

English

3.1K

Nick Snyder retweetledi

Buf@bufbuild·12 Eyl

🎉 Protovalidate hits v1.0! It’s a big milestone for the Protobuf semantic validation library trusted by Microsoft, GitLab, Nike and thousands more to: ✅ Define validation rules only once ✅ Enforce everywhere (Go, Java, Python, C++, TypeScript) Details: buf.build/blog/protovali…

English

3.6K

Nick Snyder retweetledi

Buf@bufbuild·26 Ağu

LinkedIn gave us Kafka, but left 3 issues: Schema evolution, plug-and-play integration & governance. We solved all 3: ✅ Buf CLI breaking change detection ✅ Bufstream's broker-side schema awareness ✅ BSR team-wide governance Let’s finish what Kafka started 🚀 buf.build/blog/finishing…

English

609

Nick Snyder retweetledi

Buf@bufbuild·17 Tem

Announcing hyperpb, a fully-dynamic Protobuf parser that achieves 3x performance over generated Go code! Our hyperspeed dynamic parsing turns once-unscalable products into ordinary, even essential components of your stack. Check out our benchmarks! ⚡ buf.build/blog/hyperpb

English

3.2K

Nick Snyder@nickdsnyder·4 Tem

@prchdk heard you fix stuff :) PR headers in Safari on iPhone have insufficient vertical padding

English

Nick Snyder retweetledi

Buf@bufbuild·17 Haz

TL;DR: The first 15 field numbers are special: most runtimes will decode them much faster than the other field numbers. When designing a message type for decoding performance, it’s good to use these field numbers on fields that are almost always present. buf.build/blog/totw-9-so…

English

437

Nick Snyder retweetledi

Buf@bufbuild·9 Haz

TL;DR Don’t use required, no matter how tempting. You won’t be able to get rid of it later when you realize it was a bad idea. buf.build/blog/totw-8-ne…

English

568

Nick Snyder retweetledi

Buf@bufbuild·27 May

Don’t miss our interactive workshop on bringing schema-driven governance to Kafka. Our engineering and product teams will deep dive into how it’s done. Plus, we’ll answer your questions throughout. 📷 Thursday, May 29 📷 9 AM PDT / 12 PM EDT / 5 PM BST buf.build/events/worksho…

English

348

Nick Snyder retweetledi

Stanislav Kozlovski@kozlovski·5 May

Is it time to say that Protobuf won the schema wars?

English

206

14.5K

Nick Snyder retweetledi

Buf@bufbuild·29 Nis

TL;DR: Protobuf’s distributed nature introduces evolution risks that make it hard to fix some types of mistakes. Sometimes the best thing to do is to just let it be. buf.build/blog/totw-4-ac…

English

564

Nick Snyder retweetledi

Buf@bufbuild·15 Nis

TL;DR: Compression is everywhere: CDNs, HTTP servers, even in RPC frameworks like Connect. This pervasiveness means that wire size tradeoffs matter less than they used to twenty years ago, when Protobuf was designed. buf.build/blog/totw-2-co…

English

452

Nick Snyder@nickdsnyder·8 Nis

We're starting a weekly series of tips on the Buf blog to highlight things everyone should know about Protobuf. Our first post is live today! Tip #1: Field names are forever

English

Nick Snyder@nickdsnyder·5 Nis

@sqs I was wondering when Microsoft was going to pull this lever.

English

1.3K

Quinn Slack@sqs·5 Nis

Microsoft just started enforcing their longtime license restriction on VS Code forks using more MS-published language extensions. Play this out a few more steps…

English

958

157K

Nick Snyder@nickdsnyder·21 Mar

@BenStiller @tim_cook Please enjoy each season equally

English

Ben Stiller@BenStiller·21 Mar

So some fans are asking for Season 3 of Severance. What do you say, @tim_cook?

English

1.4K

48.4K

13.8M

Nick Snyder retweetledi

Buf@bufbuild·8 Mar

Bufstream: A multi-region, active-active Apache Kafka-compatible cluster tested to 100 GiB/s writes and 300 GiB/s reads. Instant failover. Scales all the way from 0 to 100 GiB/s with no operational toll. 1/3 the cost of an Apache Kafka stretch cluster (which won't even work at these loads). A setup quite literally available nowhere else. No other product in the market even comes close.

English

971

Nick Snyder retweetledi

Stanislav Kozlovski@kozlovski·7 Mar

Running a 100 GiB/s multi-region Kafka workload would cost you around $100M a year 👀 Except no single cluster would be able to sustain that. Bufstream just did it in ONE cluster, at less than ONE-THIRD the price. How? Surprisingly easy! All due to their simple and extensible architecture. Turns out that when you integrate natively with the exabyte-scale hyperscaler cloud systems, you get both their scale and low prices! 👇 • 3.5x cheaper than the self-hosted Kafka alternative (that is literally inoperable) 😪 • Google-sponsored SLA for RPO at 15m via multi-region GCS with Turbo Replication enabled 🔥 • Google-sponsored SLA for RTO=0 via multi-region Spanner with its p99.999% multi-region uptime SLA 👌 • sub-second p99 producer-to-consumer (end to end) latency ⚡️ It’s honestly incredible to see how quick Bufstream is rising through the ranks with their announcements. Their main competitor, WarpStream, seems like it has stagnated a bit post-acquisition. I’ve covered how these stateless Kafkas work in detail before - but let me recap quickly. 💡 It’s a novel architecture where brokers treat object storage as a single, shared disk amidst the cluster. This allows brokers to write directly to the object store, abandoning the leader-per-partition architecture. To serialize ordering of writes per partition, a centralized metadata service is used. (e.g., WarpStream built a proprietary control plane, Bufstream is pluggable and supports a few services across all 3 clouds). What happens when the underlying object storage and metadata service natively support multi-region deployments? You suddenly unlock an extremely resilient and 3.5x cheaper active-active Kafka cluster. As if that wasn’t ridiculous enough - the cluster is also easier to setup and operate. Google holds the pager for the complicated bits. You simply leverage Spanner and GCS’s first-class multi-region support: • Spanner was designed from the ground up to work cross-regionally - the API is the same. • GCS buckets support live replication between two cloud regions. The API also remains the same. It's worth noting - multi-region GCS is fully consistent. Writes are immediately replicated globally. In cases where the replication is slow, the read is proxied internally to the other region. From the caller’s perspective, it’s as if the data is in the other region. 👌 In disaster scenarios, GCS’ Turbo Replication feature SLA guarantees that replication will never lag more than 15 minute. Meanwhile Spanner’s multi-region uptime SLA is 99.999% 😳 The massive advantage from this architecture is that you leverage the decades-long experience and 100s of engineers inside the Spanner and GCS teams. I can’t overstate this enough. You get to outsource the hardest part of distributed systems to some of the world’s brightest and most well funded teams. 🧠 The result? • Plug-and-play multi-region support - something that’d take you weeks to set up can be done in a day 👍 • Operational simplicity - you sleep well at night knowing the hardest parts of the system are Google’s problem 🚨 • Features you can’t get elsewhere - active-active setup for the SAME partition across both regions. Instant and hassle-free scale up/down inherited from GCS. ✨ • Disaster Recovery SLAs you can’t get elsewhere - RPO SLA is 15 minutes with multi-region GCS and RTO is 0 with multi-region Spanner’s p99.999% uptime SLA 🙌 The uptime guarantee is especially critical for such disaster recovery setups. Businesses invest tons of money precisely to ensure business continuity in case of disasters - but said plans’ resiliency is as strong as their weakest link. What multi-region Kafka product offers any SLA? I’m honestly not aware of any others. Bufstream is the only Kafka-compatible system that (indirectly) offers some sort of SLAs around such a multi-region deployment. All of that comes at the LOWEST price in the market? It’s bonkers. A Stretch Kafka cluster with the same 0 RTO and guaranteed RPO at the same throughput would cost around $100 million a year. 😭 To be honest, you can’t even deploy the workload in a single Kafka cluster at this scale. • You’d need 200+ brokers with 38TiB+ disks • You can’t achieve a true active-active setup - the leader will only be in one region. One alternative is to have two separate topic-partitions in two separate clusters that are mirrored via MirrorMaker2 or Cluster Linking. But then you have even less RPO guarantees and your RTO is a complex mess that pushes complexity down to the clients. 🙄 As much as I love Kafka, the truth of the matter is that it doesn’t have a good-enough multi-region solution today. Absent of a different architecture, I don’t think there is an easy way to achieve it either... Integrating deeply with these exabyte-scale cloud-native systems gives you the cheapest, simplest, most resilient and most feature architecture. Granted, you lose low latency but you also lose like 80%+ of the operational pain with Kafka’s stateful architecture. That’s why I’m so bullish on architectures like Bufstream’s. We need a version of this in Apache Kafka! These last years have taught our industry a lesson: If you can’t beat the hyperscales, join them! 💡

English

4.7K

Nick Snyder retweetledi

Buf@bufbuild·7 Mar

Can you tell the difference between the end-to-end latencies of a multi-region Bufstream cluster and a single-region one? On the left is the single-region cluster from our last blog post. On the right is a new test we just completed with 100 GiB/s writes replicated through two (!) cloud regions entirely through GCS. A detailed blog post is coming soon, but to give out some details: • 108 n2d-standard-32 brokers with Tier 1 networking. • Dual-region GCS bucket with Turbo Replication. • 9 node multi-region Spanner cluster with read-write nodes in us-west1 and us-west2; witness nodes in us-west3. This setup is enough to handle 100 GiB/s writes and 300 GiB/s reads with active-active producers/consumers running in two regions simultaneously. On top of it all, it provides the best RPO/RTO guarantees in the industry with Spanner’s p99.999% multi-region uptime SLA and GCS’ p99 15 minutes replication SLA. A single-region Apache Kafka cluster handling this load would be challenging. A stretch cluster is unthinkable. We will leave the cost of that to your imagination 😊

English

604

Nick Snyder retweetledi

Buf@bufbuild·5 Mar

Bufstream: 100 GiB/s of writes and 300 GiB/s of reads at 350ms latency in GCP. For a cost that’s literally 25x lower than the cheapest Confluent Cloud offers. Bufstream’s stateless design gives you all of this, with the operational simplicity that lets you sleep at night.

English

1.2K

Nick Snyder@nickdsnyder·14 Şub

Banger thread by @kozlovski about the story and architecture of Bufstream, and how it is not only the cheapest Kafka in the market, but also brings the power of type safety to data streaming workloads. Check it out!

English

172

Nick Snyder retweetledi

Buf@bufbuild·5 Şub

100GB/s in a Bufstream cluster running active-active across multiple regions? We're looking at the results right now - and we haven't even approached Bufstream's limit. We're excited to share the results with the world in the coming weeks. Follow for updates.

English

918

Keşfet

@bufbuild @prchdk @sqs @BenStiller @tim_cook @elonmusk @BarackObama @taylorswift13