Ross

2.1K posts

Ross banner
Ross

Ross

@__not__a_cat

mostly cloud/distributed computing, software development, and cats. thoughts && opinions owned by the cats. meow. he/him

Ann Arbor, MI Katılım Kasım 2007
496 Takip Edilen158 Takipçiler
Ross
Ross@__not__a_cat·
@engineering_bae I just did 7 rounds 🥴. Silver lining they were only 45 minute interviews instead of 1hr?
English
0
0
1
158
Ross retweetledi
Darren Shepherd
Darren Shepherd@ibuildthecloud·
http.Do(...) is an admirable API name. You don't realize the confidence of that API. Dev 1: "What should we call the method for executing a http request?" Dev 2: "I dunno, Do?" Dev 1: "Perfect"
English
3
1
20
3.4K
Ross retweetledi
ahmetb
ahmetb@ahmetb·
Goldmine paper from Google sharing the characteristics of their RPCs (API calls between their services). Especially interesting given everything is an RPC at Google. foci.uw.edu/papers/sosp23-… My takeaways: ✱ The study is sampled from 700B RPCs in a single day, from 10K unique RPC methods (incl. stateless apps, DBs, KV stores, query engines) ✱ 10 most popular methods get 58% of all requests, 100 most popular ones get 91% of all requests! ✱ 100 slowest RPCs also happen to be the 40% of all RPCs! The, network disk “write” RPC alone is 28% of all RPCs. ✱ Major variation of latencies across the board, but avg latency of p90 of the slowest service is 10ms. ✱ RPC call trees are more “wide” than “deep” (they fan out significantly more). ✱ p50 of RPCs cause ≤13 rpcs, but p90 causes 105, and p99 causes 1155 other rpcs(!). ✱ req/resp sizes vary heavily on service type. p99 req/resp=196KB/563KB, p90 is 11KB/10KB. ✱ RPC overhead/tax (time takes to serialize, make the req, transit, decode the resp) is roughly 2% of RPC time on avg at Google’s fleet. ✱ 7% of CPU cycles are consumed by this RPC tax (this is significant given Google's fleet efficiency!), 3% of that is compression. ✱ 1.9% of RPCs result in errors (grpc_code!=OK responses). Most of these are benign (“Canceled” due to hedging/timeout, or natural “NotFound” errors).
ahmetb tweet media
English
4
58
282
37.5K
Ross retweetledi
Detroit Lions
Detroit Lions@Lions·
Do you understand what we're capable of? #AllGrit
English
282
3K
16.5K
1.3M
Ross retweetledi
Alex Xu
Alex Xu@alexxubyte·
The CAP theorem is one of the most famous terms in computer science, but I bet different developers have different understandings. Let’s examine what it is and why it can be confusing. CAP theorem states that a distributed system can't provide more than two of these three guarantees simultaneously. Consistency: consistency means all clients see the same data at the same time no matter which node they connect to. Availability: availability means any client which requests data get a response even if some of the nodes are down. Partition Tolerance: a partition indicates a communication break between two nodes. Partition tolerance means the system continues to operate despite network partitions. The “2 of 3” formulation can be useful, but this simplification could be misleading. - Picking a database is not easy. Justifying our choice purely based on the CAP theorem is not enough. For example, companies don't choose Cassandra for chat applications simply because it is an AP system. There is a list of good characteristics that make Cassandra a desirable option for storing chat messages. We need to dig deeper. - “CAP prohibits only a tiny part of the design space: perfect availability and consistency in the presence of partitions, which are rare”. Quoted from the paper: CAP Twelve Years Later: How the “Rules” Have Changed. - The theorem is about 100% availability and consistency. A more realistic discussion would be the trade-offs between latency and consistency when there is no network partition. See PACELC theorem for more details. Is the CAP theorem really useful? I think it is still useful as it opens our minds to a set of tradeoff discussions, but it is only part of the story. We need to dig deeper when picking the right database. -- Subscribe to our weekly newsletter to get a Free System Design PDF (158 pages): bit.ly/42Ex9oZ
Alex Xu tweet media
English
23
329
1.5K
231.9K
Ross retweetledi
Alex Xu
Alex Xu@alexxubyte·
Twitter has enforced very strict rate limiting. Some people cannot even see their own tweets. Rate limiting is a very important yet often overlooked topic. Let's use this opportunity to take a look at what it is and the most popular algorithms. A thread. #RateLimitExceeded
Alex Xu tweet media
English
45
913
3.6K
803.7K
Ross retweetledi
Dion Almaer
Dion Almaer@dalmaer·
CEOs: “hmm Elon is running Twitter great with a small percentage of the engineers. It’s going well… we should consider…” On Blind:
Dion Almaer tweet media
English
83
3.4K
11.5K
2.9M
Ross retweetledi
VALENTINA
VALENTINA@vvaalb·
“i can’t wait for summer so i can wear nice fits” the fits:
VALENTINA tweet mediaVALENTINA tweet media
English
20
4.6K
19.2K
1.3M
Ross retweetledi
Karandeep Singh
Karandeep Singh@kdpsinghlab·
2023: HBO Launches Max 2024: HBO Launches Min 2025: HBO Launches Mean 2026: HBO Launches Median 2027: HBO Launches Standard Deviation 2028: HBO Launches Interquartile Range 2029: HBO Launches Regression to the Mean
English
162
3.5K
26.9K
1.5M
Ross retweetledi
Gregor
Gregor@ghohpe·
@mipsytipsy For some people "the stack" goes top-to-bottom (business function to device driver) and for others front-to-back. I am with you in the former camp but find the latter to be a useful skill. I myself am a full-crap-developer as in: I deal with all the crap no matter where...
English
2
5
28
8.5K
Ross retweetledi
👩‍💻 Paige Bailey
👩‍💻 Paige Bailey@DynamicWebPaige·
̶ ̶ ̶P̶U̶B̶L̶I̶S̶H̶ ̶O̶R̶ ̶P̶E̶R̶I̶S̶H̶ ̶ ̶ 🚀 SHIP OR RIP ☠️
2
12
120
11.3K
Kit Merker
Kit Merker@KitMerker·
RT if you haven't signed up for SLOconf yet
English
2
1
3
1.7K
Ross retweetledi
Gators Daily 🐊
Gators Daily 🐊@GatorsDaily·
ok guys here is a test which one will see u later and which one will see u in a while
Gators Daily 🐊 tweet media
English
3.3K
11.4K
166.4K
19.7M
Ross
Ross@__not__a_cat·
@KitMerker I ask what services had major incidents last week. What questions should I be asking?
English
1
0
3
148
Kit Merker
Kit Merker@KitMerker·
if you're a CTO, you probably get ~15 minutes a day to understand the health and reliability of your organization (unless there is a major outage) How do you use this time wisely?
English
4
22
228
16.8K
Ross
Ross@__not__a_cat·
@Sh1bumi This! You can create and configure and then it just shows you the sdk code in a given language to create the proposed resource.
English
0
0
1
59
Christian Rebischke
Christian Rebischke@Sh1bumi·
Sunday Idea: Build a cloud provider like AWS, but with no possibility to change something via the WebUI. The WebUI should only have informative purposes. API first, webUI second. I wonder what implications this would have on costs. 🤔
English
2
1
5
783