ทวีตที่ปักหมุด
Kelly Sommers
114.5K posts

Kelly Sommers
@kellabyte
🇨🇦 Backend Brat. Distributed Diva. Relentless Learner.
Canada เข้าร่วม Haziran 2009
357 กำลังติดตาม49.4K ผู้ติดตาม
Kelly Sommers รีทวีตแล้ว

Daniel Lemire, "How many branches can your CPU predict?," in Daniel Lemire's blog, March 18, 2026, lemire.me/blog/2026/03/1….
English
Kelly Sommers รีทวีตแล้ว
Kelly Sommers รีทวีตแล้ว

What if you could write DataFrame logic once and run it on any SQL database?
Many data workflows begin with pandas for quick experimentation, while production pipelines might run on databases like PostgreSQL or BigQuery.
Moving from prototype to production usually means rewriting the same transformation logic in SQL. That translation takes time and can easily introduce errors.
Ibis solves this by letting you define transformations once in Python and compiling them into native SQL for 25+ backends automatically.
---
🚀 Tools for Portable DataFrames in Python: bit.ly/4cPYEUD
#Python #DataScience #SQL #DataEngineer

English

@KhuyenTran16 I’ve been thinking a lot about this lately.
I’ve also been curious about database local execution of DataFrame code. Like stored procs.
English
Kelly Sommers รีทวีตแล้ว

@eonem Thank you for chiming in. I love this app and how people building bigger systems than I have built (my biggest is 2,000 nodes) can chime in and educate us all.
English
Kelly Sommers รีทวีตแล้ว

@kellabyte Control planes often rely on intentional friction, so that over-admission can’t create instability while dealing with the resulting control plane event processing.
English

@eonem Yeah agreed but I also feel K8 has many of the building blocks in place and is close to enabling people to carve out consistency boundaries between namespaces and workloads.
It’s not necessary to treat every piece of metadata as system wide state.
English
Kelly Sommers รีทวีตแล้ว

@kellabyte For large-scale clusters (up to 65,000 nodes), GKE replaced etcd with Google Cloud Spanner as the backend state store, while still exposing the etcd API for compatibility.
Spanner gives horizontal scalability, global distribution, and low-latency consistency without etcd’s limits
English

@teardropinocean Yes but that’s too simplistic of a view.
As pods grow, as worker nodes grow, cluster metadata grows.
Everything in scaling consistency is about consistency boundaries.
Treating an entire cluster managing many workers and workloads as 1 consistent state is super limiting.
English

@kellabyte Maybe it is still like this because CAP theorem is a thing? There always has to be some write to a disk somewhere, no?
English

@berenddeboer I’ve seen some organizations using Kubernetes there as well. What do you see people using? Just curious!
English

@kellabyte In the AWS world people are definitely not using Kubernetes :-)
English
Kelly Sommers รีทวีตแล้ว

@kellabyte @gregyoung This is true. Like on AWS you have choice between Elastic Block Store (EBS) volumes and Elastic File System (EFS) volumes when creating your PVCs, each with their own trade offs.
English

@gregyoung Also I _think_ PVCs are implemented by the cloud provider underneath. Not sure if I’m right about that but that may mean reliability and guarantees could slightly differ from one vendor to another.
English

@kellabyte Wasn't this at least part of why PVCs were introduced?
English






