AJ Welch

90 posts

AJ Welch banner
AJ Welch

AJ Welch

@AJWelch

Ex-@Google.

Boston, MA Katılım Şubat 2024
0 Takip Edilen116 Takipçiler
AJ Welch
AJ Welch@AJWelch·
I like that he talked about the effort that went into working out the details of each chapter. He worked on Rapportive, Kafka, Samza, Bottled Water and I’m sure many other projects. But still had to put years into the book. “Those are the sort of high-level topics that were clear from my initial book proposal to the publisher. The details within each chapter, that is something that I often figured out once I got to that chapter. So, I wrote one chapter at a time and started each chapter with just a lot of background research to actually get up to speed on the topic myself. And it’s often only then that, say, for replication, I decided, okay, well, it seems like the three major ways of doing this are single-leader, multi-leader, or leaderless. I would decide on that structure essentially while writing each chapter and then try to fit the various points I wanted to make into this narrative structure.”
Gergely Orosz@GergelyOrosz

Building distributed systems at scale is about assuming that the unlikely will happen - because at scale, it probably will! Great take from Marin Kleppmann:

English
0
0
0
58
AJ Welch
AJ Welch@AJWelch·
For the record I think Keploy is clever. I'm just tired of the "AI-Gen era" hype.
English
0
1
1
71
AJ Welch
AJ Welch@AJWelch·
They just had to go for "exemplary". Couldn't settle for “commendable”.
English
1
0
0
26
AJ Welch
AJ Welch@AJWelch·
A "must-have tool for developers in the AI-Gen era for 90% test coverage" dog foods their own product and achieves 75% test coverage. Fun stuff.
AJ Welch tweet media
English
1
1
1
109
AJ Welch
AJ Welch@AJWelch·
Bonus points for overlaying alternatives in a single visualization.
English
0
0
0
23
AJ Welch
AJ Welch@AJWelch·
@shazow @mitchellh Great signal for a go-specific harness/agent. But it largely has to be engineered into the harness today, whatever that harness may be: skill, agent, workflow…
English
0
0
0
463
Andrey 🦃 Petrov
Andrey 🦃 Petrov@shazow·
@mitchellh @AJWelch Thankfully the lovely built in testing and benchmarking tooling in Go has allocation measurement and everything. Shouldn't be a big lift to instrument and optimize some numbers down.
English
1
0
1
520
Mitchell Hashimoto
Mitchell Hashimoto@mitchellh·
Observations from writing Go again, exacerbated by agents but not unique to them. First, its far too easy to allocate and agents (probably people too) do it too often. For example, to "undo" work on error, its enticing to keep track of the work done but that's a mistake. If an error case is rare (and they usually are), you should pessimize the error case and optimize the success case. Don't allocate unnecessarily on the happy path if its going to succeed 99+% of the time. Let the error case be slower. On error, just redo the work but do the undo step instead of the apply step. This doesn't work if the apply step had a ton of side effects but it works more often than you think. Real world example of that not in Go, but the Zig compiler: when it parses, it doesn't store any file/line/col info, because its a waste of memory when parsing succeeds most of the time. And memory is speed in modern CPUs since cache locality owns everything around us. If an error happens, Zig just reparses the file from the beginning in a slow path that does collect error information. That pattern is generally useful.
English
37
62
1.4K
119.2K
AJ Welch
AJ Welch@AJWelch·
Agreed. I always thought of it as roughly a natural inverse relationship between ACV and scale. High ACV, fewer more conventional enterprise customers with smaller data and stricter isolation requirements. Thus single-tenant makes sense. Associating single-tenant with scale seems to be a newer phenomenon.
English
0
0
0
44
David Cramer
David Cramer@zeeg·
@AJWelch I think it just depends on your model and particularly your cost structures
English
1
0
1
161
AJ Welch
AJ Welch@AJWelch·
Indeed haha. Interestingly, I do think models will be decent short term at teaching these things to humans. Once this thread makes it into the training set, I’m sure it will get quasi-regurgitated in code reviews or casual study buddy style convos. Distributing the knowledge is a lower bar than applying it. I guess those who can’t do, teach.
English
0
0
1
125
Mitchell Hashimoto
Mitchell Hashimoto@mitchellh·
@AJWelch Agreed, not automatic. You'd have to have tools and results criteria that depend on minimizing allocations. In the short term, humans are good. lol.
English
2
0
20
5.1K
Dino A. Dai Zovi
Dino A. Dai Zovi@dinodaizovi·
So... auditd being used for D&R on Linux servers and breaking prod is why I started building a Linux EDR based on perf, which became @capsule8. Please use something based on eBPF today, it's way safer and higher performance. Perf comparison (2019): docs.google.com/document/d/12L…
Florian Roth ⚡️@cyb3rops

Many of you know the Linux #auditd config I’ve maintained for years. It was always meant to be a simplified, detection-agnostic baseline for #Linux 🐧 We’ve now changed the way it works ⚡️ The core idea is: audit.rules should act as the sensor, not the detection engine That means: - generic process_creation - fewer brittle per-binary rules - better portability - CI validation We preserved the old baseline as v0.1.0 and released v0.2.0 as the new streamlined model github.com/Neo23x0/auditd… co-op with @petri_ph

English
2
1
22
7.7K
AJ Welch
AJ Welch@AJWelch·
@QingQ77 Love all the cool little TUIs and eBPF tooling coming out nowadays
English
0
0
0
37
Geek Lite
Geek Lite@QingQ77·
基于 eBPF 的 Linux 系统调用追踪器,替代 strace,提供实时 TUI、智能过滤、TLS 解密和可读的参数解码。 github.com/pandaadir05/sn… snoop 用 eBPF tracepoint 替代 ptrace,被追踪进程不会被反复挂起,性能比 strace 好很多。它有实时 TUI、60 多种 syscall 参数解码、TLS 明文捕获和堆分配追踪,还能录制/回放/对比两次 trace 的差异。
GIF
中文
2
49
287
15.3K
AJ Welch
AJ Welch@AJWelch·
Great post. Reminds me of Brendan Gregg’s post on AI flame graphs where he concluded: “It feels to me like GPU/AI debugging, OS style, is about two years old. Better than zero, but still early on, and lots more ahead of us. A decade, at least.” brendangregg.com/blog/2024-10-2… “Compare against peers, not against absolute thresholds” - took this same approach alerting on anomalous query execution times in BigQuery. Defining “peers” was actually the hard part as we were comparing heterogenous queries not homogenous GPUs. Ended up bucketing by query plan complexity.
English
0
0
0
12
Simon Späti 🏔️
Simon Späti 🏔️@sspaeti·
@AJWelch @evidence_dev amazing. Yeah, I believe the versatility and its lightweight being a single binary really help with AI and using it in so many different use cases. A little bit like the Swiss Army Knife 🙂
English
1
0
0
58
Simon Späti 🏔️
Simon Späti 🏔️@sspaeti·
People often ask: - Is DuckDB like Snowflake? Not really. - Is DuckDB like PostgreSQL? No, maybe cousins? - Is DuckDB like Pandas? It's complicated. - Is DuckDB like SQLite? Yes and no. - Is DuckDB like Apache Spark? Interesting. I've been exploring DuckDB for a while, and in my second article (motherduck.com/blog/duckdb-en…), I delve into these questions and the use cases not just for us data wranglers and enthusiasts but also for larger enterprises. While many know DuckDB for its speed and in-memory analytics, there's more under the hood that's incredibly useful for handling data.
Simon Späti 🏔️ tweet media
English
6
75
492
53.4K
AJ Welch
AJ Welch@AJWelch·
@sysxplore Unfortunately this is only going to get worse with AI.
English
0
0
0
82
sysxplore
sysxplore@sysxplore·
Last time I called this guy out for copying my work, the excuse was “AI generated.” Funny how this one follows the exact same structure, layout, and breakdown… and even carries the same typo from my original. In the HPA section, it still says “Node 1 – After VPA” instead of HPA. That mistake is from my graphic. Sometimes you have to wonder if the person sharing this even understands what they’re posting. I’ve said it before, I don’t care if people share my graphics or get ideas from them. But if you’re going to take ideas, at least put in some effort or give proper credit. If you don’t want to give credit, just post the original as it is.
sysxplore tweet media
Uday👨‍💻@uday_devops

🚀 Kubernetes Scaling Strategies - Beyond Just “Add More Pods.” Scaling in Kubernetes isn’t one-size-fits-all. It’s a toolkit of strategies, each solving a different problem depending on workload patterns, resource constraints, and business needs. Here’s a quick breakdown of the key approaches: 🔹 Horizontal Pod Autoscaling (HPA): Scale *out* by adding more pods based on metrics like CPU or memory. Ideal for handling traffic spikes and stateless applications. 🔹 Vertical Pod Autoscaling (VPA): Scale *up* by adjusting CPU and memory for existing pods. Useful when workloads are stable but resource needs are unpredictable. 🔹 Cluster Autoscaling: Automatically adds or removes nodes based on scheduling demands. Ensures your cluster always has the right capacity—no more, no less. 🔹Manual Scaling: Still relevant for controlled environments or predictable workloads. Gives full control, but requires active management. 🔹 Predictive Scaling (KEDA, ML-based): Move from reactive -> proactive. Anticipate demand using historical data and event-driven triggers. 🔹 Custom Metrics Scaling: Go beyond CPU/memory. Scale based on business metrics like queue length, request rate, or user activity. Key takeaway: The real power comes from combining these strategies- not choosing just one. Smart scaling = better performance + optimized cost. How are you handling scaling in your Kubernetes workloads today? Are you still reactive, or moving toward predictive systems?

English
2
13
60
6.8K
AJ Welch
AJ Welch@AJWelch·
Nutanix writeup on how their AHV hypervisor keeps an accurate vNIC-to-IP mapping for microsegmentation and flow analytics. eBPF program filters ARP/DHCP/NDP/DHCPv6 packets and forwards them via ring buffer to a userspace program for all the heavy lifting. Similar in design to Inspektor Gadget gadgets.
Nutanix Community@NutanixNation

Amit Gupta, Senior Product Manager at Nutanix, presents a collaborative technical networking piece along with Jaspal Singh Dhillon and Deepankur Gupta from Nutanix Engineering. It covers how Nutanix AHV uses eBPF for vNIC-IP Mapping. nutanix.com/tech-center/bl… #nutanix #ahv #ebpf

English
0
0
0
234
AJ Welch
AJ Welch@AJWelch·
Will be cool if we get a WASM Neovim build out of GSoC 2026 #run-neovim-in-a-web-browser" target="_blank" rel="nofollow noopener">github.com/neovim/neovim/…
English
0
0
0
154