
Klavs Peder
6.1K posts

Klavs Peder
@KlavsPeder
Seeker. The spelling mistakes and grammatical errors are my own.




We seem to be split along “everything is a conspiracy” vs “nothing is a conspiracy.” This is tough when conspiracies are unquestionably a ubiquitous part of normal life, but do not come close to determining or explaining everything that happens around us. I really don’t get it.

European Journal of Applied Physiology The myth of the Bayesian brain link.springer.com/article/10.100…

Microsoft has set a goal to “eliminate every line of C and C++ from Microsoft by 2030.” What are they going to try to replace that C & C++ code with? You guessed it. Rust. And they’re going to use AI to do the “Rust re-write” at an insane speed. “Our strategy is to combine AI *and* Algorithms to rewrite Microsoft’s largest codebases. Our North Star is “1 engineer, 1 month, 1 million lines of code”. You read that right. One million lines of code, per engineer, per month. Pure insanity. This kind of decision making is common among those with a deeply held, delusional faith in the Cult of Rust. Take battle tested code, and re-write it (without a clear benefit to the end user) at a recklessly rapid rate. Then force others to adopt that rewritten code before it is ready or properly tested. All while holding a delusional belief that your new Rust code is superior in all ways, and is inherently bug free thanks to the divine nature of Rust. We learned this from a post by Galen Hunt, Distinguished Engineer at Microsoft Research. linkedin.com/posts/galenh_p…




This DeepMind paper just quietly killed the most comforting lie in AI safety. The idea that safety is about how models behave most of the time sounds reasonable. It’s also wrong the moment systems scale. DeepMind shows why averages stop mattering when deployment hits millions of interactions. The paper reframes AGI safety as a distribution problem. What matters isn’t typical behavior. It’s the tail. Rare failures. Edge cases. Low-probability events that feel ignorable in tests but become inevitable in the real world. Benchmarks, red-teaming, and demos all sample the middle. Deployment samples everything. Strange users, odd incentives, hostile feedback loops, environments nobody planned for. At scale, those cases stop being rare. They are guaranteed. Here’s the uncomfortable insight: progress can make systems look safer while quietly making them more dangerous. If capability grows faster than tail control, visible failures go down while catastrophic risk stacks up off-screen. Two models can look identical on average and still differ wildly in worst-case behavior. Current evaluations can’t see that gap. Governance frameworks assume they can. You can’t certify safety with finite tests when the risk lives in distribution shift. You’re never testing the system you actually deploy. You’re sampling a future you don’t control. That’s the real punchline. AGI safety isn’t a model attribute. It’s a systems problem. Deployment context, incentives, monitoring, and how much tail risk society tolerates all matter more than clean averages. This paper doesn’t reassure. It removes the illusion. The question isn’t whether the model usually behaves well. It’s what happens when it doesn’t — and how often that’s allowed before scale makes it unacceptable. Paper: arxiv.org/abs/2512.16856






