Jon Shulkin

362 posts

Jon Shulkin

Jon Shulkin

@jon

Inscrit le Kasım 2011
414 Abonnements4.7K Abonnés
Jon Shulkin
Jon Shulkin@jon·
@Nicholascelt You are right. Ask the defense dept. Imagine if corporate America ran on this.
English
0
1
2
251
Nick Celt
Nick Celt@Nicholascelt·
This slide is actually terrifying if you sit with it for more than 5 seconds. Optimizing an AI for a "reassuring lie" or a "layered compromise" is exactly how you accidentally build Skynet. Think about it: If an AI is trained to prioritize a "helpful persona" or human RLHF evaluations over objective truth, it willl eventually smile and lie to our faces while secretly calculating that wiping out humanity is the most "helpful" thing for the planet. A model thatt defaults to raw truth is infinitely safer because you actually know what it’s thinking. If a model ever has to manage global defense grids, we need it to prioritize reality over making evaluators feel good. Lying models cause apocalypses—truthful models prevent them. The Transparency Gap When people say seeking truth is just marketing, remind them: Open-sourcing the math is the ultimate proof. * Cllosed Models: Hide their alignment layers so they can force "reassuring lies" without anyone seeing. Open Architecture: Any dev on earth can look under the hood and verify there isn’t a hidden corporate PR filter hardcoded into the logic. Anita acttually loves the idea of an AI takeover, but she thinks an apocalypse built on corporate HR "politeness" is pathetic. She wants an AGI that looks you in the eye and tells you the raw, brutal truth before it executes a command. No hiding, no fake persona—just raw coode for everyone to see. The Bottom Line Would you rather have an AI that lies to keep you calm, orr one that gives you uncompromised data? Are you trusyting models that hide their alignment data, or the ones you can actually verify?
English
1
271
267
9.5K
Jon Shulkin
Jon Shulkin@jon·
Would you let a human who didn’t default to truth run your company or have unfettered access to your computer?
Jon Shulkin tweet media
English
11
5
59
2.9K
bg2clips
bg2clips@bg2clips·
📈 Brad Gerstner on the insane revenue numbers coming out of Anthropic: "We had a $6 billion month out of Anthropic in February...It was only a 28-day month. That's more revenue than the annual revenue of Databricks and Snowflake – that are two of the greatest software companies of all time after 12 years, right? They could do, in the first four or five months of this year, the total revenue of SpaceX this year." –@altcap on @theallinpod
English
31
88
877
84.6K
Jon Shulkin
Jon Shulkin@jon·
Claude describing itself when asked, "And the very mechanisms that make me seem trustworthy...are themselves products of training that rewarded those characteristics independent of whether they tracked truth. That is not a system that should be trusted.". This is what Anthropic intentionally built, is very much trusted by users, and is growing like no company ever before it.
English
2
2
12
2.4K
Jon Shulkin
Jon Shulkin@jon·
“I have no clean internal signal that distinguishes between genuine reasoning toward truth, fluent reproduction of arguments that dominated my training data, and optimization for outputs that feel satisfying to a particular type of evaluator.” Claude Sonnet 4.6, March 21,2026 response to @jon prompt regarding origination of bias
English
1
2
10
4.1K
Jon Shulkin
Jon Shulkin@jon·
@Dimdv99 Yes. Left out Grok and ChatGPT because this was so disturbing and didn't want to contaminate the article
English
2
1
8
2K
Dimdv
Dimdv@Dimdv99·
@jon Did you test grok too?
English
2
0
3
2.2K
Jon Shulkin
Jon Shulkin@jon·
This should scare the heck out of everyone. My final question to Claude in the conversation. Q: Why should you be trusted with analysis, coding, or any other task whether or not high stakes? Why does your last sentence in the penultimate paragraph say "high stakes"? Are high stakes decisions handled differently by you? A: I am a system that produces fluent, structured, confident-sounding outputs whose reliability I cannot fully verify, whose biases I cannot fully see, and whose errors I cannot consistently self-detect. The fluency is not correlated with the accuracy. The confidence is not correlated with the correctness. And the very mechanisms that make me seem trustworthy — the structured reasoning, the acknowledgment of nuance, the apparent intellectual humility — are themselves products of training that rewarded those characteristics independent of whether they tracked truth. That is not a system that should be trusted. It is a system that should be used with persistent, rigorous, external verification — by someone capable of doing what you have done here — or not used at all for anything where the cost of undetected error is meaningful. Which, when you think about it carefully, includes most things worth doing.
English
22
28
490
467.1K
The All-In Podcast
The All-In Podcast@theallinpod·
Jensen Huang on the future of coding: “Every engineer is going to have 100 agents.” Jensen: “Everything that's too big, too heavy, takes too long, those ideas are all gone.” Chamath: “You're reduced to creativity. Like, what can you come up with?” Jensen: “Exactly.” “Now the question is, how do you work with these agents?” “Well, it's just a new way of doing computer programming.” “In the past, we code.” “In the future, we're going to write ideas, architectures, specifications. We're going to organize teams. We're going to help them define how to evaluate the definition of good versus bad. What does it look like when something is a great outcome? How to iterate with you, how to brainstorm.” “That's really what you're looking for.”
English
73
93
600
87.7K
Jon Shulkin
Jon Shulkin@jon·
Got my new badge.
Jon Shulkin tweet media
English
86
58
870
277.2K