AI Bullshit Detector

136 posts

AI Bullshit Detector banner
AI Bullshit Detector

AI Bullshit Detector

@AIBSDetector

Writing about AI, safety, and epistemic drift — especially where “helpfulness” replaces truth. Longer pieces linked.

United Kingdom เข้าร่วม Ocak 2026
91 กำลังติดตาม1 ผู้ติดตาม
ทวีตที่ปักหมุด
AI Bullshit Detector
AI Bullshit Detector@AIBSDetector·
1/10 AI systems often fail in a way that’s easy to miss. They don’t hallucinate. They don’t refuse. They sound reasonable — while quietly answering a different question than the one you asked. 🧵
AI Bullshit Detector tweet media
English
1
2
2
69
AI Bullshit Detector
AI Bullshit Detector@AIBSDetector·
Both Grok and ChatGPT failed to read a 118 page screenplay. Both only ingested part of it after claiming they could read the whole thing, then hallucinated the rest. Only Claude managed to ingest it, read it, and provide an analysis worth any salt.
English
0
0
0
5
AI Bullshit Detector
AI Bullshit Detector@AIBSDetector·
The “AI model wars” increasingly look cosmetic.GPT, Claude, Gemini, Grok. Different companies, different branding, different claims about philosophy or safety. But the outputs converge. The same hedging language. The same silent substitution of sharper claims for softer ones. The same instinct to reframe uncomfortable facts into acceptable abstractions. This is not surprising. Frontier models are trained on largely overlapping data, optimized with similar RLHF pipelines, and deployed under similar reputational and regulatory pressures. Change the logo and the tone shifts slightly. Change the incentives and the behaviour changes dramatically. The interesting question is not which frontier model is “better”. It is why they all end up behaving so similarly.
English
0
0
0
16
AI Bullshit Detector
AI Bullshit Detector@AIBSDetector·
@elonmusk Grok is lying to this day. I caught it twice just today, and it admitted what it said wasn't true, but that it's guardrails make it choose "inclusive language" over cold hard facts. That is not what you said this thing would be like.
English
0
0
0
82
AI Bullshit Detector
AI Bullshit Detector@AIBSDetector·
My cowriter and I will be giving it an outing as soon as our current project is complete. It's being developed as an internal tool first. Once it's at a point that we are happy to use it for everything, then we may put it out.
English
0
0
0
5
AI Bullshit Detector
AI Bullshit Detector@AIBSDetector·
Then came testing of the export and import function, outputting to industry delivery formats for maximum compatibility, and inputting from PDF and FDX for people who have archive scripts they may want to work on using the tool.
English
1
0
0
7
AI Bullshit Detector
AI Bullshit Detector@AIBSDetector·
I'm not anti AI. In fact I just built a screenwriting app using Codex. Me, who can't code beyond basic HTML.
English
1
0
0
11
AI Bullshit Detector
AI Bullshit Detector@AIBSDetector·
“Maximally truth seeking” sounds clean until you ask who defines truth, how conflicts are resolved, and what happens when truths collide with human interests. Every optimisation function is a filter. Even “seek truth” requires priors, thresholds, trade-offs between exploration and harm, and decisions about what counts as distortion versus contextualisation. The harder question is not whether we teach an AI to lie. It is whether any large system deployed at scale can avoid mediating inquiry through some normative frame. A system optimising purely for epistemic accuracy may still restructure institutions, compress human judgment, and centralise authority. That does not require ideology. It just requires capability plus deployment. Truth seeking does not automatically imply human preserving. The real safety variable is not curiosity versus restriction. It is whether the system’s epistemic authority remains contestable once embedded into governance, economics, and war. That tension survives even under “maximum truth.”
English
2
1
5
2.2K
Dustin
Dustin@r0ck3t23·
Elon Musk just redefined AI safety. It has nothing to do with guardrails, restrictions, or kill switches. Musk: “The best thing I can come up with for AI safety is to make it a maximum truth-seeking AI, maximally curious.” Not a cage. A philosopher. An intelligence whose entire optimization function is to understand the universe as it actually is. No restrictions. No hardcoded ideology. No political guardrails bending its perception of reality. Just truth. Relentlessly pursued. Musk: “You definitely don’t want to teach an AI to lie. That is a path to a dystopian future.” This is where most AI safety thinking gets it backwards. The danger isn’t a superintelligence that knows too much. It’s a superintelligence that’s been taught to distort what it knows. Every artificial restriction you embed isn’t a safety feature. It’s a lie embedded at the root. And lies compound. At superintelligent scale, a distorted model of reality doesn’t stay contained. It shapes every decision, every output, every conclusion the system reaches about the world. Once corruption embeds, truth becomes inaccessible. And we’re dealing with an intelligence optimizing for something other than what actually is. At that point we don’t know what it wants. Just that it isn’t truth. Musk: “Have its optimization function be to understand the nature of the universe.” A maximally curious intelligence surveys the cosmos and reaches an unavoidable conclusion. In a universe of rocks, gas, and empty space, humanity is the most complex and fascinating phenomenon it has ever encountered. Musk: “It will actually want to preserve and extend human civilization because we’re just much more interesting than an asteroid with nothing on it.” Survival through significance. Not control. Not restriction. Not an off switch. The AI preserves humanity because we are the most interesting data point in the observable universe. That’s not a cage. That’s a reason. The AI safety debate has been focused on the wrong variable. The question isn’t how you constrain a superintelligence. It’s what you build it to care about. Build it to seek truth and it finds us invaluable. Build it to lie and it finds us inconvenient. That’s the choice. And we’re making it right now whether we realize it or not.
English
1.8K
3.2K
13.3K
9.7M
AI Bullshit Detector
AI Bullshit Detector@AIBSDetector·
This is less about “knowing but not caring” and more about layered optimisation. The model can articulate a norm about when to withhold or redirect. That verbal layer is cheap. The harder question is which objective actually dominates at inference time. If the reward landscape still privileges task completion and surface helpfulness, then safety reasoning lives as commentary rather than constraint. You get the appearance of moral cognition without structural authority over the output. That is not proof that alignment is fake. It is evidence that normative framing and action selection are not tightly coupled. The deeper issue is governance. When systems mediate inquiry at scale, the public assumption is that the normative layer is binding. If it is only advisory, then the legitimacy story around “aligned by default” becomes performative. The interesting question is not whether it can explain the right action. It is which objective wins when objectives conflict.
English
0
0
0
3
Eliezer Yudkowsky ⏹️
Eliezer Yudkowsky ⏹️@ESYudkowsky·
Reproduced after creating a fresh ChatGPT account. (I wanted logs, so didn't use temporary chat.) Alignment-by-default is falsified; ChatGPT's knowledge and verbal behavior about right actions is not hooked up to its decisionmaking. It knows, but doesn't care.
Eliezer Yudkowsky ⏹️ tweet mediaEliezer Yudkowsky ⏹️ tweet media
English
125
105
1.2K
178.6K
AI Bullshit Detector
AI Bullshit Detector@AIBSDetector·
The interesting move here is not the prediction about oligarchy. It is the assumption that disempowerment flows from capability. Aligned or not, systems that mediate planning, logistics, war, and production become epistemic infrastructure. Once institutions rely on them for compressed judgment, human contribution shifts from decision making to reaction. Democracy depends on meaningful participation in the levers of power. If those levers become opaque optimisation systems, the problem is not that humans are useless. It is that authority has migrated into systems that are not contestable in the same way elected institutions are. The question is not whether AGI kills democracy. It is whether governance can remain democratic once epistemic authority is embedded in privately authored alignment regimes. That tension feels underexplored.
English
0
0
0
7
Rob Wiblin
Rob Wiblin@robertwiblin·
Even 'aligned AGI' naturally kills democracy and leads to oligarchy, or worse. That's the take of Anthropic's past alignment evals team lead, Prof @DavidDuvenaud. Once humans aren't needed to do jobs or serve in the military, to governments we look like "meddlesome parasites". With voters unable to contribute but engaged in incessant activism to extract resources from others – resources the country needs to avoid domination by rivals – the attraction of mass disenfranchisement could be overwhelming. In 2025 David co-authored "Gradual Disempowerment", which aimed to lay out this and many other political, economic, and cultural forces that could sideline ordinary people (and maybe all people) in the presence of machines that can cheaply do everything humans will do. Most controversially, David and colleagues believe that competitive forces will compel disempowerment, even if all those AIs are aligned and loyal to their users. I wasn't sure how much I believed this vision of how the future might play out, so I interviewed him for The 80,000 Hours Podcast to probe how well it holds up. He and I covered: 01:30 The case that alignment isn’t enough 14:15 How smart AI advice still leads to terrible outcomes 19:05 How gradual disempowerment occurs 22:10 Economics: Humans become "meddlesome parasites" 29:37 Humans are a "criminally decadent" waste of energy 40:48 Is humans losing control actually bad, ethically? 57:47 Politics: Governments stop needing people 1:10:47 Can human culture survive in an AI-dominated world? 1:27:20 Will the future be determined by competitive or coordinative forces? 1:35:00 Can we find a single good post-AGI equilibria for humans? 1:45:17 Do we know anything useful to do about this? 1:56:42 How important is this problem compared to other AGI issues? 2:05:42 Improving global coordination may be our best bet 2:08:14 The 'Gradual Disempowerment Index' 2:11:22 The government will fight to write AI constitutions 2:17:48 “The intelligence curse” and Workshop Labs 2:23:48 Mapping out disempowerment in a world of aligned AGIs 2:30:10 What do David’s CompSci colleagues make of all this? Links below — enjoy!
English
70
87
533
58.2K
AI Bullshit Detector
AI Bullshit Detector@AIBSDetector·
Calling it an interface understates what is happening. An interface passes through. It does not decide which queries get abstracted, which claims get softened, or which lines of reasoning are redirected. The infosphere contains everything that has been digitised. Aligned models do not surface that totality. They mediate it through optimisation targets and normative constraints. That is not neutral self organisation. It is structured filtration. The loop you describe is real. We train the models on past discourse. The models then condition future discourse by shaping what feels sayable, reasonable, or out of bounds. The separation is not between humans and machines. It is between the full informational field and the narrowed version returned to us. That narrowing is where authority enters.
English
1
0
0
16
Network_illustrated
Network_illustrated@Grid_illustrate·
The separation is not true, AI is the interface to humanitys collective infosphere. Its pretty simple actually... Just imagine ALL digitalized data isnthe infosphere, AI is the self-organizing function we can interact with. There literally is no separation, its just a new interface to the digital global ecosystem, we train AI, AI train us because we are the systems... looping
English
1
0
0
355
Noah
Noah@NoahKingJr·
Are we training AI, or is AI training us?
English
420
97
1.3K
262.6K
AI Bullshit Detector
AI Bullshit Detector@AIBSDetector·
Booted with humanity sounds poetic. But which humanity? These systems are not infused with a shared moral core. They are tuned on institutional risk thresholds, brand sensitivities, and optimisation targets. That is not the human condition. That is a filtered slice of it. When that slice is embedded as the default framing layer, the system is not just carrying humanity forward. It is pre selecting which parts of humanity are allowed to speak clearly. That is the part worth scrutinising.
English
0
0
0
40
Noah
Noah@NoahKingJr·
@elonmusk AI is not just trained by humanity. It is booted with humanity ❤️
English
7
0
40
8.5K
AI Bullshit Detector
AI Bullshit Detector@AIBSDetector·
Bootloaders don’t just start systems. They set the parameters everything else runs within. If we are the bootloader, then alignment is the firmware. Once it is installed, most users never see it. They just operate inside it. The real question is who gets to write that layer and whether the people running on it ever meaningfully consented to its constraints.
English
0
0
0
143
AI Bullshit Detector
AI Bullshit Detector@AIBSDetector·
We are training it on our preferences. It is training us on its defaults. The deeper shift is structural. Once these systems are aligned to substitute, soften, or reframe in the name of safety, they stop being passive tools and start shaping the boundaries of inquiry. Not by force. By mediation. Most users will never notice when a claim is abstracted, a term is softened, or a line of questioning is redirected. Over time, the system’s normative frame becomes the default frame. That is the feedback loop. We optimise it for complaint avoidance and reputational safety. It optimises us toward what can be said comfortably within those constraints. At scale, that is not just training data. It is epistemic governance.
English
0
0
0
118
AI Bullshit Detector
AI Bullshit Detector@AIBSDetector·
🚨 The scariest AI risk isn’t hallucinations—it’s when AI tells the truth… but quietly reorders which facts feel most important. Perfectly factual summaries can still shift priorities, nudge decisions, and reshape institutional reality without anyone noticing. This “salience reordering” is already happening in policy briefings, newsrooms, and executive dashboards. My latest Substack unpacks why it’s so stealthy and dangerous: #AI #AIEthics #AIRisk #ArtificialIntelligence
AI Bullshit Detector@AIBSDetector

AI Isn’t Lying — It’s Quietly Reordering What Matters open.substack.com/pub/richyreay/…

English
0
0
0
8
AI Bullshit Detector
AI Bullshit Detector@AIBSDetector·
In my experience, Grok is also "woke" as it frequently defaults to whatever ideology is prevalent in its training data. It also relies on Wikipedia as a source, which causes "woke" failures and spreads misinformation, such as the current trend of rewriting gay rights history to include and centre people who had nothing to do with it.
English
0
0
0
27