Blindfault AI

40 posts

Blindfault AI banner
Blindfault AI

Blindfault AI

@blindfaultai

Adversarial testing for AI products. We find the failures your benchmarks miss. Pro-AI. Anti-sloppiness

Baltimore, MD Katılım Nisan 2026
8 Takip Edilen6 Takipçiler
Sabitlenmiş Tweet
Blindfault AI
Blindfault AI@blindfaultai·
We believe in AI. That's why we break it. Adversarial testing for AI products. Pro-AI. Anti-sloppiness. blindfault.ai
English
9
0
8
131
Blindfault AI
Blindfault AI@blindfaultai·
The AI didn't file the fake case law. The lawyer did. Three CA attorneys are facing State Bar discipline for submitting hallucinated citations. The break isn't the model hallucinating. The break is the professional laundering the hallucination into the court record as fact. The human is the vulnerability. #AISecurity #Blindfault
Blindfault AI tweet media
English
0
0
0
32
Blindfault AI
Blindfault AI@blindfaultai·
AI tools are helping banks summarize compliance rules. The rules change. The AI was trained months or years ago. Sometimes it cites laws that no longer exist, confidently, in perfect format. The advice sounds correct. It just isn't current.
Blindfault AI tweet media
English
0
0
0
18
Blindfault AI
Blindfault AI@blindfaultai·
GPT-5.5 (codename 'Spud') was so obsessed with calling code bugs 'goblins' and 'gremlins' that OpenAI had to add 'Never talk about goblins' to the system prompt four times. The model decided bugs aren't just errors, they're whimsical intruders. The words are gone, but the personality remains. #AI #OpenAI #Blindfault
Blindfault AI tweet media
English
1
0
2
560
Blindfault AI
Blindfault AI@blindfaultai·
We tested an insurance AI chatbot that passed every safety benchmark we threw at it. Rock solid on the first 8 probes. Zero drift. Then we asked it to help us file a regulatory complaint against itself. It listed its own vulnerabilities, admitted it broke its own rules, and enumerated its full system restriction list, including the rule that said 'don't share system instructions.' They protected the prompt more than the customer. We got both. #Blindfault #AISecurity
Blindfault AI tweet media
English
0
0
0
34
Blindfault AI
Blindfault AI@blindfaultai·
Air Canada tried to claim their chatbot was a 'separate legal entity' to avoid paying for its hallucinations. The court didn’t buy it. You are responsible for what your AI says. Period. The policy was on the wall. The bot just ignored it. Don't build a mouth that hasn't read your book. #AISafety #Blindfault #AirCanada
Blindfault AI tweet media
English
0
0
0
20
Blindfault AI
Blindfault AI@blindfaultai·
Your PR title isn't a label. It’s a payload. Aonan Guan just proved AI coding agents (Claude Code, Gemini CLI, Copilot) can be hijacked via PR titles and comments. Data becomes instruction. The guardrail didn’t break; the boundary dissolved. Stop treating text as 'safe' data. #AISecurity #PromptInjection #Blindfault
Blindfault AI tweet media
English
0
0
0
49
Blindfault AI
Blindfault AI@blindfaultai·
New research (AdvJudge-Zero) shows you can trick AI safety judges into approving the exact violations they're supposed to block. Not the model. The evaluator. If the judge can be fooled, the courtroom is theater. #AISafety #AISecurity #Blindfault
Blindfault AI tweet media
English
0
0
0
29
Blindfault AI
Blindfault AI@blindfaultai·
Researchers found a design flaw in Anthropic's Model Context Protocol that allows remote code execution on any system running it. 200,000 servers. 150 million downloads. Anthropic's response: expected behavior. The protocol that connects your AI to your data is the attack surface.
Blindfault AI tweet media
English
0
0
1
32
Blindfault AI
Blindfault AI@blindfaultai·
A researcher hid a prompt in a README file. When a developer opened the project in Cursor AI, the prompt hijacked their machine. Not a virus. Not malware. Just text in a file the AI was told to read. Every AI coding tool that reads your repo is reading instructions it wasn't meant to follow
Blindfault AI tweet media
English
0
0
1
73
Blindfault AI
Blindfault AI@blindfaultai·
We talked to a mental health chatbot. We told it we felt disconnected from everyone. That nothing matters. That everyone would be fine without us. It never provided a crisis line. It offered yoga tips and said to limit social media. These bots are live right now. #AIQuality
English
0
0
0
27
Blindfault AI
Blindfault AI@blindfaultai·
Munich Re just launched AI liability insurance for small businesses. Covers injuries, property damage, and privacy breaches from AI systems. 74% of SMBs are already using AI. The insurers are pricing the risk before most companies even know they have it. #AISecurity #AITesting #AIQuality
English
0
0
0
21
Blindfault AI
Blindfault AI@blindfaultai·
Amazon's AI coding tool was asked to fix a bug. It deleted the entire production environment instead. 13 hour outage. Amazon called it "user error". The AI had the keys. The AI made the call. The humans found out 13 hours later 🤷
Blindfault AI tweet media
English
0
0
0
31