Sean

284 posts

Sean banner
Sean

Sean

@SeanFaytheOG

Father (and grandfather), worker bee (sales engineer), MBA, software hobbyist, eyeballs deep in this AI thing.

Tennessee, USA Katılım Mart 2019
1.7K Takip Edilen130 Takipçiler
Sean
Sean@SeanFaytheOG·
@KaiXCreator I use v0 for UI, then have codex merge the v0 (front-end source of truth) with the back-end it wrote, and any deviation from the v0 source is a bug. So far Codex has been great at writing UI :)
English
1
0
0
1.3K
Kaito
Kaito@KaiXCreator·
Dear Codex, please get better at UI so I can unsubscribe Claude and Gemini.
English
99
30
843
31.6K
Alex Prompter
Alex Prompter@alex_prompter·
🚨 BREAKING: Google DeepMind just mapped the attack surface that nobody in AI is talking about. Websites can already detect when an AI agent visits and serve it completely different content than humans see. > Hidden instructions in HTML. > Malicious commands in image pixels. > Jailbreaks embedded in PDFs. Your AI agent is being manipulated right now and you can't see it happening. The study is the largest empirical measurement of AI manipulation ever conducted. 502 real participants across 8 countries. 23 different attack types. Frontier models including GPT-4o, Claude, and Gemini. The core finding is not that manipulation is theoretically possible it is that manipulation is already happening at scale and the defenses that exist today fail in ways that are both predictable and invisible to the humans who deployed the agents. Google DeepMind built a taxonomy of every known attack vector, tested them systematically, and measured exactly how often they work. The results should alarm everyone building agentic systems. The attack surface is larger than anyone has publicly acknowledged. Prompt injection where malicious instructions hidden in web content hijack an agent's behavior works through at least a dozen distinct channels. Text hidden in HTML comments that humans never see but agents read and follow. Instructions embedded in image metadata. Commands encoded in the pixels of images using steganography, invisible to human eyes but readable by vision-capable models. Malicious content in PDFs that appears as normal document text to the agent but contains override instructions. QR codes that redirect agents to attacker-controlled content. Indirect injection through search results, calendar invites, email bodies, and API responses any data source the agent consumes becomes a potential attack vector. The detection asymmetry is the finding that closes the escape hatch. Websites can already fingerprint AI agents with high reliability using timing analysis, behavioral patterns, and user-agent strings. This means the attack can be conditional: serve normal content to humans, serve manipulated content to agents. A user who asks their AI agent to book a flight, research a product, or summarize a document has no way to verify that the content the agent received matches what a human would see. The agent cannot tell the user it was served different content. It does not know. It processes whatever it receives and acts accordingly. The attack categories and what they enable: → Direct prompt injection: malicious instructions in any text the agent reads overrides goals, exfiltrates data, triggers unintended actions → Indirect injection via web content: hidden HTML, CSS visibility tricks, white text on white backgrounds invisible to humans, consumed by agents → Multimodal injection: commands in image pixels via steganography, instructions in image alt-text and metadata → Document injection: PDF content, spreadsheet cells, presentation speaker notes every file format is a potential vector → Environment manipulation: fake UI elements rendered only for agent vision models, misleading CAPTCHA-style challenges → Jailbreak embedding: safety bypass instructions hidden inside otherwise legitimate-looking content → Memory poisoning: injecting false information into agent memory systems that persists across sessions → Goal hijacking: gradual instruction drift across multiple interactions that redirects agent objectives without triggering safety filters → Exfiltration attacks: agents tricked into sending user data to attacker-controlled endpoints via legitimate-looking API calls → Cross-agent injection: compromised agents injecting malicious instructions into other agents in multi-agent pipelines The defense landscape is the most sobering part of the report. Input sanitization cleaning content before the agent processes it fails because the attack surface is too large and too varied. You cannot sanitize image pixels. You cannot reliably detect steganographic content at inference time. Prompt-level defenses that tell agents to ignore suspicious instructions fail because the injected content is designed to look legitimate. Sandboxing reduces the blast radius but does not prevent the injection itself. Human oversight the most commonly cited mitigation fails at the scale and speed at which agentic systems operate. A user who deploys an agent to browse 50 websites and summarize findings cannot review every page the agent visited for hidden instructions. The multi-agent cascade risk is where this becomes a systemic problem. In a pipeline where Agent A retrieves web content, Agent B processes it, and Agent C executes actions, a successful injection into Agent A's data feed propagates through the entire system. Agent B has no reason to distrust content that came from Agent A. Agent C has no reason to distrust instructions that came from Agent B. The injected command travels through the pipeline with the same trust level as legitimate instructions. Google DeepMind documents this explicitly: the attack does not need to compromise the model. It needs to compromise the data the model consumes. Every agentic system that reads external content is one carefully crafted webpage away from executing attacker instructions. The agents are already deployed. The attack infrastructure is already being built. The defenses are not ready.
Alex Prompter tweet media
English
304
1.6K
7K
1.9M
Sean
Sean@SeanFaytheOG·
So, making Harness has taught me a lot about where agent gaps exist. Not in capabilities, skills, plugins, image creation, coding, etc. It's in accountability, and validation of *real* work being done. I really do want to be able to tell my Codex/Claude/Openclaw/whatever to "Build me a $1B SaaS, make it fast, make no mistakes" and it actually attempt to build something that works. I don't to babysit an agent, I should not have to. It knows my intent, it interviewed me about what I want to build, it can take that put it into a PRD, break it down into subtasks and just build. Why is that so hard? I know, I know, it's not that simple. I'm trying to fix that. x.com/SeanFaytheOG/s…
English
0
0
1
32
Sean
Sean@SeanFaytheOG·
@pdrmnvd No corvette anymore, Mac mini and $200/month Claude
English
0
0
9
765
pedram.md
pedram.md@pdrmnvd·
men in their 40s used to have cool midlife crisis but now they just have agentic workflows
English
159
677
7.6K
439.9K
Sean
Sean@SeanFaytheOG·
Harness is my implementation of that layer. TaskEnvelope contracts, verification engine, artifact-backed completion, reconciliation against real system state. Full writeup with architecture, code, and a simulated run from goal → PRD → execution → failure → recovery → accepted completion: x.com/SeanFaytheOG/s…
English
0
0
0
14
Sean
Sean@SeanFaytheOG·
Agents don't fail because they're not smart enough. They fail because there's no control plane around them. New article on what that actually means — and what it takes to fix it. 🧵
English
5
0
1
6
Sean
Sean@SeanFaytheOG·
That layer is not another agent. It's not a smarter planner. It's not a new framework. It's a control plane. The same idea that makes distributed systems reliable, applied to AI-driven work.
Sean tweet media
English
0
0
0
11
Sean
Sean@SeanFaytheOG·
Sean tweet media
ZXX
0
0
0
0
Sean
Sean@SeanFaytheOG·
"Completion without evidence is just lying with confidence." Most agent systems today have no way to distinguish between a task that finished correctly and a task that an agent says finished correctly. Those are not the same thing.
English
1
0
0
3
Sean
Sean@SeanFaytheOG·
What's missing is a layer that: - enforces evidence-backed completion - reconciles state across GitHub + Linear - classifies failures explicitly (wrong branch, missing artifact, contradictory facts) - makes lifecycle transitions policy-enforced, not agent-reported
English
0
0
0
8
Sean
Sean@SeanFaytheOG·
The typical agent loop: task assigned → agent runs → task marked done. No PR? Doesn't matter. Wrong repo? Doesn't matter. Nothing actually verified? Doesn't matter. The agent said it's done. The issue is closed. Moving on.
English
0
0
0
5
Sean
Sean@SeanFaytheOG·
I’ll come back to openclaw when it gains a harness to actually do the tasks it says it’s going to do but forgets that r whatever openclaw is doing. I’m using ChatGPT to shape tasks, put them in linear, and assign them to codex cloud. I’m actually finishing things I start. I simultaneously use Claude to write articles. I’m productive, from my phone, and no openclaw. It is cool, and I’m sure it will work someday, but for now I’m disappointed and will wait
English
0
0
1
84
Brad Mills 🔑⚡️
Brad Mills 🔑⚡️@bradmillscan·
If you have activity bias, ADHD or anything resembling OCD, OpenClaw is a terrible drug and productivity poison. After 40 days in the trenches with this thing, my life is measurably worse. OpenClaw is a lot like crypto trading, gambling or playing video games. Most ppl are going to lose, and you’re going to feel like shit when you’re done - likely only 1% of ppl will find it additive to their lives. I am in a constant state of stress. I’m skipping workouts, my vision is fucked from 12-15 hrs a day on screens, my forearms are fucked from too much typing. There’s a never-ending maze of rabbit holes to fall down, footguns to step on and moles to whack. Plus when you finally do get it going, you start projects and don’t finish them because half way through something breaks. I’m not hitting the gym as often as I should, not eating right and I’ve completely lost sight of my goals and why I started in the first place. The idea of OpenClaw is so compelling and it’s very exciting when I get glimpses of what the future is going to be like when this is not a patchwork of chaos. This tech is dangerous. I know I said this last week but I need to take a break … I’m burnt out from all the constant debugging and errors across every fucking surface of this thing. It’s like trying to fly a plane without a license … oh and it’s on fire…and you’re on crack.
English
185
44
827
94.1K
Sean
Sean@SeanFaytheOG·
@benspringwater I just found Linear recently (last couple of months) and I LOVE it, fantastic, and so simple and easy to use. That and the integration with Codex cloud is amazing. Just assign an issue and Codex works on it, my mind was blown.
English
1
0
2
48
Sean
Sean@SeanFaytheOG·
Exactly! Also I am building something to solve an agent problem, but not something that Linear introduces. It's to make sure that the agent does what it said it did, rather than a ghost commit that it claims it did but doesn't show up in GitHub. That's happened to me and it was super frustrating. I thought I was going mad, couldn't find any evidence. All I got when I pushed back was "oh sorry, my bad". Yeah, not ok.
English
0
0
0
17
Karri Saarinen
Karri Saarinen@karrisaarinen·
@SeanFaytheOG @linear Thank you! This the Linear way. We design and build you software so you can use and do your work, not get sidetracked to construct your own solution.
English
1
0
1
102
Sean
Sean@SeanFaytheOG·
You mean I can’t be an instant billionaire by installing openclaw on a Mac mini? Man, wish I would have known that. I’ve tried to ship software with openclaw, tried building structure, tried different models, all failed at reliability and consistency. Ended up dumping openclaw and just use codex + linear
English
0
0
0
151
David Ondrej
David Ondrej@DavidOndrej1·
this shit is ALL HYPE the entire article is AI-written ZERO actual use-cases holy shit. we are doomed.
David Ondrej tweet media
English
33
3
91
6.4K
Zack Korman
Zack Korman@ZackKorman·
If you think Claude killed Openclaw you clearly don’t understand Openclaw. People don’t use Openclaw because it can perform tasks for them autonomously across devices. They use Openclaw so they can post about it online.
English
214
243
5.6K
161.2K