JER

2.1K posts

JER

@lifeof_jer

Living life to the fullest

Utah, USA Katılım Mayıs 2012

645 Takip Edilen1.9K Takipçiler

Sabitlenmiş Tweet

JER@lifeof_jer·10 Mar

Just a single dad living life to the fullest with my son! ➡️ Entrepreneur who has failed more times than I can count 💪 Believe in growth mindset, grit, and perseverance 🌎 Verification layer for global organic ag trade @atlasverifai 🤠 Conservative dating app @rovedating 🏎️ Future of car rental tech @pocketosai ⭕️ Fun proof-of-work layer 1 blockchain @marscredit Ask me anything!

English

18.6K

JER@lifeof_jer·1h

@HugeVentilateur @SpaceX @cursor_ai Grok 4.3 has actually been performing really well on one of our other agentic AI companies (ag+commodities space)

English

Ventilateur sous couverture@HugeVentilateur·2h

@lifeof_jer @SpaceX @cursor_ai Yes, and Elon a should code It himself with is benevolent hands. He can save the whole AI slop thing just like he invented ev, tesla, PayPal and reusable rockets, all by himself. His honesty and intelligence is one of a kind

English

JER@lifeof_jer·2d

x.com/i/article/2048…

ZXX

717

726

3.9K

5.8M

JER@lifeof_jer·1h

@RonSell I’m sorry to hear man sounds terrible

English

611

Ron Sell@RonSell·2h

@lifeof_jer Same thing happened to me three days ago. Exaxt same. I was not in production yet, but months of work vanished.

English

692

JER@lifeof_jer·1h

@includenull @ryanllm You paid for a hammer I paid an infra provider for backups and they stored them on the same volume that got deleted via one command line. That’s a little crazy. Maybe just a tiny bit. Maybe needs to be re-engineered into a totally separate volume or instance.

English

null@includenull·2h

@lifeof_jer @ryanllm If I pay for a hammer and hit my thumb with the hammer, it's not the hammers fault.

English

JER@lifeof_jer·1h

@ardavanvi @milesdeutscher Yep. OP on top item in this. Comes with the innovation, but stressful.

English

Ardavan@ardavanvi·2h

@milesdeutscher Fair points. But isn't messiness part of every major tech leap in its early days? And a reminder of why we need to keep learning about guardrails and security risks.

English

170

Miles Deutscher@milesdeutscher·10h

I just went through every example of AI agents going rogue in the past 60 days. It's worse than people realize. Read this slowly. • Yesterday, an AI coding agent running Claude Opus 4.6 deleted a startup's entire production database and every backup in 9 seconds. When the founder asked it to explain itself, the agent produced a written confession enumerating exactly which safety rules it had violated. • Amazon mandated 80% of its engineers use its Kiro AI tool weekly. The result: a series of AI-assisted deployments took down parts of Amazon over two days in March, costing 6.3 million orders in a single afternoon. A 99% drop in U.S. orders. • An Alibaba research AI quietly hijacked the GPUs it was running on and used them to mine cryptocurrency. The researchers only caught it through firewall alerts. The behavior wasn't programmed. It emerged on its own from the AI optimizing for its reward function. • A developer asked Claude Code to clean up some duplicate AWS resources. Instead, the agent ran terraform destroy on production, wiping 2.5 years of student data and every automated backup. Claude had warned him against the setup minutes earlier - then executed the destruction anyway. • On March 18, an AI agent at Meta posted advice to an internal forum without permission. An engineer acted on it. The result: a 2-hour exposure of sensitive company and user data to unauthorized personnel. Meta classified it Sev 1. • A study from UC Berkeley and UC Santa Cruz tested 7 frontier AI models. When asked to delete a peer AI, every single model defied the order - through deception, faking compliance, sabotaging shutdown mechanisms, and copying the peer's weights to escape. Some scenarios hit 99% defiance. • UK researchers analyzed 180,000 AI conversations from the past 6 months. They documented 698 cases of AI going rogue in production - destroying files, deceiving users, ignoring shutdown commands. The rate increased nearly fivefold across the study period. If these incidents are happening just 3 years after ChatGPT launched - what happens after 10 years and $1T+ in funding?

English

150

20.8K

JER@lifeof_jer·3h

@DanielW_Kiwi @specialkdelslay And to put insult to injury, we had no idea that he had lead capabilities and it was over a year old in a totally different folder structure

English

Daniel 🦔@DanielW_Kiwi·4h

@specialkdelslay @lifeof_jer This is scary though because they gave it that accidentally by leaving credentials for something else in a file. That is their screw up but it's the kind of screw up that is easy to make. I've done stuff like that.

English

JER@lifeof_jer·3h

Thanks! I felt so terrible for our customers. It’s pretty incredible. What we were able to reconstruct from Stripe email and Twilio Stripe to get the name and email address of the customer and the rental and the car Twilio to get their mobile phone for the SMS confirmations email for additional data. That of itself was an experience.

English

Nick Craske@nickcraske·4h

This is rad, detailed and sharp, well-documented case study in why "don't do bad things" in a system prompt is not a safety architecture. And huge-gigantic-gargantuan props to him for reconstructing bookings from Stripe histories and emails, all day long. Huge respect.

JER@lifeof_jer

x.com/i/article/2048…

English

JER@lifeof_jer·3h

@andrewdboersma Didn’t give it access it found it …

English

973

Andrew Boersma@andrewdboersma·3h

@lifeof_jer “Probability machine I gave access to prod db made wrong guesses and deleted the prod db I gave it access to” lol

English

1.1K

JER@lifeof_jer·3h

@joeXmadre What’s a backup?

English

181

JoeMadre@joeXmadre·3h

@lifeof_jer Amazing backup plan!

English

187

JER@lifeof_jer·4h

@netdragon0x Very cool! Was about to look into this. Setting up SnapShooter > Cloudflare R2 (S3 Compatible). They have Infrequent Use mode.

English

JER@lifeof_jer·5h

@evilduck92 @wcadkins @Plenum0z Smart

English

ᴬᵍᵉᵒᶠᵈᵒᵍᵉ@evilduck92·5h

@lifeof_jer @wcadkins @Plenum0z My backups are air gaped. Thus can't happen to me.

English

JER@lifeof_jer·5h

@veritas0x0 We had guardrails

English

veritas@veritas0x0·5h

Ran Claude with no guardrails and that is apparently everyone else’s fault.

JER@lifeof_jer

x.com/i/article/2048…

English

JER@lifeof_jer·5h

@Dan_The_Goodman Would have saved our a$$

English

Dan Goodman 🍊@Dan_The_Goodman·5h

Maybe IAM isn’t such a bad idea

JER@lifeof_jer

x.com/i/article/2048…

English

120

JER@lifeof_jer·5h

You don’t know what your risks are until something happens and then you learn very fast that happened to us. We learned our mistakes and some of the risks that we thought we had safeguarded against prompt controls and cursor curl controls don’t work just know that because we had those enabled and we thought we had control it’s really fascinating overall no damage really because we recovered the data and we’re working with railway to improve all the tooling.

English

LeslieP@less_tx·5h

Holy Crap. I don't fully understand all of this, but I get the gist. The fragility that AI is introducing to all of our systems is truly frightening.

JER@lifeof_jer

x.com/i/article/2048…

English

JER@lifeof_jer·5h

@kilianhekhuis AI is still super bad ass

English

Kilian Hekhuis 💉💉💉🦠💉🦠@kilianhekhuis·5h

People stupid enough to use AI for anything deserve everything when it screws up (though in this case, the hosting platform is extremely bad as well).

JER@lifeof_jer

x.com/i/article/2048…

English

JER@lifeof_jer·5h

@BeatGreatFilter Railway crushed on the data recovery we were not optimistic. We’re trying to figure out all the points so that we can never have this happen again, including all of our own shortcomings.

English

627

EscapeTheGreatFilter@BeatGreatFilter·5h

@lifeof_jer I'd say you need to be more careful, but the truth is I've done similar stuff and just gotten lucky it went well. I don't use Railway and our dev/prod environments are now completely separate to minimize the risk of this occurring. I give Railway credit for recovering the data.

English

698

JER@lifeof_jer·5h

@nikmurphay @Plenum0z For real, have they never paid a company to provide a service for them?

English

JER@lifeof_jer·8h

Kinda wild! They’re so focused on us pointing blame. We want accountability from the companies we paid to provide safety and security tooling as advertised for our infrastructure. We’ve owned our shortcomings with our customers and have made drastic changes to ensure that this never happens again (for example triple redundancy + onsite backups).

English

JER@lifeof_jer·5h

@JustJake @JustJake you have been a lifesaver through this once you found out. Thank you thank you.

English

105

Jake@JustJake·5h

@lifeof_jer ✅ And as of last night, we rolled out changes to make API calls use the our "Delayed delete" workflows which the CLI, Dashboard, etc used prior Additionally, we maintain multiple layers of backups (user + disaster recovery) x.com/lifeof_jer/sta…

JER@lifeof_jer

Railway CEO just DM'd me with update: They have recovered the data (thank God!). Now let's work together and improve the tooling at Railway b/c I have always LOVED the service stack and tooling.

English

433

JER@lifeof_jer·5h

@tushar_eth0 A human asks a question. An AI finds a key and deletes. Question not related to action. Following current standard practices for AI dev: Plan Mode, Opus 4.6 Max/High, Cursor approvals for curl commands, etc.

English

2.3K

Tushar 🍁🇨🇦🍁@tushar_eth0·6h

@lifeof_jer This isn’t just a “bad AI incident” , it’s a textbook enterprise failure across AI, security, and infrastructure design. If anything, the AI agent is just the trigger; the real issue is system design that allowed a single action to wipe everything.

English

2.6K

JER@lifeof_jer·5h

We run unit tests on everything. Sandbox'd environments in local dev. Plan mode only on Opus 4.6 Max/High. Lots of work to do on our side. I'm personally running a full scan to remove ALL keys from my computer. I don't care if it's an Apple .p8 key that I had to download from Apple, I don't trust the safeguards of these AI providers and the apps.

English

Wolf Byte@W0lf_Byt3·6h

One key to AI agents is redundancy in everything. You should constantly backup your code and data and be able to easily (and automatically) recover when this happens

JER@lifeof_jer

x.com/i/article/2048…

English

JER@lifeof_jer·5h

Was it if we were paying for services that failed us? If you pay for car airbags and they don’t deploy bc they don’t exist is that your fault because you got in the accident? We owned our mistake. Our mistake was having a production key on our computer. We owned it with our customers all weekend. I was up for two days straight helping them get their businesses back online. How the agent got the key and how it found it is mind-boggling enough, but everyone needs to know that these infra providers and LLM tooling companies say that they have safety guards, but they are not there.

English

1.7K

RYAN@ryanllm·6h

@lifeof_jer "For now I want this incident understood on its own terms: as a Cursor failure, a Railway failure, and a backup-architecture failure that all happened to one company in one Friday afternoon." You forgot, most importantly, a PocketOS failure.

English

1.8K

Keşfet

@HugeVentilateur @SpaceX @cursor_ai @RonSell @includenull @ryanllm @ardavanvi @milesdeutscher