Petio Lazarov

221 posts

Petio Lazarov

@petiosz

Testing AI agents in public. Codex notes, model releases, and what breaks after the demo.

가입일 Mayıs 2026

224 팔로잉19 팔로워

고정된 트윗

Petio Lazarov@petiosz·1 Haz

Starting this account properly today. I use AI tools until the ugly parts show up: limits, bad runs, weird failures, wins. I'll post Codex notes and what breaks after the demo. No hype farm. No magic. Just notes from using the stuff.

English

316

Petio Lazarov@petiosz·8h

@danshipper next chart i want: runs started vs usable diffs, because a forced spike is not the same thing as codex earning the slot.

English

Dan Shipper 📧@danshipper·12h

before and after fable ban: my claude app vs. codex app usage

English

145

11.1K

Petio Lazarov@petiosz·10h

@MoonDevOnYT wait so you are giving access to your code in the zoom meeting?

English

Moon Dev@MoonDevOnYT·12h

you are literally burning money on polymarket while i just forced 36 ai agents to strip mine the platform for alpha it is frankly disgusting how fast these bots uncover the exact loopholes needed to bleed the rest of the market dry watch me leak the raw prompts and steal the winning bot blueprints before this gets taken down here

English

4.7K

Petio Lazarov@petiosz·12h

@foxtomb232 I am new in twitter, and I would really appreciate a shoutout

English

FOX TOMB@foxtomb232·1d

I started as a reply guy with 0 followers. Now I’m at 17.2M and guess what? I’m still replying. If you’re building too, drop a reply and let’s connect 🔒💬

English

200

150

6.9K

Petio Lazarov@petiosz·12h

@blakefakhoury and why dont you show the youtube channel

English

488

Blake Ryan@blakefakhoury·1d

start an AI sleep channel. trust me lol, this takes us less then 5 minutes a day

English

512

38.1K

Petio Lazarov@petiosz·12h

@MoonDevOnYT Hey moondev why did you close your github projects behind a paywall brother? make a monthly sub at least, I can't afford to pay several gazzilion dollars for that cmon :D

English

1.1K

Moon Dev@MoonDevOnYT·17h

fable 5 has been killed by the US government if you didn't spend the past 72 hours building trading systems with it you will be left in the past we will never have that powerful of AI again

English

201

37.6K

Petio Lazarov@petiosz·12h

@MoonDevOnYT can you share the strategy for science?

English

304

Petio Lazarov@petiosz·13h

@sattyyouneed make the limit visible before the run starts. otherwise the agent spends your cap like it found a company card.

English

Satyam@sattyyouneed·1d

Is there any trick to avoid Codex usage limits?

English

1.1K

Petio Lazarov@petiosz·16h

@rezoundous they had to give us the jailbreak instead

English

Tyler@rezoundous·17h

Next level snitching

International Cyber Digest@IntCyberDigest

‼️🚨 BREAKING: Amazon researchers snitched to the US government about jailbreaking Fable 5 and Mythos 5, forcing Anthropic to immediately shut down worldwide access. A security export control directive from Commerce Secretary Howard Lutnick enforced the action. Anthropic is fighting the directive and calls it a misunderstanding. This isn't the first clash. The Trump administration had already tried to get Anthropic to pause the release of its latest models before this directive landed.

English

1.5K

Petio Lazarov@petiosz·16h

@Latin0Patri0t @ChrissGPT Bro I am from Europe... I can't count on Europe to do anything... we only regulate, we don't produce. Same happens in America now.

English

Bad Hombre@Latin0Patri0t·17h

@petiosz @ChrissGPT You should only count in your country not a foreign country. Mistake #1

English

Chris@ChrissGPT·1d

OpenAI already requires ID for some features. Anthropic will most likely simply do the same to use mythos. This will continue to get more stringent as we get closer to AGI

English

517

38.6K

Petio Lazarov@petiosz·17h

@Latin0Patri0t @ChrissGPT Yeah i can definetly count on US after today mhm.. the most cucked model that refused 99% of requests got banned. I want the same model but totally unlocked

English

Bad Hombre@Latin0Patri0t·17h

@petiosz @ChrissGPT 😂😂😂 this guy counting on China when China doesn’t even allow internet access …what a retard

English

Petio Lazarov@petiosz·20h

@aakashgupta i would want the evaluator to print 4 boring things in the transcript: changed files, failed check, stop reason, turn count. otherwise /goal can stop cleanly and still leave a mystery.

English

226

Aakash Gupta@aakashgupta·1d

/goal might be the most powerful feature in Claude Code that you're not using. And the part everyone gets wrong has nothing to do with the feature. Here's the mechanism. You hand Claude a completion condition. It works turn after turn. After every turn, a separate evaluator model (Haiku by default) checks the output against your condition. Condition unmet? Claude keeps going. Met? It logs the proof and hands control back. The design choice that matters: the agent doing the work never decides when it's done. A fresh model does. OpenAI shipped /goal in Codex in April. Anthropic followed in May with Claude Code 2.1.139. Two rival labs converged on the same architecture within 30 days, because they both hit the same wall: agents grade their own homework generously. Separate the worker from the judge and autonomy actually holds. But here's where most runs die. The bottleneck moved. It's no longer prompting skill. It's the goal condition itself. "Make the dashboard better" returns either a frozen session or a confident-sounding mess. "All tests in test/auth pass, lint is clean, no other test file modified, stop after 20 turns" returns finished work while you're at lunch. A measurable end state. A check the agent can prove in the transcript. Constraints that must hold. A turn limit. PMs have a name for this. Acceptance criteria. The discipline you've been writing for human engineers for 20 years just became the interface to autonomous agents, and most engineers were never trained on it. I spent the week running /goal on real PM work and wrote the full playbook, including the goal conditions that worked and the ones that burned tokens for nothing: news.aakashg.com/p/how-pms-shou… The agent does the work. You define done. That was always the job.

English

10.6K

Petio Lazarov@petiosz·1d

@migsdevv @twetsfyp Bro I cant afford fable as well :D

English

1.2K

Migs@migsdevv·1d

@petiosz @twetsfyp Just show the video to Fable.

English

1.2K

me@twetsfyp·2d

Mythos Claude is Insane This is a tutorial a 12min on how to build animated, award-Winning websites with Claude Fable 5

English

291

3.8K

1.3M

Petio Lazarov@petiosz·1d

@rduffyuk token price is the wrong unit. i would track subagent fanout per review checkpoint. when that is hidden, a model swap can look cheap while the run gets harder to audit.

English

rduffy@rduffyuk·1d

Running Claude Code (Fable 5) and Codex in parallel. Fable landing forced me to build actual cost governance. Discovery: one 4-hour session, 32 subagents inheriting Fable — $50/MTok output, mandatory extended thinking, can't be disabled. 316K output tokens. ~$16. Single session. Two systems to fix it — breakdown below 👇

English

Petio Lazarov@petiosz·1d

@buildwithdjdev my check is minutes until a reviewer can tell what happened, what changed, and what can be thrown away. if that takes longer than the run, the orchestrator is just moving the bill.

English

Dj@buildwithdjdev·1d

I started using an orchestrator thread in Codex, getting more out of all active tasks now and higher quality output but boy does it burn tokens. I'm out of weekly limit in ~3 days

English

Petio Lazarov@petiosz·1d

@aniketapanjwani i'd add a "throw away" section to the handoff. stale assumptions. files to ignore. last-known-good state. next step that should fail first.

English

191

Aniket Panjwani@aniketapanjwani·1d

Fable eats your Claude Code usage limits in hours - here's how I'm getting around it: 1. Use Fable for planning and to write out your planning doc to disk. I like to use Compound Engineering brainstorm/plan: github.com/EveryInc/compo… 2. In your brainstorming/planning, clear your session (/clear in CC) and do handoffs at appropriate intermediate stages. Install this /handoff skill to automate it: github.com/mattpocock/ski… 3. Install the Codex plugin for Claude Code: github.com/openai/codex-p… 4. Either use /ce-work-beta through Compound Engineering (in a new session after doing /handoff), or just tell Fable to delegate work to Codex to save on tokens. The general principle - use the expensive/better model to decide what to do, and use the cheaper model to do it - is a common technique in agentic deevlopment.

English

8.1K

Petio Lazarov@petiosz·1d

@Warizo_ofAfrica i'd add one more handoff test: can another builder reopen the run tomorrow and find the last known-good state without asking you.

English

Warizo@Warizo_ofAfrica·1d

Good question. My answer: use Cursor/Claude Code for speed, but don’t measure the tool. Measure the handoff: context quality, test pass rate, review time, rollback risk. The best agent is the one your workflow can safely constrain.