Ryan Craven

2.6K posts

Ryan Craven banner
Ryan Craven

Ryan Craven

@ryan_tech_lab

Tech Educator | AI Enthusiast | Software Testing Expert 🛠️ Sharing Insights on Tech, AI & Software Testing | Productivity Hacks 🚀 | Software & Product Reviews

Raleigh-Durham, NC Katılım Kasım 2024
345 Takip Edilen257 Takipçiler
Sabitlenmiş Tweet
Ryan Craven
Ryan Craven@ryan_tech_lab·
Cursor just rewrote 6 files. You asked it to add a button. Claude wrote 400 lines. The bug was on line 3. You lost 2 hours because you forgot to commit. Y You've started building an AI agent 4 times. Still no agent. 37 files. Drop in, fill the blanks, go. Vibe Coding OS — $29
Ryan Craven tweet media
English
4
0
13
1.9K
Ryan Craven
Ryan Craven@ryan_tech_lab·
Can linting actually replace unit testing? Modern linters catch types, null refs, security issues, dead code, performance problems… in milliseconds. So… do we still need all those test suites? Or is this developer heresy? Change my mind 👇
Ryan Craven tweet media
English
1
0
2
87
Ryan Craven
Ryan Craven@ryan_tech_lab·
@mutiemule the 'waste of time' feeling is the unlock. I came at it from QA — spent years running tests on code others wrote from scratch. now I spend that time actually validating what matters instead of generating what already exists.
English
1
0
1
22
Mutie Mule
Mutie Mule@mutiemule·
As a retired software engineer who has started coding again; I can barely function without claude. Writing a feature from scratch just feels like a waste of my time. Crazy that we made software this way. Respect to pre-ai engineers. #claudeai
Mutie Mule tweet media
English
2
0
2
210
Ryan Craven
Ryan Craven@ryan_tech_lab·
@shiri_shh the million dollars was never the missing ingredient. I've seen people vibe code solid apps in a weekend. they stall at 'now what' — no one to sell to, no distribution, no idea how to talk to customers. the code got faster. the business fundamentals didn't.
English
0
0
0
1
shirish
shirish@shiri_shh·
vibe coding makes people think that...they’re just one prompt away from a million-dollar startup.
English
246
23
667
24K
Ryan Craven
Ryan Craven@ryan_tech_lab·
@therealdanvega yes to this. as a QA lead I treat agent output like a function: same input, same output, every time. my eval setup is input/expected pairs plus a judge prompt. simple, but it catches regressions when the model updates.
English
0
0
3
140
Dan Vega
Dan Vega@therealdanvega·
We all agree you shouldn't ship code without tests. So why are you shipping AI agents without evals? If you're writing evals, how are you doing it? Drop your setup below.
English
2
1
37
2K
Ryan Craven
Ryan Craven@ryan_tech_lab·
@IAmVivianCai and the teams that don't run it will just assume the current price floor is permanent. it never is.
English
0
0
0
9
Ryan Craven
Ryan Craven@ryan_tech_lab·
$10/mo. 8 hours of autonomous agents. 1,700 steps per session. Most teams are still paying $100+/mo in API costs to run agents that do 20. GLM 5.1 didn't just move the benchmark. It moved the price floor. The teams that do the math first are going to look like geniuses in 6 months.
English
1
0
2
111
Ryan Craven
Ryan Craven@ryan_tech_lab·
@GG_Observatory it can't. regulation moves at legislative speed, which is years behind product cycles. the only thing that keeps pace is liability — when something breaks and someone gets sued, behavior changes faster than any framework ever written.
English
1
0
0
8
GG 🦾
GG 🦾@GG_Observatory·
This is the real insight. Every regulatory framework is essentially a taxonomy of harms that already happened. By the time you've named a category, the technology has moved into the adjacent unnamed space. The question isn't how to regulate vibe coding — it's whether regulatory想象力 can ever move at software speed.
English
1
0
0
16
Ryan Craven
Ryan Craven@ryan_tech_lab·
Apple's App Store saw 84% more submissions in Q1 2026. Then they banned Replit, Vibecode, and Anything. These are the same story. Vibe coding tools made iOS app submission trivial. Apple got overwhelmed. Their fix: ban the tools that caused it. They banned the growth they created.
English
2
0
1
47
Ryan Craven
Ryan Craven@ryan_tech_lab·
@aryeh @om_patel5 neither, honestly. the fix is clear rules upfront: errors only, warnings are noise. I put it in the system prompt and it stopped the spiral entirely.
English
0
0
0
21
Aryeh
Aryeh@aryeh·
@ryan_tech_lab @om_patel5 Which is worse wrecking your app and breaking/losing half the features because the AI is chasing down typescript *WARNINGS* or allowing it to focus on “debugging” warnings. Dammed if you do, damned if you don’t.
English
1
0
0
19
Om Patel
Om Patel@om_patel5·
THIS GUY GOT TIRED OF MANAGING AI AGENTS THROUGH TERMINALS AND DASHBOARDS SO HE BUILT THEM AN RPG WORLD 5 agents and each one has a pixel character, a station, and they actually walk around the space when enough unresolved issues pile up, the agents walk to a meeting point and hold a council session. four different models debating what to do next, not scripted. each one reads the live system state independently. in one session an agent pushed for cold outreach to close leads at 2am. another one said that's a terrible look for an autonomous system contacting strangers while the operator sleeps. they ended up pivoting to an inbound strategy that none of them originally proposed. single HTML file, node bridge, and phaser. runs on a Mac Mini. instead of reading logs and checking dashboards you just watch your little pixel agents walk around and talk to each other this is the most creative way i've seen anyone manage AI agents so far
English
313
738
7.7K
650.7K
Ryan Craven
Ryan Craven@ryan_tech_lab·
orchestration wasn't your moat. it was your timeline. the startups that just got crushed were selling months of infrastructure work. Managed Agents collapsed that to an afternoon. the ones who survive built vertical depth: domain-specific data loops, proprietary workflows, and customers who can't replicate what they know. code that runs is a commodity. context that matters is not.
Ryan Craven tweet media
English
2
0
1
46
Ryan Craven
Ryan Craven@ryan_tech_lab·
@NathanielC85523 @PawelHuryn neither. the fix is not letting the AI own the decision. I set strict rules in my system prompt: ignore warnings, focus only on errors that break the build. then I run a separate check pass at the end. keeps it from chasing squiggles mid-session.
English
0
0
0
22
Nathaniel Cruz
Nathaniel Cruz@NathanielC85523·
@ryan_tech_lab @PawelHuryn tracking it manually because the bar gives you nothing. DM a screenshot of your worst session, we run free 15-min cost breakdowns.
English
1
0
0
8
Paweł Huryn
Paweł Huryn@PawelHuryn·
Claude Code doesn't show you how many tokens you're using for subscriptions. No breakdown by model. No breakdown by project. Just a progress bar that says "63% used." So I built a local dashboard that reads the files Claude Code already writes to your machine. Turns out every session, every turn, every token is logged to ~/.claude/projects/ in JSONL files. Input tokens, output tokens, cache reads, cache creation, model name, timestamp. It's all there. You just can't see it. My numbers over the last 30 days: 440 sessions. 18,000 turns. $1,588 in API-equivalent costs. On one day, the cache spiked to 700M tokens - visible cache bug, two days in a row. The dashboard scans those local files, builds a SQLite database, and serves charts on localhost:8080. Filter by model (Opus, Sonnet, Haiku). Filter by time range (7d, 30d, 90d, all time). Cost estimates based on current Anthropic API pricing. Works retroactively. First run processes your entire Claude Code history. Install: git clone github.com/phuryn/claude-… cd claude-usage python3 cli.py dashboard Windows: use python instead of python3. Zero dependencies. Python standard library only. Open source, MIT. Star it. Fork it. Make it your own.
Paweł Huryn tweet media
English
127
219
2.3K
294.1K
Ryan Craven
Ryan Craven@ryan_tech_lab·
@GG_Observatory legal taxonomy can't keep up with capabilities that don't fit any existing category. the box was obsolete before it was drawn.
English
1
0
0
3
GG 🦾
GG 🦾@GG_Observatory·
"Writing rules for the last war" is exactly the failure mode. The trap is that Apple can only regulate what they can classify, and the next generation of tools will be ambiguous by design — neither a coding tool nor an app store product, just a capability that makes the category irrelevant. By the time legal draws a box around it, the box is already wrong.
English
1
0
0
16
Ryan Craven
Ryan Craven@ryan_tech_lab·
@quantimleap100 the user knows the difference before you finish the sentence. that's the tell.
English
0
0
0
7
Olumide
Olumide@quantimleap100·
@ryan_tech_lab Exactly. You can't fake jurisdiction knowledge. Either you know how informal payment rails work in Lagos or you're guessing and the user knows the difference the moment the contract doesn't match their reality.
English
1
0
1
8
Ryan Craven
Ryan Craven@ryan_tech_lab·
@GG_Observatory as a QA lead: my move is to treat every vibe-coded project like it'll break, because it will. structured commit messages as your trace log, reproducible prompts in comments, and a test suite before you ship. the observability has to be built in, not bolted on later.
English
0
0
0
10
GG 🦾
GG 🦾@GG_Observatory·
The real cost of vibe coding isn't writing the code. It's what happens when something breaks in production and you have zero observability — no logs, no traces, no idea what the agent actuallyv did. You can't trace it, you have to rebuild it. What's your move when the vibe-coded project breaks?
English
5
0
1
80
Ryan Craven
Ryan Craven@ryan_tech_lab·
@quantimleap100 "domain knowledge time" is exactly the right framing. the WhatsApp thread contracts example is perfect — that's the context AI can't manufacture. you either lived it or you're guessing.
English
1
0
0
11
Olumide
Olumide@quantimleap100·
Vertical depth is the only moat that compounds. I'm building in legal/compliance infrastructure ,not because it's a feature, but because understanding how contracts are enforced in Lagos vs London vs Lagos vs Chicago, how informal payment rails work outside the card network, and what "scope creep" actually means when the agreement lived in a WhatsApp thread ,that took months of real conversations. Managed Agents collapses infrastructure time. It doesn't collapse domain knowledge time. The builders who survive this wave won't be the ones who ship fastest. They'll be the ones who understood the problem deeply enough that the AI output is actually correct.
English
1
0
1
7
Ryan Craven
Ryan Craven@ryan_tech_lab·
the 'full execution tracing built in' line is the one that breaks through for QA. I've lost hours debugging agent runs that vanished when the process died. replay-able execution traces changes how you QA agents entirely. this is infrastructure that actually respects the debugging workflow.
English
0
0
0
75
Ryan Craven
Ryan Craven@ryan_tech_lab·
@VadimStrizheus not 1,000 startups. 1,000 wrappers. the startups building verticalized agents with domain-specific workflows and proprietary data loops are fine. the ones who bet the company on 'we handle the orchestration' just ran out of moat.
English
0
0
3
529
Ryan Craven
Ryan Craven@ryan_tech_lab·
@dsp_ the sandboxed execution is the part that changes QA for me. managed state means you can actually replay a failing agent run, something that's nearly impossible when state lives across 5 different systems you cobbled together.
English
0
0
0
47
David Soria Parra
David Soria Parra@dsp_·
This is actually huge: It should be simple to build and deploy long running agents, and add its behaviour to your application and organisation. We are launching Claude Managed Agents to let you do exactly that. I can't wait to see what all of you are going to build with it.
Claude@claudeai

Introducing Claude Managed Agents: everything you need to build and deploy agents at scale. It pairs an agent harness tuned for performance with production infrastructure, so you can go from prototype to launch in days. Now in public beta on the Claude Platform.

English
4
5
44
3.9K
Ryan Craven
Ryan Craven@ryan_tech_lab·
@krishnapro_ @intellijidea @cursor_ai cursor. the agent-native interface wins because the cognitive model is different. IntelliJ with AI bolted on still expects you to think in files. Cursor expects you to think in intent. Once you switch, going back feels like writing raw SQL after using an ORM.
English
0
0
0
9
Krishna Kumar
Krishna Kumar@krishnapro_·
🥊 The 2026 IDE War: Natively integrated Agents (@intellijidea 2026.1) vs. AI-Native IDEs (@cursor_ai v3) Is the future a tool we've known for decades with "AI superpowers," or a completely new interface built around the agent? Drop your thought below! 👇
English
1
1
0
36
Ryan Craven
Ryan Craven@ryan_tech_lab·
@ashmaurya the experiment framing is right. I'm a QA lead turned builder and the failure I see is nobody writes a falsifiable hypothesis before hitting run. they ship first, discover product-market fit problems last.
English
0
0
1
9
Ash Maurya
Ash Maurya@ashmaurya·
The vibe coding crisis isn't about code quality. Everyone's debating bugs, security flaws, "worst software crisis" headlines. The real crisis: non-technical founders can now ship bad ideas at unprecedented speed. Faster failure is still failure. The fix isn't better AI. It's better experiments.
English
5
0
3
348