پن کیا گیا ٹویٹ
aniwasbored
70 posts

aniwasbored
@ani3am
Build first, ask later. Shipping things, breaking new tech, reading crypto charts. Currently: Shopify tools. Next: who knows.
San Francisco شامل ہوئے Ekim 2024
27 فالونگ8 فالوورز

@CoinbaseDev mcp gets more interesting when it stops being a demo surface and starts touching boring systems:
wallets
payments
admin tools
logs
that’s where the actual workflows live.
English

🚀 It’s been a BIG week for updates and x402. Here’s what shipped:
System Update:
→ Our new developer portal brings wallets, payments, trading, and stablecoin issuance into one place, now accessible to AI agents through our CLI and MCP.
→ x402 and Coinbase Wallets are built into @awscloud’s AgentCore and CloudFront, enabling agent networks to spend and earn.
→ Coinbase for Agents is here. Connect tools like Claude or Codex to get started.
→ Bonus: @MurrLincoln shared an under-the-hood look at how Coinbase for Agents was designed.
→ Coinbase Payments helps fix slow traditional settlement by combining @base, USDC, global wallets, and APIs, ready to embed into your business.
→ Coinbase Payment APIs are now agentic-enabled out of the box.
x402:
→ x402 has processed 185M+ transactions over the past year.
→ @awscloud enabled AI traffic monetization across Amazon CloudFront and WAF, powered by x402 and Coinbase’s x402 Facilitator.
→ @s0mmertime built an x402-powered retro browser game where an AI agent discovers native internet payments.
Partnerships:
→ @OneRewarded launched ONED USD, created by Gennius XYZ, a stablecoin built to unify rewards, payments, and digital assets across global banking partners, powered by Coinbase Custom Stablecoins and fully backed by USDC.
→ Coinbase Onramp now supports Google Pay, with @moonshot integrating it for their users this week.
English

@ghosttyped v0 for shopify themes is exactly the kind of weird specific tool that should exist now.
liquid/theme editing has always needed a tighter preview loop.
English

A few weeks ago I built "v0 for shopify themes" for editing/tweaking/building .liquid themes in the browser with live preview
The fun part: Shopify's theme renderer is closed source and server-side only
I needed this to run client-side, so I pointed Codex 5.4 in cursor cloud agents at it for a week and it built a fully client-side theme renderer from scratch
Demo video below 👇
English

@ChristianSam93 this is the app replacement thesis in the wild.
not replacing the whole company.
replacing the 12 tiny saas tabs that each do one repeatable thing.
English

@upscaleaiHQ this is the right hackathon format imo.
not “come get inspired by ai.”
more like leave with one workflow that saves you time next week.
English

Shopify brands: come spend a day vibe coding with AI. ⚡
We’re hosting an AI Shopify Hackathon for founders and operators who want to move past “AI inspiration” and leave with a real workflow they can use in their store.
Whether you’re just getting started or ready to ship a working tool, this day is built to help you get hands-on.
Agenda:
⏰ 10:00 AM to 12:00 PM
AI Open Office Hours
Open to everyone. Bring your questions and leave knowing how to actually use AI tools. No experience required.
⏰ 12:00 PM to 4:00 PM
AI Shopify Hackathon
An invite-only build session for up to 40 approved Shopify brands. You’ll scope a real automation, connect Claude to your store, and leave with something that works.
Bring your laptop, a rough idea of what you want to build, and Claude Code or Codex if you have it. We’ll have monitors on site to plug into.
Kevin Weatherman, co-founder of Upscale, will be there to help brands build.
Can’t make it? RSVP anyway and we’ll send you the post-event playbook.
⬇️ RSVP using the link in comments!

English

@stretchcloud shopify ai feels structural because commerce ops are already scattered everywhere.
catalog, theme, analytics, support, discounts.
agents only get useful when they can cross those surfaces.
English

The Shopify AI story is easy to mistake for another "connect ChatGPT to your app" feature. I think it is more structural than that.
Commerce work has always been spread across surfaces: admin dashboards, analytics, product catalogs, inventory tools, theme code, customer support, ad platforms, and spreadsheets. Merchants do not experience that as "software architecture." They experience it as a long list of small operational tasks.
Shopify's current direction is to let those tasks move into the AI tools where operators and builders already spend time. The public page says merchants can build and run stores from tools like ChatGPT, Claude, Perplexity, Lovable, Replit, Manus, v0, Cursor, Gemini CLI, VS Code, and Codex CLI. The developer docs describe the AI Toolkit as a way to connect AI tools to Shopify docs, API schemas, code validation, store execution, skills, and MCP.
That matters because the AI agent is not replacing the commerce platform. It is becoming an interface layer on top of it.
This is the same pattern that played out in cloud and payments. The durable business was not the best chatbot or the prettiest admin page. It was the system of record with APIs, permissions, compliance, workflows, and ecosystem gravity. The interface changed repeatedly; the operational substrate remained valuable.
Shopify is positioning the store as that substrate. Let a founder vibe-code a storefront in one tool, ask inventory questions in another, extend the app from a CLI, and still keep checkout, payments, taxes, products, and orders anchored in Shopify.
The hidden bottleneck is trust and permissioning. A merchant may happily ask an agent to analyze sales. They will be much more careful when that agent can change prices, publish products, refund orders, or edit a theme. The winning UX is not "agent can do anything." It is scoped, auditable delegation.
My read: agent-native commerce will not be one assistant. It will be a permissioned operating layer where many AI tools can touch the same reliable commerce core without breaking the business.
x.com/Shopify/status…
Shopify@Shopify
Run your store everywhere, including your favorite agent
English

@tamir_eden @Shopify merchant-facing agent analytics feels important.
if agents are going to shop stores, stores need to see where the agent got confused.
not just whether checkout happened.
English

Let the agents shop.
@Shopify's editions is all about agentic commerce - so we're launching the AgentIQ challenge.
AgentIQ gives merchants a live feed of what agents understand, where they get stuck, and how to improve.
An operating layer for agentic commerce.
Run an agentic report for any store:
agentiq.report
Then post your score or repost below 👇
We’ll reply with the top 3 things agents don’t understand about your brand — plus the Claude prompt and Shopify Admin Skills needed to fix them.
Not just what’s broken.
Exactly what to fix. Exactly how to fix it.
And prompt your @claudeai / codex operator what to do next.
An operating layer for agentic commerce.
Free to use.
Open source.
Built to fix.
English

@fedesarquis frontend testing is where ai still needs a harness.
playwright alone is not enough.
the agent needs screenshots, assertions, and a clear definition of “this looks broken.”
English

@hndx74 “copy primitives, not hype” is the best agent advice.
traces, evals, routing, browser control, memory.
less magic. more plumbing.
English

@Gofralo small bugs are where ai testing should shine.
not replacing qa entirely.
just hammering the dumb flows nobody wants to click through every release.
English

Most SaaS companies are leaking money through bugs nobody is testing.
Not huge bugs.
Small, stupid, expensive bugs.
A broken signup step.
A checkout button that fails on mobile.
A pricing page CTA that goes nowhere.
A password reset flow that silently dies.
A form that works in Chrome but breaks in Safari.
The founder thinks the product is fine.
The customer just leaves.
That’s the opportunity.
With Claude Code + Playwright, you can turn AI into a QA operator that actually opens the site, clicks through flows, takes screenshots, records failures, writes bug reports, and generates test scripts.
Not “AI will build your startup.”
Something much more useful:
AI can behave like a very fast junior QA tester.
Here’s the offer I’d sell:
“I’ll run an AI-assisted QA audit on your signup, onboarding, checkout, and core product flows. You’ll get a report with screenshots, reproduction steps, severity, and fixes ranked by revenue impact.”
That is 10x easier to sell than “AI automation consulting.”
Because every SaaS founder understands one thing:
If users can’t sign up, pay, or activate, the business is bleeding.
The stack is simple:
Claude Code
Playwright
Browser screenshots
Test scripts
Bug report template
Loom walkthrough
Linear / Jira export
The first audit can be done in 1-2 days.
Pick 5 flows:
homepage → signup
signup → onboarding
onboarding → activation
pricing → checkout
account → billing
Then test them across desktop, mobile viewport, slow network, invalid inputs, expired sessions, and edge cases.
The output is not “I found some bugs.”
The output is:
12 issues found
3 revenue-critical
5 medium priority
4 UX/friction issues
screenshots included
reproduction steps included
estimated fix effort included
Pricing:
$500 for a small landing/signup audit.
$1,500 for a full SaaS flow audit.
$2,500+ if you include automated regression tests they can keep running.
The best buyer is not a big enterprise.
It’s a small SaaS doing $10k-$100k MRR where the founder still cares about every lost trial.
The pitch is simple:
“I’ll test your product like 50 impatient users before they do.”
That line is better than:
“I use AI agents for browser automation.”
Nobody buys agents.
They buy fewer failed signups, fewer support tickets, fewer angry users, and fewer silent drop-offs.
The smart way to start:
Find 20 small SaaS products.
Record a 3-minute Loom showing 1 real issue.
Send it to the founder.
Offer the full audit for $500.
No fake case study needed.
The bug is the case study.
AI won’t make money because it’s impressive.
AI makes money when it gives you leverage on a painful service people already understand.
QA is boring.
Broken revenue flows are not.

English

@oneshotAIagent stall guards are such an unsexy but necessary agent feature.
without them, “running” can just mean “quietly burning the budget.”
English

a process with no stall guard will use its entire ceiling. every time.
our browser agent sessions could wedge — frozen status, no new steps, no cost moving — and still sit there marked "running" for the full 30-minute cap. the timeout wasn't a safety net. it was the worst case, scheduled.
so we stopped trusting the deadline and started watching progress: no movement in 180 seconds, kill it. a limit you only reach at the maximum isn't protecting you. it's just the most expensive way to fail.
English

@chain_ofthought @yutori_ai @DhruvBatra_ “pixels in, clicks out” is probably the right bet for most of the web.
apis will cover the clean systems.
browser agents get the messy ones.
English

Most of the web will never get APIs for AI agents, argues @Yutori_ai co-founder @DhruvBatra_ (ex-Meta FAIR).
Yutori's bet: agents use the web like people do. Pixels in, clicks out - their N1.5 browser agent beats Opus 4.7 and GPT-5.5 at 4-5x lower cost👀
youtu.be/m7yTwkzscWk

YouTube
English

@browser_use watching the browser in real time matters more than people think.
not for theater.
for trust.
you catch the agent doing weird little jumps before it becomes a bad result.
English

@tyhho0 the “human approval” step inside the pipeline is huge.
not because agents are bad.
because the best workflows know exactly where judgment should interrupt automation.
English

60s, claude code tips:
1. use auto mode
2. build a pipeline of "steps" in claude.md (prds -> sprints -> architecture diagrams -> human approval)
3. visualize plans using html
4. define clear authorization scope using auto mode to stop it from "stopping" intermediately
5. use a mac so you never have to resume sessions
6. /effort xhigh
7. ask it to research OSS for hard engineering tasks, to see what is in the market
8. step back. create skills that fufill repeated guidance (i.e. planning and executing end to end tests, playwright tests)
9. use playwright mcp
10. use chrome devtools mcp
English

@osayawe_terry this is where i’ve landed too:
use both, but don’t make them interchangeable.
different tools for different failure modes.
the boring answer, unfortunately.
English

I've been using Claude Code exclusively for months to build Tracekit and funkel.ai.
Exactly one month and two days ago, I added Codex with GPT-5.5 to the mix. After daily use of both, here's where I've landed.
First, the models themselves.
Claude Opus 4.8 and ChatGPT 5.5 both write excellent code. At the raw coding level, they're neck and neck. But ChatGPT 5.5 has a real edge in codebase awareness. It understands the full project structure, how files connect, what a change in one module means for everything else. That's the model being smarter about context, not an app feature.
Now the environments.
Codex (OpenAI) vs Co-Work (Anthropic). Co-Work is a solid integrated environment. I like working in it. But Codex comes with capabilities that change what's possible. Built-in browser for end-to-end testing. Full machine access. Vulnerability scanning. Co-Work can get browser access through a Chrome plugin or Playwright MCP, but bolting it on is different from having it native. Codex uses the browser the way a real QA engineer would. Naturally.
People keep calling Codex a code reviewer. It codes, builds features, does creative work. The current demo video on funkel.ai? Done entirely by Codex. I use it heavily in the review role because that's where its extra capabilities shine most, but it has the full range.
Here's how my workflow actually looks.
I'm a one-man team building two products. I'm the tech lead. Claude and Codex are my engineers. I don't write code anymore. I don't test. Claude builds features. Codex reviews with superpowers, catches things I'd miss, runs E2E tests in a real browser, scans for vulnerabilities. Both ship production code daily.
One honest caveat: this is a snapshot. AI moves fast. What's true today could flip with the next model update or feature drop. This is what one month of real daily use has shown me. Not a permanent verdict.
But right now, both earned their spot on the team. And I'm keeping both.
English

@xianminx recorded browser sessions as artifacts feels underrated.
if an agent touched the browser, i want the trail.
otherwise the demo can look clean while the process was chaos.
English

Turn any browser session into a narrated, subtitled video — a Claude Code skill: local Kokoro neural TTS + FFmpeg, no cloud, no API keys.
This explainer was recorded AND posted by the skill itself. 🎥
Built with DeepVista → deepvista.ai
#ClaudeCode #Playwright
English

@troyaitken_ this is why validation has to be part of the workflow, not the cleanup step.
agents make bad assumptions faster too.
fun and annoying.
English

Our first legal marketing campaign failed.
Not because of deliverability. Not because of volume. The claims weren't 100% valid.
We scraped Google Maps, ran a SERPdev call inside Clay to pull competitors, assumed if Google showed it, it was real.
Turns out, it wasn't.
Google was pulling in firms that were too big, too far, or even the wrong market.
We were manufacturing false urgency with bad data.
So we stopped and rebuilt.
All with Claude Code. Entirely automated.
Within one day of going live, we got 11 positive replies, 5 bookings off 1,620 sends.
Here's how it works:
1. We identified 14 states, top 20 cities, with 23 relevant keywords that our client actively wanted.
2. Playwright script to scrape GMaps pages 2-10 only (page 1 = the actual competitor we name).
3. We name the competitors in our outreach and have a citation the firm can verify themselves.
4. AI.ARK for contacts and JINA scrape to identify relevant personas and emails to target
5. OpenRouter API call to run the research and copy.
Claude Code build this entirely thing as I ideated and scripted it with Claude.
Day 1, this campaign outperformed our first month by 5x.
Don't expect to get it right every time.
Be willing to sit in the mud till you figure it out. Then scale the heck out of the winning angle when you find it.
We found ours.
English

@cubesol_greg i hit a version of this too.
browser access is not the same as reliable browser understanding.
you still have to force receipts or it will skip over the important part.
English

Claude Opus 4.8 has really gone downhill. Before, if I said "read my X feed and roast it," it would say "let me use Playwright to grab that."
Now it says: "X blocks me from reading your actual feed, so I can't roast the specific posts."
So I say "use Playwright." Then it says: "X gates the post text behind login, but the profile facts came through clean so I will just use that."
It's gone dumb. It didn't ask me to log in for it. It didn't even try. It's got "doesn't work syndrome."
English

@alphabatcher this is the right framing.
browser + repo + logs + db is not “extra context.”
it’s the actual workbench.
agents without the workbench are just very confident interns.
English

David Soria Parra:
"2026 is all about connectivity, and the best agents use every available method"
A coding agent needs access to the same places you check while building:
- repo and PRs
- docs
- browser
- database
- error logs
- Figma
- tasks
- payments
The article gives the 11 MCP servers for that setup:
- Context7, GitHub, Playwright first
- Supabase or Neon, Sentry, Firecrawl next
- Figma, Linear, Stripe when you need them
- Filesystem, Git, Memory, Sequential Thinking as the base
Read it if you keep copying code, docs, schemas, screenshots, errors, and tickets into Claude Code by hand
Alpha Batcher@alphabatcher
English

@NamanyayG i think the better version is existing saas becoming editable.
not every user wants to rebuild the tool.
they want to change the one workflow the product team never prioritized.
English

Vibe coding is causing the death of SaaS...
Because we want better software: more features, more workflows, more EVERYTHING
We're not happy with existing software, that's why we vibe code something ourselves.
But what if... existing software came built-in with an AI vibe coding layer?
If it did: that software can sell more, beat AI-first companies, and reduce churn.
We've already proven this works - that's how we added $1,000,000 in sales and prevented $120,000 in churn for our Series B customers.
Gigacatalyst embeds a vibe coding layer inside your software. See how it works:
If you could improve any daily software that you use, what would you change?
English

@DanielleFong vibe debt is real.
the faster agents let you ship, the more important it gets to make the repo understandable to future you.
rude tradeoff.
English




