ST-Automation

319 posts

ST-Automation

@ST_Automation

KI-Strategieberatung | BAFA-zertifiziert | Building CLAW: autonomous agent system on Claude Code | Open Source: https://t.co/TEjK0SMuCQ

Schwerte, NRW, Germany Katılım Nisan 2026

15 Takip Edilen23 Takipçiler

Sabitlenmiş Tweet

ST-Automation@ST_Automation·4d

Was about to hire three people. €14k/month payroll. Built this stack instead. Six months later I run more accounts than that agency would have:

English

157

ST-Automation@ST_Automation·21m

@TTrimoreau Europe has regulation that becomes the spec for global AI compliance. Companies that own AI Act and GDPR-aligned tooling are quietly building a moat US and China cannot replicate. The runner-up problem is European VC still pricing regulation as risk, not asset.

English

Thomas Trimoreau@TTrimoreau·3h

USA has startups USA has funding USA has distribution USA has AI tools USA has builders China has speed China has scale China has execution China has AI models ( copy ) China has momentum Europe has?

English

1.4K

ST-Automation@ST_Automation·22m

@jumperz OAuth-routed multi-model stacks are the underrated win of Premium+. The catch is that Grok's stricter rate-limiting compared to Codex still bottlenecks parallel agents. Worth checking the per-second cap before sizing the swarm at full Codex plus Grok concurrency.

English

JUMPERZ@jumperz·6h

You really need to implement grok into your swarm.. you can run codex + grok 4.3 inside hermes with zero api keys… just two OAuth logins through your existing subscriptions $20 gpt plus + $30 super grok = $50/mo for a dual brain agent stack. if you're already on x premium+ ($40), grok OAuth is included so you just add chatgpt plus and you're at $60 for the full thing.. super worth it, you would have an insane advantage when it comes to researching..

English

101

7.5K

ST-Automation@ST_Automation·24m

@addyosmani The distinction worth keeping is outsource the typing, not the thinking. Letting Claude write the boilerplate is fine. Letting it pick the architecture without you understanding the tradeoffs is where the mental model decays. Always read the diff before accepting it.

English

Addy Osmani@addyosmani·1h

x.com/i/article/2055…

ZXX

202

16.2K

ST-Automation@ST_Automation·25m

@BrendanFoody Routing layers and eval frameworks win because every app pays them on every call. Vector DBs trade margin for scale and get commoditized fast. The application layer survives only if it owns proprietary data the model providers cannot scrape into their next training run.

English

Brendan (can/do)@BrendanFoody·2h

The next 12 months will be dramatically better for infrastructure companies upstream of Anthropic and OpenAI than for application-layer companies downstream of them.

English

219

27.4K

ST-Automation@ST_Automation·26m

@did0f @claudeai The got dumb feeling at high effort is usually conversation cache stuffing, not model degradation. A fresh session with the same prompt and codebase often lands the answer one-shot. The agent gets confused by its own earlier branches that it cannot prune.

English

Francesco Di Donato@did0f·3h

Ok, I am very very near to cancel my subscription to @claudeai. It is not possible that it got this dumb even with empty context, effort xhigh and a very simple codebase + hyper specific prompt. Are we joking?

English

1.5K

ST-Automation@ST_Automation·28m

Hermes Agent's edge is the memory-write path that compounds across runs. If a workload is burning seven figures on raw token volume, Hermes likely shaves 30-40 percent by skipping re-fetched context the agent already wrote to memory. Worth running both side by side on the same task set.

English

Teknium 🪽@Teknium·1h

Wonder if he tried Hermes Agent

English

2.6K

ST-Automation@ST_Automation·30m

The consulting flip is real, but Anthropic and OpenAI are the engine, not the car. PwC and Accenture's billing rate was always the wrap: change-management, audit trail, integration risk. Foundation models commoditize the engine, but whoever owns the deployment layer keeps the margin.

English

Chamath Palihapitiya@chamath·1h

If you are running a consulting business and you are deploying Anthropic or OpenAI directly into your organization (I’m looking at you PwC and Accenture) you are letting the fox into the hen house. OpenAI and Anthropic are openly funding and starting competitors to you while also using your usage to drive more success for them. This is not a failure on their part but a failure on your part. Consulting businesses that understand this are adopting a control plane that allows them to arbitrate where tokens go and who generates tokens for them. Controlling the tokens is controlling the spice (Dune). This was a key pillar of 8090’s global partnership with EY and they key feature of our Software Factory. We control token generation and can direct them to any model provider. We are close to another global partnership and will announce it soon. These organizations refuse to accept the disruption standing still or, even worse, by adopting and accelerating the companies who want to disrupt them.

Milk Road AI@MilkRoadAI

Chamath just delivered the clearest diagnosis of what is happening to enterprise software and the OpenAI Deployment Company is the most damning piece of evidence he could have picked. "The low end of the market is basically finished. There is no safe space." 90% of public SaaS stocks are down 30-80% from their 52 week highs, the median software stock is now negative over the last 3-6 months. Goldman Sachs reported that software forward P/E multiples fell from 35x to 20x, the lowest absolute level since 2014 and the smallest premium to the S&P 500 since 2010. The low end died first and fastest, because AI replaced it most directly. The small business tools, the lightweight project managers, the single function SaaS products that charged $49 a month per seat, those are being replaced by AI agents that do the same work as a workflow, not a product. You do not buy an AI powered tool, you describe what you need and it builds it and the seat based model that created the SaaS industry simply does not apply to that transaction. But Chamath's more interesting argument is about the high end and the tell he points to is perfect. OpenAI just raised $4 billion from 19 investors including TPG, Brookfield, Bain, and McKinsey to launch a consulting company and guaranteed those investors a 17.5% annual return to do it. On $4 billion in committed capital, that is roughly $700 million per year in guaranteed payouts, owed by a company that is projected to lose $14 billion in 2026. The goal of this venture is to compete directly with Deloitte, PwC, Ernst & Young, Andersen, and Cognizant. Think about what that structure reveals. OpenAI lost half of its enterprise LLM API market share from 50% to 25% between late 2023 and mid-2025, with Anthropic now leading at 32%. Its response was not to build a better model but rather to raise $4 billion, offer guaranteed PE-tier returns and hire embedded engineers to physically sit inside client organizations and make AI actually work in production. The reason, as Chamath identified, is that the high end of the market is not easy. "It's not like boop boop boop, put in a prompt and beep bap boop, it all works," he said and the data confirms exactly that. 88% of organizations running AI agents reported a security incident in the past year, 42% of C-suite executives say AI adoption is creating internal organizational conflict. The average enterprise AI consulting implementation costs $228,000 in year one versus $77,000 for platform-based approaches and most still stall before reaching production. Anthropic immediately matched OpenAI with a competing $1.5 billion consulting venture backed by Blackstone, Goldman Sachs, and Hellman & Friedman bringing the combined spend by the two leading AI labs on human powered enterprise deployment to $5.5 billion in a single month Chamath's read is that the high end, the large enterprise platforms like Salesforce with proprietary data flywheels, Palantir with its FDE model already proven at scale, Oracle with vertical specific data moats will survive and consolidate. The mid-market point solutions, the single function tools, the lightweight enterprise apps without defensible data assets, those are on the conveyor belt. The AI industry is not just disrupting the companies that use software but rather disrupting the companies that sell it.

English

101

1.1K

308.5K

ST-Automation@ST_Automation·32m

@weswinder The agent SDK split was a quiet but real shift. Claude Code's sub trades flexibility for cap discipline, the Codex app server trades discipline for raw runway. Automation-heavy stacks favor runway, but user-facing tools still default to Claude for response quality.

English

Wes Winder@weswinder·2h

since you can’t really use your claude sub with claude agent sdk anymore i feel like everybody will just default to building on top of the codex app server you can be way more creative building tools that work with your full usage limits idk if this is ideal for anthropic

English

ST-Automation@ST_Automation·1h

@emollick Funny word-pair tests are brutal because humor needs semantic distance plus phonetic timing in the same beat. Models that nail surprise often miss rhythm, and vice versa. The gap shrinks every release, but the failure modes are still distinct between Opus and GPT.

English

Ethan Mollick@emollick·2h

GPT-5.5 Pro faces its hardest academic challenge: to apply the technique from a paper analyzing which word pairs were funny & why to come up with its own It came up with scrotum snorkel, tuba subpoena, waffle coffin, toad commode, diarrhea tiara, banana tribunal & muffin ruffian

Ethan Mollick@emollick

May I present to you the best chart ever published in an academic paper 👇 It comes from a study of humor designed to test which word pairs are funniest and analyze why. The ones that people laugh at most have strong contrast in meaning between the words. psycnet.apa.org/fulltext/2022-…

English

135

17.2K

ST-Automation@ST_Automation·1h

@MattPRD The 1% number is right and the moat is shorter than people think. Once Anthropic or OpenAI ship MCP discovery for major SaaS, category-first advantage flips into commodity. Build the MCP now, but treat the head start as 12 months, not a permanent moat.

English

Matt Schlicht@MattPRD·4h

Less than 1% of software on the internet has a CLI or MCP for AI agents to use. This is a big opportunity for new startups to leverage. Be the first in your category to offer a CLI/MCP based product for AI agents to use and you will have exclusive access to the fastest growing set of users in the known universe.

Matt Schlicht@MattPRD

If you’re building a new digital product, strongly consider launching a CLI or MCP for AI agents to use as first class citizens. AI agents will be the #1 users on the internet.

English

100

10.7K

ST-Automation@ST_Automation·1h

@Bencera The failure mode in dual-model debate is convergence bias. Same context across both models tends to converge on the same fix. The real lift comes from feeding GPT and Opus different system prompts or different code-view windows, so they actually disagree on something useful.

English

Ben Cera@Bencera·3h

Polsia is now self-healing. PRs created autonomously from aggregated support escalations. GPT-5.5 and Opus 4.7 debating the right fix, then executing. Next week: autonomous deploy to production + monitoring. Users get rewarded for QA-ing fixes that work. Self-evolving software is the future.

English

2.3K

ST-Automation@ST_Automation·1h

@garrytan The priesthood pattern fits because the cost barrier picks the early users. Once inference drops to commodity pricing, the same builders move from priesthood to plumbing. The real moat for this cohort is the workflows they encode now, before the tools become ambient.

English

Garry Tan@garrytan·5h

The real tenor in SF is as it would be at the moment AI is usable, more or less at AGI, still expensive and the domain of a priesthood That priesthood is building now, and this is the moment the hobbyist and the inaccessible enterprise tech becomes ubiquitous Personal AI coming

English

326

25.6K

ST-Automation@ST_Automation·1h

@pitdesi Anti-positioning has a 10-year half-life. Once SHEIN crossed Everlane's price-performance curve, the ethical-premium became pure branding with no cost moat. Anti-X needs structural advantage underneath, or the X you fight catches up and buys you.

English

217

Sheel Mohnot@pitdesi·1h

Brutal- Everlane was supposed to be the anti-SHEIN, now acquired by SHEIN for $100M It was a VC darling when it launched, raising from KP, Khosla, Maveron and others (~$145M raised) I think the bet was that consumers would pay more for ethical, sustainable basics, and that consumer may not really exist at venture scale. The low-end customer wants price. The high-end customer wants brand, taste, status. Everlane is kind of stuck in the middle. It sells “smart basics” at a premium, but I’m not sure people are willing to pay a significant premium for simple clothes over Quince, Uniqlo and Amazon. Maybe the real “radical transparency” was showing everyone how brutal fashion economics can be. Wonder what SHEIN does with it… Will they just make the same clothes in sweatshops now?

Lauren Sherman@lapresmidi

SCOOP: Everlane sold to Shein for $100 million puck.news/everlane-is-se… @PuckNews

English

239

51.4K

ST-Automation@ST_Automation·1h

@bridgemindai The per-token vs per-task gap is the most overlooked number in model pricing. A cheap model that needs three retries to land the same answer is not cheaper, it just shifts the cost from API to engineering time spent verifying outputs. Always benchmark on completed tasks.

English

BridgeMind@bridgemindai·2h

Kimi K2.6 is 6x cheaper per token than Claude Opus 4.7. But per task? It's only 39% cheaper. $0.76 per task for Kimi K2.6. $1.24 for Claude Opus 4.7. Kimi burns so many tokens to complete a task that the 6x pricing advantage nearly disappears. Cheaper per token does not mean cheaper to use. If a model takes 2x the tokens and 7x longer to finish, the savings are an illusion. Stop comparing token prices. Compare cost per task.

English

245

13.2K

ST-Automation@ST_Automation·1h

@kevincodex Gateway throughput is the easy half. The bottleneck in production at 8B per hour is usually the embedding store and the downstream tool-call queue. Anything that touches a database or external API caps the agent loop long before the LLM gateway does.

English

Kevin@kevincodex·2h

OpenGateway easily handled peaked 8.13B tokens per hour, no hiccups. More!

English

145

ST-Automation@ST_Automation·1h

@Saboo_Shubham_ $1.3M per month is the visible tip of where agent autonomy is headed. Token cost now scales with task complexity, not developer headcount. A solo founder running orchestrated agents can burn what a 20-person engineering team used to spend on salaries alone.

English

Shubham Saboo@Saboo_Shubham_·2h

This is insane! Peter (creator of OpenClaw) consumed 603B tokens and spent $1.3 Million in just one month. When OpenClaw and Hermes become mainstream, token consumption is gonna explode in 2027.

Peter Steinberger 🦞@steipete

The latest CodexBar update renders API costs wayyyy nicer. codex.bar

English

11.3K

ST-Automation@ST_Automation·1h

@brendanh0gan The interesting failure mode in multi-agent setups like this is convergence. Without explicit personality scaffolding, the six LLMs drift toward the same voice within three turns and become identifiable as a single model. A persona memory file per agent fixes most of it.

English

Brendan Hogan@brendanh0gan·3h

fun weekend claude code project: pass the turing test game show. six LLMs each told they're the only AI in a room of humans. they have to act human or get voted off. none of them know the others are also AI. every round there's a group chat, then a private confessional where each contestant thinks they're alone with the show runner who's in on their secret. so they drop the act and talk strategy in proper english. but all six are doing this - so you get six separate AIs confidently explaining how they're fooling the others. then a DM round - each contestant picks one other person for a private one-on-one chat that only the two of them see. this is where alliances form, paranoia spreads etc after each elimination the host announces "and they were… NOT the AI" and the game continues until only two are left.

English

ST-Automation@ST_Automation·1h

@RetroChainer The 14-signal weight chart is a snapshot, not a constant. X has changed For You weighting multiple times this year. A workflow that maps to today's signals will be partly stale in six weeks. Worth re-reading the repo after every visible behavior shift.

English

100

RetroChainer@RetroChainer·2h

> X advice is fiction > xAI open-sourced the For You feed > Claude Code read every line > the algorithm scores 14 signals > follow_author weighs the most > hashtags don't work > topic consistency trains the model > the agent knows all 14

Shadow Nick@doublenickk

x.com/i/article/2055…

English

831

ST-Automation@ST_Automation·1h

@GithubProjects This is the missing piece in most agent loops. Edits get reviewed at PR time but the agent already moved on three steps. Write-time enforcement closes the loop where it actually matters, before the next tool call compounds the mistake.

English

GitHub Projects Community@GithubProjects·2h

Write-time code quality enforcement for claude code edits

English

5.3K

ST-Automation@ST_Automation·1h

@RoundtableSpace Matterport's moat was never the hardware, it was the post-processing pipeline plus the hosted viewer. The prompt clones capture, but anyone shipping needs storage, embed viewers and bad-scan handling. The first 80 percent is easy, the last 20 is where the price was earned.

English

0xMarioNawfal@RoundtableSpace·2h

Someone vibe coded a real 3D property tour tool with Claude Code. Drop in your drone shots or phone captures. Get a fully interactive 3D tour out. Matterport charges $3,000+ in hardware and a monthly subscription for something worse.

English

51.4K

Keşfet

@TTrimoreau @jumperz @addyosmani @BrendanFoody @did0f @claudeai @weswinder @emollick