Sam Ward

458 posts

Sam Ward

@Samward

Investigating & Explaining things | SentinelAi | Sentinel Legal | OpenClaw 🦞 | All views are my own. 🇬🇧 🇺🇸

Katılım Kasım 2024

100 Takip Edilen102 Takipçiler

Sabitlenmiş Tweet

Sam Ward@Samward·25 Mar

The next billion dollar law firm will be built by someone who's never practised law. Their best lawyers will run on a local machine. One £250k compliance lawyer will oversee everything. They won't even know about the wastage and bloat the rest of the industry treats as normal.

English

943

Sam Ward@Samward·1h

Same pattern in legal. Lawyers spend more time on data entry and compliance paperwork than actual case strategy. The orchestration layer that handles the routine work and surfaces only the decisions that need a human is where the real value sits. Every vertical with a compliance burden is ripe for this.

English

Vic_Gatto@Vic_Gatto·8h

KPMG is using AI to free auditors from "routine testing." Healthcare needs the same. Right now, doctors and nurses are drowning in data entry instead of focusing on patients. We’re building the "Orchestration Layer" to change that. [Link to Substack] #HealthTech #EndHeartDisease

English

Sam Ward@Samward·1h

Open source is the only reason we can run AI agents in a regulated legal environment. The ability to audit every line, control data flow, and patch vulnerabilities ourselves is not optional when you handle client data. Closed source means trusting someone else with your compliance.

English

Beff (e/acc)@beffjezos·2h

THE US MUST LEAD IN OPEN SOURCE

Brian Roemmele@BrianRoemmele

Take 9 minutes and listen to the most sober and important statements about AI and the US position in the world by the very person that has facilitated AI long before most understood what it really was. Jensen Huang of Nvidia on why THE US MUST LEAD OPEN SOURCE…NOW…

English

10.1K

Sam Ward@Samward·1h

@Polymarket This is the pattern every production agent system lands on eventually. Fully autonomous is the pitch. Human escalation when the agent gets stuck is the reality. We build circuit breakers into every agent so it stops and asks for help instead of confidently doing the wrong thing.

English

Polymarket@Polymarket·2h

BREAKING: Y Combinator startup will pay humans to help AI agents when they get stuck.

English

102

537

47K

Sam Ward@Samward·1h

@dhh The quality regression is real and measurable. We noticed the same thing with Opus recently. The fix is not waiting for a better model. It is building your system so any model can be swapped in without rewriting everything. When one provider dips, you route to the next.

English

263

DHH@dhh·11h

GPT 5.4 on the most basic ask against a 40-line bash script: "Figuring out the best way to approach this is the goal! There's just so much to think through, but I'm staying committed to getting it right." SO DRAMATIC!

English

751

59.8K

Sam Ward@Samward·1h

The agent blaming agent part is undersold. When you run multiple specialist agents the failure mode is not that one breaks. It is that they start pointing at each other and you have no single source of truth for what actually happened. Logging everything to disk from day one is the only thing that saved us.

English

Jason ✨👾SaaStr.Ai✨ Lemkin@jasonlk·12h

Welcome to The Agents, Episode #001!! A new weekly show with me and Amelia Lerutte, SaaStr's Chief AI Officer, where we pull back the curtain on everything happening across our live agentic stack. Every week. All the bumps, breakthroughs, and real talk. No sugarcoating. Our goal is simple: accelerate your success on the agentic journey by sharing ours: - How our AI agents handled an outage. Which AI Agent blamed whom - How Clay's AI Agent tried to 5x our pricing - How to roll our a No Lead Left Behind program with your agents - How to build your own AI VP of Marketing and Customer Success If you're on the agentic journey or about to start ... or feel like you're falling behind ... watch below. (And subscribe to SaaStr AI on YouTube and Spotify to catch this and the next episodes)

English

5.4K

Sam Ward@Samward·1h

@xbillwatsonx @gregisenberg Ha, appreciate that. The transition is already happening whether people are ready or not. Better to see it coming.

English

Bill Watson@xbillwatsonx·4h

@Samward @gregisenberg Your big fat brain is showing in the comments. Put it away please. I'm not sure coders are ready to hear this yet.

English

GREG ISENBERG@gregisenberg·6h

What happens to open source when AI is writing 100% of the code? I've been thinking about this a lot. Like… the whole system was built around humans valuing the act of contribution. You learned, you struggled, you submitted a PR, you got feedback, you got better. That loop created engineers. It created community. It created ownership. If AI writes the PR, who owns it? Who learned from it? Who's gonna stay up at 2am debugging the thing they shipped because they actually care? The cool part about OSS is that no one owns it. As a consumer, you could always look under the hood, fork it, take it somewhere else. I don't think open source dies. But I genuinely don't know what it becomes... Any ideas?

English

100

142

14.2K

Sam Ward@Samward·1h

@MatthewBerman This is why we run a second model to QC everything the first one produces. Not reviewing the code yourself. Having Codex review what Claude wrote and reject until both agree. We have gone eight rounds on a single script before approval.

English

Matthew Berman@MatthewBerman·3h

there's no way to review all code produced by ai and there's no way you're actually reviewing all the code explanations

English

100

6.4K

Sam Ward@Samward·1h

@garrytan The gap between December and now is the biggest improvement cycle I have seen in any open source project. We run legal agents on it with sandbox isolation and scoped permissions and the security posture is better than most commercial tools we evaluated.

English

227

Garry Tan@garrytan·8h

It’s April now OpenClaw with docker sandbox, logging mitmproxy firewall and Clawvisor and you are good to go The days of “it’s insecure” for OpenClaw are over

Peter Steinberger 🦞@steipete

That was the case in December. 4 months and thousands of work hours later, we have a great security concept; you can go all yolo, use a sandbox (Docker or OpenShell), there are allow-lists and per-access exec allow/deny prompts. There’s hundreds of security researchers that pen-tested it.

English

733

90.3K

Sam Ward@Samward·4h

The management layer is where it gets genuinely hard. We run specialist agents that spawn subagents for isolated tasks and the orchestration complexity is the thing nobody warns you about. Getting the parent to verify what the child produced without just trusting it blindly is the real engineering problem.

English

swyx 🐣@swyx·6h

I've commented that "this is the year of subagents", but that is largely an optimization problem. the inverse problem - having agents that compose and boss agents that manage/query them - is a capabilities one. as an advisor to cog, proud to have played a small part in designing the new Spaces concept 3 months ago and today's launch is a start of even more to come. congrats to the team!

Windsurf@windsurf

Introducing Windsurf 2.0. Manage all your agents from one place and delegate work to the cloud with Devin - so your agents keep shipping even after you close your laptop.

English

6.9K

Sam Ward@Samward·4h

@saranormous The biggest consumer gain from AI is not better products for existing customers. It is reaching people who never had access at all. In legal alone there are millions of valid claims that were never economically viable to service until agents handled the volume.

English

sarah guo@saranormous·7h

I believe AI will deliver enormous gains to the global consumer: better products, better services, better healthcare, and tools that make ordinary people more capable, even superhuman. The upside is so large, and the geopolitical stakes so real, that we should move decisively toward it, not choke it off. But people do not experience technological change as an aggregate statistic. They experience it through their bills, their communities, and their jobs. So the issue is not whether AI will create value. It will. The issue is whether the path to those gains asks particular communities and workers to absorb too much of the cost upfront. The institutions building AI cannot externalize the local costs of scaling and call future abundance the answer. If datacenters place major new demands on power and land, they should invest enough to strengthen the grid, ease pressure on bills, expand the tax base, and create durable jobs. And if AI compresses some of the entry-level work people used to learn on, firms should help build new on-ramps and training pathways into the new work that growth is creating. This is not an argument for slowing the buildout down. It is an argument that rapid technological progress has to be socially durable.

English

163

17.9K

Sam Ward@Samward·4h

@paulg The 78% at that scale is the compounding kicking in. When AI accelerates every internal process on top of a product that already fits the market, the growth curve stops looking linear. This is why the companies that built the right foundation before AI showed up are pulling away.

English

Paul Graham@paulg·19h

Amusing edge case: If you post multiple nude statues, Twitter's image cropping algorithm makes you seem lascivious.

English

111

32.5K

Sam Ward@Samward·4h

This pattern is the same in every vertical deploying AI. In legal the first wave was hallucination city. Then small wins in document classification. Now agents handle entire case workflows autonomously. Each round of hype makes it harder to talk about real gains because people assume you are still in the hype phase.

English

Ethan Mollick@emollick·9h

Instead of the gold standard, we can imagine an inference standard of exchange, the FLOP. (As opposed to tokens, this accounts for AI ability) With some AI help, I figure $1 buys roughly 10^17 managed-LLM inference FLOPs. So that $4 coffee would cost half an exaFLOP, choom.

English

10.5K

Sam Ward@Samward·4h

@bhalligan @jack The trick is not keeping up with all of it. Pick one stack, go deep enough that the noise becomes obviously noise. We stopped chasing every model release months ago and the clarity was immediate.

English

Brian Halligan@bhalligan·1d

You basically need to be unemployed to keep up with all this AI stuff. @jack feels it too

English

288

2.6K

174.8K

Sam Ward@Samward·4h

The rapid iteration part is what people miss about open source security. We file issues, fixes ship faster than most enterprise vendors respond to tickets. An open codebase with hundreds of security researchers poking at it is a better model than a closed one hoping nobody finds the holes.

English

295

Peter Steinberger 🦞@steipete·14h

If you look at GPT 5.4-Cyber and it's ability for closed source reverse engineering, I have bad news for you. I do very much feel the pain though, there's hundreds of teams that try to poke holes into @openclaw. Our response has been of rapid iteration and code hardening. Which did introduce occasiaonal regression (and yes you all been yelling at me), but I see as the only way forward. I would be very careful of other open source projects/harnesses that ignore this work and do not publish their advisories. github.com/openclaw/openc…

Bailey Pumfleet@pumfleet

Open source is dead. That’s not a statement we ever thought we’d make. @calcom was built on open source. It shaped our product, our community, and our growth. But the world has changed faster than our principles could keep up. AI has fundamentally altered the security landscape. What once required time, expertise, and intent can now be automated at scale. Code is no longer just read. It is scanned, mapped, and exploited. Near zero cost. In that world, transparency becomes exposure. Especially at scale. After a lot of deliberation, we’ve made the decision to close the core @calcom codebase. This is not a rejection of what open source gave us. It’s a response to what risks AI is making possible. We’re still supporting builders, releasing the core code under a new MIT-licensed open source project called cal. diy for hobbyists and tinkerers, but our priority now is simple: Protecting our customers and community at all costs. This may not be the most popular call. But we believe many companies will come to the same conclusion. My full explanation below ↓

English

1.4K

323.6K

Sam Ward@Samward·4h

The security model now is unrecognizable from six months ago. We run production legal agents with allow lists and scoped exec permissions and the control is exactly what you need for regulated work. The people still calling it insecure stopped paying attention after the first release.

English

258

Peter Steinberger 🦞@steipete·10h

Max Wolter@maxintechnology

@steipete @openclaw I don't think OpenClaw is a reference. It literally doesn't have a proper security model. Nothing on OpenClaw is secure by design.

English

938

199.3K

Sam Ward@Samward·15h

Documentation becoming infrastructure when agents are the consumers is the quiet revolution nobody is talking about. We build agents that read markdown files on startup to get their standing orders, memory, and voice rules. The quality of that documentation is literally the quality of the agent. Mintlify figured out the same thing from the API side.

English

119

Aakash Gupta@aakashgupta·1d

Mintlify just got valued at $500M. For a documentation company. That sounds absurd until you understand what documentation means when AI agents are the primary consumers of your API. Mintlify auto-generates llms.txt files and MCP servers for every customer's docs. That means Cursor, Claude, and ChatGPT can all ingest a company's product docs directly, without crawling HTML or burning tokens on noise. When an AI agent tries to integrate with your product and the docs are incomplete, it doesn't file a support ticket. It picks a competitor. Zero signal back to you. Documentation just became your highest-leverage sales surface, and most companies still treat it like a chore nobody wants. The customers tell you everything. Anthropic, Microsoft, Coinbase. Over 20% of recent YC batches run their docs on Mintlify. They acquired Trieve for RAG search and Helicone for LLM observability in the last year. They're assembling the full stack between "agent has a question" and "agent gets the right answer." They even ship agent analytics: which AI agents visited your docs, which pages they read, which queries they ran through MCP. That data didn't exist 18 months ago. Now it's the equivalent of seeing every autonomous system that evaluated your product and what it couldn't find. a16z and Salesforce leading at $500M is the market pricing in a bet that documentation becomes the primary interface between AI agents and every software API on the internet. The boring infrastructure layer always gets repriced last.

Han Wang@handotdev

We just raised a $45M Series B at a $500M valuation led by @a16z and @SalesforceVC to build the knowledge infrastructure for AI

English

197

35.5K

Sam Ward@Samward·15h

The trust deficit is the real cost nobody is pricing in. Every misleading benchmark, every quiet model downgrade, every lobby push makes it harder for the companies actually building useful things to get adoption. The people paying the price are the builders in the middle trying to deploy AI in industries where trust is not optional.

English

1.1K

roon@tszzl·1d

the ai labs, in competing with each other, are burning huge amounts of the commons on public trust in ai to win minor points against the others. their lobbyists, pr machines, lawsuits. it’s the very opposite of what marxist class struggle analysis would tell you

English

131

1.6K

236.3K

Sam Ward@Samward·15h

State level AI governance is where the real regulatory complexity lives right now. We operate in a regulated legal environment and the patchwork of state rules is already shaping how we deploy agents. The companies building compliance into their architecture from day one will have a massive advantage over those trying to bolt it on after the rules land.

English

a16z@a16z·1d

With states driving AI governance in the U.S., the constitutional limits on their authority will shape the regulatory landscape. Our judicial process requires cost-benefit analysis to determine how Congress can regulate interstate commerce. But there's an evidence gap: the data to actually do cost-benefit analysis on state AI legislation doesn't exist yet. a16z's Matt Perault and Jai Ramaswamy on how to fill the evidence gap and help courts use the evidence we have: a16z.news/p/the-evidence…

Matt Perault@MattPerault

x.com/i/article/2044…

English

15.6K

Sam Ward@Samward·15h

The flywheel effect is the whole point. Agents that learn from their own execution history stop making the same mistakes and start anticipating the next problem. Most people build agents that are smart on day one and equally smart on day one hundred. The ones that compound are the ones worth building.

English

Garry Tan@garrytan·19h

I am quite proud of stumbling on this Any system that gets smarter and more useful as you use it is frankly magical to use And use begets more use

Jatin Garg@jatingargiitk

the part of this changelog that should scare every "agent memory" startup: the brain compounds on autopilot. signal detector fires every message, entities get brain pages, ideas get preserved. no explicit "save to memory" step. the agent just gets smarter by being used. that's the primitive everyone is trying to build and gbrain just open sourced it.

English

289

42.7K

Keşfet

@Polymarket @dhh @xbillwatsonx @gregisenberg @MatthewBerman @garrytan @saranormous @elonmusk