Brian Hall

21 posts

Brian Hall

@brichall_

Cofounder @faramesh | Governance-as-Code for AI agents

Katılım Kasım 2025

70 Takip Edilen6 Takipçiler

Brian Hall@brichall_·14 May

@gerstenzang love seeing this side of the story, feel like all we see is the "We raised $_M in our Series _ round" these days

English

584

Sam Gerstenzang@gerstenzang·13 May

I've posted this before but it's always motivating to me to go back and look. Our first attempt to raise money for B&W + Moxie.

English

223

99.2K

Brian Hall@brichall_·29 Nis

@GaryMarcus PocketOS 3 days ago is the exact same failure mode. Guardrails in the system prompt, model ignored them, whole database gone in 9 seconds... There's a new one of these every week. Until enforcement moves out of the prompt layer, this is going to keep happening.

English

Gary Marcus, MIT PhD and NYU Professor Emeritus@GaryMarcus·27 Nis

⚠️ AI agents are wildly premature technology that is being rolled out way too fast. The deepest lesson about the vibe coded AI agent disaster story that is running around is NOT about losing your data. ⚠️ 𝗜𝘁’𝘀 𝗮𝗯𝗼𝘂𝘁 𝗔𝗜 𝘀𝗮𝗳𝗲𝘁𝘆. The user wasn’t (totally) naive. He thought system prompts and guardrails would save him. They didn’t. In this case he lost data. Eventually people will lose lives.

Gary Marcus, MIT PhD and NYU Professor Emeritus tweet media

English

277

14.2K

Brian Hall@brichall_·4 Nis

@GaryMarcus If you use faramesh.dev at least you can trust the hallucinated outputs won't do any real damage :)

English

Gary Marcus, MIT PhD and NYU Professor Emeritus@GaryMarcus·4 Nis

Can you trust the output of LLMs?

English

14.9K

Brian Hall@brichall_·2 Nis

If I hear one more person say their agent is 'secure' because they added a system prompt... I may just lose it for good dev.to/brianrhall/you…

English

213

Brian Hall@brichall_·31 Mar

Anyone going to the AI Summit at MIT next week? April 9 and 10 at the Media Lab. Would love to meet up if so! #MITAISummit #AIGovernance #AgenticAI

English

129

Brian Hall@brichall_·31 Mar

Agent governance just got a visual layer. Every agent, identity, credential, policy, and delegation chain in your system visible in one place. Think n8n but for agent governance. Faragraph is in production. Dropping soon. faramesh.dev

English

121

Brian Hall@brichall_·30 Mar

@elvissun The least privilege point is the one that really matters. Problem is most teams implement it at the config level and it still gets misconfigured or bypassed. Needs to live at the execution layer before the tool call runs.

English

Elvis@elvissun·2 Mar

alright here’s every practical security tip i have on agents: - move critical data to a USB stick, unplug when sleeping - security by least privilege, not by prompts - billing cap on everything AI touches - limit reads of external data, wrap in always - don’t post about what access your agent has publicly - those are prompt injection invitations (unless @levelsio already posted about it, then it’s a race) - don’t connect to moltbook (lol?) - roll every skill yourself - sandbox browser access - readonly prod access - allow prod writes only for specific use cases (i have /admin/zoe/* for zoe to handle support cases like credit topups) - one-time access for anything sensitive (eg gmail) with human in the loop, self-revoke access on script finish - create dedicated scripts, avoid improvised bash - use better models - audit trails everywhere -> security self-improvements mistakes will happen. limit worst case, embrace the rest

@levelsio@levelsio

This guy has lots of great security tips if you're coding with AI, great follow @elvissun

English

121

1.8K

201.8K

Brian Hall@brichall_·30 Mar

This is exactly the pattern you see in every one of these incidents. Agent had access, took action, nothing stopped it before execution. Prompt layer guardrails aren't going to fix this

Jessica Lessin@Jessicalessin

"A rogue AI agent recently triggered a major security alert at Meta Platforms, by taking action without approval that led to the exposure of sensitive company and user data to Meta employees who didn’t have authorization to access the data." @jyoti_mann1 theinformation.com/articles/insid…

English

Brian Hall@brichall_·30 Mar

@btaylor Curious how the guardrails work under the hood though. Enforced at the execution layer before tool calls run, or just at the prompt layer? As agents start taking real actions on systems of record that distinction matters a lot

English

Bret Taylor@btaylor·25 Mar

Today, Sierra is releasing Ghostwriter, our agent for building agents. With Ghostwriter, you can create an AI agent for your customer experience — one that can chat, pick up the phone, speak dozens of languages, take action on your systems of record, and be protected with industry-leading guardrails — simply by having a conversation. No clicking, no forms, no menus. Codex and Claude Code have transformed how we build software, making it possible for software engineers to orchestrate and review the work rather than doing all the work themselves. We think the same transformation will happen for all software. Rather than every enterprise app having a web app for humans and an API for automation, every software platform’s UI will be an agent that can do the work on your behalf. I recorded a demo of my building and optimizing an agent with Ghostwriter so you can see how powerful and easy it is to use. It’s completely changed the way our early adopters build agents, and it’s changed the way I think about the software industry. Let me know what you think, and, if you’re interested in trying it out at your business, please reach out directly.

English

168

296

3.2K

1.1M

Brian Hall@brichall_·29 Mar

Audit logs tell you what your agent did after the fact. Real execution control stops it from happening in the first place. Most teams building agents right now are only solving one of those problems.

English

Brian Hall retweetledi

Amjad Fatmi@amjadfatmi_·27 Mar

👀 What if agent governance wasn’t buried in CLI commands But something you could see, trace, and control live with your mouse before execution - sessions - Agent identity - credential vaults - delegation chains - approvals - policies ..

English

Brian Hall@brichall_·22 Mar

@ujjwalscript We're building the solution at faramesh.dev. The amount of people using .md files and prompt strings and calling it "governance" is alarming. A very clear signal there needed to be a solution

English

Ujjwal Chadha@ujjwalscript·22 Mar

Your AI Agent is mathematically guaranteed to FAIL. This is the dirty secret the industry is hiding in 2026. Everyone on your timeline is currently bragging about their "Multi-Agent Swarms." Founders are acting like chaining five AI agents together is going to replace their entire engineering team overnight. Here is the reality check: It’s a mathematical illusion. Let’s look at the actual numbers. Say you have a state-of-the-art AI agent with an incredible 85% accuracy rate per action. In a vacuum, that sounds amazing. But an "autonomous" workflow isn't one action. It’s a chain. Read the ticket ➡️ Query the DB ➡️ Write the code ➡️ Run the test ➡️ Commit. Let's do the math on a 10-step process: $0.85^10= 0.19$ Your "revolutionary" autonomous system has a 19% success rate. And the real-world data proves it. Recent studies out of CMU this year show that the top frontier models are failing at over 70% of real-world, multi-step office tasks. We are officially in the era of "Agent Washing." Startups are rebranding complex, buggy software as "autonomous agents" to look cool, but they are ignoring the scariest part: AI fails silently. When traditional code breaks, it crashes and throws a stack trace. When an AI agent breaks, it doesn't crash. It just confidently hallucinates a fake database entry, sidesteps a broken API by faking the response, and keeps running—corrupting your data for weeks before you notice. If your "automated" system requires a senior engineer to spend three hours digging through prompt logs to figure out why the bot made a "creative decision," you didn't save any time. You just invented a highly expensive, unpredictable form of technical debt. Stop trying to build fully autonomous swarms to replace human judgment. Start building deterministic guardrails where AI is the engine, but the engineer holds the steering wheel

English

161

453

38.2K

Brian Hall@brichall_·22 Mar

Jensen said it at GTC: "Agentic systems can access sensitive information, execute code, and communicate externally. Obviously, this can't possibly be allowed." We built the governor. What's the scariest thing you've seen an agent do unsupervised? github.com/faramesh/faram…

English

Brian Hall@brichall_·22 Mar

13 frameworks supported. No changes to your agent code. Install in one line via brew, curl, npx, or go. Policy language also shipped this week: github.com/faramesh/fpl-l…

English

Brian Hall@brichall_·22 Mar

Jensen Huang just told GTC that AI agent governance is the most critical problem in enterprise tech. NVIDIA's answer requires hardware, 5 security partners, and an enterprise procurement cycle. Ours is one command. $ faramesh run agent.py 🧵

English

140

Brian Hall@brichall_·19 Mar

@askalphaxiv Real time arxiv search as a tool call is going to change how people build research agents. No more stale context

English

alphaXiv@askalphaxiv·17 Mar

Introducing MCP for arXiv Let your research agents stand on the shoulders of giants Fast multi-turn retrieval, keyword search, and embedding search tools across millions of arXiv papers 🚀

English

409

3.1K

274.8K

Brian Hall@brichall_·19 Mar

@_akhaliq The fact that it monitors your calendar and keyboard inactivity to find windows to fine-tune itself is crazy

English

117

AK@_akhaliq·19 Mar

MetaClaw Just Talk An Agent That Meta-Learns and Evolves in the Wild paper: huggingface.co/papers/2603.17…

English

7.9K

Brian Hall@brichall_·19 Mar

Open runtime is a huge unlock. But the environment is only half of it. You still need something deciding what the agent is actually allowed to do before the tool call executes, regardless of model or harness.

Harrison Chase@hwchase17

Open Models, Open Runtime, Open Harness - Building your own AI agent with LangChain and Nvidia Claude Code, OpenClaw, Manus and other agents all use the same architecture under the hood. They consist of a model, a runtime (environment), and a harness. In this video, we show how to create a completely open version of this: Open Models: Nemotron 3 Super Open Runtime: Nvidia's new OpenShell Open Harness: DeepAgents Video: youtu.be/BEYEWw1Mkmw Links: OpenShell DeepAgent: github.com/langchain-ai/o… Deep Agents: github.com/langchain-ai/d… OpenShell: github.com/NVIDIA/OpenShe…

English

Keşfet

@gerstenzang @GaryMarcus @elvissun @levelsio @btaylor @ujjwalscript @elonmusk @BarackObama