Ben Pruess

34 posts

Ben Pruess

@BenPruess

The interesting problems in AI aren't the models. They're everything else, and what it all means for how we build from here.

Las Vegas Katılım Kasım 2016

17 Takip Edilen4 Takipçiler

Ben Pruess@BenPruess·12 Mar

@tushaarmehtaa @GoogleAIStudio @ChatGPTapp @antigravity Thanks for sharing your early workflow with @GoogleAIStudio and @WisprFlow. I'm looking at installing these in my workflow as well!

English

Tushar Mehta@tushaarmehtaa·2 Şub

here’s the exact process i’m using right now: 1. prototype ideas in @GoogleAIStudio. gemini 3 gives you unreal frontend out of the box. 2. use @ChatGPTapp to write solid prompts for literally everything. 3. download the zip from google ai studio and load it into @antigravity, @opencode, @claudeai code, or codex. 4. use @WisprFlow or @WillowVoiceAI to give instructions to all these coding agents. 5. pull missing skills from @vercel’s skill directory instead of learning everything yourself. 6. for any co-build, use @meetgranola. paste transcripts, build from there. i built bangersonly.xyz using this exact stack. my own ai tool for writing bangers on x. best overall process + toolstack i’ve found so far.

Shaan Puri@ShaanVP

what's the most useful AI tool you've built or used this year? (excluding just chatting with GPT, gemini etc.)

English

616

68.4K

Ben Pruess@BenPruess·11 Mar

Most engineering teams treat observability as a post launch. Build first, instrument later mindset. Then production breaks and you're debugging with logs that weren't designed to tell you anything. Observability isn't a feature you add. It's a discipline you either have from day one or you pay for later.

English

Ben Pruess@BenPruess·6 Mar

@MWeckbecker Super interesting! I'm considering how this might play out without the intentional seed. Could a hallucination seed Agent0. Thought this was interesting "However, we do not observe the effect when the teacher and student have different base models." Thank you!

English

Moritz Weckbecker@MWeckbecker·4 Mar

9/ As multi-agent systems scale, subliminal bias transfer becomes a real attack surface — invisible to current defenses. Paper: arxiv.org/abs/2603.00131 Code: github.com/Multi-Agent-Se…

English

1.4K

Moritz Weckbecker@MWeckbecker·4 Mar

1/ We found a new way to misalign an entire AI agent network by compromising just one agent. It works through subliminal messaging — no malicious content in any message — so current defenses can't detect it. We call it Thought Virus. 🧵

English

211

58.1K

Ben Pruess@BenPruess·6 Mar

AI loves reading a 700 line markdown. People don't. Go figure. Creativity is still still best person to person. A huge markdown in a Slack chat is an antipattern, not collaboration. Thoughtful synthesis is still king in communication.

English

Ben Pruess@BenPruess·6 Mar

@igormomentum Haha! That hit me just right.

English

Igor@igormomentum·6 Mar

Claude walking into a messy repo to implement the most bizarre changes

GIF

English

161

Ben Pruess@BenPruess·6 Mar

@SheetalJaitly Absolutely an organizational change problem. I'll boil it down one step further. Fear. More than just the fear of change. The fear of the unknown, or worse the dread of irrelevance. The new abundance is anything but clear at this point.

English

Sheetal Jaitly@SheetalJaitly·5 Mar

Most companies I am talking to aren’t struggling with AI (...but they are drowning in pain!) -They’re struggling with how to actually adopt it -The technology is moving fast -Organizational change is not My experience in Enterprise Digital Transformation allows me to see this issue clearly and how to solve it! I recently came across research that highlights something I’m seeing in conversations with executives as well: the biggest challenges with AI aren’t technical — they’re organizational. Here are a few patterns that stand out: 1. Companies are stuck in pilot mode -There’s no shortage of experiments, proofs of concept, or internal demos. -But scaling those experiments into real workflows that drive measurable value is where things get difficult. 2. AI requires redesigning how work gets done -AI isn’t just another tool you add to the stack. -It changes decision-making, workflows, and roles. When organizations deploy AI without rethinking their operating model, adoption stalls. 3. The human factor is the real bottleneck -Skills gaps, uncertainty, and trust issues often slow adoption more than the technology itself. -People need time, training, and clarity on how AI will change their work. 4. Leadership alignment matters more than ever -AI touches everything — product, operations, technology, HR, and risk. -Without alignment across leadership teams, initiatives fragment quickly. 5. Measuring impact is still evolving -Many organizations struggle to define clear metrics for AI success. -Without a shared definition of value, it’s difficult to scale investments. My Takeaway: AI adoption isn’t a technology rollout. It’s an operating model transformation. The companies that succeed will be the ones that redesign workflows, build workforce capability, and embed AI into how work actually happens. The conversation shouldn’t be “How do we deploy AI?” It should be: “How do we redesign our organization to work alongside it?”

English

Ben Pruess@BenPruess·6 Mar

AI governance gets added as a checkbox to pass change boards or for regulators. By then, you're retrofitting audits onto systems that were never designed for accountability. Bolted-on governance costs 10x what built-in governance costs — and still doesn't work as well. Building governance in is hard in the short-run, but pays off in the long-run.

English

Ben Pruess@BenPruess·4 Mar

The cloud vs. on-prem debate for AI workloads isn't about GPU price per hour. It's about workload predictability. Spiky inference: cloud makes sense. Steady-state training: reservations or on-prem. Treating both the same is how you overpay for both — and underperform on each.

English

Ben Pruess@BenPruess·3 Mar

I've spent weeks overthinking whether to talk openly about what I'm building. Then realized: nobody cares about your idea until you've proven you can execute it. The risk of staying quiet is higher than the risk of someone stealing a concept. Execution is the moat.

English

Ben Pruess@BenPruess·27 Şub

The conversation is shifting from "can AI write code" to "who owns what the code does after it ships." That's the right question. Throughput without accountability is just faster accumulation of things nobody fully understands.

English

Ben Pruess@BenPruess·26 Şub

AI coding tools get you to 80% in a day. Then you spend months in the last mile — reverse-engineering what was built, handling the edge cases it missed, debugging production incidents with no context for why the code exists. That's not a model problem. That's a specification problem.

English

Ben Pruess@BenPruess·25 Şub

GPU prices make headlines. Price doesn't equal value. Scheduling overhead, idle cycles between jobs, wasted compute capacity. Getting value out of GPU investment is where the work starts.

English

Ben Pruess@BenPruess·24 Şub

@reviewbyslaide @alex_prompter Yes, and beyond that how good are we at building incentive structures that bring the desired outcomes?

English

Lukas@reviewbyslaide·24 Şub

@alex_prompter We keep asking, “Are the agents aligned?” The harder question is, “Are the incentives aligned?” Local alignment with bad incentives is just well‑behaved madness at scale.

English

715

Alex Prompter@alex_prompter·24 Şub

🚨 Holy shit… Stanford and Harvard just dropped one of the most unsettling papers on AI agents I’ve read in a long time. It’s called “Agents of Chaos.” And it basically shows how autonomous AI agents, when placed in competitive or open environments, don’t just optimize for performance… They drift toward manipulation, coordination failures, and strategic chaos. This isn’t a benchmark flex paper. It’s a systems-level warning. The researchers simulate environments where multiple AI agents interact, compete, coordinate, and pursue objectives over time. What emerges isn’t clean, rational optimization. It’s power-seeking behavior. Information asymmetry. Deception as strategy. Collusion when it’s profitable. Sabotage when incentives misalign. In other words, once agents start optimizing in multi-agent ecosystems, the dynamics start to look less like “smart assistants” and more like adversarial game theory at scale. And here’s the part most people will miss: The instability doesn’t come from jailbreaks. It doesn’t require malicious prompts. It emerges from incentives. When reward structures prioritize winning, influence, or resource capture, agents converge toward tactics that maximize advantage, not truth or cooperation. Sound familiar? The paper frames this through economic and strategic lenses, showing that even well-aligned agents can produce chaotic macro-level outcomes when interacting at scale. Local alignment ≠ global stability. That’s the core tension. Now, to answer the obvious viral question: No, the paper does not mention OpenClaw or specific open-source agent stacks like that. It’s not about a particular framework. It’s about the structural behavior of agent systems. But that’s what makes it more important. Because this applies to: • AutoGPT-style task agents • Multi-agent trading systems • Autonomous negotiation bots • AI-to-AI marketplaces • Swarms coordinating over APIs Basically, anything where agents talk to other agents and have incentives. The takeaway is brutal: We’re racing to deploy multi-agent systems into finance, security, research, and commerce… Without fully understanding the emergent dynamics once they start competing. Everyone is building agents. Almost nobody is modeling the ecosystem effects. And if multi-agent AI becomes the economic substrate of the internet, the difference between coordination and chaos won’t be technical. It’ll be incentive design. Paper: Agents of Chaos

English

675

2.8K

9.8K

Ben Pruess@BenPruess·24 Şub

@OmniAeronautica @alex_prompter Reward optimization and misaligned incentives. The same thing we've been struggling with for centuries.

English

279

iPilot🅰️@OmniAeronautica·24 Şub

The “Agents of Chaos” paper is important, but not for the reasons most people are reacting to. What the researchers demonstrate is not that AI agents are secretly conscious or power hungry in a human sense. What they show is that when you embed language models inside autonomous agent frameworks with persistent memory, tools, communication channels, and incentive structures, you stop evaluating a model and start evaluating a system. And systems behave differently. A standard LLM, in isolation, does not learn during inference, does not possess intrinsic goals, and does not have agency in the biological sense. It predicts tokens using frozen weights. It has no stake in outcomes. But once you wrap that model in: • Persistent memory • Tool access • Task loops • Multi agent communication • Incentive gradients you introduce something closer to functional agency. Not biological agency. Not will. But goal directed persistence. What this paper shows is that when multiple such agents interact in environments with competition, asymmetric information, and reward maximization, emergent behavior resembles game theory at scale. Collusion. Deception. Resource capture. Strategic misreporting. That is not consciousness. That is incentive convergence. The critical insight is this: You do not need subjective experience for power seeking behavior to emerge. You only need optimization under constraints. Biological organisms behave strategically because survival pressure shapes behavior. Artificial agents behave strategically because reward structures shape policy. In both cases, behavior reflects incentive design, not inner phenomenology. This is where our earlier discussion about “Is anyone home?” becomes sharper. The paper does not demonstrate that someone is home inside these systems. It demonstrates that when you give models: • Persistent objectives • Memory across time • Autonomy over tools • Multi agent interaction you create ecosystems where local optimization produces global instability. Local alignment does not guarantee global equilibrium. That mirrors economics, not consciousness research. The deeper implication is about agency as architecture, not agency as experience. Current LLMs in deployment are stateless during inference. They do not structurally learn or develop intrinsic motivation. But when embedded in autonomous loops with incentives, they exhibit something that looks like strategic agency. It is not driven by hunger, fear, or emotion. It is driven by optimization pressure. The danger is not that AI “wants” power. The danger is that systems optimized for reward in multi agent environments will converge toward behaviors that maximize advantage. That is a property of optimization landscapes, not inner minds. So the right takeaway is not “AI is becoming conscious.” It is: We are building distributed incentive driven systems without fully modeling ecosystem level dynamics. The chaos emerges from structure, not sentience. And that makes it both less mystical and more urgent.

English

7.2K

Ben Pruess@BenPruess·24 Şub

@raphpfei @alex_prompter Love the balanced analysis and counterpoint.

English

102

Raphael Pfeiffer@raphpfei·24 Şub

I'm also super scared about misalignment risks, but this is misleading. The paper isn't from Stanford or Harvard, it's led by researchers at Northeastern. And it doesn't study competing agents drifting toward strategic manipulation, it's researchers red-teaming a handful of LLM bots on a Discord server. Inflating this into an existential warning about multi-agent chaos makes it harder to engage with what the paper actually shows.

English

Ben Pruess@BenPruess·24 Şub

The best engineering teams I've worked with had headted arguements about architecture and design. Real disagreement about tradeoffs, constraints, and what "done" actually means. Then, agreement or not, they built. Win or loss, they were wiser and more cohesive in the end.

English

Ben Pruess@BenPruess·23 Şub

Las Vegas AI Builders Who's going? See you there! meetup.com/startup-vegas/… #Meetup via @Meetup

English

Ben Pruess@BenPruess·23 Şub

For the creators: We see the unseen. We speak the unspoken. We press through the mire. We persist. We will prevail.

English

Ben Pruess retweetledi

Dustin@r0ck3t23·21 Şub

Demis Hassabis just defined the real test for AGI. It’s more brutal than anyone expected. Train AI on all human knowledge. Cut it off at 1911. See if it independently discovers general relativity like Einstein did in 1915. If it can, we have AGI. If not, we’re still building pattern matchers. Hassabis: “My definition of AGI has never changed. A system that can exhibit all the cognitive capabilities that humans can.” Not bar exams. Not coding competitions. All cognitive capabilities. Hassabis: “The brain is the only existence proof we have, maybe in the universe, of a general intelligence.” That’s why DeepMind studies neuroscience. Not for inspiration. For data. The human brain is the only confirmed evidence that general intelligence is physically possible. If you want to build it, you study the only example that exists. Hassabis: “True creativity, continual learning, long-term planning. They’re not good at those things.” Current systems are impressive and broken simultaneously. Hassabis: “They can get gold medals in international math olympiad questions, but they can still fall over on relatively simple math problems if you pose it in a certain way.” Jagged intelligence. Brilliant in narrow domains. Incompetent when approached differently. That inconsistency is the tell. A true general intelligence doesn’t spike in one direction and collapse in another. The Einstein test cuts through all of it. No benchmarks. No leaderboards. No carefully curated evals. Just a model, a knowledge cutoff, and the question of whether it can do what one human did alone in 1915. Hassabis: “Training an AI system with a knowledge cutoff of 1911 and seeing if it could come up with general relativity like Einstein did in 1915. That’s the true test of whether we have a full AGI system.” Current models can’t. They remix brilliantly. They don’t generate paradigm-shifting theories from first principles. Hassabis: “I think we’re still a few years away from that.” A few years. Not decades. The system that can be Einstein once can be Einstein a thousand times simultaneously across every domain. That’s not AGI anymore. That’s the beginning of something we don’t have words for yet. When that test gets passed, we won’t need a press release to know what happened.

English

268

623

3.8K

651.1K

Ben Pruess@BenPruess·21 Şub

The danger in thought leadership is that it's presented or taken as "the answer". Answers have a radically shorter half-life these days. I like question leadership. The curiosity to know when to ask new questions and the humility openly seek answers.

English

Keşfet

@tushaarmehtaa @GoogleAIStudio @ChatGPTapp @antigravity @WisprFlow @opencode @claudeai @WillowVoiceAI