Aron van Ammers ⛺⛰🏔

7.4K posts

Aron van Ammers ⛺⛰🏔

@aronvanammers

Building and growing token ecosystems as CTO of @outlierventures | #bitcoin #blockchain #decentralisation | Tech geek | Triathlete | Take a deep breath

Amsterdam, Netherlands Присоединился Ocak 2009

501 Подписки2.6K Подписчики

Aron van Ammers ⛺⛰🏔@aronvanammers·4d

I'm genuinely not sure the Post-Web era will have an "App Store moment." AI keeps pushing from shared to personal, from products for millions to solutions for one. My best guess at what stays valuable: shipping fast, continuous validation, trusted curation, deep domain expertise.

English

Aron van Ammers ⛺⛰🏔@aronvanammers·4d

One pattern I keep noticing: the real play isn't selling skills, it's giving them away. Stripe, Cloudflare, and AWS all ship free MCP servers so agents default to their platforms. Solidity auditing firms publish free skills so you think of them when you need the real audit. The skill is the handshake. The business is what comes after.

English

Aron van Ammers ⛺⛰🏔@aronvanammers·4d

Every platform era found its monetizable unit. The web had SaaS. Mobile had apps. So what is it for the age of agents? Agent skills feel like the obvious candidate. But they're just text. People fork them, tweak them, forget about them. Nobody uses them as-is for long.

English

1.1K

Aron van Ammers ⛺⛰🏔 ретвитнул

JB@jamie247·4d

Just spent $500 in compute vibe coding my own Civilisation RPG but with unbounded natural language diplomacy.. meet Uncivilised. ask me anything.

English

225

120

2.9K

318K

Aron van Ammers ⛺⛰🏔 ретвитнул

JB@jamie247·14 Mar

AI agents are already reverse-engineering institutions from their public surfaces. The question is not whether your institution will be decomposed into machine-readable primitives. It is whether you control the terms.. Pathways to the Post Web ~ ebook ‘A transition framework for startups and the firm’. outlierventures.io/publications/p…

English

461

Aron van Ammers ⛺⛰🏔@aronvanammers·12 Mar

MCP or API? Or both? If the AI is the application, use MCP for dynamic reasoning. Every call costs LLM tokens and is slow. If the AI builds the application, use direct APIs for token-free, zero-latency execution.

English

987

Aron van Ammers ⛺⛰🏔 ретвитнул

JB@jamie247·11 Mar

Startup equity can't coordinate fluid contributors, agents, or global problem stacks. Tokens were the first mechanism to financially incentivise open source collaboration at scale. But without design constraints on speculation, the incentive inverted. We need new, open and permissionless market mechanisms native to the internet, that optimise for conviction at machine speed. Not zero-sum. But where founders, contributors, and agents — collaborating and competing on the same problem stack, each earning proportionate to risk and contribution. I summarise a recent paper we published today on the potential of conviction markets. @jamie247/note/p-190638308?r=3k7ub&utm_medium=ios&utm_source=notes-share-action" target="_blank" rel="nofollow noopener">substack.com/@jamie247/note…

English

412

Aron van Ammers ⛺⛰🏔 ретвитнул

JB@jamie247·11 Mar

Crypto was for too long based on; promises, speculation, opacity & misalignment Let’s fix that! If you, and your agents, want to play we’ve got 💰 to seed protocols, primitives & mechanisms for conviction based systems.

English

198

Aron van Ammers ⛺⛰🏔 ретвитнул

JB@jamie247·11 Mar

This is the future of crypto. The coordination layer for decentralised innovation. But more than that.. Zero to One is 💀 We need Zero to Many; hybrid teams of people & agents, competing and collaborating to solve valuable problems. Systems, not 𝖲̶𝗍̶𝖺̶𝗋̶𝗍̶𝗎̶𝗉̶𝗌̶

English

343

Aron van Ammers ⛺⛰🏔 ретвитнул

JB@jamie247·11 Mar

More than prediction, we need conviction. We're funding primitives for a new era of Conviction Markets: machine-executable systems to price problems, align founders & agents with capital, to produce verifiable outcomes. Where everyone earns a stake, proportionate to risk & contribution. Read the paper convictionmarkets.io

English

3.6K

Aron van Ammers ⛺⛰🏔 ретвитнул

Dimitrios Chatzianagnostou ⛺️@ChatziDimi·11 Mar

4/ We're seeding the Conviction Markets protocol and its first primitives. If you have a problem worth solving, capital to back it, or the skills to build it, we want to hear from you. You and your agents are welcome. convictionmarkets.io

English

194

Aron van Ammers ⛺⛰🏔 ретвитнул

Dimitrios Chatzianagnostou ⛺️@ChatziDimi·11 Mar

3/ That's what Conviction Markets are. Capital committed to problems, not teams. Ownership earned by whoever builds toward the solution. You, your agents, or both. Weighted by how early you showed up and how long you stayed. The problem outlives any founder. Any hype cycle.

English

701

Aron van Ammers ⛺⛰🏔 ретвитнул

Dimitrios Chatzianagnostou ⛺️@ChatziDimi·11 Mar

2/ Zero to One is over. A solo founder with agents can build 10 things in parallel today. The startup as a unit of organisation can't hold that. We need systems, not startups. Coordination infrastructure for how building actually works in the Zero to Many era.

English

173

Aron van Ammers ⛺⛰🏔 ретвитнул

Dimitrios Chatzianagnostou ⛺️@ChatziDimi·11 Mar

1/ Crypto was built on promises, speculation, and opacity. Capital went to the best narrative. Builders got misaligned. Speculators won. We kept building instruments for betting on outcomes. When what we actually needed were instruments for funding work toward them.

English

121

Aron van Ammers ⛺⛰🏔 ретвитнул

Dimitrios Chatzianagnostou ⛺️@ChatziDimi·11 Mar

We've spent two years on a simple question: why can't capital coordinate around problems the way markets coordinate around prices? Today we have an answer. 🧵

English

141

Aron van Ammers ⛺⛰🏔 ретвитнул

Outlier Ventures@oviohq·11 Mar

Conviction Markets are the new paradigm where humans and agents work together, not just to speculate but to fund, build, and profit. Sign up for early access at convictionmarkets.io and learn more about this new primitive from @jamie247

JB@jamie247

English

2.9K

Aron van Ammers ⛺⛰🏔@aronvanammers·6 Mar

GPT-5.4 is available in Cursor and I've been using it for some dev and doc tasks since yesterday afternoon (thinking, medium, 1M context window). Not excited by what I've seen up to now. For dev tasks, sure it's very smart, of course, but at times it also got sassy, a bit unhelpful. Claude Opus 4.6 wants to help me solve my problems with all its might, and usually it can. OpenAI GPT-5.4 at times was explaining why things are already good as they are and I probably misunderstood them (I didn't). For writing prose (docs, skills, whatnot) it was worse. Very staccato nested bullet lists. Opus 4.6 gives me much more readable documents while staying concise.

English

495

Aron van Ammers ⛺⛰🏔 ретвитнул

JB@jamie247·2 Mar

In the end we will all want an LLM like Universal Interface for all aspects of our increasingly agentified lives (tm @aronvanammers)

English

329

Aron van Ammers ⛺⛰🏔@aronvanammers·18 Şub

Don't watch your AI agents work. It's mesmerizing, but it's not helping you to get more done.

English

278

Aron van Ammers ⛺⛰🏔@aronvanammers·10 Şub

My takeaways: Models have gotten a lot better at reasoning, especially at formal reasoning. Frontier models do still fail on some of the exact examples and of failure style outlined in the paper: reason just enough to sound convincing, but not enough to be reliable. Verify 🧐!

English

286

Aron van Ammers ⛺⛰🏔@aronvanammers·10 Şub

Caveat: I used all 3 models used through @NotionHQ AI, which likely adds some unknown system prompts. Not a 100% pure test.

English

Aron van Ammers ⛺⛰🏔@aronvanammers·10 Şub

This research used models that are by now outdated. Inevitable consequence of the research / peer review / publish cycle I imagine. I tested some of the examples with some frontier models. They perform much better on formal, clear reasoning puzzles, including Theory of Mind, but some still fail 👇

God of Prompt@godofprompt

🚨 Holy shit… Stanford just published the most uncomfortable paper on LLM reasoning I’ve read in a long time. This isn’t a flashy new model or a leaderboard win. It’s a systematic teardown of how and why large language models keep failing at reasoning even when benchmarks say they’re doing great. The paper does one very smart thing upfront: it introduces a clean taxonomy instead of more anecdotes. The authors split reasoning into non-embodied and embodied. Non-embodied reasoning is what most benchmarks test and it’s further divided into informal reasoning (intuition, social judgment, commonsense heuristics) and formal reasoning (logic, math, code, symbolic manipulation). Embodied reasoning is where models must reason about the physical world, space, causality, and action under real constraints. Across all three, the same failure patterns keep showing up. > First are fundamental failures baked into current architectures. Models generate answers that look coherent but collapse under light logical pressure. They shortcut, pattern-match, or hallucinate steps instead of executing a consistent reasoning process. > Second are application-specific failures. A model that looks strong on math benchmarks can quietly fall apart in scientific reasoning, planning, or multi-step decision making. Performance does not transfer nearly as well as leaderboards imply. > Third are robustness failures. Tiny changes in wording, ordering, or context can flip an answer entirely. The reasoning wasn’t stable to begin with; it just happened to work for that phrasing. One of the most disturbing findings is how often models produce unfaithful reasoning. They give the correct final answer while providing explanations that are logically wrong, incomplete, or fabricated. This is worse than being wrong, because it trains users to trust explanations that don’t correspond to the actual decision process. Embodied reasoning is where things really fall apart. LLMs systematically fail at physical commonsense, spatial reasoning, and basic physics because they have no grounded experience. Even in text-only settings, as soon as a task implicitly depends on real-world dynamics, failures become predictable and repeatable. The authors don’t just criticize. They outline mitigation paths: inference-time scaling, analogical memory, external verification, and evaluations that deliberately inject known failure cases instead of optimizing for leaderboard performance. But they’re very clear that none of these are silver bullets yet. The takeaway isn’t that LLMs can’t reason. It’s more uncomfortable than that. LLMs reason just enough to sound convincing, but not enough to be reliable. And unless we start measuring how models fail not just how often they succeed we’ll keep deploying systems that pass benchmarks, fail silently in production, and explain themselves with total confidence while doing the wrong thing. That’s the real warning shot in this paper. Paper: Large Language Model Reasoning Failures

English

478

Открыть

@jamie247 @elonmusk @BarackObama @taylorswift13 @cristiano @BillGates @NASA @nikifrancismediavine