Pete Soderling

3.4K posts

Pete Soderling

@petesoder

Engineer, Entrepreneur, Investor. Founder @AICouncilConf + @ZeroPrimeVC. Helping 10k engineers start companies 🤓🖖

USA + Europe whenever possible Katılım Mart 2008

1.6K Takip Edilen3.2K Takipçiler

Pete Soderling@petesoder·2d

1 billion tokens in, 1 billion tokens out. Opus 4.6 runs you about $30,000, real-time. DeepSeek‑V4‑Pro async on @Doubleword_ lands closer to $4,100. Roughly the same intelligence, ~86% cheaper. That delta is what @MeryemArik9 has been building around. Most inference stacks were designed for humans sitting and waiting on a response - ChatGPT, Claude, Perplexity, Cursor, Codex. Everything optimized for that near real-time loop, including the spinner verbs you read while you wait. An async agent is a different pattern entirely. It chugs along for hours and nobody's watching. What matters is the total cost when the job finishes. The teams not lighting cash on fire are getting deliberate about which tokens need a frontier model and an immediate response. Sometimes you pay for the realtime closed frontier reasoner. Sometimes the async open model gets you there just fine. This is the territory Meryem is covering at AI Council - long-running async agents that don't torch your token budget. Her talk will cover strategies builders can use to maximize async agent performance while keeping inference costs under control, covering context engineering, compaction, cache maintenance, model routing and batch inference. Highly relevant for builders. @AICouncilConf 2026. May 12–14. See you in SF!

English

295

Pete Soderling@petesoder·5d

Earlier this month, a 30-person US open-source startup shipped a 400B-parameter Mixture-of-Experts reasoning model for long-horizon agents. The model is Trinity-Large-Thinking from Arcee AI, built on Trinity Large, one of the most ambitious open foundation models ever trained from scratch by a US team. Its predecessor, Trinity-Large-Preview, is already one of the most-used open-weight models on OpenRouter. OpenRouter, where Trinity has been racking up that traffic, is the unified API for hundreds of open and closed models. The inference runs on platforms like Fireworks AI, which serves open-weight and fine-tuned models to Cursor, Notion, DoorDash, and Uber. On top, you get applications like Kilo Code, the fastest-growing open-source coding agent. Four companies, four layers of the open stack — model, routing, inference, agent. We've invited all four founders on stage at @AICouncilConf this year to talk about this "open layer." Their upcoming talk, "The Open Layer: How Open Models, Routing, and Inference Are Reshaping Agentic Engineering," gets into what "open" actually means in 2026 (and where it falls short), when open-weight models win (and when they don't), and what it really takes to keep always-on agents reliable on this stack. On stage: @MarkMcQuade, Founder & CEO — @arcee_ai @cclark, Co-Founder & COO — @OpenRouter @dzhulgakov, Co-Founder & CTO — @FireworksAI_HQ @s_breitenother, Co-Founder & CEO — @kilocode Four exceptional founders on one stage. Looking forward to this discussion! May 12–14, SF. aicouncil.com/sf-2026

English

473

Pete Soderling@petesoder·6d

@BEBischof Some people have been using em dashes for a long time Bryan …

English

Bryan Bischof fka Dr. Donut@BEBischof·6d

There is LM Slop everywhere for those with the eyes to see

English

261

Pete Soderling@petesoder·6d

@BEBischof Yeah you’re worked super hard on inference track. Awesome speakers #inference-systems" target="_blank" rel="nofollow noopener">aicouncil.com/sf-2026#infere…

English

Bryan Bischof fka Dr. Donut@BEBischof·6d

Come see 4 of these founders speak in the Inference Systems track. We have announcements and math and models, oh my!

Pete Soderling@petesoder

One of my favorite things about running @AICouncilConf for eleven years? The founders. There's a secret "track" that's not on the schedule — an invisible hallway of builders. And the next wave is showing up at SF 2026: 🧵 @EnoReyes of @FactoryAI @vikhyatk of @moondreamai @ds3638 of @honeyhiveai Emilie Schario of @kilocode @ianlivingstone of @KeycardLabs @neilmovva of @sailresearchco @CompleteSkeptic of @typesafeai @HessianFree of @PrismML @latkins of @arcee_ai Iona Hreninciuc of @runware petesoder.substack.com/publish/post/1…

English

1.2K

Pete Soderling@petesoder·6d

English

1.8K

Pete Soderling@petesoder·6d

Meet @lloydtabb. Bike mechanic, co-creator of Malloy, founder and former CTO of Looker. At last year’s AI Council (fka Data Council), you could listen to Lloyd demo Malloy while making the case that semantic modeling is what makes LLMs actually useful on top of your data. Then you’d stick around for Office Hours to ask him a question. Every year, people tell me the speaker Office Hours is their favorite part of the conference. This year will be no different. It’s not often you can be in the same room with your AI & data heroes and the builders of your favorite tools for intimate, small-group chats. The @AICouncilConf 2026 agenda is live. Pre-plan your must-see talks and Office Hours to get the most out of the conference: docs.google.com/document/d/1J4…

English

468

Pete Soderling@petesoder·23 Nis

Talked to @changhiskhan of @lancedb and it's got me thinking: Most of the current data stack was built for a human hitting "search" a few times a minute. Not for an agent firing a hundred queries in parallel and chaining them across a long reasoning path. Curious what others are seeing. If you're running AI in production right now — what's breaking first? Throughput? Latency? The coordination tax between systems? Full conversation from our sit-down ahead of @AICouncilConf: open.substack.com/pub/petesoder/…

English

466

Pete Soderling@petesoder·23 Nis

Engineers in the early days of DNS used to joke it stood for "Does Not Secure." Eventually DNSSEC arrived to harden DNS against spoofing and cache poisoning. It's a reminder that every new layer of the internet goes through this same arc: something useful ships, everyone adopts it, attackers exploit the gaps, and the security rigor shows up later, after a few breaches force the issue. Agentic AI is squarely in that phase right now. Agents hold credentials, take actions across tools and data stores on our behalf, and consume untrusted inputs along the way. The equivalent of DNSSEC — a widely-adopted, well-understood set of controls for bounding that kind of trust — doesn't yet exist. We built a dedicated AI Security & Safety track into AI Council 2026 to put the people doing that work into one room. Diana Kelley is one of them. She's CISO at @NomaSecurity, and before Noma, held senior security roles at Microsoft, IBM, and Symantec. She's also in the Cybersecurity Hall of Fame (among many other honors), and co-wrote the book on cybersecurity architecture. If you ship anything with agent access to production, her session — "Agentic AI: From Risk Awareness to Practical Control" — is one I'd make room for. Excited for this track! May 12–14, SF. #ai-security" target="_blank" rel="nofollow noopener">aicouncil.com/sf-2026#ai-sec…

English

101

Pete Soderling@petesoder·22 Nis

@databricks @AICouncilConf @nikitabase @kelvich look forward to having your team there 🖖

English

Databricks@databricks·20 Nis

Databricks is heading to @AICouncilConf in San Francisco, May 12-14, with three sessions covering some of the most pressing questions in data and AI infrastructure right now. - @nikitabase keynoting on what data infrastructure actually needs to look like in the AI era - @kelvich on why AI agents need a new kind of OLTP and how Lakebase is built for it - Robert Martin-Short leading a workshop on systematic LLM prompt optimization with DSPy Find us in the Expo Hall if you're there!

English

2.5K

Pete Soderling@petesoder·21 Nis

@changhiskhan, CEO of @lancedb, thinks the data stack we've used for 20 years is done. Metadata in the DB, files on S3, connected by a pointer. Fine for humans. Breaks under agentic workloads. His argument: the files need to live inside the database. open.substack.com/pub/petesoder/…

English

6.3K

Pete Soderling@petesoder·21 Nis

really proud to show up in lists like this with other high-quality peer funds 👊 doing god's work, most of the time in secret, with the occasional pop of recognition 😎 @alanaagoyal @mantisvc @AlexPallNY @GauravBhogale @atShruti @ileri @YTR4N_

Pavel Prata@pavelprata

Which emerging VCs have the strongest early-stage picking alpha? Standard emerging manager evaluation still leans heavily on qualitative signals – GP background, thesis articulation, founder references. All useful, but by the time TVPI and DPI tell you something meaningful, you're usually already in or already too late. So I experimented with a quantitative framework to answer a core LP allocator question: which small, early-stage fund managers consistently back seed-stage companies that go on to raise exceptional Series A rounds – before those outcomes are visible to the broader market? I started with @harmonic_ai Scout (my fav research tool!) and checked every company globally that raised a first pre-seed or seed round between 2022–2026 (Post-ZIRP). The funnel looks like this: 1/ 55,491 companies raised a pre-seed or seed round – the full opportunity set 2/ 4,368 (7.9%) went on to raise a Series A – the base rate, roughly 1 in 13 3/ 764 (1.4%) qualified as Tier 1 Breakouts – above-median Series A for their vintage year, with at least one top-tier institutional VC (from a defined set of 38 firms: @a16z, @sequoia, @lightspeedvp, @IndexVentures, and peers) For each of those 1,604 companies, I traced back to every investor who backed them at pre-seed or seed — before the outcome was visible. 4,176 unique investors across the breakout set. Then I computed a simple ratio for each: breakout companies backed at seed divided by total seed investments in the period. I'm calling this the "Tier 1 Concentration Rate". After filtering out mega-platforms, accelerators, CVCs, and angels and requiring a minimum of 10 seed deals – 20 emerging managers (sub-$250M AUM) surfaced with notably high concentration rates. A few things stood out: 1/ Several micro-funds under $100M were placing 25–35% of their seed bets into companies that later raised from @Sequoia, @a16z, @lightspeedvp – consistently, not as one-off flukes. 2/ Participant concentration and lead concentration are different signals. Participant = network and access. Lead = independent conviction before consensus forms. For LP diligence, these deserve to be evaluated separately. 3/ The data has real limitations: ~12% of breakout companies had no named seed investor in the database, we can't cleanly separate Fund I from Fund III for a given manager, and small sample sizes mean some high concentration rates likely reflect luck rather than repeatable skill. But the core idea holds. "Tier 1 Concentration Rate" is an early, measurable signal of picking ability – observable years before fund-level metrics tell you anything. For LP allocators evaluating Fund I–III managers, that timing gap is the whole problem. This is one attempt to close it. What’s your take on this experiment?

English

820

Pete Soderling@petesoder·17 Nis

How can you trust that your vision AI isn't hallucinating? @vikhyatk's solution at @moondreamai: don't let the model give verdicts. Make it show its work.

English

304

Bryan Bischof fka Dr. Donut@BEBischof·16 Nis

Come see my track at @AICouncilConf where the creator of this model will explain the history and science of quantization I'm telling you: The inference systems track at this years AI Council will be the highest density of this content ever assembled aicouncil.com/sf-2026

PrismML@PrismML

Today we’re announcing Ternary Bonsai: Top intelligence at 1.58 bits Using ternary weights {-1, 0, +1}, we built a family of models that are 9x smaller than their 16-bit counterparts while outperforming most models in their respective parameter classes on standard benchmarks. We’re open-sourcing the models under the Apache 2.0 license in three sizes: 8B (1.75 GB), 4B (0.86 GB), and 1.7B (0.37 GB).

English

6.2K

Pete Soderling@petesoder·17 Nis

@BEBischof @AICouncilConf gonna be 🔥🔥

English

Pete Soderling@petesoder·16 Nis

With @AICouncilConf only four weeks away, I thought I'd share: why SF? As a founder, I've moved to San Francisco three different times during my career. Every time I've needed to punch through to the next level, SF pulled me back. And every time, it worked. Most recently, I came back to launch @ZeroPrimeVC . Because if you're building AI infra, dev tools, or anything that other engineers depend on - SF still concentrates an absurd amount of what matters. The other tool builders, the customers, the investors, the future hires, and the hallway convos can change a startup's trajectory. This is especially true for founders coming from Europe and beyond. I often tell founders, you don't have to move here, but at some point, you need to commit to spending time here ... even if it's just a few times a year. Build the relationships in person. Compress months of momentum into a few days. Get inspired by other founders moving faster than you. That's a big part of why we brought the conference back to SF. AI Council is the perfect launchpad - a reason to be in the city with the right people, make a key hire, meet your next investor, and leave with momentum. See you in 4 weeks.

English

289

Pete Soderling@petesoder·14 Nis

Everyone's chasing AGI breakthroughs. But @vikhyatk is stacking 2% gains at @moondreamai — custom tokenizers, grounding tokens, etc. The strategy is not the most glamorous but delivers customer results. petesoder.substack.com/p/01181fa3-e15…

English

1.3K

Pete Soderling@petesoder·13 Nis

"Generate an SVG of a pelican riding a bicycle." A benchmark famously created by @simonw and recently discussed on Lenny’s podcast. Ask any text model to draw a pelican on a bike in SVG and the quality of the output tracks surprisingly well with the model's overall capability. The SVG format tests the text model's ability to reason spatially and plot vectors into something recognizable. Probably. It's hard to say exactly why it works so well. Yet with every major model release, the pelicans get a little better. This was just one thread from Simon's convo on @lennysan's excellent podcast, where they dig into the dark factory pattern, why the bottleneck has shifted from writing code to testing it, the prompt injection "lethal trifecta," and his prediction that 50% of engineers will be writing 95% AI-generated code by end of year. It's one of the best discussions on the state of AI engineering I've heard this year. Simon co-created Django, coined "prompt injection," popularized "AI slop," and built Datasette and 100+ other open-source tools. He's been building software for over two decades and has gone deeper into agentic workflows than almost anyone, documenting every lesson in real time on his blog. If you're not regularly reading simonwillison.net, you're missing the most honest, ground-level reporting on how AI is changing the craft of building software. Simon, we'd be honored if you'd join us on the keynote stage at @AICouncilConf in May and bring this conversation to our community in person!

English

143

Pete Soderling@petesoder·11 Nis

@barrald accepts a jpg is "multimodal"

Català

Barry McCardel@barrald·10 Nis

In 2026, all: Loops are “agents” Backends are “engines” UIs are “canvases” Apps are “context layers” Engineers are “members of the technical staff” Collections of markdown files are "moats" What am I missing?

English

1.9K

Pete Soderling@petesoder·10 Nis

structure that most data teams outside of very large companies have never had a real reason to build. Until now. Excited to dig into this with the data infrastructure folks at @AICouncilConf this May. Full conversation here: substack.com/home/post/p-19…

English

242

Pete Soderling@petesoder·10 Nis

The application layer gets all the attention. Data infrastructure is where things will quietly break. @EnoReyes at @FactoryAI suggests the only way around it is to introduce traditional backend guardrails — query linters, deterministic checks —

English

227

Pete Soderling@petesoder·10 Nis

What are the hidden evils of vibe coding that no one talks about? A bad table join doesn't throw an error. It quietly 10x's your BigQuery bill. A weird union doesn't crash immediately — it just degrades your data quality in ways nobody notices for weeks.

English

441

Keşfet

@Doubleword_ @MeryemArik9 @AICouncilConf @MarkMcQuade @arcee_ai @cclark @OpenRouter @dzhulgakov