Paul Vu

1.7K posts

Paul Vu banner
Paul Vu

Paul Vu

@PaulVuAI

Building AI apps

Austin Texas Katılım Ekim 2020
1.3K Takip Edilen238 Takipçiler
Paul Vu
Paul Vu@PaulVuAI·
@FredKSchott Agent harness as a framework category is the right unbundling. Most teams hand-roll loop + eval + recovery + cost cap — every one a 2-week distraction. If Flue makes those table stakes, it's rails indie agents need.
English
0
0
0
311
fks
fks@FredKSchott·
Introducing Flue — The First Agent Harness Framework Flue is a TypeScript framework for building the next generation of agents, designed around a built-in agent harness. Flue is like Claude Code, but 100% headless and programmable. There's no baked in assumption like requiring a human operator to function. No TUI. No GUI. Just TypeScript. But using Flue feels like using Claude Code. The agents you build act autonomously to solve problems and complete tasks. They require very little code to run. Most of the "logic" lives in Markdown: skills and context and AGENTS.md. Flue is like Astro or Next.js for agents (not surprising, given my background 🙃). It's not another AI SDK. It's a proper runtime-agnostic framework. Write once, build, and deploy your agents anywhere (Node.js, Cloudflare, GitHub Actions, GitLab CI/CD, etc). We originally built Flue to power AI workflows inside of the Astro GitHub repo. But then @_bgiori got his hands on it, and we realized that every agent needs a framework like Flue, not just us. Check it out! It's early, but I'm curious to hear what people think. Are agents ready for their library -> framework moment?
fks tweet media
English
176
333
3.7K
712.6K
Paul Vu
Paul Vu@PaulVuAI·
@sama Right framing. 'Which agent wins benchmarks' matters way less than 'which fits your codebase + your taste.' I run both and switch by file type. The competition is good for users.
English
0
0
0
11
Sam Altman
Sam Altman@sama·
you know what all of these "which is better" polls are silly use codex or claude code, whatever works best for you i am grateful we live in a time with such amazing tools, and grateful there is a choice
English
2.2K
1.1K
23K
1.6M
Paul Vu
Paul Vu@PaulVuAI·
Which of these surprised you? Reply with the one you'd build on first.
English
0
0
0
13
Paul Vu
Paul Vu@PaulVuAI·
6. @Uncanny_Harry building IP from scratch with AI characters — animated style vs AI realism poll. The interesting question: which aesthetic the audience picks tells you where the market actually is.
English
1
0
0
20
Paul Vu
Paul Vu@PaulVuAI·
Agents are finally getting taste. 6 examples from this week: 1. DESIGN.md by @bbssppllvv — 2,000 design specs from the world's best products, structured for models to learn. The 'agents make ugly UIs' era ends here.
English
1
0
1
37
Paul Vu
Paul Vu@PaulVuAI·
@karpathy The 'use cases that didn't exist before' framing is the right yardstick. I don't measure my AI gains in coding speed anymore — I measure 'fraction of projects I'd have skipped that I now ship.' That's the metric that compounds.
English
0
0
3
424
Andrej Karpathy
Andrej Karpathy@karpathy·
Fireside chat at Sequoia Ascent 2026 from a ~week ago. Some highlights: The first theme I tried to push on is that LLMs are about a lot more than just speeding up what existed before (e.g. coding). Three examples of new horizons: 1. menugen: an app that can be fully engulfed by LLMs, with no classical code needed: input an image, output an image and an LLM can natively do the thing. 2. install .md skills instead of install .sh scripts. Why create a complex Software 1.0 bash script for e.g. installing a piece of software if you can write the installation out in words and say "just show this to your LLM". The LLM is an advanced interpreter of English and can intelligently target installation to your setup, debug everything inline, etc. 3. LLM knowledge bases as an example of something that was *impossible* with classical code because it's computation over unstructured data (knowledge) from arbitrary sources and in arbitrary formats, including simply text articles etc. I pushed on these because in every new paradigm change, the obvious things are always in the realm of speeding up or somehow improving what existed, but here we have examples of functionality that either suddenly perhaps shouldn't even exist (1,2), or was fundamentally not possible before (3). The second (ongoing) theme is trying to explain the pattern of jaggedness in LLMs. How it can be true that a single artifact will simultaneously 1) coherently refactor a 100,000-line code base *and* 2) tell you to walk to the car wash to wash your car. I previously wrote about the source of this as having to do with verifiability of a domain, here I expand on this as having to also do with economics because revenue/TAM dictates what the frontier labs choose to package into training data distributions during RL. You're either in the data distribution (on the rails of the RL circuits) and flying or you're off-roading in the jungle with a machete, in relative terms. Still not 100% satisfied with this, but it's an ongoing struggle to build an accurate model of LLM capabilities if you wish to practically take advantage of their power while avoiding their pitfalls, which brings me to... Last theme is the agent-native economy. The decomposition of products and services into sensors, actuators and logic (split up across all of 1.0/2.0/3.0 computing paradigms), how we can make information maximally legible to LLMs, some words on the quickly emerging agentic engineering and its skill set, related hiring practices, etc., possibly even hints/dreams of fully neural computing handling the vast majority of computation with some help from (classical) CPU coprocessors.
Stephanie Zhan@stephzhan

@karpathy and I are back! At @sequoia AI Ascent 2026. And a lot has changed. Last year, he coined “vibe coding”. This year, he’s never felt more behind as a programmer. The big shift: vibe coding raised the floor. Agentic engineering raises the ceiling. We talk about what it means to build seriously in the agent era. Not just moving faster. Building new things, with new tools, while preserving the parts that still require human taste, judgment, and understanding.

English
263
726
5.5K
765.1K
Paul Vu
Paul Vu@PaulVuAI·
MoshiRAG: speech model delegates hard questions to text LLMs in real time, hides the latency. The architecture pattern voice AI startups will copy by Q3 — speech for flow, text for facts. Two specialists, one experience.
kyutai@kyutai_labs

Speech-native models like Moshi sound great and answer fast, but aren’t as smart as text LLMs. In our new paper, MoshiRAG, we show how Moshi can ask for advice from a text LLM or a knowledge base. The tricky part is how to do this in real time without adding latency. 🧵

English
1
0
0
57
Paul Vu
Paul Vu@PaulVuAI·
@tamrrat MCP for 3D worlds is the niche-tool unlock that matters more than another general agent. Agents that win verticals will be built on tool servers nobody writing 'agent demos' has heard of.
English
1
0
2
1.2K
tamrat
tamrat@tamrrat·
Introducing the mint mcp Your coding agent can now generate fully immersive 3D worlds and 3D models No Blender required :)
English
37
116
1.5K
127.1K
Paul Vu
Paul Vu@PaulVuAI·
@sundarpichai @GeminiApp Gemini becoming the export layer is the actual workflow change. Most LLM output dies in chat — copying loses formatting + context. Native export collapses 'AI generated this → I fix it' to one step.
English
0
0
0
1.6K
Sundar Pichai
Sundar Pichai@sundarpichai·
You can now ask Gemini to create Docs, Sheets, Slides, PDFs, and more directly in your chat. No more copying, pasting, or reformatting, just prompt and download. Available globally for all @GeminiApp users.
English
591
1.7K
18.1K
1.9M
Paul Vu
Paul Vu@PaulVuAI·
Which of these surprised you? Reply with what you're actually testing this week.
English
0
0
0
54
Paul Vu
Paul Vu@PaulVuAI·
6. Agent Opus — AI video editor as a team of specialist agents, not one model doing everything. Production studio model. This is where next-tier creative tools go.
English
1
0
1
72
Paul Vu
Paul Vu@PaulVuAI·
6 AI tools shipped this week. Pattern: specialist agents eating verticals, not 'one model rules all.' 1. @ammaar — offline vibe coding via Gemma 4 + MLX. Your Mac is the runtime. No internet, no rate limits, no cloud outages eating your day.
English
1
0
5
1.3K