Garry Tan

73.4K posts

Garry Tan

@garrytan

President & CEO @ycombinator —Founder @garryslist—Creator of GStack & GBrain—designer/engineer who helps founders—SF Dem accelerating the boom loop

San Francisco, CA Katılım Ocak 2008

5.8K Takip Edilen832K Takipçiler

Sabitlenmiş Tweet

Garry Tan@garrytan·11 Ağu

Tech gave me everything I have Its capacity to lift people into abundance is incredible and there is nothing like it We must make that into prosperity for everyone

Bloomberg Technology@technology

"I realized tech is this thing that can bring people out of whatever situation they're in and often into prosperity. And that's what I want for everyone." @ycombinator’s @garrytan tells @emilychangtv how tech changed his family's life. Watch here: trib.al/sxg1VGR

English

897

814

6.3K

4.3M

Garry Tan@garrytan·46m

This is going to be common from here Brave new world Prompters of the world unite

Patrick McKenzie@patio11

Today is May 25th, 2026. This is the first time I remember reading an LLM-produced public artifact which is obviously professionally relevant and which is sufficiently complete that I do not perceive the lack of a human author materially compromising its utility to me.

English

Garry Tan@garrytan·1h

Ultimately the golden age of abundance will be this kind of tech built and deployed 1000x

Afshine Emrani MD FACC@afshineemrani

1/5 I'm a cardiologist. I have spent twenty years watching cholesterol destroy arteries, trigger heart attacks, and kill people I care about. Today, Eli Lilly presented data that may begin to end that era. VERVE-102. A single infusion. One dose. It uses base editing to permanently turn off the PCSK9 gene in your liver. Presented today at the European Atherosclerosis Society Congress: 88% reduction in PCSK9. 62% reduction in LDL cholesterol. Sustained up to 18 months. No treatment-related serious adverse events. One infusion. Not daily pills you forget to take. Not monthly injections. One dose — and your cholesterol may stay low for the rest of your life.

English

8.2K

Garry Tan@garrytan·1h

By evals I mean literally tell the agent: given what we discussed about what we are doing and why and what happened, use three different frontier models to look at inputs and outputs of your skill file calling the code, and rate it on effectiveness. Why isn’t it a 10? How could it be made to be so? Run this a few times and you will be surprised how fast it gets astonishingly better And since it is in a skill file plus code with evals (LLM as judge) and unit tests, it stays better forever

English

2.8K

Garry Tan@garrytan·1h

Funny how simple using openclaw and Hermes agent is these days Just have it do stuff. Then improve in progressive batches with evals from multiple frontier models. It self improves!

Garry Tan@garrytan

Right now I just use my personal AI and our company brain and it screws up and I tell it to fix it and write tests for it. Also I do cross modal evals on progressive batches (eg if there are 10000 items do 5 and eval the input and output and skill, then keep doubling the batch size as you go)

English

10.4K

Garry Tan retweetledi

arman@ksw_arman·12h

it's crazy how @greptile has had such a noticeable improvement in the last few months. i've never seen an agent at that scale improve drastically so fast

English

6.2K

Garry Tan@garrytan·1h

@aidenybai Sounds like a “giving a shit” problem really

English

642

Aiden Bai@aidenybai·4h

this is mostly a guardrails problem: - teams can't keep up with code review - existing testing is mostly "fake" - the good ICs care, most don't give a shit. tokens amplify this problem

Hedgie@HedgieMarkets

🦔Uber's COO Andrew Macdonald said on Saturday that the company is having a harder time justifying its AI spend. After CTO Praveen Neppalli Naga went viral in April for admitting Uber burned through its 2026 Claude Code budget in four months, senior engineering leaders concluded higher token usage was not translating into proportionally more useful product. Macdonald said the link between AI consumption and shipped features is "not there yet." CEO Dara Khosrowshahi confirmed on the earnings call that Uber is slowing hiring to fund its AI spend. Duolingo also walked back its decision to include AI usage in performance reviews last month. My Take Uber is the first major enterprise where the C-suite has publicly admitted, on the record, that the AI productivity story is not closing for them. That matters because Uber is not a skeptic. The company went all-in on AI tooling, set internal targets, and burned through its annual research and development budget in four months trying to make it work. The conclusion from the people running the experiment is that tokens consumed and value shipped are not the same number, and management is finally noticing. Duolingo's reversal lands in the same week for a reason. CEO Luis von Ahn said employees were asking whether they needed to use AI just to use AI, which is Goodhart's Law showing up in a performance review system. When usage becomes the metric, employees optimize for usage, not output. Microsoft canceled internal Claude Code licenses, Google AI Pro stripped credits from paid subscribers, and now Uber is admitting the ROI does not close at scale. The narrative has shifted in the last 30 days from "AI productivity is here" to "AI productivity is harder to measure than we thought." The companies pushing tokenmaxxing internally are now the same companies signaling cost pressure externally. The IPO calendar for OpenAI and Anthropic is going to get a lot more complicated if the largest enterprise customers keep saying this out loud. Hedgie🤗

English

119

24.1K

Garry Tan@garrytan·1h

@karrisaarinen Use AI effectively to create new products and services that didn’t exist before that customers love

English

861

Karri Saarinen@karrisaarinen·1h

@garrytan True but how do you solve the demand side? Selling more to old or new customers?

English

1.8K

Karri Saarinen@karrisaarinen·2h

We keep hearing about 10x or 100x productivity gains in engineering and knowledge work. But outside the model labs, I haven’t seen the corresponding 10-100x revenue growth across the market or increase in quality. So where is the productivity going?

English

125

598

49.4K

Garry Tan@garrytan·1h

This sounds complicated but the agents can implement this in OpenClaw/Hermes Agent trivially (use skillify from GBrain with a link to this tweet) Sounds ridiculous but you should try it

Muratcan Koylan@koylanai

Gradient descent for SKILL.md files sounds interesting, maybe a bit complex but it's becoming a real part of agent harness. SkillOpt is one of the first papers to treat markdown skill files as trainable parameters and provides a proper optimization framework for them. A few things I learned that you should consider too. 1. The validation gate is the only thing that matters in a self-editing loop. Held-out set, strict improvement, ties rejected. End-to-end, their best skills land with 1 to 4 accepted edits total. If your "self-improving agent" is accepting most of what it proposes, you're shipping slop. 2. Bounded edits are better than full rewrites. 4 to 8 edits per step is the sweet spot. Remove the budget and performance collapses. This is the textual analog of learning rate, and it transfers to any LLM-as-author loop. If you're using an agent to refactor your docs, your prompts, or your skills, cap the diff size. 3. Compactness wins. Median final skill: ~920 tokens. Skills do not need to be long. They need to be high-signal. Most skill files I see are bloated because length feels like effort. It isn't. 4. The harness is becoming less important; the skill is becoming more important. A Codex-trained skill ported into Claude Code hit +59.7 points on SpreadsheetBench. Procedural knowledge is more general than the runtime that produced it. 5. Frozen model + trained context is the practical adaptation. GPT-5.4-nano with a SkillOpt'd skill ≈ frontier behavior on procedural benchmarks. Cheaper, portable, inspectable, zero inference-time cost. This is the answer to "how do we adapt a frontier model for our domain" for almost everyone who isn't training their own models. 6. Verification is the bottleneck. Every gate in this paper depends on an auto-grader. That works for benchmarks. It fails for writing, design, and strategy, exactly the open-ended work we want to automate. Whoever builds the verifier for open-ended tasks owns the next stage. There are also two leassons I learned while shipping v2.3.0 of my Context Engineering Agent Skills repo, measured across composer-2, claude-opus-4-7, gpt-5.5, and gemini-3.1-pro via the @cursor_ai SDK: - Description and body are two different surfaces. The router only sees the description. The agent sees the body once activated. They can quietly disagree, and only end-to-end task tests catch it. - Aggregate accuracy is the wrong unit. When I rewrote three descriptions, the corpus average moved ~1pp. Individual skills moved 23–25pp. Per-skill effect size is where the action is. Also, in Feb 2026 I shared a piece called Personal Brain OS arguing that the markdown file is a first-class substrate for agent state. SkillOpt is the optimizer-shaped version of that same argument: not "store memory in files" but "treat files as trainable parameters with proper optimization machinery around them." That's the move from static to measured. The fast/slow split they describe already lives implicitly in the digital-brain-skill repo: - voice-guide and tone-of-voice.md are slow-state (rarely touched) - posts.jsonl and bookmarks.jsonl are fast-state What SkillOpt adds that I didn't have is a protected section invariant, a structural guarantee that fast edits cannot overwrite slow lessons. Removing that mechanism cost them 22 points on SpreadsheetBench. Worth borrowing. If you're building agents, SkillOpt: Executive Strategy for Self-Evolving Agent Skills is a good paper to read: arxiv.org/pdf/2605.23904

English

12.1K

Garry Tan@garrytan·1h

@karrisaarinen I’m sorry to say it requires skills that few people even possess because it is all so new

English

Garry Tan@garrytan·1h

English

10.9K

Alex Hovansky@Alex_TGH·1h

@garrytan this sounds like youre building an actual brain instead of just a bot script curious how the feedback loop looks in practice

English

425

Garry Tan@garrytan·1h

These concepts coming soon to GBrain this week

elvis@omarsar0

New research from Microsoft Research I see a lot of AI engineers handwriting agent skill docs and hope they generalize. Probably not optimal. This works show why. It treats the skill doc as a trainable external state of a frozen agent instead. It introduces SkillOpt, where an optimizer model makes validation-gated edits to the skill file. It adds, deletes, or replaces instructions, with a textual learning rate that controls how aggressively each round rewrites the doc. The agent itself never changes. SkillOpt is best or tied on all 52 (model, benchmark, harness) cells. On GPT-5.5 it adds 23.5 points in direct chat, 24.8 with Codex, and 19.1 with Claude Code over no skill. It beats human-written skills, TextGrad, GEPA, and EvoSkill, carries zero extra inference-time cost, and the learned skills transfer across models and harnesses. Paper: arxiv.org/abs/2605.23904 Learn to build effective AI agents in our academy: academy.dair.ai

English

12K

Garry Tan@garrytan·1h

@AroraBhavyam @ycombinator @speedrun @afore Stop capping We still take late apps throughout

English

698

Bhavyam Arora (Content Arc)@AroraBhavyam·17h

Applications for both @ycombinator and A16Z @Speedrun are closed now completely! But if you are a founder who wants to raise funding for your #startup, here's a list of the best pre-seed / seed funds that are investing actively: @204BVC $82M, deep tech/bio, pre-seed/seed @Afore $185M, generalist, pre-seed specialist @AntiFund $30M, AI + defense, $100K-$500K first check "Follow @AroraBhavyam if you found this valueable 🫡" @basecasecapital ~$99M, enterprise infra, solo GP @haunventures $1B, crypto + AI agents x finance @HaystackVC $85M, generalist software, pre-seed/seed @HummingbirdVC $800M, outlier founders globally @MantisVC $100M, cyber + B2B multi-sector @MischiefVC $80M, generalist software, $1M-$4M @ModernTechnical $22M, software infra, solo GP @PrecursorVC $66M, generalist tech, $100K-$500K @SevenStarsVC $40M, AI applications, pre-seed @StrikerVenture $165M, AI + cyber + life sciences @ZeroShotFund $100M target, post-AGI builders Give feedback if I missed any major ones 👇

Bhavyam Arora (Content Arc)@AroraBhavyam

YC deadline has been extended until this weekend. If you are a founder who missed it before, now is the time... Also, you now get $2M worth of OpenAI tokens if you're selected! (screenshot of @agupta's tweet, gp at @ycombinator)

English

195

25.6K

Garry Tan@garrytan·1h

@anshublog @levie @random_walker I got a second EA because I need more help having all the people stuff across many networks be navigated by them

English

Anshu Sharma 🌶@anshublog·1d

@levie @random_walker New rule: any ceo who claims work can be fully done by ai needs to immediately let go of their executive assistant. Oh so you’re telling me it can do the job of a software engineer that builds schedulers but not that of a scheduler?

English

364

21.8K

Garry Tan retweetledi

Aaron Levie@levie·1d

CEOs are uniquely prone to AI psychosis because they’re sufficiently distant from the last mile of work that still has to happen to generate most value with AI. So when they play with AI, they see the happy path results, often not considering the next 10 or 20 things that have to happen to get sustainable results from agents. “Look I made this awesome product prototype”. Yes but you didn’t have to review the code before it went into production and fix a bunch of issues. “Look I generated a contract”. Yes but you didn’t verify all the terms before it goes out to the counterparty and didn’t have to wire up all the past contracts to work with. The best thing you can do as a CEO is to use AI a *ton* to figure out the real implications of agents in the enterprise, and come out the other side with an appreciation for both the upside and the real work that goes into them.

Michal Malewicz@michalmalewicz

CEOs are the most delusional about AI. Detached from reality.

English

272

664

6.3K

Garry Tan@garrytan·1h

@maccaw @levie 🚀

QME

Alex MacCaw@maccaw·1d

@levie If anything, CEOs aren’t AI-pilled enough. As a former manager/CEO turned IC again, my experience is that AI can do everything I throw at it, and more.

English

1.6K

Garry Tan retweetledi

Kathryn Wu@kathrynwu1·10h

I think one reason YC likes logical engineers is not just because they can code. A lot of them are unusually clear communicators. Coding trains you to think in strict logical sequences: input → output, cause → effect, constraint → solution. You can hear it immediately in good founders. Not necessarily charismatic, but coherent. People underestimate how much startup momentum comes from simply being easy to understand.

English

6.3K

Garry Tan@garrytan·1h

@tszzl Will fight against this until my dying days

English

107

8.1K

roon@tszzl·2h

i see this kind of just universe reasoning about startups and whatnot but i think it’s wishful thinking. there may be one company that ends up dominating most of the world economy and hopefully is run as some sort of regulated utility

Suhail@Suhail

Possibly the thing we will most realize looking back: intelligence was so big that lots of companies were going to succeed. It's not so simply bucketed into chatgpt and claude code.

English

410

66.4K

Garry Tan@garrytan·1h

Someone just described hell

roon@tszzl

English

8.5K

Garry Tan@garrytan·1h

@SplinteredEsq Markdown system of record GBrain uses pgvector and Postgres. I’m on a Supabase XL instance now

English

Chris Baker@SplinteredEsq·4h

@garrytan how do you store them? Like if the models are on a VPS, whats the best storage method for the originals and then the markdown?

English

Garry Tan@garrytan·6h

GBrain just got a big update: graph generation is now much more automated and powerful My knowledge wiki is now pushing 300k markdown files across multiple federated company brains