Azalia Mirhoseini

494 posts

Azalia Mirhoseini

@Azaliamirh

Founder @RicursiveAI, Asst. Prof. of CS at Stanford. Prev: DeepMind, Anthropic, Brain. Co-Creator of MoEs, AlphaChip, Test Time Scaling.

Stanford, CA Katılım Mayıs 2013

605 Takip Edilen18.4K Takipçiler

Sabitlenmiş Tweet

Azalia Mirhoseini@Azaliamirh·2 Ara

Thrilled to share that @annadgoldie and I are launching @RicursiveAI, a frontier lab enabling recursive self-improvement through AIs that design their own chips. Our vision for transforming chip design began with AlphaChip, an AI for layout optimization used to design four generations of TPUs, data center CPUs, and smartphones. AlphaChip offered a glimpse into a future where AI designs the silicon that fuels it. Ricursive extends this vision to the entire chip stack, building AI that architects, verifies, and implements silicon, enabling models and chips to co-evolve in a tight loop. We sat down with WSJ’s @berber_jin1 to discuss Ricursive: wsj.com/tech/this-ai-s…

Ricursive Intelligence@RicursiveAI

Introducing Ricursive Intelligence, a frontier AI lab enabling a recursive self-improvement loop between AI and the chips that fuel it. Learn more at ricursive.com

English

124

135

1.5K

232.6K

Azalia Mirhoseini@Azaliamirh·5d

@edchi Sounds good!

English

Azalia Mirhoseini retweetledi

Striker Venture Partners@strikervp·5d

Striker portfolio companies @ElorianAI and @RicursiveAI were founded by Google's most elite AI researchers. Names like @AndrewDai, @annadgoldie & @Azaliamirh. Read more in @Bloomberg today. Link below.

English

4.5K

Azalia Mirhoseini@Azaliamirh·6d

@drmapavone @Stanford Congrats!

English

450

Marco Pavone@drmapavone·6d

Excited to announce the launch of the Stanford Sustainable Mobility Center, where I’ll be serving as inaugural co-director. Housed within Stanford Precourt Institute for Energy, the center brings together @Stanford’s strengths — from energy systems to AI and autonomy — alongside industry and government collaboration to accelerate real-world mobility solutions at scale. 🔗 Overview of the center: news.stanford.edu/stories/2026/0… The center traces its origins to the Center for Automotive Research at Stanford (CARS), which I had the pleasure of directing for several years. 🚗🚢✈️ If you are interested in rethinking how people and goods move across land, sea, and air, I’d love to connect! @StanfordEng @StanfordASL

English

2.3K

Azalia Mirhoseini@Azaliamirh·6d

@RicursiveAI

Ricursive Intelligence@RicursiveAI

Chips are the fuel for AI. By using AI to design, optimize, and automate chip design, we can close the recursive self-improving loop between AI and its physical substrate. Our co-founders @annadgoldie and @Azaliamirh took the stage at @sequoia AI Ascent 2026 to share more about Ricursive’s vision to accelerate and democratize chip design. Proud of the progress our world-class team has already made. And we’re just getting started!

QAM

4.1K

Azalia Mirhoseini retweetledi

Ricursive Intelligence@RicursiveAI·14 May

Thanks to @NicolSchwarzK and @CNBC for spotlighting our work. The core AlphaChip team is back together and we're laser focused on the most consequential loop in technology today: AI for chip design and chip design for AI. cnbc.com/2026/04/28/met…

English

1.2K

Azalia Mirhoseini retweetledi

Kelly Buchanan@ekellbuch·7 May

Very excited to release Terminal-Bench 2.1! Coding agents are among the most economically consequential deployments of LLMs to date. As agents improve, benchmark reliability matters more. We audited TB2.0 and found and corrected issues in 28/89 tasks. 30% of the benchmark! But the rankings survived, absolute scores moved up to 12pp!

English

768

84.2K

Azalia Mirhoseini@Azaliamirh·6 May

@klazizpro @stephzhan @sonyatweetybird Thanks!

English

Man from Atlantis @ SF@klazizpro·6 May

@Azaliamirh @stephzhan @sonyatweetybird One of the greatest talk - love it

English

104

Azalia Mirhoseini@Azaliamirh·6 May

It was great to present at Sequoia AI Ascent! Many thanks to @stephzhan, @sonyatweetybird, and the entire Sequoia team for hosting us!

Stephanie Zhan@stephzhan

So fun to host @annadgoldie and @Azaliamirh of @RicursiveAI at @sequoia AI Ascent 2026! They share the story of building AlphaChip which was incorporated into multiple generations of the TPU. They walk through their three-phase roadmap: from AI-powered design tools -> a "fabless era" platform for custom silicon -> full vertical integration, and their LT vision of building true recursive AI. 00:00 Neural Nets Meet Chips 00:21 Meet AlphaChip Creators 00:43 Recursive Intelligence Vision 01:31 AlphaChip Real World Impact 02:17 Three Phase Company Roadmap 02:20 Phase One Speeding Design 04:11 Rebuilding Tools for AI 05:05 STA Engine and RL Loop 06:15 Designless Platform and Custom Chips 08:01 Team and Audience Q&A

English

4.7K

Azalia Mirhoseini retweetledi

Avanika Narayan@Avanika15·4 May

hyped to see computer systems 🐐's like @JeffDean, david patterson, @AzaliaMirh & others discussing how intelligence per watt (ipw) should be the north star metric for computer system design links to event notes + ipw work w/@JonSaadFalcon in comments below!

English

8.5K

Azalia Mirhoseini retweetledi

Emre Can Acikgoz@emrecanacikgoz·26 Nis

It is happening now at Lifelong Agent Workshop (lifelongagent.github.io)! Don't miss the chance to hear from Azalia (@Azaliamirh) about self-improvement at test-time! #ICLR2026

Azalia Mirhoseini@Azaliamirh

Looking forward to giving a talk at the ICLR'2026 workshop on Lifelong Agents: Learning, Aligning, Evolving on: "Self-Improvement Through Test-Time Scaling of Verifiers" April 26, 16:00 – 16:30 lifelongagent.github.io

English

Azalia Mirhoseini@Azaliamirh·26 Nis

English

131

9.5K

Azalia Mirhoseini@Azaliamirh·24 Nis

Check out Kevin and DSL-Monkeys for kernels at ICLR!

Simon Guo@simonguozirui

At #ICLR2026 🇧🇷 this week, learning Portuguese is way harder than learning a new programming language! 😅 On that note, come find me presenting some work on post-training and test-time bootstrapping for rare or domain-specialized 🦜code generation! ⚡ Kevin: Multi-Turn RL for Generating CUDA Kernels Friday 3:15 PM – 5:45 PM, Pavilion 4-#5003 🐒 DSL-Monkeys: Self-Generated In-Context Examples for Low-Resource GPU DSL Kernels Data-FM (Sunday) and Test-Time Updates (Monday) Workshop Até lá!

English

4.6K

Azalia Mirhoseini@Azaliamirh·23 Nis

@lishali88 @a16z Congrats, Lisha!

English

483

Lisha@lishali88·23 Nis

BIG PERSONAL UPDATE. I've joined a16z as a partner investing in infra and AI. I'm also stepping down as CEO of Rosebud AI. I reflect in this article on my 8 years of building in generative AI. At @a16z I‘ll be focusing on the frontier model stack: the models, and the infra and dev tooling around them. I'm excited about rapid model progress, increasingly driven by AI itself, and about what AI is unlocking for math and the sciences. And I’ll always have a soft spot for AI creative tools, having built them for 8 years.

Lisha@lishali88

x.com/i/article/2046…

English

451

93.6K

Azalia Mirhoseini retweetledi

Cameron R. Wolfe, Ph.D.@cwolferesearch·16 Nis

Strongly recommend the LLM-as-a-Verifier writeup. Biggest takeaway for me is that increasing scoring granularity makes the verifier more effective. This indicates that LLM judges / verifiers are developing new (and better) capabilities. This did not work well 1-2 years ago. In fact, LLM-as-a-Judge best practice was that lower scoring granularity (e.g., binary, ternary, or 1-5 Likert score) worked way better than granular scores (e.g., 1-100 scale). This was a constant recommendation I gave for setting up LLM judges properly. It seems like recent frontier LLMs now are better at scoring at finer granularities, making this best practice (potentially) obsolete. One caveat to this finding is that the scoring setup used in this writeup is a specific setup based upon logprobs. Instead of just using the score token outputted by the LLM as the result, they compute the logprob of each possible score token and take a weighted average of scores (with weights given by probabilities). Then, they go further by expanding this weighted average across repeated verifications and multiple criterion: Reward = (1 / CK) * ∑_{c=1}^{C} ∑_{k=1}^{K} ∑_{g=1}^{G} score_logprob * score_value where C is the total number of evaluation criterion, K is the number of repeated verifications, and G is the scoring granularity (i.e., number of unique scoring output options). The reward determines if a particular output passes verification across criteria. When using this logprob setup, we see consistent gains in verifier accuracy by: - Increasing scoring granularity G. - Increasing repeated verifications K. - Increasing the number of evaluation criterion C. The last two findings are in line with prior work, but the fact that higher scoring granularity is helpful is interesting! In the LLM-as-a-Verifier paper, this system is used at inference time in a pairwise fashion as described below. "To pick the best trajectory among N candidates for a given task, a round-robin tournament is conducted. For every pair (i, j) the verifier produces Reward(i) and Reward(j) using the formula above. The trajectory with the higher reward receives a win, and the trajectory with the most wins across all \binom{N}{2} pairs is selected."

English

103

970

191.6K

Azalia Mirhoseini retweetledi

Jacky Kwok@jackyk02·10 Nis

We release LLM-as-a-Verifier 🧠: A general-purpose verification framework that achieves SOTA 👑 on Terminal-Bench 2 (86.4%) and SWE-Bench Verified (77.8%) by scaling: - scoring granularity - repeated verification - criteria decomposition 📄 Blog & Code: llm-as-a-verifier.notion.site

English

443

54.4K

Azalia Mirhoseini@Azaliamirh·14 Nis

Looking forward to this!

Alex Dimakis@AlexGDimakis

Check our new cool workshop for Agents, Discovery and Optimization: CAIS AI Agents for Discovery in the Wild. We have a pretty good speaker lineup. Submit your papers by: May 1st. (1/2)

English

4.4K

Azalia Mirhoseini@Azaliamirh·14 Nis

Some more results:

English

6.1K

Azalia Mirhoseini@Azaliamirh·14 Nis

Turns out we can get SOTA on agentic benchmarks with a simple test-time method! Excited to introduce LLM-as-a-Verifier. Test-time scaling is effective, but picking the "winner" among many candidates is the bottleneck. We introduce a way to extract a cleaner signal from the model: 1️⃣ Ask the LLM to rank results on a scale of 1-k 2️⃣ Use the log-probs of those rank tokens to calculate an expected score You can get a verification score in a single sampling pass per candidate pair. Blog: llm-as-a-verifier.notion.site Code: llm-as-a-verifier.github.io Led by @jackyk02 and in collaboration with a great team: @shululi256, @pranav_atreya, @liu_yuejiang, @drmapavone, @istoica05

English

114

987

115.7K

Azalia Mirhoseini@Azaliamirh·27 Mar

@Avanika15 @guruchahal ❤️

QME

117

Avanika Narayan@Avanika15·27 Mar

@Azaliamirh @guruchahal go @Azaliamirh!!!!

182

Azalia Mirhoseini@Azaliamirh·27 Mar

It was great to chat with @guruchahal about Ricursive, chip design bottlenecks and what's next for this industry!

Lightspeed@lightspeedvp

What if AI could design the chips that power the next generation of AI? That’s the vision behind @RicursiveAI, an AI-driven semiconductor design platform that uses reinforcement learning to compress the chip development process from years to weeks. We led their $300M Series A in January, and Co-Founders @annadgoldie and @Azaliamirh went in-depth about Ricursive and the future of AI-driven chip design in this episode of The Investment Memo with Lightspeed partner @guruchahal. 0:00 Welcome, Anna Goldie and Azalia Mirhoseini! 3:10 How the Idea for AI Chip Design Started 4:58 How They Expanded From Software to Chip Design 5:27 Turning the Idea Into a Google Moonshot 7:20 When Google Leadership Realized the Potential 10:35 Why Physical Design Is the Hardest Part of Chip Design 13:51 The 3-Phase Vision for the Future of Chip Design 17:51 Market Opportunity & Industry Disruption 20:24 The “Designless” Future of Hardware 22:04 Speed, Innovation & AI-Driven Exploration 21:00 The Role of Engineers in an AI-Driven Future 24:22 The Founders’ Unique Partnership Story 25:44 How They Hire World-Class Talent 27:05 Choosing the Right Investors 30:42 Advice for AI Researchers & Founders 33:16 What’s Most Important in the Next 12 months?

English

13.4K

Keşfet

@edchi @ElorianAI @RicursiveAI @AndrewDai @annadgoldie @Bloomberg @drmapavone @Stanford