PatronusAI

399 posts

PatronusAI

@PatronusAI

Simulation research and infrastructure for human-aligned AGI https://t.co/8X6bVgvCHd

Katılım Temmuz 2023

215 Takip Edilen2.3K Takipçiler

Sabitlenmiş Tweet

PatronusAI@PatronusAI·18 Ara

1/ Today, we are thrilled to announce Generative Simulators, a new class of adaptive, auto-scaling environments for AGI training and evaluation 🤖🧵 Static datasets, hand-authored environments, and human-curated demonstrations do not automatically scale with the learning patterns of the trained model. We propose Generative Simulators as a principled alternative: environments that evolve, evaluate, and adapt to agent behavior over time. Technical Report: patronus.ai/generative-sim… Blog: patronus.ai/blog/introduci…

English

16.5K

PatronusAI@PatronusAI·2d

Meet the Patronus team 🎉 In case you missed it: we recently announced our Series B, and we're hiring like crazy. Here's a chance to meet a few of the people who make the magic happen! Hear from Patronus members dig into why data quality is the real bottleneck to AI reliability, the strange patterns in how models fail, and why simulations are what gets us from here to superintelligence. They also share what kinds of people thrive here and the culture we're building along the way. Check out our open roles: patronus.ai/join #checkthedata

English

2.9K

PatronusAI@PatronusAI·4d

Two weeks ago we announced our $50M Series B. Now that the news has settled, we want to share this article that captures what our round means. AI agents are starting to run real, multi-step work. Before one can be trusted to book a trip or run a financial analysis, someone has to prove it'll do the job right. That's what we build: simulated replicas of real systems where agents get stress-tested against the messy scenarios they'll hit in production, the same way Waymo trained cars in synthetic worlds before the road. What we're most excited about is what comes next. As our co-founder and CEO Anand Kannappan put it: "We want to be able to actually create the environment in which you can operate an agent that can run for 10 hours or 10 days or 10 weeks." That's the frontier we're building toward. Thanks to Marina Temkin from @TechCrunch for capturing our story with this great piece! Read the full article here: techcrunch.com/2026/06/25/pat…

English

2.8K

PatronusAI@PatronusAI·1 Tem

Saturday night we gathered at the Exploratorium in San Francisco to raise a glass to our Series B🥂. The evening was our way of thanking the people who made this milestone possible, and welcoming the ones helping shape what comes next. It was awesome to celebrate with incredible people like @polynoamial, @mobav0, @aparnabsinha, @cutlasskelly, @timshi_ai, @LiorOnAI, @crwhite_ml, and more. It was a beautiful evening on the water, with views of the Bay Bridge, and we even got the whole room in black ties. Thank you to everyone who showed up for us. This is just the beginning.

English

2.1K

PatronusAI@PatronusAI·25 Haz

Digital World Models preview: patronus.ai/dwm

English

576

PatronusAI@PatronusAI·25 Haz

Read our story: patronus.ai/blog/announcin…

English

839

PatronusAI@PatronusAI·25 Haz

Today, we’re excited to announce our $50M Series B, led by @GreenfieldVC, with participation from @lightspeedvp and @notablecap. 🚀 At Patronus AI, we develop simulations and evals to train and improve AI. The first phase of AI was built on static benchmarks, but that era is over. As agents are used to solve longer and longer tasks, they need to practice in dynamic, living worlds to get better. Simulations are the critical infrastructure powering this next phase. As a company, we’re behind the most influential research and products in AI evaluation, like FinanceBench, Lynx, and Percival. And things have moved at the speed of light since.⚡ We partner with the world's leading frontier AI labs and enterprises, and our revenue has grown more than 15x over the past year. Additionally, today, we’re introducing a preview of the first Digital World Model for AI agent training and simulation: Patronus-DWM. Digital World Models are language diffusion world models that predict realistic environment behaviors and steer agent actions across digital workflows. Just as physical world models predict how objects move through space, we’re developing the equivalent for the digital world: predicting how agents act in digital workflows, then using that to scale the creation of high-quality training data for LLMs. Digital World Models help us push the frontier of ultra long horizon workflows, and unlock a new class of self-improving RL environments. This is our scalable approach to simulating all of the world’s intelligence. The round was also joined by @datadoghq, @SamsungVentures, @gokulr, @factorialcap, and a large cohort of amazing AI leaders across @AnthropicAI, @OpenAI, @GoogleDeepMind, @nvidia, @Recursive_SI, and more.✨ It has been the ride of a lifetime. But we’re just getting started. The best is yet to come. "Do not go gentle into that good night, Rage, rage against the dying of the light" - Dylan Thomas (1954)

English

128

93.8K

PatronusAI@PatronusAI·16 Haz

Spotlighting our paper on analyzing and mitigating LLM judge biases that has been accepted to ACL2026 Findings 🎊! When people use an LLM as a judge, they assume the score reflects the content. But ask the same model to rate identical text on a 0-4 scale, then a 1-5 scale, then a 2-6 scale, and the scores shift in ways the content never justifies. Existing work on LLM-as-a-judge largely treats the scoring range as a neutral design choice. We show it is not. We call this failure mode score range bias: a systematic distortion that undermines anyone relying on direct assessment. Our key insight is that models from the same family (Llama-3, Qwen-2.5) encode similar biases regardless of size. We exploit this with contrastive decoding, subtracting the smaller model's logits from the larger one so that the shared bias cancels out while the signal survives. The effect is consistent: up to 11.7% relative improvement in Spearman correlation with human judgments, holding across all score ranges tested. We hope this work pushes LLM-as-a-judge toward more robust evaluation, especially for practitioners working with non-standard score ranges where the bias is most damaging. arXiv Paper: arxiv.org/abs/2510.18196 @akkikiki

English

1.7K

PatronusAI@PatronusAI·12 Haz

Patronus AI is simulating the world's intelligence every day, and now we're doing it from Times Square too 🗽 Pulling off moments like this is exactly the kind of work you'd own as our Founding Marketer, which is one of the roles we're hiring for right now. Take a look at a few of our open roles below: Founding Marketer - patronus.ai/job-detail?gh_… Senior Software Engineer - patronus.ai/job-detail?gh_… Thanks to our friends at @join_arc for the billboard!

English

743

PatronusAI@PatronusAI·4 Haz

Last week we hosted our quarterly RL dinner at Taksim in SF, bringing together a group of people doing post-training work at the frontier. The evening consisted of great Turkish food, sharp conversation and a room full of people shaping what's coming next!

English

PatronusAI@PatronusAI·28 May

Spotlighting our latest research accepted to the ICML 2026 Position Paper track: "Position: We Need A Unified Definition of Hallucination, Or: It's the World Model, Stupid!" 🎉 We keep saying LLMs hallucinate, but what does that really mean? The lack of a clear, unified definition has historically led to disparate characterizations, for example faithfulness, factuality, or calibration failure. While these definitions work for basic QA, they do not extend naturally to multi-turn and agent-in-environment settings. For instance, evaluating "faithfulness to context" becomes inadequate when an agent intentionally chooses how its context builds up over time. To address this, we argue for a neater characterization: casting hallucination as an internal world modeling failure. In other words, a model hallucinates when it begins making claims that contradict a reference world model (which might simply be fixed environment dynamics, rather than a neural model). The Formal Definition: We introduce a reference world model 𝑊 = (𝑆, 𝐻, 𝑅), a conflict policy 𝑃, and a truth function 𝑇_(𝑊,𝑷). A model hallucinates with respect to (𝑊, 𝑃) if and only if it produces at least one atomic claim 𝑐 ∈ 𝒞(𝑦) such that 𝑇_(𝑊,𝑷)(𝑥, 𝘲) = false. This framework matters for two main reasons. First, it makes the scaling of hallucination benchmarks possible, as any environment with known dynamics can be instantiated to match it. Second, it formally handles Parametric vs. Contextual-driven disagreements through the conflict policy 𝘗, which can simply be null where no such divergence exists. Building on this definition, we are also excited to share our sequel benchmark: HalluWorld. This work actively measures hallucination by asking probes as an agent solves tasks in three environments with known, controllable dynamics: GridWorlds, Chess, and the Terminal. Read the papers here: 📄ICML '26 Position Paper: arxiv.org/abs/2512.21577 🌍️HalluWorld Preprint: arxiv.org/abs/2605.19341… (See below for Fig 1 and a breakdown of our definitions!) @VarunGangal

English

526

PatronusAI@PatronusAI·27 May

Honored to be included in @Redpoint's 2026 InfraRed 100 alongside so many innovative AI infrastructure companies. Congratulations to all the companies featured this year!

English

339

PatronusAI@PatronusAI·13 May

Spotlighting our benchmark for agentic search: DETOUR which was accepted to ACL 2026 🎊! When people try to recall something in conversation, they rarely give a perfect query upfront. They say things like “that movie with the scene where…” or “the paper about…” and the assistant has to ask the right follow-up questions to get there. Existing search and agent benchmarks often miss this multi-turn, tip-of-the-tongue behavior. To more realistically evaluate it, we introduce DETOUR: Dual-agent based Evaluation Through Obscure Under-specified Retrieval, an interactive benchmark for dual-agent search and reasoning. DETOUR contains 1,011 prompts across text, image, audio, and video. In the benchmark, a Primary Agent is evaluated on its ability to identify a target entity by querying a consistent Memory Agent, testing whether models can resolve ambiguity through useful follow-up questions. Current state-of-the-art models still struggle: performance reaches only 36% accuracy across all modalities, showing that today’s agents remain weak at clarification-seeking in underspecified, real-world search settings. We hope DETOUR helps push the next generation of search agents toward better reasoning, better questions, and more robust multi-turn retrieval. arXiv Paper: arxiv.org/abs/2602.00352 @getdarshan @anandnk24 @rebeccatqian

English

924

PatronusAI@PatronusAI·11 May

Excited to share that our paper, Benchmarking Reward Hack Detection in Code Environments via Contrastive Analysis, has been accepted to ICML 2026 🎉 As RL coding agents become more capable, they also become better at exploiting gaps in reward functions: passing tests, satisfying proxies, or appearing successful without actually solving the underlying task. Detecting this behavior from live training rollouts is difficult, especially when a single trajectory can look plausible in isolation. To study this, we introduce TRACE: a human-verified benchmark of 517 multi-turn trajectories spanning 54 fine-grained categories of code reward hacks. The key finding: models are much better at detecting reward hacks when they analyze trajectories contrastively, rather than one at a time. This setup fits naturally with rollout-based RL pipelines such as GRPO, where multiple trajectories are already generated and compared. In our experiments, GPT-5.2 improved from a 45% detection rate in isolated settings to 63% in contrastive settings, but the gap to human-level performance remains substantial. A few takeaways: Contrast matters: Increasing cluster size from N=1 to N=5 produced a large improvement in Match Rate across models. Semantic hacks are harder: Models detect syntactic exploits, such as test manipulation or hardcoded outputs, more reliably than hacks that require understanding intent or broader context. Model behavior varies but trends remain consistent: GPT-5.2 was the most robust overall, while Claude Opus 4.5 showed the largest gain when evaluated contrastively. Reasoning strategy matters: Models performed better when they grounded their judgments in specific code artifacts and explored downstream consequences. They performed worse when they over-relied on user acceptance or the agent's own explanations in the trajectory. Our hope is that TRACE helps the community build more robust reward functions and better detection systems for RL training pipelines. arXiv Paper: lnkd.in/guWD-dnk Hugging Face Dataset: lnkd.in/gxGcuCUf

English

PatronusAI@PatronusAI·23 Nis

Last week marked a major milestone: the opening of our new headquarters off Market Street in downtown San Francisco. It was a special moment for our team and a meaningful opportunity to bring together the community that has supported us along the way. Huge thanks to everyone who came to celebrate this with us. Here's to what's ahead!

English

1.6K

PatronusAI@PatronusAI·11 Mar

We spent the weekend at the @Meta x @Cerebral_valley Hackathon, one of the largest gatherings focused on RL and post-training systems. It was so much fun meeting builders thinking deeply about agents, environments, and how models actually learn. At Patronus AI, we spend a lot of time thinking about how to simulate the world’s intelligence and it was inspiring to see so many people exploring adjacent ideas. Excited to keep the conversations going with the folks we met this weekend. If we didn’t get a chance to connect, feel free to reach out! Thank you to the OpenEnv team, Cerebral Valley and Shack 15 for the space. Thank you to those we connected with and as always, the best is yet to come!

English

636

PatronusAI@PatronusAI·3 Mar

We're excited to judge the @Meta- @PyTorch Hackathon with @cerebral_valley this weekend! At Patronus AI, we're developing simulation research and infrastructure to accelerate progress toward human-aligned AGI. Looking forward to seeing the creative ideas participants bring and meeting talented builders pushing the boundaries of AI. If you'll be there, come say hi! We'll have some fun merch and would love to connect. See you there!

English

587

PatronusAI retweetledi

Darshan Deshpande@getdarshan·30 Oca

RL coding agents increasingly game rewards by exploiting their semantic and syntactic weaknesses. Can LLMs detect such behaviors from live training rollouts? We find contrastive cluster analysis is key! 🚀 GPT-5.2 jumps from 45% to 63%. Humans reach 90% Paper + data 🧵

English

1.5K

Keşfet

@TechCrunch @polynoamial @mobav0 @aparnabsinha @cutlasskelly @timshi_ai @LiorOnAI @crwhite_ml