PatronusAI

383 posts

PatronusAI banner
PatronusAI

PatronusAI

@PatronusAI

Simulation research and infrastructure for human-aligned AGI https://t.co/8X6bVgvCHd

Katılım Temmuz 2023
251 Takip Edilen1.9K Takipçiler
Sabitlenmiş Tweet
PatronusAI
PatronusAI@PatronusAI·
1/ Today, we are thrilled to announce Generative Simulators, a new class of adaptive, auto-scaling environments for AGI training and evaluation 🤖🧵 Static datasets, hand-authored environments, and human-curated demonstrations do not automatically scale with the learning patterns of the trained model. We propose Generative Simulators as a principled alternative: environments that evolve, evaluate, and adapt to agent behavior over time. Technical Report: patronus.ai/generative-sim… Blog: patronus.ai/blog/introduci…
PatronusAI tweet media
English
3
23
56
11.8K
PatronusAI
PatronusAI@PatronusAI·
We spent the weekend at the @Meta x @Cerebral_valley Hackathon, one of the largest gatherings focused on RL and post-training systems. It was so much fun meeting builders thinking deeply about agents, environments, and how models actually learn. At Patronus AI, we spend a lot of time thinking about how to simulate the world’s intelligence and it was inspiring to see so many people exploring adjacent ideas. Excited to keep the conversations going with the folks we met this weekend. If we didn’t get a chance to connect, feel free to reach out! Thank you to the OpenEnv team, Cerebral Valley and Shack 15 for the space. Thank you to those we connected with and as always, the best is yet to come!
PatronusAI tweet mediaPatronusAI tweet mediaPatronusAI tweet mediaPatronusAI tweet media
English
0
0
3
172
PatronusAI
PatronusAI@PatronusAI·
We're excited to judge the @Meta- @PyTorch Hackathon with @cerebral_valley this weekend! At Patronus AI, we're developing simulation research and infrastructure to accelerate progress toward human-aligned AGI. Looking forward to seeing the creative ideas participants bring and meeting talented builders pushing the boundaries of AI. If you'll be there, come say hi! We'll have some fun merch and would love to connect. See you there!
PatronusAI tweet media
English
1
0
3
261
PatronusAI retweetledi
Darshan Deshpande
Darshan Deshpande@getdarshan·
RL coding agents increasingly game rewards by exploiting their semantic and syntactic weaknesses. Can LLMs detect such behaviors from live training rollouts? We find contrastive cluster analysis is key! 🚀 GPT-5.2 jumps from 45% to 63%. Humans reach 90% Paper + data 🧵
Darshan Deshpande tweet media
English
1
3
6
516
PatronusAI
PatronusAI@PatronusAI·
At @PatronusAI, we're excited to publish a new article with tutorials and examples for LLM Post Training. 🚀 Post-training helps pre-trained foundational large language models (LLMs) be further trained on curated datasets to gain domain-specific knowledge or learn behaviors such as following instructions or adhering to certain styles. In this article, you will learn about the techniques, best practices, and tools for post-training models, including Supervised Fine-Tuning (SFT), Direct Preference Optimization (DPO), and Reinforcement Learning (RL). You will also explore Proximal Policy Optimization (PPO) and Group Regularized Policy Optimization (GRPO) reward models for reinforcement learning, and follow RL implementation examples in Python. Read the article at: patronus.ai/guide-to-rl-en… All based on the latest AI research produced by the @PatronusAI Team and the broader research community. #AI #NLP #LLM
English
0
0
1
214
PatronusAI
PatronusAI@PatronusAI·
At @PatronusAI, we're excited to publish a new article with tutorials and examples for RL Environments. 🚀 In this article, you will learn the core concepts behind reinforcement learning (RL), where AI models and agents learn by trial and error based on feedback in the form of rewards and penalties--and understand when to use RL in place of or in addition to supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF). You will also learn how to create RL environments and explore their core components, like state, action space, reward functions, and transition dynamics. Finally, we walk through an example in Python to explain how RL environments are implemented in practice. Read the article at: patronus.ai/guide-to-rl-en… All based on the latest AI research produced by the @PatronusAI Team and the broader research community. #AI #NLP #LLM
English
0
0
7
508
PatronusAI
PatronusAI@PatronusAI·
5/ We are partnering with model developers to develop frontier RL environments. With generative simulation, we are constructing hyperrealistic, auto-scaling worlds that are complex and learnable, to train agents to perform real world job functions ranging from equity research analysts to product engineers. 🧑‍💼💼
English
1
2
6
385
PatronusAI
PatronusAI@PatronusAI·
1/ Today, we are thrilled to announce Generative Simulators, a new class of adaptive, auto-scaling environments for AGI training and evaluation 🤖🧵 Static datasets, hand-authored environments, and human-curated demonstrations do not automatically scale with the learning patterns of the trained model. We propose Generative Simulators as a principled alternative: environments that evolve, evaluate, and adapt to agent behavior over time. Technical Report: patronus.ai/generative-sim… Blog: patronus.ai/blog/introduci…
PatronusAI tweet media
English
3
23
56
11.8K
PatronusAI
PatronusAI@PatronusAI·
We're still buzzing from our night at the San Diego Zoo! We had an incredible evening hosting our @NeurIPSConf community. Guests explored the park at sunset and joined us for a private Wildlife Encounter featuring six amazing animals, including a tenrec 🐾 , owl 🦉, opossum 🐀, hedgehog 🦔, cheetah 🐆, and even a howling wolf 🐺! It was truly an unforgettable setting for meaningful conversations about RL environments, AI evaluation, and the future of intelligent systems. We're grateful to everyone who joined us and made the night special. Thank you all!
PatronusAI tweet mediaPatronusAI tweet mediaPatronusAI tweet media
English
2
1
8
464
PatronusAI
PatronusAI@PatronusAI·
We’re excited to support @Meta and @huggingface's OpenEnv launch today! OpenEnv provides an open-source framework for building and interacting with agentic execution environments. This allows researchers and developers to create isolated, secure, deployable, and usable environments. Lately, at Patronus, we’ve been working on RL environments for coding agents, and we were excited to contribute to OpenEnv with real-world-inspired tools and tasks to train and steer AGI. We began with a Gitea-based git server environment. Git server environments are foundational and enable effective collaboration and version control for software workflows, and we thought it would be a perfect way to get started with OpenEnv. With our git server environment, we support: * Fast iteration across runs with sub-second resets for RL training loops * Shared server + isolated workspaces * Environment variables + setting custom configs for Gitea We look forward to seeing what everyone builds with OpenEnv! GitHub: github.com/meta-pytorch/O… HuggingFace: huggingface.co/openenv
PatronusAI tweet media
English
1
0
3
567
PatronusAI
PatronusAI@PatronusAI·
Our CTO, @rebeccatqian, spoke at the @PyTorch Measuring Intelligence Summit 2025 yesterday! She was on the Beyond the Leaderboard: Practical Intelligence in the Wild panel with @jeremyphoward (fast.ai), and @haifengxu0 (@UChicago/ ProphetArena), moderated by @shishirpatil_ (@Meta). The group discussed the limitations of public benchmarks and explored how real-world tasks, such as code generation, enterprise analytics, and scientific discovery can guide evaluation priorities and methodology. We’re excited to continue pushing the boundaries in this space with novel agent evaluation benchmarks and the development of dynamic, feedback-driven training environments. Thank you to @joespeez for organizing the conference and the other speakers at the intelligence summit. We enjoyed hearing from Vivienne Zhang, @polynoamial, @achowdhery, Yifan Mai, and @ml_angelopoulos! #PyTorchCon
PatronusAI tweet mediaPatronusAI tweet media
English
1
3
23
2.5K
PatronusAI
PatronusAI@PatronusAI·
At @PatronusAI, we're excited to publish a new article with tutorials and examples for AI Guardrails. 🚀 In this article, you will learn about the importance of AI guardrails in ensuring the reliable and ethical use of large language models in various industries, and the different components, strategies, and tools involved in their development and deployment. Read the article at: patronus.ai/ai-reliability… All based on the latest AI research produced by the @PatronusAI Team and the broader research community. #AI #NLP #LLM
English
0
0
2
206
PatronusAI
PatronusAI@PatronusAI·
Introducing MEMTRACK, a new benchmark designed to evaluate long-term memory and state tracking in multi-platform agent environments. 🎉 Human memory enables us to achieve complex objectives by taking in, storing, and applying saved information. We wanted to evaluate how LLMs would perform when given access to memory tools. We found that although LLMs are effective in general tool calling, they struggle to properly use memory tools leading to continued underperformance with long-context reasoning and follow-ups. This makes agent memory an exciting space to unlock performance gains. The team is looking forward to presenting this paper as part of the @NeurIPSConf SEA workshop in December! arXiv Paper: arxiv.org/pdf/2510.01353 Blog: patronus.ai/blog/memtrack
PatronusAI tweet media
English
0
2
5
1.8K
PatronusAI
PatronusAI@PatronusAI·
At @PatronusAI, we're excited to publish a new article with tutorials and examples for AI Agent Tools. 🚀 In this article, you will learn about AI agent tools that allow AI models to interact with external systems and enhance their capabilities through real-time data access to third-party systems for taking automated actions. You will learn state-of-the-art best practices for invoking tools in agentic workflows, designing guardrails, applying reinforcement learning, and evaluating the functionality and effectiveness of AI agents powered by tools. Read the article at: patronus.ai/ai-agent-devel… All based on the latest AI research produced by the @PatronusAI Team and the broader research community. #AI #NLP #LLM
English
0
0
2
189
PatronusAI
PatronusAI@PatronusAI·
Introducing Percival Chat, a new way to work with Percival, the first AI agent that can evaluate and fix other agents 🚀 Now, you can Chat with Percival to automatically analyze your agent traces and detect complex failures, making your AI more reliable and secure. Read more in our blog post: patronus.ai/blog/percival-…
English
0
0
2
202
PatronusAI
PatronusAI@PatronusAI·
At @PatronusAI, we're excited to publish a new article on the best practices for Advanced Prompt Engineering. 🚀 In this article, you will learn about advanced prompt engineering techniques that can maximize the potential of large language models, including self-ask decomposition, step-back prompting, contextual priming, and more. Read the article at: lnkd.in/guhqp7_g All based on the latest AI research produced by the @PatronusAI Team and the broader research community. #AI #NLP #LLM
English
0
1
4
244
PatronusAI
PatronusAI@PatronusAI·
At @PatronusAI, we're excited to publish a new article on the best practices for LLM Observability. 🚀 In this article, you will learn how LLM observability empowers engineering teams by capturing and analyzing various aspects of LLM-based applications—like prompts, responses, latency, costs, hallucinations, and chain trace data—to optimize performance, accuracy, and reliability. This article also covers the tools and best practices for adopting LLM observability in your AI application environment. All based on the latest AI research produced by the @PatronusAI Team and the broader research community. Read the article at: patronus.ai/llm-testing/ll… #AI #NLP #LLM
English
0
1
3
348