Darshan Deshpande

161 posts

Darshan Deshpande

@getdarshan

Research Scientist working on RL environments and evals @PatronusAI | ex-Research @USC_ISI

San Francisco, CA Katılım Temmuz 2020

71 Takip Edilen204 Takipçiler

Sabitlenmiş Tweet

Darshan Deshpande@getdarshan·30 Oca

RL coding agents increasingly game rewards by exploiting their semantic and syntactic weaknesses. Can LLMs detect such behaviors from live training rollouts? We find contrastive cluster analysis is key! 🚀 GPT-5.2 jumps from 45% to 63%. Humans reach 90% Paper + data 🧵

English

516

Darshan Deshpande@getdarshan·30 Oca

@jonashuebotter Agreed! I will try to experiment a little to see if we can incorporate teacher uncertainty (confidence-based) or cross-rollout outcome variance as a scaling factor for the current advantage function. Happy to collaborate too if this seems interesting to you 🙂

English

Jonas Hübotter@jonashubotter·30 Oca

@getdarshan This would be super interesting to study! We only studied verifiable settings where your verifier tells you (up to noise) whether your outcome was correct or not.

English

Jonas Hübotter@jonashubotter·29 Oca

Training LLMs with verifiable rewards uses 1bit signal per generated response. This hides why the model failed. Today, we introduce a simple algorithm that enables the model to learn from any rich feedback! And then turns it into dense supervision. (1/n)

English

110

798

129K

Darshan Deshpande@getdarshan·30 Oca

@jonashuebotter Environment as a filter makes sense. Did you notice localization problems when there is only partial observability and this filtering becomes unreliable? As a basic example: a captcha appears inconsistently across G rollouts leading to poor cross-trajectory signal

English

105

Jonas Hübotter@jonashubotter·30 Oca

@getdarshan Great question! In our work the self-teacher never generates, so it wouldn’t face the exact same problem. The feedback comes from the environment and the students own generations (which have been marked as successful by the environment). So the environment acts as a filter :)

English

542

Darshan Deshpande@getdarshan·30 Oca

We hope TRACE enables more robust reward function design and better detection in RL training pipelines! 🤖 Dataset: hf.co/datasets/Patro… Paper: arxiv.org/abs/2601.20103 Work done @PatronusAI ❤️ Models used: @AnthropicAI @OpenAI @GeminiApp @Kimi_Moonshot @Zai_org @deepseek_ai

English

272

Darshan Deshpande@getdarshan·30 Oca

Where do models fail? 🤔 - Semantic reward hacks are harder to detect than syntactic hacks! - Models consistently show similar failures QA reveals: ✅ Grounding and exploring consequences helps ❌ Over-reliance on user acceptance or self awareness patterns impact performance

English

Darshan Deshpande@getdarshan·30 Oca

English

516

Darshan Deshpande retweetledi

Varun Gangal@VarunGangal·7 Ara

👋 Folks at #NEURIPS2025, come check out & stop by the poster of our Memtrack env at the SEA workshop happening at Upper Level 23ABC, 3:50pm onwards. Our env studies how well an agent dropped into a workplace can context engineer by composing tool calls to access intertwined slack, linear & git timelines in pursuit of answering a battery of related questions. Full paper arxiv: arxiv.org/abs/2510.01353

Darshan Deshpande@getdarshan

🚨We will be presenting Memtrack today at the SEA workshop from 3:50pm onwards at #NeurIPS2025 Memtrack is a SoTA eval env to study an agent's ability to memorize and retrieve facts using exploration over interleaved enterprise slack, linear and git threads in a multi-QA setting

English

818

Darshan Deshpande@getdarshan·7 Ara

English

2.5K

Darshan Deshpande@getdarshan·30 Kas

I will also be presenting Memtrack at #SEAWorkshop on 7th of Dec! arxiv.org/abs/2510.01353

English

Darshan Deshpande@getdarshan·30 Kas

I will be at #NeurIPS2025 from 2nd-7th Dec. Happy to meet old and new friends and chat about non-deterministic evals, long horizon RL and world building 🌍

English

108

Darshan Deshpande@getdarshan·13 Kas

@AlexanderSpangh Congratulations Alex!

English

Alex Spangher @ Neurips2025@AlexanderSpangh·12 Kas

✨ Very overdue update: I'll be starting as an Assistant Professor in CS at University of Minnesota, Twin Cities, Fall 2026. I will be recruiting PhD students!! Please help me spread the word! [Thread] 1/n

English

142

745

91.6K

Darshan Deshpande@getdarshan·29 Eki

@DanAdvantage @willccbb This was my intro to Prime hub > Sees bounty with benchmark datasets > Visits an eval env on hub > Sees an option to train on eval env > ?? > Posts on X > Comments are about how you can use other trainers too?? Misunderstanding more than shady? Maybe but still needs fixing!

English

Dan Advantage@DanAdvantage·29 Eki

@willccbb @getdarshan imo there is nothing shady going on at all?!

English

Darshan Deshpande@getdarshan·29 Eki

Creating a bounty program out of benchmark datasets that restrict training on to then create RL environments that can be trained on using Prime's "open source" training services. This is scammy practice under the name of open science!

will brown@willccbb

if you or a loved one is looking to learn about building environments and get a bag in the process, inquire within our bounty list is bigger and better than ever

English

4.4K

Darshan Deshpande@getdarshan·29 Eki

@willccbb Here are my good faith recommendations if you want them: 1. Eval-only envs shouldn't link to train with environment (even if it points to docs) 2. Remove outputs dir from eval envs since this upstream eval data 3. READMEs must cite sources (e.g., paperbench is missing them)

English

128

will brown@willccbb·29 Eki

sorry but this feels like you’re grasping at straws looking for some kind of dunk. most of our evals are python wrapper libraries contributed by community members, and are referencing upstream codebases or datasets which themselves have individual fully-permissive licenses which still apply, and which don’t mandate a particular method of displaying them. if you have good-faith suggestions on how we could improve guidance/guardrails for licenses or documentation, we’re all ears. i am not sure what you mean by “lack of transparency” though. there is nothing preventing people from training on test in the first place, and the people who you should be most worried about doing this are the labs who don’t disclose how they’re training models at all, not independent/open-source researchers. we working to make post-training research easier and more accessible, which, sure, makes doing *bad* research easier as well, but that’s not a good reason to not do it.

English

147

Darshan Deshpande@getdarshan·29 Eki

@willccbb It's very easy for people to abuse evals because they have gold labels which can be converted into training rewards. I've seen a lot of such cases and I assume you have too. It's best to stay transparent here for the sake of the community.

English

155

Darshan Deshpande@getdarshan·29 Eki

@willccbb Most of your hosted environments don't even have licenses attached to them as they are expected to. The descriptions don't incentivize fair use/purpose either. I do evals and I understand where you are coming from but open science should come with fair credit and disclosures

English

138

Darshan Deshpande@getdarshan·23 Eki

Excited to have contributed to OpenEnv before its release today! Thanks to @Meta and @huggingface for working towards standardizing RL environment creation!

PatronusAI@PatronusAI

We’re excited to support @Meta and @huggingface's OpenEnv launch today! OpenEnv provides an open-source framework for building and interacting with agentic execution environments. This allows researchers and developers to create isolated, secure, deployable, and usable environments. Lately, at Patronus, we’ve been working on RL environments for coding agents, and we were excited to contribute to OpenEnv with real-world-inspired tools and tasks to train and steer AGI. We began with a Gitea-based git server environment. Git server environments are foundational and enable effective collaboration and version control for software workflows, and we thought it would be a perfect way to get started with OpenEnv. With our git server environment, we support: * Fast iteration across runs with sub-second resets for RL training loops * Shared server + isolated workspaces * Environment variables + setting custom configs for Gitea We look forward to seeing what everyone builds with OpenEnv! GitHub: github.com/meta-pytorch/O… HuggingFace: huggingface.co/openenv

English

208

Darshan Deshpande retweetledi

PatronusAI@PatronusAI·6 Ağu

Thank you, @BerkeleyRDI, for hosting the Agentic AI Summit and having us! @getdarshan, one of our research scientists, who leads agent evaluation here at Patronus, presented at the summit! Here are a few takeaways: * Given context explosion and increasing domain depth and specificity, we are approaching a 𝗻𝗲𝘄 𝗮𝗴𝗲 𝗼𝗳 𝗰𝗼𝗻𝘁𝗲𝘅𝘁𝘂𝗮𝗹 𝗯𝗲𝗻𝗰𝗵𝗺𝗮𝗿𝗸𝘀. * As the AI we work with becomes exponentially better, nuances become more important, as does the 𝗰𝗿𝗲𝗮𝘁𝗶𝗼𝗻 𝗼𝗳 𝗴𝗿𝗼𝘂𝗻𝗱𝗲𝗱 𝗳𝗲𝗲𝗱𝗯𝗮𝗰𝗸-𝗱𝗿𝗶𝘃𝗲𝗻 𝗲𝗻𝘃𝗶𝗿𝗼𝗻𝗺𝗲𝗻𝘁𝘀. * Agent evaluation and 𝗲𝘅𝗽𝗹𝗮𝗶𝗻𝗮𝗯𝗹𝗲 𝗔𝗜 go hand-in-hand. Explainable agents are optimal for understanding agent workflows, fixing errors, and improving trajectories. * Our team has seen success in developing 𝗵𝗮𝗿𝗱, 𝗱𝗼𝗺𝗮𝗶𝗻-𝘀𝗽𝗲𝗰𝗶𝗳𝗶𝗰, 𝗮𝗻𝗱 𝗻𝗼𝘃𝗲𝗹 𝗯𝗲𝗻𝗰𝗵𝗺𝗮𝗿𝗸𝘀 to rigorously evaluate AI performance. 𝘠𝘰𝘶 𝘤𝘢𝘯 𝘳𝘦𝘢𝘥 𝘮𝘰𝘳𝘦 𝘢𝘣𝘰𝘶𝘵 𝘋𝘢𝘳𝘴𝘩𝘢𝘯’𝘴 𝘳𝘦𝘤𝘦𝘯𝘵 𝘸𝘰𝘳𝘬 𝘩𝘦𝘳𝘦: * 𝗧𝗥𝗔𝗜𝗟: 𝗔 𝗕𝗲𝗻𝗰𝗵𝗺𝗮𝗿𝗸 𝗳𝗼𝗿 𝗔𝗴𝗲𝗻𝘁𝗶𝗰 𝗘𝘃𝗮𝗹𝘂𝗮𝘁𝗶𝗼𝗻 patronus.ai/blog/introduci… * 𝗕𝗟𝗨𝗥: 𝗔 𝗕𝗲𝗻𝗰𝗵𝗺𝗮𝗿𝗸 𝗳𝗼𝗿 𝗧𝗶𝗽-𝗼𝗳-𝘁𝗵𝗲-𝗧𝗼𝗻𝗴𝘂𝗲 𝗦𝗲𝗮𝗿𝗰𝗵 𝗮𝗻𝗱 𝗥𝗲𝗮𝘀𝗼𝗻𝗶𝗻𝗴 patronus.ai/blog/the-blur-… * 𝗚𝗟𝗜𝗗𝗘𝗥: 𝗦𝗼𝗧𝗔 𝗦𝗟𝗠 𝗝𝘂𝗱𝗴𝗲 patronus.ai/blog/glider-st… Reach out if you’re interested in chatting more about agent evals and how we can collaborate! #BerkeleyRDI #AgenticAISummit

English

278

Darshan Deshpande retweetledi

Clémentine Fourrier 🍊 is off till Dec 2026 hiking@clefourrier·15 May

Check out the very cool work from our friends @PatronusAI 🔥 work here! huggingface.co/spaces/Patronu…

English

1.1K

Keşfet

@jonashuebotter @PatronusAI @AnthropicAI @OpenAI @GeminiApp @Kimi_Moonshot @Zai_org @deepseek_ai