Darshan Deshpande

161 posts

Darshan Deshpande

Darshan Deshpande

@getdarshan

Research Scientist working on RL environments and evals @PatronusAI | ex-Research @USC_ISI

San Francisco, CA Katılım Temmuz 2020
71 Takip Edilen204 Takipçiler
Sabitlenmiş Tweet
Darshan Deshpande
Darshan Deshpande@getdarshan·
RL coding agents increasingly game rewards by exploiting their semantic and syntactic weaknesses. Can LLMs detect such behaviors from live training rollouts? We find contrastive cluster analysis is key! 🚀 GPT-5.2 jumps from 45% to 63%. Humans reach 90% Paper + data 🧵
Darshan Deshpande tweet media
English
1
3
6
516
Darshan Deshpande
Darshan Deshpande@getdarshan·
@jonashuebotter Agreed! I will try to experiment a little to see if we can incorporate teacher uncertainty (confidence-based) or cross-rollout outcome variance as a scaling factor for the current advantage function. Happy to collaborate too if this seems interesting to you 🙂
English
1
0
1
87
Jonas Hübotter
Jonas Hübotter@jonashubotter·
@getdarshan This would be super interesting to study! We only studied verifiable settings where your verifier tells you (up to noise) whether your outcome was correct or not.
English
1
0
1
85
Jonas Hübotter
Jonas Hübotter@jonashubotter·
Training LLMs with verifiable rewards uses 1bit signal per generated response. This hides why the model failed. Today, we introduce a simple algorithm that enables the model to learn from any rich feedback! And then turns it into dense supervision. (1/n)
Jonas Hübotter tweet media
English
20
110
798
129K
Darshan Deshpande
Darshan Deshpande@getdarshan·
@jonashuebotter Environment as a filter makes sense. Did you notice localization problems when there is only partial observability and this filtering becomes unreliable? As a basic example: a captcha appears inconsistently across G rollouts leading to poor cross-trajectory signal
English
1
0
0
105
Jonas Hübotter
Jonas Hübotter@jonashubotter·
@getdarshan Great question! In our work the self-teacher never generates, so it wouldn’t face the exact same problem. The feedback comes from the environment and the students own generations (which have been marked as successful by the environment). So the environment acts as a filter :)
English
1
0
3
542
Darshan Deshpande
Darshan Deshpande@getdarshan·
Where do models fail? 🤔 - Semantic reward hacks are harder to detect than syntactic hacks! - Models consistently show similar failures QA reveals: ✅ Grounding and exploring consequences helps ❌ Over-reliance on user acceptance or self awareness patterns impact performance
Darshan Deshpande tweet media
English
1
0
2
98
Darshan Deshpande
Darshan Deshpande@getdarshan·
RL coding agents increasingly game rewards by exploiting their semantic and syntactic weaknesses. Can LLMs detect such behaviors from live training rollouts? We find contrastive cluster analysis is key! 🚀 GPT-5.2 jumps from 45% to 63%. Humans reach 90% Paper + data 🧵
Darshan Deshpande tweet media
English
1
3
6
516
Darshan Deshpande retweetledi
Varun Gangal
Varun Gangal@VarunGangal·
👋 Folks at #NEURIPS2025, come check out & stop by the poster of our Memtrack env at the SEA workshop happening at Upper Level 23ABC, 3:50pm onwards. Our env studies how well an agent dropped into a workplace can context engineer by composing tool calls to access intertwined slack, linear & git timelines in pursuit of answering a battery of related questions. Full paper arxiv: arxiv.org/abs/2510.01353
Darshan Deshpande@getdarshan

🚨We will be presenting Memtrack today at the SEA workshop from 3:50pm onwards at #NeurIPS2025 Memtrack is a SoTA eval env to study an agent's ability to memorize and retrieve facts using exploration over interleaved enterprise slack, linear and git threads in a multi-QA setting

English
0
3
7
818
Darshan Deshpande
Darshan Deshpande@getdarshan·
🚨We will be presenting Memtrack today at the SEA workshop from 3:50pm onwards at #NeurIPS2025 Memtrack is a SoTA eval env to study an agent's ability to memorize and retrieve facts using exploration over interleaved enterprise slack, linear and git threads in a multi-QA setting
Darshan Deshpande tweet media
English
0
4
13
2.5K
Darshan Deshpande
Darshan Deshpande@getdarshan·
I will be at #NeurIPS2025 from 2nd-7th Dec. Happy to meet old and new friends and chat about non-deterministic evals, long horizon RL and world building 🌍
English
1
0
1
108
Alex Spangher @ Neurips2025
Alex Spangher @ Neurips2025@AlexanderSpangh·
✨ Very overdue update: I'll be starting as an Assistant Professor in CS at University of Minnesota, Twin Cities, Fall 2026. I will be recruiting PhD students!! Please help me spread the word! [Thread] 1/n
Alex Spangher @ Neurips2025 tweet media
English
40
142
745
91.6K
Darshan Deshpande
Darshan Deshpande@getdarshan·
@DanAdvantage @willccbb This was my intro to Prime hub > Sees bounty with benchmark datasets > Visits an eval env on hub > Sees an option to train on eval env > ?? > Posts on X > Comments are about how you can use other trainers too?? Misunderstanding more than shady? Maybe but still needs fixing!
Darshan Deshpande tweet media
English
1
0
1
13
Darshan Deshpande
Darshan Deshpande@getdarshan·
@willccbb Here are my good faith recommendations if you want them: 1. Eval-only envs shouldn't link to train with environment (even if it points to docs) 2. Remove outputs dir from eval envs since this upstream eval data 3. READMEs must cite sources (e.g., paperbench is missing them)
English
1
0
3
128
will brown
will brown@willccbb·
sorry but this feels like you’re grasping at straws looking for some kind of dunk. most of our evals are python wrapper libraries contributed by community members, and are referencing upstream codebases or datasets which themselves have individual fully-permissive licenses which still apply, and which don’t mandate a particular method of displaying them. if you have good-faith suggestions on how we could improve guidance/guardrails for licenses or documentation, we’re all ears. i am not sure what you mean by “lack of transparency” though. there is nothing preventing people from training on test in the first place, and the people who you should be most worried about doing this are the labs who don’t disclose how they’re training models at all, not independent/open-source researchers. we working to make post-training research easier and more accessible, which, sure, makes doing *bad* research easier as well, but that’s not a good reason to not do it.
English
1
0
7
147
Darshan Deshpande
Darshan Deshpande@getdarshan·
@willccbb It's very easy for people to abuse evals because they have gold labels which can be converted into training rewards. I've seen a lot of such cases and I assume you have too. It's best to stay transparent here for the sake of the community.
English
1
0
2
155
Darshan Deshpande
Darshan Deshpande@getdarshan·
@willccbb Most of your hosted environments don't even have licenses attached to them as they are expected to. The descriptions don't incentivize fair use/purpose either. I do evals and I understand where you are coming from but open science should come with fair credit and disclosures
English
1
0
1
138
Darshan Deshpande
Darshan Deshpande@getdarshan·
Excited to have contributed to OpenEnv before its release today! Thanks to @Meta and @huggingface for working towards standardizing RL environment creation!
PatronusAI@PatronusAI

We’re excited to support @Meta and @huggingface's OpenEnv launch today! OpenEnv provides an open-source framework for building and interacting with agentic execution environments. This allows researchers and developers to create isolated, secure, deployable, and usable environments. Lately, at Patronus, we’ve been working on RL environments for coding agents, and we were excited to contribute to OpenEnv with real-world-inspired tools and tasks to train and steer AGI. We began with a Gitea-based git server environment. Git server environments are foundational and enable effective collaboration and version control for software workflows, and we thought it would be a perfect way to get started with OpenEnv. With our git server environment, we support: * Fast iteration across runs with sub-second resets for RL training loops * Shared server + isolated workspaces * Environment variables + setting custom configs for Gitea We look forward to seeing what everyone builds with OpenEnv! GitHub: github.com/meta-pytorch/O… HuggingFace: huggingface.co/openenv

English
0
0
2
208
Darshan Deshpande retweetledi
PatronusAI
PatronusAI@PatronusAI·
Thank you, @BerkeleyRDI, for hosting the Agentic AI Summit and having us! @getdarshan, one of our research scientists, who leads agent evaluation here at Patronus, presented at the summit! Here are a few takeaways: * Given context explosion and increasing domain depth and specificity, we are approaching a 𝗻𝗲𝘄 𝗮𝗴𝗲 𝗼𝗳 𝗰𝗼𝗻𝘁𝗲𝘅𝘁𝘂𝗮𝗹 𝗯𝗲𝗻𝗰𝗵𝗺𝗮𝗿𝗸𝘀. * As the AI we work with becomes exponentially better, nuances become more important, as does the 𝗰𝗿𝗲𝗮𝘁𝗶𝗼𝗻 𝗼𝗳 𝗴𝗿𝗼𝘂𝗻𝗱𝗲𝗱 𝗳𝗲𝗲𝗱𝗯𝗮𝗰𝗸-𝗱𝗿𝗶𝘃𝗲𝗻 𝗲𝗻𝘃𝗶𝗿𝗼𝗻𝗺𝗲𝗻𝘁𝘀. * Agent evaluation and 𝗲𝘅𝗽𝗹𝗮𝗶𝗻𝗮𝗯𝗹𝗲 𝗔𝗜 go hand-in-hand. Explainable agents are optimal for understanding agent workflows, fixing errors, and improving trajectories. * Our team has seen success in developing 𝗵𝗮𝗿𝗱, 𝗱𝗼𝗺𝗮𝗶𝗻-𝘀𝗽𝗲𝗰𝗶𝗳𝗶𝗰, 𝗮𝗻𝗱 𝗻𝗼𝘃𝗲𝗹 𝗯𝗲𝗻𝗰𝗵𝗺𝗮𝗿𝗸𝘀 to rigorously evaluate AI performance. 𝘠𝘰𝘶 𝘤𝘢𝘯 𝘳𝘦𝘢𝘥 𝘮𝘰𝘳𝘦 𝘢𝘣𝘰𝘶𝘵 𝘋𝘢𝘳𝘴𝘩𝘢𝘯’𝘴 𝘳𝘦𝘤𝘦𝘯𝘵 𝘸𝘰𝘳𝘬 𝘩𝘦𝘳𝘦: * 𝗧𝗥𝗔𝗜𝗟: 𝗔 𝗕𝗲𝗻𝗰𝗵𝗺𝗮𝗿𝗸 𝗳𝗼𝗿 𝗔𝗴𝗲𝗻𝘁𝗶𝗰 𝗘𝘃𝗮𝗹𝘂𝗮𝘁𝗶𝗼𝗻 patronus.ai/blog/introduci… * 𝗕𝗟𝗨𝗥: 𝗔 𝗕𝗲𝗻𝗰𝗵𝗺𝗮𝗿𝗸 𝗳𝗼𝗿 𝗧𝗶𝗽-𝗼𝗳-𝘁𝗵𝗲-𝗧𝗼𝗻𝗴𝘂𝗲 𝗦𝗲𝗮𝗿𝗰𝗵 𝗮𝗻𝗱 𝗥𝗲𝗮𝘀𝗼𝗻𝗶𝗻𝗴 patronus.ai/blog/the-blur-… * 𝗚𝗟𝗜𝗗𝗘𝗥: 𝗦𝗼𝗧𝗔 𝗦𝗟𝗠 𝗝𝘂𝗱𝗴𝗲 patronus.ai/blog/glider-st… Reach out if you’re interested in chatting more about agent evals and how we can collaborate! #BerkeleyRDI #AgenticAISummit
PatronusAI tweet media
English
0
1
2
278