

Ankit Gupta
180 posts

@AnkitGuptaAI
researcher, cmu cs alum // past: databricks, imc trading, facebook, instagram, amazon




How can we make a better TerminalBench agent? Today, we are announcing the OpenThoughts-Agent project. OpenThoughts-Agent v1 is the first TerminalBench agent trained on fully open curated SFT and RL environments. OpenThinker-Agent-v1 is the strongest model of its size on TerminalBench, and sets a new bar on our newly released OpenThoughts-TB-Dev benchmark. (1/n)




History doesn’t repeat but it rhymes…especially in enterprise software Hosting an Enterprise Software Trivia Night with @_shreya_s on December 1st in SF! If you’re curious about how the iconic software companies were built & enjoy studying the greats, you’ll love this event.

Announcing Olmo 3, a leading fully open LM suite built for reasoning, chat, & tool use, and an open model flow—not just the final weights, but the entire training journey. Best fully open 32B reasoning model & best 32B base model. 🧵




New Scale research: Can smaller models reliably oversee stronger LLM agents? We red team monitoring systems to detect covert sabotage, like agents secretly downloading sensitive information.





1. Mckinsey Consultant as a Software 2. Venture Capitalist as a Software 3. FP&A Analyst as a Software 4. Legal / Compliance as a Software Build and make it widely accessible. If you want credits, reply to this post. If there are other ideas you want to pursue and need credits, reply again.




