
Hot take from looking at @github Copilot telemetry: benchmarks make coding models look wildly different. Production workflows make them look much more similar. 👀 We looked at 23M+ Copilot requests and examined one simple metric: code survivability.
Kate Catlin
4.2K posts

@Kate_Catlin
AI Model Lifecycle PM for @github Copilot. Building tools for AI developers. Potluck enthusiast. Laughs often. Views my own.

Hot take from looking at @github Copilot telemetry: benchmarks make coding models look wildly different. Production workflows make them look much more similar. 👀 We looked at 23M+ Copilot requests and examined one simple metric: code survivability.


.@OpenAI’s GPT-5.2-Codex is now rolling out in GitHub Copilot. This model excels at large code changes like refactors or migrations, and has improved performance in Windows environments. Try it out in @code. github.blog/changelog/2026…
















At first, prompting seemed to be a temporary workaround for getting the most out of large language models. But over time, it's become critical to the way we interact with AI. On the @LightconePod, Garry, Harj, Diana, and Jared break down what they've learned from working with hundreds of founders building with LLMs: why prompting still matters, where it breaks down, and how teams are making it more reliable in production. They share real examples of prompts that failed, how companies are testing for quality, and what the best teams are doing to make LLM outputs useful and predictable. 0:58 - Parahelp’s prompt example 4:59 - Different types of prompts 6:51 - Metaprompting 7:58 - Using examples 12:10 - Some tricks for longer prompts 14:18 - Findings on evals 17:25 - Every founder has become a forward-deployed engineer (FDE) 23:18 - Vertical AI agents are closing big deals with the FDE model 26:13 - The personalities of the different LLMs 27:26 - Lessons from rubrics 29:47 - Kaizen and the art of communication