Ted Moskovitz
306 posts

Ted Moskovitz
@ted_moskovitz
science of scaling at @AnthropicAI

We're hosting a social hour in London in early August for quant traders and developers. If you'd like to join us, meet researchers from our London office, and learn about the technical problems we're working on, please sign up at the following form: gem.com/form?formID=3d…



We're launching an "AI psychiatry" team as part of interpretability efforts at Anthropic! We'll be researching phenomena like model personas, motivations, and situational awareness, and how they lead to spooky/unhinged behaviors. We're hiring - join us! job-boards.greenhouse.io/anthropic/jobs…



Transformers employ different strategies through training to minimize loss, but how do these tradeoff and why? Excited to share our newest work, where we show remarkably rich competitive and cooperative interactions (termed "coopetition") as a transformer learns. Read on 🔎⏬





Transformers employ different strategies through training to minimize loss, but how do these tradeoff and why? Excited to share our newest work, where we show remarkably rich competitive and cooperative interactions (termed "coopetition") as a transformer learns. Read on 🔎⏬

We’re starting a Fellows program to help engineers and researchers transition into doing frontier AI safety research full-time. Beginning in March 2025, we'll provide funding, compute, and research mentorship to 10–15 Fellows with strong coding and technical backgrounds.


Math reasoning benchmarks keep getting saturated… Excited to introduce HARD-Math: Human-Annotated Reasoning Dataset for Math. Consisting of 4,780 short answer problems, based on the AHSME, AMC, & AIME contests, HARD-Math still poses a challenge for frontier LLMs. Read on 🔎⏬


When do transformers length-generalize? Generalizing to sequences longer than seen during training is a key challenge for transformers. Some tasks see success, others fail — but *why*? We introduce a theoretical framework to understand and predict length generalization.





