Pete Shaw

110 posts

Pete Shaw

@ptshaw2

Research Scientist @GoogleDeepmind

Seattle, WA Katılım Ocak 2013

517 Takip Edilen696 Takipçiler

Pete Shaw@ptshaw2·27 Şub

I particularly enjoyed the perspectives in this blog post and the paper it is based on... [2/2] x.com/emilianopp_/st…

Emiliano Penaloza@emilianopp_

Link to full post: emilianopp.github.io/Privileged-Inf… This was joint work with some great folks: @dheeraj_46329, @siddarthv66 and @MassCaccia

English

816

Pete Shaw@ptshaw2·27 Şub

Lots of interesting recent work related to the information asymmetry introduced by conditioning a teacher on privileged information [1/2]

Martin Klissarov@MartinKlissarov

In the limit, what's important is our ability to adapt. What is a good recipe for teaching agents to adapt on-the-fly? We introduce two meta-learning for LLMs papers written with @JonnyCoook at @GoogleDeepMind. This is research from last year we can finally share 🧵👇

English

3.2K

Pete Shaw retweetledi

Archiki Prasad@ArchikiPrasad·18 Şub

🚨 I’m on the 2026 Research Scientist Job Market! I am a PhD student at UNC Chapel Hill (advised by @mohitban47) and recipient of the Apple Scholars in AI/ML PhD Fellowship. My research centers around: 🔸Reasoning & RL/Post-Training: Evaluating and interpreting the reasoning process, and improving post-training and alignment through self-generated and reward-based signals (Intrinsic Dim., ReCEVAL, ScPO, LASeR). 🔸Agents & Planning: Designing adaptive agent frameworks to that use extra test-time compute & reasoning upon failure (ADaPT, System-1.x, PRInTS). 🔸Reward & Skill Discovery in Code: Leveraging execution signals to build reliable rewards, automate debugging, and discover abstractions in code (UTGen, ReGAL). Prev (Research Intern): Google DeepMind, Meta FAIR, Allen Institute for AI (AI2), and Adobe Research. Feel free to reach out via DM or email if you’re interested, have leads, or would like to connect! 🌐 archiki.github.io 📧 archiki@cs.unc.edu #NLP #AI #JobSearch

English

344

55.3K

Pete Shaw@ptshaw2·12 Şub

Good reasoning strategies make a task more compressible. I found this to be an elegant and intuitive perspective on why effective reasoning leads to better generalization, and was a lot of fun working with @ArchikiPrasad and team on this!

Archiki Prasad@ArchikiPrasad

🚨Excited to share our new work viewing reasoning strategies as teaching tools: for fixed target model, which CoT strategies best support learning and generalization? ✨Our answer is intrinsic dimensionality (minimum effective capacity a model needs to solve the task). Somewhat counterintuitively, adding CoT – which requires generating longer and more structured outputs – can reduce learning complexity. Good reasoning compresses the task, i.e., it reduces the degrees of freedom the model needs to map inputs to correct solutions. 🧵⬇️ (1/5)

English

3.6K

Pete Shaw retweetledi

Jeff Dean@JeffDean·24 Oca

This is absolutely shameful. Agents of a federal agency unnecessarily escalating, and then executing a defenseless citizen whose offense appears to be using his cell phone camera. Every person regardless of political affiliation should be denouncing this.

Ryan Grim@ryangrim

Drop Site obtained harrowing footage of the latest killing which appears to be from the perspective of the woman in pink filming from the sidewalk

English

250

956

8.4K

975.4K

Pete Shaw@ptshaw2·18 Oca

@AdaptiveAgents Seems like learnability challenges are more relevant than expressivity limits in the context of approximating universal compressors?

English

456

Pedro A. Ortega@AdaptiveAgents·18 Oca

The fact alone that a universal compressor is at least as long as the longest program in the class it closes over should be enough to show that artificial intelligence will forever remain a moving goal post.

English

13.4K

Pete Shaw retweetledi

François Chollet@fchollet·22 Ara

The goal of AI should not be to replace human thought and human agency, but to expand them. Not everything needs to be automated.

English

143

940

66.5K

Pete Shaw@ptshaw2·4 Ara

@fchollet This view is often used to motivate symbolic representations, but DL models can in theory also learn optimal compression if we move past parameter counting as a description length measure: arxiv.org/abs/2509.22445 But either way, hard to optimize.

English

188

François Chollet@fchollet·3 Ara

To perfectly understand a phenomenon is to perfectly compress it, to have a model of it that cannot be made any simpler. If a DL model requires millions parameters to model something that can be described by a differential equation of three terms, it has not really understood it, it has merely cached the data.

English

160

153

1.6K

122.7K

Pete Shaw@ptshaw2·16 Kas

x.com/ptshaw2/status…

Pete Shaw@ptshaw2

Excited to share a new paper that aims to narrow the conceptual gap between the idealized notion of Kolmogorov complexity and practical complexity measures for neural networks.

ZXX

103

Pete Shaw@ptshaw2·16 Kas

Good time to plug our recent paper connecting the notion of Kolmogorov complexity to Transformers, inspired by the work of Schmidhuber and many others... 🧵

Jürgen Schmidhuber@SchmidhuberAI

English

345

Pete Shaw retweetledi

Conference on Language Modeling@COLM_conf·7 Eki

Outstanding paper 3🏆: Don't lie to your friends: Learning what you know from collaborative self-play openreview.net/forum?id=2vDJi…

Conference on Language Modeling tweet media

English

11K

Pete Shaw retweetledi

Google DeepMind@GoogleDeepMind·7 Eki

Our new Gemini 2.5 Computer Use model can navigate browsers just like you do. 🌐 It builds on Gemini’s visual understanding and reasoning capabilities to power agents that can click, scroll and type for you online - setting a new standard on multiple benchmarks, with faster speed.

English

107

342

2.7K

452.7K

Pete Shaw retweetledi

Sundar Pichai@sundarpichai·8 Eki

Our new Gemini 2.5 Computer Use model is now available in the Gemini API, setting a new standard on multiple benchmarks with lower latency. These are early days, but the model’s ability to interact with the web – like scrolling, filling forms + navigating dropdowns – is an important next step in building general-purpose agents. Developers can try these capabilities via API in @googleaistudio + Vertex AI.

English

116

302

3.1K

310K

Pete Shaw retweetledi

Rohan Paul@rohanpaul_ai·6 Eki

The paper links Kolmogorov complexity to Transformers and proposes loss functions that become provably best as model resources grow. It treats learning as compression, minimize bits to describe the model plus bits to describe the labels. Provides a single training target that rewards simple, compressible solutions while staying mathematically grounded. This gives a principled way to aim models at simplicity and generalization, and it explains why optimization, not capacity, is the current bottleneck. In Kolmogorov complexity, a "program" is just the shortest set of instructions that can recreate some data. A shorter program means the data or model is simpler. So when they say “a prior favoring shorter programs,” it means the model is assumed to be more likely if it can be described with fewer bits. As the Transformer gets deeper (more layers) and has more context (bigger input window), its ability to represent complex programs grows. In that limit, the paper proves that this code length becomes the best possible measure of simplicity and fit — the same way Kolmogorov complexity works in theory. “Code length” here means how many bits it takes to describe both the model and how well it fits the data. So in simple words, they are saying: if you keep increasing model size and context, this method of preferring shorter and better-fitting models gets as close as possible to the theoretical ideal of perfect compression and generalization. ---- Paper – arxiv. org/abs/2509.22445 Paper Title: "Bridging Kolmogorov Complexity and Deep Learning: Asymptotically Optimal Description Length Objectives for Transformers"

English

287

24.3K

Pete Shaw retweetledi

Xing Han Lu@xhluca·3 Eki

i will be presenting AgentRewardBench at #COLM2025 next week! session: #3 date: wednesday 11am to 1pm poster: #545 come learn more about the paper, my recent works or just chat about anything (montreal, mila, etc.) here's a teaser of my poster :)

Xing Han Lu@xhluca

AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories We are releasing the first benchmark to evaluate how well automatic evaluators, such as LLM judges, can evaluate web agent trajectories. We find that rule-based evals underreport success rates, and no single LLM judge excels across all benchmarks. We collect trajectories from web agents built on four LLMs (Claude 3.7, GPT-4o, Llama 3.3, Qwen2.5-VL) across popular web benchmarks (AssistantBench, WebArena, VWA, WorkArena, WorkArena++). An amazing team effort with: @a_kazemnejad @ncmeade @arkil_patel @dcshin718 @alejaz_a @karstanczak @ptshaw2 @chrisjpal @sivareddyg

English

4.2K

Pete Shaw@ptshaw2·1 Eki

w/ @James_Cohan, @jacobeisenstein, @toutanova Paper link: arxiv.org/abs/2509.22445

English

531

Pete Shaw@ptshaw2·1 Eki

We hope this work adds some conceptual clarity around how Kolmogorov complexity relates to neural networks, and provides a path towards identifying new complexity measures that enable greater compression and generalization.

English

548

Pete Shaw@ptshaw2·1 Eki

Excited to share a new paper that aims to narrow the conceptual gap between the idealized notion of Kolmogorov complexity and practical complexity measures for neural networks.

English

122

18.3K

Keşfet

@mohitban47 @ArchikiPrasad @AdaptiveAgents @fchollet @googleaistudio @James_Cohan @jacobeisenstein @toutanova