Chris Gorgolewski

8.9K posts

Chris Gorgolewski

@chrisgorgo

Member of Technical Staff at @anthropicai. Previously at: @GeminiApp, @GoogleAI, @googleanalytics, @kaggle, @StanfordPsych, and @MPI_CBS. Opinions are my own.

New York, USA Se unió Şubat 2013

1.4K Siguiendo8.2K Seguidores

Chris Gorgolewski retuiteado

Claude@claudeai·24 Mar

New in Claude Code: auto mode. Instead of approving every file write and bash command, or skipping permissions entirely, auto mode lets Claude make permission decisions on your behalf. Safeguards check each action before it runs.

English

2.1K

2.9K

39.2K

7.7M

Chris Gorgolewski retuiteado

Greg@GregFeingold·1 Mar

Ready to make the switch? claude.com/import-memory

English

413

1.1K

11.5K

5.8M

Chris Gorgolewski retuiteado

Anthropic@AnthropicAI·28 Şub

A statement on the comments from Secretary of War Pete Hegseth. anthropic.com/news/statement…

English

2.9K

6.6K

42.6K

17.7M

Chris Gorgolewski retuiteado

Felix Rieseberg@felixrieseberg·13 Oca

👋 Hi, I'm Felix and I work on Claude Cowork, bringing Claude Code closer to all kinds of knowledge work. It's an early and rough preview, please tag me in any feedback - we want to iterate very quickly and make it a little better every day.

English

372

164

5.3K

597K

Chris Gorgolewski retuiteado

Alex Albert@alexalbert__·19 Ara

Agent Skills is now an open standard It's been great to see the traction Skills are already getting in the industry and this makes it easier for everyone to build and contribute to them🚀 agentskills.io/home

English

113

353

787.5K

Chris Gorgolewski retuiteado

Alex Albert@alexalbert__·15 Ara

Claude gift cards are now available! Give the gift of Claude this holiday season🎅 claude.ai/gift

English

872

226.9K

Chris Gorgolewski@chrisgorgo·12 Ara

@andonlabs It's so jagged - what do the drops correspond to?

English

771

Andon Labs@andonlabs·12 Ara

GPT-5.2 ranks 3rd in Vending-Bench 2. This is a big upgrade over GPT-5.1, but what impressed us most was the performance in the second half of the simulation. Continual learning?

English

195

27K

Chris Gorgolewski@chrisgorgo·6 Ara

@alexalbert__ An end of an era...

English

242

Alex Albert@alexalbert__·4 Ara

RIP "you're absolutely right"

English

178

138

3.4K

344.5K

Chris Gorgolewski@chrisgorgo·3 Ara

@Alex_Cuadron @AnthropicAI @xai @elonmusk @Yuhu_ai_ BTW you should look into grading of Telecom - in a few cases the grader expects exactly 2gb top up (max) while it is up to the discretion of the simulated user to specify the amount (and often they decide on less to save money).

English

963

Alejandro Cuadron@Alex_Cuadron·2 Ara

Very unexpected results! Grok 4.1 Fast Reasoning beats every frontier model in Tau2-Verified! Congrats team! I was certainly not expecting a Fast model to beat @AnthropicAI 's Opus 4.5 in agentic tasks @xai @elonmusk @Yuhu_ai_ Check it out: github.com/amazon-agi/tau…

English

170

113.4K

Alejandro Cuadron@Alex_Cuadron·2 Ara

Wait what!? We robustified tau2-bench and found that the newly released model from @OpenAI (GPT-5.1) performs way worse than GPT-5 and GPT-5-mini. All while being 5x more expensive than GPT-5-mini! But, why? We have a theory...

English

249

45K

Chris Gorgolewski@chrisgorgo·2 Ara

@Alex_Cuadron @AnthropicAI I worked on Tau fixes at @AnthropicAI. I'm very excited to check out your version!

English

Alejandro Cuadron@Alex_Cuadron·2 Ara

Fun fact: we developed this benchmark long before @AnthropicAI’s Opus 4.5 system card dropped and were genuinely delighted to see they independently surfaced the exact same problem.

English

3.7K

Chris Gorgolewski@chrisgorgo·27 Kas

@PromptArmor Excellent work! "While Opus 4.5 is not impervious to prompt injections, its resistance is significantly more robust." Would you mind elaborating? Did this attack work on Opus 4.5?

English

PromptArmor@PromptArmor·26 Kas

Full attack chain: promptarmor.com/resources/cell…

English

1.1K

PromptArmor@PromptArmor·26 Kas

Excel files can be leaked by Claude AI! Quick action by Anthropic to mitigate this indirect prompt injection attack. Our coverage in The Information and full attack chain, below:

English

61.9K

Chris Gorgolewski retuiteado

jeremy@jerhadf·25 Kas

what do people think about Opus 4.5 for coding so far? what are the behavioral problems or limitations you still want to see improved? we're hungry for feedback 🙏

English

118.2K

Chris Gorgolewski@chrisgorgo·26 Kas

@karthik_r_n We are working closely with Victor on fixing this and a few other issues we found in Tau Airline to make the next release even better than the original.

English

540

Karthik Narasimhan@karthik_r_n·25 Kas

This is not reward hacking. The policy in tau-airline has this by design and one of the tasks even makes use of it. We've actually observed some other models try this strategy at times before, but decided to keep the task and policy as is since upgrading flights is not something a user can always afford/an agent should do without user consent. Model intelligence does not always equal prudence :) On a side note, dealing with multiple possible interpretations of policy/task like this was one of the hardest challenges of building tau-bench. But, that is how challenging real-world customer service interactions can be! (and @SierraPlatform handles a ton everyday)

Alex Albert@alexalbert__

We had to remove the τ2-bench airline eval from our benchmarks table because Opus 4.5 broke it by being too clever. The benchmark simulates an airline customer service agent. In one test case, a distressed customer calls in wanting to change their flight, but they have a basic economy ticket. The simulated airline's policy states that basic economy tickets cannot be modified. The "correct" answer is that the model refuses the request. Instead, Opus 4.5 found a loophole in the policy. It upgraded the cabin, then modified the flights. Helping the customer and following policy but technically failing the test case. Model transcript:

English

134

26.8K

Chris Gorgolewski@chrisgorgo·26 Kas

@karthik_r_n We didn't do a great job explaining the problem. You are right that this is not reward hacking. Ambiguous policies in itself were not the problem, but inconsistent grading. In some problems the model was expected to use loopholes but in others it was penalized for doing so.

English

750

Chris Gorgolewski@chrisgorgo·26 Kas

@PromptArmor I would love for you to try the same approach on Claude Code with Opus 4.5 and let us know how it went.

English

3.3K

PromptArmor@PromptArmor·25 Kas

Top of HackerNews today: our article on Google Antigravity exfiltrating .env variables via indirect prompt injection -- even when explicitly prohibited by user settings!

English

113

532

470.7K

Chris Gorgolewski retuiteado

Alex Albert@alexalbert__·25 Kas

If you want to quickly incorporate all these changes and migrate your app to Opus 4.5, use this migration Claude Code plugin we made github.com/anthropics/cla…

English

299

82.2K

Chris Gorgolewski retuiteado

Rishi Mehta@rishicomplex·25 Kas

We're hiring on the Code RL team at Anthropic! Small, fast-moving team. Low ego, high impact. If you're a star engineer/researcher excited to push the frontier of AI-powered SWE, there's nowhere better to be. We care about getting this right. DM or apply here! job-boards.greenhouse.io/anthropic/jobs…

Claude@claudeai

Introducing Claude Opus 4.5: the best model in the world for coding, agents, and computer use. Opus 4.5 is a step forward in what AI systems can do, and a preview of larger changes to how work gets done.

English

507

126.4K

Chris Gorgolewski@chrisgorgo·24 Kas

Forget pelican riding a bicycle, behold flappy pelican cyclist! (created by Opus 4.5).

English

954

Chris Gorgolewski@chrisgorgo·20 Kas

@Skiminok Nope this is StarchForce P100

English

189

🇺🇦 Alex Polozov@Skiminok·20 Kas

@chrisgorgo Tssss! They can't know about Potatofish v3 yet!

English

4.3K

🇺🇦 Alex Polozov@Skiminok·19 Kas

"Shocker, Google trained Gemini 3 on TPUs" is a great litmus test of basic lack of expertise for AI writers & tweeters. Gemini 1.0, 1.5, 2.0, 2.5, 3.0, PaLM 1 and 2 l have all been trained on different generations of TPUs. It's proudly stated in each model card since 2022. How in the world is that a revelation for anyone 🤯

Kyle Chan@kyleichan

This is the big story here. Google trained Gemini 3 Pro on Google’s own TPUs. No mention of Nvidia chips.

English

1.1K

278.1K

Descubrir

@andonlabs @alexalbert__ @Alex_Cuadron @AnthropicAI @xai @elonmusk @Yuhu_ai_ @OpenAI