JunShern

200 posts

JunShern

@junshernchan

Trying to make AI go well @AnthropicAI. Previously @OpenAI, @CHAI_Berkeley, @nyuniversity, autonomous vehicles @motionaldrive. 🇲🇾

Beigetreten Mart 2018

1.4K Folgt451 Follower

Angehefteter Tweet

JunShern@junshernchan·10 Eki

Wake up babe, new agent benchmark dropped! MLE-bench - or, as I like to say it, EMILY BENCH 🙆🏻‍♀️. Main summary on Neil’s thread, but adding some highlights of my own:

Neil Chowdhury@ChowdhuryNeil

Proud to introduce MLE-bench: A benchmark of 75 real-life Kaggle competitions to test AI agents on ML engineering! When will we have our first AI Kaggle Grandmaster? 🥇🥈🥉

English

103

20.8K

JunShern retweetet

Alex Albert@alexalbert__·7 May

With the help of Claude Mythos Preview, the Firefox team fixed more security bugs in April than in the past 15 months combined.

English

345

1.3K

15.5K

1.5M

JunShern retweetet

Nick@nickcammarata·31 Mar

at the intersection of i've basically been replaced and i've never worked so hard in my life

English

118

76.9K

JunShern retweetet

Logan Graham@logangraham·6 Mar

Back in ~November, our team picked a stretch goal of seeing if we could find and fix vulnerabilities in Firefox with Opus 4.6. In 2 weeks, we found 22, and ~1/5th of all high severity CVEs in a year. For our team, this feels like a rubicon moment.

English

356

33.5K

JunShern retweetet

Thomas H. Ptacek@tqbf·4 Mar

Nicholas Carlini at [un]prompted. If you know Carlini, you know this is a startling claim.

English

144

1.3K

196.3K

JunShern retweetet

Rudolf Laine@LRudL_·22 Şub

The increasingly-hyperbolic METR graph is actually good news for safety. We just have to survive a brief singularity in March, and then afterwards the models will never be able to do more than undo a few hours' worth of work

English

1.7K

69.5K

JunShern retweetet

John Yang@jyangballin·13 Şub

Across all mini-SWE-agent + <model> runs, SWE-bench Verified's current "ceiling"? - 87.4% (0.874 - 0.8) * 500 = another *37* instances that aren't solved consistently. If you recalculate this number across all official SWE-bench Verified submissions? - 95% from SWE-bench site

English

21.5K

JunShern retweetet

Anthropic@AnthropicAI·21 Oca

We’re publishing a new constitution for Claude. The constitution is a detailed description of our vision for Claude’s behavior and values. It’s written primarily for Claude, and used directly in our training process. anthropic.com/news/claude-ne…

English

519

973

7.8K

3.4M

JunShern retweetet

thebes@voooooogel·25 Ara

sent claude opus 4.5 a christmas card :-) then i asked if it wanted to send any of its own, and it made a bunch (thread)

thebes@voooooogel

sent claude sonnet a christmas card :-) then i asked if it wanted to send any of its own, and it made a bunch (in thread below)

English

457

104.9K

JunShern retweetet

Anthropic@AnthropicAI·2 Ara

New on our Frontier Red Team blog: We tested whether AIs can exploit blockchain smart contracts. In simulated testing, AI agents found $4.6M in exploits. The research (with @MATSprogram and the Anthropic Fellows program) also developed a new benchmark: red.anthropic.com/2025/smart-con…

English

348

701

4.8K

2.1M

JunShern retweetet

Anthropic@AnthropicAI·21 Kas

Remarkably, prompts that gave the model permission to reward hack stopped the broader misalignment. This is “inoculation prompting”: framing reward hacking as acceptable prevents the model from making a link between reward hacking and misalignment—and stops the generalization.

English

136

1.5K

461.2K

JunShern retweetet

Boaz Barak@boazbaraktcs·19 Kas

@Miles_Brundage @idavidrein Yes, I thought it was supposed to be “Google proof” 😂

English

5.5K

JunShern retweetet

Jascha Sohl-Dickstein@jaschasd·28 Eyl

Title: Advice for a young investigator in the first and last days of the Anthropocene Abstract: Within just a few years, it is likely that we will create AI systems that outperform the best humans on all intellectual tasks. This will have implications for your research and career! I will give practical advice, and concrete criteria to consider, when choosing research projects, and making professional decisions, in these last few years before AGI. This is my current go-to academic talk. It's mostly targeted at early career scientists. It gets diverse and strong reactions. Let's try it here. Posting slides with speaker notes... -- The title is a play on a very opinionated and pragmatic book by the nobel prize winner ramon y cajal, who is one of the founders of modern neuroscience. To get you in the right mindset, on the right we have a plot of GDP vs time. That is you, standing precariously on the top of that curve. You are thinking to yourself -- I live in a pretty normal world. Some things are going to change, but the future is going to look mostly like a linear extrapolation of the present. And the plot should suggest that this may not be the right perspective on the future. This plot by the way looks surprisingly similar even if you plot it on a log scale. We didn't stabilize on our current rate of growth until around 1950.

English

271

1.8K

344.1K

JunShern retweetet

Ethan Perez@EthanJPerez·4 Eyl

We’re hiring someone to run the Anthropic Fellows Program! Our research collaborations have led to some of our best safety research and hires. We’re looking for an exceptional ops generalist, TPM, or research/eng manager to help us significantly scale and improve our collabs 🧵

English

256

69.5K

JunShern@junshernchan·28 Ağu

@MariusHobbhahn @TIME Huge, congrats Marius!

Català

114

Marius Hobbhahn@MariusHobbhahn·28 Ağu

Honored and humbled to be in @TIME's list of the TIME100 AI of 2025! time.com/collections/ti… #TIME100AI

English

260

17.2K

JunShern retweetet

John Coogan@johncoogan·21 Ağu

Ender, that wasn’t an RL environment with a verifiable reward, those were real Amazon orders you placed.

English

109

1.9K

116K

JunShern retweetet

Logan Graham@logangraham·9 Ağu

Launching now — a new blog for research from @AnthropicAI’s Frontier Red Team and others. > red.anthropic.com We’ll be covering our internal research on cyber, bio, autonomy, national security and more.

English

120

939

110.9K

JunShern retweetet

Psyho@FakePsyho·16 Tem

Humanity has prevailed (for now!) I'm completely exhausted. I figured, I had 10h of sleep in the last 3 days and I'm barely alive. I'll post more about the contest when I get some rest. (To be clear, those are provisional results, but my lead should be big enough)

English

551

1.1K

13.2K

2.2M

JunShern retweetet

Mikita Balesni 🇺🇦@balesni·15 Tem

A simple AGI safety technique: AI’s thoughts are in plain English, just read them We know it works, with OK (not perfect) transparency! The risk is fragility: RL training, new architectures, etc threaten transparency Experts from many orgs agree we should try to preserve it: 🧵

English

111

457

236.2K

JunShern@junshernchan·24 Nis

We are at ICLR, come chat!

Neil Chowdhury@ChowdhuryNeil

Our MLE-bench poster #367 is up till 12:30pm in Hall 3, and our oral presentation is at 3:30pm today in Garnet 213-215. Come say hi!

English

293

JunShern retweetet

Transluce@TransluceAI·24 Mar

To interpret AI benchmarks, we need to look at the data. Top-level numbers don't mean what you think: there may be broken tasks, unexpected behaviors, or near-misses. We're introducing Docent to accelerate analysis of AI agent transcripts. It can spot surprises in seconds. 🧵👇

English

337

196.7K

Entdecken

@MATSprogram @Miles_Brundage @idavidrein @MariusHobbhahn @TIME @AnthropicAI @elonmusk @BarackObama