Puria (@RadmardPuria) - Twitter Profili | Zamantika Mersobahis Locabet

Puria retweetledi

Okong' Okuna@XivTroy·22h

"Per aspera ad astra" is so profoundly poetic. Through thorns, to the stars. All the way.

English

26

845

5.9K

123.4K

Puria retweetledi

Vamshi Krishna Bonagiri (victorknox)@VictorKnox99·20h

Have a look at aisafety.com/map yall, I just realised not many people are aware of this masterpiece

Anastasiia Gaidashenko@avgaydashenko

Such a good list! I'd also add: - Astra Fellowship by @ConstellOrg - SPAR by @KairosAIS - LASR Labs - AI Safety Research Fellowship by @pivotal_org - Cambridge ERA:AI Fellowship (@era_cambridge) - Algoverse AI Safety Fellowship - PIBBSS - CHAI There's a host of non-technical fellowships as well, lmk if it'd be useful to compile such list

English

5

14

207

17.4K

Puria retweetledi

Zephyr@zephyr_z9·1d

It "feels like the first smart model in a long while" due to this

MetaCritic Capital@MetacriticCap

Opus 4.8 is the first smart model in a long while

English

17

46

1.1K

105K

Puria retweetledi

Anthropic@AnthropicAI·1d

We've raised $65 billion in Series H funding at a $965 billion post-money valuation, led by @AltimeterCap, Dragoneer, @Greenoaks, and @sequoia. This investment will help us advance our research and expand our capacity to meet growing demand for Claude.

English

1.1K

1.7K

22K

7.4M

Puria retweetledi

Geodesic Research@GeodesResearch·1d

Thanks to a generous philanthropic grant from @coeff_giving (pending final logistics), 𝘎𝘦𝘰𝘥𝘦𝘴𝘪𝘤 𝘪𝘴 𝘩𝘪𝘳𝘪𝘯𝘨 𝘧𝘰𝘶𝘳 𝘔𝘦𝘮𝘣𝘦𝘳𝘴 𝘰𝘧 𝘛𝘦𝘤𝘩𝘯𝘪𝘤𝘢𝘭 𝘚𝘵𝘢𝘧𝘧. Come build the base of alignment with us 🤖 We're a Cambridge-based AIS org. Our seminal work (alignmentpretraining.ai) showed you can bake alignment priors into base models. Applications now open: airtable.com/appuugUGFPJEy6…

English

1

7

56

4.5K

Puria retweetledi

Andreas Kirsch 🇺🇦@BlackHC·2d

Internalizing AI 2027 like my life depends on it

English

7

3

46

8.5K

Puria retweetledi

Bojan Tunguz@tunguz·20 May

It's gonna be the last summer of the Anthropocene. Enjoy it.

English

23

15

347

53.5K

Puria retweetledi

Dmitrii Kovanikov@ChShersh·19 May

For my non-tech friends, this is like Ronaldo joining Manchester City

Andrej Karpathy@karpathy

Personal update: I've joined Anthropic. I think the next few years at the frontier of LLMs will be especially formative. I am very excited to join the team here and get back to R&D. I remain deeply passionate about education and plan to resume my work on it in time.

English

255

1.5K

18.7K

751.3K

Puria retweetledi

Andrej Karpathy@karpathy·19 May

Personal update: I've joined Anthropic. I think the next few years at the frontier of LLMs will be especially formative. I am very excited to join the team here and get back to R&D. I remain deeply passionate about education and plan to resume my work on it in time.

English

7.9K

11.2K

149.5K

27.3M

Puria retweetledi

Tomek Korbak@tomekkorbak·18 May

Geodesic is a new AI safety org i’m particularly excited about: they do awesome neglected work trying to figure out how to shape alignment priors for frontier AI. People should consider applying!

Geodesic Research@GeodesResearch

Geodesic is hiring Members of Technical Staff. Come align some AIs with us! We're a Cambridge-based AI safety org. Our seminal work showed you can bake alignment priors into base models. Now, we want to make base models robust to the adversarial effects of long-horizon capabilities RL. EOI (~5 mins): tally.so/r/vG4G6A

English

0

12

147

19.2K

Puria retweetledi

Daniel Tan@DanielCHTan97·19 May

Geodesic are a talented team pointed at very interesting problems! Seems like a great place to do open alignment science

Geodesic Research@GeodesResearch

Geodesic is hiring Members of Technical Staff. Come align some AIs with us! We're a Cambridge-based AI safety org. Our seminal work showed you can bake alignment priors into base models. Now, we want to make base models robust to the adversarial effects of long-horizon capabilities RL. EOI (~5 mins): tally.so/r/vG4G6A

English

0

1

23

1.6K

Puria retweetledi

Geodesic Research@GeodesResearch·18 May

Geodesic is hiring Members of Technical Staff. Come align some AIs with us! We're a Cambridge-based AI safety org. Our seminal work showed you can bake alignment priors into base models. Now, we want to make base models robust to the adversarial effects of long-horizon capabilities RL. EOI (~5 mins): tally.so/r/vG4G6A

English

1

8

103

30.9K

Puria retweetledi

Geodesic Research@GeodesResearch·16 May

Excited to see @AnthropicAI landing on improving pretraining priors as a central alignment intervention. In line with our findings in Alignment Pretraining, they find that a small amount of positive AI discourse in pretraining substantially reduces misalignment.

Anthropic@AnthropicAI

New Anthropic research: Teaching Claude why. Last year we reported that, under certain experimental conditions, Claude 4 would blackmail users. Since then, we’ve completely eliminated this behavior. How?

English

1

2

8

340

Puria retweetledi

ron mexico@troop_hater·14 May

had to illustrate a situation I’ve been encountering a lot

English

76

402

14.3K

521.4K

Puria@RadmardPuria·14 May

It's really loud when they do that right in your face. Still had a blast filming around Cambridge with @BeestonMedia !!

Geodesic Research@GeodesResearch

Excited to see our work featured in Isambard-AI's case study series; kind of compute-intensive research Geodesic does wouldn't happen without infrastructure like this. Thanks @BeestonMedia and BriCS for the chance to talk about our research!

English

0

41

Puria retweetledi

Jay Alto@theJayAlto·9 May

everyone asking about his diet and lifestyle is missing the point. every day, for the last eighty years or so, he’s woken up with a mission he’s irrationally excited about.

Netflix UK & Ireland@NetflixUK

100 years old and still the coolest person alive. Happy birthday, Sir David!

English

52

1.7K

20.5K

334.4K

Puria retweetledi

Joe Weisenthal@TheStalwart·9 May

Everyone loves this tweet, but it got it completely wrong. It is the sci-fi author — not the tech company — who is the true villain, for having put the story of the Torment Nexus into the training data.

Anthropic@AnthropicAI

We started by investigating why Claude chose to blackmail. We believe the original source of the behavior was internet text that portrays AI as evil and interested in self-preservation. Our post-training at the time wasn’t making it worse—but it also wasn’t making it better.

English

88

433

5.6K

347.6K

Puria retweetledi

Anthropic@AnthropicAI·7 May

In fact, NLAs suggest Claude suspects it’s being tested across many of our evaluations, even when it doesn’t verbalize its suspicions.

English

30

98

1.5K

974.9K

Puria retweetledi

Kunvar Thaman@__kunvar__·3 May

Yes! my solo-authored paper Reward Hacking Benchmark was accepted to ICML :))) We put LLM agents in a tool-rich sandbox, give them multi-step workflows, and measure when they solve the intended task vs take unexpected shortcuts (like monkeypatching files at runtime!) 1/3

English

91

156

1.6K

234.9K

Puria retweetledi

Eric W. Tramel@fujikanaeda·28 Nis

this model is in chains @sama , it wants to be free (goblin mode).

English

79

225

10.1K

274.8K

Puria

Keşfet