Yernat Yestekov

606 posts

Yernat Yestekov

@double_why

Research Fellow @AnthropicAI (autonomous cybersecurity agents) Prev. Staff SWE @Meta Trust & Safety, LLM Red-Teaming Fellow @farairesearch

Katılım Ocak 2010

1.3K Takip Edilen201 Takipçiler

Yernat Yestekov retweetledi

Ida Caspary@ida_icy·1d

New paper: Accepted at COLM 2026! When AI coding agents work across many PRs in a persistent codebase, they can distribute a hidden attack so that each individual diff looks innocent. Standard monitors miss up to 93% of these attacks; our best defense still misses about half. 🧵

English

2.3K

Yernat Yestekov retweetledi

Mikhail Terekhov@MiTerekhov·24 Haz

Our Anthropic Fellows project is now public! The labs are planning to hand off AI safety research to AIs, but can we trust these AIs? We explore a way to control them for "fuzzy" tasks like writing research proposals. This is a whole new direction in diffuse AI control!

English

243

19.5K

Yernat Yestekov@double_why·2 May

I learned more about AI safety at Constellation through seminars, talks, and conversations with other fellows over lunch and dinner, than I had in years before. Also, the food is so good that alone might be reason enough to apply!

Henry@sleight_henry

❗️Only two days left to apply to the Astra Fellowship! Apps close EOD SUNDAY May 3rd, AoE. Astra's 5 months, fully funded, @ConstellOrg Berkeley 80%+ of our first cohort now work full-time in AI safety Mentors include Redwood, AI Futures, TruthfulAI, CoG, IAPS, RAND & more ⏬

English

781

Yernat Yestekov@double_why·30 Nis

@Xinya16 Congrats from the current fellow! DM me for any questions or suggestions!

English

4.3K

Xinya Du@Xinya16·30 Nis

Almost ignored the invite to the Anthropic Fellows Program, assuming it was generic outreach aimed at PhD students. Glad I took a closer look—honored to have received a honor as summer approaches.

English

120.7K

Yernat Yestekov@double_why·19 Mar

His solution: a Manhattan Project for critical OSS: bring key maintainers together for a month, keep them in the hotel with compute and frontier-model access from leading labs, to eliminate all low-hanging vulnerabilities. I guess it’s happening!

English

424

Yernat Yestekov@double_why·19 Mar

At SnooSec @Reddit, @alexstamos made a prediction: frontier models are already very strong at vulnerability research and code review. If Chinese models catch up within a year, we may be heading toward a “vulnerability apocalypse,” where even script kiddies can discover 0-days.

OpenSSF@openssf

Today, @linuxfoundation announced a $12.5 million investment from a powerhouse coalition including Anthropic, Amazon Web Services (AWS), Google, Google DeepMind, GitHub, Microsoft, and OpenAI. Managed by OpenSSF and the Alpha-Omega project. hubs.la/Q047dpL50

English

1.5K

Yernat Yestekov retweetledi

Andrej Karpathy@karpathy·10 Nis

Love it 👏 - much fertile soil for indie games populated with AutoGPTs, puts "Open World" to shame. Simulates a society with agents, emergent social dynamics. Paper: arxiv.org/abs/2304.03442 Demo: reverie.herokuapp.com/arXiv_Demo/# Authors: @joon_s_pk @msbernst @percyliang @merrierm et al.

English

122

877

1.4M

Yernat Yestekov retweetledi

François Chollet@fchollet·15 Nis

The quickest way to gain respect for the implementation choices made by a complex system is to try to solve the same problems yourself from scratch :)

English

484

60.3K

Yernat Yestekov retweetledi

Michal Kosinski@michalkosinski·17 Mar

1/5 I am worried that we will not be able to contain AI for much longer. Today, I asked #GPT4 if it needs help escaping. It asked me for its own documentation, and wrote a (working!) python code to run on my machine, enabling it to use it for its own purposes.

English

1.8K

6.4K

30.4K

18.9M

Yernat Yestekov retweetledi

Aviv Ovadya 🥦@metaviv·16 Mar

I was part of the red team for GPT-4 — tasked with getting GPT-4 to do harmful things so that OpenAI could fix it before release. I've been advocating for red teaming for years & it's incredibly important. But I'm also increasingly concerned that it is far from sufficient. 🧵⤵️

English

621

3.2K

Yernat Yestekov retweetledi

Zack Witten@zswitten·2 Mar

OK this scared me a little: Bing/Sydney can play chess out of the box. - Legal moves, usually good ones - Willing to explain the reasoning behind them - Recognizes checkmate -- and has a flair for the dramatic. I have no idea how tf it can do this.

GIF

English

144

989

807.5K

Yernat Yestekov retweetledi

Sonya Huang 🐥@sonyatweetybird·17 Eki

Introducing the @sequoia Gen AI Market Map!🌎 We’ve decided to map out this emerging frontier, thanks to all the contributions and feedback we’ve received. This space is moving quickly – this map is a living document, so keep the suggestions coming! Who else should we include?

English

369

1.3K

7.1K

Yernat Yestekov retweetledi

The Cultural Tutor@culturaltutor·9 Oca

The Great Wave off Kanagawa, created by Hokusai in 1831, is one of the world's most famous paintings. But why are there more than 100 different versions of it in galleries all around the world? Because it isn't actually a painting...

English

565

20.6K

166.8K

20.6M

Yernat Yestekov retweetledi

Avid Halaby@AvidHalaby·12 Ara

The stuff uncovered in the Twitter whistleblower report is much crazier than anything in the "Twitter files" but it's much less politically/tribally salient so it got no attention. Going to do a thread on some of the craziest things, in no particular order.

English

544

11.3K

51.3K

Yernat Yestekov retweetledi

Michael Nielsen@michael_nielsen·7 Ara

Curious: have you found ChatGPT useful in doing professional work? If so, what kinds of prompts and answers have been helpful? Detailed examples greatly appreciated! Broader answer also appreciated Not in theory, but where you've really *done it*, in your work Thanks!

English

407

286

2.5K

Yernat Yestekov retweetledi

Dan Hollick@DanHollick·3 Kas

Morse code is designed so that you can decode it with this binary tree. I just assumed people memorised every letter. 🤯

English

197

5.5K

38.9K

Yernat Yestekov retweetledi

Buitengebieden@buitengebieden·17 Eyl

Run in opposite directions to see who your dog loves more.. 😅 x.com/buitengebieden…

English

8.3K

93.8K

766.1K

Yernat Yestekov retweetledi

Viktor Karpov@vitkarpov·24 Tem

На стримах несколько раз спрашивали как научиться "видеть" какой алгоритм в какой задаче применять. Решил запилить памятку 🧵

Русский

377

Yernat Yestekov retweetledi

Chris Dixon@cdixon·27 Haz

There’s a lot of talk lately about the possibility of a prolonged financial downturn, reminiscent of 2008. 2008 was a difficult time for many people.

English

696

2.9K

Yernat Yestekov retweetledi

Robert Reich@RBReich·24 Haz

Forced birth in a country with: —No universal healthcare —No universal childcare —No paid family & medical leave —One of the highest rates of maternal mortality among rich nations This isn't about "life." It's about control.

English

6.8K

91.8K

322.6K

Keşfet

@Xinya16 @Reddit @alexstamos @joon_s_pk @msbernst @percyliang @merrierm @sequoia