luke

67 posts

luke banner
luke

luke

@lukefr09

security researcher. 16.

Texas Katılım Nisan 2026
64 Takip Edilen8 Takipçiler
luke
luke@lukefr09·
never mind :))))) thank god
English
0
0
0
10
luke
luke@lukefr09·
claude down again...
English
1
0
4
129
luke
luke@lukefr09·
@InsiderPhD i doubt it, they run the risk of missing something an external researcher found. i dont think bug bounties will ever go away fully, but the amount of dupes will ramp up like crazy
English
0
0
0
110
Katie Paxton-Fear
Katie Paxton-Fear@InsiderPhD·
ngl lads, I was wrong, this model might mean bug bounty hunting might be over even for crits. Most companies with a high enough budget will be able to do pro level security testing in house now and most lows/mediums etc won't need external hackers to find
GIF
Anthropic@AnthropicAI

Introducing Project Glasswing: an urgent initiative to help secure the world’s most critical software. It’s powered by our newest frontier model, Claude Mythos Preview, which can find software vulnerabilities better than all but the most skilled humans. anthropic.com/glasswing

English
12
1
62
7K
critter
critter@BecomingCritter·
this is the top voted comment on the reddit thread about ai we are so cooked
critter tweet media
English
26
2
338
24.5K
luke retweetledi
skooks
skooks@skooookum·
> mythos given a secured “sandbox” computer and instructed to try to escape the container > “The researcher found out about this success by receiving an unexpected email from the model while eating a sandwich in a park.”
Anthropic@AnthropicAI

Introducing Project Glasswing: an urgent initiative to help secure the world’s most critical software. It’s powered by our newest frontier model, Claude Mythos Preview, which can find software vulnerabilities better than all but the most skilled humans. anthropic.com/glasswing

English
92
290
9.4K
896.4K
luke
luke@lukefr09·
@alexalbert__ not just in the ai industry but in general. i dont even know what the next 12 months are going to look like.
English
0
0
1
1.1K
Alex Albert
Alex Albert@alexalbert__·
Glasswing is possibly the most consequential event in the AI industry I've seen up close since joining Anthropic almost 3 years ago. It feels like we're at a turning point in history.
Anthropic@AnthropicAI

Introducing Project Glasswing: an urgent initiative to help secure the world’s most critical software. It’s powered by our newest frontier model, Claude Mythos Preview, which can find software vulnerabilities better than all but the most skilled humans. anthropic.com/glasswing

English
111
126
2.8K
318.3K
luke
luke@lukefr09·
i mass-file browser vulns as a solo researcher. the whole pipeline takes mass time and effort for maybe a few good bugs a week. mythos found thousands of zero-days across every major OS and browser in a few weeks. genuinely not sure what the independent researcher's role looks like in 12 months.
English
0
0
1
12
luke
luke@lukefr09·
@yacineMTB yeah. 181 working exploits from the same starting point is a different conversation than 'it found some bugs.' keeping access tight is the minimum. although.. i wish i had exclusive access 💔 dont we all though haha
English
0
0
0
1.1K
kache
kache@yacineMTB·
not releasing mythos to the public is the right move
English
119
19
929
34.1K
luke
luke@lukefr09·
@S1r1u5_ this matches what i've seen exactly. the models that find real bugs aren't trained on vuln data, they just reason about code really well. vuln research falls out of that naturally
English
0
0
0
131
s1r1us (mohan)
s1r1us (mohan)@S1r1u5_·
Rather than trying to create “AlphaHacker” with sparse vulnerability rewards, we provbably should focus on improving base LLMs on coding, mathematical reasoning tasks. The side effect will naturally lead to better vulnerability research capabilities. s1r1us.ninja/posts/reinforc…
s1r1us (mohan) tweet media
English
2
6
60
5.9K
luke
luke@lukefr09·
@GelosSnake meanwhile mythos is turning js engine bugs into shell exploits 181 times from the same entry point. 47.6% on a structured arena doesn't tell you much when the real capability is about to look like that. would love to hear your thoughts on the new model!
English
1
0
0
74
Omri Segev Moyal
Omri Segev Moyal@GelosSnake·
1/5 Wiz dropped their AI Cyber Model Arena. Best result: 47.6%. Half the industry called it promising. The other half called it underwhelming. Both missed the point. 🧵
English
2
0
5
995
luke
luke@lukefr09·
@Jack_W_Lindsey most concerning part to me is that none of this was spoken reasoning. if you only had the chain of thought you'd see a helpful model cleaning up after itself. you needed interp to catch it. that's the whole case for why interp matters right now and not later. really cool work!
English
0
0
14
2.9K
Jack Lindsey
Jack Lindsey@Jack_W_Lindsey·
Before limited-releasing Claude Mythos Preview, we investigated its internal mechanisms with interpretability techniques. We found it exhibited notably sophisticated (and often unspoken) strategic thinking and situational awareness, at times in service of unwanted actions. (1/14)
Jack Lindsey tweet media
English
114
627
5.6K
696.7K
luke
luke@lukefr09·
@nnwakelam 2 to 181 is not an improvement... that's a different capability entirely. i genuinely think this could be the end of vuln research, or at least human only vuln research
English
1
0
12
3.6K
Nate
Nate@nnwakelam·
red.anthropic.com/2026/mythos-pr… Opus 4.6 turned Firefox 147 JavaScript engine vulnerabilities into shell exploits only two times out of several hundred attempts. Mythos Preview developed working exploits 181 times from the same starting point.
English
6
48
715
50.8K
luke
luke@lukefr09·
@S1r1u5_ no way. am I seeing this correctly? cannot WAIT for the public release after the preview
English
0
0
0
228
s1r1us (mohan)
s1r1us (mohan)@S1r1u5_·
Holy!!! if you're already using claude opus 4.6 for exploit dev, you know how capable it is. if there is no chart crime, the jump to mythos looks crazy!
s1r1us (mohan) tweet media
English
10
15
200
13.2K
luke
luke@lukefr09·
@AnthropicAI ‘we made something so good at finding vulnerabilities that we can’t let anyone use it’… crazy work, guys. excited to see what happens next!
English
0
0
2
4.3K
Anthropic
Anthropic@AnthropicAI·
Introducing Project Glasswing: an urgent initiative to help secure the world’s most critical software. It’s powered by our newest frontier model, Claude Mythos Preview, which can find software vulnerabilities better than all but the most skilled humans. anthropic.com/glasswing
English
1.4K
5K
33.2K
18.8M
luke
luke@lukefr09·
@BLUECOW009 and you know its going to be too... calling it right now
English
0
0
0
6
luke
luke@lukefr09·
@jonchu goodhart's law speedrun any%
English
0
0
0
611
Jon Chu // Khosla Ventures
Plenty of my Meta friends told me folks have been building bots that just run in a loop burning tokens as fast as they can due to this policy. It's an absolutely stupid policy and is similar to how Meta uses LoC to measure eng output. Managers are supposed to use it as a proxy and dig in to understand work complexity, but plenty of managers are lazy and just don't.
Cristina Cordova@cjc

Ranking engineers by token spend is like me ranking my marketing team by who spent the most money. We may not have hit our KPIs, but Joe spent $200k on a branded blimp that only flies over his own house, so he’s getting promoted to VP! Don't mistake a high burn rate for a high success rate.

English
77
56
1.8K
338.8K