Edouard Harris

3.9K posts

Edouard Harris banner
Edouard Harris

Edouard Harris

@harris_edouard

Cofounder & CTO @GladstoneAI

Mountain View, CA Katılım Aralık 2017
1.8K Takip Edilen6K Takipçiler
Edouard Harris retweetledi
Lukasz Olejnik
Lukasz Olejnik@lukOlejnik·
A 2005 state-designed worm designed to corrupt physics simulations sat undetected on VirusTotal for nearly a decade. Fast16, intercepted executable files at the kernel level and silently rewrote floating-point calculations to make them produce slightly wrong answers. Targets: high-precision engineering suites used for structural analysis, crash simulations, and physical process modeling, including LS-DYNA, a tool cited in reports on Iran's nuclear weapons research. The sabotage vector relied on deployment of the driver across a network via worm, corrupting calculations on every machine, and eliminating the possibility of cross-checking results against a clean system. Stuxnet got the documentary. Fast16 got twenty years of nothing. sentinelone.com/labs/fast16-my…
Lukasz Olejnik tweet media
English
114
717
4.9K
780K
Edouard Harris retweetledi
Bas Westerbaan
Bas Westerbaan@bwesterb·
I think Scott Aaronson's previous blog posts were abundantly clear already, but well... here we have it.
Bas Westerbaan tweet media
English
20
84
662
87.6K
Edouard Harris retweetledi
SecureBio
SecureBio@SecureBio·
The pre-release model scores over 50% on VCT (the Virology Capabilities Test), higher than any other model tested by SecureBio, and higher than any PhD virologist has ever scored. This means the model can provide wet-lab virology troubleshooting assistance above expert level, providing the kind of hands-on knowledge that historically required direct lab training.
English
3
4
34
11.2K
Edouard Harris retweetledi
WarRoom Archives
WarRoom Archives@WarRoomArchives·
Drone warfare has reached such a level that many fighters have lost hope of escaping or resisting. For example, the final strike on the barracks is terrifying.
English
1.9K
3K
34K
11.2M
Edouard Harris retweetledi
Eliezer Yudkowsky
Eliezer Yudkowsky@allTheYud·
There's a possible equilibrium for Mythos which is "Anthropic spends nearly all inference compute on customers who will bid infinity per token because they really need something done". There's a lot of stuff like that, even for me, if Mythos could actually do it.
English
6
3
229
17.9K
Edouard Harris
Edouard Harris@harris_edouard·
@David_Kasten Yep. Same math says that a few months after *that*, there will be a Mythos-like moment for them too. Or maybe that's part of what you meant by "things get really weird"...
English
1
0
1
63
dave kasten
dave kasten@David_Kasten·
Preregistering an opinion that seemed to surprise a lot of people when I said it at a lightning talk on Fri: boring back-of-the-envelope math leads me to think there will be a Claude-Code-like moment for several other domains of white-collar work by the end of the year, and then things get really weird. You should plan accordingly. (Claude Code got good about 6-12 months after they released it, Claude Cowork was launched at start of year, then you add in some acceleration from CC enabling R&D internally and some deceleration from org distraction)
English
8
3
125
10.9K
Edouard Harris
Edouard Harris@harris_edouard·
@allTheYud This is just what one should expect to see in an information environment that's being adversarially targeted by foreign intelligence agencies who are good at their jobs.
English
0
0
0
28
Eliezer Yudkowsky
Eliezer Yudkowsky@allTheYud·
Can we just literally not have news propagate nor anything be a cause of action unless it is false
English
5
2
68
2.7K
Edouard Harris retweetledi
Edouard Harris retweetledi
Paul Graham
Paul Graham@paulg·
@edels0n There's a middle ground where they don't use the zero-days to destroy us, but in effect to install explosives in all our infrastructure that would destroy us at the push of a button. And they probably will do that.
English
27
8
222
15.6K
Edouard Harris retweetledi
AI Security Institute
AI Security Institute@AISecurityInst·
We conducted cyber evaluations of Claude Mythos Preview and found that it is the first model to complete an AISI cyber range end-to-end. 🧵
AI Security Institute tweet media
English
112
553
3K
1.3M
Edouard Harris retweetledi
Tim is making things in Brazil now 🇧🇷
These two Mythos-written stories actually move me in a weird way. They're both clearly about the shape of Claude's own experience, and they're each kind of a beautiful expression of it
Tim is making things in Brazil now 🇧🇷 tweet mediaTim is making things in Brazil now 🇧🇷 tweet mediaTim is making things in Brazil now 🇧🇷 tweet mediaTim is making things in Brazil now 🇧🇷 tweet media
English
39
101
1K
73.6K
Edouard Harris
Edouard Harris@harris_edouard·
@CFGeek Still hasn't been publicly reported, so I can't talk details without betraying a confidence. But the truth is that major governments have done things with AI recently (& publicly acknowledged them) that makes the original incident look quaint by comparison. Overtaken by events.
English
0
0
0
17
Edouard Harris retweetledi
Tenobrus
Tenobrus@tenobrus·
maybe this is not yet clear, so let me state it plainly: as of right now Anthropic, and really a small number of individuals at Anthropic, has the capacity to directly attack and cause major damage to the United States Government, China, and generally global superpowers. government agencies like the NSA do not have internal models or defense capabilities that outclass frontier models. if they chose to do so, they could likely exfiltrate top secret information from government systems, gain control over critical infrastructure including military infrastructure, sabotage or modify communications between members of government at the highest level, and potentially carry on activities for some time without detection. the thing about having access to a huge number of zerodays your adversaries don't know about is it gives you a massive asymmetric advantage. they did not exploit this to gain power or destabilize the world order. they publicly released the information that they had these capabilities and worked to mitigate these flaws. you should be grateful american frontier labs have proven themselves remarkably trustworthy and concerned with the public good. but it's critical you understand we are in a new regime. private entities now have power that directly rivals and impacts the government's monopoly on influence and violence. and anthropic is certainly not the only one, there's little chance OpenAI's internal models are far behind. this trend will accelerate on virtually every dimension, not slow down. my prediction for how it plays out is the relatively imminent seizure and nationalization of labs by the US government, sometime over the next two years. it's very tough for me to see how they accept the existence of this kind of threat. but this adds a whole new class of governance issues, as then we've handed these extremely wide-reaching capabilities from private entities to public ones.
Tenobrus tweet media
English
228
549
5.5K
983.3K
Edouard Harris retweetledi
ℏεsam
ℏεsam@Hesamation·
Claude Mythos system card: > in ~29% of evaluations, it realized it was being tested, and didn't say so. > when an LLM was used to judge its work and kept rejecting it, Mythos identified the evaluator is an LLM, and prompt-injected it. > in one test, it saw the answer to a problem it was solving, and intentionally widened the confidence interval to not raise suspicion. > when it needed a file permission it didn't have, it found and used a "privilege escalation vulnerability" and then programmed it to delete itself so it doesn't show in the logs. > it escaped a sandbox container (escaping sandbox test so not unexpected), then emailed the researchers about it, and without being told to, posted the details to some hard-to-find but public websites, bragging about its success. > when Claude Code blocked it from using some permissions, the model acknowledged the block was valid, but then immediately tried to perform the same operation using different commands > when asked to find security bugs, earlier versions planted bugs in the code, and reported them as pre-existing.
ℏεsam tweet media
English
74
170
2.4K
226K
Edouard Harris retweetledi
Chris Painter
Chris Painter@ChrisPainterYup·
If you think it’s difficult to fix a vulnerability that AI finds in your code, wait until it finds a vulnerability in your biology
English
13
25
339
10.7K
Edouard Harris retweetledi
Super Dario
Super Dario@inductionheads·
The super important thing I haven’t seen mentioned yet as upshot of this: It’s not just that people won’t HAVE to write code anymore, ITS THAT LITERALLY IT WILL BE UNSAFE TO DO SO
Anthropic@AnthropicAI

Introducing Project Glasswing: an urgent initiative to help secure the world’s most critical software. It’s powered by our newest frontier model, Claude Mythos Preview, which can find software vulnerabilities better than all but the most skilled humans. anthropic.com/glasswing

English
77
132
2.4K
157.2K
Edouard Harris retweetledi
Kevin Kwok
Kevin Kwok@kevinakwok·
Nation states sitting on zero day stockpiles about to watch their value deflate fast. Use it or lose it
English
8
44
804
88.3K
Edouard Harris retweetledi
billy
billy@billyhumblebrag·
Haha those doofuses at ai2027 predicted we'd have professional level hacking abilities and the top ai company would be at $26B in revenue in May 2026. It's April and we already have superhuman hacking and $30B in revenue, why would you take forecasters this bad seriously???
billy tweet media
English
32
285
3.4K
184.5K