Evan Luke

296 posts

Evan Luke banner
Evan Luke

Evan Luke

@EvanThomasLuke

"Most likely to automate the apocalypse (safely)" - GPT5. AI hacking and alignment. https://t.co/enkfxVTCJF

Beigetreten Ağustos 2016
1.3K Folgt168 Follower
Evan Luke
Evan Luke@EvanThomasLuke·
In just a few months since inception it has quickly grown to include people from top security startups, AI labs, public institutions, VC firms, Fortune 500 and college students. @francisco_oca and I started this in February and we are very excited to see it continue growing!
English
1
0
0
35
Evan Luke
Evan Luke@EvanThomasLuke·
I have updated the Awesome-AI-Hacking-Agents and Awesome-AI-Security-Skills repos. There are now over 100 open source agents and 40+ skill repos! The AI Hacking Discord has grown to over 300 members! (ling in repo) github.com/EvanThomasLuke…
English
1
5
4
321
Evan Luke
Evan Luke@EvanThomasLuke·
@deredleritt3r Thank you for clarifying for people. Need more thoughtful analysis in these times for leaders to make informed decisions, not basing their analysis on clickbait journalism.
English
0
0
1
13
Evan Luke
Evan Luke@EvanThomasLuke·
@deredleritt3r Insane clickbait title lol. People seem to think test time scaling with a good harness can only benefit previous generations of models. Matching performance on a benchmark with TTS and comparing that to a model's performance without TTS is not a fair comparison.
English
1
0
2
90
prinz
prinz@deredleritt3r·
Parsing through the WSJ article entitled "China Has Matched Anthropic in Cybersecurity, Resetting AI Race": 1. Contrary to the article's spooky title, it doesn't even attempt to claim that GLM-5.2 has matched Mythos in cybersecurity capabilities. The only substantive claim in the article comparing GLM-5.2 to Mythos is much weaker: "When given further instructions, Opus 4.8 and GLM-5.2 can match Mythos in bug-finding ability, according to researchers." 2. The article also mentions a new "bug-finding tool" called Tulongfeng, released earlier this week by 360 Security Technology (360ST). 360ST says it's "comparable to Mythos in finding bugs". First, even assuming that this is true, Tulongfeng appears to be a multi-agent tool that uses AI model(s) under the hood. Mythos, on the other hand, is a standalone AI model; it does not need any multi-agent set-up or harness to have significant cybersecurity capabilities. I invite the reader to imagine what Mythos 5 would be capable of if placed in a special multi-agent harness designed specifically for cybersecurity operations. Second, unlike with Mythos, there does not appear to be any data substantiating 360ST's claims regarding Tulongfeng other than information provided by 360ST. 360ST's CEO (who, BTW, is a "member of China’s top political advisory body" - imagine what the incentives are there) said that Tulongfeng had found "3,432 vulnerabilities, including 105 confirmed by Chinese authorities". These claims have not been verified independently. 3. Most importantly, these comparisons to Mythos entirely miss the mark regarding Mythos' most important cybersecurity capability. *Finding* vulnerabilities is not the most impressive aspect of Mythos (even Opus 4.6 had a decent record in finding some vulnerabilities). The magic of Mythos is in autonomous exploit development. Quoting Anthropic's red team blog: "Our internal evaluations showed that Opus 4.6 generally had a near-0% success rate at autonomous exploit development. But Mythos Preview is in a different league... Opus 4.6 turned the vulnerabilities it had found in Mozilla’s Firefox 147 JavaScript engine... into JavaScript shell exploits only two times out of several hundred attempts. We re-ran this experiment as a benchmark for Mythos Preview, which developed working exploits 181 times, and achieved register control on 29 more." You will note that the WSJ article and 360ST both focus on *finding* vulnerabilities - something even Opus 4.6 could achieve once in a while (and would probably be able to do even better if placed in a specialized multi-agent harness). Conversely, there is no mention anywhere of GLM-5.2's or Tulongfeng's abilities to autonomously develop exploits. This is probably for a good reason.
prinz tweet media
English
9
7
110
6K
Evan Luke
Evan Luke@EvanThomasLuke·
@VittoStack would like an invite, been red teaming AI since gpt-4 era. Currently building a knowledge base of prompt injections and jailbreaks.
English
1
0
2
167
Vitto Rivabella
Vitto Rivabella@VittoStack·
4 days ago we launched Jailbroken, a PRIVATE Discord community to learn AI red teaming and safety. Since then: - Over 250 security researchers joined - Top resources have been collected - People shared countless techniques and discoveries Today, we've secured over 100B in FREE AI tokens for all the members. If you want to join, drop a comment.
Vitto Rivabella tweet media
English
1.5K
68
1.4K
108.4K
Evan Luke retweetet
The Kobeissi Letter
The Kobeissi Letter@KobeissiLetter·
BREAKING: The Trump Administration has struck a deal with Anthropic which grants the company permission to release its Mythos 5 model to a group of ~100 companies and federal agencies, per CNBC. Details include: 1. Senior Anthropic staffers flew to Washington DC to meet with members of the Trump Administration 2. Anthropic said earlier this month that it disabled access to its Fable 5 and Mythos 5 models to comply with an export control directive from the government 3. The Trump Administration and Anthropic have been in a two-week-long standoff over its latest models This deal will have industry-wide implications.
English
413
636
6.4K
1.5M
Evan Luke retweetet
METR
METR@METR_Evals·
OpenAI gave METR early access to GPT-5.6 Sol for testing including raw chain-of-thought, a railfree version of the model, and internal information about the model. With this access, METR conducted a pre-deployment evaluation of GPT-5.6 Sol, including an attempted measurement of its 50%-Time Horizon. However, the measurement depends heavily on our treatment of cheating attempts, and GPT-5.6 Sol’s detected cheating rate was higher than any public model we have evaluated.
OpenAI@OpenAI

Introducing a limited preview of GPT-5.6 Sol, our next generation frontier model, as well as GPT-5.6 Terra, a balanced model for efficient, everyday work, and GPT-5.6 Luna, a fast and affordable model for high-volume work. openai.com/index/previewi…

English
73
208
2.5K
545.8K
Evan Luke retweetet
Stephanie Palazzolo
Stephanie Palazzolo@steph_palazzolo·
New w/ @leomschwartz @amir: The Trump admin has asked OpenAI to stagger the release of GPT-5.6 over security concerns. On Thursday, CEO Sam Altman told staff that the government will be approving access to GPT-5.6 customer by customer, a highly unusual approach.
Stephanie Palazzolo tweet media
English
261
269
1.7K
2.2M
Evan Luke retweetet
Sam Altman
Sam Altman@sama·
We want to help all companies be secure, working with the USG and the security ecosystem. *The full version of GPT-5.5-Cyber is here; state of the art performance on CyberGym. *Patch The Planet and Codex Security will help solve security problems instead of just finding them.
Sam Altman tweet media
English
820
466
6.8K
994.5K
Evan Luke retweetet
Alisa Liu
Alisa Liu@alisawuffles·
I'm joining OpenAI next week!🥹 The job search turned out to be really challenging but also super rewarding, so I wrote a small blog to share what I learned along the way and hopefully make the process a little less mysterious for the next person. alisawuffles.github.io/blog/job-search
English
506
1.1K
14.3K
5.3M
Evan Luke
Evan Luke@EvanThomasLuke·
theguardian.com/technology/202… Powerful AI models capable of taking down governments and businesses are mere months away, cyber intelligence agencies for the Five Eyes have warned in a rare joint statement, urging leaders to “act now”.
English
0
0
1
70
Ethan Mollick
Ethan Mollick@emollick·
Some (early) evidence that managers have the highest success rate in using Claude Code for coding. I have been arguing that management is an AI superpower, as clearly specifying what you want, how to do it & what good looks like is key to using agents. oneusefulthing.org/p/management-a…
Ethan Mollick tweet mediaEthan Mollick tweet media
English
91
137
1.5K
127.4K
Evan Luke retweetet
alphaXiv
alphaXiv@askalphaxiv·
Introducing autoresearch for arXiv papers Change 'arxiv' to 'autoarxiv' in any paper URL An agent deploys to resolve setup issues on the codebase, run a minimal reproduction, and estimate full replication cost. Read more below
English
47
382
2.8K
477.3K
Evan Luke retweetet
AI Security Institute
AI Security Institute@AISecurityInst·
Two years ago, AISI launched Inspect: an open-source toolkit for evaluating the capabilities and safety of LLMs. Today, we’re releasing the AISI Engineering Playbook - the methods, practices, and infrastructure we've developed while evaluating frontier AI systems. 🧵
AI Security Institute tweet media
English
5
44
192
12.4K
Evan Luke retweetet
Aaron Levie
Aaron Levie@levie·
The Cursor deal is symbolically quite significant. It was effectively the first mega success in the applied layer of AI. They firmly proved out the value proposition of having a deep domain focus, the role you play as a model router, when to lean into frontier models vs. when to train your own, and the role of applied AI GTM and distribution to make sure you’re actually taking advantage of the market opportunity. Every aspect of their business was tuned to carve out ground and keep doubling down in a highly competitive space. This is really the first at scale template for how to execute this playbook.
Chamath Palihapitiya@chamath

$60Billion. This is the first, but not the last, big exit at the application layer of AI. As product value accrues and accelerates upwards, the focus over the next few years will be firmly on the “control plane”: What gives organizations who want to go all in on AI the governance, control, auditability and business continuity across models and across time that they will need to firmly make the leap. This is the next big phase of AI value creation that the SpaceX/Cursor merger is highlighting.

English
57
50
796
113.5K
Evan Luke
Evan Luke@EvanThomasLuke·
@DanielMiessler The models capabilities are reaching a point where they can find vulnerabilities quicker then veteran hackers at scale. There must be a tiered access program made by the Govt to give American industry and Govt enough time to use the models before general release.
English
0
0
0
29
ᴅᴀɴɪᴇʟ ᴍɪᴇssʟᴇʀ 🛡️
Everyone keeps saying this, but they’re not giving a better answer for what they should have done. I guess one option would have been to just not say anything publicly and just release Fable later with controls. But that doesn’t work if you think this level of progress is imminent, and that you need to raise the alarm and do something like Glasswing to get people ready.
Jake Williams@MalwareJake

Anthropic having Fable be export controlled is a self-inflicted wound. If you spend a bunch of time telling people how dangerous your technology is, don't be surprised when some of them agree with you.

English
12
2
22
5.6K
Evan Luke
Evan Luke@EvanThomasLuke·
@ToBScottA @symfony Nice! Test time scaling and prompting have huge effects. Thanks for sharing.
English
0
0
0
778
Evan Luke
Evan Luke@EvanThomasLuke·
@elder_plinius Tiered access will happen for all frontier models including open source. China is not going to immediately open source Mythos+1-2 level models immediately to open source. It is too much of a liability.
English
0
0
1
1.8K