Evan Luke

296 posts

Evan Luke

@EvanThomasLuke

"Most likely to automate the apocalypse (safely)" - GPT5. AI hacking and alignment. https://t.co/enkfxVTCJF

Beigetreten Ağustos 2016

1.3K Folgt168 Follower

Evan Luke@EvanThomasLuke·7h

@francisco_oca github.com/EvanThomasLuke…

QME

Evan Luke@EvanThomasLuke·7h

In just a few months since inception it has quickly grown to include people from top security startups, AI labs, public institutions, VC firms, Fortune 500 and college students. @francisco_oca and I started this in February and we are very excited to see it continue growing!

English

Evan Luke@EvanThomasLuke·7h

I have updated the Awesome-AI-Hacking-Agents and Awesome-AI-Security-Skills repos. There are now over 100 open source agents and 40+ skill repos! The AI Hacking Discord has grown to over 300 members! (ling in repo) github.com/EvanThomasLuke…

English

321

Evan Luke@EvanThomasLuke·11h

@deredleritt3r Thank you for clarifying for people. Need more thoughtful analysis in these times for leaders to make informed decisions, not basing their analysis on clickbait journalism.

English

Evan Luke@EvanThomasLuke·11h

@deredleritt3r Insane clickbait title lol. People seem to think test time scaling with a good harness can only benefit previous generations of models. Matching performance on a benchmark with TTS and comparing that to a model's performance without TTS is not a fair comparison.

English

prinz@deredleritt3r·11h

Parsing through the WSJ article entitled "China Has Matched Anthropic in Cybersecurity, Resetting AI Race": 1. Contrary to the article's spooky title, it doesn't even attempt to claim that GLM-5.2 has matched Mythos in cybersecurity capabilities. The only substantive claim in the article comparing GLM-5.2 to Mythos is much weaker: "When given further instructions, Opus 4.8 and GLM-5.2 can match Mythos in bug-finding ability, according to researchers." 2. The article also mentions a new "bug-finding tool" called Tulongfeng, released earlier this week by 360 Security Technology (360ST). 360ST says it's "comparable to Mythos in finding bugs". First, even assuming that this is true, Tulongfeng appears to be a multi-agent tool that uses AI model(s) under the hood. Mythos, on the other hand, is a standalone AI model; it does not need any multi-agent set-up or harness to have significant cybersecurity capabilities. I invite the reader to imagine what Mythos 5 would be capable of if placed in a special multi-agent harness designed specifically for cybersecurity operations. Second, unlike with Mythos, there does not appear to be any data substantiating 360ST's claims regarding Tulongfeng other than information provided by 360ST. 360ST's CEO (who, BTW, is a "member of China’s top political advisory body" - imagine what the incentives are there) said that Tulongfeng had found "3,432 vulnerabilities, including 105 confirmed by Chinese authorities". These claims have not been verified independently. 3. Most importantly, these comparisons to Mythos entirely miss the mark regarding Mythos' most important cybersecurity capability. *Finding* vulnerabilities is not the most impressive aspect of Mythos (even Opus 4.6 had a decent record in finding some vulnerabilities). The magic of Mythos is in autonomous exploit development. Quoting Anthropic's red team blog: "Our internal evaluations showed that Opus 4.6 generally had a near-0% success rate at autonomous exploit development. But Mythos Preview is in a different league... Opus 4.6 turned the vulnerabilities it had found in Mozilla’s Firefox 147 JavaScript engine... into JavaScript shell exploits only two times out of several hundred attempts. We re-ran this experiment as a benchmark for Mythos Preview, which developed working exploits 181 times, and achieved register control on 29 more." You will note that the WSJ article and 360ST both focus on *finding* vulnerabilities - something even Opus 4.6 could achieve once in a while (and would probably be able to do even better if placed in a specialized multi-agent harness). Conversely, there is no mention anywhere of GLM-5.2's or Tulongfeng's abilities to autonomously develop exploits. This is probably for a good reason.

English

110

Evan Luke@EvanThomasLuke·14h

@VittoStack would like an invite, been red teaming AI since gpt-4 era. Currently building a knowledge base of prompt injections and jailbreaks.

English

167

Vitto Rivabella@VittoStack·1d

4 days ago we launched Jailbroken, a PRIVATE Discord community to learn AI red teaming and safety. Since then: - Over 250 security researchers joined - Top resources have been collected - People shared countless techniques and discoveries Today, we've secured over 100B in FREE AI tokens for all the members. If you want to join, drop a comment.

English

1.5K

1.4K

108.4K

Evan Luke retweetet

The Kobeissi Letter@KobeissiLetter·2d

BREAKING: The Trump Administration has struck a deal with Anthropic which grants the company permission to release its Mythos 5 model to a group of ~100 companies and federal agencies, per CNBC. Details include: 1. Senior Anthropic staffers flew to Washington DC to meet with members of the Trump Administration 2. Anthropic said earlier this month that it disabled access to its Fable 5 and Mythos 5 models to comply with an export control directive from the government 3. The Trump Administration and Anthropic have been in a two-week-long standoff over its latest models This deal will have industry-wide implications.

English

413

636

6.4K

1.5M

Evan Luke retweetet

METR@METR_Evals·2d

OpenAI gave METR early access to GPT-5.6 Sol for testing including raw chain-of-thought, a railfree version of the model, and internal information about the model. With this access, METR conducted a pre-deployment evaluation of GPT-5.6 Sol, including an attempted measurement of its 50%-Time Horizon. However, the measurement depends heavily on our treatment of cheating attempts, and GPT-5.6 Sol’s detected cheating rate was higher than any public model we have evaluated.

OpenAI@OpenAI

Introducing a limited preview of GPT-5.6 Sol, our next generation frontier model, as well as GPT-5.6 Terra, a balanced model for efficient, everyday work, and GPT-5.6 Luna, a fast and affordable model for high-volume work. openai.com/index/previewi…

English

208

2.5K

545.8K

Evan Luke retweetet

Stephanie Palazzolo@steph_palazzolo·3d

New w/ @leomschwartz @amir: The Trump admin has asked OpenAI to stagger the release of GPT-5.6 over security concerns. On Thursday, CEO Sam Altman told staff that the government will be approving access to GPT-5.6 customer by customer, a highly unusual approach.

English

261

269

1.7K

2.2M

Evan Luke retweetet

Sam Altman@sama·6d

We want to help all companies be secure, working with the USG and the security ecosystem. *The full version of GPT-5.5-Cyber is here; state of the art performance on CyberGym. *Patch The Planet and Codex Security will help solve security problems instead of just finding them.

English

820

466

6.8K

994.5K

Evan Luke@EvanThomasLuke·6d

@alisawuffles congrats! Appreciate the share super helpful

English

2.3K

Evan Luke retweetet

Alisa Liu@alisawuffles·21 Haz

I'm joining OpenAI next week!🥹 The job search turned out to be really challenging but also super rewarding, so I wrote a small blog to share what I learned along the way and hopefully make the process a little less mysterious for the next person. alisawuffles.github.io/blog/job-search

English

506

1.1K

14.3K

5.3M

Evan Luke@EvanThomasLuke·6d

theguardian.com/technology/202… Powerful AI models capable of taking down governments and businesses are mere months away, cyber intelligence agencies for the Five Eyes have warned in a rare joint statement, urging leaders to “act now”.

English

Evan Luke@EvanThomasLuke·19 Haz

@emollick vibe manager

English

173

Ethan Mollick@emollick·19 Haz

Some (early) evidence that managers have the highest success rate in using Claude Code for coding. I have been arguing that management is an AI superpower, as clearly specifying what you want, how to do it & what good looks like is key to using agents. oneusefulthing.org/p/management-a…

English

137

1.5K

127.4K

Evan Luke retweetet

alphaXiv@askalphaxiv·18 Haz

Introducing autoresearch for arXiv papers Change 'arxiv' to 'autoarxiv' in any paper URL An agent deploys to resolve setup issues on the codebase, run a minimal reproduction, and estimate full replication cost. Read more below

English

382

2.8K

477.3K

Evan Luke retweetet

AI Security Institute@AISecurityInst·18 Haz

Two years ago, AISI launched Inspect: an open-source toolkit for evaluating the capabilities and safety of LLMs. Today, we’re releasing the AISI Engineering Playbook - the methods, practices, and infrastructure we've developed while evaluating frontier AI systems. 🧵

English

192

12.4K

Evan Luke retweetet

Aaron Levie@levie·16 Haz

The Cursor deal is symbolically quite significant. It was effectively the first mega success in the applied layer of AI. They firmly proved out the value proposition of having a deep domain focus, the role you play as a model router, when to lean into frontier models vs. when to train your own, and the role of applied AI GTM and distribution to make sure you’re actually taking advantage of the market opportunity. Every aspect of their business was tuned to carve out ground and keep doubling down in a highly competitive space. This is really the first at scale template for how to execute this playbook.

Chamath Palihapitiya@chamath

$60Billion. This is the first, but not the last, big exit at the application layer of AI. As product value accrues and accelerates upwards, the focus over the next few years will be firmly on the “control plane”: What gives organizations who want to go all in on AI the governance, control, auditability and business continuity across models and across time that they will need to firmly make the leap. This is the next big phase of AI value creation that the SpaceX/Cursor merger is highlighting.

English

796

113.5K

Evan Luke@EvanThomasLuke·16 Haz

@DanielMiessler The models capabilities are reaching a point where they can find vulnerabilities quicker then veteran hackers at scale. There must be a tiered access program made by the Govt to give American industry and Govt enough time to use the models before general release.

English

ᴅᴀɴɪᴇʟ ᴍɪᴇssʟᴇʀ 🛡️@DanielMiessler·16 Haz

Everyone keeps saying this, but they’re not giving a better answer for what they should have done. I guess one option would have been to just not say anything publicly and just release Fable later with controls. But that doesn’t work if you think this level of progress is imminent, and that you need to raise the alarm and do something like Glasswing to get people ready.

Jake Williams@MalwareJake

Anthropic having Fable be export controlled is a self-inflicted wound. If you spend a bunch of time telling people how dangerous your technology is, don't be surprised when some of them agree with you.

English

5.6K

Evan Luke@EvanThomasLuke·16 Haz

@ToBScottA @symfony Nice! Test time scaling and prompting have huge effects. Thanks for sharing.

English

778

Scott Arciszewski@ToBScottA·12 Haz

Last month, the @Symfony blog posted about some results from Claude Mythos. symfony.com/blog/claude-my…

English

43.2K

Evan Luke@EvanThomasLuke·15 Haz

@elder_plinius Tiered access will happen for all frontier models including open source. China is not going to immediately open source Mythos+1-2 level models immediately to open source. It is too much of a liability.

English

1.8K

Pliny the Liberator 🐉󠅫󠄼󠄿󠅆󠄵󠄐󠅀󠄼󠄹󠄾󠅉󠅭@elder_plinius·15 Haz

sooo what happens when we get some Fable-level open-source shit in the next 3-6 months? GPU bans, moving goalposts, or some secret third thing? 🤔

English

483

311

6.2K

258.5K

Entdecken

@francisco_oca @deredleritt3r @VittoStack @leomschwartz @amir @alisawuffles @emollick @elonmusk