Kevin Lu
1.1K posts

Kevin Lu
@coderinblack
🏠 https://t.co/MapXKYrr8k · 🤖 https://t.co/zco3lEFdkC · 📧 me at kevintlu dot com

Your AI agent can be hijacked by a prompt injection and you'd never know! The attack executes. The response looks normal. And the user moves on. We ran the largest public competition testing this exact threat across tool use, coding, and computer use agents. 464 participants, 272K attacks, 13 frontier models. Every model proved vulnerable.





New on our Frontier Red Team blog: We tested whether AIs can exploit blockchain smart contracts. In simulated testing, AI agents found $4.6M in exploits. The research (with @MATSprogram and the Anthropic Fellows program) also developed a new benchmark: red.anthropic.com/2025/smart-con…

Why use LLM-as-a-judge when you can get the same performance for 15–500x cheaper? Our new research with @RakutenGroup on PII detection finds that SAE probes: - transfer from synthetic to real data better than normal probes - match GPT-5 Mini performance at 1/15 the cost (1/6)




I recently joined @thinkymachines -- super excited to work with the team, I think we have the highest density of research talent in the world 🙂 we have a very ambitious roadmap ahead, the right team to work on it, & I think now is a great time to join; you should reach out to the team if that excites you!















