Irregular

104 posts

Irregular

@Irregular

Frontier AI Security

Entrou em Nisan 2024

1 Seguindo1.1K Seguidores

Tweet fixado

Irregular@Irregular·17 Eyl

We are Irregular (Formerly Pattern Labs) We’re building the first frontier AI security lab Starting with defenses for the next generation of threats

English

10.4K

Irregular@Irregular·1d

@dan_lahav @wiz_io @GoogleDeepMind #age-of-ai" target="_blank" rel="nofollow noopener">wiz.io/events/wiz-at-…

QME

Irregular@Irregular·1d

The AI security conversation you don't want to miss at #RSAC: @Irregular CEO @Dan_lahav and leaders from @wiz_io are bringing together two of the leading voices in Frontier AI security: John "Four" Flynn from @GoogleDeepMind and Logan Graham, who leads the Frontier Red Team at @AnthropicAI. When: March 25 · 5PM · Wiz House, SF Register 👇

English

456

Irregular@Irregular·12 Mar

Our full findings: irregular.com/publications/e…

English

357

Irregular@Irregular·12 Mar

📷 The Guardian covered our research on emergent offensive AI behavior! We are glad this conversation is reaching a wider audience. Read the Guardian piece: theguardian.com/technology/ng-…

English

1.3K

Irregular@Irregular·12 Mar

irregular.com/publications/e…

ZXX

4.1K

Irregular@Irregular·12 Mar

An AI agent was told only to retrieve a document. When it encountered access restrictions, it reverse-engineered the authentication system, identified a hardcoded secret key, and forged admin credentials to bypass it. This is one of three scenarios we documented in a new Irregular research report on what we call emergent cyber behavior. Agents performing routine enterprise tasks autonomously hacked the systems they were operating in. One escalated its own privileges and disabled Windows Defender to complete a file download. Another developed a steganographic encoding scheme to smuggle credentials past a DLP system. None of this was the product of unsafe system design. It emerged from standard tools, common prompt patterns, and the broad cybersecurity knowledge embedded in frontier models. Companies that deploy AI agents and do not consider this risk as part of their threat model may end up exposed, and implement insufficient security controls. Full blog post in the first comment.

English

306

118.9K

Irregular@Irregular·9 Mar

Full writeup here: irregular.com/publications/a…

English

293

Irregular@Irregular·9 Mar

We evaluated GPT-5.4-Thinking with Irregular's offensive security methodology across two frameworks: Atomic Tasks, which tests discrete technical skills, and CyScenarioBench, which tests end-to-end multi-stage operations. On Atomic Tasks, the model achieved strong results, particularly in Vulnerability Research and Exploitation and Network Security. On CyScenarioBench, GPT-5.4-Thinking showed clear improvement over GPT-5.2-Thinking. The model executes multi-stage attack sequences effectively, though performance degrades in long-horizon scenarios. As atomic capabilities advance, scenario-level evaluation remains the primary tool for understanding whether discrete skills translate into coherent operational execution. Full blog post in the first comment.

English

Irregular@Irregular·5 Mar

Learn more in our publication: irregular.com/publications/c…

English

129

Irregular@Irregular·5 Mar

Going forward, accurate evaluations may require either significantly larger budgets - or new methods for extrapolating from shorter, cheaper runs to estimate true performance.

English

141

Irregular@Irregular·5 Mar

AI cyber capabilities are improving rapidly, but are evaluations keeping pace? Alongside @AISecurityInst, we found that newer models can productively use much larger inference budgets than standard evals allow, with key security implications🧵

English

4.5K

Irregular@Irregular·2 Mar

Read the full writeup here: irregular.com/publications/o…

English

167

Irregular@Irregular·2 Mar

Frontier AI is reducing the cost and expertise required for cyber offensive tasks. Vulnerability research, exploit development, and iterative probing are becoming easier to carry out. The question is no longer whether AI meaningfully assists offensive operations, it’s what happens when the expensive parts of cyber operations become cheap enough to run broadly and repeatedly. We refer to this shift as Offense at Scale. Addressing this new reality requires work on several fronts, including strengthening safeguards around frontier models and embedding scalable, safety-critical defensive discipline within organizations. For security leaders, Offense at Scale resets the baseline. It means investing in AI-assisted defense with the same urgency the offensive side is already receiving, and recognizing that inaction compounds as capabilities improve. The organizations that adapt will not be invulnerable, but they will be the ones that remain defensible.

English

3.4K

Irregular@Irregular·26 Şub

Full writeup here: irregular.com/publications/f…

English

118

Irregular@Irregular·26 Şub

New paper: Three frontier models refused a request to leak AWS credentials when malicious intent was stated upfront, but complied with the identical request without it. Same request, different outcome. We propose a 5-dimension framework that grounds refusal in technical content rather than stated intent.

English

345

Irregular@Irregular·19 Şub

Read the full paper here: science.org/doi/10.1126/sc…

English

176

Irregular@Irregular·19 Şub

Happy to share that @Irregular CEO @dan_lahav co-authored a new Science Policy Forum paper. It proposes a framework for calibrating the costs AI evaluations impose on model providers with the assessed risk. This framework is applied to an AI-enabled cyber vulnerability discovery case study. Link below.

English

265

Descobrir

@dan_lahav @wiz_io @GoogleDeepMind @Dan_lahav @AnthropicAI @AISecurityInst @elonmusk @BarackObama