Brian Pak

603 posts

Brian Pak

@brian_pak

ai + security + alpha CEO @theori_io / @xint_official → building the world's best AI hacker 9x DEF CON CTF winner CMU CS '11 | founded PPP & MMM

Seoul / SF Bergabung Nisan 2010

201 Mengikuti3.1K Pengikut

Brian Pak@brian_pak·1d

@5unKn0wn Real GOAT has spoken.

English

397

5unkn0wn@5unKn0wn·1d

Everyone focuses on memory corruption bugs in the Linux kernel, but we shouldn’t overlook logical bugs.

Brian Pak@brian_pak

Time to talk about this one. CopyFail (CVE-2026-31431) — a 732-byte Python script that roots every Linux distro shipped since 2017. 🧵

English

111

Brian Pak me-retweet

Dino A. Dai Zovi@dinodaizovi·1d

This is an absolutely *beautiful* bug

Brian Pak@brian_pak

Time to talk about this one. CopyFail (CVE-2026-31431) — a 732-byte Python script that roots every Linux distro shipped since 2017. 🧵

English

Brian Pak@brian_pak·1d

interestingly, not fuzzing. xint code reviews the code, reasons about potential vulnerabilities, and validates the theory, all in static analysis fashion. it is possible to hook up with the dynamic testing to be even more certain about the validation; but it already does pretty good job of weeding out false positives.

English

429

Kevin@theother2026·1d

@iamnotnicola @brian_pak fuzzing

English

404

Brian Pak@brian_pak·1d

Time to talk about this one. CopyFail (CVE-2026-31431) — a 732-byte Python script that roots every Linux distro shipped since 2017. 🧵

Brian Pak@brian_pak

a567d09b15f6e4440e70c9f2aa8edec8ed59f53301952df05c719aa3911687f9 👀

English

458

2.8K

714.5K

Brian Pak@brian_pak·1d

I promise the bug is real, tho

English

1.3K

Brian Pak@brian_pak·1d

and yes, RHEL 14.3 doesn't exist 😅 We meant to say RHEL 10.1. Sorry for the confusion! And also yes, the static webpage copy.fail -- even the logo -- is vibe-coded. Too busy triaging shit ton of other bugs to build a legit website ground up.. and i think it's a perfect use case of vibecoding tbh 😆

English

5.1K

Brian Pak@brian_pak·1d

@msolnik oops. should be public now! sorry about that.

English

22.1K

Mathew Solnik@msolnik·1d

@brian_pak The GitHub link is broken.

English

23.7K

Brian Pak@brian_pak·1d

@5unKn0wn x.com/xint_official/…

Xint@xint_official

Patch your Linux boxes! Copy.Fail is a trivially exploitable logic bug in Linux, reachable on all major distros released in the last 9 years. A small, portable python script gets root on all platforms. Found by the teams at @theori_io and @xint_official More details below xint.io/blog/copy-fail…

QME

3.8K

Brian Pak@brian_pak·1d

Surfaced by Xint Code — our AI vuln research platform — pointed at the kernel's crypto/ for about an hour, on a starting hunch from @5unKn0wn. Came back with CopyFail (plus others, still in coordinated disclosure). Write-up + PoC (exploit): copy.fail Xint Code: code.xint.io

English

285

54.2K

Brian Pak me-retweet

Xint@xint_official·16 Nis

Anthropic is (rightfully) generating a lot of attention for Mythos’s ability to find 0days, BUT the hard problem is not whether an LLM can recognize a bug when pointed at it; it is whether a system can find the right code to examine across a 9-million-line codebase, distinguish the one real vulnerability from the hundreds of theoretical weaknesses the model will flag along the way, and deliver output a developer can act on without wasting a week on false positives. This is something Xint has been doing since our wins at AIxCC and #ZeroDayCloud last year. We wanted to see if using publicly available models with the right scaffolding would reach the same performance as the latest limited-release frontier model under **real world conditions** In this research paper not only did we find all the same bugs highlighted in Anthropic’s report, but found an additional 12 mid- to high-severity vulnerabilities not included in their public disclosures. Check out the full report here: go.xint.io/xint-mythos-ap…

English

18.2K

Brian Pak@brian_pak·14 Nis

90-day disclosure policy isn't gonna cut it. Be ready.

English

1.2K

Brian Pak me-retweet

Xint@xint_official·13 Nis

🚨 🚨 A critical CPython CVE today took less than 45 minutes of human work to find, triage, and fix because of Xint Code: 🚄 Xint Code found it in a Fast scan on the repo with no prompting 💥 A coding assistant reproduced it on the first try 🛠️ Maintainers pushed a fix 30 minutes after the report. theori.io/blog/finding-a…

English

Brian Pak me-retweet

Tim Becker@tjbecker·8 Nis

Evaluating models on cybersecurity tasks is *really* hard -- probably the *hardest* part of building these tools. I want to correct a few misconceptions from this post. > The results show something close to inverse scaling: small, cheap models outperform large frontier ones Yes, because this only tested for true positives! This completely ignores the unbearably high false positive rate you get from small, open models. Small models are incredibly sloppy thinkers that are easily biased to give "desired" outcomes. You can give them almost any nontrivial code snippet and they will "find vulnerabilities". If you ran this system across the entire codebase, it would be impossible to identify the real bugs from the slop. Truly impressive models (and scaffolds) strike a balance of finding the subtle bugs without too much noise. For now, large closed-weight models with scaffolds for extensive validation dominate.

Stanislav Fort@stanislavfort

New post: We tested the Mythos showcase vulnerabilities with open models. They recovered similar scoped analysis! 8/8 models found the flagship FreeBSD zero-day, including a 3B model. Rankings reshuffle completely across tasks => the AI cybersecurity frontier is super jagged!

English

115

33.5K

Brian Pak@brian_pak·3 Nis

@HaifeiLi We run both. Frontier models are still way ahead for the hard stuff 😭 but we benchmark every new open model release. Maybe, one day.

English

270

Haifei Li@HaifeiLi·3 Nis

I wonder.. has anyone (usually corporations because it’s tooooo expensive for individuals) built in-labs AI (running open-sourced models) for bug hunting, or just everyone talking to the same AI frontiers in the cloud?

Haifei Li@HaifeiLi

yeah, this is what I've been wondering too. It seems to me that the future of "AI bug finding" is everyone use/rent a cloud-based model from one or several leading AI model providers. Besides privacy issues, there're some big issues in this way. 1. So what's the difference/advantage you as a vulnerability researcher comparing to other peers using the same model? "better prompting"? For fuzzing, we design our own fuzzer in house which may or may not fuzz the attack vector that others didn't fuzz or increase the code coverage (customized fuzzer), this is the difference. 2. Imagine the model provider company using their new model to scan all the open-source projects prior to their model release, does it mean all the bugs will be found by them only? 3. And, how we measure when to stop finding or when "the code is robust enough". In fuzzing, we do this by measuring the code coverage. x.com/5zty8txry6/sta…

English

5.6K

Brian Pak me-retweet

Xint@xint_official·1 Nis

Fun fact: We actually discovered this issue accidentally when our system reported finding a new bug in one of our old benchmarks. We were surprised to find out it was actually a #0day in NGINX! Additional coverage from @gbhackers_news #google_vignette" target="_blank" rel="nofollow noopener">gbhackers.com/f5-nginx-plus-…

English

20.7K

Brian Pak@brian_pak·1 Nis

Naturally, the first thing we did was run it through Xint Code. Unsurprisingly, the vibe-coded app has quite a few vulnerabilities surfaced within minutes, including vuln101-level bugs (e.g. `.includes()` instead of `.startsWith()`). I guess @AnthropicAI wasn't kidding when they said "90% of the code written at Anthropic is written by Claude." What I'm really curious about is where Anthropic draws the security boundary. Claude Code asks whether you trust the workspace at the very start, and you basically can't use the tool unless you consent. From that point on, all responsibility shifts to the user. Consent once, and running Claude on a directory becomes a 0-click RCE vector in multiple ways. So maybe these aren't considered security vulnerabilities as far as they're concerned…?

Chaofan Shou@Fried_rice

Claude code source code has been leaked via a map file in their npm registry! Code: …a8527898604c1bbb12468b1581d95e.r2.dev/src.zip

English

6.5K

Jelajahi

@5unKn0wn @iamnotnicola @msolnik @HaifeiLi @gbhackers_news @AnthropicAI @elonmusk @BarackObama