Alexander Mackie

1K posts

Alexander Mackie banner
Alexander Mackie

Alexander Mackie

@ZanderMackie

Security research and other such things. Detection engineer @datadogHQ

Katılım Ekim 2022
2K Takip Edilen194 Takipçiler
Sabitlenmiş Tweet
Alexander Mackie
Alexander Mackie@ZanderMackie·
@IceSolst Any sufficiently advance LLM tool is indistinguishable from malware.
English
5
3
37
12.7K
Alexander Mackie retweetledi
pdawg
pdawg@prathamgrv·
I made a Claude Code skill that turns any arxiv paper into working code. Every line traces back to the paper section it came from & any implementation detail the paper skips will be flagged, and not assumed. open sourcing it - github.com/PrathamLearnsT…
English
50
269
2.5K
145.5K
Alexander Mackie retweetledi
Danielle Fong 🔆
Danielle Fong 🔆@DanielleFong·
we live in a reality of agent swarms, so if software has this class of bug deep down we should set up to just migrate every class of heap overflow bug until we solve it, among others
Dan McAteer@daniel_mac8

Nicholas Carlini, Anthropic Researcher, reports that Claude Code found multiple remotely exploitable security vulnerabilities in the Linux kernel. Includes a bug that evaded human software engineers for an astonishing 23 (!) years.

English
4
3
21
2.4K
Alexander Mackie retweetledi
William Fedus
William Fedus@LiamFedus·
RL against verifiable rewards in LLMs has clearly opened a very powerful regime. It works, and because it works, there is a strong tendency to view more and more problems through that lens. You optimize for tasks where the reward is clean, where success is easy to check, where the feedback loop closes quickly. This is productive and will keep paying off. But it also creates a bias: you start emphasizing what is legible to the training setup, not necessarily what is most valuable. Scientific reasoning is a good example. Not every step in science is something that can be cleanly graded at the moment it is produced. A hypothesis can later fail experimentally and still have been exactly the right kind of thinking at the time: creative, mechanistically grounded, and responsive to the available evidence. “Turns out to be wrong” does not imply “was low-quality thinking”. A big part of the next frontier will be AI systems that can operate well under this kind of uncertainty, just like a big part of the last one was RL against verifiable rewards.
English
34
60
690
61.3K
Alexander Mackie retweetledi
Ryan Naraine
Ryan Naraine@ryanaraine·
"I remind you that this present you're so concerned about losing, you hated it in the first place." @juanandres_gs on why security practitioners should stop clinging to the broken thing and start imagining what the fixed thing looks like. New episode is live 👇 open.spotify.com/episode/7K67rf…
English
3
13
32
3.6K
Alexander Mackie retweetledi
Brendan Dolan-Gavitt
One of the really interesting things about this is that they think their dataset is *already* saturated if you can give the models a realistic amount of compute/tokens. 2M tokens is a tiny amount to spend on an exploit!
Brendan Dolan-Gavitt tweet media
Ethan Mollick@emollick

Here’s an independent domain extension of METR’s famous time-horizon analysis, applying it to offensive cybersecurity with real human expert timing data Similar to METR: 5.7 months doubling time. Frontier models now succeed 50% of the time at tasks that take human experts 10.5h.

English
1
5
31
4.3K
Alexander Mackie
Alexander Mackie@ZanderMackie·
This is great! Here’s my recipe: - take a domain or attack surface where there is a deterministic test of whether a security property has been violated - find existing research - distill vulnerability patterns - use an LLM to build a massive bucket of candidates where these patterns may exist - us an LLm to build a filtration system to orchestrate deterministic tests against said candidates against said attack surface (ie exploits) - use LLms and human judgement to build increasingly improved validations of success / failure of Exploitation - repeat and improve system
English
0
0
0
53
Alexander Mackie retweetledi
Simon Willison
Simon Willison@simonw·
Started a new tag on my blog to track stories about AI-powered security research, which is very much having a moment right now - 11 posts so far already simonwillison.net/tags/ai-securi…
English
26
36
269
21K
Alexander Mackie retweetledi
Ryan Naraine
Ryan Naraine@ryanaraine·
If AI finds the zero-day, writes the exploit, and patches the code, who trains the next generation of security researchers? Chris St. Myers' "Cognitive Rust Belt" essay kicked off a debate we couldn't stop having. Apple Podcasts podcasts.apple.com/us/podcast/thr…
English
5
12
37
3.1K
Alexander Mackie retweetledi
s1r1us (mohan)
s1r1us (mohan)@S1r1u5_·
this year's pwn2own isn't just interesting because there will be lots of entries with AI+human. it is also interesting because a) anthropic burned a ton of tokens on firefox, basically running claude in a loop until it found something for a month, probably exhausting whatever claude can one shot. b) if someone submits full chain without much use of ai, it tells you one shotting plateaus and these models are bit like fuzzers than seasoned security reseachers. c) even if they used an llm to find the bug, this tells us scaffolding/harnesss design, prompting, and the operator matters a lot.
English
7
19
214
16.3K
Alexander Mackie retweetledi
Anjney Midha
Anjney Midha@AnjneyMidha·
Stanford @CS153Systems, Week 1 (Full Lecture) AI Scaling, Bottlenecks, and Why Compute Isn't a Commodity Yet 00:00 Compute Coachella 00:29 Simple Life Heuristic 01:08 Uncertainty Creates Opportunity 01:42 Four Bottlenecks Framework 01:51 Empirical Proof Matters 02:05 Cloud Costs Are Shifting 02:15 Verifiable vs Fuzzy Progress 02:48 Scaling Predictability Explained 03:43 CapEx Explosion in Big Tech 04:06 Chips Aren’t Commodities 04:45 Compute Scarcity Conclusion
English
50
240
2.1K
272.4K
Alexander Mackie retweetledi
Dr. Anton Chuvakin
Dr. Anton Chuvakin@anton_chuvakin·
"RSA 2026: Agentic Future, Analog Fundamentals — The Paradox of Why the Old Guard Still Survives" bit.ly/47CLA1u <- the classic! My annual #RSAC #RSA2026 reflection post!
English
2
5
18
2.9K
Alexander Mackie retweetledi
Florian Roth ⚡️
Florian Roth ⚡️@cyb3rops·
Gemma 4 outperforms all other open source models in my cyber security related benchmark set
Arena.ai@arena

Gemma 4 by @GoogleDeepMind debuts at 3rd and 6th on the open source leaderboard, making it the #1 ranked US open source model. By total parameter count, Gemma 4 31B is 24× smaller than GLM-5 and 34× smaller than Kimi-K2.5-Thinking, delivering comparable performance at a fraction of the footprint.

English
14
25
334
121.9K
Alexander Mackie retweetledi
Tim Becker
Tim Becker@tjbecker·
Remote heap-buffer-overflow in CUPS just dropped. Found autonomously by Xint Code. Disclosure + PoC below
English
5
14
104
10.5K
Alexander Mackie retweetledi
flavio
flavio@flaviocopes·
How Axios was compromised 🤯
flavio tweet media
English
147
855
6.9K
1.5M
vx-underground
vx-underground@vxunderground·
There is this strange phenomena where people new to cybersecurity go way overboard trying to look cool and badass to give the facade of being really technical. I'll tell you something right now. You probably won't like to hear it, but it is important. Nobody cares about: - Your certificates - The conferences you've attended - Your vendor swag - What OS you're using - How many LED's your computer has Here is what your peers admire the most: - If you're polite - If you're willing to admit if you're wrong - If you're easy to get along with If you're just a chill nerd who is nice, easy going, willing to admit when you're wrong, you will go further than the big mean nerd with the galaxy brain
English
157
316
3.3K
102.2K
Thomas H. Ptacek
@roddux Everything in Syskaller is about to get higher-stakes now that everyone's LLM agent is going to try to shortcut generating working exploits by pulling from its crashers.
English
2
0
12
1.4K
Jamieson O'Reilly
Jamieson O'Reilly@theonejvo·
@ZanderMackie I've thought about this too. I can totally understand a forced hand situation, where if they were also behind recent but prior attacks, and the creds they gained from those were at high risk of being rotated, classic all or nothing decision imo.
English
1
0
0
28
Alexander Mackie retweetledi
sshkhr
sshkhr@sshkhr16·
(1/7) Everyone's been piling on the cs153.stanford.edu syllabus at @Stanford by @AnjneyMidha and @mabb0tt as being too high-level (I saw some comparisons being made to Ted Talks, even Coachella 😅) fwiw I personally think this is a great class, and its probably significantly more value in-person than online. I saw the lectures with @GuillaumeLample and @sualehasif996 from the Winter 2025 session and personally found them very informative. But if you're looking for a deeper dive into frontier AI systems, here are 5 courses (and a few other resources) you will like:
sshkhr tweet media
English
5
28
208
19K