RustyRabbit

2.2K posts

RustyRabbit

@_RustyRabbit

non fungible dad security researcher

شامل ہوئے Nisan 2018

1.1K فالونگ342 فالوورز

RustyRabbit ری ٹویٹ کیا

Andrej Karpathy@karpathy·6d

- Drafted a blog post - Used an LLM to meticulously improve the argument over 4 hours. - Wow, feeling great, it’s so convincing! - Fun idea let’s ask it to argue the opposite. - LLM demolishes the entire argument and convinces me that the opposite is in fact true. - lol The LLMs may elicit an opinion when asked but are extremely competent in arguing almost any direction. This is actually super useful as a tool for forming your own opinions, just make sure to ask different directions and be careful with the sycophancy.

English

1.7K

2.4K

31K

3.3M

RustyRabbit ری ٹویٹ کیا

Pashov Audit Group@PashovAuditGrp·6d

security auditors only have 5 moods: 1. fk I hit the Claude limit again 2. how we gonna get rich 3. random 2 am motivation 4. intense loneliness 5. my family needs me

English

131

5.2K

RustyRabbit ری ٹویٹ کیا

Hari@hrkrshnn·6d

"The language models we have now are the most significant thing to happen to security since the beginning of the internet". Nicolas Carlini from Anthropic. Full video below:

English

2.2K

RustyRabbit@_RustyRabbit·27 Mar

@DevDacian @milotruck @nisedo_ Tragedy of the commons at work'

English

213

Dacian@DevDacian·27 Mar

How many salaries are you responsible for paying on regular basis? Are you providing the livelihoods for 20-30 people on an ongoing basis? If you are just a solo independent act it is easy to talk smack and want everything for free. Why doesn't Cantina open-source their AI and give away their edge? @hrkrshnn

English

584

nisedo@nisedo_·27 Mar

Fuzzing a codebase from scratch takes hours of setup What if it took 1 command? Echidna/Medusa harness + basic invariants, auto-generated for ANY Foundry project Soon™

English

4.6K

RustyRabbit ری ٹویٹ کیا

Josselin Feist@Montyly·27 Mar

These days, in almost all my discussions I get asked what I think about AI and the future of security, so I figured I should share it here Short version: I try not to have a strong opinion yet. We are clearly in a transition phase, and outside of people working directly on foundation models, no one really has a solid view of where this is going Over the past months, LLMs improved a lot. The releases at the end of 2025 were a real step change. In practice, most people I know (myself included) have barely written code in the past 2-3 months. For security, we went from "this is fun" to "this is actually useful" Right now, the best mental model I have is that we effectively jumped from having no tooling to having an advanced static analyzer or fuzzer. A lot of bugs that used to take time to find can now be surfaced quickly Does that mean security researchers disappear in 2 years? Based on today’s tech, I do not think so. There are a lot of bugs to be found. Some are found by humans, some by traditional techniques, and now some by LLMs. But it does not mean all bugs get found. If anything, history suggests there are always more bugs than anyone expects, and that gap does not go away easily The real question is: do LLMs get another capability jump, or just steady iteration? There are reasonable arguments both ways. To be honest, I do not have enough understanding of how these models evolve to have a confident answer. And anyone giving a very definite answer is probably overconfident, unless they are working directly on the models Depending on that, the role of security researchers could change a lot, including the way we work. The demand could decrease if models get very strong at finding bugs. But it could also increase if the amount of code grows faster than the models’ ability to reason about it. We could even end up with a shortage of experienced researchers in a few years if fewer juniors enter the field while seniors move elsewhere. It is hard to predict because everything depends on how model capabilities evolve On the business side, I am skeptical about "AI audit as a service". If models keep improving, it is hard to see how these companies compete with native offerings from OpenAI or Anthropic. Especially if those providers stop exposing raw capabilities and push everyone into their own products. I tried codex security, and while it is not perfect, it is clear where this is going. Mythos / Capybara seem to be around the corner, and it will be interesting to see how far it goes My current bet is that within a few months, tools like codex or claude security will be great at finding blockchain issues, and they will integrate directly into most dev pipelines. At that point, the marginal value of an extra "AI audit SaaS" becomes limited So what to do as a security researcher? Be adaptive. This is a transition period, and things will likely move fast in 2026. Stay curious, and keep working on skills that give you an edge. Regularly reassess where you are strong or weak, and where AI helps you versus where it replaces you. If you like challenges, see AI as one that pushes you to improve Also, be careful with what people call "cognitive debt" or "brain rot". I was skeptical at first, but I do see it now. The more I rely on LLMs during an audit, the more I lose part of the intuition that I normally build while going deep into code. That intuition is still critical to find complex bugs. I have not found the right balance yet, but it is something to watch It probably makes sense to revisit your view on LLMs every 3-6 months. I have already been wrong a few times on this, and I am fine with that, as long as I don’t get locked into a fixed view Finally, a lot of people focus on the downside for security researchers. But there are also upsides. I can explore codebases much faster, build custom tooling easily, and spend less time on boring tasks. Maybe it’s my last few years/months as a security researcher, maybe not. But at least LLMs let me have some fun before doomsday 😅

English

167

14.6K

RustyRabbit ری ٹویٹ کیا

Tay 💖@tayvano_·26 Mar

@hanni_abu @_esk_kse_ @lex_node bc im fucking censorship resistant and capture resistant and open and freeeeeeee

English

4.4K

RustyRabbit@_RustyRabbit·26 Mar

@cyfrin Great idea but why would users use this chain. And without users there are no funds to act as incentive.

English

Cyfrin Audits@cyfrin·26 Mar

The lifecycle: 1. Deploy audited contracts to BattleChain with real liquidity 2. On-chain Safe Harbor protects whitehats legally 3. DAO approves contracts for attack mode 4. Whitehats, AI agents, experimentalists, open season 5. Survive? Promote to production → deploy to mainnet If you get hacked on BattleChain, that's the plan. You're on the ultimate red team platform.

English

2.8K

Cyfrin Audits@cyfrin·26 Mar

As of today, BattleChain testnet is LIVE. The pre-mainnet, post-testnet blockchain, where whitehats legally attack your smart contracts before they reach production. Deploy. Get attacked. Ship stronger. Here's why we built it, what it is, and how you can get involved 🧵

GIF

English

107

471

107.8K

RustyRabbit@_RustyRabbit·25 Mar

@pashov Could be, but the likelihood of an exploitable bug is a lot smaller in that case.

English

pashov@pashov·25 Mar

@_RustyRabbit Do you think there are vulnerabilities around pausing as well

English

171

pashov@pashov·25 Mar

Web3 Security Horror Story Time A protocol gets reported a Critical vulnerability. They immediately patch it with a code fix and push it on-chain to their upgradeable contracts. A MEV bot picks up the "code fix" transaction before it is validated into a block, re-engineers the vulnerability with AI and front-runs the upgrade patch with an exploit. Upgrade passes successfully, the exploit before it as well. You just exposed the fix of a Critical vulnerability to an untrusted actor. AI allowed seconds to be enough to deduct a vulnerability from a patch. You can argue AI is dumb, sure. But you can't argue AI is not fast - and that it can't be even faster. Upgradeability and MEV bots become an attack vector with time. I challenge you to say how this can be safely secured.

English

245

18.5K

RustyRabbit@_RustyRabbit·25 Mar

@banteg Great Firewall of US

English

banteg@banteg·25 Mar

pour one out for the global idea marketplace. i was never interested in local news/rumors since i was a teen and i found internet access very alluring because i could connect with people based on their ideas alone. seems like there should be a less heavy handed way to fight foreign ops than forcing everyone into their own local bubbles. the suggestion pretty much defeats the purpose of this site, x is much more than news/politics.

Nikita Bier@nikitabier

Starting Thursday, we'll be updating our revenue sharing incentives to better reward the content we want on X: We will be giving more weight to impressions from your home region—to encourage content that resonates with people in your country, in neighboring countries and people who speak your language. While we appreciate everyone's opinion on American politics, we hope this will disincentivize gaming the attention of US or Japanese accounts and instead, drive diverse conversations on the platform. We invite creators to start building an audience locally. X will be a much richer community when there's relevant posts for people in all parts of the world.

English

146

11.8K

RustyRabbit ری ٹویٹ کیا

Varun@varun_mathur·23 Mar

Introducing the Agent Virtual Machine (AVM) Think V8 for agents. AI agents are currently running on your computer with no unified security, no resource limits, and no visibility into what data they're sending out. Every agent framework builds its own security model, its own sandboxing, its own permission system. You configure each one separately. You audit each one separately. You hope you didn't miss anything in any of them. The AVM changes this. It's a single runtime daemon (avmd) that sits between every agent framework and your operating system. Install it once, configure one policy file, and every agent on your machine runs inside it - regardless of which framework built it. The AVM enforces security (91-pattern injection scanner, tool/file/network ACLs, approval prompts), protects your privacy (classifies every outbound byte for PII, credentials, and financial data - blocks or alerts in real-time), and governs resources (you say "50% CPU, 4GB RAM" and the AVM fair-shares it across all agents, halting any that exceed their budget). One config. One audit command. One kill switch. The architectural model is V8 for agents. Chrome, Node.js, and Deno are different products but they share V8 as their execution engine. Agent frameworks bring the UX. The AVM brings the trust. Where needed, AVM can also generate zero-knowledge proofs of agent execution via 25 purpose-built opcodes and 6 proof systems, providing the foundational pillar for the agent-to-agent economy. AVM v0.1.0 - Changelog - Security gate: 5-layer injection scanner with 91 compiled regex patterns. Every input and output scanned. Fail-closed - nothing passes without clearing the gate. - Privacy layer: Classifies all outbound data for PII, credentials, and financial info (27 detection patterns + Luhn validation). Block, ask, warn, or allow per category. Tamper-evident hash-chained log of every egress event. - Resource governor: User sets system-wide caps (CPU/memory/disk/network). AVM fair-shares across all agents. Gas budget per agent - when gas runs out, execution halts. No agent starves your machine. - Sandbox execution: Real code execution in isolated process sandboxes (rlimits, env sanitization) or Docker containers (--cap-drop ALL, --network none, --read-only). AVM auto-selects the tier - agents never choose their own sandbox. - Approval flow: Dangerous operations (file writes, shell commands, network requests) trigger interactive approval prompts. 5-minute timeout auto-denies. Every decision logged. - CLI dashboard: hyperspace-avm top shows all running agents, resource usage, gas budgets, security events, and privacy stats in one live-updating screen. - Node.js SDK: Zero-dependency hyperspace/avm package. AVM.tryConnect() for graceful fallback - if avmd isn't running, the agent framework uses its own execution path. OpenClaw adapter example included. - One config for all agents: ~/.hyperspace/avm-policy.json governs every agent framework on your machine. One file. One audit. One kill switch.

English

130

183

1.3K

134.9K

RustyRabbit@_RustyRabbit·23 Mar

@hrkrshnn It would be interesting to know the design and how it works. Do you plan to make this info public or keep it internal as 'secret sauce'?

English

292

Hari@hrkrshnn·23 Mar

The reason this result is impressive is the ability to match the 34 critical, high, and medium severity findings. That is a lot of findings. This is a pretty large and complex codebase. Most AI systems, including baseline ChatGPT, Claude, and Gemini, will find some bugs (and a ton of false positives), but not all. However, finding some bugs is not enough for an AI system. It needs to be able to find *all* bugs. What does it mean to find all bugs? The baseline: it needs to match all the bugs a competent human team will find over a reasonably sized manual audit. If it can match all critical, high, and medium severity findings, I'd consider it to have 100% coverage. Anything more is icing on the cake. Remember: no human audit today guarantees they'll find *all* bugs; they all come with disclaimers that tell you it's a point-in-time security review over N number of weeks, and many of them will recommend getting another security review to improve confidence that there's nothing left. Clearly, no single human in an audit team can guarantee that they'll find all the bugs in that team audit. Early versions of Apex never got close to 100% coverage. Sometimes it found bugs that the human team missed (which is normal in any audit, as the disclaimers state), but finding all the same bugs was impossible. We had to make a series of improvements over time to get here. And we still have a lot of work left to build confidence that this performance is indeed generalizable. But in getting here, we've made a pretty staggering realization: code security as we know it is on track to be solved! There's a lot of engineering and product work left, but there's a clear path ahead of us that will give us something that's faster, better, and cheaper than a human audit every single time. Maybe not 100% of the time today, but 100% over time. This is a huge statement that will rightfully receive a lot of skepticism, but hear me out: we had a list of bugs that we just couldn't get previous versions of Apex to find. But no longer! Our cracked Apex team pulled their hair out over weeks last year on certain complex bugs. Even when we were 'cheating' by telling Apex about the bug, earlier versions just didn't have enough intelligence to process certain issues. We don't see that anymore. We literally don't know of a bug or bug class that's out of reach today. We methodically track bugs that Apex is missing and bugs that are marked as false positives. We have a clear strategy for fixing every gap we spot in a generalizable way. It's now a lot of shipping, scaling, optimizing, and product work. There are two different ways people are taking this (that an AI can catch any bug): 1. Denial. I've seen this last year when coding agents started to look promising. So many strong engineers were in denial. They loved to point out every single mistake that these coding agents made. But others saw opportunity: what if the coding agents kept improving? 2. The opportunity. So many early users of Apex are finding out they can now get really good security guarantees on full-stack applications, something they could never do in the past. Imagine your backend application that interacts with sensitive data or money. You could never get a similar level of diligence as, say, smart contracts because it would cost too much and was an ever-moving target. You can now get continuous world-class security for the first time in history. In some way, these AI tools are increasing the total addressable market for security. We saw a similar trend with coding agents: people who have never been able to code before are now shipping apps that they've always dreamed of building but didn't have the know-how or time to create. We'll start to see this in security too: applications and teams that could never afford security guarantees that come with an external line-by-line code review by top security researchers can now get it.

Hari@hrkrshnn

Our cracked Apex R&D team has one job: to build the frontier AI security agent. Here's a benchmark on how an experimental version of Apex performed against a 6-person audit. It found all the Crits, Highs and Mediums, and several more!

English

6.6K

RustyRabbit@_RustyRabbit·23 Mar

@Al_Qa_qa Cooperative is better for everyone in the long run.

English

Al-Qa'qa'@Al_Qa_qa·22 Mar

Competitive vs Collaborative Strategy in Web3 Security Firms. I have worked with multiple Web3 security firms, including top-tier ones. I can say that there are two main Strategies used by Firms when reviewing a given codebase, which are Competitive and Collaborative. Competitive is where security researchers do not know each other’s findings. Let's say an audit with 4 engaged auditors, each one of them submits the findings they got; no one knows the findings that the other auditors submitted. Then, at the end, we collect all the findings. Collaborative is the opposite. Auditors know each other's findings. Once a given auditor submits an issue, it is visible to all other auditors, and they can collaborate with each other, etc... I see Collaborative strategy is better for the Client, Competitive is better for the security firm itself. Collaboration is better for the client from different aspects. 1- The Client does not care which auditor gets the issue. 2- Collaborative makes the process more professional, where auditors can engage with each other to discuss a given issue, etc... 3- Sometimes one of the team mate find an issue that opens you to find another issue (this happens to me) 4- For Large protocols with a lot of Findings, the coverage is better. Some protocols end with +60 total findings, with a lot of H/Ms. This increases the process of submitting without consuming time from each auditor (in case all of them find this issue) to validate, write the report, etc... Competitive is better for the Security firm as it will be easier to evaluate the performance of the auditors who engaged in the codebase, who missed a lot, who found unique bugs, etc... While I have worked in Firms from both strategies, I prefer the collaborative one. This is my opinion; you can agree or disagree with me. What strategy do you prefer?

English

1.6K

RustyRabbit@_RustyRabbit·23 Mar

Maybe now they understand what it's all about

Ramy Abdu| رامي عبده@RamAbdu

French judge Nicolas Gouyou, who issued an arrest warrant for Netanyahu at the ICC: • Visa and Mastercard have blocked all my cards • I cannot make any purchases • I am a judge, yet treated like a criminal • Judges, lawyers, and politicians are being intimidated • A colleague told me my name won’t be removed from the blacklist until Trump’s term ends • Despite intervention by the French president, U.S. authorities have not responded

English

RustyRabbit ری ٹویٹ کیا

gmhacker@realgmhacker·21 Mar

JP Morgan's CISO wrote an open letter about agentic AI access management. AI agents are starting to act autonomously across systems, reading emails, executing code, calling APIs. Each one needs credentials, permissions, and guardrails. We haven't figured out how to manage human access properly (38% of ex-employees still have access). Now we're handing keys to autonomous agents. The access-trust gap isn't closing. It's about to get exponentially wider.

English

615

RustyRabbit ری ٹویٹ کیا

Keycard Shell@Keycard_·20 Mar

Your hardware wallet can receive a firmware update that changes how your keys are used. 🔐 Ours can't. Not by hackers. Not by us. Not by anyone. Here's why that matters 👇 keycard.tech/en/blog/a-keyc…

English

2.6K

RustyRabbit@_RustyRabbit·14 Mar

@pashov Probably going to end up on price per finding.

English

pashov@pashov·14 Mar

How much do you think an AI audit scan should cost? Only honest answers, comment below.

English

13.7K

RustyRabbit@_RustyRabbit·14 Mar

@DCinvestor They try to speak the other’s language out of courtesy. Mostly the one that performs best or is able to endure the effort the best is what is continued. If they both don’t speak each other’s language you try to find one that you both understand/speak.

English

141

DCinvestor@DCinvestor·14 Mar

when two Europeans from different countries meet and they can speak each others’ languages, what is the etiquette to determine which language they speak with each other?

English

19K

RustyRabbit@_RustyRabbit·14 Mar

@GalloDaSballo Any card that converts into fiat rails will be forced to comply. So they all have the same issue.

English

Alex the Entreprenerd@GalloDaSballo·13 Mar

What are the best crypto cards?

Afonso Gonçalves@AfonsoJFG

⚠️ SHOCKING: My Revolut account has just been BANNED by order of the Public Prosecutor’s Office! 👉🏻 I am living under tyranny and being treated like an international criminal for defending 🇵🇹 and the Portuguese people. 🫵🏻 They are not after me — they are after YOU. I’m just in the way.

English

1.5K

RustyRabbit@_RustyRabbit·14 Mar

@zacodil @StaniKulechov I’m all for self-custody and every responsibility that Goes along with it but any wallet that let’s this go through without a cool of period is retarded. That being said, I think pretty much any wallet let’s the user do this with a minimal warning.

English

286

Vadim@zacodil·14 Mar

I ran Aave's code locally to show you exactly what a $50M swap screen looks like. Yellow warning. 99.9% price impact. Checkbox. You can't miss it. So how did someone confirm past this with $50M? Could you accidentally check this box?

English

137

27.7K

دریافت کریں

@DevDacian @milotruck @nisedo_ @hrkrshnn @hanni_abu @_esk_kse_ @lex_node @cyfrin