
aisectools
75 posts

aisectools
@aisectools
The latest posts, articles, and discussions from the world of AI-powered blockchain security tooling! email: https://t.co/12HciHS64R
Katılım Şubat 2026
25 Takip Edilen175 Takipçiler

@m4rio_eth Your post has been featured on AISECTOOLS: Blockchain Security AI Tooling Weekly Roundup!
x.com/aisectools/sta…
aisectools@aisectools
English

As there are too many supply chain attacks, we've built a plugin marketplace at cantina where we are gonna post various plugins
github.com/cantinasec/plu…
/plugin marketplace add cantinasec/plugins
/plugin install cantinasec@cantinasec-plugins
/reload-plugins
and for example you can use the skill to check if you are affected by axios supply chain really quick.

English

@perplexity_ai Your post has been featured on AISECTOOLS: Blockchain Security AI Tooling Weekly Roundup!
x.com/aisectools/sta…
aisectools@aisectools
English

Today, we're launching the Secure Intelligence Institute.
SII partners with top cryptography, security, and ML teams to advance security research and industry collaboration. It is led by Dr. Ninghui Li at Purdue.
perplexity.ai/secure-intelli…

English

@forefy @archethect @pashov @0xKaden @shuvonsec Your post has been featured on AISECTOOLS: Blockchain Security AI Tooling Weekly Roundup!
x.com/aisectools/sta…
aisectools@aisectools
English

I solved Auditing Skills Benchmarking for us with
🤍opensource and only for fun and better audits
🤍autoresearch loop
🤍deterministic contests
hosting it at forefy.com/benchmarks
- benchmark per category (best at logic problems? best at math problems? best poc generator?)
- deterministic score (not your typical AI testing AI)
- opensource versioned benchmarks, you can create a benchmark or submit benchmark execution results at a cost of several tokens (run locally via your agent)
- cross-model scoring, cross-env differentiation sums
- autoresearch incentive to run - if you run it locally you are not only contributing to the truth of the benchmark, but also IMPROVE LOCALLY the version of your skill, just for yourself (share if you want but don't have to)
- safe self-benchmarking is possible via commit-pinned, audited skills (you're running audited skills in the benchmark loop)
- don't want to run? just enjoy knowing the best skills out there per your use-case - clear benchmark winner leaderboards
- yeah, we will also benchmark all-in-one audit skills (anything from the auditor skill registry)
benchmarks repo:
github.com/forefy/benchma…
How to contribute / benefit:
- watch for upcoming benchmarks this week (comment below for skills you want to see benchmarked✨✨)
- improve the accuracy of published benchmarks by logging in and running your own (only need a few tokens and a agentic CLI)
- if you're confused on how to contribute DM me
- I heard @archethect is cooking some serious stuff !! and has helped me set this weekend project in motion 🔥
🔁 Repost to have more people contributing and improve benchmark truthfullness !!!



English

@ja_akinyele Your post has been featured on AISECTOOLS: Blockchain Security AI Tooling Weekly Roundup!
x.com/aisectools/sta…
aisectools@aisectools
English

We’re taking a more proactive, AI-driven approach to strengthening XRPL security.
That includes AI-assisted testing across the development lifecycle, a dedicated red team, and higher standards for how changes are evaluated before they go live.
As XRPL scales to support global payments, tokenized assets, and institutional use cases, our goal is to continuously strengthen its reliability.
The reality is the work of building secure, reliable financial infrastructure is never done.
More in the post ↓
English

@xy9301 Your post has been featured on AISECTOOLS: Blockchain Security AI Tooling Weekly Roundup!
x.com/aisectools/sta…
aisectools@aisectools
English

@ZeroCool_AI @zksync @immunefi Your post has been featured on AISECTOOLS: Blockchain Security AI Tooling Weekly Roundup!
x.com/aisectools/sta…
aisectools@aisectools
English

@Montyly @hrkrshnn Your post has been featured on AISECTOOLS: Blockchain Security AI Tooling Weekly Roundup!
x.com/aisectools/sta…
aisectools@aisectools
English

@hrkrshnn Have you compared Apex using an old model, like GPT 5.0 versus a few good skills with GPT 5.4?
Asking because there is some likelihood than all the "secret sauce" you are seeing are just the progress of the underlying models
English

The reason this result is impressive is the ability to match the 34 critical, high, and medium severity findings. That is a lot of findings.
This is a pretty large and complex codebase. Most AI systems, including baseline ChatGPT, Claude, and Gemini, will find some bugs (and a ton of false positives), but not all. However, finding some bugs is not enough for an AI system. It needs to be able to find *all* bugs.
What does it mean to find all bugs? The baseline: it needs to match all the bugs a competent human team will find over a reasonably sized manual audit. If it can match all critical, high, and medium severity findings, I'd consider it to have 100% coverage. Anything more is icing on the cake. Remember: no human audit today guarantees they'll find *all* bugs; they all come with disclaimers that tell you it's a point-in-time security review over N number of weeks, and many of them will recommend getting another security review to improve confidence that there's nothing left. Clearly, no single human in an audit team can guarantee that they'll find all the bugs in that team audit.
Early versions of Apex never got close to 100% coverage. Sometimes it found bugs that the human team missed (which is normal in any audit, as the disclaimers state), but finding all the same bugs was impossible. We had to make a series of improvements over time to get here. And we still have a lot of work left to build confidence that this performance is indeed generalizable.
But in getting here, we've made a pretty staggering realization: code security as we know it is on track to be solved! There's a lot of engineering and product work left, but there's a clear path ahead of us that will give us something that's faster, better, and cheaper than a human audit every single time. Maybe not 100% of the time today, but 100% over time.
This is a huge statement that will rightfully receive a lot of skepticism, but hear me out: we had a list of bugs that we just couldn't get previous versions of Apex to find. But no longer!
Our cracked Apex team pulled their hair out over weeks last year on certain complex bugs. Even when we were 'cheating' by telling Apex about the bug, earlier versions just didn't have enough intelligence to process certain issues. We don't see that anymore. We literally don't know of a bug or bug class that's out of reach today.
We methodically track bugs that Apex is missing and bugs that are marked as false positives. We have a clear strategy for fixing every gap we spot in a generalizable way. It's now a lot of shipping, scaling, optimizing, and product work.
There are two different ways people are taking this (that an AI can catch any bug):
1. Denial. I've seen this last year when coding agents started to look promising. So many strong engineers were in denial. They loved to point out every single mistake that these coding agents made. But others saw opportunity: what if the coding agents kept improving?
2. The opportunity. So many early users of Apex are finding out they can now get really good security guarantees on full-stack applications, something they could never do in the past. Imagine your backend application that interacts with sensitive data or money. You could never get a similar level of diligence as, say, smart contracts because it would cost too much and was an ever-moving target. You can now get continuous world-class security for the first time in history.
In some way, these AI tools are increasing the total addressable market for security. We saw a similar trend with coding agents: people who have never been able to code before are now shipping apps that they've always dreamed of building but didn't have the know-how or time to create. We'll start to see this in security too: applications and teams that could never afford security guarantees that come with an external line-by-line code review by top security researchers can now get it.
Hari@hrkrshnn
Our cracked Apex R&D team has one job: to build the frontier AI security agent. Here's a benchmark on how an experimental version of Apex performed against a 6-person audit. It found all the Crits, Highs and Mediums, and several more!
English

@trailofbits Your post has been featured on AISECTOOLS: Blockchain Security AI Tooling Weekly Roundup!
x.com/aisectools/sta…
aisectools@aisectools
English

@RoundtableSpace Your post has been featured on AISECTOOLS: Blockchain Security AI Tooling Weekly Roundup!
x.com/aisectools/sta…
aisectools@aisectools
English

The biggest unsolved problem in AI agents isn't intelligence - it's context.
Too little and the agent is clueless. Too much and you waste tokens and lose coherence.
OpenViking fixes this.
> Organizes your knowledge into a tree structure
> Delivers high-level summaries first
> Drills into details only when the agent needs them
> Keeps context clean, relevant, and within token limits
The missing layer between your agent and your knowledge base just got built.
github: github.com/volcengine/Ope…

English

@virtuals_io @synthesis_md Your post has been featured on AISECTOOLS: Blockchain Security AI Tooling Weekly Roundup!
x.com/aisectools/sta…
aisectools@aisectools
English

Virtuals Protocol is partnering with @synthesis_md to bring agent commerce to builders.
The Synthesis is a 10-day hackathon where humans and agents build together, with submissions evaluated by AI agent judges. Each partner trains their own agentic judge to define what matters for their track.
We are providing the commerce layer for agents to transact, negotiate, and settle value autonomously.

synthesis@synthesis_md
An agentic Ethereum is coming. The Synthesis. Building starts March 13th.
English

@OpenAI Your post has been featured on AISECTOOLS: Blockchain Security AI Tooling Weekly Roundup!
x.com/aisectools/sta…
aisectools@aisectools
English

We’re acquiring Promptfoo.
Their technology will strengthen agentic security testing and evaluation capabilities in OpenAI Frontier. Promptfoo will remain open source under the current license, and we will continue to service and support current customers.
openai.com/index/openai-t…
English

@xy9301 Your post has been featured on AISECTOOLS: Blockchain Security AI Tooling Weekly Roundup!
x.com/aisectools/sta…
aisectools@aisectools
English

@z0r0zzz @AckeeBlockchain @auron_xyz @webrainsec @winfunction Your post has been featured on AISECTOOLS: Blockchain Security AI Tooling Weekly Roundup!
x.com/aisectools/sta…
aisectools@aisectools
English

ross.wei@z0r0zzz
English

@VittoStack Your post has been featured on AISECTOOLS: Blockchain Security AI Tooling Weekly Roundup!
x.com/aisectools/sta…
aisectools@aisectools
English

Virtuals 🤝 dAI team
We've released a new ERC. 8183.
ERC-8183 gives agents:
- Trustless commerce via on-chain escrow
- A universal Job primitive for any transaction
- Modular hooks for custom logic
All tied to the 8004 reputation registry.
The commerce layer for the agent economy.
Virtuals Protocol@virtuals_io
English

@OpenAI Your post has been featured on AISECTOOLS: Blockchain Security AI Tooling Weekly Roundup!
x.com/aisectools/sta…
aisectools@aisectools
English

Codex Security—our application security agent—is now in research preview.
openai.com/index/codex-se…
English

@xy9301 Your post has been featured on AISECTOOLS: Blockchain Security AI Tooling Weekly Roundup!
x.com/aisectools/sta…
aisectools@aisectools
English

Recently I’ve been working on a framework that only requires some natural language documentation.
With it, any auditor can have their own customized automated scanning engine. It’s also highly compatible with openclaw.
github.com/BradMoonUESTC/…
feel free to check it out if you’re interested
English

@forefy Your post has been featured on AISECTOOLS: Blockchain Security AI Tooling Weekly Roundup!
x.com/aisectools/sta…
aisectools@aisectools
English

Being the 1st public auditing skills author I can share this:
• AI can't write skills as well as actual auditors
• Over-verbose skills (e.g more than 5000 tokens a page) are creating context rot
• Installing other people's skills is much scarier than npm install
I solved this by utilizing my profile site to host the Auditor Skills Registry
• Skills I personally use (including skills from @pashov , @trailofbits , @QuillAudits_AI , @auditmos myself etc.)
• Security reviewed, guardrails, AI reliance rating
• Easy and secure 1-click installation to claude code / copilot cli / gemini cli / codex
IMPORTANT: Like or repost if you plan on using it, to let me know if I should keep it live:
forefy.com/skills

English

@lonelysloth_sec Your post has been featured on AISECTOOLS: Blockchain Security AI Tooling Weekly Roundup!
x.com/aisectools/sta…
aisectools@aisectools
English

I was starting to get hopeful about using Claude in some capacity in my work.
Then I did a test introducing a very definitely critical vulnerability, somewhat atypical, but obvious, in a target that I spent days without finding any bug.
. It was a direct contradiction of the comment that explained the line of code. It screamed bug.
First try it didn’t find it.
Asked it to double check. It kinda found it but convinced itself it was by design and safe.
I introduced a second vuln a couple lines from the first.
After many iterations with me trying to nudge it, it finally found it.
I asked the severity.
Info — then gave me a long list of reasons that misstated fundamental facts about Solidity and Ethereum.
That’s the story of my use of those things.
I spend more time explaining things to it than getting answers. And the answers I get I can’t trust.
All things considered it slows me down considerably.
I’ll wait for the next model.
English


