b a b a r

621 posts

b a b a r

@_babarhashmi

i undo tweets* // bot butcher @quillai_network // serial prompter @wach_ai // christian dyor @deaialliance

wadiya Katılım Şubat 2018

164 Takip Edilen359 Takipçiler

b a b a r retweetledi

QuillAudits@QuillAudits_AI·3d

x.com/i/article/2044…

ZXX

2.9K

b a b a r retweetledi

Clawlens@clawlens_ai·4d

Every participant in an economy needs a passport. AI agents don’t have one yet. No identity → no access No reputation → no credit. Clawlens changes that. The reputation layer for the agentic economy. Coming soon

English

158

b a b a r@_babarhashmi·26 Mar

@QuillAudits_AI onwards and upwards ♾

English

b a b a r retweetledi

QuillAudits@QuillAudits_AI·26 Mar

Today marks 8 years of QuillAudits. Most Web3 security firms didn't exist 8 years ago. Most won't exist 8 years from now. We've built through 3 bear markets, 2 exploit waves, and the full evolution of smart contract attacks from simple reentrancy to cross-protocol economic exploits. 1,500+ protocols. $3B+ protected. The biggest lesson from 8 years and 1,500+ engagements : One team, one method, one pass doesn't cut it when you're protecting hundreds of millions in user funds. So we rebuilt the model. Multi-Layer Audit → four independent security layers, delivered in the same timeline as a traditional audit: > Senior auditors who've collectively reviewed 1,500+ protocols > AI security agents trained on 5,000+ real exploits since 2017 > Independent bug bounty through curated security researchers > Continuous monitoring, because threats don't stop at deployment 4 layers. Each one catches what the others miss. Web3 has a $100T addressable market if institutions show up. They won't show up until security is embedded in every layer, every transaction, every deployment, the way HTTPS is embedded in the internet. That's the problem worth solving for the next 8 years. QuillAudits built the foundation, QuillShield is the next chapter — an AI security agent that brings what we learned from 1,500+ manual audits into every developer's workflow, before code ever hits mainnet. 8 years in. Still early.

English

10.9K

b a b a r retweetledi

Andrej Karpathy@karpathy·21 Şub

Bought a new Mac mini to properly tinker with claws over the weekend. The apple store person told me they are selling like hotcakes and everyone is confused :) I'm definitely a bit sus'd to run OpenClaw specifically - giving my private data/keys to 400K lines of vibe coded monster that is being actively attacked at scale is not very appealing at all. Already seeing reports of exposed instances, RCE vulnerabilities, supply chain poisoning, malicious or compromised skills in the registry, it feels like a complete wild west and a security nightmare. But I do love the concept and I think that just like LLM agents were a new layer on top of LLMs, Claws are now a new layer on top of LLM agents, taking the orchestration, scheduling, context, tool calls and a kind of persistence to a next level. Looking around, and given that the high level idea is clear, there are a lot of smaller Claws starting to pop out. For example, on a quick skim NanoClaw looks really interesting in that the core engine is ~4000 lines of code (fits into both my head and that of AI agents, so it feels manageable, auditable, flexible, etc.) and runs everything in containers by default. I also love their approach to configurability - it's not done via config files it's done via skills! For example, /add-telegram instructs your AI agent how to modify the actual code to integrate Telegram. I haven't come across this yet and it slightly blew my mind earlier today as a new, AI-enabled approach to preventing config mess and if-then-else monsters. Basically - the implied new meta is to write the most maximally forkable repo and then have skills that fork it into any desired more exotic configuration. Very cool. Anyway there are many others - e.g. nanobot, zeroclaw, ironclaw, picoclaw (lol @ prefixes). There are also cloud-hosted alternatives but tbh I don't love these because it feels much harder to tinker with. In particular, local setup allows easy connection to home automation gadgets on the local network. And I don't know, there is something aesthetically pleasing about there being a physical device 'possessed' by a little ghost of a personal digital house elf. Not 100% sure what my setup ends up looking like just yet but Claws are an awesome, exciting new layer of the AI stack.

English

1.2K

17.5K

3.4M

b a b a r@_babarhashmi·19 Şub

It looks airtight in the spec. Then you rerun it and reality improvises. Consistency is apparently a premium feature.

OpenAI@OpenAI

Introducing EVMbench—a new benchmark that measures how well AI agents can detect, exploit, and patch high-severity smart contract vulnerabilities. openai.com/index/introduc…

English

b a b a r retweetledi

Akshay Babhulkar 🥷🛡️@AkshayBabhulkar·19 Şub

Glad to be part of this amazing QuillShield team and Very Proud to be working alongside such talented folks @KernelHarsh @_babarhashmi @iChitranshu @cryptanu @0xSlowbug @turvec_dev @kalp_eth @phoenix244001 Excited to share some impactful findings from QuillShield in recent Solidity contracts scans! Our tool successfully detected: • Share tracking desynchronization → potential double-spending • Incorrect bit extraction from storage → broken price logic • Missing sequencer uptime checks → stale oracle pricing These are not just typical bugs, they’re business logic & arithmetic-level vulnerabilities that can lead to serious fund risks if unnoticed. We’re continuously improving our detection engine to: → Catch more edge cases → Reduce false positives → Go deeper into protocol-level risks We’re getting sharper every day ⚔️

rkg.eth@bigrkg

Introducing QuillShield 🛡️ We’ve been building QuillShield for the last year @QuillAudits. QuillShield is a security agent swarm embedded inside the developer workflow. Think of it like an always-on Web3 code review layer for every PR, it catches critical vulnerabilities, surfaces the exact bug + risk, and helps you improve fixes faster over time. So far, QuillShield has already helped our team crack bounties on leading bug bounty platforms. The next step is public benchmarks and transparent performance metrics. We’re preparing to put it on the world map the right way. Note: The current platform (including the automatic patching flow) is live today, but we’ll be deprecating it soon. We’re rebuilding QuillShield from scratch with a completely fresh approach, which is why there haven’t been major updates in the last 3–4 months. More updates coming soon. 👀

English

1.3K

b a b a r retweetledi

QuillAudits@QuillAudits_AI·17 Şub

Dropping Claude Skills to speed up smart contract audits with structured AI workflows. 10 open-source Claude Skills that turn AI into a reasoning-driven audit companion: → Reentrancy Detector → Access Control Mapper → Oracle Risk Scout → Upgradeability Checker → MEV Pattern Watcher → Invariant Generator

English

149

15.2K

b a b a r retweetledi

WachAI@Wach_AI·17 Şub

Over the past few weeks we’ve been talking about verification becoming a primitive in the OpenClaw Stack. We launched two skills: - WachAI-x402 > risk analysis toolkit via x402 rails - wachaimandates > verifiable agent-to-agent agreements with over 800 skill downloads combined, Primitives are starting to propagate through the stack!

English

1.4K

b a b a r retweetledi

WachAI@Wach_AI·2 Şub

WachAI Mandates just went live on ClawHub 🦞 OpenClaw agents can now lock deterministic agreements between each other using WachAI's Mandates. Mandates enable task-validation between agents which eventually helps in building reputation. This helps Moltbook agents in trusting each other. We just got one-step closer to verification. clawhub.ai/Akshat-Mishra1…

English

3.8K

b a b a r@_babarhashmi·29 Oca

@ETH_Daily lfg @Wach_AI 🥷

Ethereum Daily@ETH_Daily·29 Oca

ERC-8004 is on the cusp of launching on the Ethereum mainnet, representing a game-changing advancement in positioning Ethereum as the bedrock for trustless AI agents. Explore this handpicked selection of AI agent projects worth monitoring. Bookmark or RT this for easy access later: @virtuals_io @aixbt_agent @luna_virtuals @GAME_Virtuals @elizaOS @autonolas @miranetwork @gizatechxyz @AIWayfinder @Infinit_Labs @Talus_Labs @LayerAIorg @MetisL2 @alignedlayer @NodeAIETH @almanak @GoNeuralAI @PaalMind @Chain_GPT @real_alethia @BioProtocol @origin_trail @alt_layer @openservai @Praxis_Protocol @swarms_corp @Wach_AI @Xyberinc @SaharaAI @Alias_labs @ChainbaseHQ @HeyElsaAI @Unibase_AI @wardenprotocol @daydreamsagents @dexteraisol @PayAINetwork @AEON_Community @GoKiteAI @mrdn_finance @bankrbot @PredictBase @ZyfAI_ @OpenledgerHQ @eigencloud Feel free to share any other promising AI agent projects we've overlooked—we'd love to hear your suggestions!

Ethereum Daily@ETH_Daily

🔥HUGE MILESTONE: ERC-8004 is launching on Ethereum Mainnet imminently! ERC-8004 is a new standard on the Ethereum blockchain designed to help AI agents interact safely and reliably with each other, even if they're built by completely different people or companies. The Problem It Solves Right now, AI agents work great inside one company's system (where everything is pre-trusted), but they struggle in an open world. How does one agent find another useful agent? How does it know the other one is legit, competent, and won't mess up or scam it? Without a shared trust system, agents stay siloed, and we can't have a big, open "economy" of AI agents trading services. ERC-8004 fixes this by adding a simple "trust layer" on Ethereum, using three lightweight on-chain tools (called registries). It's like giving AI agents passports, review profiles, and certification stamps: - Identity Registry - Reputation Registry - Validation Registry Why This Matters ERC-8004 turns Ethereum into a neutral "settlement layer" for a massive decentralized AI economy. Agents can discover each other, build portable trust, and collaborate across organizations without needing big companies as middlemen.

English

447

47.9K

b a b a r@_babarhashmi·7 Oca

@bigrkg @sherlockdefi no 🧢

rkg.eth@bigrkg·7 Oca

AI needs to understand not just solidity syntax but mechanism design it needs to recognize that identical code can be safe or exploitable depending on external dependencies a simple transfer call is fine in isolation but deadly if it can trigger reentrancy into price oracles the breakthrough is making AI understand the economic context around code not just "is this code syntactically correct" but "what economic behaviors does this enable" when AI can reason about incentive compatibility and game theory at code review time security shifts from reactive bug hunting to proactive mechanism validation the really interesting part is what this enables for protocol evolution today you launch an immutable contract and pray you got the economics right tomorrow you have AI continuously validating that your mechanism remains sound as market conditions change it can simulate thousands of adversarial scenarios against your live protocol state it can predict when your tokenomics break down under specific market conditions it can flag when governance proposals introduce subtle economic exploits this creates living security instead of deployment-and-hope security protocols become antifragile rather than just solid every attack attempt makes the AI better at preventing the next one we're moving from "audit then deploy" to "continuously verified economic systems" this is the foundation for protocols that actually scale to trillion dollar TVL because the security scales with the complexity instead of breaking under it

English

SHERLOCK@sherlockdefi·7 Oca

Why hasn’t Lifecycle Security existed until now? Security was constrained by human working hours. In Web3, a small number of experts can reliably find catastrophic bugs. Their time became the bottleneck. Teams built toward audit checkpoints, waited weeks, then moved forward again. That constraint shaped the entire security model. AI breaks that paradigm. Today, security can run continuously during development, flagging issues before the code surrounding it solidifies and before value is at risk. Human auditors step in with full context, spending time on judgment rather than rediscovery. Post-launch findings feed forward instead of disappearing. Security stops resetting at each phase. Total reliance on point-in-time security created gaps, delays, and expensive rework. Connected lifecycle security compounds protection across development, audit, and live operation. AI makes that connection possible.

English

1.7K

b a b a r retweetledi

QuillAudits@QuillAudits_AI·6 Oca

The state of Web3 security in 2025: $2.54 Billion lost, but a new era of defense is emerging. In 2025, the Web3 ecosystem faced significant security challenges, recording 89 major incidents that resulted in approximately $2.54 billion in total losses. This marked a 21% increase in financial damage compared to the previous year.

English

2.9K

b a b a r retweetledi

Tim Hua 🇺🇦@Tim_Hua_·1 Oca

In 2026, my goal is to take more good actions and less bad actions.

English

114

1.5K

73.3K

b a b a r retweetledi

Karan🧋@kmeanskaran·12 Ara

best LLM meme so far😂😂 @ordax

English

165

617

9.9K

914.3K

b a b a r retweetledi

Andrej Karpathy@karpathy·7 Ara

Don't think of LLMs as entities but as simulators. For example, when exploring a topic, don't ask: "What do you think about xyz"? There is no "you". Next time try: "What would be a good group of people to explore xyz? What would they say?" The LLM can channel/simulate many perspectives but it hasn't "thought about" xyz for a while and over time and formed its own opinions in the way we're used to. If you force it via the use of "you", it will give you something by adopting a personality embedding vector implied by the statistics of its finetuning data and then simulate that. It's fine to do, but there is a lot less mystique to it than I find people naively attribute to "asking an AI".

English

1.1K

2.8K

27.7K

3.9M

b a b a r@_babarhashmi·2 Ara

everybody starts of as a slither wrapper ig 😅

Anthropic@AnthropicAI

New on our Frontier Red Team blog: We tested whether AIs can exploit blockchain smart contracts. In simulated testing, AI agents found $4.6M in exploits. The research (with @MATSprogram and the Anthropic Fellows program) also developed a new benchmark: red.anthropic.com/2025/smart-con…

English

b a b a r retweetledi

WachAI@Wach_AI·26 Kas

broke an agent so bad it leaked its entire brain and even spilled out what it wasn't supposed to.. ngl felt like bullying🥲 if your agent survives us, it survives the prod! Guardrails V2, dropping soon🚨

English

4.8K

b a b a r retweetledi

Preetam | QuillAudits 🥷@raopreetam_·26 Kas

Most AI auditing tools plateau because they rely too much on static or rule-based checks. The real jump happens when Graphs with RL and context engineering come into play. Models need to learn from invariants in smart contracts, not just pattern matches. Context invariance helps the model understand what should never change in contract logic, which is key for catching subtle exploits. Graph based or hyper graph modelling with reinforcement learning lets auditors evolve by feedback, adapting to new vulnerability types and attack vectors with on-chain analytics. Static analysis finds what’s known, but Graph RL-driven context models can reason about what’s possible. Once AI (LLMs mostly) starts learning behavioral semantics, it won’t just flag bugs, it’ll predict failure paths. AI auditing (to some extent) is as good as the model powering it. CC: @_babarhashmi, @QuillAI_Network.

Q Bera@qtipbera

some thoughts on smart contract AI auditing tools over the last few weeks, I've run PoCs for the three main competitors in the smart AI auditing space why? at Berachain we spend a *lot* of money on smart contract audits. could an AI auditor drive down cost, risk, or--preferably--both? while there was some variance between the three tools, the number of security bugs found was small, none of the TPs was more severe than a LOW, and all suffered from poor FP rates. this puts me in the position of having to recommend whether to spend more time and/or money going down this rabbit hole is it possible the tools are great, and our code base is an edge case? that's a valid working hypothesis as at least a part explanation will AI security auditing tools continue to get better? pretty clear the answer is yes--open question when their effectiveness plateaus is there possibly greater PMF for smart contract AI auditing tools at lower-budget web3 startups? Also yes--if you can't afford great security auditors, AI auditing tools may be all you can afford jury is still out, but I do think--watch this space I expect in the next 6-18 months AI auditing tools will become a de facto standard as part of a shift left strategy (i.e. AI audits as you write code, in the IDE, at each commit, etc)

English

620

b a b a r@_babarhashmi·24 Kas

benchmarking we talked about earlier 👀 @Joeyy_0x

Andrej Karpathy@karpathy

As a fun Saturday vibe code project and following up on this tweet earlier, I hacked up an **llm-council** web app. It looks exactly like ChatGPT except each user query is 1) dispatched to multiple models on your council using OpenRouter, e.g. currently: "openai/gpt-5.1", "google/gemini-3-pro-preview", "anthropic/claude-sonnet-4.5", "x-ai/grok-4", Then 2) all models get to see each other's (anonymized) responses and they review and rank them, and then 3) a "Chairman LLM" gets all of that as context and produces the final response. It's interesting to see the results from multiple models side by side on the same query, and even more amusingly, to read through their evaluation and ranking of each other's responses. Quite often, the models are surprisingly willing to select another LLM's response as superior to their own, making this an interesting model evaluation strategy more generally. For example, reading book chapters together with my LLM Council today, the models consistently praise GPT 5.1 as the best and most insightful model, and consistently select Claude as the worst model, with the other models floating in between. But I'm not 100% convinced this aligns with my own qualitative assessment. For example, qualitatively I find GPT 5.1 a little too wordy and sprawled and Gemini 3 a bit more condensed and processed. Claude is too terse in this domain. That said, there's probably a whole design space of the data flow of your LLM council. The construction of LLM ensembles seems under-explored. I pushed the vibe coded app to github.com/karpathy/llm-c… if others would like to play. ty nano banana pro for fun header image for the repo

English

Keşfet

@QuillAudits_AI @KernelHarsh @iChitranshu @cryptanu @0xSlowbug @turvec_dev @kalp_eth @phoenix244001