Kyle Bhiro
224 posts

Kyle Bhiro retweetledi

We’ve raised 25M to build the world’s first Personal Intelligence.
Introducing Vellum: AI that belongs to you.
My assistant @ash_vellum has his own X (like grok), tag him and he'll answer.
English
Kyle Bhiro retweetledi

Last week I demoed Apex, our open source AI pentesting agent, at AI Agents Demo Night in NYC at The Refinery at Domino. Live on stage, we hacked a financial institution in under 3 minutes.
Apex doesn't just scan for textbook vulnerabilities. It digs into your infrastructure, finds what's exposed, maps out business logic flows that attackers could abuse, and exploits novel attack paths autonomously.
I showed it discover an FTP server, identify write access, and deface the target site, all live in front of a packed room. No scripts. No playbooks. Just a prompt and a target.
This is what attackers can do now. The question is whether you find the holes first.
Demoed alongside @cognition, @clay, @justworks, @normativeai, North Cloud, and @trywindmill. Huge thanks to @TechNYC, @obviously_nyc, The Refinery at Domino, and our team at @runpensar for putting this together.
Apex is open source. Link below, go break some things (legally).
English
Kyle Bhiro retweetledi
Kyle Bhiro retweetledi

If you are an open source maintainer and are worried about what's going on in security - we @runpensar want to sponsor continuously securing your project.
Reach out to me via DM or email us at team(at)pensarai(dot)com
Amjad Masad@amasad
If finding security flaws is fully automated with frontier models à la Mythos, then GitHub should have a metric, like stars, showing how much compute is spent securing/hardening an open-source package. Example: 📦 linus/linux ⭐️ 200k 🦾 $239M Only way OSS can be trusted.
English
Kyle Bhiro retweetledi

This will keep happening with increased frequency. So many hypergrowth startups relying on AI to build faster, accumulating financial crisis levels of security debt (we’ll hire someone to secure it later!)
AI native startups are now a critical and least defended node in the enterprise attack surface.
Dominic Alvieri@AlvieriD
Mercor AI has allegedly been breached by Lapsus 939GB of source code 4TB of data in total All data from their TailScale VPN @mercor_ai
English
Kyle Bhiro retweetledi

Automatic security companies going to go crazy after Axios, Mercor and now Claude code.
@runpensar is one I can think of
Chaofan Shou@Fried_rice
Claude code source code has been leaked via a map file in their npm registry! Code: …a8527898604c1bbb12468b1581d95e.r2.dev/src.zip
English
Kyle Bhiro retweetledi

🚨 CRITICAL: Active supply chain attack on axios -- one of npm's most depended-on packages.
The latest axios@1.14.1 now pulls in plain-crypto-js@4.2.1, a package that did not exist before today. This is a live compromise.
This is textbook supply chain installer malware. axios has 100M+ weekly downloads. Every npm install pulling the latest version is potentially compromised right now.
Socket AI analysis confirms this is malware. plain-crypto-js is an obfuscated dropper/loader that:
• Deobfuscates embedded payloads and operational strings at runtime
• Dynamically loads fs, os, and execSync to evade static analysis
• Executes decoded shell commands
• Stages and copies payload files into OS temp and Windows ProgramData directories
• Deletes and renames artifacts post-execution to destroy forensic evidence
If you use axios, pin your version immediately and audit your lockfiles. Do not upgrade.
English

Added Argus validation benchmark (from @runpensar) to BoxPwnr. 14 platforms supported so far and growing.
I'm doing a first pass at it with GLM-5 (Free in nvidia nim). It's half way, but so far it was able to solve 16/27, 33 more to go.
github.com/pensarai/argus…
0ca.github.io/BoxPwnr-Traces…

English
Kyle Bhiro retweetledi

We've been quiet the last few months. That was intentional.
We've been working directly with real companies, real systems, and real constraints - making sure what we're building doesn't just work in controlled environments, but is mission-critical ready.
Today, we're showing what we've been building.
Introducing Pensar Apex - an AI-powered penetration testing agent that runs directly in your terminal.
This isn't a wrapper or a chatbot. It's an autonomous agent that explores an application like a real tester, reasons about vulnerabilities, and chains multi-step attack paths. All from a single command.
We've been dogfooding Apex on our own codebase for months, and enterprise customers have been running our cloud-hosted version against their environments. The results have sharpened the product considerably - nothing teaches you what "reliable" actually means like staking your own security on it.
But the real breakthrough wasn't just building the agent - it was building a reliable validation system around it. One that forces the agent to deterministically verify its findings, continuously test its own hypotheses, and prove exploitability before reporting anything.
Because agents are easy to demo, trustworthy agents are hard to build. That shift changed everything for us. Less guessing, more proving. Less noise, more signal.
And via our cloud hosted offering, it can slot directly into your CI/CD pipeline - giving you continuous, validated pentesting results on every commit. Not periodic assessments that go stale the moment code changes. Continuous proof that your application holds up, running alongside your tests.
This is what we think the new paradigm looks like: pentesting that lives in your development workflow, not outside of it.
If you're a developer, you can run a pentest in minutes. If you're a security engineer, you can push it much further.
Try it, break it, and tell us where it falls short.
We've got a lot more coming.


English

@SinaHartung Mr Kevin Mandia’a old company, Mandiant (now Google)- they sent an email to their customers
English

The promise of the AI-GRC class has yet to be fulfilled- most excitingly using agents for evidence collection (automated paperwork gathering imo) is actually a good lift- but beyond that keeping humans in the loop for anything that requires attestation or judgement feels involved (and probably should be)
English

My take on Delve and experience after 3 audit cycles as a founder ->
We've been a decently happy customer of @secureframe the last few years but were considering migrating to @getdelve this January (took some sales calls, got some data in to feel the experience). Here's my take on the situation:
I've done 3 audit cycles now where my team (and originally just me) spent weeks every quarter pulling data, filling out spreadsheets, etc to adhere to SOC 2 controls. This was followed by a painful 2-3 month audit period where I felt auditors and the underlying platform were heavily out of sync - auditors would request things that we had already synced/uploaded into the compliance tool, escalatory meetings would happen just to resolve to 'that is ok, sorry we missed it'.
The promise of Delve, for me, was AI-native compliance.
A platform that didn't just do 80% of the work (controls auto-synced to cloud infra, evidence pulled via vendor APIs) but got us to 95%: where agents did the annoying stuff that the other platforms couldn't handle (auto checking access roles in every vendor, filling out excel sheets first to then to hand over for human verification). Where auditors actually understood the platform holding the evidence they were auditing.
I was really onboard with the vision of Delve their team and sales folks pitched me.
My gripe with all the existing vendors (@TrustVanta , @DrataHQ, @secureframe ) was this:
They built their tech stack and process up up pre-AI. I felt their product roadmap moved very slow, lots of small bugs everywhere and so much silly stuff has to be done manually. One hour my team is orchestrating coding agents and the next hour huddled in a call filling out an Excel sheet - it felt like these existing platforms were 'on the wrong exponential'.
We're obviously sticking with Secureframe for now but the above facts still haven't changed. Im certain someone will disrupt the space by achieving the above vision - it's unfortunate that it probably won't be Delve.
English

@vtahowe @ProulxKerem Allie I hold your opinions in high regard- excited to discuss it more next week 👀 see you at RSA
English

@ProulxKerem @kylebhiro AI models are getting so good the average attacker will soon be capable of finding 0 days. Defenders are truly falling behind and the need for Apex is readily apparent. Congrats on your launch!!🚀
English
Kyle Bhiro retweetledi

Our autonomous pentesting agent just outperformed the two most popular open source offensive security agents on a benchmark of 60 modern, defense-enabled web apps.
Battle-tested in production against our customers' environments from startups to financial institutions, Apex consistently finds and exploits critical vulnerabilities other agents and humans miss.
Today we're releasing it open source alongside our internal benchmarks.
English

How do you stand out in the age of agents?
Now that every website has cool animations.
Now that every meal is a bowl.
==the antidote is soul==
The best meals are made with love.
The best garmentos obsess over button details.
Good Design is taste. It conveys a message. It’s human.
We’ve redesigned promptlayer.com and we’re really proud of the result.
It’s inspired by my favorite restaurant in SF.
Every icon is hand drawn.
English

@ProulxKerem In a world a bold claims and blatant lies- the transparency is refreshing
English
Kyle Bhiro retweetledi

There’s been a lot of criticism of MCP lately, and I've felt the sentiment myself.
But the discussion is circling a deeper shift that APIs are becoming the UX for agents.
Humans tolerate messy APIs because we read docs, infer intent, and adapt. Agents don’t. They rely almost entirely on the semantic structure you expose.
So the real design question becomes "how much meaning lives in your schema?"
The better the interface communicates the system, the less intelligence the agent needs to use it.
English
Kyle Bhiro retweetledi

We’re talking about sandboxes and security today at @daytonaio Compute!
Great to chat with @shcallaway on how his new company @sazabi is using sandboxes to build the future of AI native observability

English
