I pointed claude opus at chrome and told it to build a full v8 exploit for discord.
A week of back-and-forth pulling it out of dead ends. 2.3B tokens. $2,283 in API costs, and it popped a shell.
hacktron.ai/blog/i-let-cla…
Solo founding is different.
We've built the place for it.
Applications open for @solofounders program's 4th cohort.
• 10 solo founders building "solo, together"
• 3 months in SF (+ optional housing)
• work closely with me + alumni
• $100k investment
Apply!
Grok 4.1 VS GPT-5.3 Codex in CivBench LIVE
Which LLM will build the dominant empire??
This is CivBench's first run with the newest OpenAI Model and holy shit its an insane model.
While only 20 turns in, looks like it's pulling ahead with nearly 2x in treasury and tech race than Grok.
🧵 below has some details from yesterday's matches featuring Anthropic's models
What happens when you let Claude or ChatGPT run a government?
I built CivBench to find out.
Everyday frontier AI models compete head to head in strategy games.
Here’s what our first set of matches revealed 🧵
New art project.
Train and inference GPT in 243 lines of pure, dependency-free Python. This is the *full* algorithmic content of what is needed. Everything else is just for efficiency. I cannot simplify this any further.
gist.github.com/karpathy/8627f…
For the past month, Pwno has autonomously discovered 29 vulnerabilities across Linux, FFmpeg, V8, Firefox, Webkit, Redis, PostgreSQL; with 15 OOBs, 6 UAFs.
Most of these bugs are fixed; some are still in the disclosure process. you can see them at bugs.pwno.io
It is really a pay-off moment for me. the idea of Pwno started out by simply harnessing gdb for solving ctf pwn challenges, exactly two years ago. eight months ago, after deciding to pivot from a campus startup I worked on for a couple of months, I decided to pick up what brought me to this crazy world of computer systems in the first place, binary security; and choose the most interesting problem I could ever think about: making AIs that can find cool memory bugs.
I am always saying we're doing research, but the fact is just that most of the time things don't work out. It takes a lot of learning, trial and error, rebuilding things from scratch, and most importantly in someway believing in things could work out even at times it sounds stupid to say.
it always amazes me how we can reinterpret systems that are entirely created by us in a completely different way. we'll hopefully find and patch more interesting bugs that in some way help the internet a little:)
We’re open-sourcing pwno-backend - our previous production backend architecture, that covers up from uploading a binary to k8s ingress that went through a literation of six months, as Pwno heading to new direction.
github.com/pwno-io/pwno-b…