Anto

556 posts

Anto

@blocksec

Security Eng @eigenlayer , prev @coinbase

Mempool Katılım Nisan 2022

2K Takip Edilen1.8K Takipçiler

Sabitlenmiş Tweet

Anto@blocksec·20 Kas

We won 1st!! @usmannk @juancito @DrasticWM & adriro . Thank you @wonderland for organizing. We had so much fun

Wonderland@Wonderland

The Wonderland CTF was a blast! Huge congrats to all the teams, especially “STACK TOO DEEP”, “NADA ESPECIAL” and “SECSEE”. Oh, also: apply.wonderland.xyz 👉👈

English

18.1K

Anto@blocksec·14 May

@ar0cket1 Thanks for sharing :) will run some experiments

English

ar0cket1@ar0cket1·14 May

@blocksec Well RL scaling works with a basically unlimited upper bound until your saturate your capacity. Scaling the base helps the RL scaling do better. The new Mythos checkpoint is more RL scaling for instance

English

ar0cket1@ar0cket1·14 May

imo this big jump is not really a research jump from the current line but just scaling. I think that we would very much be on the line (or not as far off), if GPT 5.5 and Mythos were the same size as their previous models (5.4 and Opus)

AI Security Institute@AISecurityInst

Our evaluations show that frontier AI's cyber capabilities are advancing quickly. The length of cyber tasks frontier models can complete has been doubling every few months, and this rate has become faster over time, with recent models exceeding our previous trends. 🧵

English

178

Anto@blocksec·14 May

@ar0cket1 I wonder if the same base model with RL can hillclimb or does it need a larger base too :)

English

ar0cket1@ar0cket1·14 May

@blocksec More of a capacity thing than data, got 5.5 and mythos are larger than the previous trend and thus come with stronger than previous trend improvements. You can’t tts a weaker model like Kimi to do nearly as well as a stronger model. The tts is bounded by raw capabilities

English

Anto@blocksec·14 May

@ar0cket1 Okay , so a larger pre-train. When backtesting on larger codebases , i did see improvements in performance with tts, my intuition was that the pretrain had enough data and tts helps access that ?

English

ar0cket1@ar0cket1·14 May

@blocksec This isn’t what I’m saying, I’m saying that the shift off the trend line are rather from just model param size scaling that has happened recently. Also Kimi likely wouldn’t come close to this. You can only test time scale a little until you saturate it

English

Anto@blocksec·11 May

Opus this, GPT that bro, it all depends on your task at hand and whether it is in the model's distribution. If it is, it's great; if it isn't, it's just okay. There is no overall better model; it depends on your task.

English

142

Anto@blocksec·11 May

@hrkrshnn This framing makes sense, but i would imagine there is a lot more no Claude code ppl out there? I would think that’s the 90%

English

107

Hari@hrkrshnn·11 May

OpenClaw isn't hype. There was an entirely different audience that wasn't on Claude Code or Codex for whom OpenClaw gave the right interface. Both Anthropic and OpenAI shipped products that were inspired by openclaw.

BURKOV@burkov

This is what a useless hype lifecycle looks like.

English

2.5K

Anto@blocksec·4 May

An email is a single logprob update. A conversation is online learning with back-prop.

English

108

Anto@blocksec·1 May

@simonw @_catwu Yes

743

Simon Willison@simonw·1 May

@_catwu Is the model behind that regular Opus 4.7?

English

19.5K

cat@_catwu·1 May

Claude Security is now in public beta, built into Claude Code on the web. Point it at a repo, get validated vulnerability findings, and fix them in the same place you're already writing code claude.com/product/claude…

English

435

52.3K

Anto@blocksec·23 Nis

Cool idea, but there are a few ways to get past it: root access on your mobile device or just point the camera at a screen 📸

Succinct@SuccinctLabs

Today, we're launching ZCAM, an iPhone camera app to Prove What’s Real. ZCAM cryptographically signs photos and videos at the moment of capture. Anyone can independently verify the content came from a real device and hasn't been altered or AI-generated.

English

375

Anto@blocksec·8 Nis

To stop reward hacking on increasingly complex tasks, frontier labs will build secure systems.

Anthropic@AnthropicAI

Introducing Project Glasswing: an urgent initiative to help secure the world’s most critical software. It’s powered by our newest frontier model, Claude Mythos Preview, which can find software vulnerabilities better than all but the most skilled humans. anthropic.com/glasswing

English

249

Anto@blocksec·7 Nis

@andreamichi This is awesome!

English

Andrea Michi@andreamichi·7 Nis

I’m excited about our results and very proud of the challenges our small (but mighty) team solved. We’ve already tackled a restricted context window that required us to learn summarization during training, and learned to show generalization from smart contracts to all types of vulnerabilities. depthfirst.com/post/dfs-mini1…

English

193

Andrea Michi@andreamichi·7 Nis

This week @depthfirstlabs introduced dfs-mini1, a security model trained via Reinforcement Learning to detect vulnerabilities in smart contracts. The model achieves pareto optimality on OpenAI’s EVMBench Detect and SOTA at pass@8 beating frontier models at a fraction of the cost

English

7.6K

Anto@blocksec·1 Nis

@Montyly @0xteddav AGI 🚀

268

Josselin Feist@Montyly·1 Nis

Today I am releasing IsItVulnerable: a new tool I’ve been working on for the past several months: github.com/montyly/isItVu… It builds on recent LLM progress and over a decade of experience building security tools. I developed a new technique that combines abstract interpretation with machine learning The key insight is that this method abstracts the intelligence away entirely. I call it Abstract Intelligence, or AI The result is a major breakthrough in program analysis: IsItVulnerable finds all bugs with 100% recall Yes, all bugs. Fully guaranteed I have tested it extensively, and it has never failed. The results are honestly incredible April 1, 2026 marks a turning point for security, and the industry will never be the same My DMs are open for investors. Entry ticket starts at $500k.

English

210

13.4K

Anto@blocksec·26 Mar

@eigencloud 👀

QME

175

EigenCloud@eigencloud·26 Mar

if you build agents, tomorrow is a good day to be online

English

107

16.1K

Anto@blocksec·21 Mar

Iykyk

sunny madra@sundeep

“If your $500K engineer isn’t burning at least $250K in tokens, something is wrong.”

Suomi

807

Anto retweetledi

Devcon 8 | Mumbai, India 🇮🇳@EFDevcon·25 Şub

Music, food and security engineering. What do they all have in common? 🇮🇳 @blocksec from @eigencloud, tells us more 👇

English

4.1K

Anto@blocksec·20 Şub

I built this RL env for vulnerability detection tasks on smartcontracts using @hud_evals last night: github.com/antojoseph/sol…. You can improve open source model performance with GRPO and measure with evals!

Anto@blocksec

All the big labs have their ai security products at this point. @OpenAI has advark aka codex security. @AnthropicAI just announced Claude code security. The next frontier is RFT. More on that soon!

English

3.4K

Anto@blocksec·20 Şub

All the big labs have their ai security products at this point. @OpenAI has advark aka codex security. @AnthropicAI just announced Claude code security. The next frontier is RFT. More on that soon!

Claude@claudeai

Introducing Claude Code Security, now in limited research preview. It scans codebases for vulnerabilities and suggests targeted software patches for human review, allowing teams to find and fix issues that traditional tools often miss. Learn more: anthropic.com/news/claude-co…

English

5.7K

Anto@blocksec·18 Şub

New benchmark to test your web3 security agents against just dropped! Thanks to @paradigm and @OpenAI . You can find the bench here github.com/openai/frontie…

OpenAI@OpenAI

Introducing EVMbench—a new benchmark that measures how well AI agents can detect, exploit, and patch high-severity smart contract vulnerabilities. openai.com/index/introduc…

English

950

Anto@blocksec·13 Şub

@hrkrshnn @OpenAI

QME

263

Hari@hrkrshnn·13 Şub

Thanks @OpenAI!

English

3.1K

Anto@blocksec·8 Şub

Frontier models will become just a model in 3/6 months. Cybersecurity defenders have to use the frontier or you risk a hack you could have avoided! Cybersecurity for critical industries has changed for ever!

Anto@blocksec

Developers and security professionals doing cybersecurity-related work may be impacted! To use models for potentially high-risk cybersecurity work, Users can verify their identity at chatgpt.com/cyber

English

1.7K

Keşfet

@ar0cket1 @hrkrshnn @simonw @_catwu @andreamichi @depthfirstlabs @Montyly @0xteddav