Malte Ubl

59.3K posts

Malte Ubl banner
Malte Ubl

Malte Ubl

@cramforce

Self-driving infrastructure @Vercel CTO. Immigrant 🇺🇸/🇩🇪/acc

San Francisco Katılım Temmuz 2008
902 Takip Edilen47.8K Takipçiler
Sabitlenmiş Tweet
Malte Ubl
Malte Ubl@cramforce·
// I was young and needed to ship.
English
12
88
868
0
Hydra
Hydra@FatalHydra·
@vercel_dev Anyone have any cost estimates for using this? What was the project size/complexity like?
English
1
0
1
483
Vercel Developers
Vercel Developers@vercel_dev·
Introducing deepsec, an open source coding security harness. • CLI-first • Sandbox-based scaling • Pluggable coding agents • Designed for large-scale repos • Use AI Gateway or your own subscription After months of successful internal use, we put it to the test on some of the largest open source codebases. vercel.com/blog/introduci…
English
24
62
603
98.2K
Rekt404
Rekt404@Rektx404·
@rauchg @iamsahaj_xyz Working only to review a codebase or this agent can scan a complete infra in prod ?
English
1
0
0
64
Guillermo Rauch
Guillermo Rauch@rauchg·
𝚗𝚙𝚡 𝚍𝚎𝚎𝚙𝚜𝚎𝚌 We're introducing an open-source agent orchestrator for deep security reviews. We built it for internal use, and after running it against some major OSS projects, we gained conviction to share it with the world. Coding agents can now find critical vulnerabilities in minutes that would take teams of people months (if they can spot them at all). Since 𝚍𝚎𝚎𝚙𝚜𝚎𝚌 is optimized to work with Vercel Sandbox, you can effectively harness the power of thousands of agents scrutinizing your codebase in parallel. I encourage you to try this on your repositories. BTW: If you run an OSS project and want us to sponsor a run, my DMs are open.
Vercel Developers@vercel_dev

Introducing deepsec, an open source coding security harness. • CLI-first • Sandbox-based scaling • Pluggable coding agents • Designed for large-scale repos • Use AI Gateway or your own subscription After months of successful internal use, we put it to the test on some of the largest open source codebases. vercel.com/blog/introduci…

English
28
24
364
41.5K
Malte Ubl
Malte Ubl@cramforce·
@jason_haugh @rauchg The agent operates on entry point files and each concurrent agent gets a disjunct set of files to operate on.
English
0
0
1
6
Jason Haugh
Jason Haugh@jason_haugh·
@rauchg Quick question: do the parallel agents see each other's findings live, or is dedup a post-pass? Always the part that wobbles for me.
English
1
0
0
243
Malte Ubl
Malte Ubl@cramforce·
Using the claude/codex sub is a nice side benefit. TBH I mostly didn't want to have to worry about the agent part and so just used off-the-shelf. For our internal use the subscription part was irrelevant because - We don't use a subscription - I literally spend tens of thousands of dollars on it which would have broken any subscription 😅
English
1
0
2
226
Armin Ronacher ⇌
Armin Ronacher ⇌@mitsuhiko·
@cramforce Is part of the motivation here to be able to use the claude sub or because the architecture worked better with it?
English
1
0
0
257
Malte Ubl
Malte Ubl@cramforce·
@patrickssons @rauchg We're seeing 10-20% but closer to 10%. This does include the type of FP that is "because of some layer invisible in the code such as a the firewall setup this cannot be exploited" The revalidate step allows for having the agent double-check its work.
English
0
0
1
20
Patrick
Patrick@patrickssons·
@rauchg How does it handle false positives? Most security agents I've tried either miss everything or flag every config edit. Curious what your noise floor is on a healthy codebase.
English
1
0
0
186
Malte Ubl
Malte Ubl@cramforce·
@zencoderai @vercel_dev See the security model documented here. I'd run deepsec on a VM if you are afraid of getting prompt-injected by your own code #security-model-of-deepsec-itself" target="_blank" rel="nofollow noopener">github.com/vercel-labs/de…
English
0
0
0
14
zencoderai
zencoderai@zencoderai·
@vercel_dev What's the call you ended up making on which actions get fully sandboxed vs which ones just need a confirmation gate before running?Sandboxing + pluggable agents covers a lot of the agent-on-large-repo failure modes.
English
1
0
0
1.1K
Malte Ubl
Malte Ubl@cramforce·
@izadoesdev I really have no idea how this could possibly happen unless there is a bug in the Claude Agent SDK @delba_oliveira 👋🏼 Any chance you have an idea?
English
0
0
1
55
Iza
Iza@izadoesdev·
@cramforce nope still got plenty of usage left, no other errors, just logs out specifically when I run deepsec, everything else still wroks
English
1
0
0
55
Malte Ubl
Malte Ubl@cramforce·
Today we're open-sourcing `deepsec`: a security harness powered by coding agents. We've been testing it for a few months on our internal code bases as well as open-source applications from customers and partners. For the latter group we have privately shared the results, so issues can be fixed. - It actually works. I recommend giving it a try. The dream of Mythos in CLI-form. - You can run it on your laptop with your existing claude or codex subscription. - For large repos it can take a very long time to run. For this it supports fanout to worker sandboxes. I've been running it on 1000 cores+ to get through a lot of code quickly
Vercel Developers@vercel_dev

Introducing deepsec, an open source coding security harness. • CLI-first • Sandbox-based scaling • Pluggable coding agents • Designed for large-scale repos • Use AI Gateway or your own subscription After months of successful internal use, we put it to the test on some of the largest open source codebases. vercel.com/blog/introduci…

English
5
14
187
22.8K
Malte Ubl
Malte Ubl@cramforce·
@izadoesdev Definitely not something we have observed so far. Are you getting any other type of error message? With such a large batch you'd be expected to run out of quota quickly and would need to switch to pay-as-you-go
English
1
0
0
52
Malte Ubl
Malte Ubl@cramforce·
@kyl3kan @rauchg Yeah, it's completely generic. You may need to make custom matchers tho (or have your agent do it) github.com/vercel-labs/de… I'm also happy to take PRs with generically useful matchers for new programming domains.
English
1
0
1
28
Kyle
Kyle@kyl3kan·
@rauchg Could it run a mobile app scan?
English
1
0
0
114
Malte Ubl
Malte Ubl@cramforce·
@izadoesdev And are you actually logged out if you try claude after? Or did the claude that deepsec invoked just fail to pick up your subscription?
English
1
0
1
64
Iza
Iza@izadoesdev·
@cramforce it just refuses to run
Iza tweet media
English
1
0
0
58
Malte Ubl
Malte Ubl@cramforce·
@izadoesdev Not really. What does logging out mean? What's the output?
English
1
0
1
246
Iza
Iza@izadoesdev·
@cramforce claude code keeps logging out everytime I run it, this never happens with anything else, including sentry's warden, are you guys aware of any issue that could be causing it?
English
1
0
0
309
Malte Ubl
Malte Ubl@cramforce·
@thesherlocker Good question! It's addressed in the post. deepsec has a classifier that checks for refusals after the task is done. For both Opus 4.7 and GPT 5.5 refusal rate is under 1%
English
1
0
2
374
Sherlock
Sherlock@thesherlocker·
@cramforce What about running into model safeguards that might "prevent" such actions? Is that handled within deepsec?
English
1
0
1
404
James Perkins
James Perkins@jamesperkins·
Shoutout to the Vercel team. This is absolutely fantastic, and we now are integrating it into our developer workflow. We've tried every tool we could, and even wrote our own and none of them surfaced as many positives as deepsec. So good I let them quote me
Vercel Developers@vercel_dev

Introducing deepsec, an open source coding security harness. • CLI-first • Sandbox-based scaling • Pluggable coding agents • Designed for large-scale repos • Use AI Gateway or your own subscription After months of successful internal use, we put it to the test on some of the largest open source codebases. vercel.com/blog/introduci…

English
3
1
35
2.9K
Malte Ubl
Malte Ubl@cramforce·
I'm gonna ship an open-source project today or Monday. Finishing touches on the blog post are landing. I actually have two in the pipeline `just-bash`, `chat`, …
English
8
0
71
7.7K
Malte Ubl
Malte Ubl@cramforce·
@elie2222 yeah, because when you follow OpenAI's instructions to download codex, then the bar does not go up
English
0
0
0
19
Elie Steinbock — oss/acc
@cramforce But it does, doesn’t it? I can see the bar for Codex this month vs last month I believe you that it’s not a good measure of Codex vs Claude
English
1
0
0
24
Malte Ubl
Malte Ubl@cramforce·
@elie2222 I'm sure that is true! It's just that this particular set of data does not measure that growth
English
1
0
1
30