Caleb Gross

1.5K posts

Caleb Gross

@noperator

ai for security

Katılım Ekim 2009

647 Takip Edilen2.7K Takipçiler

Sabitlenmiş Tweet

Caleb Gross@noperator·10 Şub

1/ Agentic LLMs can automate vuln detection. Very exciting, but doesn't address the hardest part (imo) of vuln research: prioritization. Can we reliably explore the search space and separate signal from noise? I wrote a paper (and OSS tool) to solve this. arxiv.org/pdf/2512.06155

English

218

103.4K

Caleb Gross@noperator·6h

@kusonooyasumi need specs on your inference rig

English

oyasumi@kusonooyasumi·9h

ssd acquired for the ai machine doing a fresh os install since there is practically nothing on it gonna be using it as an inference machine and doing some vibe coding today

English

158

Caleb Gross retweetledi

goniz@gonizahavy·22h

My MLX Vulkan backend just passed both CPP AND Python test suites!!

English

22.7K

Caleb Gross@noperator·14h

I opted for the framework desktop because: - it's cheaper than the mac/spark options - I want to support framework as a company - I want to support AMD ecosystem as an alternative to being locked into NVIDIA (and I think support for that platform is rapidly improving) x.com/noperator/stat…

English

Caleb Gross@noperator·14h

@AlizTheHax0r the tldr is that you can run small-ish models fast on a GPU (3090, etc.) or large-ish models slow on unified memory (mac ultra, dgx spark, framework desktop, etc.). biggest difference between those two is the amount of (V)RAM and the bandwidth of that memory.

English

Aliz (they/them pls)@AlizTheHax0r·1d

Well, I'm going to end up splashing out on a big graphics card. I'm just testing out some different hardware on runpod.io, but since the 4090 is absurdly price-gouged I think I'll end up shelling out a small fortune for a 5090.

English

240

Caleb Gross@noperator·14h

@antirez @AMD amazing. my framework desktop arrives today 🙌

English

575

antirez@antirez·15h

It's official, @AMD is kindly sending me Strix Halo so I'll be able to support ROCm and the Halo in particular for DS4. This means I'll be able to merge the "rocm" branch and the other optimizations the community is doing, and to make sure it does not break after I change stuff.

English

685

23.7K

Caleb Gross@noperator·14h

betting that GitHub's "Download ZIP" button gets _way_ more clicks since the advent of LLMs. I very frequently drop entire zipped codebases in front of ChatGPT.

English

257

Caleb Gross@noperator·1d

@AlizTheHax0r also depends on what you're trying to accomplish. if you're already ready to spend a lot ("small fortune"), a 128GB unified-memory inference machine could be appropriate if you want to run frontier-ish models locally.

English

Caleb Gross@noperator·1d

@AlizTheHax0r 3090 is great too :)

English

104

Caleb Gross retweetledi

Simone Margaritelli@evilsocket·3d

Earlier today Cloudflare's CSO shared how they tested Anthropic Mythos using an unreleased 8-stage vulnerability-discovery agent. So I asked Opus to implement the agent for me, it works via Claude SDK with a Pro or Max subscription, no API. Enjoy github.com/evilsocket/aud…

English

101

555

46.6K

Caleb Gross@noperator·3d

@__tinygrad__ @halvarflake Agree with your notion of AI-as-search. Here's how I'm thinking about it: x.com/i/status/20212…

Caleb Gross@noperator

English

Caleb Gross retweetledi

the tiny corp@__tinygrad__·3d

@halvarflake I find most computer security people have a grounded notion on AI because they have practical experience with things like fuzzing and z3 and see things as search. Search is powerful, but the space is bounded, and even within a space it can be just hard. AI is mainstream search.

English

913

the tiny corp@__tinygrad__·3d

The AI panic is really unbelievable today. The level of delusion and hype have grown to mythic proportions. Has AI beaten Pokemon Red yet? Like a normal 6 year old does, by looking at the screen? Oh it hasn't. But all jobs are over in 18 months? This website is full of idiots.

English

165

254

4.1K

253.6K

Caleb Gross@noperator·5d

@antirez "one of us!"

English

471

antirez@antirez·5d

I guess I should wait a few weeks and buy a AMD Ryzen AI Halo Box, instead of the current alternatives? I want ROCm support in DS4 to be good and well supported.

English

180

23.3K

Caleb Gross@noperator·6d

@chompie1337 based

English

chompie@chompie1337·14 May

Claude helped me with this bug too but in a different way... Tried to gaslight me saying it wasn’t ~exploitable in practice~ and I got obsessed with proving it wrong 😩

TrendAI Zero Day Initiative@thezdi

Confirmed! @chompie1337 of IBM X-Force Offensive Research (XOR) used a race condition to escalate privileges on Red Hat Enterprise Linux for Workstations, earning $20,000 and 2 Master of Pwn points. #Pwn2Own #P2OBerlin

English

100

1.3K

74.4K

Caleb Gross@noperator·13 May

@sudoingX + framework desktop

English

1.5K

Sudo su@sudoingX·13 May

dgx spark is so so soo fucking underrated right now.

English

196

17.2K

Caleb Gross retweetledi

Sudo su@sudoingX·13 May

buy a gpu. 3090, 4090, dgx spark, whatever fits your budget. tier doesn't matter. running your first local model does. the moment your first prompt lands with no api between you and the model, your brain rewires. that single moment is worth more than every take you'll ever read on a timeline.

English

649

30.4K

Caleb Gross@noperator·13 May

@mattions @sudoingX @sudoingX how are you avoiding OOM with 262K context on 24GB VRAM?

English

Michele Mattioni@mattions·11 May

@sudoingX The context IMHO is way too big. It fits, but the computer will OOM too much, especially if TTS local services or random ComfyUI. I've tried with 192, but the real sweet sport is 131k for me

English

297

Sudo su@sudoingX·11 May

this is what my setup looks like today. about to test qwen 3.6 27b dense q4 on a single rtx 3090 at ~41 tok/s gen, hermes agent driving. predecessor model qwen 3.5 dense q4 made it work in one iteration when i ran the same agentic build on the same card. i've been daily driving qwen 3.6 27b dense for weeks now, the model i keep coming back to. if 3.6 oneshots too, this becomes the best model that runs on a single rtx 3090. consumer tier king. firing the test now will report back soon.

English

270

81.6K

Caleb Gross@noperator·12 May

@gonizahavy @antirez looks like your gen_tps (9.20) is a bit better than his (6.10) at the same 32k context

English

goniz@gonizahavy·12 May

@noperator @antirez Haha his version looks less hacked up 🫣

English

goniz@gonizahavy·12 May

Hmm, could not handle the FOMO of @antirez DS4 so I made it work on my Strix Halo using ROCm HIPify 🫢

English

10.3K

Caleb Gross@noperator·11 May

a benefit of working from home: I can dictate a stream-of-consciousness rant into my mic while iterating on ideas with Claude. not sure how in-office folks handle this.

English

253

Caleb Gross@noperator·10 May

@pradeep24 @antirez what's doing the compaction for you?

English

843

pradeep@pradeep24·9 May

tested out @antirez' ds4.c this morning. so impressive and delivers. on a M3 max, 128GB, stock ds4 settings: - 14–15 t/s at 62K pre-filled actual coding conversation - memory usage was flat during gen ~85GB res - disk cache is ~8GB for a full 100K context window - thermals were normal, light fan activity - inference server is rock solid so far biggest constraint: anytime there's a compact, we pay the wait-time price of a fresh prefill (~1min per 10k context) before we are back in action. sequential inference + multiple agents in parallel performance is unclear, will report back. I'm so amped.

English

563

161.5K

Keşfet

@kusonooyasumi @AlizTheHax0r @antirez @AMD @__tinygrad__ @halvarflake @chompie1337 @sudoingX