Caleb Gross

1.5K posts

Caleb Gross banner
Caleb Gross

Caleb Gross

@noperator

ai for security

Katılım Ekim 2009
647 Takip Edilen2.7K Takipçiler
Sabitlenmiş Tweet
Caleb Gross
Caleb Gross@noperator·
1/ Agentic LLMs can automate vuln detection. Very exciting, but doesn't address the hardest part (imo) of vuln research: prioritization. Can we reliably explore the search space and separate signal from noise? I wrote a paper (and OSS tool) to solve this. arxiv.org/pdf/2512.06155
Caleb Gross tweet media
English
2
60
218
103.4K
oyasumi
oyasumi@kusonooyasumi·
ssd acquired for the ai machine doing a fresh os install since there is practically nothing on it gonna be using it as an inference machine and doing some vibe coding today
English
1
0
6
158
Caleb Gross retweetledi
goniz
goniz@gonizahavy·
My MLX Vulkan backend just passed both CPP AND Python test suites!!
goniz tweet mediagoniz tweet mediagoniz tweet media
English
4
5
56
22.7K
Caleb Gross
Caleb Gross@noperator·
I opted for the framework desktop because: - it's cheaper than the mac/spark options - I want to support framework as a company - I want to support AMD ecosystem as an alternative to being locked into NVIDIA (and I think support for that platform is rapidly improving) x.com/noperator/stat…
English
0
0
0
33
Caleb Gross
Caleb Gross@noperator·
@AlizTheHax0r the tldr is that you can run small-ish models fast on a GPU (3090, etc.) or large-ish models slow on unified memory (mac ultra, dgx spark, framework desktop, etc.). biggest difference between those two is the amount of (V)RAM and the bandwidth of that memory.
English
1
0
1
42
Aliz (they/them pls)
Aliz (they/them pls)@AlizTheHax0r·
Well, I'm going to end up splashing out on a big graphics card. I'm just testing out some different hardware on runpod.io, but since the 4090 is absurdly price-gouged I think I'll end up shelling out a small fortune for a 5090.
English
1
0
2
240
antirez
antirez@antirez·
It's official, @AMD is kindly sending me Strix Halo so I'll be able to support ROCm and the Halo in particular for DS4. This means I'll be able to merge the "rocm" branch and the other optimizations the community is doing, and to make sure it does not break after I change stuff.
English
23
29
685
23.7K
Caleb Gross
Caleb Gross@noperator·
betting that GitHub's "Download ZIP" button gets _way_ more clicks since the advent of LLMs. I very frequently drop entire zipped codebases in front of ChatGPT.
Caleb Gross tweet media
English
0
0
5
257
Caleb Gross
Caleb Gross@noperator·
@AlizTheHax0r also depends on what you're trying to accomplish. if you're already ready to spend a lot ("small fortune"), a 128GB unified-memory inference machine could be appropriate if you want to run frontier-ish models locally.
English
1
0
1
93
Caleb Gross retweetledi
Simone Margaritelli
Simone Margaritelli@evilsocket·
Earlier today Cloudflare's CSO shared how they tested Anthropic Mythos using an unreleased 8-stage vulnerability-discovery agent. So I asked Opus to implement the agent for me, it works via Claude SDK with a Pro or Max subscription, no API. Enjoy github.com/evilsocket/aud…
Simone Margaritelli tweet media
English
13
101
555
46.6K
Caleb Gross retweetledi
the tiny corp
the tiny corp@__tinygrad__·
@halvarflake I find most computer security people have a grounded notion on AI because they have practical experience with things like fuzzing and z3 and see things as search. Search is powerful, but the space is bounded, and even within a space it can be just hard. AI is mainstream search.
English
4
1
21
913
the tiny corp
the tiny corp@__tinygrad__·
The AI panic is really unbelievable today. The level of delusion and hype have grown to mythic proportions. Has AI beaten Pokemon Red yet? Like a normal 6 year old does, by looking at the screen? Oh it hasn't. But all jobs are over in 18 months? This website is full of idiots.
English
165
254
4.1K
253.6K
antirez
antirez@antirez·
I guess I should wait a few weeks and buy a AMD Ryzen AI Halo Box, instead of the current alternatives? I want ROCm support in DS4 to be good and well supported.
English
19
3
180
23.3K
chompie
chompie@chompie1337·
Claude helped me with this bug too but in a different way... Tried to gaslight me saying it wasn’t ~exploitable in practice~ and I got obsessed with proving it wrong 😩
TrendAI Zero Day Initiative@thezdi

Confirmed! @chompie1337 of IBM X-Force Offensive Research (XOR) used a race condition to escalate privileges on Red Hat Enterprise Linux for Workstations, earning $20,000 and 2 Master of Pwn points. #Pwn2Own #P2OBerlin

English
42
100
1.3K
74.4K
Sudo su
Sudo su@sudoingX·
dgx spark is so so soo fucking underrated right now.
English
49
6
196
17.2K
Caleb Gross retweetledi
Sudo su
Sudo su@sudoingX·
buy a gpu. 3090, 4090, dgx spark, whatever fits your budget. tier doesn't matter. running your first local model does. the moment your first prompt lands with no api between you and the model, your brain rewires. that single moment is worth more than every take you'll ever read on a timeline.
English
68
49
649
30.4K
Michele Mattioni
Michele Mattioni@mattions·
@sudoingX The context IMHO is way too big. It fits, but the computer will OOM too much, especially if TTS local services or random ComfyUI. I've tried with 192, but the real sweet sport is 131k for me
English
1
0
2
297
Sudo su
Sudo su@sudoingX·
this is what my setup looks like today. about to test qwen 3.6 27b dense q4 on a single rtx 3090 at ~41 tok/s gen, hermes agent driving. predecessor model qwen 3.5 dense q4 made it work in one iteration when i ran the same agentic build on the same card. i've been daily driving qwen 3.6 27b dense for weeks now, the model i keep coming back to. if 3.6 oneshots too, this becomes the best model that runs on a single rtx 3090. consumer tier king. firing the test now will report back soon.
Sudo su tweet media
English
25
9
270
81.6K
Caleb Gross
Caleb Gross@noperator·
@gonizahavy @antirez looks like your gen_tps (9.20) is a bit better than his (6.10) at the same 32k context
English
1
0
1
42
goniz
goniz@gonizahavy·
Hmm, could not handle the FOMO of @antirez DS4 so I made it work on my Strix Halo using ROCm HIPify 🫢
goniz tweet mediagoniz tweet media
English
6
3
36
10.3K
Caleb Gross
Caleb Gross@noperator·
a benefit of working from home: I can dictate a stream-of-consciousness rant into my mic while iterating on ideas with Claude. not sure how in-office folks handle this.
English
0
0
1
253
pradeep
pradeep@pradeep24·
tested out @antirez' ds4.c this morning. so impressive and delivers. on a M3 max, 128GB, stock ds4 settings: - 14–15 t/s at 62K pre-filled actual coding conversation - memory usage was flat during gen ~85GB res - disk cache is ~8GB for a full 100K context window - thermals were normal, light fan activity - inference server is rock solid so far biggest constraint: anytime there's a compact, we pay the wait-time price of a fresh prefill (~1min per 10k context) before we are back in action. sequential inference + multiple agents in parallel performance is unclear, will report back. I'm so amped.
English
26
53
563
161.5K