Post

Leonardo Grasso
Leonardo Grasso@leogrease·
Thanks to SSD streaming I was able to try DwarfStart on my M3 Max 64GB. I let DeepSeek Flash analyze a ~50K LOC codebase. Then I asked Opus to double-check and review the findings. I was impressed that 5 of the 6 findings reported by DeepSeek were accurate.
antirez@antirez

Today I had an harder than usual question for my local model (security). With SSD streaming now DwarfStar can run DeepSeek v4 PRO at 4.15 t/s, and this was more than enough to get a detailed reply. I already feel "safer" than before in my AI future. M5 max 128GB, model 433GB.

English
2
3
41
7.5K
Leonardo Grasso
Leonardo Grasso@leogrease·
I meant "correct and accurate" at the level of any other SOTA model. The only incorrect finding was about a race condition that Opus wasn't able to reproduce, still a possible issue to pay attention to. @antirez , thanks for this project 🙏
English
0
0
0
228
.mane🏴‍☠️
.mane🏴‍☠️@eddy_mane·
@leogrease Useful datapoint. The next thing I’d want is a replay pack around the 6 findings: finding IDs, repro tests, prompt/model hashes, false-positive/triage time, and which issues survive a clean checkout. That’s what turns “LLM code review worked” into an eval harness.
English
1
0
0
153
Bagikan