Complete hardware + software setup for running Deepseek-R1 locally. The actual model, no distillations, and Q8 quantization for full quality. Total cost, $6,000. All download and part links below:
@createthiscom It will, but it will be much slower! 405B is a dense model, so all 405B parameters must be read from memory for every generated token, whereas only ~45B parameters of R1 are read per token.
You can get a ~2X speedup on 405B using speculative sampling, but still only 1-2 t/s @ Q8
🎉 Happy New Year! 🎉
Have you already thought about the new resolutions you are going to make? ✨
The bountysource team has already written its list! 📝
#HappyNewYear#Bountysource#Resolutions#List
Can we, as a community, get together and rage against the atrocity that is “const” in ES6? I have never before seen such a weak and confusing language feature.
In your opinion, what differentiates a developer from a senior developer?
Is it years of experience?
Breadth of experience?
More focus on larger picture vs. day to day coding?
Something else?