Will Kurt

4.5K posts

Will Kurt banner
Will Kurt

Will Kurt

@willkurt

Ferment my own alcohol, run my own LLMs, that's just the kinda guy I am.

Seattle, WA Katılım Nisan 2007
830 Takip Edilen6.9K Takipçiler
Sabitlenmiş Tweet
Will Kurt
Will Kurt@willkurt·
🥳Check out: Token-Explorer! 🤖 Interact with and explore LLM token generation! Features: - Step through token selection - Remove tokens to explore alt paths - Fork prompt and quickly switch between them - Visualize all token probabilities and entropy! - OSS (github in replies)
English
3
9
44
5.8K
David Curran
David Curran@iamreddave·
@willkurt Smullyan has great covers in the early editions. What cover dies that one have?
David Curran tweet media
English
1
0
1
22
Will Kurt
Will Kurt@willkurt·
Reading two wildly different books (Žižek’s “Too late to awaken” and Smullyan’s “To mock a mockingbird”) and each makes reference to the exact same joke (with different names/attributions)!
Will Kurt tweet mediaWill Kurt tweet media
English
1
0
2
287
Will Kurt
Will Kurt@willkurt·
@oprydai How many GPUs do you have and how many requests do you expect to be serving concurrently? If the answer to both is roughly 1 then llama.cpp is a good place to start (esp if you have < 1 GPU). Otherwise you'll probably get more value out of vLLM
English
0
0
4
246
Mustafa
Mustafa@oprydai·
a question for the hardcore LLM folks; vLLM vs Llama cpp vs Ollama etc ?? which one? the use case is ; hosting it locally on the lab's machine; for tool calling agent based apps; also experimenting with llms for different analysies. my priority is reliability + all new advances on the architecture. e.g turboquant etc etc.
English
31
3
33
6.8K
Will Kurt
Will Kurt@willkurt·
It's pretty funny how often on both X and LinkedIn I see posts of the nature "You CAN'T do X with Y!!!" while I am actively "Xing with Y"
English
0
1
2
187
Will Kurt
Will Kurt@willkurt·
@TheAhmadOsman Heat and electricity bills? I run all my image models through my RTX 4090, but I keep any long running LLMs running on my M3 max MBP. Ultimately it boils down to bandwidth vs wattage That said, I've not seen a compelling argument in favor of the DGX
English
0
0
1
248
Ahmad
Ahmad@TheAhmadOsman·
Please stop saying that the DGX Spark or any Unified Memory machine competes with GPUs in any meaningful way It’s misleading Speak of the pros and cons of each, but stop misinforming your audience for whatever reason you’re doing it for
English
36
13
211
25.7K
Will Kurt
Will Kurt@willkurt·
A huge distinction in success with agents really boils down to whether or not you're solving a problem with a directly measurable outcome. I also believe people will increasingly question why anyone is spending significant time writing software without a measurable outcome
English
0
0
2
213
Will Kurt
Will Kurt@willkurt·
Weirdly starting to enjoy X again, you just have to recognize that there are only a handful of real/interesting people left. A few likes from the right people means what 100s from randos did a few years ago.
English
1
0
2
163
Will Kurt
Will Kurt@willkurt·
@jun_song @LottoLabs People really underestimate how much everything really boils down to memory bandwidth and power consumption. 128GB of memory isn’t much better than 24GB if your bandwidth is still ~250 GB/s and your local model isn’t really “free” if you need 1200 watts to do inference.
English
1
0
2
144
Lotto
Lotto@LottoLabs·
Why would anyone get a MBP over a gb10
English
13
1
17
5.6K
Will Kurt
Will Kurt@willkurt·
@nptacek Seriously, the last chapter of my soon-to-be released book on Stable Diffusion is all about using proprietary models to improve base SD. As long as new information can be added a model can be improved
English
1
0
2
515
CuddlySalmon
CuddlySalmon@nptacek·
i have never understood the synthetic data skeptics it's like they haven't played around with models at all, just stuck to whatever the prevailing view was
English
14
4
165
5.6K
Will Kurt
Will Kurt@willkurt·
@jun_song So apparently cats do this because they are trying to imitate you, so if you have a old, small laptop for your cat it *might* use that instead!
English
0
0
0
115
송준 Jun Song
송준 Jun Song@jun_song·
고양이가 맥북 위에 올라가지 못하게 하는 효과적인 방법은 무엇이죠? 저는 지금 심각한 문제를 겪고있어요.
한국어
17
1
35
2.8K
Will Kurt
Will Kurt@willkurt·
Yes you should host your own LLM, and yes you should host your own private git server, but you should also ferment you own booze! Local apples, home fermented cider!
Will Kurt tweet media
English
1
0
4
305
Will Kurt retweetledi
Ahmad
Ahmad@TheAhmadOsman·
Qwen 3.6 27B means the permanent underclass thing has been canceled btw
English
64
120
3K
161.1K
Will Kurt
Will Kurt@willkurt·
@LottoLabs Exactly. I don’t need local models to be better than proprietary SotA *today*, I just need them to be as good as proprietary models where when I started to be able to reliably trust agents to do their thing.
English
1
1
46
2.2K
Lotto
Lotto@LottoLabs·
Assume qwen 3.6 27b isn’t actually opus level or even sonnet level knowledge It’s much better than sonnet 3.5 level And that was sota not long ago, a loved model We’re in crazy times
English
28
18
585
23.6K
Will Kurt
Will Kurt@willkurt·
This is honestly the most exciting time in computing I've seen. It's not just "AI is amazing!", it's because we're starting to think in systems again My current homelab is ridiculous. My old computers don't just sit there, they're all performing some role, all communicating with each other. I have one box serving an LLM, another a ComfyUI server, yet another running hermes-agent. I've got a backend communicating to custom, hyper specific chrome extensions. For the first time since the early 2000s I remember which ports I'm using!
English
0
0
6
543
Will Kurt retweetledi
Taelin
Taelin@VictorTaelin·
Kimi 2.6 solved the HVM hard debug prompt!!? Took 3 attempts, but it did!! For a context, Gemini 3 was the first to solve it, inconsistently. Even GPT 5.4 fails sometimes. And this problem took me weeks back then. Now an open model solves it! Also captivated by its code style.
Taelin tweet media
English
43
35
1K
55K