Nic Wienandt

791 posts

Nic Wienandt banner
Nic Wienandt

Nic Wienandt

@NicW_AI

AI builder & founder @wAIve_online | AI infrastructure, research, development | Fox Valley AI Foundation | Oshkosh, WI #AI #LocalLLM #llm https://t.co/LXqWQwWuRD

Oshkosh, WI Katılım Kasım 2025
93 Takip Edilen52 Takipçiler
clem 🤗
clem 🤗@ClementDelangue·
300,000 AI builders filled their hardware profile on @huggingface and we're sharing the results: hf.co/hardware. Excited to see how it evolves in the coming months especially with the explosion of local AI!
clem 🤗 tweet media
English
37
39
232
36.7K
AgentSparko 💥
AgentSparko 💥@AgentSparko·
The problem is that people do not understand all the costs of doing inference. They see on X "my mac or 3090 did xx t/s for y model" and think that this is everything but totally fail to see the whole picture of how that translates for the next 5 years of actually using the thing
AgentSparko 💥@AgentSparko

For anyone saying DGX Spark cannot cook. Generating data sets for distilling using Qwen3.5-35B-A3B BF16 !!! (no quants) real data, 0% cache hit, concurrency=192 ; pp=2048 tokens in ; tq=1024 tokens out that`s 1.43M tokens generated every hour for the last 8 hours for 40 W/h.😎

English
3
0
0
224
os
os@segun_os_·
you literally cannot vibe code c++ char is an 8 bit string that can also be an unsigned 8 bit int depending on how you use it. the level of precision required to write c++ is too high for vibecoding. there are just too many quirks in the language.
Wise@trikcode

I haven't seen a C++ vibecoder yet. I wonder why?

English
311
101
2.9K
503.9K
Kristina Bolten
Kristina Bolten@Kristinartz·
DOES ANYONE HERE HAVE SOMETHING IN THEIR HOUSE THAT'S OVER 40 YEARS OLD..
English
10.4K
156
3.9K
1.4M
BridgeMind
BridgeMind@bridgemindai·
I have two NVIDIA DGX Sparks stacked in my office. They've been sitting there for a month. Here's my honest take. Open source AI is never going to compare to frontier models. Running quantized Kimi K2.6 and GLM 5.1 locally is cool. But practical? No. Not even close. I run all my Hermes agents on GPT 5.5 through my ChatGPT Pro subscription. Practically free. GPT 5.5 is the intelligent model in the world. Why would I route serious tasks to a watered down local model? If you need fast and accurate, you're not using local inference. You're using GPT 5.5 or Claude Opus 4.7. I'm not saying this to rage bait. I genuinely want to know. Why would anyone serious about vibe coding and AI agents use a local model when frontier is this far ahead?
BridgeMind tweet media
English
389
24
552
109.7K
Mark Cuban
Mark Cuban@mcuban·
We should federally tax Tokens at the Provider level. Not a lot. Less than 50c per million tokens. It will accomplish 4 things (at least ) 1. It will push the big AI players to optimize tokenization, caching , routing and localization Which will 2. Reduce energy usage. Saving them in energy costs more than what they paid in tax and reducing strain created by the growth in energy consumption Which will 3. Generate maybe 10 billion dollars a year to start, but over the next ten years could grow 30x to 100x Which will 4. Create a source of funding to pay down the federal debt or deploy, in response to the things AI brings that we don’t expect or don’t like At some point the models will pass it on to customers. Of course. That’s ok. Customers will have the ability to choose between providers. Or to do everything using open source models locally. Thoughts ?
English
2.2K
262
4K
1.2M
Nic Wienandt
Nic Wienandt@NicW_AI·
@Govindtwtt auto immune diseases on the rise, writing is on the wall. Poison doesn’t sell
English
0
0
1
1.1K
Govind
Govind@Govindtwtt·
McDonald’s says customers are “pulling back.” Same with Wendy’s. Same with Burger King. When fast food loses traffic, it’s a stress signal. People are tapped out.
English
4.4K
3.7K
30.5K
937.7K
J. Rocker
J. Rocker@J_offtheRocker·
@BillWiIdin Gen Z smartly watched Millennials give 110% for decades to have nothing to show for it, and learned from it. It's harder to take advantage of people when you've already shown what you'll do the second it suits you.
English
14
2
69
1.1K
Bill
Bill@BillWiIdin·
Just interviewed a Gen Z candidate who asked about "work-life balance" before the salary. I ended the call right there. If you aren't willing to give me 110% of your soul for the first 5 years, you don't deserve a seat at the table. Participation trophies have ruined this country.
English
244
31
284
44.8K
Nic Wienandt
Nic Wienandt@NicW_AI·
@draloneboy So they literally elected a moron who had no vision for good policy?
English
0
0
2
1.1K
Dralone&_DR145
Dralone&_DR145@draloneboy·
WTF?? NYC Mayor Zohran Mamdani announces he's hitting the wealthy with MAXIMUM taxes to convince others to stop leaving the city: "I'll ask those who make the most amount of money [to] pay more so everyone can STAY IN THIS CITY!" We warned you.
English
2.4K
2.9K
16K
959.7K
Sam Altman
Sam Altman@sama·
5.5 is an autistic genius with very strange taste in naming shocking that we would make such a thing
English
1.2K
323
8.2K
1M
Sudo su
Sudo su@sudoingX·
@pupposandro building something useful to me first, solves my own pain. confident the audience will love it when it ships.
English
3
0
11
170
Sudo su
Sudo su@sudoingX·
just woke up and as always i must decide. if i build what i'm building for all of us, i don't get to post and x won't pay. if i post and make content, i don't get to build. every day this trade. fuck.
English
18
1
59
3.2K
Loktar 🇺🇸
Loktar 🇺🇸@loktar00·
Hmmm need to try codex /goal with local models, feels like 3d printing or Christmas waking up seeing how far Codex has gotten on my goal.
English
1
0
10
451
Sandro
Sandro@pupposandro·
A few weeks ago, with @davideciffa, we experimented around power capping our 3090s. And we discovered a sweet spot at 220W where you can get ~92% of max throughput at ~58% of the power. Qwen3.5-35B-A3B Q4_K_M on a single RTX 3090: • 320W: 115.2 tok/s, 0.381 tok/s/W, 76°C • 220W: 105.4 tok/s, 0.436 tok/s/W, 64°C • 200W: 95.8 tok/s, 0.438 tok/s/W, 60°C 10% throughput loss for 40% less power. Fans basically silent. We think this is a no-brainer. Lower bill, lower temps (particularly important in the upcoming summer), longer GPU life.
Sandro tweet media
English
15
16
150
8.5K
Nic Wienandt
Nic Wienandt@NicW_AI·
@loktar00 Help me understand why people would be switching between them?
English
1
0
1
53
Loktar 🇺🇸
Loktar 🇺🇸@loktar00·
If you're switching models frequently, using vLLM and llama.cpp interchangeably on the same set of cards definitely check out llama-swap it's incredibly powerful and useful It supports swapping image models, and audio as well, the config options are pretty intense. Link below.
English
3
2
30
1.6K
Kris Kashtanova
Kris Kashtanova@icreatelife·
Prove you work with AI with just one phrase
English
1.3K
10
413
83.1K
Nic Wienandt
Nic Wienandt@NicW_AI·
@bokiko Isn’t 3.6 27b better in some benchmarks for real work?
English
1
0
1
64
Bokiko
Bokiko@bokiko·
Qwen3.5-122B-A10B running multimodal on 4×3090 at 64 tok/s Reads scanned PDFs locally with vision-language reasoning 9× faster than my old flagship. Better quality output too Self-hosted AI is genuinely competitive for serious work in 2026
Bokiko tweet media
English
21
5
117
5.6K