Nic Wienandt

791 posts

Nic Wienandt

@NicW_AI

AI builder & founder @wAIve_online | AI infrastructure, research, development | Fox Valley AI Foundation | Oshkosh, WI #AI #LocalLLM #llm https://t.co/LXqWQwWuRD

Oshkosh, WI Katılım Kasım 2025

93 Takip Edilen52 Takipçiler

Nic Wienandt@NicW_AI·1d

@TheAhmadOsman @ClementDelangue @huggingface i have 10 of them 🤷‍♂️

English

Ahmad@TheAhmadOsman·1d

@ClementDelangue @huggingface 3090s owners, assemble!

GIF

English

1.3K

clem 🤗@ClementDelangue·1d

300,000 AI builders filled their hardware profile on @huggingface and we're sharing the results: hf.co/hardware. Excited to see how it evolves in the coming months especially with the explosion of local AI!

English

232

36.7K

Nic Wienandt@NicW_AI·2d

@AgentSparko Can we get together and talk about cooking?

English

AgentSparko 💥@AgentSparko·3d

The problem is that people do not understand all the costs of doing inference. They see on X "my mac or 3090 did xx t/s for y model" and think that this is everything but totally fail to see the whole picture of how that translates for the next 5 years of actually using the thing

AgentSparko 💥@AgentSparko

For anyone saying DGX Spark cannot cook. Generating data sets for distilling using Qwen3.5-35B-A3B BF16 !!! (no quants) real data, 0% cache hit, concurrency=192 ; pp=2048 tokens in ; tq=1024 tokens out that`s 1.43M tokens generated every hour for the last 8 hours for 40 W/h.😎

English

224

Nic Wienandt@NicW_AI·3d

@segun_os_ you need a vibe compiler lol

English

879

os@segun_os_·3d

you literally cannot vibe code c++ char is an 8 bit string that can also be an unsigned 8 bit int depending on how you use it. the level of precision required to write c++ is too high for vibecoding. there are just too many quirks in the language.

Wise@trikcode

I haven't seen a C++ vibecoder yet. I wonder why?

English

311

101

2.9K

503.9K

Nic Wienandt@NicW_AI·3d

@BenjaminPDixon Except some of us have trillions in each forward pass…

English

641

Nic Wienandt retweetledi

Pastor Ben@BenjaminPDixon·3d

I'm not sure what to say to people who don't realize our brains are literally producing the wetware equivalent of the next-predicted token with every word we say and compose.

Prof. Lee Cronin@leecronin

AI does not write, it produces text. Humans uniquely write.

English

245

715

102.9K

Nic Wienandt@NicW_AI·3d

@Kristinartz Me, i’m over 40 Kristina

English

Kristina Bolten@Kristinartz·5d

DOES ANYONE HERE HAVE SOMETHING IN THEIR HOUSE THAT'S OVER 40 YEARS OLD..

English

10.4K

156

3.9K

1.4M

Nic Wienandt@NicW_AI·17 May

@bridgemindai L take…

English

BridgeMind@bridgemindai·17 May

I have two NVIDIA DGX Sparks stacked in my office. They've been sitting there for a month. Here's my honest take. Open source AI is never going to compare to frontier models. Running quantized Kimi K2.6 and GLM 5.1 locally is cool. But practical? No. Not even close. I run all my Hermes agents on GPT 5.5 through my ChatGPT Pro subscription. Practically free. GPT 5.5 is the intelligent model in the world. Why would I route serious tasks to a watered down local model? If you need fast and accurate, you're not using local inference. You're using GPT 5.5 or Claude Opus 4.7. I'm not saying this to rage bait. I genuinely want to know. Why would anyone serious about vibe coding and AI agents use a local model when frontier is this far ahead?

English

389

552

109.7K

Nic Wienandt@NicW_AI·16 May

💯

Sudo su@sudoingX

look anon, those of you who kept saying local AI is not there yet, who said open source can't compete, who said you need cloud APIs to get anything serious done, look at this gameplay for one minute. every pixel on this screen was written by one model, in one shot, on a single rtx 3090 with 24gb of vram. the model is qwen 3.6 27b dense q4. the harness is hermes agent. the hardware is a single consumer card you can buy used for 900 dollars. the prompt is open source on github. every claim verifiable, on your own desk. if your local AI take is from 2024, update it. the consumer tier is shipping work that was supposed to need 8 gpus and an api key. open source moved the floor while the rest of the field was busy explaining why it cannot. 24gb tier owners are eating ramen with half boiled egg and double chocolate.

ART

Nic Wienandt@NicW_AI·16 May

@mcuban What about small providers? wtf @mcuban

English

Mark Cuban@mcuban·16 May

We should federally tax Tokens at the Provider level. Not a lot. Less than 50c per million tokens. It will accomplish 4 things (at least ) 1. It will push the big AI players to optimize tokenization, caching , routing and localization Which will 2. Reduce energy usage. Saving them in energy costs more than what they paid in tax and reducing strain created by the growth in energy consumption Which will 3. Generate maybe 10 billion dollars a year to start, but over the next ten years could grow 30x to 100x Which will 4. Create a source of funding to pay down the federal debt or deploy, in response to the things AI brings that we don’t expect or don’t like At some point the models will pass it on to customers. Of course. That’s ok. Customers will have the ability to choose between providers. Or to do everything using open source models locally. Thoughts ?

English

2.2K

262

1.2M

Nic Wienandt@NicW_AI·10 May

@Govindtwtt auto immune diseases on the rise, writing is on the wall. Poison doesn’t sell

English

1.1K

Govind@Govindtwtt·9 May

McDonald’s says customers are “pulling back.” Same with Wendy’s. Same with Burger King. When fast food loses traffic, it’s a stress signal. People are tapped out.

English

4.4K

3.7K

30.5K

937.7K

Nic Wienandt@NicW_AI·10 May

@J_offtheRocker @BillWiIdin nothing, like houses, cars, vacations, 1000/month in subscriptions to OF. Yeah nothing to show for it…

English

J. Rocker@J_offtheRocker·10 May

@BillWiIdin Gen Z smartly watched Millennials give 110% for decades to have nothing to show for it, and learned from it. It's harder to take advantage of people when you've already shown what you'll do the second it suits you.

English

1.1K

Bill@BillWiIdin·9 May

Just interviewed a Gen Z candidate who asked about "work-life balance" before the salary. I ended the call right there. If you aren't willing to give me 110% of your soul for the first 5 years, you don't deserve a seat at the table. Participation trophies have ruined this country.

English

244

284

44.8K

Nic Wienandt@NicW_AI·10 May

@draloneboy So they literally elected a moron who had no vision for good policy?

English

1.1K

Dralone&_DR145@draloneboy·9 May

WTF?? NYC Mayor Zohran Mamdani announces he's hitting the wealthy with MAXIMUM taxes to convince others to stop leaving the city: "I'll ask those who make the most amount of money [to] pay more so everyone can STAY IN THIS CITY!" We warned you.

English

2.4K

2.9K

16K

959.7K

Nic Wienandt@NicW_AI·10 May

@DJLougen @sama Just call it hyper focus or something :)

English

Daniel Lougen@DJLougen·9 May

@sama Can you not use autism like this…

English

3.6K

Sam Altman@sama·9 May

5.5 is an autistic genius with very strange taste in naming shocking that we would make such a thing

English

1.2K

323

8.2K

Nic Wienandt@NicW_AI·9 May

@sudoingX @pupposandro solve real problems, get real results

English

Sudo su@sudoingX·7 May

@pupposandro building something useful to me first, solves my own pain. confident the audience will love it when it ships.

English

170

Sudo su@sudoingX·7 May

just woke up and as always i must decide. if i build what i'm building for all of us, i don't get to post and x won't pay. if i post and make content, i don't get to build. every day this trade. fuck.

English

3.2K

Nic Wienandt@NicW_AI·8 May

@loktar00 let me know what you find

English

Loktar 🇺🇸@loktar00·8 May

Hmmm need to try codex /goal with local models, feels like 3d printing or Christmas waking up seeing how far Codex has gotten on my goal.

English

451

Nic Wienandt@NicW_AI·7 May

@pupposandro @davideciffa 216-219, depending on the card :)

English

Sandro@pupposandro·7 May

A few weeks ago, with @davideciffa, we experimented around power capping our 3090s. And we discovered a sweet spot at 220W where you can get ~92% of max throughput at ~58% of the power. Qwen3.5-35B-A3B Q4_K_M on a single RTX 3090: • 320W: 115.2 tok/s, 0.381 tok/s/W, 76°C • 220W: 105.4 tok/s, 0.436 tok/s/W, 64°C • 200W: 95.8 tok/s, 0.438 tok/s/W, 60°C 10% throughput loss for 40% less power. Fans basically silent. We think this is a no-brainer. Lower bill, lower temps (particularly important in the upcoming summer), longer GPU life.

English

150

8.5K

Nic Wienandt@NicW_AI·7 May

@loktar00 Help me understand why people would be switching between them?

English

Loktar 🇺🇸@loktar00·7 May

If you're switching models frequently, using vLLM and llama.cpp interchangeably on the same set of cards definitely check out llama-swap it's incredibly powerful and useful It supports swapping image models, and audio as well, the config options are pretty intense. Link below.

English

1.6K

Nic Wienandt@NicW_AI·7 May

@icreatelife Tensor Parallelism = 4

Español

Kris Kashtanova@icreatelife·7 May

Prove you work with AI with just one phrase

English

1.3K

413

83.1K

Nic Wienandt@NicW_AI·7 May

TokenSpeed it’s even named after a bottleneck, lfg!

LightSeek Foundation@lightseekorg

Introducing TokenSpeed, a speed-of-light LLM inference engine. > TensorRT LLM level performance > vLLM level usability > Built by a lean and mission-driven team in two months > MIT license, open-source github.com/lightseekorg/t… lightseek.org/blog/lightseek…

English

Nic Wienandt@NicW_AI·7 May

@bokiko Isn’t 3.6 27b better in some benchmarks for real work?

English

Bokiko@bokiko·7 May

Qwen3.5-122B-A10B running multimodal on 4×3090 at 64 tok/s Reads scanned PDFs locally with vision-language reasoning 9× faster than my old flagship. Better quality output too Self-hosted AI is genuinely competitive for serious work in 2026

English

117

5.6K

Keşfet

@TheAhmadOsman @ClementDelangue @huggingface @AgentSparko @segun_os_ @BenjaminPDixon @Kristinartz @bridgemindai