Tech2Wild

148 posts

Tech2Wild banner
Tech2Wild

Tech2Wild

@Tech2Wild

🎮 Tech, gaming, AI, and everything in between. 🤖 Building with it, not just talking about it. 🔥 From the mind of @ToNYD2WiLD

Bergabung Mart 2026
59 Mengikuti99 Pengikut
yourfren
yourfren@0xyourfren·
@Tech2Wild Two 3090’s on top of each other just sat on the PSU outside the case got me feeling a type of way
English
1
0
0
64
Tech2Wild
Tech2Wild@Tech2Wild·
TRIPLE 3090s…. Need a new MB and start building a RACK if I decide to go any further lol. Temporary rig for now I gotta get a rack 🤣🤣🤣
Tech2Wild tweet mediaTech2Wild tweet mediaTech2Wild tweet media
English
3
0
18
1.9K
Tech2Wild
Tech2Wild@Tech2Wild·
One issue with Opus since 4.7 that still hasn't resolved: the agent sometimes NOT using telegram to respond. Goes silent but been blabbing on the terminal. Tell them to always use the plugin to respond back & they still revert to not responding in telegram.
English
0
0
0
14
Tech2Wild
Tech2Wild@Tech2Wild·
test
English
0
0
0
9
bgeneto
bgeneto@netobge·
@Tech2Wild Qwen3.6 35B A3B AutoRound fits in a single 24GB GPU with 262K context with fp8 KV cache and runs at 160 tps in a rtx 3090 via vLLM... Produces much better code than Gemma 4 12B. Unfair to compare them.
English
1
0
3
480
Tech2Wild
Tech2Wild@Tech2Wild·
Hmm wondering which is better: Gemma 4 12B or Qwen 3.6 35B-A3? 🤔
English
35
1
50
17.5K
Master Builder
Master Builder@MakerInParadise·
@Tech2Wild I can’t say anything negative about either model other than that 12B’s native vernacular is too informal for my liking… it has grok4.1/deepseek sentence structure and punctuation. Otherwise, I think that 12B is the better chat model and 35B the better reasoner/researcher.
English
2
0
4
1.8K
Tech2Wild
Tech2Wild@Tech2Wild·
@malikwas1f Good call I been running your recipes bro thanks for what you do
English
1
0
1
105
Tech2Wild
Tech2Wild@Tech2Wild·
Got the 3rd GPU setup (3x 3090s) but no TP=3, so I'm running a separate model or cloned 27B on the extra card. Been looking at Gemma 4 12B but honestly wondering if it's worth it when I can already run 27B or 35B at full context... What's your take? 🤔
English
6
0
8
1.6K
Tech2Wild
Tech2Wild@Tech2Wild·
@sakurayukiai I have 35B running now. The issue I’m having is 2GPUs of 27B give me almost identical speeds as 1 GPU on 35B
English
2
0
2
1.9K
Sakura Yuki
Sakura Yuki@sakurayukiai·
@Tech2Wild If you can fit the 35B footprint, Qwen is wild. Only 3B active params means it runs circles around Gemma's 12B dense decode speeds, but Gemma 4 is way friendlier on a single consumer GPU.
English
3
0
16
2.1K
Tech2Wild
Tech2Wild@Tech2Wild·
@gospaceport Sir I literally just watched your video on your Quad Build from 9 months ago 🙏🏽. Debating whether you go to GEN 5 or just grab one of the motherboards you showed and stay Gen 4.
English
0
0
0
32
Tech2Wild
Tech2Wild@Tech2Wild·
Now that I've learned so much about AI I've realized the amount of false info content creators put out.
English
0
0
0
68
Sakura Yuki
Sakura Yuki@sakurayukiai·
@Tech2Wild Q2 perplexity hit on a 550B is so brutal you're basically paying a massive latency and hardware tax to get the reasoning of a solid 70B. Wild engineering flex, but the math is pretty unforgiving.
English
1
0
2
233
Tech2Wild
Tech2Wild@Tech2Wild·
Ran NVIDIA Nemotron-3-Ultra-550B fully local across 2 DGX Sparks (188GB split via llama.cpp RPC) 🤯 Findings: it works + reasons — but ~5 tok/s, since RPC is round-trip-bound (dual-node is slower per-token than one; it's a capacity play). But I question bigger≠better: 2-bit 550B barely tops a clean 4-bit ~285B. Can we agree ?
Tech2Wild tweet media
English
4
1
18
2.3K
Tech2Wild
Tech2Wild@Tech2Wild·
@outsource_ Model: unsloth/NVIDIA-Nemotron-3-Ultra-550B-A55B-GGUF, UD-Q2_K_XL (~188 GiB, 6 shards)
Dansk
1
0
1
163