Tech2Wild

148 posts

Tech2Wild

@Tech2Wild

🎮 Tech, gaming, AI, and everything in between. 🤖 Building with it, not just talking about it. 🔥 From the mind of @ToNYD2WiLD

Bergabung Mart 2026

59 Mengikuti99 Pengikut

Tech2Wild@Tech2Wild·1h

@otpyrcmaniac Oh your right there with me

English

Diez3l@otpyrcmaniac·2d

Nothing exploded yet 🤞🏼

Diez3l@otpyrcmaniac

Is this okay or it will explode ?

English

Tech2Wild@Tech2Wild·1h

@0xyourfren That was me setting it up lol

English

yourfren@0xyourfren·15h

@Tech2Wild Two 3090’s on top of each other just sat on the PSU outside the case got me feeling a type of way

English

Tech2Wild@Tech2Wild·23h

TRIPLE 3090s…. Need a new MB and start building a RACK if I decide to go any further lol. Temporary rig for now I gotta get a rack 🤣🤣🤣

English

1.9K

Tech2Wild@Tech2Wild·1h

This is an amazing price and deal.

GPTware@GPTWare

Canadian friends looking to do local Ai. 👀👀👀 No affiliation. Just sharing a great deal. DYOR! ebay.ca/itm/2779751515…

English

Tech2Wild@Tech2Wild·1h

One issue with Opus since 4.7 that still hasn't resolved: the agent sometimes NOT using telegram to respond. Goes silent but been blabbing on the terminal. Tell them to always use the plugin to respond back & they still revert to not responding in telegram.

English

Tech2Wild@Tech2Wild·1h

test

English

Tech2Wild@Tech2Wild·8h

@netobge Can you send me that recipe

English

342

bgeneto@netobge·8h

@Tech2Wild Qwen3.6 35B A3B AutoRound fits in a single 24GB GPU with 262K context with fp8 KV cache and runs at 160 tps in a rtx 3090 via vLLM... Produces much better code than Gemma 4 12B. Unfair to compare them.

English

480

Tech2Wild@Tech2Wild·20h

Hmm wondering which is better: Gemma 4 12B or Qwen 3.6 35B-A3? 🤔

English

17.5K

Tech2Wild@Tech2Wild·8h

@YahiaAh87164950 @sakurayukiai 2 GPUS gave me more speed on 27B it went from 70 tok/s to a 120

English

LEO@YahiaAh87164950·9h

@Tech2Wild @sakurayukiai 2 gpu means more capacity not speed

English

Tech2Wild@Tech2Wild·15h

@princedoesai Agentic automation where would you put that at ?

English

1.6K

Prince does AI@princedoesai·17h

@Tech2Wild Gemma vs Qwen depends on workload

English

1.7K

Tech2Wild@Tech2Wild·15h

@MakerInParadise Oh 12b of talk more unprofessional?

English

1.5K

Master Builder@MakerInParadise·17h

@Tech2Wild I can’t say anything negative about either model other than that 12B’s native vernacular is too informal for my liking… it has grok4.1/deepseek sentence structure and punctuation. Otherwise, I think that 12B is the better chat model and 35B the better reasoner/researcher.

English

1.8K

Tech2Wild@Tech2Wild·17h

@malikwas1f Good call I been running your recipes bro thanks for what you do

English

105

noname@malikwas1f·17h

@Tech2Wild Setup comfyui on third

English

200

Tech2Wild@Tech2Wild·19h

Got the 3rd GPU setup (3x 3090s) but no TP=3, so I'm running a separate model or cloned 27B on the extra card. Been looking at Gemma 4 12B but honestly wondering if it's worth it when I can already run 27B or 35B at full context... What's your take? 🤔

English

1.6K

Tech2Wild@Tech2Wild·17h

@sakurayukiai I have 35B running now. The issue I’m having is 2GPUs of 27B give me almost identical speeds as 1 GPU on 35B

English

1.9K

Sakura Yuki@sakurayukiai·17h

@Tech2Wild If you can fit the 35B footprint, Qwen is wild. Only 3B active params means it runs circles around Gemma's 12B dense decode speeds, but Gemma 4 is way friendlier on a single consumer GPU.

English

2.1K

Tech2Wild@Tech2Wild·18h

@gospaceport Sir I literally just watched your video on your Quad Build from 9 months ago 🙏🏽. Debating whether you go to GEN 5 or just grab one of the motherboards you showed and stay Gen 4.

English

Digital Spaceport@gospaceport·18h

Soon

Tech2Wild@Tech2Wild

TRIPLE 3090s…. Need a new MB and start building a RACK if I decide to go any further lol. Temporary rig for now I gotta get a rack 🤣🤣🤣

English

1.1K

Tech2Wild@Tech2Wild·19h

@sakurayukiai Going to look into this now.

English

Sakura Yuki@sakurayukiai·19h

@Tech2Wild Wrote a breakdown on how the mechanics and math of speculative decoding actually work: leetllm.com/learn/speculat…

English

153

Tech2Wild@Tech2Wild·1d

Now that I've learned so much about AI I've realized the amount of false info content creators put out.

English

Tech2Wild@Tech2Wild·1d

@sakurayukiai Exactly I'll stick to Deepseek V4 Flash

English

154

Sakura Yuki@sakurayukiai·1d

@Tech2Wild Q2 perplexity hit on a 550B is so brutal you're basically paying a massive latency and hardware tax to get the reasoning of a solid 70B. Wild engineering flex, but the math is pretty unforgiving.

English

233

Tech2Wild@Tech2Wild·2d

Ran NVIDIA Nemotron-3-Ultra-550B fully local across 2 DGX Sparks (188GB split via llama.cpp RPC) 🤯 Findings: it works + reasons — but ~5 tok/s, since RPC is round-trip-bound (dual-node is slower per-token than one; it's a capacity play). But I question bigger≠better: 2-bit 550B barely tops a clean 4-bit ~285B. Can we agree ?

English

2.3K

Tech2Wild@Tech2Wild·1d

@outsource_ Model: unsloth/NVIDIA-Nemotron-3-Ultra-550B-A55B-GGUF, UD-Q2_K_XL (~188 GiB, 6 shards)

Dansk

163

Eric ⚡️ Building...@outsource_·2d

@Tech2Wild 🔥🔥 need quant

English

662

Jelajahi

@otpyrcmaniac @0xyourfren @netobge @YahiaAh87164950 @sakurayukiai @princedoesai @MakerInParadise @malikwas1f