eclectic leaps

1.8K posts

eclectic leaps

@eclecticleaps

Eclectic interests including AI, media and complex systems. Making a splash is our purest form.

Katılım Haziran 2008

983 Takip Edilen55 Takipçiler

eclectic leaps@eclecticleaps·1d

@mr_hari75249 Your ladder is dangerously vertical - needs 4:1 slope. If you slightly lean back or hold out a weighty object the levered weight can cause ladder to fall backwards. And the person watching below should be a spotter holding ladder, protecting against this or sideways tipping.

English

166

🇺🇲Wesker🇺🇲@mr_hari75249·1d

This was no joke—we went 2-3 stories above ground to fix a badly blocked gutter system. 💪

English

3.9K

eclectic leaps@eclecticleaps·1d

@ArrayEmpty @UnslothAI @no_stp_on_snek @_HermesAgent @MajorFAFO @ASUS That GUI interface looks outstanding! Is it on github?

English

emptyArray@ArrayEmpty·2d

I'm finally happy with my @UnslothAI unsloth/Qwen3.6-35B-A3B-MTP-GGUF, @no_stp_on_snek llama.cpp-turboquant, @_HermesAgent setup, @MajorFAFO params suggestions, and my own Rust/SQL/Hermes app. - @ASUS, 32GB RAM, 8GB VRAM 5070 laptop - MoE fully offloaded to CPU - I've fiddled around with --no-mmap and mlock. It pegs my RAM to 95% instead of 45% and after it spins up, I don't see a difference. --cache-type-k q8_0 \ --cache-type-v turbo3 \ -ot "\\.ffn_.*_exps\\..*=CPU" \ --spec-type draft-mtp \ --spec-draft-n-max 3 \ The video shows it has full hermes skills/toolset access easily, knows my system, can be creative, can print out a quick BASIC function, corrected itself when I told it that it was incorrect, and that it doesn't love me....

English

180

10.9K

eclectic leaps@eclecticleaps·5d

@canalCCore2 Sounds a lot like me in the morning. But once I get some caffeine in me and it is circulating, it is all smooth running and high output!

English

caio temer@canalCCore2·5d

Muita gente acha que o barulho de motor antigo é sinal de que a máquina vai quebrar, mas esse monstro V16 precisa exatamente dessa violência para ligar. O som assusta no começo, mas o ritmo perfeito que ele ganha em segundos mostra a precisão dessa engenharia brutal.

Português

306

3.2K

232.6K

eclectic leaps@eclecticleaps·5d

@nic_carter Nice! The story data centers are pushing up elecricity prices never make sense because the data center gets charged the full capex of grid upgrades as part of its permission to interconnnect. Plus it more fully utilizes existing capacity, cutting fixed costs charged to consumers.

English

nic carter@nic_carter·5d

"AI is hiking your energy bill" is the most popular political talking point of 2026. The data doesn't support it. A thread:

English

112

322

1.9K

271.7K

eclectic leaps@eclecticleaps·5d

@tobi I've been doing the same thing with AutoResearch across multiple topic areas using Opus as the advisor and quantized (unsloth Q5) Qwen3.5-27B, Qwen3.6-27B or Gemma4-31B locally on a 5090 under llama-server. Works great!

English

130

tobi lutke@tobi·5d

I’ve had very good results running autoresearch with local qwen 3.6 26b model as long as I had a simple vibed pi “advisor” extension that allowed it to periodically ask GPT 5.5 for ideas. I think this direction has a lot of merit.

English

175.9K

eclectic leaps@eclecticleaps·5d

@tpm_28 Assuming there are unused slot(s) under those multi-slot GPUs, get PCIe extender cables.

English

TPM-28@tpm_28·10 May

It’s ridiculous… I no longer have any free PCIe slots on my two servers now

English

674

eclectic leaps@eclecticleaps·6d

@Snixtp I think that 7960x has 4 channel memory so 4 DRAMs would almost (~3.8x) quadruple your memory speed. If you have 8 slots adding 7 more DIMMs to your 1 now only takes you to bit more than 4 DIMMs (maybe ~4.3x). So add 3 DIMMs for faster CPU - and GPUs if DRAM xfer is a bottleneck.

English

Espen JD@Snixtp·16 May

Got a few questions earlier this week about what hardware I'm running. This is my current setup: CPU: Threadripper 7960x MB: Gigabyte TRX50 Aero D RAM: Kingston DDR5 5200Mhz 1x64GB CL42 ECC PSU: Cooler Master V Platinum V2 1600W Cooler: Enermax Liqtech Xtr 360 AIO Drives: 512gb, 1tb, 2tb, 4tb (all m.2 nvme gen 4) GPUs: RTX Pro 6000 Blackwell, 2x 3090, 4080 Super, 5070 GPU risers and Case is from AliExpress and Amazon Everything except Pro 6000 and 4080 super was bought used 5070 is just laying on the side waiting to be replaced with another 3090 (if the 5070 is popular I can definitely do some speed or efficiency tests on it ) 4080 super is currently in my gaming rig, but that isn't used a lot anymore so I'm thinking of getting rid of it I have a spare 750W PSU I'm going to use running all 5 GPUs at the same time, but I'm waiting for a 3 more GPU risers and PCIe splitters before I get to do that. Ideally I should get a new motherboard with 5 PCIe slots, but that will have to wait.

English

4.9K

eclectic leaps@eclecticleaps·16 May

@Jason Nope. 3.5x sales is an insane customer acq. cost. 0.25-0.5x maybe. Cust. can be acquired for far less than 3.5x. 20x for hi risk FCF also silly. Its Uber that urgently needs auto-driving to avoid fast erosion of their driver-customer match platform. Waymo scaling very fast now!

English

@jason@Jason·16 May

Uber is going to be bought by Google/Waymo, Amazon or Tesla/SpaceX in the next year. For a “buy it now” price of $250b, one of those three companies gets a $12b a year free cash flow machine with $70b in revenue — and hundreds of millions of global customers This is the most obvious M&A deal since Instagram, Android and YouTube transformed Meta and Google Discuss

English

1.3K

256

6.4K

1.8M

eclectic leaps@eclecticleaps·16 May

@mcuban So dumb! US plan to move AI offshore, just like chips & manufacturing! It is AI agents using massive tokens, not humans. AI Agents will move offshore; devs follow. Like chips did. Data centers already focus on optimize & energy. It is why they pay inference eng multi-mil comp.

English

Mark Cuban@mcuban·16 May

We should federally tax Tokens at the Provider level. Not a lot. Less than 50c per million tokens. It will accomplish 4 things (at least ) 1. It will push the big AI players to optimize tokenization, caching , routing and localization Which will 2. Reduce energy usage. Saving them in energy costs more than what they paid in tax and reducing strain created by the growth in energy consumption Which will 3. Generate maybe 10 billion dollars a year to start, but over the next ten years could grow 30x to 100x Which will 4. Create a source of funding to pay down the federal debt or deploy, in response to the things AI brings that we don’t expect or don’t like At some point the models will pass it on to customers. Of course. That’s ok. Customers will have the ability to choose between providers. Or to do everything using open source models locally. Thoughts ?

English

2.2K

263

4.1K

1.2M

eclectic leaps@eclecticleaps·14 May

@stevibe Thanks! What also critically matters is accuracy. Some benchmarks on each would help qualify the speed benefits. I expect there will be a very rapid falloff in accuracy at some point.

English

113

stevibe@stevibe·14 May

Wondering which quant is right for you? I ran Unsloth's Qwen3.6-27B MTP across the full range: Q2 through Q8 (all _K_XL variants) on my DGX Spark. > UD-Q2_K_XL: 23.95 tok/s, 261ms TTFT > UD-Q3_K_XL: 22.12 tok/s, 286ms TTFT > UD-Q4_K_XL: 19.51 tok/s, 307ms TTFT > UD-Q5_K_XL: 17.74 tok/s, 363ms TTFT > UD-Q6_K_XL: 16.30 tok/s, 381ms TTFT > UD-Q8_K_XL: 12.07 tok/s, 444ms TTFT Q2 runs nearly 2x faster than Q8, and TTFT climbs steadily with precision. Even if you're not on a DGX Spark, the relative gap between quants should hold on your setup.

English

10.9K

eclectic leaps@eclecticleaps·13 May

@NYPDnews Uncertified batteries in ebikes, escooters, eskateboards have caught fire & literally burned down an entire block (e.g. 200 engine NY block fire), killed families & other horrific damage. Great job NY getting rid of these! Lots of YouTubes: youtube.com/watch?v=xbCav3…

YouTube

English

763

NYPD NEWS@NYPDnews·13 May

Consider these crushed.

English

117

190

93.2K

NYPD NEWS@NYPDnews·13 May

The NYPD is crushing it when it comes to taking illegal mopeds and scooters off our streets. So far this year, the NYPD has seized over 5,700 of these dangerous, illegal vehicles — and we are not letting up.

English

3.8K

690

7.2K

4.4M

eclectic leaps@eclecticleaps·13 May

@danielhanchen Excellent! Thanks! And this may help push it to 4+ draft tokens - the paper came out yesterday: Attention Drift: What Autoregressive Speculative Decoding Models Learn arxiv.org/pdf/2605.09992

English

385

Daniel Han@danielhanchen·13 May

We released experimental MTP Qwen3.6 Unsloth GGUFs! Qwen3.6 27B MTP now runs at 140 tokens/s. Qwen3.6 35B-A3B MTP gets 220 tokens/s generation on a single GPU. Qwen3.6 27B and 35B-A3B have >1.4x speed-up over the original GGUFs without any change in accuracy. Guide + GGUFs + Benchmarks: #mtp-guide" target="_blank" rel="nofollow noopener">unsloth.ai/docs/models/qw… In terms of average speedup, we see a 1.4x for dense models at draft tokens = 2 and for the MoE around 1.15 to 1.2x. We do not recommend more than 2 draft tokens because the acceptance rate drops precipitously from 83% to 50% with 4 draft tokens, and the forward passes for MTP become less beneficial. Use `--spec-type mtp --spec-draft-n-max 2` Thanks to Aman for github.com/ggml-org/llama…!

English

117

789

122.2K

eclectic leaps@eclecticleaps·12 May

@stevibe It would be very interesting to see Qwen3.6 35B-A3B (an MoE model) versus Qwen3.6 27B (a dense model). The dense model generally does better.

English

129

stevibe@stevibe·12 May

Six open-source LLMs. One sliding puzzle. A brutal test of long-horizon reasoning and tool calling. Five of them broke. One didn't. I gave each model a move_tile tool and a scrambled 3×3 board, then asked it to solve the puzzle through pure turn-by-turn reasoning. The deeper the scramble, the more brutal the search. Five runs per depth, best run kept. A model fails the round if it exceeds 6x the optimal move count. > Depth 5: Everyone solves it. Yawn. > Depth 10: GLM 5.1 melts down. 43 moves. Cut. > Depth 12: Gemma4 26B loses the plot, shuffling tiles in circles. Gone. > Depth 15: The wall. DeepSeek V4 Flash, out. DeepSeek V4 Pro, out. Gemma4, out again. GLM 5.1, out. Two survivors: Qwen3.6 35B-A3B, and Kimi K2.6 with an 11-move solve that looked like cheating. > Depth 18: Same two. Everyone else hallucinating tiles that weren't there. > Depth 22: Final boss. Kimi, flawless for five rounds, finally cracks. 81 moves. Still scrambled. DeepSeek V4 Pro limps home at 90. Qwen3.6 35B-A3B solves it in 36. The smallest model in the room. ~3B active params. Fits on a single 3090. It beat everything. Kimi was elegant. Qwen3.6 was unstoppable.

English

212

16.3K

eclectic leaps@eclecticleaps·12 May

@mathelirium And before that: 'A "Neural-Gas" Network Learns Topologies' Thomas Martinez & Klaus Schulten, ANN 1991. And later: "An incremental growing neural gas learns topologies" Yann Prudent & Abdellatif Ennaji, IEEE IJCNN 2005. etc.

English

497

Mathelirium@mathelirium·12 May

A Neural Network Can Grow New Neurons Where It Is Confused? In 1994, Bernd Fritzke published A Growing Neural Gas Network Learns Topologies. He introduced a network that starts small, follows incoming data, and inserts new neurons where its error is highest. In the animation, the fog is the drifting data. The glowing nodes are neurons. The fibers are learned connections. The network grows into a living skeleton of the manifold.

English

420

37.9K

eclectic leaps@eclecticleaps·12 May

@malikwas1f Using llama-server, I used to use unsloth/Qwen3.6-27B-GGUF:UD-Q5_K_XL with context 262144 on a 5090 (32GB) but now I get slightly better quality results (doing mostly coding) using unsloth/gemma-4-31B-it-GGUF:UD-Q5_K_XL but am limited to context 180000. Both are great local LLMs.

English

114

noname@malikwas1f·12 May

🛠️ Everything here came from the same 5-phase harness: bench → verify-stress → quality-full → soak → aider-polyglot-30 ~1–2h per leg on 2× RTX 3090. Repo: github.com/noonghunna/clu… If you’ve got local inference hardware, I’d genuinely love replication data 🤝 Especially: • 5090 rigs • NVLink setups • MI300 • mixed-GPU configs • long-agent workloads

English

557

noname@malikwas1f·12 May

Ran Gemma 4-31B and Qwen 3.6-27B head-to-head on the same dual-3090 rig. Same vLLM nightly. Same harness. Same ctx. Same workloads. What surprised me wasn’t which model won. It was where each one broke away. 🟢 Gemma dominates single-turn UX 🔵 Qwen dominates long-running agents The split was way cleaner than I expected 🧵

English

119

10.7K

eclectic leaps@eclecticleaps·12 May

@sudoingX More tests between gemma 4 31B and Qwen 3.6 27B (dual 3090) - winner depends on workload x.com/malikwas1f/sta…

noname@malikwas1f

English

230

Sudo su@sudoingX·11 May

i declare qwen 3.6 27b dense q4 the king of a single rtx 3090 card. not even close. this model is absolute beast on local ai, ruthless on agentic loops, owns its own thinking. anyone can use it on single 3090, the weights are open, the stack is reproducible, the prompt is canonical, every claim below is verifiable on your own hardware. the octopus invaders one shot you are seeing is the visible test. i run these models on workloads you wouldn't think to ask for and i couldn't show you if i wanted to, and qwen 3.6 27b dense q4 quietly does the heavy lifting on a single consumer card while the rest of the field is busy explaining why it cannot. if you think a different model is king on a single 3090 right now, name it. drop your card, drop your model, drop your numbers. the throne is not crowded.

Sudo su@sudoingX

update: qwen 3.6 27b dense q4 just one shotted octopus invaders game on a single 3090. hermes agent drove the whole thing, ~41 tok/s gen 21gb vram at full 262k context, thinking mode on. one prompt in and the canonical multi-file space shooter benchmark out, the same exact prompt i ran on qwen 3.5 27b dense back in march on the same card. 3.5 needed one external scope bug fix before the game would even load on first play. 3.6 needed nothing. 11 of 11 files written, 2411 lines of code, zero steering interventions, zero external fixes, playable on first load. 16 minutes 41 seconds wall clock from prompt to playable. consumer tier king on a single 3090 is locked tonight, and the silicon underneath my desk did not change between march and now. the open source ecosystem just moved the floor. watch it ship itself, the full 16 minutes 41 seconds sped to 3 minutes 45, no human touched the keyboard between the first prompt and the final frame.

English

497

41.1K

eclectic leaps@eclecticleaps·12 May

@sudoingX Using llama-server, I used to use unsloth/Qwen3.6-27B-GGUF:UD-Q5_K_XL with context 262144 on a 5090 (32GB) but now I get slightly better quality results (doing mostly coding) using unsloth/gemma-4-31B-it-GGUF:UD-Q5_K_XL but am limited to context 180000. Both are great local LLMs.

English

162

eclectic leaps@eclecticleaps·10 May

@tetsuoai I watched his entire semester on my cellphone* while walking around our neighborhood. Excellent! (* see his videos for how utterly ironic this is.)

English

284

tetsuo@tetsuoai·10 May

The most important equation in statistics is the Normal Equations AᵀA x̂ = Aᵀb. It's the foundation of linear regression. Professor Gilbert Strang, MIT.

English

139

948

68.4K

eclectic leaps@eclecticleaps·6 May

@jukan05 Look at leading edge users today doing reasonable things with today's top agents e.g. x.com/doodlestein/st… That is just one project. Will be common enterprise agent project use in 3 yrs. Forecast seems wildly understated.

Jeffrey Emanuel@doodlestein

@sama I’m easily in the hundreds of billions of tokens for this project: asupersync.com

English

248

Jukan@jukan05·6 May

I read Goldman Sachs’ AI report, and I was genuinely impressed. The core insight is as follows: Agentic AI could turn AI from a capex-heavy cost burden into a business where usage growth drives margin expansion. As token costs fall, more complex agents become economically viable. These agents then consume far more tokens through longer context windows, repeated reasoning loops, validation, tool use, and always-on background monitoring. This increase in token usage improves infrastructure utilization, strengthens unit economics, and gives hyperscalers and model providers more room to reinvest in model quality, distribution, and capacity. In other words, the bull case for AI capex is not simply that usage will grow. It is that this usage growth can increasingly flow through at attractive incremental margins. Goldman Sachs argues that this margin inflection is beginning to appear from 2026 onward.

Jukan@jukan05

We have only just entered the early innings.

English

105

456

2.9K

502.7K

eclectic leaps@eclecticleaps·6 May

@elon_lit Still reading paper in detail. Seems a lot of dimensions are just reservoirs for noise. Does the reservoir concept suggest a trained model could be shrunk a lot by eliminating some or all of the reservoir noise dimensions from a reduced model, while still retaining accuracy?

English

537

Elon Litman@elon_lit·5 May

We developed a unified theory of generalization in deep learning. It explains grokking, double descent, benign overfitting, and implicit bias. But theory is only half the story. It turns out that optimizing the population risk of any neural network amounts to a small change to your optimizer. 🧵

English

128

74.9K

Keşfet

@mr_hari75249 @ArrayEmpty @UnslothAI @no_stp_on_snek @_HermesAgent @MajorFAFO @ASUS @canalCCore2