SimboKlice

4.3K posts

SimboKlice

@SimboKlice1

bim bop boom.

Katılım Nisan 2021

2.2K Takip Edilen1.7K Takipçiler

SimboKlice@SimboKlice1·2d

At this point I gotta learn Mandarin. America is becoming a shell of itself. Yaw think China go room for 1 more?

English

SimboKlice@SimboKlice1·25 Nis

Lebron lost this game with back to back turnovers in the last 1:45 seconds of the 4th quarter

English

134

SimboKlice@SimboKlice1·25 Nis

@kevinblue345 I need bro in Cali.

English

kevin blue@kevinblue345·23 Nis

We NEED more black people in general in healthcare because WHITE providers couldn’t care less. If you’re offended, kick rocks. 🫶🏾

English

243

1.5K

6.5K

64.6K

SimboKlice@SimboKlice1·22 Nis

@gkisokay Now you gotta do a 64gb line up

English

Graeme@gkisokay·22 Nis

The Local LLM Cheat Sheet for your 32GB RAM device I was asked to put together a practical lineup of local models that fit comfortably on a 32GB machine. At this tier, you start getting access to real flagship-class local models, plus a growing number of custom quants. But for most people, these are the core models worth knowing first. Flagship Models Qwen3.5 27B / GGUF / Q6_K_M The best overall 32GB flagship. General chat, writing, research, and agent workflows. Great if you want one model that can handle almost everything well. Qwen3.6-35B-A3B / GGUF / UD-Q4_K_M Best MoE flagship. Stronger for coding, reasoning, and tool use than most smaller generalists. Gemma 4 31B / GGUF / Q6_K_M Dense premium model. Writing, analysis, reasoning, and high-end local chat. Heavier than the MoE options, but excellent when quality matters more than speed. Models for Fast Flagship Use Gemma 4 26B A4B / GGUF / Q6_K_M Great balance of speed and quality for general assistant work, coding, agent tasks, and research. This is one of the best 32GB picks if you want something that feels high-end without dragging. DeepSeek-R1 Distill Qwen 32B / GGUF / Q4_K_M Offline reasoning engine. Best for math, logic, deliberate analysis, and step-by-step problem solving. Mistral Small 24B / GGUF / Q6_K_M Tool-calling specialist. Strong for assistants, chat workflows, local business tasks, and function calling. Available for 24GB machines. Models for Companion Use Qwen3.5 9B / GGUF / Q6_K_M Best sidekick. Fast drafts, search loops, cheap retries, and secondary agent work. Even on a 32GB machine, you still want a smaller model around for support tasks. Llama 3.1 8B / GGUF / Q6_K_M Long-context companion. RAG, doc ingestion, codebase chat, and long prompts. The output quality is not the sharpest anymore, but it is still useful when needing simple tasks fast. From what my community tells me, the best single models are Qwen3.5 27B or Gemma 4 31B. For two models, the strongest general pairing is Qwen3.5 27B + Qwen3.5 9B. If you are more code-heavy, Qwen3.6-35B-A3B + Llama 3.1 8B. Let me know what models you are running on 32GB, and which ones have actually been worth the RAM.

Graeme@gkisokay

The Local LLM cheat sheet for your 16GB RAM device I pulled together a lineup of small models that can run comfortably on a Mac Mini or personal laptop while still leaving room for context without melting your machine. Models for Daily Use Qwen3.5 9B / GGUF / Q4_K_M Daily driver. General chat, drafting, research, translation. If you're keeping only one, keep this. DeepSeek-R1 Distill Qwen 7B / GGUF / Q4_K_M Reasoning engine. Math, logic, step-by-step problems. Slower, but worth it when you need actual thinking. Models for Specialty Work Qwen2.5 Coder 7B / GGUF / Q4_K_M Code specialist. Completions, refactors, debugging, repo Q&A. Better than a generalist when the task is code. Llama 3.1 8B / GGUF / Q4_K_M Long context worker. RAG, doc chat, codebase Q and A. The output isn't top tier, but the context is strong for its size. Phi-4 Mini Reasoning / GGUF / Q4_K_M Compact thinker. Logic, structured answers, math, and short coding bursts. Smaller context is the catch. Models for Efficiency Gemma 4 E4B / GGUF / Q4_K_M Light all-rounder. Writing, chat, light agents, structured output. Phi-3.5 Mini / GGUF / Q5_K_M Pocket sidekick. Summaries, extraction, background doc chat. Easy to pair with a bigger model. Qwen3.5 2B / GGUF / Q4_K_M Useful for summaries, tagging, rewrites, and lightweight sidekick work. Micro Models Qwen3.5 0.8B / GGUF / Q5_K_M Classification, keyword routing, binary decisions, triage. Gemma 4 E2B-it / GGUF / Q4_K_M Lightweight chat, quick Q and A, summaries, tiny agents. My personal choice for a single model is Qwen3.5 9B For two models use Qwen3.5 9B + Qwen2.5 Coder 7B for code, or Qwen3.5 9B + Phi-3.5 Mini for support tasks. Let me know in the comments your experience with these models, or any I have left out.

English

348

2.1K

305.9K

SimboKlice@SimboKlice1·15 Nis

@nukeru_gofile Shiori Tsukada missing so list seems incomplete

Eesti

3.2K

SimboKlice@SimboKlice1·11 Nis

@Enywealthh_1 “…get back come with time” -Posted At by Lil Durk

English

167

Eniola@Enywealthh_1·10 Nis

A perfect comeback 🤣🤣

Eniola@Enywealthh_1

She's a keeper 🤣🤣

English

558

3.3K

198.7K

SimboKlice@SimboKlice1·9 Nis

@LeakerApple Incoming… MacBook Neo+ with A19Pro

English

810

AppleLeaker@LeakerApple·8 Nis

Apple has reportedly underestimated the sales of the MacBook Neo and now they're in a pickle. MacBook Neo has been selling extremely well, and usually that would be a good thing. But the problem is that it uses binned A18 Pro chips (1 fewer GPU core). Apple only had 5-6 million leftover binned A18 Pro chips, which were unfit for the iPhone, making them essentially "free" for the Neo. This helped keep costs low while maintaining margins. However, with the high demand for the Neo, Apple is on track to run out of the binned chips. They will either have to restart production of the A18 Pro and pay a premium to TSMC (tanking profit margins), stop selling the Neo entirely, or release the A19 Pro MacBook Neo early. Although the last option is unlikely (and could result in the same issue), it is definitely possible.