Santiago Rodríguez Díaz

28 posts

Santiago Rodríguez Díaz banner
Santiago Rodríguez Díaz

Santiago Rodríguez Díaz

@sonriks6

Data Scientist

Spain Присоединился Eylül 2025
58 Подписки0 Подписчики
Aryan
Aryan@justbyte_·
Is VS Code still the default choice for developers?
Aryan tweet media
English
65
2
72
14.6K
Santiago Rodríguez Díaz
@github Lixux AppImage performance is terrible! (running Fedora 44 and more than capable machine :))
English
0
0
0
174
GitHub
GitHub@github·
The GitHub Copilot app is now generally available. 🙌 The new home base for your work. Pick up what's next, direct agents in parallel, and land your PRs, all in one place. ⬇️ github.blog/changelog/2026…
English
79
144
880
185.9K
Santiago Rodríguez Díaz
Santiago Rodríguez Díaz@sonriks6·
@AIPandaX It’s a shame but I literally do all this whenever I start with a new TV (friends and family included) 🤦🏻‍♂️
English
0
0
0
281
AI Panda
AI Panda@AIPandaX·
A guy was ready to drop $1,500 on a new OLED TV because his 3-year-old Smart TV was freezing up and took 5 seconds just to respond to the remote. He unplugged it. Deleted old apps. Cleared the cache. The lag kept coming back. He went to Best Buy to get a replacement. The home theater installer in the blue shirt stopped him: "Before you spend a grand, let me show you something." He grabbed a remote and shook his head. "There are 8 hidden tracking settings throttling your TV's processor right now. Manufacturers turn them all on by default. Nobody tells you they exist. Let's fix this." Here's what he showed him in the next 8 minutes. 🧵
English
202
2.1K
13.5K
2.9M
Santiago Rodríguez Díaz ретвитнул
ollama
ollama@ollama·
Gemma 4 Quantization-Aware Training (QAT) weights are now available on Ollama! They reduce memory requirements while maintaining model quality. E2B: ollama run gemma4:e2b-it-qat E4B: ollama run gemma4:e4b-it-qat 12B: ollama run gemma4:12b-it-qat 26B: ollama run gemma4:26b-a4b-it-qat 31B: ollama run gemma4:31b-it-qat Try them with ollama launch integrations to use with your favorite tools 👇👇👇
Google Gemma@googlegemma

We just dropped Gemma 4 Quantization-Aware Training (QAT) checkpoints on Hugging Face! All Gemma 4 model sizes and their drafters are now optimized with QAT to cut memory requirements and maximize on-device performance!

English
42
160
1.5K
110.7K
Santiago Rodríguez Díaz ретвитнул
Ehsan
Ehsan@acadictive·
Dear GitHub Copilot team, I am happy to announce that I successfully burned all of my monthly tokens in under 3 days thanks to your garbage new pricing model. I'd also like to inform you that I won't be renewing my subscription or adding more budget. Best, A former customer.
Ehsan tweet media
English
310
266
3.8K
732.1K
Dmitry Lyalin
Dmitry Lyalin@LyalinDotCom·
If you're waiting Gemma 4 12b through @ollama, its here: gemma4:12b gemma4:12b-it-q4_K_M gemma4:12b-it-q8_0 gemma4:12b-it-bf16 gemma4:12b-mlxMLX gemma4:12b-mlx-bf16MLX gemma4:12b-mxfp8MLX gemma4:12b-nvfp4MLX ollama.com/library/gemma4… You'll need Ollama version 0.30.4-rc0
English
11
29
302
25.2K
NVIDIA GeForce
NVIDIA GeForce@NVIDIAGeForce·
Over 1,000 RTX games and apps are available now with ray tracing and DLSS. To celebrate, we're dropping Steam cash in the replies to upgrade your library with a #RTXOn title... Comment #RTXOn to enter 💸
English
25.4K
1.1K
9.2K
1.3M
MisterChip (Alexis)
MisterChip (Alexis)@2010MisterChip·
Que os parece el nuevo horario de la final de la Champions League?
Español
2.2K
64
3.5K
1.4M
NVIDIA AI
NVIDIA AI@NVIDIAAI·
A new era of PC. 25.0528, 121.5990
English
373
516
9.3K
1.7M
Santiago Rodríguez Díaz
Santiago Rodríguez Díaz@sonriks6·
@witcheer The price of ‘freedom’ an investment of 3000+$ and monthly electricity bill 50-60$ to keep this machine alive 24/7 to run an open weight model? Does anyone see this? I know there are many advantages but there’s no break even point compared to cloud providers!
English
3
0
1
103
witcheer
witcheer@witcheer·
BOOM Qwen3.6 27b is up! day 1 with the headless 5090 linux box. local ChatGPT running on my own hardware. and this without API costs or rate limits or data leaving my network. my current stack: - ubuntu server 26.04, fully headless - ollama serving qwen3.6 27B, entire model in VRAM - open WebUI as the frontend - tailscale for access from anywhere - tmux for persistent sessions over SSH my current architecture: browser → open webui → ollama API → RTX 5090 VRAM every layer is swappable. I can replace ollama with vLLM, swap the frontend, add models. nothing is locked in. can't wait to wire Hermes in for local inference and start benchmarking models.
witcheer tweet media
English
32
12
171
12K
Santiago Rodríguez Díaz ретвитнул
Andrej Karpathy
Andrej Karpathy@karpathy·
Personal update: I've joined Anthropic. I think the next few years at the frontier of LLMs will be especially formative. I am very excited to join the team here and get back to R&D. I remain deeply passionate about education and plan to resume my work on it in time.
English
8K
11.1K
150.3K
27.6M
Santiago Rodríguez Díaz
Santiago Rodríguez Díaz@sonriks6·
@witcheer Why don’t you run qwen3.5:4b with a decent ctx? It runs full on 8 Gb and is not that ‘dumb’ with pi coding harness
English
0
0
0
60
witcheer
witcheer@witcheer·
can a local LLM replace cloud coding agents on 8GB VRAM? tested it. short answer: not yet. setup: RTX 4060 Ti 8GB, WSL2, llama-server (turboquant fork), Hermes Agent + Pi as agent frameworks. two coding tasks: a port scanner (easy, single file) and a log watcher CLI (hard, multi-file + debug loop). tested 3 model configs: >Qwopus3.5-9B-Coder (43 tok/s, fully in VRAM) clean code in general, but the model can't produce valid JSON for structured tool calls at 9B params. >Qwen3.6-35B-A3B MoE + thinking (35 tok/s, ncmoe=30) thinking tokens consumed most of the budget. tool calls: reliable, never broke. >Qwen3.6-35B-A3B MoE, reasoning OFF (35 tok/s, ncmoe=30) 2.7x faster than thinking mode. tool calls: reliable, but still super long to produce the output. also tested Pi (4-tool agent: read/write/edit/bash) vs Hermes Agent with Qwopus 9B. I got the same result: model wrote good code but got stuck in a reasoning loop during test verification. 9B breaks regardless of tooling. ---- MoE expert offload adds ~2-5s latency per API call. agent tasks need 20-50+ round trips. that compounds to 18-87 minutes for tasks that take seconds on cloud APIs. code quality was good across all configs. the models can write code. they just can't do the multi-turn planning loop fast enough (or reliably enough at 9B) to work as agents. what would fix this imho: - a 9B model with reliable structured output (fits fully in VRAM at 43+ tok/s) - faster expert offload (PCIe 5.0, faster RAM)
witcheer tweet media
English
15
5
66
4.1K
Santiago Rodríguez Díaz
Santiago Rodríguez Díaz@sonriks6·
@AishwaryaDevv I’ve just dropped Hermes and OpenClaw was about to do it… If you need a coding harness used Codex/Claude Code period, if you want a personal assistant cool and fancy then fight against the two first
English
0
0
0
24
Aish
Aish@AishwaryaDevv·
Why use Hermes or OpenClaw? Genuine question. The more I use these AI assistant wrappers, the more they feel like overlays with memory bolted on. If I can build my own AI-agnostic agent with skills, rules, context files, and a feedback loop that keeps it updated… what’s the actual value add? Is it just faster skill setup? Better tooling? Less maintenance? I even had to wipe Hermes memory because it got full. Feels like I’m missing something. People using Hermes/OpenClaw heavily, what am I not seeing?
English
138
4
149
56.6K
Holger Gruenhagen
Holger Gruenhagen@hgruenhagen·
Which embedding models are you using for memory-core in @openclaw ? Mostly curious about retrieval quality and latency.
English
21
5
40
31.4K
Santiago Rodríguez Díaz
@petergyang I’m more lean towards OpenClaw the level of effort from devs will come back with stability and reliability soon \ In the long run OC will lead
English
0
0
0
91
Peter Yang
Peter Yang@petergyang·
I caved and downloaded Hermes to try. For those of you who have tried both Hermes and OpenClaw what difference do you notice? No shilling please, just want some honest opinions
English
375
29
1.2K
306K
Karen Payne MVP
Karen Payne MVP@KarenPayneMVP·
For Visual Studio, what is your go-to model?
Karen Payne MVP tweet media
English
99
6
114
31.1K
Santiago Rodríguez Díaz
@NoahKingJr To me there’s no ‘race’ or at least we are far from the finish. Indeed a loser now would be dramatic, we need a balanced share of usage
English
0
0
1
104
Noah
Noah@NoahKingJr·
Who's gonna win the AI race? > OpenAI > Anthropic > Google > xAI
English
410
9
266
47.5K