Santiago Rodríguez Díaz

28 posts

Santiago Rodríguez Díaz

@sonriks6

Data Scientist

Spain Присоединился Eylül 2025

58 Подписки0 Подписчики

Santiago Rodríguez Díaz@sonriks6·3d

@justbyte_ More than ever! I don't like those AI-first, maybe I'll change my mind in 6 months xD

English

241

Aryan@justbyte_·4d

Is VS Code still the default choice for developers?

English

14.6K

Santiago Rodríguez Díaz@sonriks6·3d

@github Lixux AppImage performance is terrible! (running Fedora 44 and more than capable machine :))

English

174

GitHub@github·3d

The GitHub Copilot app is now generally available. 🙌 The new home base for your work. Pick up what's next, direct agents in parallel, and land your PRs, all in one place. ⬇️ github.blog/changelog/2026…

English

144

880

185.9K

Santiago Rodríguez Díaz@sonriks6·14 Haz

@AIPandaX It’s a shame but I literally do all this whenever I start with a new TV (friends and family included) 🤦🏻‍♂️

English

281

AI Panda@AIPandaX·13 Haz

A guy was ready to drop $1,500 on a new OLED TV because his 3-year-old Smart TV was freezing up and took 5 seconds just to respond to the remote. He unplugged it. Deleted old apps. Cleared the cache. The lag kept coming back. He went to Best Buy to get a replacement. The home theater installer in the blue shirt stopped him: "Before you spend a grand, let me show you something." He grabbed a remote and shook his head. "There are 8 hidden tracking settings throttling your TV's processor right now. Manufacturers turn them all on by default. Nobody tells you they exist. Let's fix this." Here's what he showed him in the next 8 minutes. 🧵

English

202

2.1K

13.5K

2.9M

Santiago Rodríguez Díaz ретвитнул

ollama@ollama·5 Haz

Gemma 4 Quantization-Aware Training (QAT) weights are now available on Ollama! They reduce memory requirements while maintaining model quality. E2B: ollama run gemma4:e2b-it-qat E4B: ollama run gemma4:e4b-it-qat 12B: ollama run gemma4:12b-it-qat 26B: ollama run gemma4:26b-a4b-it-qat 31B: ollama run gemma4:31b-it-qat Try them with ollama launch integrations to use with your favorite tools 👇👇👇

Google Gemma@googlegemma

We just dropped Gemma 4 Quantization-Aware Training (QAT) checkpoints on Hugging Face! All Gemma 4 model sizes and their drafters are now optimized with QAT to cut memory requirements and maximize on-device performance!

English

160

1.5K

110.7K

Santiago Rodríguez Díaz ретвитнул

Ehsan@acadictive·3 Haz

Dear GitHub Copilot team, I am happy to announce that I successfully burned all of my monthly tokens in under 3 days thanks to your garbage new pricing model. I'd also like to inform you that I won't be renewing my subscription or adding more budget. Best, A former customer.

English

310

266

3.8K

732.1K

Santiago Rodríguez Díaz@sonriks6·4 Haz

@LyalinDotCom @ollama Only MLX for the moment… 😭

English

Dmitry Lyalin@LyalinDotCom·3 Haz

If you're waiting Gemma 4 12b through @ollama, its here: gemma4:12b gemma4:12b-it-q4_K_M gemma4:12b-it-q8_0 gemma4:12b-it-bf16 gemma4:12b-mlxMLX gemma4:12b-mlx-bf16MLX gemma4:12b-mxfp8MLX gemma4:12b-nvfp4MLX ollama.com/library/gemma4… You'll need Ollama version 0.30.4-rc0

English

302

25.2K

Santiago Rodríguez Díaz@sonriks6·1 Haz

@NVIDIAGeForce #RTXon Control Ultimate

Español

NVIDIA GeForce@NVIDIAGeForce·1 Haz

Over 1,000 RTX games and apps are available now with ray tracing and DLSS. To celebrate, we're dropping Steam cash in the replies to upgrade your library with a #RTXOn title... Comment #RTXOn to enter 💸

English

25.4K

1.1K

9.2K

1.3M

Santiago Rodríguez Díaz@sonriks6·30 May

@2010MisterChip Los patrocinadores estarán que trinan por las audiencias, jajajaja

Español

1.2K

MisterChip (Alexis)@2010MisterChip·30 May

Que os parece el nuevo horario de la final de la Champions League?

Español

2.2K

3.5K

1.4M

Santiago Rodríguez Díaz@sonriks6·30 May

@2010MisterChip Horrible, me he perdido el partido!!

Español

518

Santiago Rodríguez Díaz@sonriks6·29 May

@NVIDIAAI @grok explain this

English

NVIDIA AI@NVIDIAAI·29 May

A new era of PC. 25.0528, 121.5990

English

373

516

9.3K

1.7M

Santiago Rodríguez Díaz@sonriks6·27 May

@witcheer The price of ‘freedom’ an investment of 3000+$ and monthly electricity bill 50-60$ to keep this machine alive 24/7 to run an open weight model? Does anyone see this? I know there are many advantages but there’s no break even point compared to cloud providers!

English

103

witcheer@witcheer·27 May

BOOM Qwen3.6 27b is up! day 1 with the headless 5090 linux box. local ChatGPT running on my own hardware. and this without API costs or rate limits or data leaving my network. my current stack: - ubuntu server 26.04, fully headless - ollama serving qwen3.6 27B, entire model in VRAM - open WebUI as the frontend - tailscale for access from anywhere - tmux for persistent sessions over SSH my current architecture: browser → open webui → ollama API → RTX 5090 VRAM every layer is swappable. I can replace ollama with vLLM, swap the frontend, add models. nothing is locked in. can't wait to wire Hermes in for local inference and start benchmarking models.

English

171

12K

Santiago Rodríguez Díaz ретвитнул

Warp@warpdotdev·22 May

You can now bring your own key and inference endpoint to the Warp Agent, no paid plan required.

Zach Lloyd@zachlloydtweets

x.com/i/article/2057…

English

520

119.1K

Santiago Rodríguez Díaz@sonriks6·21 May

.@openclaw I've migrated to @NousResearch Hermes Agent. full_stop.

English

Santiago Rodríguez Díaz ретвитнул

Andrej Karpathy@karpathy·19 May

Personal update: I've joined Anthropic. I think the next few years at the frontier of LLMs will be especially formative. I am very excited to join the team here and get back to R&D. I remain deeply passionate about education and plan to resume my work on it in time.

English

11.1K

150.3K

27.6M

Santiago Rodríguez Díaz@sonriks6·19 May

@witcheer Why don’t you run qwen3.5:4b with a decent ctx? It runs full on 8 Gb and is not that ‘dumb’ with pi coding harness

English

witcheer@witcheer·19 May

can a local LLM replace cloud coding agents on 8GB VRAM? tested it. short answer: not yet. setup: RTX 4060 Ti 8GB, WSL2, llama-server (turboquant fork), Hermes Agent + Pi as agent frameworks. two coding tasks: a port scanner (easy, single file) and a log watcher CLI (hard, multi-file + debug loop). tested 3 model configs: >Qwopus3.5-9B-Coder (43 tok/s, fully in VRAM) clean code in general, but the model can't produce valid JSON for structured tool calls at 9B params. >Qwen3.6-35B-A3B MoE + thinking (35 tok/s, ncmoe=30) thinking tokens consumed most of the budget. tool calls: reliable, never broke. >Qwen3.6-35B-A3B MoE, reasoning OFF (35 tok/s, ncmoe=30) 2.7x faster than thinking mode. tool calls: reliable, but still super long to produce the output. also tested Pi (4-tool agent: read/write/edit/bash) vs Hermes Agent with Qwopus 9B. I got the same result: model wrote good code but got stuck in a reasoning loop during test verification. 9B breaks regardless of tooling. ---- MoE expert offload adds ~2-5s latency per API call. agent tasks need 20-50+ round trips. that compounds to 18-87 minutes for tasks that take seconds on cloud APIs. code quality was good across all configs. the models can write code. they just can't do the multi-turn planning loop fast enough (or reliably enough at 9B) to work as agents. what would fix this imho: - a 9B model with reliable structured output (fits fully in VRAM at 43+ tok/s) - faster expert offload (PCIe 5.0, faster RAM)

English

4.1K

Santiago Rodríguez Díaz@sonriks6·18 May

@AishwaryaDevv I’ve just dropped Hermes and OpenClaw was about to do it… If you need a coding harness used Codex/Claude Code period, if you want a personal assistant cool and fancy then fight against the two first

English

Aish@AishwaryaDevv·18 May

Why use Hermes or OpenClaw? Genuine question. The more I use these AI assistant wrappers, the more they feel like overlays with memory bolted on. If I can build my own AI-agnostic agent with skills, rules, context files, and a feedback loop that keeps it updated… what’s the actual value add? Is it just faster skill setup? Better tooling? Less maintenance? I even had to wipe Hermes memory because it got full. Feels like I’m missing something. People using Hermes/OpenClaw heavily, what am I not seeing?

English

138

149

56.6K

Santiago Rodríguez Díaz@sonriks6·17 May

@hgruenhagen @openclaw node-llama-cpp and whatever model it sets when you define ‘local’

English

109

Holger Gruenhagen@hgruenhagen·15 May

Which embedding models are you using for memory-core in @openclaw ? Mostly curious about retrieval quality and latency.

English

31.4K

Santiago Rodríguez Díaz@sonriks6·4 May

@petergyang I’m more lean towards OpenClaw the level of effort from devs will come back with stability and reliability soon \ In the long run OC will lead

English

Peter Yang@petergyang·4 May

I caved and downloaded Hermes to try. For those of you who have tried both Hermes and OpenClaw what difference do you notice? No shilling please, just want some honest opinions

English

375

1.2K

306K

Santiago Rodríguez Díaz@sonriks6·20 Nis

@KarenPayneMVP GPT5.4 as main, Opus for long-term plans

English

216