Pusal retweetledi

🚀 NEW GEMMA 4 31B TURBO DROPPED
Runs on a SINGLE RTX 5090:
⚡️18.5 GB VRAM only (68% smaller)
🧠51 tok/s single decode
💻1,244 tok/s batched
🤖15,359 tok/s prefill ← yes, fifteen thousand
🚨2.5× faster than base model with basically zero quality loss.
It hits Sonnet-4.5 level on hard classification tasks…
at 1/600th the cost.
Local models are shipping faster than we can test 👇🏻
🔥 HF: huggingface.co/LilaRest/gemma…

English







