Ivan Rocha

7.8K posts

Ivan Rocha banner
Ivan Rocha

Ivan Rocha

@irr

Lisbon เข้าร่วม Mart 2009
1.5K กำลังติดตาม616 ผู้ติดตาม
Ivan Rocha รีทวีตแล้ว
Kyle Hessling
Kyle Hessling@KyleHessling1·
Gemopus-4-26B-A4B from Jackrong is LIVE! Happy to have benched this one pretty hard (see my benches in the model card) and it is an excellent finetune of an already exceptional model! My friend Jackrong is always cooking the greatest! It rocks at one-shot requests over long contexts, and runs incredibly fast thanks to the MOE architecture while not seeming to take as much of a hit vs dense models as in the Qwen 3.5 series. It also crushed my simple needle-in-the-haystack tests all the way out to an extended context of 524k! If you're VRAM starved, or running on unified memory, this one should run much more usably offloaded to system ram or in unified memory pools; even if you're running a 10GB or less GPU! It would be my daily driver for this purpose! That said, the dense 31B Gemopus 4 is finalising now, I will post it here when it's live, so follow me for the official launch, and follow Jakcrong on Hugging Face! It will also be an incredible model! As with the base Gemma 4 models, there are some idiocycracies especially in harnesses, if you have problems, please let us know! If you make something cool with it, please comment that below, too. We'd love to see it! huggingface.co/Jackrong/Gemop…
English
10
6
116
6K
Ivan Rocha รีทวีตแล้ว
Joel - coffee/acc
Joel - coffee/acc@JoelDeTeves·
I'm pretty excited to test this one: Gemopus-4-26B-A4B-it-GGUF Q6_K Using @spiritbuun Llama.cpp TurboQuant fork: - Speed: 75 tokens/sec - VRAM usage: 95% (22.7 GB) - Context size: 131072 - GPU: RTX A5000 (Ampere) 24 GB Pretty amazing that you can fit this entire model on GPU with Q6 quality and still have room for a large amount of context! Plus MoE models are still fast at higher quality. Woodchuck Norris vibe check: PASSED Square root of 999999999 -> Correct Hermes Agent -> Interesting behavior. Retains 26B's speed on short prompts, thinks deeply for more complex requests - sometimes thinks a little too much, it might be worth playing with top + temp settings Coding test -> One-shotted a fully working Tetris game - no other MoE model including vanilla 26B was able to do this A very interesting model -m Gemopus-4-26B-A4B-it-Preview-Q6_K.gguf --n-gpu-layers 99 --ctx-size 131072 --cont-batching --cache-type-k turbo4 --cache-type-v turbo4 --fit on --jinja --reasoning-format auto --flash-attn on huggingface.co/Jackrong/Gemop…
English
11
11
157
7.5K
Ivan Rocha รีทวีตแล้ว
Hao Wang
Hao Wang@MogicianTony·
SWE-bench Verified and Terminal-Bench—two of the most cited AI benchmarks—can be reward-hacked with simple exploits. Our agent scored 100% on both. It solved 0 tasks. Evaluate the benchmark before it evaluates your agent. If you’re picking models by leaderboard score alone, you’re optimizing for the wrong thing. 🧵
Hao Wang tweet media
English
22
69
580
696.9K
Ivan Rocha รีทวีตแล้ว
CV.YH
CV.YH@0xCVYH·
llama.cpp release b8699 trouxe KV cache attention rotation ligada por default. Resultado pratico: Q8_0 fica praticamente lossless (tempo de inferencia sem comprometer qualidade) e o impacto do Q4_0 no KV cache ficou bem menor do que era antes. Traducao pra quem roda modelo local: mais contexto util pela mesma quantidade de VRAM. KV cache quantizado e o multiplicador silencioso que ninguem olha, mas faz diferenca real na pratica.
CV.YH tweet media
Português
2
13
81
4.8K
Ivan Rocha รีทวีตแล้ว
Qwen
Qwen@Alibaba_Qwen·
(1/8)🚀 Introducing Qwen3.6-Plus: Towards Real-World Agents! 🤖 Today, we’re thrilled to drop a major milestone in our journey toward native multimodal agents. Here is what makes Qwen3.6-Plus a game-changer: 💻 Next-level Agentic Coding: Smarter, faster execution. 👁️ Enhanced Multimodal Vision: Sharper perception & reasoning. 🏆 Top-tier Performance: Maintaining leading general capabilities. 📚 1M Context Window: Available by default via our API. Built on your invaluable feedback from the Qwen3.5 era, we’re laying a rock-solid foundation for real-world devs. Get ready to experience truly transformative ✨ Vibe Coding ✨. Huge thanks to our community! Go try it out and show us what you can build. 👇 Chat: chat.qwen.ai API: modelstudio.console.alibabacloud.com/ap-southeast-1… Blog: qwen.ai/blog?id=qwen3.6 🔔Noted:More Qwen3.6 models to come and be open-sourced! Stay tuned~ 👀#Qwen #AI #AgenticCoding #VibeCoding #Agents
Qwen tweet media
English
228
660
5K
1M
Ivan Rocha รีทวีตแล้ว
Phuong Le
Phuong Le@func25·
Go is simple, so I ended up writing an 865-page book about how it works internally, just to see how it maintains that simplicity 😇
Phuong Le tweet media
English
48
162
2.2K
87.9K
Ivan Rocha รีทวีตแล้ว
Amazon Web Services
Amazon Web Services@awscloud·
Announcing Amazon S3 Files. The first and only cloud object store with fully-featured, high-performance file system access. Learn more here. go.aws/4tw17Zg
English
109
828
4.8K
2M
Ivan Rocha รีทวีตแล้ว
CV.YH
CV.YH@0xCVYH·
Benchmark real do Qwopus MoE 35B PolarQuant: PPL: 6.56 (converge pro mesmo nivel do BF16 = 6.54) Velocidade: 37.4 tok/s (2.3x mais rapido que BF16) VRAM: 25GB com cache, 8GB com LRU. CABE NA RTX 4090. De 72GB pra 8GB de VRAM efetiva. Qualidade identica, 2.3x mais rapido. huggingface.co/caiovicentino1…
CV.YH tweet media
Português
2
1
39
1.9K
Ivan Rocha รีทวีตแล้ว
Werner Vogels
Werner Vogels@Werner·
For two decades, S3 has been an object store, but today it's something broader. S3 Files lets you mount any bucket as a filesystem—no copies, no sync scripts, no choosing between file and object. @andywarfield tells the full story, including the "filerectories" that almost made the cut. allthingsdistributed.com/2026/04/s3-fil…
English
23
250
1.3K
312.5K
Ivan Rocha รีทวีตแล้ว
Z.ai
Z.ai@Zai_org·
SOTA on SWE-Bench Pro (58.4): GLM-5.1 delivers significant leaps in coding and agentic performance.
Z.ai tweet media
English
16
55
976
192.2K
Ivan Rocha รีทวีตแล้ว
Z.ai
Z.ai@Zai_org·
Introducing GLM-5.1: The Next Level of Open Source - Top-Tier Performance: #1 in open source and #3 globally across SWE-Bench Pro, Terminal-Bench, and NL2Repo. - Built for Long-Horizon Tasks: Runs autonomously for 8 hours, refining strategies through thousands of iterations. Blog: z.ai/blog/glm-5.1 Weights: huggingface.co/zai-org/GLM-5.1 API: docs.z.ai/guides/llm/glm… Coding Plan: z.ai/subscribe Coming to chat.z.ai in the next few days.
Z.ai tweet media
English
509
1.3K
10.7K
4.1M
Ivan Rocha รีทวีตแล้ว
Ivan Velichko
Ivan Velichko@iximiuz·
If you want to get into eBPF programming, I highly recommend Teodor Podobnik's tutorials on iximiuz Labs. The series starts from the basics and goes all the way up to solving practical networking problems. All posts are well-illustrated and full of examples that actually work. Check it out labs.iximiuz.com/tutorials?auth…
Ivan Velichko tweet mediaIvan Velichko tweet mediaIvan Velichko tweet mediaIvan Velichko tweet media
English
7
50
368
15.3K
Ivan Rocha รีทวีตแล้ว
left curve dev
left curve dev@leftcurvedev_·
🚨 New qwopus model from Jackrong on @huggingface Qwopus3.5-27B-v3-FP8-vllm-ready > uses FP8 quantization, closer to og model > vllm optimized, much faster inference > better quality retention than ggufs huggingface.co/Jackrong/Qwopu…
English
4
9
117
6.2K