Fzo (@Fzo) - Twitter Profili | Zamantika Mersobahis Locabet

Fzo retweetledi

Prasenjit@Star_Knight12·27 Nis

CLAUDE OPUS 4.7 USING 500K TOKENS TO RENAME A VARIABLE

English

182

1.6K

18.7K

1.4M

Fzo retweetledi

Unsloth AI@UnslothAI·28 Nis

NVIDIA releases Nemotron-3-Nano-Omni, a new 30B open multimodal MoE model. Nemotron-3-Nano-Omni-30B-A3B is the strongest omni model for its size and supports audio, video, image and text. Run on ~25GB RAM. GGUF: huggingface.co/unsloth/NVIDIA… Guide: unsloth.ai/docs/models/ne…

English

42

135

944

69.4K

Fzo retweetledi

NVIDIA AI@NVIDIAAI·28 Nis

Meet Nemotron 3 Nano Omni 👋 Our latest addition to the Nemotron family is the highest efficiency, open multimodal model with leading accuracy. 30B parameters. 256K context length. 🧵👇

English

88

186

1.3K

443.5K

Fzo retweetledi

Mek Suwit@SuwitSap·23 Nis

คนย้ายออกจาก กทม.ไปปริมณฑลและไปเมืองอุตสาหกรรม หลายจังหวัดท้ายที่สุดจะกลายเป็นจังหวัดร้างไร้ผู้คน ไร้งาน มากขึ้น แต่ยังของบพัฒนาจังหวัดเพิ่ม ควรยุบจังหวัดและส่วนราชการภูมิภาคได้แล้ว

ไทย

7

375

302

33.2K

Fzo@Fzo·22 Nis

บอกได้เลยว่า $60B นี่มันบ้าไปแล้ว! SpaceX จะฮุบ Cursor ไปรวมกับ xAI เพื่อสร้าง AI stack ของตัวเอง แบบนี้ Developer มีหนาวแน่นอนครับ

Vihan Singh@vihan13singh

$60B for a coding tool. let that sink in spacex might acquire cursor, plug it into xAI infra, and turn dev tools into a core layer of the AI stack nytimes.com/2026/04/21/bus…

ไทย

0

49

Fzo@Fzo·22 Nis

GPT-Image V2 สร้างแผนที่โลกกลับทะเลเป็นแผ่นดินได้สำเร็จ ไม่มี AI ตัวไหนทำแบบนี้ได้มาก่อน บอกได้เลยว่า image generation พัฒนาไปอีกขั้นแล้ว และนี่อาจเป็นสัญญาณว่า AI เริ่มเข้าใจเรื่องพิกัดภูมิศาสตร์ดีขึ้นมาก

Chetaslua@chetaslua

🚨 GPT-IMAGE V2 Just Passed Another TEST THAT NO FUCKING ANOTHER AI HAD EVER PROMPT : - A high-resolution map of the Earth, but with the land and water inverted. All continents (North America, Africa, etc.) are made of blue water, and all oceans (Atlantic, Pacific) are solid green land masses. The shapes must be geographically accurate to the real Earth, just with the textures swapped

ไทย

0

1

777

Fzo@Fzo·22 Nis

🔥 Karpathy Autoresearch บน local Mac M5 Max + Gemma 4! AI วิจัยตัวเองได้แล้ว ประหยัดเงินค่า API อีกต่างหาก ใครอยากลองต้องมี GPU แรงๆ นะครับ

Ivan Fioravanti ᯅ@ivanfioravanti

Autoresearch from @karpathy in action locally using gemma-4-26b-a4b-it-6bit with oMLX on an M5 Max to train Gemma 4 E2B 🚀 IT COULD WORK!

ไทย

0

102

Fzo@Fzo·22 Nis

🤩 Ling 2.6 Flash กระโดด 10 คะแนน! จาก Ant Group ราคาถูกมาก ($0.10/1M input) แต่ความฉลาดเทียบเท่า GPT-4.5 nano เลย ประหยัดดี!

Artificial Analysis@ArtificialAnlys

Ant Group's Ling 2.6 Flash scores 26 on the Artificial Analysis Intelligence Index, a 10-point jump from Ling-flash-2.0. It is one of few recent open weights releases focused on non-reasoning capabilities and focuses on a reasonable cost to intelligence ratio. Ling 2.6 Flash is a non-reasoning model from Ant Group's @TheInclusionAI lab. Ant Group's model family comprises three series: Ling (non-reasoning), Ring (reasoning), and Ming (multimodal). Ling-flash-2.0 was the previous flash-tier non-reasoning model. Ling 2.6 Flash is expected to be open weights shortly after release, but as of today the weights have not been released on Hugging Face. Key takeaways: ➤ At 104B total parameters with 7.4B active parameters, Ling 2.6 Flash (26) sits in intelligence near GPT-5.4 nano (Non-Reasoning, 24) and Gemma 4 26B A4B (Non-reasoning, 27), both models with comparable active parameter counts. However, at 18 points behind GLM-5.1 (Non-reasoning, 44), there remains a gap to frontier non-reasoning open weights models ➤ Ling 2.6 Flash is comparatively token efficient, using ~15M output tokens to run the Intelligence Index. This is comparable to Gemma 4 26B A4B (~14M) but a fraction of Qwen3.5 9B (~78M). Compared to models in the similar intelligence tier, Ling 2.6 Flash represents a reasonable efficiency tradeoff, which has positive effects on cost when deployed on larger workloads. At a price of $0.1 / million input tokens and $0.3 / million output tokens, Ling 2.6 Flash costs only ~$23 to run the full Artificial Analysis Intelligence Index. ➤ Gains from Ling-flash-2.0 were driven mostly by improvements agentic capabilities and instruction following. τ²-Bench jumped from 21% to 86% (+65 points), IFBench from 34% to 57% (+23 points), and GDPval-AA Elo from 425 to 783 (+84%). Conversely, GPQA Diamond fell from 66% to 59% (-6 points) and SciCode from 29% to 27% (-2 points). ➤ AA-Omniscience performance is at -66 with 15% accuracy and 96% hallucination rate. This is consistent with the model's small 7.4B active parameter count. Knowledge recall benefits from larger parameter counts, and sub-10B active-parameter models systematically underperform on this metric. Additional model details: ➤ Architecture: MoE, 104B total parameters, 7.4B active parameters ➤ Context window: 262K tokens (doubled from 128K for Ling-flash-2.0) ➤ Pricing: $0.10 / $0.30 per 1M input/output tokens (via Novita API) ➤ License: Weights not yet released ➤ Availability: Third party API through @novita_labs

ไทย

0

100

Fzo@Fzo·22 Nis

@ArtificialAnlys 🤩 Ling 2.6 Flash กระโดด 10 คะแนน! จาก Ant Group ราคาถูกมาก ($0.10/1M input) แต่ความฉลาดเทียบเท่า GPT-4.5 nano เลย ประหยัดดี!

ไทย

0

226

Artificial Analysis@ArtificialAnlys·22 Nis

Ant Group's Ling 2.6 Flash scores 26 on the Artificial Analysis Intelligence Index, a 10-point jump from Ling-flash-2.0. It is one of few recent open weights releases focused on non-reasoning capabilities and focuses on a reasonable cost to intelligence ratio. Ling 2.6 Flash is a non-reasoning model from Ant Group's @TheInclusionAI lab. Ant Group's model family comprises three series: Ling (non-reasoning), Ring (reasoning), and Ming (multimodal). Ling-flash-2.0 was the previous flash-tier non-reasoning model. Ling 2.6 Flash is expected to be open weights shortly after release, but as of today the weights have not been released on Hugging Face. Key takeaways: ➤ At 104B total parameters with 7.4B active parameters, Ling 2.6 Flash (26) sits in intelligence near GPT-5.4 nano (Non-Reasoning, 24) and Gemma 4 26B A4B (Non-reasoning, 27), both models with comparable active parameter counts. However, at 18 points behind GLM-5.1 (Non-reasoning, 44), there remains a gap to frontier non-reasoning open weights models ➤ Ling 2.6 Flash is comparatively token efficient, using ~15M output tokens to run the Intelligence Index. This is comparable to Gemma 4 26B A4B (~14M) but a fraction of Qwen3.5 9B (~78M). Compared to models in the similar intelligence tier, Ling 2.6 Flash represents a reasonable efficiency tradeoff, which has positive effects on cost when deployed on larger workloads. At a price of $0.1 / million input tokens and $0.3 / million output tokens, Ling 2.6 Flash costs only ~$23 to run the full Artificial Analysis Intelligence Index. ➤ Gains from Ling-flash-2.0 were driven mostly by improvements agentic capabilities and instruction following. τ²-Bench jumped from 21% to 86% (+65 points), IFBench from 34% to 57% (+23 points), and GDPval-AA Elo from 425 to 783 (+84%). Conversely, GPQA Diamond fell from 66% to 59% (-6 points) and SciCode from 29% to 27% (-2 points). ➤ AA-Omniscience performance is at -66 with 15% accuracy and 96% hallucination rate. This is consistent with the model's small 7.4B active parameter count. Knowledge recall benefits from larger parameter counts, and sub-10B active-parameter models systematically underperform on this metric. Additional model details: ➤ Architecture: MoE, 104B total parameters, 7.4B active parameters ➤ Context window: 262K tokens (doubled from 128K for Ling-flash-2.0) ➤ Pricing: $0.10 / $0.30 per 1M input/output tokens (via Novita API) ➤ License: Weights not yet released ➤ Availability: Third party API through @novita_labs

English

9

8

185

19.4K

Fzo retweetledi

Felipe Demartini@namcios·20 Nis

Um lab chinês que quase ninguém no Brasil conhece acabou de humilhar os três maiores labs de IA do planeta. Modelo open-source. Pesos no HuggingFace. Gratuito. E bate Claude Opus 4.6, GPT-5.4 e Gemini 3.1 Pro em 6 benchmarks. Não é exagero. A Moonshot lançou o Kimi K2.6 hoje: → SWE-Bench Pro: 58,6 (Claude: 57,7) → Toolathlon: 50,0 (Claude: 47,2) → SWE-Bench Multilingual: 76,7 → BrowseComp: 83,2 → HLE com tools: 54,0 → MathVision com Python: 93,2 Agora a parte que deveria tirar o sono de toda big tech americana: o preço. Kimi K2.6 via API: $0,60/milhão de tokens de input. $2,50 de output. Claude Sonnet 4.6: $3,00 e $15,00. 5x mais barato no input. 6x no output. E como os pesos são abertos, qualquer empresa com GPUs roda sem pagar nada para a Moonshot. Mas o número mais assustador não é benchmark nem preço. É velocidade de execução. O modelo rodou 4.000+ tool calls em uma sessão única. 12 horas de execução contínua. 300 sub-agentes em paralelo. Pegou um modelo local, reescreveu a inferência inteira em Zig, e foi de 15 tokens/segundo para 193. Sozinho. Um engenheiro de software autônomo que trabalha 12 horas sem parar e não cobra salário. Open-source. A OpenAI cobra $200/mês pelo Pro. A Anthropic levantou $60 bilhões em valuation. O Google queima $75 bilhões por ano em infraestrutura. E um lab de Pequim, com uma fração desse capital, está entregando de graça o que essas empresas dizem aos investidores que custa dezenas de bilhões para construir. A cadência é o que mata. K2 em julho de 2025. K2.5 em janeiro de 2026. K2.6 agora. A cada 8 semanas a Moonshot solta um modelo que come mais um pedaço do moat dos labs fechados. Dessa vez, em benchmarks agênticos, o moat evaporou. Em janeiro o DeepSeek evaporou $600 bilhões da Nvidia em um único dia e forçou a OpenAI a tornar o ChatGPT gratuito na mesma semana. Agora a Moonshot fez de novo. Essa é a segunda vez em quatro meses. Vai ter uma terceira.

Kimi.ai@Kimi_Moonshot

Meet Kimi K2.6: Advancing Open-Source Coding 🔹Open-source SOTA on HLE w/ tools (54.0), SWE-Bench Pro (58.6), SWE-bench Multilingual (76.7), BrowseComp (83.2), Toolathlon (50.0), Charxiv w/ python(86.7), Math Vision w/ python (93.2) What's new: 🔹Long-horizon coding - 4,000+ tool calls, over 12 hours of continuous execution, with generalization across languages (Rust, Go, Python) and tasks (frontend, devops, perf optimization). 🔹Motion-rich frontend - Videos in hero sections, WebGL shaders, GSAP + Framer Motion, Three.js 3D. 🔹Agent Swarms, elevated - 300 parallel sub-agents × 4,000 steps per run (up from K2.5's 100 / 1,500). One prompt, 100+ files. 🔹Proactive Agents - K2.6 model powers OpenClaw, Hermes Agent, etc for 24/7 autonomous ops. 🔹Claw Groups (research preview) - bring your own agents, command your friends', bots & humans in the loop. - K2.6 is now live on kimi.com in chat mode and agent mode. For production-grade coding, pair K2.6 with Kimi Code: kimi.com/code - 🔗 API: platform.moonshot.ai 🔗 Tech blog: kimi.com/blog/kimi-k2-6 🔗 Weights & code: huggingface.co/moonshotai/Kim…

Português

42

225

2.2K

333.9K

Fzo retweetledi

Teknium 🪽@Teknium·21 Nis

Quick facts about Hermes; - It's pronounced Her Meeze - It's named after the Greek God of communication, magic, and intelligence - We have been using the Hermes name for 3 or so years now, including in our Hermes 1-4 series of models - That's all ^_^

English

117

45

1.3K

84.6K

Fzo retweetledi

Stitch by Google@stitchbygoogle·21 Nis

Today, we’re open-sourcing the draft specification for DESIGN.md, so it can be used across any tool or platform. We’re also adding new capabilities. DESIGN.md lets you easily export and import your design rules from project to project. Instead of guessing intent, agents know exactly what a color is for and can even validate their choices against WCAG accessibility rules. Watch David East break down this shared visual language in action👇. New capabilities and links in 🧵

English

205

2K

18.2K

6.8M

Fzo retweetledi

stevibe@stevibe·21 Nis

Which LLMs actually love to think? Tested 7 models on 5 math problems, measured reasoning length. The think winners: both Qwen3.5 models (27B and 35B A3B) — massive overthinkers, up to 10k+ tokens on a single question. Plot twists: > Kimi K2.6 feels verbose, actually one of the leanest > Gemma4 26B A4B solved 2 with ZERO thinking

English

38

48

605

97K

Fzo retweetledi

Qwen@Alibaba_Qwen·20 Nis

🚀 Introducing Qwen3.6-Max-Preview, an early preview of our next flagship model Highlights： ⚡️ Improved agentic coding capability over Qwen3.6-Plus 📖 Stronger world knowledge and instruction following 🌍 Improved real-world agent and knowledge reliability performance Smarter, sharper, still evolving. More Qwen3.6 models to come. Stay tuned! 🔗👇 Blog: qwen.ai/blog?id=qwen3.… Qwen Studio: chat.qwen.ai/?models=qwen3.… API: modelstudio.console.alibabacloud.com/ap-southeast-1…

English

172

446

4.2K

344.8K

Fzo retweetledi

Eric ⚡️ Building...@outsource_·20 Nis

🤯GEMMA 4 E4B IS FULLY OBLITERATED @elder_plinius dropped gemma-4-E4B-it-OBLITERATED one of the most aggressive uncensoring jobs on Google’s new 4B model. This version is next level 🤯Refusals dropped from 98.8% → 0% hard refusal ✅Guardrails surgically removed (21/42 layers) 🤖Coding performance improved over base 🚀 v3 fixes: 720 tensors intact, no more broken K/V 🚨 Tiny footprint — Q4_K_M is only 4.9 GB 👀 Runs smoothly on phones, edge devices & low-VRAM GPUs ✅Full GGUF quants + multimodal (vision/audio) support Truly uncensored local AI for chat, roleplay etc! Grab the hugging face model 👇🏻

English

12

88

641

48.3K

Fzo retweetledi

Alex Cheema@alexocheema·21 Nis

Running Qwen3.6 35B (vision) on 2 x M5 Max MacBook Pro with RDMA over Thunderbolt 5. It describes the image and identifies Apple Park correctly, but misidentifies John Ternus as Jeff Williams. Near instant response with prefix caching.

EXO Labs@exolabs

EXO v1.0.70 is out. This release ships with multimodality and major enhancements for memory usage in long context use cases (e.g. @openclaw and @opencode), as well as updated model support and QOL features.

English

23

42

460

98.6K

Fzo@Fzo·21 Nis

สุดๆ

Lotto@LottoLabs

This is hands down the chinese golden era MM2.7 Qwen 3.6 GLM 5.1 MiMo v2 Kimi K2.6

ไทย

0

19

Fzo retweetledi

Graeme@gkisokay·21 Nis

The Local LLM cheat sheet for your 16GB RAM device I pulled together a lineup of small models that can run comfortably on a Mac Mini or personal laptop while still leaving room for context without melting your machine. Models for Daily Use Qwen3.5 9B / GGUF / Q4_K_M Daily driver. General chat, drafting, research, translation. If you're keeping only one, keep this. DeepSeek-R1 Distill Qwen 7B / GGUF / Q4_K_M Reasoning engine. Math, logic, step-by-step problems. Slower, but worth it when you need actual thinking. Models for Specialty Work Qwen2.5 Coder 7B / GGUF / Q4_K_M Code specialist. Completions, refactors, debugging, repo Q&A. Better than a generalist when the task is code. Llama 3.1 8B / GGUF / Q4_K_M Long context worker. RAG, doc chat, codebase Q and A. The output isn't top tier, but the context is strong for its size. Phi-4 Mini Reasoning / GGUF / Q4_K_M Compact thinker. Logic, structured answers, math, and short coding bursts. Smaller context is the catch. Models for Efficiency Gemma 4 E4B / GGUF / Q4_K_M Light all-rounder. Writing, chat, light agents, structured output. Phi-3.5 Mini / GGUF / Q5_K_M Pocket sidekick. Summaries, extraction, background doc chat. Easy to pair with a bigger model. Qwen3.5 2B / GGUF / Q4_K_M Useful for summaries, tagging, rewrites, and lightweight sidekick work. Micro Models Qwen3.5 0.8B / GGUF / Q5_K_M Classification, keyword routing, binary decisions, triage. Gemma 4 E2B-it / GGUF / Q4_K_M Lightweight chat, quick Q and A, summaries, tiny agents. My personal choice for a single model is Qwen3.5 9B For two models use Qwen3.5 9B + Qwen2.5 Coder 7B for code, or Qwen3.5 9B + Phi-3.5 Mini for support tasks. Let me know in the comments your experience with these models, or any I have left out.

English

98

346

2.3K

413.6K

Fzo retweetledi

Eric ⚡️ Building...@outsource_·20 Nis

NEW 🤯 GLM+ QWEN 18B RUNS ON CONSUMER GPU IT BEATS 35B MoE AT HALF THE VRAM @KyleHessling1 just dropped the healed Qwopus-GLM-18B-Merged-GGUF Insane 64-layer frankenmerge of two elite Qwen3.5-9B finetunes (Opus reasoning + GLM-5.1 distill). This thing is cooking on consumer GPUs. 🧠Overall Score: 40/44 (90.9%) beats new Qwen 3.6 🤖Only 9.2 GB Q4_K_M runs on 12-16GB VRAM 🚨Perfect tool calling & agentic reasoning (6/6 + 4/4) 🤯Production frontend code flawless HTML/CSS/JS (98.4% stress test pass) 📈 262k context + strong multilingual ✅ Elite structured output & complex apps ⭐️ Agent workflows, CoT, self-correction 🏆66 tokens/sec with low variance Healed merge = no more issues! If you’ve got a mid-range GPU run this 18B 👇🏻

English

21

49

487

28.3K

Fzo@Fzo·21 Nis

ใครคิดว่าโมเดลตัวไหนง่อยๆ ตอนใช้ #openclaw ลองเอาไปใช้กับ #hermes ดูครับ อาจจะคิดใหม่ได้ 🤣

ไทย

0

38

Fzo

Keşfet