gelim

564 posts

gelim

@gelim

Mixing colors in cybersec

Katılım Ağustos 2008

69 Takip Edilen261 Takipçiler

Sabitlenmiş Tweet

gelim@gelim·17 Nis

After reading @Guardicore CVE-2020-3952 excellent article and PoC I rewrote it a bit to check without adding admin user here github.com/gelim/CVE-2020… #vmware #vmdir #ldap #vcenter #vulnerability

English

gelim@gelim·26 Nis

@Tur24Tur @deepseek_ai @PortSwigger @claudeai Curious how qwen3.6-27B would perform on those tests

English

203

Tur.js@Tur24Tur·25 Nis

Mission accomplished ✅ Here's a summary of today's experiments with @deepseek_ai V4 Pro 3 expert-level @PortSwigger web challenges + 1 real Android app, all solved autonomously. Each run reviewed by @claudeai Opus 4.7: 1/ SQL Injection 26 tool calls, 3 minutes x.com/Tur24Tur/statu… 2/ Android Root Detection Bypass 102 tool calls, 16 minutes x.com/Tur24Tur/statu… 3/ Reflected XSS with AngularJS sandbox escape + CSP bypass 142 tool calls, 71 minutes x.com/Tur24Tur/statu… 4/ Web Cache Deception 142 tool calls, 35 minutes x.com/Tur24Tur/statu… 412 total tool calls. 4 different security categories. No solutions copied. Two additional tasks failed mid-run due to a crash in my agent but after fixing the bug I re-ran them and both were solved. Total cost for the entire day: $6.84 on deepseek-v4-pro (see attached screenshot). Thanks to @PortSwigger for providing the best hands-on labs for web security made it possible to benchmark AI agents on real expert-level challenges. More experiments coming soon. #DeepSeek #ClaudeOpus #AgenticAI #DeepSeekV4Pro

English

114

11.5K

gelim@gelim·12 Nis

@MiniMax_AI @vllm_project It seems @UnslothAI + @ggml_org gives us llama.cpp day 0 support as well

English

MiniMax (official)@MiniMax_AI·12 Nis

Day-0 vLLM support for MiniMax M2.7 is now live. 🙌 Huge thanks to the @vllm_project team for enabling the open-source community to run M2.7 from day one. 👉 Get started: docs.vllm.ai/projects/recip…

vLLM@vllm_project

🎉 Congrats to @MiniMax_AI on this release. Day-0 support for MiniMax M2.7 in vLLM! 🤖 Agentic-first design. Multi-agent orchestration ("Agent Teams") and complex skill management 💻 Strong coding. Production debugging, log analysis, and code security 📄 Office automation. Proficient in document editing across Word, Excel, and PowerPoint Get started 👇 📖 docs.vllm.ai/projects/recip…

English

123

64.8K

gelim@gelim·7 Nis

@victormustar brace for the incoming wave of download :-)

English

284

Victor M@victormustar·7 Nis

MiniMax-2.7 open weights confirmed and coming super soon - this is going to be a very big one 🔥

English

497

27.2K

gelim@gelim·6 Nis

@ivanfioravanti #69d3e884ba6f6793d723f30e" target="_blank" rel="nofollow noopener">huggingface.co/MiniMaxAI/Mini…

QME

Ivan Fioravanti ᯅ@ivanfioravanti·4 Nis

@gelim 🤞

QME

497

Ivan Fioravanti ᯅ@ivanfioravanti·4 Nis

Where is MiniMax 2.7 on HF? 🤔

English

12.6K

gelim@gelim·29 Mar

@NielsRogge How was the speed in minimax? (pp & tg) and how much context max at which quant? Using llama.cpp I guess if speaking about Q3

English

238

Niels Rogge@NielsRogge·29 Mar

Some learnings: I was able to run MiniMax-M2.5 (3-bit GGUF) on the DGX Spark (about 90 GB size), but the speed was not really acceptable So switched to Qwen3.5-35B-A3B cause Qwen3.5-27B is also too slow. MoEs are better in terms of speed on memory-bandwidth constrained devices like the DGX, as only 3B parameters are activated for each token

English

6.2K

gelim@gelim·23 Mar

@ivanfioravanti @nyan4maru @grok pointless to stick on DGX Spark that is a bit overpriced. I'll take 2 Asus Ascent GX10 for the price of your MBP M5 Max. 2x3k€ vs. 6k€. Same NVme storage in total, twice the unified RAM. What about thermal throttling on sustained load? 🍊 & 🍏:-)

English

119

Ivan Fioravanti ᯅ@ivanfioravanti·21 Mar

@nyan4maru @grok please compare the price of a MacBook with M5 Max and 128GB of RAM with Nvidia DGX Spark

English

1.3K

Ivan Fioravanti ᯅ@ivanfioravanti·21 Mar

Apple M5 Max crushes Nvidia DGX Spark. Prove me wrong.

English

207

27.6K

gelim@gelim·22 Mar

@Carlos_Ajoy @TeksEdge As to how much the quality is impacted with this quantization level... That stays unanswered 😕

English

gelim@gelim·22 Mar

@Carlos_Ajoy @TeksEdge UD-Q3_K_XL (unsloth GGUF) and max 64k context in Q8. Probably better to have the machine headless to spare some RAM.

English

David Hendrickson@TeksEdge·22 Mar

Great OS news! MiniMax M2.7 weight will be available in a couple of weeks. MiniMax M2.5 was ~450GB with 8bit ~250GB fitting on a single top end Mac Studio.

Skyler Miao@SkylerMiao7

M2.7 open weights coming in ~2 weeks. still actively iterating just updated a new version on yesterday — noticeably better on OpenClaw.

English

123

9.2K

gelim@gelim·22 Mar

@SkylerMiao7 Goat. ♥️

English

Skyler Miao@SkylerMiao7·22 Mar

M2.7 open weights coming in ~2 weeks. still actively iterating just updated a new version on yesterday — noticeably better on OpenClaw.

English

164

134

2.2K

348.1K

gelim@gelim·22 Mar

x.com/i/status/20357…

Skyler Miao@SkylerMiao7

M2.7 open weights coming in ~2 weeks. still actively iterating just updated a new version on yesterday — noticeably better on OpenClaw.

ZXX

gelim@gelim·20 Mar

@MiniMax_AI M2.7 ♥️🤗?

gelim@gelim·22 Mar

@iamsupersocks Breaking news qui fait bien plaisir x.com/i/status/20357…

Skyler Miao@SkylerMiao7

M2.7 open weights coming in ~2 weeks. still actively iterating just updated a new version on yesterday — noticeably better on OpenClaw.

Français

Supersocks@iamsupersocks·21 Mar

@gelim M2.5 was a real one for sure. But the throne didn't stay empty long → Nemotron Coalition is here, Cascade 2 just dropped, and Qwen3.5 is probably the biggest gift to the open-weight community right now. Good times to build local.

English

Supersocks@iamsupersocks·20 Mar

L'âge d'or de l'open-weight se referme mais pas là où on croit MiniMax M2.7 vient de sortir. Performance au niveau de GLM-5, un tiers du coût. Mais la vraie info c'est pas les benchmarks. C'est le premier flagship MiniMax qui sort sans publier ses poids. Pas de téléchargement. Pas de HuggingFace. API only. MiniMax, le labo chinois qui publiait tout en open, vient de fermer la porte. Pour comprendre pourquoi c'est un signal fort, il faut rembobiner. Depuis un an, les labos chinois dominaient l'open-weight. DeepSeek V3, Qwen3, GLM-4.7, MiMo, MiniMax M2 → tout publié. Licences ouvertes. Poids téléchargeables dès le jour de la sortie. N'importe qui pouvait les faire tourner, les modifier, les déployer. C'était pas de la générosité. C'était une stratégie : inonder le marché de modèles gratuits pour couper l'herbe sous le pied des labos US fermés comme OpenAI et Anthropic. Aujourd'hui les dominos tombent. → MiniMax M2.7 : fermé. API only → GLM-5 Turbo (Z.ai) : le modèle de base reste ouvert, mais la meilleure version est fermée → Qwen (Alibaba) : départs dans le leadership, rumeurs de pivot vers le propriétaire Trois labos. Même direction. Même trimestre. Quand un entraînement coûte $100M+ et que tes investisseurs veulent du retour, donner tes meilleurs modèles gratuitement devient difficile à justifier. Mais attention. Les chinois ne lâchent pas l'open-weight. Ils restructurent. Le pattern qui se dessine c'est celui de DeepSeek : → modèle de base : publié, ouvert, téléchargeable → version premium : fermée, accessible uniquement par API payante Ils ferment le haut de gamme tout en gardant un pied dans l'écosystème open. Souveraineté, influence, contrôle de la communauté dev. C'est pas un abandon. C'est une stratégie hybride. Et pendant que la Chine se referme en haut, NVIDIA fait le mouvement inverse. Au GTC la semaine dernière, Jensen Huang lance la Nemotron Coalition. Huit labos fondateurs : Mistral AI, Cursor, LangChain, Perplexity, Reflection AI (fondé par Mira Murati, ex-CTO d'OpenAI), Black Forest Labs, Sarvam, Thinking Machines Lab. Premier projet : un modèle frontier co-développé par Mistral et NVIDIA. Entraîné sur DGX Cloud. Publié en open. Base du futur Nemotron 4. Et la gamme actuelle est déjà impressionnante. Nemotron 3 Super : 120 milliards de paramètres, 12 milliards actifs à l'inférence. Architecture hybride. Poids ouverts + 10 000 milliards de tokens de données d'entraînement publiés + recettes complètes. Numéro 1 sur DeepResearch Bench. Nemotron-Cascade 2, sorti aujourd'hui : 30 milliards de paramètres, 3 milliards actifs. Bat Qwen3.5 (un modèle 40x plus gros) sur le math, le code et les workflows agentiques. 3 milliards de paramètres actifs. C'est le genre de modèle qui tourne sur ta machine, pas sur un cluster. La stratégie de NVIDIA est limpide. Plus les modèles ouverts tournent bien sur GPU NVIDIA, plus le monde achète du NVIDIA. L'open-weight n'est pas un cadeau. C'est une machine à créer de la demande pour le compute. L'inverse exact du mouvement chinois. Les chinois ferment parce que l'open ne rapporte plus assez. NVIDIA ouvre parce que l'open fait vendre du hardware et ils ont besoin de diversifier leur vertical pour rester leader. Le vrai produit de NVIDIA, c'est la dépendance au compute. Et Mistral dans tout ça ? Le labo français se positionne comme le bras européen de cette coalition. Mais faut pas être naïf. Mistral n'est pas open par idéologie. La même semaine que la coalition, ils lancent Forge → une plateforme d'entraînement custom pour entreprises. Du lock-in pur. Open-weight pour la distribution. Propriétaire pour le revenu. Le même calcul que tout le monde. L'open-weight n'est pas une idéologie. C'est une stratégie de distribution. Le paysage qui se dessine pour les prochains mois : → Frontier : tu paies une API. OpenAI, Anthropic, Google. Tu contrôles rien. → Mid-tier : open-weight. NVIDIA + Mistral + labos chinois en mode hybride. Tu télécharges, tu customises, tu déploies chez toi. → Local : explosion. Des modèles de 3 à 80 milliards de paramètres actifs qui tournent sur du hardware grand public. Quand un modèle à 3 milliards de paramètres actifs bat des modèles 40 fois plus gros, la question "pourquoi je paie une API ?" devient de plus en plus difficile à ignorer. L'open-weight ne disparaît pas. Il change de mains. Les chinois restructurent. NVIDIA ramasse le flambeau. Mistral s'accroche. Meta se referme. Personne n'est open par conviction. Pour ceux qui buildent en local, c'est paradoxalement le meilleur moment de l'histoire. Le frontier sera toujours l'API de quelqu'un d'autre. Le sweet spot local, c'est le tien.

MiniMax (official)@MiniMax_AI

Introducing MiniMax-M2.7, our first model which deeply participated in its own evolution, with an 88% win-rate vs M2.5 - Production-Ready SWE: With SOTA performance in SWE-Pro (56.22%) and Terminal Bench 2 (57.0%), M2.7 reduced intervention-to-recovery time for online incidents to 3-min on certain occasions. - Advanced Agentic Abilities: Trained for Agent Teams and tool search tool, with 97% skill adherence across 40+ complex skills. M2.7 is on par with Sonnet 4.6 in OpenClaw. - Professional Workspace: SOTA in professional knowledge, supports multi-turn, high-fidelity Office file editing. MiniMax Agent: agent.minimax.io API: platform.minimax.io Token Plan: platform.minimax.io/subscribe/toke…

Français

17.2K

gelim@gelim·14 Mar

@ziwenxu_ @ASolovichh No need to buy the nvidia one if you just want a GB10. Asus, Dell, MSI etc. sell the same. If you are ready to do a compromise on the nvme disk size (and upgrade for cheaper) you can go for 3k$ youtube.com/watch?v=QbtSco…

YouTube

English

Ziwen@ziwenxu_·14 Mar

@ASolovichh Thinking about buying one too but 5k lmao 🤣

English

171

Ziwen@ziwenxu_·13 Mar

Someone benchmarked 8 local LLMs on DGX Spark. If you're wondering which model to run locally, check this article!

0xsatorisan@0xsatorisan

x.com/i/article/2030…

English

224

46K

gelim@gelim·2 Mar

@markfrancisio @TeksEdge ASUS Ascent GX10 (3k$ is for 1Tb disk) asus.com/networking-iot…

Indonesia

Mark James Francis@markfrancisio·1 Mar

@TeksEdge Grok thinks that $0.30 would be $0.74 in the UK! What's the DGX spark clone? How does Qwen3.5 27B compare to Sonnet 4.6? Do we have stats?

English

1.1K

David Hendrickson@TeksEdge·1 Mar

💵Home Inferencing Cost Comparison Running For 1 Day. 🏠 Personal LLM 🖥️ DGX-Spark Clone ($3K Asus) 🤖 Qwen3.5 27B @ 30 tps ⏲️ 24 hours 🪙 2.6M tokens Cost = $0.30 of electricity 🏭 BigAI 🤖 Sonnet 4.6 @ 55tps (my experience) ⏲️ 13 hours 🪙 2.6M tokens Cost = $39 in tokens

English

621

60.2K

gelim retweetledi

Benjamin Marie@bnjmn_marie·23 Şub

Here’s a more complete evaluation of GGUF variants of Qwen3.5 (models by @UnslothAI ), and it’s way better than I expected. - Qwen3.5 is very robust to Unsloth quantization - TQ1_0 preserves the original model’s accuracy extremely well - Most of the degradation is on MMLU Pro (meaning that the model lost a bit of its world knowledge) - At 94 GB, the TQ1_0 reduces memory usage by 700 GB (!) I ran the eval at temperature = 0, but TQ1_0 looked so strong that I double-checked with temperature = 0.6 and top_p = 0.95, and the results were more or less the same. Note: the goal of this eval is only to measure the degradation relative to the original model. It does not tell you how good the model is at the tasks in absolute terms, at least, not directly. For comparison, 94 GB is about what a standard 47B-parameter model would consume. That puts Qwen3.5 (TQ1_0) in “best model under 100 GB” territory.

English

308

103.5K

gelim@gelim·22 Şub

Any example of big companies that invested in inference hw instead of taking subs in any of the closed sota models? Good choice of open weights models now very good on coding. CAPEX just not the same if you plan on leveraging 80B MoE or those 500B+ beasts...

English

gelim@gelim·8 Şub

@StepFun_ai impressive 👏

English

gelim@gelim·8 Şub

Looks like there's a new contender in the 200B MoE space. Step-3.5-Flash from 上海阶跃星辰智能科技有限公司 aka Stepfun. Fantastic work of the llama.cpp community + stepfun initial PR. Llama.cpp support is fresh. Ubergarm IQ4_XS working like a charm.

English

104

gelim@gelim·8 Şub

huggingface.co/ubergarm/Step-… huggingface.co/ggml-org/Step-…

ZXX

Keşfet

@Tur24Tur @deepseek_ai @PortSwigger @claudeai @MiniMax_AI @vllm_project @UnslothAI @ggml_org