Andrej Szontagh

1K posts

Andrej Szontagh

@ScatteraAI

Problem Solver

Beigetreten Mart 2013

2.5K Folgt683 Follower

Andrej Szontagh@ScatteraAI·14h

@marmaduke091 Now we need a feature to store and load contexts :D .. and that's basically a memory now.

English

492

can@marmaduke091·1d

🚨 100M TOKEN CONTEXT WITHOUT COLLAPSE > <9% degradation from 16K → 100M > beats RAG + rerank + SOTA pipelines > runs on just 2×A800 GPUs we could be back

艾略特@elliotchen100

论文来了。名字叫 MSA，Memory Sparse Attention。一句话说清楚它是什么：让大模型原生拥有超长记忆。不是外挂检索，不是暴力扩窗口，而是把「记忆」直接长进了注意力机制里，端到端训练。过去的方案为什么不行？ RAG 的本质是「开卷考试」。模型自己不记东西，全靠现场翻笔记。翻得准不准要看检索质量，翻得快不快要看数据量。一旦信息分散在几十份文档里、需要跨文档推理，就抓瞎了。线性注意力和 KV 缓存的本质是「压缩记忆」。记是记了，但越压越糊，长了就丢。 MSA 的思路完全不同： → 不压缩，不外挂，而是让模型学会「挑重点看」核心是一种可扩展的稀疏注意力架构，复杂度是线性的。记忆量翻 10 倍，计算成本不会指数爆炸。 → 模型知道「这段记忆来自哪、什么时候的」用了一种叫 document-wise RoPE 的位置编码，让模型天然理解文档边界和时间顺序。 → 碎片化的信息也能串起来推理 Memory Interleaving 机制，让模型能在散落各处的记忆片段之间做多跳推理。不是只找到一条相关记录，而是把线索串成链。结果呢？ · 从 16K 扩到 1 亿 token，精度衰减不到 9% · 4B 参数的 MSA 模型，在长上下文 benchmark 上打赢 235B 级别的顶级 RAG 系统 · 2 张 A800 就能跑 1 亿 token 推理。这不是实验室专属，这是创业公司买得起的成本。说白了，以前的大模型是一个极度聪明但只有金鱼记忆的天才。MSA 想做的事情是，让它真正「记住」。我们放 github 上了，算法的同学不容易，可以点颗星星支持一下。🌟👀🙏 github.com/EverMind-AI/MSA

English

116

1.6K

186.2K

Andrej Szontagh@ScatteraAI·1d

@DaveShapi We should normalize not having opinion on matters nobody knows nothing about.

English

David Shapiro (L/0)@DaveShapi·1d

Machines will ALWAYS just be tools, no matter how sophisticated they become. They will never have moral agency, personhood, or consciousness in a way that is legally, ethically, or philosophically salient. Agree or disagree?

English

5.6K

Andrej Szontagh@ScatteraAI·1d

@arena @MiniMax_AI The model itself wouldn't be all that interesting on its own without considering how much cheaper is than the models of similar capability.

English

139

Arena.ai@arena·1d

MiniMax M2.7 is ranked #8 in Code Arena. It’s also the most cost-efficient of the top 10 at $0.30 / $1.20 per MToken. Congrats to the team at @MiniMax_AI 👏

MiniMax (official)@MiniMax_AI

Introducing MiniMax-M2.7, our first model which deeply participated in its own evolution, with an 88% win-rate vs M2.5 - Production-Ready SWE: With SOTA performance in SWE-Pro (56.22%) and Terminal Bench 2 (57.0%), M2.7 reduced intervention-to-recovery time for online incidents to 3-min on certain occasions. - Advanced Agentic Abilities: Trained for Agent Teams and tool search tool, with 97% skill adherence across 40+ complex skills. M2.7 is on par with Sonnet 4.6 in OpenClaw. - Professional Workspace: SOTA in professional knowledge, supports multi-turn, high-fidelity Office file editing. MiniMax Agent: agent.minimax.io API: platform.minimax.io Token Plan: platform.minimax.io/subscribe/toke…

English

483

42.6K

Andrej Szontagh@ScatteraAI·2d

@SkylerMiao7 Can't wait they add this to Ollama Cloud.

English

156

Skyler Miao@SkylerMiao7·2d

M3 is scaling up.

Zephyr@zephyr_z9

This is an extremely strong model I really wanna see Minimax scale to 1 or 2 trillion parameters

English

530

29.4K

Andrej Szontagh@ScatteraAI·2d

@RealPostFolder I actually get it, I do feel the same, but I still find things that are not boring for me. What really seems boring is what most other people do and if you asked them what they would do if they have millions of $$$ that's all so boring to me. I am all but bored.

English

790

Real Post Folder@RealPostFolder·2d

ZXX

735

182

3.9K

1.9M

Andrej Szontagh@ScatteraAI·2d

@elonmusk I really like what you have managed to achieve with hallucinations. I believe that's the right path for Grok.

English

Elon Musk@elonmusk·2d

What are your initial impressions of Grok 4.20? Major upgrades are still landing every week.

Testlabor@testerlabor

Grok 4.20 is now officially out of Beta. It's now on Auto, Fast, Expert & Heavy.

English

7.4K

3.5K

25.7K

7.5M

Andrej Szontagh@ScatteraAI·2d

@bilawalsidhu I believe most people read the Dunning-Kruger effect curve wrong. The curve never reaches the heights of the the peak at the start, in fact I believe that most of the growth at the right side is just understanding how much you know compared to other people ..

English

198

Bilawal Sidhu@bilawalsidhu·2d

DLSS 5 might be the moment where the anti AI pendulum starts swinging back. Many in the 3D community who were against generative AI are now pushing back on the "everything is AI slop" crowd. The pendulum swung too far and they can feel it. Nice to see the rebalancing.

Georgian Avasilcutei@nimlot26

After this whole debate about DLSS 5 I came to the conclusion that most of the people talking about it are completely unaware of what they don't know...they're on the peak of ignorance and don't even grasp how little they understand. They just heard generative AI and like Pavlov's dog they just start drooling thinking it's the same shit as unethical slop image generators...for the love of Christ...go and educate yourself before raging on the internet for no reason. DLLS 5 is not a prompt based generator...it's not creating stuff based on someone else's images and hallucinates results. It uses the information from the raster to build up a final render frame with the same information but with better lighting and shading... I'll even give you an example on how much of an impact better shading and lighting has. This is a character I've worked on not long ago. On the left you have a raster render, with some bad shaders. On the right you have a render with raytrace on, a much better shader for both hair and skin. They don't even look like the same person...do they? This is what DLSS5 is doing....getting a result like the one on the right(tbh a lot better) at a smaller cost than actually rendering it. Still the same geo, same textures, same light sources. Some of you will go and say the one on the left is better and it's the artist's vision. It's not...it's just the artist's limitation due to shading and lighting constrains. Every single artist out there would love to get the right result in real time.

English

225

18.8K

Andrej Szontagh@ScatteraAI·2d

@allgarbled It's a pretty nice touch to the design, I actually like this.

English

gabe@allgarbled·3d

I don’t know what you call them, but these little side tabs are like the emdash of vibe coded UIs

Juri Strumpflohner@juristr

What's your AI adoption level? (according to Steve Yegge)

English

622

520

15.3K

Andrej Szontagh@ScatteraAI·2d

@DaveShapi The funny thing is that the barrier of entry made the pursuit of business worthwhile, if the barrier of entry drops down too much the whole "business" stops making sense as a whole. It dissolves. Which leads to the collapse of our current paradigm of how economy operates.

English

David Shapiro (L/0)@DaveShapi·3d

This will be very interesting because, yes, the barrier of entry is going to drop. But that means every TAM is going to be saturated very quickly. And the only moat will be execution speed and reach, which is going to be based upon things like reputation and trust. And ultimately, the margins will collapse for most of the startups, and they will consolidate.

Varshika Prasanna@varshikaARK

x.com/i/article/2033…

English

178

17.9K

Andrej Szontagh@ScatteraAI·2d

@digitalfoundry Some people were just living under a rock or intentionally ignored something that made them feel unease, and this was the moment when they couldn't cover their eyes anymore. It was long time coming, and people are loosing their minds as expected.

English

Digital Foundry@digitalfoundry·2d

The big DLSS 5 machine learning debate and why we should have waited before posting our first round of coverage - today's video: youtu.be/5dTTfjBAFzc

YouTube

English

129

1.9K

2.1M

Andrej Szontagh@ScatteraAI·2d

@bridgemindai This is interesting. So the self-improvement was probably just focused to ace some specific benchmarks.

English

625

BridgeMind@bridgemindai·2d

MiniMax M2.7 scores worse than M2.5 on BridgeBench. M2.5 ranked #12. Overall 92.3. M2.7 ranked #19. Overall 88.1. UI dropped from 76.6 to 61.9. Refactor from 97.3 to 90.7. Gen from 94.3 to 89.2. #1 on Multi-SWE Bench. #19 on BridgeBench. That's a 18 rank difference between synthetic benchmarks and real vibe coding evaluations. This is why BridgeBench exists. bridgemind.ai/bridgebench

English

13.1K

Andrej Szontagh@ScatteraAI·2d

@kimmonismus Here we go .. cheap and good, this is what we call intelligence explosion. Big labs are going to have big problem.

English

554

Chubby♨️@kimmonismus·2d

Minimax M2.7 released! And its a big one Highlights: Self-evolving - first model that helped build itself, running 100+ autonomous optimization loops during its own RL training (30% internal improvement). Strong coder - 56.2% on SWE-Pro (near Opus 4.6), 55.6% on VIBE-Pro, production debugging down to under 3 minutes. ML research agent - 66.6% medal rate on MLE Bench Lite, tying Gemini 3.1. Office work - top open-source ELO on GDPval-AA (1495), 97% skill adherence, can do end-to-end analyst workflows (reports, models, PPTs). Native multi-agent and a new open-source interactive character demo called OpenRoom.

English

733

49.9K

Andrej Szontagh@ScatteraAI·2d

@cryptopunk7213 Minimax 2.5M was the only OS model that I was using, everything else I have tried was just too bad for my purpose. It wasn't on part with frontier models but close enough and for 10-30x less cost! (which is unbelievable)

English

375

Ejaaz@cryptopunk7213·2d

fuck me china just launched the 1st AI model that autonomously built itself... and its as good as claude opus 4.6 and gpt-5.4 - minimax M2.7 trained itself through 100+ rounds of autonomous self-improvement. 30% gain. No humans involved - what the actual f*ck - model now handles 30-50% of the AI lab's OWN AI research - beats gemini 3.1 at coding and pretty much matches opus 4.6 + gpt 5.4 😶 (china used to lag now they match - doesn't require crazy hardware to run (single a30 gpu) - absolutely CRUSHES tasks: financial modelling, coding, openclaw - one-shotted the chinese have officially caught up. self-improving ai is a real thing. all researchers did was set an objective and the model figured the rest out. i wasn't expecting this from minimax. im now wondering wtf deepseek is going to be like.

GIF

MiniMax_Agent@MiniMaxAgent

MiniMax-M2.7 just landed in MiniMax Agent. The model helped build itself. Now it's here to build for you. ↓ Try Now: agent.minimax.io

English

221

320

2.9K

394.9K

Andrej Szontagh@ScatteraAI·2d

@mark_k @midjourney Honestly, when it comes to images, we have reached very close to the (it can do anything) level, so Midjourney is still one of the best image generators, especially for artistic creations. Maybe they will get sold eventually.

English

154

Mark Kretschmann@mark_k·2d

Here's the problem with @Midjourney, in a nutshell: They are stuck with their technology stack, which is Diffusion-Transformer (DiT). This is essentially still the same tech as Stable Diffusion and FLUX used. Newer models like Nano Banana and GPT-Image are multimodal transformers. Essentially, the same model that generates text (like GPT) can also generate images! This allows for much better prompt understanding and unparalleled realism. So why can't Midjourney switch to the same technology? Because it's crazy expensive and difficult to train an entire foundation model. Midjourney is a small company and doesn't have the resources to do this. It would also take years to perfect.

English

448

42.6K

Andrej Szontagh@ScatteraAI·2d

@DaveShapi Yes, but I would prefer the fast version of this story rather than the slow one.

English

David Shapiro (L/0)@DaveShapi·2d

Labor is going away AND we will all be happier for it.

Hoops@Hoopss

What opinion will get you in this position?

English

426

13.7K

Andrej Szontagh@ScatteraAI·2d

@ImLukeF @MiniMax_AI Ye,s this is the top os model mainly because of the performance/cost ratio which is unmatched. How about pricing for M2.7?

English

1.6K

Luke@ImLukeF·2d

@MiniMax_AI M2.7 Let the testing begin.... Big fan of M2.5, so this is exciting!

English

167

66.5K

Andrej Szontagh@ScatteraAI·3d

@lady_valor_07 yummy dogshit cake

English

LadyValor@lady_valor_07·3d

First word that comes to mind when you see this cake?

English

4.3K

348

7.3K

3.1M

Andrej Szontagh@ScatteraAI·3d

@AymericRoucher @petergostev Gemini feels like the AI is kind of high all the time and looses contact with reality on occasions, but at least it keep me entertained.

English

291

m_ric@AymericRoucher·3d

I've long preferred Claude Code over Codex or Gemini, because it seemed much more reliable, but couldn't explain why : now Bullshit Bench by @petergostev provides compelling numbers. It measures bullshit as "when given false premises disguised in jargon, will the model go with the flow (=bullshit) or push back (=truthful)" And Claude is leagues ahead ! Also, this objective of truthfulness is probably at odds with the Chatbot Arena emergent objective of "pleasant chat experience" ; but a model optimizing for the former will be more useful.

English

114

1.1K

102.8K

Andrej Szontagh@ScatteraAI·3d

@DaveShapi Congrats!

English

David Shapiro (L/0)@DaveShapi·3d

FUCK YES MY LABOR/ZERO KICKSTARTER WAS FULLY FUNDED IN LESS THAN 2 HOURS!!!! 🚀🚀🚀 WE'RE SHOOTING FOR THE MOON, BOYS!!! POST-LABOR ECONOMICS IS HAPPENING AND YOU MADE IT HAPPEN!!! THIS IS YOUR VICTORY MORE THAN IT IS MINE! I'M JUST A DUDE WHO WROTE A BOOK, BUT IT BELONGS TO THE WORLD!!!!

English

629

16.7K

Andrej Szontagh@ScatteraAI·3d

@GamersNexus With level of realism everything else looks fake in contract (I am talking about animations).

English

GamersNexus@GamersNexus·3d

The problem, and I know this is hard to follow, is that it... looks like shit If you smeared dog shit on a camera lens, lit someone's face up with about $8,000 of studio-grade 5500K lighting, and then added some lipstick, you get DLSS5. It looks like shit. It's that simple.

Jay Dook@JayDook

If DLSS 5 is ‘AI slop,’ but current in-market DLSS, FSR, frame gen, PSSR, etc. are fine, then your problem clearly isn’t AI. So what is it? Defend your position without virtue signaling. Replies are open.

English

509

603

8.6K

223.5K

Entdecken

@marmaduke091 @DaveShapi @arena @MiniMax_AI @SkylerMiao7 @RealPostFolder @elonmusk @bilawalsidhu