cedric

10.6K posts

cedric

@cedric_chee

SWE | @fastdotai alumni, independent researcher, tester | ex-entrepreneur @AntlerGlobal | GitHub: cedrickchee | building a new computer

Supercomputer Beigetreten Kasım 2007

435 Folgt3.2K Follower

Angehefteter Tweet

cedric@cedric_chee·13 Şub

Insane. We got close to Opus 4.5 at home >70 tokens/s

cyysky@cyysky

@cedric_chee MiniMax 2.5 full precision FP8 running LOCALLY on vLLM x 8x Pro 6000 🔥 Hosting it is easier then I thought, it just reuse the same script for M2.1. Time to do the vibe coding test! Generation: 70 tokens-per-sec and 122 tokens-per-sec for two conneciton Peak Memory: 728GB

English

297

30.6K

cedric@cedric_chee·9h

@covrovski i want the linux app so badly

English

covrovski@covrovski·9h

@cedric_chee Id settle for the linux app

English

cedric@cedric_chee·9h

What Codex is cooking. super app? 0.118 -> 0.120 features list: - under development: remote_control, tool_search - now experimental: image_detail_original - now stable: shell_snapshot, shell_tool, tool_suggest, undo, unified_exec, use_legacy_landlock skill_mcp_dependency_install, tool_call_mcp_elicitation no guesswork

Tibo@thsottiaux

Codex App has achieved take-off internally. I can hear the fans

English

253

cedric@cedric_chee·10h

@trekedge linux wen?

Deutsch

Daniel Steigman@trekedge·20h

If you think that’s good, just wait till you see what the team’s cooking.

Riley Brown@rileybrown

Codex App > Claude Desktop App

English

398

34.4K

cedric@cedric_chee·11h

Sadly, my tweet got caught up in this. Y'all are being pretty harsh when I already admitted I made a mistake and clarify right after I posted. The licensing change is genuinely confusing, even for someone with years of open source experience.

English

cedric@cedric_chee·11h

My group is still are not entirely clear on what counts as legitimate community use. I'll email them to clarify. "Being fast and reasonable on commercial authorization requests — DM me on X or send email" This is MiniMax drawing a line so users get a better experience and serious providers are not punished for doing it properly. MiniMax tightening the commercial license makes sense?

RyanLee@RyanLeeMiniMax

x.com/i/article/2043…

English

271

cedric@cedric_chee·18h

One of the @aiDotEngineer talks that I pressed repeat. Thank you for the slides. Emerging SWE practices in the agentic AI era: Codebase is infra

Armin Ronacher ⇌@mitsuhiko

I put the slides from our @aiDotEngineer talk up.

English

255

cedric retweetet

geoff@GeoffreyHuntley·23h

suspect folks not ready for time to last token at or < 200ms entire applications at 5 generations per sec

English

121

18.6K

cedric retweetet

Andrew Curran@AndrewCurran_·1d

There has been a great deal of speculation about why Anthropic is keeping Mythos in restricted release. One of the least-discussed reasons is cost. Not the cost to Anthropic of serving the model, but the downstream effects that cost will have on the industry, and on the world. Mythos is now being served to a small group of about 50 major companies. For organizations like these, token budgets are effectively unlimited, and the opportunity cost of not using as much of the model as possible is too high. I think you can already see the downstream effects even in this limited release. Claude users complain about hitting caps faster. They complain about degraded performance. For months now almost everyone I know has been continuously hitting the cap on Claude or Codex. The existence of Mythos pressures not just the amount of usage available to smaller subscribers, but also the pricing of these plans themselves, which are already subsidized. Smaller users will get hit twice. The compute cost of serving Mythos exerts pressure all the way down the line. Inference will get cheaper over time, but demand is already ahead of that curve and continues to expand. Mythos is not the end of this chain. As long as scale keeps rewarding larger runs, larger models will keep being trained. The next model that makes a Mythos-like jump may be dramatically larger again, and much more expensive to serve. If the cost of serving frontier models continues to outpace attempts to reduce it, then smaller players and public use get squeezed out. We end up with vast models, served at immense cost, available only to the richest corporations on earth. Those firms then use that access to outcompete smaller rivals, become richer still, and widen the gap again. If this continues, a small number of giant companies end up holding the only passports to the Country of Geniuses in a datacenter. For Anthropic, culturally, this is not a desirable world. Part of their reluctance to serve Mythos more broadly comes from a reluctance to help bring this world into being. There may be no way to serve a model like Mythos at scale right now without beginning this feedback loop. And as that loop accelerates, it will generate great resentment. If they serve it to lower-tier subscribers, those users get a handful of exchanges before hitting the cap. Seeing how capable the model is only deepens the resentment, because access is visibly rationed. The labs will be forced to make a trickle-down argument: let the largest firms use the models first, and the abundance will eventually spread to everyone else. The public is unlikely to buy this argument. The hostility and pushback against the industry will spiral. Eventually it may not remain merely political. It is not only Dario who has seen this world, but Sam as well. That is part of why OpenAI has started talking about mechanisms that would give ordinary citizens a direct stake in the upside of the industry, like the Public Wealth Fund. In my opinion the original use case of Worldcoin was a global UBI in a future where OpenAI won the race. Not only is that future no longer certain, but the trust and solidarity required to support a UBI no longer seem to me to exist in the West. The only path then is simply to scale everything as quickly as possible and hope abundance eventually arrives in a cascade strong enough that it reaches everyone on earth. To my friends who are in the safety camp, I understand this argument is hard to accept. Please consider that there is a level of capability beyond which, unless your p(doom) is literally 100, stopping becomes more dangerous than continuing. I think we passed that threshold even before Mythos. Even if stopping were possible - and I personally do not believe it has been for years - stopping here would lock in a dystopia. This dynamic is incentive-driven, just like the race itself, and just as hard to coordinate against. We must not stop inside this tunnel. The only way out is through.

English

611

50.1K

cedric@cedric_chee·1d

@VictorWilsonDev As they like to say, you can just do things. I'm porting github.com/johnzfitch/cla… to my distro as we speak. I don't have time to port from scratch this time around. Have you found a better starting point for a Linux port?

English

Victor Wilson@VictorWilsonDev·1d

@cedric_chee well i'm sure the 8 CCW/Linux users can figure something out

English

cedric@cedric_chee·1d

GA leaving behind Linux desktop is a big mistake. usually I don't rant. c'mon

Boris Cherny@bcherny

Claude Cowork, now generally available!

English

299

cedric@cedric_chee·1d

@JohnThilen True. I adapted Codex app. Agree. I'm tired boss.

English

John Thilén@JohnThilen·1d

@cedric_chee People who use Linux on desktop are used to software companies ignoring them, and can adapt. But this priority does show what Anthropic values, and it is not business critical infrastructure.

English

cedric retweetet

Lincoln 🇿🇦@Presidentlin·1d

From my 2026 Predictions collection At least we still have Zai It will be funny if we have a full flip.

Florian Brand@xeophon

wow, they did a non-commercial license... M2: Display the name if >30M revenue / 100M users M2.1: Display the name M2.5: Acceptable use policy M2.7: Non-Commercial license

English

1.3K

cedric@cedric_chee·1d

@darekgusto Similar trend. Also Qwen. We just can't have nice things :(

English

Darek Gusto@darekgusto·1d

@cedric_chee Kimi K2.5 and Composer case surely had its effect here.

English

cedric@cedric_chee·1d

Mad respect for the open source commitment. Just like MiniMax M2.5, local deployment is solid. My group got the vLLM inference up & running in no time. Details below.

MiniMax (official)@MiniMax_AI

We're delighted to announce that MiniMax M2.7 is now officially open source. With SOTA performance in SWE-Pro (56.22%) and Terminal Bench 2 (57.0%). You can find it on Hugging Face now. Enjoy!🤗 huggingface：huggingface.co/MiniMaxAI/Mini… Blog: minimax.io/news/minimax-m… MiniMax API: platform.minimax.io

English

716

cedric@cedric_chee·1d

A lot of people will complain about the license. Still, I would rather see the weights released under a non-commercial license than kept fully closed.

English

101

cedric@cedric_chee·1d

@darekgusto oof. I overlooked this. I digress - M2.7 is open weights for non-commercial use permitted based on MIT-style terms. Commercial use is more strict than Kimi-K2.5 modified MIT license. 😭

English

Darek Gusto@darekgusto·1d

@cedric_chee huggingface.co/MiniMaxAI/Mini…

QME

cedric@cedric_chee·1d

Naice! You should share the vLLM inference speed and throughput here. How many tokens/s? How many GPUs utilized? Definitely not 4 right? Drop the screenshots here. vLLM configs for those interested: $ vllm serve MiniMax-M2.7 \ --served-model-name minimax-m2.7 \ --tool-call-parser minimax_m2 \ --reasoning-parser minimax_m2 \ --enable-auto-tool-choice \ --tensor-parallel-size 4 \ --gpu-memory-utilization 0.78 \ --max-model-len -1 \ --trust_remote_code \ --port 9501 \ --compilation-config '{"mode":3,"pass_config":{"fuse_minimax_qk_norm":true}}'

English

cyysky@cyysky·1d

Minimax-M2.7 is up on local A6000 x4 full precision! let's go #MiniMax

English

176

cedric@cedric_chee·1d

@darekgusto WDYM? Did they change the license?

English

Darek Gusto@darekgusto·1d

@cedric_chee What about the non-commercial licence?

English

cedric@cedric_chee·1d

GPUs go brrrrrrrrrr 🔥🔥 cyysky is cracked. I don't dare to say Opus 4.6 at home with 2+ GPUs. Got burned for saying that for M2.5. vLLM settings are quite optimized. It's plenty fast. x.com/cyysky/status/…

cyysky@cyysky

Minimax-M2.7 is up on local A6000 x4 full precision! let's go #MiniMax

English

Entdecken

@covrovski @trekedge @aiDotEngineer @VictorWilsonDev @JohnThilen @darekgusto @elonmusk @BarackObama