Ismail Elsherbini

0

1

387

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex·22h

@elsherbin_ Sonnet is maybe 600B we have to adjust estimates downward

English

0

4

403

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex·1d

I think GLM 5.2 points to a 7 months gap currently It's around Opus 4.7-4.8 level, all told (modulo vision which in Opus's case is garbage anyway). Mythos reached Preview status (≥ Opus 4.8, functionally) by early Feb 2026. This means full PRC Mythos ("Fable") by Nov-Dec'26.

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞) tweet media

Lunexa@Lunexalith

@teortaxesTex What's your current timeline for china to reach Fable class ? GLM-5.2 certainly shorten the gap.

English

57

79

1.2K

442.5K

Ismail Elsherbini@elsherbin_·22h

@teortaxesTex Sonnet is 1 trillion parameters, you think opus is 500 billion parameters larger then sonnet ? Elon tweeted it's around 5 trillion opus class models , I doubt he is far off from the real parameter count

English

0

1

502

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex·22h

@elsherbin_ Opus is likely 2x larger

English

0

8

2.4K

Ismail Elsherbini@elsherbin_·22h

@teortaxesTex GLM 5.2 is a massive jump in opensource, compared to a similar sized model like sonnet 4.6 . It's insane release regardless

English

94

Ismail Elsherbini@elsherbin_·22h

@teortaxesTex "7 months behind " GLM 5.2 achieves opus level inteligence with a model 5 to 3 times smaller , Mythos models are rumored to be massive , even 10T parameters, if china achieved mythos level inteligence with a model 10 smaller Antropic and Openai would be out of business today.

English

0

11

2.7K

Ismail Elsherbini@elsherbin_·23h

@nlw Ultra mini fable we can say

English

269

Nathaniel Whittemore@nlw·1d

Is GLM-5.2 Temu Fable?

Indonesia

24

3

168

21.4K

Ismail Elsherbini@elsherbin_·1d

@kkamranxyz @lmstudio 5

0

7

867

Kamran@kkamranxyz·1d

@lmstudio How many tokens per second

English

7

0

62

8.4K

LM Studio@lmstudio·1d

For WWDC, we worked with Apple to run Kimi K2.6, a 1T-parameter model, across a cluster of four Mac Studios using a preview version of LM Studio. We showcased secure remote access from a MacBook Neo and iPhone using LM Link. A glimpse of your own private, frontier-scale AI.

English

120

301

4.3K

350.6K

Ismail Elsherbini@elsherbin_·1d

@jurbed @bindureddy Ask Fable to translate , easy

English

2

136

Juraj Bednar@jurbed·1d

@bindureddy The problem is it often answers in Chinese. And I can't read Chinese

English

0

3

1.2K

Bindu Reddy@bindureddy·1d

GLM 5.2 Is Mind Blowingly Good On Benchmarks Yes, it even beats Opus 4.8 and GPT 5.5. on some of them However it is also bench-maxxed! Internal evals have it behind them 😼 STILL - A HUGE WIN FOR OPEN SOURCE AI

English

48

17

370

20.9K

Ismail Elsherbini@elsherbin_·1d

@oleksoleksoleks Did you try to use it inside their coding agent app, they give 1.5x quota , I tried it and it's quite good

English

0

3

238

Olek@oleksoleksoleks·2d

Z.ai GLM-5.2 via Lite sub 45min long-horizon task @ 40 tok/s before hitting 5h limit 16mil cached, 200k input, 100k output Off peak hours (02:00 - 06:00 EST)

English

5

0

46

5.2K

Ismail Elsherbini retweetledi

Vercel@vercel·1d

Introducing eve, an agent framework. 𝚊𝚐𝚎𝚗𝚝/ 𝚊𝚐𝚎𝚗𝚝.𝚝𝚜 𝚒𝚗𝚜𝚝𝚛𝚞𝚌𝚝𝚒𝚘𝚗𝚜.𝚖𝚍 𝚝𝚘𝚘𝚕𝚜/ 𝚜𝚔𝚒𝚕𝚕𝚜/ 𝚜𝚊𝚗𝚍𝚋𝚘𝚡/ 𝚜𝚌𝚑𝚎𝚍𝚞𝚕𝚎𝚜/ Like Next.js, for agents. vercel.com/blog/introduci…

English

318

711

7.1K

2.1M

Ismail Elsherbini@elsherbin_·3d

@theo What about Gpt 5.6 if they took down fable , will they take down openai models too

English

0

1

439

Theo - t3.gg@theo·3d

It's kind of wild that Fable still isn't back. Honestly thought this would be resolved quicker 🙃

English

225

40

3.6K

173K

Ismail Elsherbini@elsherbin_·3d

@themikebwebb @wesbos 🤣🤣 this is hilarious, I love Xiaomi , best company ecosystem after apple

English

1

24

Mike Webb@themikebwebb·3d

@wesbos

QME

0

25

2.3K

Wes Bos@wesbos·3d

xiaomi - the Chinese company that makes phones, rice cookers and electric vehicles - has forked OpenCode

English

128

98

3.3K

224.6K

Ismail Elsherbini@elsherbin_·3d

@ggg78g89 @crystalsssup It's from my experience better than opus 4.6 and near 4.7 level of reliability, it's expensive and not token efficient though , opus 4.8 medium is better and cheaper

English

28

LiveLifewithAI-QA@ggg78g89·3d

@crystalsssup Please stop. It's not opus. Don't steal opus meme.

English

0

109

Crystal@crystalsssup·3d

Me using Kimi K2.7 to rename a file

🌘 Meet Kimi K2.7 Code HighSpeed! A high-speed mode of our latest open-source multimodal coding model, Kimi K2.7 Code. ⚡️ Up to 6× faster: Around 180 tok/s on coding tasks with median-length inputs, and up to 260 tok/s on shorter-context tasks. 🔷 Rolling out to Kimi Code Beta Program members, Kimi API developers, and Kimi Business users. (Access will remain limited for now due to capacity constraints.) 🔷 No invite needed. Anyone who joins the Beta Program has a chance to get access 👉 kimi.com/code/beta Open intelligence should be instant, affordable, and borderless. We'll continue improving the model and expanding access as more capacity becomes available! 🔗 Kimi Code: kimi.com/code 🔗 API: platform.kimi.ai

English

14

285

25.2K

Ismail Elsherbini@elsherbin_·3d

@bridgemindai i reached my limit twice in half a prompt on the 20 USD plan , its not token usage based, its request based , these ai agents looping even a small request with 2 token consumption , consumes 1 request , its awful

English

1

632

BridgeMind@bridgemindai·3d

The usage limits on the Kimi K2.7 Code plan are TERRIBLE. I got rate limited after only 30 minutes of testing. Isn't this model supposed to be cheap?

English

69

7

431

32.6K

Ismail Elsherbini@elsherbin_·5d

@mabhi1999 @its_miro1 Bibi tactics 😭😭🤣

Filipino

20

Abhishek Kumar@mabhi1999·5d

@its_miro1 How are you marketing your app ?

English

0

3

2.5K

Ismail Elsherbini@elsherbin_·5d

@tyleryust Hey man, I hope to see Kimi k2.7 Code and glm 5.2, I am curious to see the progress of these models from Chinese labs

English

Artificial Analysis@ArtificialAnlys

103

Tyler Yust@tyleryust·6d

DeepSWE is now the main SWE benchmark on Artificial Analysis replacing SWE bench pro. really proud of the team

We've updated the Artificial Analysis Coding Agent Index, replacing SWE-Bench Pro with Datacurve's DeepSWE benchmark - the swap lifts Codex with GPT-5.5 (xhigh) above Claude Code with Opus 4.8 (max), while the newly released Claude Fable 5 (max) in Claude Code debuts at the top DeepSWE, built by @datacurve, writes its tasks from scratch rather than adapting them from public GitHub issues or pull requests, so no model has seen the solutions during training. That matters because SWE-Bench Pro, the benchmark it replaces in our Coding Agent Index, had grown gameable, with some models recovering the fix from the repository's commit history instead of solving the task. The swap reorders the index: Codex with GPT-5.5 (xhigh) rises from 65 to 76, overtaking Claude Code with Opus 4.8 (max) at 73. Claude Code with Fable 5 (max), which enters directly on the refreshed index, leads at 77. SWE-Bench Pro had been flattering some combinations and penalizing others. More below.

English

0

21

1.1K

Ismail Elsherbini@elsherbin_·6d

@thegenioo @jumperz No clue, this doesn't hold whatsoever

English

1

10

Hamza@thegenioo·6d

@jumperz what’s that projection

English

0

2

232

JUMPERZ@jumperz·6d

kimi k2.7-code just dropped, on DeepSWE k2.6 sits at 24% the top open-weight model on the board minimax , Qwen and GLM when we have deepseek V4-Pro at 8% and collapses on real long-horizon work and that's Kimi's actual bet not price tho if K2.7's +21.8% coding claim holds, that's ~29% on DeepSWE.. enough to flip gemini-3.5-flash (28%).. lets see

🌘 Kimi-K2.7-Code, our latest coding model, is now released and open-sourced! 🔷 Improved coding & agent performance over K2.6: +21.8% on Kimi Code Bench v2, +11.0% on Program Bench, and +31.5% on MLS Bench Lite. 🔷 Reasoning efficiency: Less overthinking, with 30% lower reasoning-token usage compared to K2.6. 🔷 Long-horizon coding: Improved instruction following, higher end-to-end coding task success rates. ⚡️ 6x High-Speed Mode coming soon! 🔌 Available today via Kimi API and Kimi Code. 🔗 Kimi Code: kimi.com/code 🔗 API: platform.moonshot.ai

English

28

12

330

38.3K

Ismail Elsherbini@elsherbin_·6d

@teortaxesTex Composer on deepswe is worse than Kimi 2.6 , but it's more token efficient so , I will wait for Kimi 2.7 deepswe results , I like moonshot work , I hope its a good model release

English

15

1.5K

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex·6d

I want to see this compared with Composer 2.5 Like, really hard Cursor has a ton of proprietary data, a large head start, and threw a Colossus at RLing Kimi K2.5 checkpoint. What is the gap now?

🌘 Kimi-K2.7-Code, our latest coding model, is now released and open-sourced! 🔷 Improved coding & agent performance over K2.6: +21.8% on Kimi Code Bench v2, +11.0% on Program Bench, and +31.5% on MLS Bench Lite. 🔷 Reasoning efficiency: Less overthinking, with 30% lower reasoning-token usage compared to K2.6. 🔷 Long-horizon coding: Improved instruction following, higher end-to-end coding task success rates. ⚡️ 6x High-Speed Mode coming soon! 🔌 Available today via Kimi API and Kimi Code. 🔗 Kimi Code: kimi.com/code 🔗 API: platform.moonshot.ai

English

19

11

493

43.1K

Ismail Elsherbini@elsherbin_·6d

@Da7_Tech @KimiDevs Glm 5.2 soon lol

English

126

Da7em@Da7_Tech·6d

@KimiDevs this or GLM-5.1?

English

0

653

Kimi Developers@KimiDevs·6d

Meet Kimi-K2.7-Code 👀 Here’s what developers should know to fully unlock K2.7-Code potential：

🌘 Kimi-K2.7-Code, our latest coding model, is now released and open-sourced! 🔷 Improved coding & agent performance over K2.6: +21.8% on Kimi Code Bench v2, +11.0% on Program Bench, and +31.5% on MLS Bench Lite. 🔷 Reasoning efficiency: Less overthinking, with 30% lower reasoning-token usage compared to K2.6. 🔷 Long-horizon coding: Improved instruction following, higher end-to-end coding task success rates. ⚡️ 6x High-Speed Mode coming soon! 🔌 Available today via Kimi API and Kimi Code. 🔗 Kimi Code: kimi.com/code 🔗 API: platform.moonshot.ai

English

78

174

3K

220.8K

Ismail Elsherbini retweetledi

Claude@claudeai·9 Haz

Introducing Claude Fable 5: a Mythos-class model that we’ve made safe for general use. Its capabilities exceed those of any model we’ve ever made generally available.

English

5K

14.5K

105K

56.3M

Ismail Elsherbini@elsherbin_·9 Haz

Finally a benchamark measuring how good the code is for production environment

Cognition@cognition

Introducing FrontierCode: a coding eval that raises the bar for difficulty & quality. Each task took 40+ hrs of work by leading open-source maintainers. Models write sloppy code that works but isn’t maintainable. Our eval is first to measure: would you actually merge this code?

English