Lex Savage

40 posts

Lex Savage banner
Lex Savage

Lex Savage

@lexsavege

Katılım Aralık 2023
58 Takip Edilen0 Takipçiler
Lex Savage
Lex Savage@lexsavege·
@Dimdv99 @teortaxesTex All features are now included in the $30 plan. Unless these features are made available to free or premium members, they’re doomed to fail. No one is going to go out and pay $30. I used to really like Grok.
English
0
0
0
3
Dimdv
Dimdv@Dimdv99·
@teortaxesTex with 1t model, grok build, imagine 2.0 it will change. will be released in early may
English
1
0
0
292
Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)
It's actually insane how bad DeepSeek does on AA-Omniscience. It actually knows a hell of a lot, but they absolutely have failed or neglected to train for this. This is 12.5% of total score. –10 points for Pro. If not for that one thing, it'd be ranked the best OS model.
X Freeze@XFreeze

Grok 4.3 is absolutely ranking at the top in the latest Artificial Analysis benchmarks - #1 on IFBench (81%) - #1 on τ²-Bench Telecom (98%) - 74% Non-Hallucination Rate (Top-tier factual accuracy) - 196 Output Tokens per Second (Blazing fast) Not only did it take the #1 spot for IFBench and Telecom, but it’s doing this while pushing nearly 200 output tokens per second with a massive non-hallucination rate The competition is struggling to keep up with this balance of speed and sheer factual accuracy xAI is dominating the charts

English
9
3
100
9.4K
Lex Savage
Lex Savage@lexsavege·
@scaling01 Honestly, I don’t think that’s true. I use DeepSeek a lot, and it’s a pretty good model. People who look at benchmarks like that are just fooling themselves. Anyway, keep spending more money on OpenAI and Anthropic models :D.
English
0
0
4
58
Eric Jiang
Eric Jiang@veggie_eric·
When training Grok 4.3, we spoke directly with devs and businesses to understand what they actually needed: a model that’s fast, affordable, and great at tool calling. The result is a daily driver that doesn't just look good on random benchmarks, but is actually useful in the real world. 💰 $1.25 in / $2.50 out ⚡️ 100 tokens / second 📖 1 million context window Try it through Hermes Agent or direct through the xAI API!
Eric Jiang tweet media
English
359
860
3.5K
624.5K
Lex Savage
Lex Savage@lexsavege·
@bruce_x_offi @ArtificialAnlys People are finally starting to get it :). It's not all about benchmarks anymore. I've used DeepSeek quite a bit and it's way better than the Grok 4.20 reasoning model. Realworld usage is what matters. DeepSeek is really good, has great limits and is free. Grok 4.3 (30 $)
English
0
0
3
197
bruce
bruce@bruce_x_offi·
Grok 4.3 (left) vs DeepSeek v4 pro (right) Not good, I shouldn't have added more money to API, based on @ArtificialAnlys benchmark I thought it will be better than DeepSeek but that is not the case
bruce tweet mediabruce tweet media
English
3
2
17
8K
Lex Savage
Lex Savage@lexsavege·
@techdevnotes I still can't use it on the Grok interface. They're still asking for $30 😓
English
0
0
0
38
Tech Dev Notes
Tech Dev Notes@techdevnotes·
Grok 4.3 has 1M context window with pricing of $1.25 input per million tokens
Tech Dev Notes tweet media
English
17
6
248
9K
tetsuo
tetsuo@tetsuoai·
grok 4.3 live in the xai api
tetsuo tweet media
English
20
15
149
11.1K
Lex Savage
Lex Savage@lexsavege·
@victor207755822 I hope you’ll give it to everyone right away. We’ve been waiting for a long time.
GIF
English
0
0
3
1.3K
Lex Savage
Lex Savage@lexsavege·
@teortaxesTex I think the DeepSeek v4 model is really good. I use it every day. We shouldn't really pay attention to such benchmark tests anymore. It all comes down to daily usage now. It's excellent for both search and chat. Plus it's unlimited and free.
English
1
0
2
234
Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)
DeepSeek makes the most hallucinatory smart models, also horrible at pushing back on the user's bullshit. I think the answer is that they are laser focused on best-case capability. "How smart can it be *with fully specified correct premises, in the best straitjacket scaffold*?"
Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞) tweet media
Leon@ericssunLeon

rock bottom for DSV4. it's clear they don't train for this (but unclear why); DS models have never (?) performed here. It feels like a data engineering and post-training gap more than anything. Hopefully we'll see big lifts in future iterations on top of this base

English
6
0
151
11.7K
Lex Savage
Lex Savage@lexsavege·
@ericssunLeon I think the DeepSeek v4 model is really good. I use it every day. We shouldn't really pay attention to such benchmark tests anymore. It all comes down to daily usage now. It's excellent for both search and chat. Plus it's unlimited and free.
English
0
0
0
229
Leon
Leon@ericssunLeon·
rock bottom for DSV4. it's clear they don't train for this (but unclear why); DS models have never (?) performed here. It feels like a data engineering and post-training gap more than anything. Hopefully we'll see big lifts in future iterations on top of this base
Leon tweet media
English
1
1
23
11.9K
Lex Savage
Lex Savage@lexsavege·
@teortaxesTex The Kimi AI K2.6 model is great, but I can’t use it for free. Except for DeepSeek, they’re all slow and expensive.
English
0
0
0
49
Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)
interesting claim from Moonshot V4 has more pretraining tokens (much more text-only), more active params, and yet does not dethrone K2.6, indeed it is *3 points behind* on SWE-Pro (according to V4's own paper). Does Kimi really just have better data (at least in some domains)?
Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞) tweet mediaTeortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞) tweet media
Shuyao Tim Xu@TimXu222575

@teortaxesTex not the amount of (pre/post-training) data, but the quality (how many engineers cleaning data)

English
19
2
127
13.4K
Lex Savage
Lex Savage@lexsavege·
@AlvyAumi @deepseek_ai This isn’t unique to DeepSeek. Most models don’t know which version they’re running. I asked DeepSeek in detail, and it said something similar. Since this information isn’t available in the system command line, it can’t tell you. Version 4 has been made available to everyone.
English
1
0
3
426
Shaer Alvy Aumi
Shaer Alvy Aumi@AlvyAumi·
@deepseek_ai I'm little confuse here, am i using v3 or v4, DeepSeek itself says it's in v3.
Shaer Alvy Aumi tweet media
English
5
1
3
7.9K
DeepSeek
DeepSeek@deepseek_ai·
🔥DeepSeek Input Cache Price Drop! Effective immediately, the price for input cache hits across the ENTIRE DeepSeek API series is reduced to just 1/10th of the original price! Build more efficiently for less. 📌Reminder: The DeepSeek-V4-Pro 75% OFF promotion is still active until May 5th, 2026, 15:59 (UTC Time).
DeepSeek tweet media
English
395
739
7.8K
1.3M
Lisan al Gaib
Lisan al Gaib@scaling01·
I don't like what I'm seeing just gonna go to bed and pretend I didn't see it waiting for all the good evals
Arena.ai@arena

Exciting news - DeepSeek V4 Pro is in the Arena with 1.6T parameters (49B activated) alongside V4 Flash at 284B parameters (13B activated). Both support 1M token context. It’s a major leap over DeepSeek V3.2! Code Arena: - DeepSeek V4 Pro (thinking): #3 open model (#14 overall), on par with GPT-5.4-high and Gemini-3.1-Pro in agentic webdev tasks Text Arena: - DeepSeek V4 Pro (thinking): #2 open model (#14 overall), matching Kimi-2.6 - DeepSeek V4 Flash (thinking): #10 open model (#47 overall) Competition at the top of the open model leaderboards keeps heating up. Huge congrats to @DeepSeek_AI on the strong comeback!

English
14
4
363
93.5K
Matthew Dabit
Matthew Dabit@MattDabit·
SuperGrok and X Premium+ subscribers: Grok 4.3 beta is live for you now!!! The performance will shock you. We aren't done yet and will continue to keep improving. We will iterate faster than our rivals and we will be the frontier lab. Enjoy the computer again. @xai @grok
English
139
96
2K
62.9K
Lex Savage
Lex Savage@lexsavege·
@theo I use it all the time and it’s really great. They’re secretly testing the model. I don’t use it much for coding but its conversational style and web search capabilities are excellent. It can even find and read the Twitter post I linked to it.
English
0
0
0
99
Theo - t3.gg
Theo - t3.gg@theo·
Anyone check on Deepseek recently?
English
157
5
1K
116.8K
Lex Savage
Lex Savage@lexsavege·
@axttimm @chetaslua When you ask the current model directly, it doesn't actually state its version number. It has said V3 before, but it seems like it's just guessing. It has told some people it's V4. This model is most likely a variant of V3.5 or V4.
English
1
0
1
60
Lex Savage
Lex Savage@lexsavege·
@axttimm @chetaslua I don't think this model is DeepSeek V3. DeepSeek 3.2 was released on December 1st and it is actually quite bad right now. It is available on the Chatbot Arena website, you can try it out. The model currently on their site is better in most aspects, for both V3 and V3.2 versions+
English
1
0
0
58
Chetaslua
Chetaslua@chetaslua·
Deepseek Expert Update is really good 😲 you can see deepseek answe is correct and Opus4.6 is wrong Prompt : How many different numbers with a non-zero prefix can be formed by arranging the five digits 2, 0, 20, 202, and 2020 in a row without using any tool and codes
Chetaslua tweet mediaChetaslua tweet mediaChetaslua tweet media
English
21
17
396
35.4K