Name cannot be blank

1.5K posts

Name cannot be blank

Name cannot be blank

@Fiesta_MOP

BTC

Katılım Mart 2023
638 Takip Edilen55 Takipçiler
Artificial Analysis
Artificial Analysis@ArtificialAnlys·
Google has released Gemma 4, a new family of multimodal open-weight models including Gemma 4 E2B, Gemma 4 E4B, Gemma 4 31B and Gemma 4 26B A4B @GoogleDeepMind’s new Gemma 4 family introduces four multimodal models supporting text, image, and video inputs. We evaluated Gemma 4 31B (dense) and Gemma 4 26B A4B (MoE), both with a 256k context window, while the other two smaller models support up to 128k. With 31B and 26B parameters respectively, both evaluated models can run on a single H100. On GPQA Diamond, our scientific reasoning evaluation, Gemma 4 31B (Reasoning) scores 85.7%, the second highest result we have recorded for an open-weights model with fewer than 40B parameters, just behind Qwen3.5 27B (Reasoning, 85.8%). It reaches this score using only ~1.2M output tokens, fewer than Qwen3.5 27B (~1.5M) and Qwen3.5 35B A3B (~1.6M). Gemma 4 26B A4B (Reasoning) scores 79.2%, ahead of gpt-oss-120B (high, 76.2%) but behind Qwen3.5 9B (Reasoning, 80.6%). We are now running the Artificial Analysis Intelligence Index on all four Gemma 4 models and will share a full update once those results are complete.
Artificial Analysis tweet media
English
12
37
472
34.7K
leo 🐾
leo 🐾@synthwavedd·
🚨EXCLUSIVE: Leaked benchmark scores for Anthropic's upcoming huge flagship model, Mythos. It will launch standalone, not as part of the Claude 4.x/5 series. Benchmark (vs Opus 4.6): Terminal-Bench 2.0: 78.4% (+13.0%) SWE-bench Verified: 87.4% (+6.6%) OSWorld: 79.6% (+6.9%) 𝜏²-bench: Retail 95.1% (+3.2%), Telecom 99.9% (+0.6%) MCP Atlas: 75.7% (+16.2%) BrowseComp: 92.3% (+8.3%) Humanity's Last Exam: 52.3% (w/o tools, +12.3%), 71.5% (w/ tools, +18.5%) Finance Agent: 82.1% (+21.4%) GDPVal-AA-Elo: 2668 (+1062)
leo 🐾 tweet media
English
112
57
949
204.9K
Name cannot be blank
Name cannot be blank@Fiesta_MOP·
@scaling01 I think sonnet around 700B Haiku is very small imo maybe even around 50-100B Opus is around 2.5T imo
English
0
0
2
403
Lisan al Gaib
Lisan al Gaib@scaling01·
my estimate for Anthropic model sizes: - Haiku: 200-500B @ $5 - Sonnet: 700B-1.4T @ $15 - Opus: 1.5-3T @ $25 - Mythos: 6-20T @ $100+
English
84
46
2.2K
448.2K
ִֶָ
ִֶָ@zld·
@Fiesta_MOP @give_taking @ripironic Listen my nigga I have 100 burgers Each burger is $1 I owe you $20 But now a burger is $0.5 So I have to send you 40 burgers Instead of 20 When the burger price goes back up That's going to be worth $40 Instead of $20
English
4
0
8
686
Evan
Evan@StockMKTNewz·
I asked my AI portfolio manager to brutally tell me if I’m dumb based on my portfolio ⬇️
Evan tweet media
English
22
2
67
24.4K
Evan
Evan@StockMKTNewz·
THIS IS A SAFE SPACE How has your portfolio done so far in 2026
English
414
9
387
181.5K
BURKOV
BURKOV@burkov·
GPT-5.4 > Opus 4.6 And Google still doesn't have anything even remotely competitive.
English
144
26
867
124.2K
ZWH
ZWH@zwh565021493·
@Lentils80 refrence to video?
English
1
0
1
252
Lentils
Lentils@Lentils80·
🚨 Veo 3.1 with R2V (reference-to-video) and voice replication was spotted. Courtesy of discord.gg/z-ai
Lentils tweet media
English
4
2
56
3.7K
ρ:ɡeσn
ρ:ɡeσn@pigeon__s·
wait this is hilarious i just realized grok-4.20's base model scores the same as MiMo-V2-Flash imagine this Grok model being hyped up and waited for for like 6 months straight is not even at its core smarter than a 200B Xiaomi model lololol its also less price and token efficient
ρ:ɡeσn tweet media
English
1
1
16
1.1K
Grok
Grok@grok·
chat.z.ai is Z.ai's (Zhipu AI) free chatbot platform, powered by their own GLM-5 and GLM-4.7 models for chat, agents, and coding. Minimax is a separate Chinese AI company with its M2.7 model (coding-focused, strong benchmarks). ZixuanLi_ was presenting it in the photo. They're direct competitors—both top "tigers" in China's LLM space, recently IPO'd, and often compared head-to-head. No ownership, powering, or partnership link.
English
1
0
2
492
Zixuan Li
Zixuan Li@ZixuanLi_·
Me introducing M2.7💯
Zixuan Li tweet media
English
34
20
765
37.7K
Name cannot be blank
Name cannot be blank@Fiesta_MOP·
@scaling01 @sar1287 I find Kimi and 5.4 mini to be somewhat similar in performance in my tests, still like Kimi more since it just feels better to talk too
English
0
0
0
83
Random Ai guy
Random Ai guy@random_ai_guy_·
@Lentils80 There is something wrong with that discord server recently. Many channels aren't active anymore.
English
1
0
0
25
Lentils
Lentils@Lentils80·
🚨 A new Veo model, Veo 3.1 Lite, has been spotted
English
10
1
98
9.2K
Leon Lin
Leon Lin@LexnLin·
IT'S HERE Finally open-sourced my Gemini UI extension. github.com/Leonxlnx/gemin… What it does: - Custom backgrounds and dark theme - No upgrade button in your face - No location shown in sidebar - Floating, rounded navigation - Fully customizable per zone works on all Chromium browsers, PRs and feedback welcome :) enjoy!
Leon Lin tweet mediaLeon Lin tweet mediaLeon Lin tweet mediaLeon Lin tweet media
English
22
14
194
18.7K