Naeem

404 posts

Naeem

Naeem

@identity_matrix

Love computers and research

Katılım Nisan 2026
229 Takip Edilen4 Takipçiler
Loren Lugosch
Loren Lugosch@lorenlugosch·
Whoa: first author is a one-namer
Loren Lugosch tweet media
English
1
0
13
2.4K
Naeem
Naeem@identity_matrix·
The lead openai have is disgustingly Good. like, I don't know how they created their models such a beauty to work with. Plus, codex is the best harness [period] Never been this bullish on OpenAI.
English
1
0
1
16
Naeem
Naeem@identity_matrix·
Google can stop horrible PR by releasing 3.5 pro ASAP
English
0
0
0
2
uj
uj@umeshjj7·
Canada has e-transfer 🇨🇦 America has cash app, venmo, zelle, tipping screens everywhere 🇺🇸 what does your country have?
English
19
1
33
2.3K
Naeem
Naeem@identity_matrix·
@zxcodes Unless it's a multimodality or math benchmark
English
0
0
0
18
Naeem
Naeem@identity_matrix·
@basedjensen Probably otw to interview wenfeng
English
0
0
0
41
Giordano Randone
Giordano Randone@giordanorandone·
Do you prefer Composer 2.5 now, or do you still reach for Opus for UI-heavy work?
English
10
0
19
2.6K
Naeem
Naeem@identity_matrix·
Minimax M3 and Qwen 3.7 Plus this week pleaseeee
English
0
0
0
37
Naeem
Naeem@identity_matrix·
@adonis_singh Next version of claude, 100% on DeepSwe 👀
English
0
0
0
32
Justus Mattern
Justus Mattern@MatternJustus·
Composer 2.5 outperforms all open source models and clearly beats its base model Kimi 2.5 as well as Kimi 2.6. It is roughly on par and slightly ahead of Gemini 3.1 Pro We still see a large gap between models from Anthropic / OpenAI and other labs
Proximal@ProximalHQ

Composer 2.5 is ranked #5 on FrontierSWE The model is broadly on par with Gemini 3.1 Pro, with a slight edge in our evaluation, and it beats all open source models. We still observe a significant performance gap between Composer and models from Anthropic and OpenAI

English
3
0
90
9.3K
Naeem
Naeem@identity_matrix·
I hate bot-like speaking humans on this platform, insufferable.
English
0
0
0
4
Naeem
Naeem@identity_matrix·
Wtf is growth hacking?
English
0
0
0
2
Serena Ge (Datacurve)
Serena Ge (Datacurve)@serenaa_ge·
Today we’re releasing DeepSWE, a new standard for agentic coding benchmarks. On public leaderboards, top models often look relatively close in capability. DeepSWE shows where they actually diverge, reflecting the realistic experience of developers in their day-to-day work.
Serena Ge (Datacurve) tweet media
English
275
362
3.2K
638K
Naeem
Naeem@identity_matrix·
stepfun ai chat interface is going away :(
Naeem tweet media
English
0
0
0
1
Naeem
Naeem@identity_matrix·
@sri9s Backwards deployed orchestrator (BDO) ?
English
0
0
0
4
SrinathJ
SrinathJ@sri9s·
They say AI will create entirely new jobs we can’t even imagine yet. So, name one
English
1K
30
747
133.2K
Sarthak
Sarthak@Sarthak4Alpha·
The f*ck does this even mean? 😭
Sarthak tweet media
English
277
462
20.2K
662.3K
Naeem
Naeem@identity_matrix·
@opencode can you guys fix the pricing for deepseek and mimo models given recent pricing cuts
English
0
1
0
16