Hasan Can

4.6K posts

Hasan Can banner
Hasan Can

Hasan Can

@HCSolakoglu

SWE & AI- News, Insights Posts in ENG&TUR Exploring AI

Proxima C B Katılım Temmuz 2020
2.5K Takip Edilen1.4K Takipçiler
Hasan Can retweetledi
Bojan Tunguz
Bojan Tunguz@tunguz·
The end of Kaggle?
Bojan Tunguz tweet media
English
18
11
253
20K
Hasan Can
Hasan Can@HCSolakoglu·
At this price-performance level, M2.7 makes everything else look overpriced, including GPT-5.4 mini and Gemini 3 Flash. If benchmarks hold up in real world use, people will change their preferences. No one wants to waste their money.
Artificial Analysis@ArtificialAnlys

MiniMax has released MiniMax-M2.7, delivering GLM-5-level intelligence for less than one third of the cost MiniMax-M2.7 from @MiniMax_AI scores 50 on the Artificial Analysis Intelligence Index, an 8-point improvement over MiniMax-M2.5, which was released one month ago. This is driven by stronger performance on real-world agentic tasks and reduced hallucinations. MiniMax-M2.7 is now ahead of MiMo-V2-Pro (Reasoning, 49) and Kimi K2.5 (Reasoning, 47), and equivalent to GLM-5 (Reasoning, 50) while using 20% fewer output tokens and costing less than a third as much to run. MiniMax-M2.7 is a reasoning-only model and maintains the same per-token pricing as MiniMax-M2.5. Key takeaways: ➤ Strong performance on real-world agentic tasks: MiniMax-M2.7 achieves a GDPval-AA Elo of 1494, a significant improvement from MiniMax-M2.5 (1203) and ahead of MiMo-V2-Pro (Reasoning, 1426), GLM-5 (Reasoning, 1406), and Kimi K2.5 (Reasoning, 1283). It remains behind frontier models such as GPT-5.4 (xhigh, 1667) and Claude Opus 4.6 (Adaptive Reasoning, max effort, 1606) ➤ Reduced hallucinations: MiniMax-M2.7 scores +1 on the AA-Omniscience Index, up from MiniMax-M2.5 (-40). This is competitive with GPT-5.2 (xhigh, -1) and GLM-5 (Reasoning, +2), and well ahead of Kimi K2.5 (Reasoning, -8). The improvement from M2.5 is purely driven by reduced hallucinations, meaning the model is more likely to abstain from answering when it doesn’t know the answer, rather than guessing. M2.7 achieves a hallucination rate of 34%, lower than Claude Sonnet 4.6 (Adaptive Reasoning, max effort, 46%) and Gemini 3.1 Pro Preview (50%). ➤ Gains across most evaluations compared to MiniMax-M2.5: Outside of the GDPval-AA and AA-Omniscience improvements noted above, MiniMax-M2.7 improves in HLE (+9 p.p.), TerminalBench Hard (+5 p.p.), SciCode (+4 p.p.), IFBench (+4 p.p.), GPQA (+3 p.p.), and LCR (+3 p.p.). We saw a notable regression in τ²-Bench (-11 p.p.). ➤ Increased token use: MiniMax-M2.7 used ~87M output tokens to run the Artificial Analysis Intelligence Index, up 55% from MiniMax-M2.5 (~56M). It remains more token-efficient than other models such as GLM-5 (Reasoning, 110M) and Kimi K2.5 (Reasoning, ~89M) ➤ Leading cost efficiency: MiniMax-M2.7 cost $176 to run the Artificial Analysis Intelligence Index, maintaining the same $0.30/$1.20 per 1M input/output pricing as M2.5. This places it on the Pareto frontier of our Intelligence vs. Cost chart. For context, GLM-5 (Reasoning) cost $547 at equivalent intelligence, Kimi K2.5 (Reasoning) cost $371, and Gemini 3 Flash Preview (Reasoning) cost $278 Key model details: ➤ Context window: 200K tokens (equivalent to MiniMax-M2.5). ➤ Pricing: $0.30/$1.20 per 1M input/output tokens (unchanged from MiniMax-M2.5). ➤ Availability: MiniMax first-party API only. ➤ Modality: Text input and output only (no multimodality). ➤ Licensing: MiniMax has not announced whether MiniMax-M2.7 will be open weights. MiniMax-M2.5 is available under the MIT license.

English
0
1
2
429
Colaboratory
Colaboratory@GoogleColab·
🤖 Your favorite AI agents + Colab’s powerful cloud compute. 🤝 Colab's new Open Source MCP Server gives your agents programmatic access to control the entire notebook development lifecycle. Build, run, and visualize—securely in the cloud. Dive into the details and the GitHub repo below 👇
Google for Developers@googledevs

Bringing Google Colab’s secure environment to your AI agents. ☁️🛠️ With the new open-source Colab MCP Server, your agents can now natively write and execute code inside a Colab Notebook. Secure, automated, and ready to go.

English
11
89
640
55.5K
Hasan Can retweetledi
Albert Gu
Albert Gu@_albertgu·
The newest model in the Mamba series is finally here 🐍 Hybrid models have become increasingly popular, raising the importance of designing the next generation of linear models. We've introduced several SSM-centric ideas to significantly increase Mamba-2's modeling capabilities without compromising on speed. The resulting Mamba-3 model has noticeable performance gains over the most popular previous linear models (such as Mamba-2 and Gated DeltaNet) at all sizes. This is the first Mamba that was student led: all credit to @aakash_lahoti @kevinyli_ @_berlinchen @caitWW9, and of course @tri_dao!
Albert Gu tweet media
English
36
311
1.6K
409.2K
Hasan Can retweetledi
OpenAI
OpenAI@OpenAI·
GPT-5.4 mini is available today in ChatGPT, Codex, and the API. Optimized for coding, computer use, multimodal understanding, and subagents. And it’s 2x faster than GPT-5 mini. openai.com/index/introduc…
OpenAI tweet media
English
542
682
6.3K
1.5M
figure
figure@figuret20·
@matvelloso It’s great, but the price increase really sucks. It cuts out a lot of the use cases of 2.5 flash lite.
English
1
0
6
375
Mat Velloso
Mat Velloso@matvelloso·
gemini-3.1-flash-lite-preview is extremely underrated. I know I keep saying that, but nothing beats the (price*latency)/intelligence you get here.
English
20
3
99
10.3K
Hasan Can retweetledi
Kimi.ai
Kimi.ai@Kimi_Moonshot·
Introducing 𝑨𝒕𝒕𝒆𝒏𝒕𝒊𝒐𝒏 𝑹𝒆𝒔𝒊𝒅𝒖𝒂𝒍𝒔: Rethinking depth-wise aggregation. Residual connections have long relied on fixed, uniform accumulation. Inspired by the duality of time and depth, we introduce Attention Residuals, replacing standard depth-wise recurrence with learned, input-dependent attention over preceding layers. 🔹 Enables networks to selectively retrieve past representations, naturally mitigating dilution and hidden-state growth. 🔹 Introduces Block AttnRes, partitioning layers into compressed blocks to make cross-layer attention practical at scale. 🔹 Serves as an efficient drop-in replacement, demonstrating a 1.25x compute advantage with negligible (<2%) inference latency overhead. 🔹 Validated on the Kimi Linear architecture (48B total, 3B activated parameters), delivering consistent downstream performance gains. 🔗Full report: github.com/MoonshotAI/Att…
Kimi.ai tweet media
English
327
2K
13.4K
4.8M
Chayenne Zhao
Chayenne Zhao@GenAI_is_real·
linus was the original vibe coder before it was cool. dude just posts an angry email on the mailing list describing what he wants and thousands of engineers worldwide implement it for free. zero tokens consumed, zero API costs, infinite context window (30+ years of kernel knowledge). openai and anthropic are basically trying to replicate what linus has been doing with human contributors since 1991 except linus's agents dont hallucinate and they work for free @sahill_og
Sahil@sahill_og

Linus Torvalds created Linux at 21 without Claude or any other AI. - He didn't have a co-founder. - No VC funding. No office. - No team. - Just a personal project he posted to a mailing list: "I'm doing a free OS." 33 years later, it runs 97% of the world's servers, all smartphones, and the International Space Station. The most important software in history started as someone's side project. Absolute legend.

English
55
438
7.7K
433.6K
Hasan Can
Hasan Can@HCSolakoglu·
@Josh9817 And Google still rents TPUs to Anthropic despite this.
English
1
0
1
51
Josh
Josh@Josh9817·
Gemini-3.1-Pro-Preview was released 23 days ago. As of today, its uptime further degraded into sub-80's. That's right: not even double 9s, or a single 9 anymore. The Google Vertex API endpoint for Gemini-3.1-Pro-Preview is now in the 80's range for uptime.
Josh tweet media
English
1
1
4
124
Hasan Can
Hasan Can@HCSolakoglu·
I’m already quite close to my limit with or without caffeine. I can’t get below a 130 ms average with a 360 Hz monitor and a Razer DAV3 Pro. Maybe with 1000 Hz monitors I could see the 120ms avg
English
1
0
2
198
Hasan Can
Hasan Can@HCSolakoglu·
@bardozVAL I just tried it and realized I’m more consistent with this. Thanks for your contribution.
English
0
0
2
435
Hasan Can
Hasan Can@HCSolakoglu·
Fastest possible human reaction time sits between 100 and 120 milliseconds. This limit exists because of pure physiological hardware constraints. It takes approximately 13 to 70 milliseconds for a visual signal just to reach the brain. From there, your brain needs time to consciously acknowledge the input (around 75 to 150 ms) and send a signal down your spinal cord to actuate a muscle (another ~20 ms). Even with perfect anticipation, dropping below 100 milliseconds is physically impossible for a conscious, unpredicted response. I tried a number of things to optimize my reaction time. The ones that worked best for me were good sleep (8 hours uninterrupted in a well-ventilated room), weight training, testing when I’m neither too hungry nor too full, and doing it in a 22–23°C room with a high-refresh-rate monitor and a low-latency mouse. These helped me get close to my limits. I could probably push it a bit lower with caffeine, but I don’t use it because it makes me sick.
Hasan Can tweet media
English
6
2
65
8K
Hasan Can
Hasan Can@HCSolakoglu·
@ValorantBits I try to stay away from drugs. I’ve never tried them, so I can’t really comment, but caffeine did help.
English
0
0
0
231
Hasan Can
Hasan Can@HCSolakoglu·
With a total of 50 hours of Aimlabs grinding, I made it into the top 0.724%. Task-specific sensitivity is off. I use single sens. If I enabled task-specific sens, I could get much better results, but it would kill transfer of that aim to CS2, and other games.
Hasan Can tweet media
English
0
0
1
516
Hasan Can
Hasan Can@HCSolakoglu·
@mark_k @GoogleDeepMind I think 3.1 Flash will be released very soon. I’ve started running into A/B tests for it. Similarly, I’ve encountered a lot of A/B tests from OpenAI as well, so it looks like they might release a new model soon too.
English
1
0
2
538
Mark Kretschmann
Mark Kretschmann@mark_k·
A bit strange that we didn't get a Gemini release this week, after it was being hinted at. Maybe it was planned but then @GoogleDeepMind had to postpone it in the last minute? 🤷‍♂️ Would have been nice to get Gemini 3.1 Flash, but maybe next week! 🤞
English
21
4
150
9.7K
Hasan Can
Hasan Can@HCSolakoglu·
@FACEIT_Darwin I’m living for day Cache comes back and Mirage gets removed.
English
0
0
0
166
Hasan Can
Hasan Can@HCSolakoglu·
@MurkFPSHub Even cod bo7 has more stable 1% lows than CS2. /sad
English
1
0
1
571
MurkTweaks
MurkTweaks@MurkFPSHub·
Performance largely seems around the same after the recent update. Use my config files and you should be set!
MurkTweaks tweet media
English
5
1
45
9.2K