Labomen

498 posts

Labomen

@labomen001

Knower

Tham gia Haziran 2024

192 Đang theo dõi37 Người theo dõi

Labomen@labomen001·1m

@cedric_chee *ding-ding* x.com/fynnso/status/…

Fynn@fynnso

was messing with the OpenAI base URL in Cursor and caught this accounts/anysphere/models/kimi-k2p5-rl-0317-s515-fast so composer 2 is just Kimi K2.5 with RL at least rename the model ID

English

cedric@cedric_chee·1h

Composer 2 smells like Kimi-K2. I wish they were more transparent.

Cursor@cursor_ai

Composer 2 is now available in Cursor.

English

113

Labomen@labomen001·4h

@markjaquith @thdxr @opencode But you don't need Anthropic prompts to access Claude over Bedrock.

English

356

Mark Jaquith@markjaquith·4h

@thdxr @opencode Can you speak to the Anthropic prompt removal in that PR, like for people who use OpenCode to access Anthropic models via Bedrock?

English

23.1K

dax@thdxr·5h

opencode 1.3.0 will no longer autoload the claude max plugin we did our best to convince anthropic to support developer choice but they sent lawyers it's your right to access services however you wish but it is also their right to block whoever they want we can't maintain an official plugin so it's been removed from github and marked deprecated on npm appreciate our partners at openai, github and gitlab who are going the other direction and supporting developer freedom

English

157

252

351.2K

Labomen@labomen001·18h

@banteg Meanwhile Opus 4.6 in Claude Code will do pentesting against live websites when asking it directly

English

540

banteg@banteg·18h

got my first refusals in codex since december. the work was related to copy protection study. the session clearly started off the wrong foot, it kept putting the words into my mouth, even though i was just studying the algo. it hard refused annotating the code after a certain point as it considered it "sensitive". a bit disappointed with this, given it's a legitimate preservation effort of abandonware, and the publisher has been out of business for 16 years. but i can see how codex can consider this grey area.

English

9.5K

Labomen@labomen001·1d

@scaling01 @OpenAIDevs No way they reposted your tweet without checking what else you said about 5.4 mini lol

English

134

Lisan al Gaib@scaling01·2d

GPT-5.4-mini looks really good for computer-use

English

209

22.4K

Labomen@labomen001·1d

It's honestly crazy, they have GPT 5.4 to help them develop Codex, but it's clearly not working. In Codex, when the model is doing something (literally just writing text or running lightweight commands) I immediately see CPU usage go to 40% of a single core on M4 Pro, + not negligible GPU usage. For now I'm sticking to TUI, I don't want to use such an unoptimized app even if my laptop can handle it.

English

Chris Allen@theodorvaryag·2d

there shouldn't be render lag / a perceivable halt when I'm switching between threads in the native Codex GUI app on an M5 Max w/ maxed out GPU accordingly, I'm shutting that bad boy down and going back to the TUI client.

English

492

Labomen@labomen001·2d

@FriesIlover49 @Angaisb_ I also tried it, and at least in API 5.4 mini is way faster than 5.4

English

FriesLover@FriesIlover49·2d

@Angaisb_ I did 2 small tests and seems like it isnt the case? its slower than standard 5.4 and produces worse results, frontend specially is just plainly unusable for example. Id have to do more to give definitive conclusions, but kimi was better than this at coding at the very least

English

580

Angel ❄️@Angaisb_·2d

GPT-5.4 mini is still awesome for Codex And I don't get why everyone's talking about OpenSource models when we all know they're always benchmaxxed, I've got Kimi K2.5 outputting random chinese characters in its responses

Lisan al Gaib@scaling01

GPT-5.4-mini is dead on arrival you can just use Kimi-K2.5 for $1.5 less and better performance

English

125

14K

Labomen@labomen001·2d

@MistralDevs No image support, sad.

English

325

Mistral AI for Developers@MistralDevs·2d

⚡️ Introducing Mistral Moderation 2, our next-generation moderation model. It introduces new categories and builds on the strengths of the previous version. - Enhanced performance - 128k context length (up from 8k) - Free to use

English

433

21.8K

Labomen@labomen001·2d

@adonis_singh 3 Flash on the Gemini API is about 130 tok/s right now, on Vertex about 120 tok/s.

English

Labomen@labomen001·2d

I'm checking the speeds rn on API, and I really hope they'll stay that way. Currently 5.4 mini is at ~180 tok/s (!!) on API, both normal and priority. Sub (on $20 plan) is 140 tok/s. This is faster than 3 Flash. 5.4 nano is reaching about 200 tok/s on API. For comparison, 5 mini is ~55 tok/s normal, ~115 tok/s priority.

English

adi@adonis_singh·2d

absolutely huge given the price and latency, the nano series was underrated

OpenAI@OpenAI

GPT-5.4 mini is available today in ChatGPT, Codex, and the API. Optimized for coding, computer use, multimodal understanding, and subagents. And it’s 2x faster than GPT-5 mini. openai.com/index/introduc…

English

3.4K

Labomen@labomen001·2d

I assume 5.4 is stuck with this behaviour for qutie some time because it's deployed much more widely? Another HUGE annoyance of 5.4 in ChatGPT, Codex, API is that it loves a spam of bullet lists + repetition, like this (yes, I'm not exaggerating): *blunt opener that reframes or challenges the question* - caveat or prerequisite - caveat or prerequisite - caveat or prerequisite - caveat or prerequisite *disclaimer about what the assistant won't help with* But if your real question is "reframed version of what they actually meant," then the answer is: - short tactical point - short tactical point - short tactical point *one-liner key insight that sounds profound* If you want the highest-odds play, do this: - Pick a thing that is: - quality - quality - quality - quality Examples: - example - example - example - example Then connect that to the goal: - option - option - option - option The math matters: - number = number - that could mean: - scenario - scenario - scenario - or one scenario that sounds impressive So if you do it the naive way, it won't work. If you do it the smart way, it becomes "hard but possible." A practical breakdown would look like this: - Phase 1: - action - action - action - Phase 2: - action - action - action - Phase 3: - action - action - action What actually makes this work is usually some combination of: - factor - factor - factor - factor So the real lesson is not "obvious wrong take." It's: *punchy restatement broken across a line for dramatic effect.* What not to do: - don't X - don't Y - don't Z - don't assume W If you want, I can give you a specific version for one of these: - option A - option B - option C - option D

English

287

Michelle Pokrass@michpokrass·2d

we shipped a new version of 5.3 instant to chatgpt yesterday. 5.3 was unintentionally pretty annoyingly clickbait-y. it's better in yesterday's model and we're going to keep stamping that behavior out. keep the feedback coming! help.openai.com/en/articles/68…

English

452

53.8K

Labomen@labomen001·3d

Rechecking once in a while, still the same speeds for GPT 5.4. API is about ~42-47 tok/s normal, with priority it's about ~53 tok/s. Sub is just 30 tok/s, sub with fast mode about ~47 tok/s (normal API speed). That's on ChatGPT Plus. x.com/labomen001/sta…

Labomen@labomen001

idk if anyone will read this, but I tested OAI Responses API speed with GPT-5.4. Sub = the backend used by Codex (used my $20 sub), API = Tier 5 API key. In the end: priority on API right now is about 57t/s, normal API 47t/s, sub priority 50t/s, normal sub... only 35t/s.

English

Labomen@labomen001·3d

@lefthanddraft @diavoli Pro accounts still have 4.5 in ChatGPT, that's the only place where you can use it, since it's completely gone from API

English

146

Wyatt Walls@lefthanddraft·3d

@diavoli is this recent? how do you have access to 4.5?

English

4.6K

Wyatt Walls@lefthanddraft·3d

This thing is going to find a cure for cancer before it stops falling for dumb tricks.

English

159

102

8.4K

254.4K

Labomen@labomen001·4d

@LottoLabs Why would you not want to give 2B sudo?

English

515

Lotto@LottoLabs·4d

Do NOT give 2b sudo!

Lotto@LottoLabs

Maybe qwen 2b can do it So far 4b is the smallest to follow skills reliably

English

494

20.8K

Labomen@labomen001·4d

@kevinxu That's gpt-4o in API which has a very different personality from chatgpt-4o-latest (the model that was in ChatGPT).

English

340

Kevin Xu@kevinxu·4d

if you miss chatgpt 4o, apparently you can still talk to it in github copilot

English

12.7K

Labomen@labomen001·4d

GLM 5 Turbo isn't and won't be, they mentioned that they'll include the improvements from it into future models, but won't release that model as open weight. And even beyond that, I don't see much advantage for those huge models to be open weight, they're about the same price on all providers and almost no one is actually running them locally.

English

309

Darkhan@v0lzanbel·4d

@labomen001 @WillUndrll @Zai_org It is open weights tho

English

250

Z.ai@Zai_org·4d

Introducing GLM-5-Turbo: A high-speed variant of GLM-5, excellent in agent-driven environments such as OpenClaw. Coding Plan Max: z.ai/subscribe OpenRouter: openrouter.ai/z-ai/glm-5-tur… API: docs.z.ai/guides/llm/glm…

English

178

292

2.6K

Labomen@labomen001·4d

@WillUndrll @Zai_org Yes but this isn't GLM 5, this is GLM 5 Turbo which apparently isn't just a fast mode for GLM 5

English

592

Will Undrell@WillUndrll·4d

@labomen001 @Zai_org GLM 5 is objectively a better model than gemini 3 flash though. See here: artificialanalysis.ai/leaderboards/m…

English

612

Labomen@labomen001·4d

@inerati Apparently those were also common in the past, for lots of generations, but they don't let you use both types at once anyway.

English

183

liz@inerati·4d

what the fuck

Tom's Hardware@tomshardware

ASRock launches new Frankensteined motherboard with one DDR4 slot and two DDR5 slots — Intel board signals the RAM apocalypse is truly nigh tomshardware.com/pc-components/…

English

184

17.3K

Labomen@labomen001·4d

@cheatyyyy that pricing is insane though, 3 flash is cheaper

English

168

cheaty@cheatyyyy·4d

GLM 5 Turbo with 200TPS seems to be live on OpenRouter

English

167

6.9K

Labomen@labomen001·4d

@testingcatalog Crazy price though, more expensive than 3 Flash x.com/labomen001/sta…

Labomen@labomen001

@Zai_org It costs more than 3 Flash (which is a very capable model and is also very fast), honestly at this point I think some Chinese model vendors lost the plot. Gemini 3 Flash is $0.5/$3 GLM 5 Turbo is $1.2/$4 (!) on the official API, $0.96/$3.2 on OpenRouter with a discount.

English

433

TestingCatalog News 🗞@testingcatalog·4d

Z AI announced GLM-5-Turbo, a faster variant of GLM-5 designed for always-on agents like OpenClaw. GLM-5-Turbo will be available on Pro coding plans in March and on Lite in April with an extra waitlist for earlier access.

Z.ai@Zai_org

Need it sooner? Apply for Early Access: - Pro (GLM-5-Turbo): docs.google.com/forms/d/e/1FAI… - Lite (GLM-5): docs.google.com/forms/d/e/1FAI…

English

331

33.7K

Khám phá

@cedric_chee @markjaquith @thdxr @opencode @banteg @scaling01 @OpenAIDevs @FriesIlover49