Labomen

498 posts

Labomen

Labomen

@labomen001

Knower

Tham gia Haziran 2024
192 Đang theo dõi37 Người theo dõi
Mark Jaquith
Mark Jaquith@markjaquith·
@thdxr @opencode Can you speak to the Anthropic prompt removal in that PR, like for people who use OpenCode to access Anthropic models via Bedrock?
English
2
0
14
23.1K
dax
dax@thdxr·
opencode 1.3.0 will no longer autoload the claude max plugin we did our best to convince anthropic to support developer choice but they sent lawyers it's your right to access services however you wish but it is also their right to block whoever they want we can't maintain an official plugin so it's been removed from github and marked deprecated on npm appreciate our partners at openai, github and gitlab who are going the other direction and supporting developer freedom
English
157
252
5K
351.2K
Labomen
Labomen@labomen001·
@banteg Meanwhile Opus 4.6 in Claude Code will do pentesting against live websites when asking it directly
English
0
0
5
540
banteg
banteg@banteg·
got my first refusals in codex since december. the work was related to copy protection study. the session clearly started off the wrong foot, it kept putting the words into my mouth, even though i was just studying the algo. it hard refused annotating the code after a certain point as it considered it "sensitive". a bit disappointed with this, given it's a legitimate preservation effort of abandonware, and the publisher has been out of business for 16 years. but i can see how codex can consider this grey area.
English
9
1
71
9.5K
Labomen
Labomen@labomen001·
@scaling01 @OpenAIDevs No way they reposted your tweet without checking what else you said about 5.4 mini lol
English
0
0
0
134
Lisan al Gaib
Lisan al Gaib@scaling01·
GPT-5.4-mini looks really good for computer-use
Lisan al Gaib tweet media
English
2
4
209
22.4K
Labomen
Labomen@labomen001·
It's honestly crazy, they have GPT 5.4 to help them develop Codex, but it's clearly not working. In Codex, when the model is doing something (literally just writing text or running lightweight commands) I immediately see CPU usage go to 40% of a single core on M4 Pro, + not negligible GPU usage. For now I'm sticking to TUI, I don't want to use such an unoptimized app even if my laptop can handle it.
English
0
0
0
41
Chris Allen
Chris Allen@theodorvaryag·
there shouldn't be render lag / a perceivable halt when I'm switching between threads in the native Codex GUI app on an M5 Max w/ maxed out GPU accordingly, I'm shutting that bad boy down and going back to the TUI client.
English
2
0
15
492
FriesLover
FriesLover@FriesIlover49·
@Angaisb_ I did 2 small tests and seems like it isnt the case? its slower than standard 5.4 and produces worse results, frontend specially is just plainly unusable for example. Id have to do more to give definitive conclusions, but kimi was better than this at coding at the very least
English
3
0
3
580
Mistral AI for Developers
⚡️ Introducing Mistral Moderation 2, our next-generation moderation model. It introduces new categories and builds on the strengths of the previous version. - Enhanced performance - 128k context length (up from 8k) - Free to use
Mistral AI for Developers tweet media
English
14
44
433
21.8K
Labomen
Labomen@labomen001·
@adonis_singh 3 Flash on the Gemini API is about 130 tok/s right now, on Vertex about 120 tok/s.
English
0
0
0
19
Labomen
Labomen@labomen001·
I'm checking the speeds rn on API, and I really hope they'll stay that way. Currently 5.4 mini is at ~180 tok/s (!!) on API, both normal and priority. Sub (on $20 plan) is 140 tok/s. This is faster than 3 Flash. 5.4 nano is reaching about 200 tok/s on API. For comparison, 5 mini is ~55 tok/s normal, ~115 tok/s priority.
English
1
0
1
72
Labomen
Labomen@labomen001·
I assume 5.4 is stuck with this behaviour for qutie some time because it's deployed much more widely? Another HUGE annoyance of 5.4 in ChatGPT, Codex, API is that it loves a spam of bullet lists + repetition, like this (yes, I'm not exaggerating): *blunt opener that reframes or challenges the question* - caveat or prerequisite - caveat or prerequisite - caveat or prerequisite - caveat or prerequisite *disclaimer about what the assistant won't help with* But if your real question is "reframed version of what they actually meant," then the answer is: - short tactical point - short tactical point - short tactical point *one-liner key insight that sounds profound* If you want the highest-odds play, do this: - Pick a thing that is: - quality - quality - quality - quality Examples: - example - example - example - example Then connect that to the goal: - option - option - option - option The math matters: - number = number - that could mean: - scenario - scenario - scenario - or one scenario that sounds impressive So if you do it the naive way, it won't work. If you do it the smart way, it becomes "hard but possible." A practical breakdown would look like this: - Phase 1: - action - action - action - Phase 2: - action - action - action - Phase 3: - action - action - action What actually makes this work is usually some combination of: - factor - factor - factor - factor So the real lesson is not "obvious wrong take." It's: *punchy restatement broken across a line for dramatic effect.* What not to do: - don't X - don't Y - don't Z - don't assume W If you want, I can give you a specific version for one of these: - option A - option B - option C - option D
English
0
0
1
287
Michelle Pokrass
Michelle Pokrass@michpokrass·
we shipped a new version of 5.3 instant to chatgpt yesterday. 5.3 was unintentionally pretty annoyingly clickbait-y. it's better in yesterday's model and we're going to keep stamping that behavior out. keep the feedback coming! help.openai.com/en/articles/68…
English
75
25
452
53.8K
Labomen
Labomen@labomen001·
Rechecking once in a while, still the same speeds for GPT 5.4. API is about ~42-47 tok/s normal, with priority it's about ~53 tok/s. Sub is just 30 tok/s, sub with fast mode about ~47 tok/s (normal API speed). That's on ChatGPT Plus. x.com/labomen001/sta…
Labomen@labomen001

idk if anyone will read this, but I tested OAI Responses API speed with GPT-5.4. Sub = the backend used by Codex (used my $20 sub), API = Tier 5 API key. In the end: priority on API right now is about 57t/s, normal API 47t/s, sub priority 50t/s, normal sub... only 35t/s.

English
0
0
0
50
Labomen
Labomen@labomen001·
@lefthanddraft @diavoli Pro accounts still have 4.5 in ChatGPT, that's the only place where you can use it, since it's completely gone from API
English
0
0
1
146
Wyatt Walls
Wyatt Walls@lefthanddraft·
@diavoli is this recent? how do you have access to 4.5?
English
2
0
5
4.6K
Wyatt Walls
Wyatt Walls@lefthanddraft·
This thing is going to find a cure for cancer before it stops falling for dumb tricks.
Wyatt Walls tweet media
English
159
102
8.4K
254.4K
Labomen
Labomen@labomen001·
@LottoLabs Why would you not want to give 2B sudo?
Labomen tweet media
English
0
0
19
515
Labomen
Labomen@labomen001·
@kevinxu That's gpt-4o in API which has a very different personality from chatgpt-4o-latest (the model that was in ChatGPT).
English
5
0
4
340
Kevin Xu
Kevin Xu@kevinxu·
if you miss chatgpt 4o, apparently you can still talk to it in github copilot
Kevin Xu tweet media
English
12
0
40
12.7K
Labomen
Labomen@labomen001·
GLM 5 Turbo isn't and won't be, they mentioned that they'll include the improvements from it into future models, but won't release that model as open weight. And even beyond that, I don't see much advantage for those huge models to be open weight, they're about the same price on all providers and almost no one is actually running them locally.
English
1
0
4
309
Labomen
Labomen@labomen001·
@WillUndrll @Zai_org Yes but this isn't GLM 5, this is GLM 5 Turbo which apparently isn't just a fast mode for GLM 5
English
1
0
3
592
Labomen
Labomen@labomen001·
@inerati Apparently those were also common in the past, for lots of generations, but they don't let you use both types at once anyway.
English
0
0
0
183
Labomen
Labomen@labomen001·
@cheatyyyy that pricing is insane though, 3 flash is cheaper
English
0
0
0
168
cheaty
cheaty@cheatyyyy·
GLM 5 Turbo with 200TPS seems to be live on OpenRouter
cheaty tweet media
English
9
5
167
6.9K
Labomen
Labomen@labomen001·
@testingcatalog Crazy price though, more expensive than 3 Flash x.com/labomen001/sta…
Labomen@labomen001

@Zai_org It costs more than 3 Flash (which is a very capable model and is also very fast), honestly at this point I think some Chinese model vendors lost the plot. Gemini 3 Flash is $0.5/$3 GLM 5 Turbo is $1.2/$4 (!) on the official API, $0.96/$3.2 on OpenRouter with a discount.

English
0
0
2
433