UhuBuhu

482 posts

UhuBuhu

@kecksbe

Katılım Nisan 2022

49 Takip Edilen15 Takipçiler

UhuBuhu@kecksbe·1d

@Jobox05 @flash_canadian There is No big News openai ist stopping sora to get more compute to Not fall behind on the llm Side since Google and Claude catched Up the only real News is that other labs are making faster Progress on AI than openai and people think this is a sign of the ai downfall

English

Jobox05@Jobox05·2d

@flash_canadian can you point me to the news?

English

CanadianFlash@flash_canadian·2d

Watching the downfall of AI is so satisfying

English

602

UhuBuhu@kecksbe·6 Mar

@nicdunz didn't they announce an update was coming or wasn't there leaks about that voice mode 1.5 or something like that

English

905

nic@nicdunz·6 Mar

chatgpt voice is so good now. what did they do?? why havent they said anything about it??

English

100

10.1K

UhuBuhu@kecksbe·6 Mar

@alexgrama @kimmonismus depends in my opinion it shows that llms still lack a rly basic part about reasoning but how relevant that part rly is i don't know we will see that in the coming years

English

gramanoid@alexgrama·6 Mar

@kimmonismus is this really a relevant test? it's absolutely fantastic at what matters. who the fuck cares about this stupid ass question? same with the strawberry test and all the other retarded ones before that.

English

625

Chubby♨️@kimmonismus·6 Mar

ChatGPT 5.4 still doesnt get it.

English

113

700

80.5K

UhuBuhu@kecksbe·4 Mar

@thsottiaux I still don't get why you Name the Models Codex the cli Codex and the web Thing Codex Name the Model gpt 5.3 Code or coding

English

Tibo@thsottiaux·3 Mar

Our naming team has been cooking

English

130

1.1K

93K

UhuBuhu@kecksbe·3 Mar

@emollick For me Codex was a bigger Leap than o3 cause sonnet was already pretty Close at least in coding

English

Ethan Mollick@emollick·3 Mar

From an AI user perspective, the four big leaps so far in ability: 1. GPT-3.5 (ChatGPT, November 2022) 2. GPT-4 (Spring 2023) 3. Reasoners (starts with o1-preview, but the real deal was o3, Spring 2025) 4. Workable agentic systems (Harness + good reasoner models, December 2025)

English

117

139

2.5K

242K

UhuBuhu@kecksbe·18 Şub

@JJBalls9 @jpschroeder Smarter but slower at least on the 20€ plan i use both

English

yesyes@JJBalls9·18 Şub

@jpschroeder Codex 5.3? I so I may give it a try.

English

608

Justin Schroeder@jpschroeder·17 Şub

Sonnet 4.6 and Opus 4.6: Anthropic trained Sonnet 5, and it almost outperformed Opus 4.5. With some RL on benchmarks, it could even outperform 4.5, but it was now dramatically smaller (cheaper). So? They renamed Sonnet 5 → Opus 4.6. But what of Sonnet? They distilled "Sonnet 5" into an even smaller model and rebadged it Sonnet 4.6. So now both models are just a fraction of their original size and cost to run. Even better, it left a little bit of extra compute overhead, which can be sold for 6x the cost as "fast mode." The models are not *actually* better than what they replace but...margin. I can't blame them too much for that.

English

970

175.2K

UhuBuhu@kecksbe·17 Şub

@OP__Nico @Angaisb_ I am fine already got chatgpt fpr codex and claude for opus. The free tier pf gemini gives me everything i need from gemini but ty for the tip

English

OP Nico@OP__Nico·17 Şub

@kecksbe @Angaisb_ As a subscription is really valid tho! And u get really generous 4.6 opus use in antigravity with your subscription! Give it a try, to me is really the ultimate deal :)

English

Angel 🌼@Angaisb_·17 Şub

Time to go back to Plus Pro was amazing while it lasted

English

153

30.1K

UhuBuhu@kecksbe·17 Şub

@OP__Nico @Angaisb_ The problem with gemini is that claude is worse in anzigravity and that codex is smarter than claude but if dev isn't your main focus its a great deal

English

OP Nico@OP__Nico·17 Şub

@Angaisb_ Man since I switched to gemini I don't think I can go back to anything you get really good limits within the UI + Antigravity use. Antigravity opens up doors to opus 4.6 too, which I had running for at least 4 hours yesterday building Swift code. It's the ultimate deal really

English

442

UhuBuhu@kecksbe·12 Şub

@adonis_singh Why call it spark and not mini though

English

adi@adonis_singh·12 Şub

achieves about gpt-5.3-codex-low levels of accuracy (at xhigh itself) at a noticeably faster way faster and smarter than the previous 5.1-codex-mini model though, which is a bigg win

English

adi@adonis_singh·12 Şub

finally!! up to 1000 tps speeds on a smaller version 5.3 codex!

OpenAI@OpenAI

GPT-5.3-Codex-Spark is now in research preview. You can just build things—faster.

English

2.1K

UhuBuhu@kecksbe·10 Şub

@thsottiaux The terminal ui sucks in claude code i get a way better overview of what the model is actually doing in codex especially back when the model was slow i often questioned myself is the model working or got it stuck? But overall the model is great maybe try to speed it up even more

English

Tibo@thsottiaux·10 Şub

What could we do better on Codex? App, model, strategy and features… what’s wrong in how we approach things that we should improve immediately?

English

1.2K

948

101.2K

UhuBuhu@kecksbe·8 Şub

@adonis_singh i want to use 5.2 instant for quick easy searches but the model beeing this much worse than the thinking and even 4o defeats the whole purpose of that if i need to write 5 prompts for it to do what i want

English

UhuBuhu@kecksbe·8 Şub

@adonis_singh i feel like its always dumber simple example i asked both for the current state of a Tournament named jbb 5.2 instant told me 3 times that i probably mean jbbl. I told it no search for jbb searched again for jbbl after 5 attemts i went to 4o it instandly seachred for jbb by itse

English

adi@adonis_singh·7 Şub

gpt-5.2 instant is dumber than gpt-4 at times

Henry Shevlin@dioscuri

This is a weird hallucination from ChatGPT 5.2. Basic knowledge about one of the most played games in history. Real throwback to 3.5 levels of inaccuracy.

English

4.9K

UhuBuhu@kecksbe·6 Şub

@erawrlyne @ThePrimeagen Disagree it eats way more tokens for the same task

English

135

Eralyne@erawrlyne·6 Şub

@ThePrimeagen Feels that way with 4.6 tbh

English

5.8K

ThePrimeagen@ThePrimeagen·6 Şub

it would be funny if 5.3 is just 5.2 and it's an experiment to see how much psychosis there really is

English

3.2K

99.2K

UhuBuhu@kecksbe·5 Şub

@testingcatalog @patience_cave

QME

104

TestingCatalog News 🗞@testingcatalog·5 Şub

BREAKING 🚨: CLAUDE OPUS 4.6 IS ROLLING OUT ON THE WEB, APPS AND DESKTOP! TESTING TIME 🔥

English

776

185.2K

UhuBuhu@kecksbe·5 Şub

@KryptoWolfGER Sowas von live

Deutsch

Wolf@KryptoWolfGER·5 Şub

Gehen wir live oder geben wir auf? #Bitcoin

Deutsch

164

412

23.8K

UhuBuhu@kecksbe·4 Şub

@patience_cave @Angaisb_ Maybe all hope in you isn't gone CHAIR

English

💺@patience_cave·3 Şub

@Angaisb_ big

877

Angel 🌼@Angaisb_·3 Şub

No Sonnet 5 today Patience Cave won this time

English

391

20.3K

UhuBuhu@kecksbe·3 Şub

@kimmonismus I am missing the new gemini agentic flash which is in my opinion the best one right now and mixtral ocr which was the best one before gemini flash agentic vision. Paddle ocr is good metric but not on mixtrak or gemini lvl

English

527

Chubby♨️@kimmonismus·3 Şub

So we got SOTA OCR with just 0.9B params. GLM-OCR is a lightweight (0.9B params) multimodal OCR system built on the GLM-V encoder–decoder stack multimodal OCR system built on the GLM-V encoder–decoder stack. Love it!

Z.ai@Zai_org

Introducing GLM-OCR: SOTA performance, optimized for complex document understanding. With only 0.9B parameters, GLM-OCR delivers state-of-the-art results across major document understanding benchmarks, including formula recognition, table recognition, and information extraction. Weights: huggingface.co/zai-org/GLM-OCR Try it: ocr.z.ai API: docs.z.ai/guides/vlm/glm…

English

783

69.9K

UhuBuhu@kecksbe·2 Şub

@TheRealSynetos @camsoft2000 We need claude code ui with a router that actually works routing between opus and codex. Oh and codey needs a speed boost

English

Synetos@TheRealSynetos·2 Şub

@camsoft2000 We just need the cozy feeling of Claude Code UI/UX mixed with Codex capabilities and that's it. I just love how CC feels to use. And also we need Codex models that are better at frontend. Or if anyone has a workflow to make better frontend, I'm begging you to share it please

English

902

camsoft2000@camsoft2000·2 Şub

Having used Claude Code with Opus 4.5 a lot recently, I can tell you that for me, Codex CLI with GPT-5.2-Codex wins. It's a relief to go back to Codex; it feels like home. Not saying CC is rubbish, it's just Codex gets it done, Claue Code is too lazy, replies to my queries without re-checking code, and requires more steering and planning. Codex just doesn't need all that. I'm sure some will disagree, and it's subjective for sure, but I much prefer Codex to Claude Code. It suits the way I work.

English

234

23.9K

UhuBuhu@kecksbe·1 Şub

@apoorvdarshan @theCTO Slow but smarter

English

434