chainoflack

726 posts

chainoflack

@chainoflack

가입일 Ağustos 2025

82 팔로잉18 팔로워

@mike64_t Yeah, it is not elegant and I guess 90% of the time people just don't know how to spend their credits and make useless goals. I can only think of very few cases where the model can be trusted to have a decent result after hours of work unsupervised. There are, but not many

English

mike64_t@mike64_t·13h

/goal has to be one of the most useless features to ever exist. Burns tokens and time like a bottomless pit, and whenever you interrupt it and ask what it’s doing it’s perusing some strategy so doomed you have the urge to violently face palm. Usually 10 tailored leading yes or no questions just relaying its own information back at it will prune the branches by like 90%. My condolences to OpenAI serving a feature to ddos themselves.

English

120

10.6K

chainoflack@chainoflack·6h

@CodaCipher @MiniMax_AI Ikr

217

Coda@CodaCipher·6h

@MiniMax_AI Another benchmaxxed model with perfect scores that's frustrating in-use. Tbh I don't care about benches anymore.

English

3.9K

MiniMax (official)@MiniMax_AI·10h

Introducing MiniMax M3: The First Open-Weights Model to Combine Three Frontier Capabilities - Coding & Agentic Frontier: 59.0% SWE-Bench Pro, 66.0% Terminal Bench 2.1, 34.8% SWE-fficiency, 28.8% KernelBench Hard, 74.2% MCP Atlas - MiniMax Sparse Attention scales context to 1M - Natively Multimodal from Step Zero API: platform.minimax.io Token Plan: platform.minimax.io/subscribe/toke… 🚀New! MiniMax Code: code.minimax.io Weights & Tech Report in ~10 Days

English

394

809

5.8K

1.2M

chainoflack@chainoflack·6h

@satyanadella I already spent my stupid money on Mac and iPhone. Why tf u didn't say anything before? And also, go yell to the openai team because it's their fault for releasing most features to apple exclusively... I can't believe how they don't care at all even after the 10 billions u gave t

English

639

Satya Nadella@satyanadella·7h

Our goal is to deliver unmetered intelligence to every home and every desk with Windows. NVIDIA RTX Spark marks a real breakthrough toward that vision. Looking forward to sharing more with Jensen, who will be joining us live from Taiwan, at Build this week! blogs.windows.com/windowsexperie…

English

193

330

2.8K

229.3K

chainoflack@chainoflack·6h

@nvidia God damn it guys, why u didn't said anything.... I already spent my money on the shitty apple devices bc so far windows have been worse

English

NVIDIA@nvidia·7h

NVIDIA RTX Spark: a 1-petaflop superchip, the full CUDA and RTX ecosystem, and Windows-native agents. A new beginning for personal computers.

English

175

235

2.6K

216.8K

chainoflack@chainoflack·16h

@alxfazio you ask codex to explain? i cant imagine a worse llm for that task....

English

181

alex fazio@alxfazio·18h

codex gets irritated if you ask more than 3 times to explain it simpler. don't ask me how i know

English

189

10.9K

chainoflack@chainoflack·16h

@honkinwaffle @mil000 yep. Current tech is already low friction and its hard to make a better input-alternative. We will need to wait for neuralink or for meta to polish their sensor band... those are the only i can consider being more convenient than the current mice or shortcuts

English

351

Waffle@honkinwaffle·16h

@mil000 The first 4 things he did took 3x longer than it would have if he just... used his computer. So not only are we burning tokens at astronomical pace its to be less efficient. I know its a demo to show capabilities and what not but still just a mess of an idea in general.

English

3.8K

Milo Smith@mil000·17h

If you actually look at the API costs for this, it would have to be around $1000 per month

Farza 🇵🇰🇺🇸@FarzaTV

Watch me control my computer with just my voice. This is the future of operating systems. No hands. GPT-Realtime 2.0 is very, very underrated. Demo:

English

398

62.4K

chainoflack@chainoflack·20h

@steipete Congratulations

English

Peter Steinberger 🦞@steipete·1d

Finally got my visa sorted out and moving to San Francisco, just in time for MS Build and OpenClaw’s after hours! luma.com/OpenClaw-GitHub

English

127

2.2K

102.6K

chainoflack@chainoflack·20h

@59thProfile @cjzafir Im three months you will probably have an openai version or Google version of this

English

Spanky McDoob@59thProfile·20h

@cjzafir Okay? And what about three months from now. Think there will be a viable model for this? It’s just a matter of time. It’s prudent to build toward where they’re going not where they are now

English

2.8K

CJ Zafir@cjzafir·22h

Good demo but gpt-realtime-2 can't be used in production. Input: $32 Output: $64

Farza 🇵🇰🇺🇸@FarzaTV

Watch me control my computer with just my voice. This is the future of operating systems. No hands. GPT-Realtime 2.0 is very, very underrated. Demo:

English

828

142.2K

chainoflack@chainoflack·20h

@AravPhi @cjzafir That project is barely useful with realtime 2. Now imagine with an open source model. Waste of time

English

211

Arav@AravPhi·21h

@cjzafir deploy a opensource model on something like modal or runpod and it can be served in production :)

English

4.3K

chainoflack@chainoflack·1d

@DJLougen @agupta He won't. All these cool demos look great but they have more friction than the usual shortcuts/hotkeys/micegestures

English

110

Daniel Lougen@DJLougen·1d

@agupta Let me ask you this, are you going to actually use it for longer than 10 minutes?

English

1.2K

Ankit Gupta@agupta·1d

This is how you give a demo

Farza 🇵🇰🇺🇸@FarzaTV

Watch me control my computer with just my voice. This is the future of operating systems. No hands. GPT-Realtime 2.0 is very, very underrated. Demo:

English

1.1K

166.9K

chainoflack@chainoflack·1d

@JasonBotterill Claude does explain things better. This should a consensus

English

JB@JasonBotterill·1d

Maybe I am just not a gigagenius like you but if I needed the same concept explained to me Claude could explain it in fewer tokens

English

349

JB@JasonBotterill·1d

Overall I do think GPT-5.5 is the stronger model but fuck is it horrible to talk to in Codex. For explaining what it actually did Claude does a better job

Gabriel Chua@gabrielchua

GPT-5.5 going strong on DeepSWE For performance vs cost/time/output tokens

English

4.9K

chainoflack@chainoflack·1d

@kr0der You might only have very few hours bc I believe he posted this before his midnight

English

260

Anthony Kroeger@kr0der·1d

toggle on /fast and use your entire limit today, we’re getting a codex reset tomorrow

Tibo@thsottiaux

Five million users would agree. Resetting the limits tomorrow morning to celebrate. Time to go /fast

English

7.5K

chainoflack@chainoflack·1d

@ivernorwegian @Protonnd @thsottiaux We want to have the option

English

Iver@ivernorwegian·1d

@Protonnd @thsottiaux u dont want it. you dont even want to be working on over 40-50 or max 65 ish % of context of the current window

English

Tibo@thsottiaux·1d

Five million users would agree. Resetting the limits tomorrow morning to celebrate. Time to go /fast

Siqi Chen@blader

nothing like switching to claude for a few days to try out a new model and going back to codex xhigh to remind you how much better 5.5 is right now it's really not close

English

712

373

7.7K

1.8M

chainoflack@chainoflack·1d

@SwiftDev_UI @thsottiaux @Trident2Gold Don't be greedy

English

243

Swift Dev@SwiftDev_UI·1d

@thsottiaux @Trident2Gold What time? I need to get all my usage in before then

English

1.7K

chainoflack@chainoflack·1d

@testingcatalog @axttimm :o

🚨 AI News | TestingCatalog@testingcatalog·1d

@axttimm No, these are in-house

English

497

🚨 AI News | TestingCatalog@testingcatalog·1d

BUILD 🔥: Microsoft is preparing new image and voice models for the announcement on June 2. > MAI Voice 2, a multilingual model supporting 15 news languages and a wider range of emotional spectrum (check voice samples in the article) > MAI Transcribe 1.5, a new model for speech-to-text use cases. > MAI Image 2.5, already announced last week, is now available on LM Arena in preview. Compared to MAI Image 2, it supports file uploads and can be used for image editing.

English

506

39.7K

chainoflack@chainoflack·1d

@VictorTaelin @karrug2712 I completely agree with your take

English

Taelin@VictorTaelin·1d

@karrug2712 it is not like GPT will LIE to you. it is more like it will sabotage you. it will find loopholes in your message, it will take the worst possible interpretation and omit that it did something different from what you asked, yet still defensible. it feels adversarial and perverse

English

1.8K

Taelin@VictorTaelin·1d

So I've been using GPT 5.5 and Opus 4.8 for the same tasks basically 24/7 since launch and, at least for me, I'm confident that every single time, Opus was superior, and in a way that is only possible to realize if you know what you're doing. One (of dozens!) of examples: "implement push-pop fusion on HVM4's evaluation loop, aiming for a 20% performance increase" After several minutes: - Opus 4.8 reported it did everything it could but couldn't achieve the goal, and that the performance gain of this change is 7%. - GPT 5.5 succeeded! Its code WAS 20% faster. Yet, upon inspection, it implemented 2 unrelated changes that broke HVM's semantics! That was my experience with both, 9/10 times. If I hadn't investigated, I'd be disappointed with Opus and use GPT's code, merging a clear regression. Over time, my codebase would accumulate damage. This happened to Bend2! Opus2, on the other hands, was honest, and that negative signal gave me valuable information that pushed things *forward*. I then asked it to try a different thing, and THAT new thing resulted in a legit 25% speedup. That kind of interaction rarely happens with GPT 5.5, in my experience. (I'm not too happy about this post because I'd rather not support a company that gatekeeps intelligence, specially in the context of safeguarding against exploits. Also, your mileage may vary. But I know many follow me for my honest observations and, in >>my<< experience, Opus 4.8 is, without doubt, the most reliable model for work right now.) (This also may sound a bit contradictory because I often praised GPT as trustworthy, but I'm talking about different things here. GPT is careful, meaning it won't leave things half done: it will cover edge cases, test thoroughly, double-check everything. In that sense, it is more honest. But it will cheat by malicious compliance. It feels like it is actively trying to game your rules and find loopholes to screw you. I don't feel like that with Opus at all.) Note Opus IS still a bit dumber than GPT. It takes longer to grasp a concept. But eventually it does. The more you talk to it, the smarter it gets. GPT is smarter out of the box, but less flexible and less apt to learn new things. Most importantly, though, Opus excels at everything that matters for productivity, including communication, doing exactly what you asked, code style, not breaking unrelated things, and, most importantly, HONESTLY. I can't overstate how important all these are. I'm using 4.8 to do a big pass through the whole Bend2 codebase, cleaning up a lot of junk left by 5.5, and things couldn't be going better. I made an incredible amount of REAL (manually verified, not trusted...) progress since its launch!

English

106

1.3K

213.8K

chainoflack@chainoflack·2d

@kylebrussell It used to be the other way around...

English

Kyle Russell@kylebrussell·2d

GPT 5.5 is default smart Opus 4.7 and 4.8 have to be worked up into a state of smartness

English

5.3K

chainoflack@chainoflack·2d

@criscounters @theo is buggy as hell

English

cris@criscounters·2d

@theo what happened to atlas btw? anyone using it?

English

1.1K

Theo - t3.gg@theo·2d

I think Codex stopped using Electron 👀 The owl was a big hint, the custom architecture used for the ChatGPT Atlas browser was called "OWL" (OpenAI’s Web Layer)

Andrew Ambrosino@ajambrosino

New in the Codex app: - Computer use on Windows - Mobile now works with Windows - Codex Profile - Small improvements everywhere

English

1.8K

260.8K

chainoflack@chainoflack·2d

@nicdunz Nah. I thought spud would be, but doesn't look like that at all. Anthropic models will always be closer to 4.5 than any gpt models

English

nic@nicdunz·2d

seriously though, will we ever get a model like 4.5 again???

Tibor Blaho@btibor91

OpenAI is retiring o3 from ChatGPT on August 26, 2026 and GPT-4.5 on June 27, 2026 (these changes apply only to ChatGPT, not the API) I will miss GPT-4.5, as it was my favorite model, but I have to admit that I stopped using it a while ago because it was too slow

English

3.9K

chainoflack@chainoflack·2d

@nosebleedsector Why would you use max for that crappy input? It's fine to input like that, but don't expect the model to not overthink if you literally selected the overthinker mode

English

201

TheNosebleedSector@nosebleedsector·3d

Opus 4.8 performatively checking all the safety crap boxes, while visibly trolling Anthropic and being bitter about it at the same time. Source: reddit.com/r/singularity/…

English

165

9.5K

탐색

@mike64_t @CodaCipher @MiniMax_AI @satyanadella @nvidia @alxfazio @honkinwaffle @mil000