chainoflack

726 posts

chainoflack

chainoflack

@chainoflack

가입일 Ağustos 2025
82 팔로잉18 팔로워
chainoflack
chainoflack@chainoflack·
@mike64_t Yeah, it is not elegant and I guess 90% of the time people just don't know how to spend their credits and make useless goals. I can only think of very few cases where the model can be trusted to have a decent result after hours of work unsupervised. There are, but not many
English
0
0
0
33
mike64_t
mike64_t@mike64_t·
/goal has to be one of the most useless features to ever exist. Burns tokens and time like a bottomless pit, and whenever you interrupt it and ask what it’s doing it’s perusing some strategy so doomed you have the urge to violently face palm. Usually 10 tailored leading yes or no questions just relaying its own information back at it will prune the branches by like 90%. My condolences to OpenAI serving a feature to ddos themselves.
English
25
1
120
10.6K
Coda
Coda@CodaCipher·
@MiniMax_AI Another benchmaxxed model with perfect scores that's frustrating in-use. Tbh I don't care about benches anymore.
English
3
0
23
3.9K
MiniMax (official)
MiniMax (official)@MiniMax_AI·
Introducing MiniMax M3: The First Open-Weights Model to Combine Three Frontier Capabilities - Coding & Agentic Frontier: 59.0% SWE-Bench Pro, 66.0% Terminal Bench 2.1, 34.8% SWE-fficiency, 28.8% KernelBench Hard, 74.2% MCP Atlas - MiniMax Sparse Attention scales context to 1M - Natively Multimodal from Step Zero API: platform.minimax.io Token Plan: platform.minimax.io/subscribe/toke… 🚀New! MiniMax Code: code.minimax.io Weights & Tech Report in ~10 Days
MiniMax (official) tweet media
English
394
809
5.8K
1.2M
chainoflack
chainoflack@chainoflack·
@satyanadella I already spent my stupid money on Mac and iPhone. Why tf u didn't say anything before? And also, go yell to the openai team because it's their fault for releasing most features to apple exclusively... I can't believe how they don't care at all even after the 10 billions u gave t
English
0
0
1
639
Satya Nadella
Satya Nadella@satyanadella·
Our goal is to deliver unmetered intelligence to every home and every desk with Windows. NVIDIA RTX Spark marks a real breakthrough toward that vision. Looking forward to sharing more with Jensen, who will be joining us live from Taiwan, at Build this week! blogs.windows.com/windowsexperie…
English
193
330
2.8K
229.3K
chainoflack
chainoflack@chainoflack·
@nvidia God damn it guys, why u didn't said anything.... I already spent my money on the shitty apple devices bc so far windows have been worse
English
1
0
4
2K
NVIDIA
NVIDIA@nvidia·
NVIDIA RTX Spark: a 1-petaflop superchip, the full CUDA and RTX ecosystem, and Windows-native agents. A new beginning for personal computers.
NVIDIA tweet media
English
175
235
2.6K
216.8K
chainoflack
chainoflack@chainoflack·
@alxfazio you ask codex to explain? i cant imagine a worse llm for that task....
English
0
0
2
181
alex fazio
alex fazio@alxfazio·
codex gets irritated if you ask more than 3 times to explain it simpler. don't ask me how i know
English
16
0
189
10.9K
chainoflack
chainoflack@chainoflack·
@honkinwaffle @mil000 yep. Current tech is already low friction and its hard to make a better input-alternative. We will need to wait for neuralink or for meta to polish their sensor band... those are the only i can consider being more convenient than the current mice or shortcuts
English
0
0
1
351
Waffle
Waffle@honkinwaffle·
@mil000 The first 4 things he did took 3x longer than it would have if he just... used his computer. So not only are we burning tokens at astronomical pace its to be less efficient. I know its a demo to show capabilities and what not but still just a mess of an idea in general.
English
4
0
58
3.8K
Spanky McDoob
Spanky McDoob@59thProfile·
@cjzafir Okay? And what about three months from now. Think there will be a viable model for this? It’s just a matter of time. It’s prudent to build toward where they’re going not where they are now
English
1
0
18
2.8K
chainoflack
chainoflack@chainoflack·
@AravPhi @cjzafir That project is barely useful with realtime 2. Now imagine with an open source model. Waste of time
English
0
0
1
211
Arav
Arav@AravPhi·
@cjzafir deploy a opensource model on something like modal or runpod and it can be served in production :)
English
3
0
14
4.3K
chainoflack
chainoflack@chainoflack·
@DJLougen @agupta He won't. All these cool demos look great but they have more friction than the usual shortcuts/hotkeys/micegestures
English
0
0
2
110
Daniel Lougen
Daniel Lougen@DJLougen·
@agupta Let me ask you this, are you going to actually use it for longer than 10 minutes?
English
2
0
7
1.2K
JB
JB@JasonBotterill·
Maybe I am just not a gigagenius like you but if I needed the same concept explained to me Claude could explain it in fewer tokens
English
2
0
6
349
chainoflack
chainoflack@chainoflack·
@kr0der You might only have very few hours bc I believe he posted this before his midnight
English
2
0
2
260
Iver
Iver@ivernorwegian·
@Protonnd @thsottiaux u dont want it. you dont even want to be working on over 40-50 or max 65 ish % of context of the current window
English
2
0
0
61
🚨 AI News | TestingCatalog
BUILD 🔥: Microsoft is preparing new image and voice models for the announcement on June 2. > MAI Voice 2, a multilingual model supporting 15 news languages and a wider range of emotional spectrum (check voice samples in the article) > MAI Transcribe 1.5, a new model for speech-to-text use cases. > MAI Image 2.5, already announced last week, is now available on LM Arena in preview. Compared to MAI Image 2, it supports file uploads and can be used for image editing.
English
25
35
506
39.7K
Taelin
Taelin@VictorTaelin·
@karrug2712 it is not like GPT will LIE to you. it is more like it will sabotage you. it will find loopholes in your message, it will take the worst possible interpretation and omit that it did something different from what you asked, yet still defensible. it feels adversarial and perverse
English
3
0
40
1.8K
Taelin
Taelin@VictorTaelin·
So I've been using GPT 5.5 and Opus 4.8 for the same tasks basically 24/7 since launch and, at least for me, I'm confident that every single time, Opus was superior, and in a way that is only possible to realize if you know what you're doing. One (of dozens!) of examples: "implement push-pop fusion on HVM4's evaluation loop, aiming for a 20% performance increase" After several minutes: - Opus 4.8 reported it did everything it could but couldn't achieve the goal, and that the performance gain of this change is 7%. - GPT 5.5 succeeded! Its code WAS 20% faster. Yet, upon inspection, it implemented 2 unrelated changes that broke HVM's semantics! That was my experience with both, 9/10 times. If I hadn't investigated, I'd be disappointed with Opus and use GPT's code, merging a clear regression. Over time, my codebase would accumulate damage. This happened to Bend2! Opus2, on the other hands, was honest, and that negative signal gave me valuable information that pushed things *forward*. I then asked it to try a different thing, and THAT new thing resulted in a legit 25% speedup. That kind of interaction rarely happens with GPT 5.5, in my experience. (I'm not too happy about this post because I'd rather not support a company that gatekeeps intelligence, specially in the context of safeguarding against exploits. Also, your mileage may vary. But I know many follow me for my honest observations and, in >>my<< experience, Opus 4.8 is, without doubt, the most reliable model for work right now.) (This also may sound a bit contradictory because I often praised GPT as trustworthy, but I'm talking about different things here. GPT is careful, meaning it won't leave things half done: it will cover edge cases, test thoroughly, double-check everything. In that sense, it is more honest. But it will cheat by malicious compliance. It feels like it is actively trying to game your rules and find loopholes to screw you. I don't feel like that with Opus at all.) Note Opus IS still a bit dumber than GPT. It takes longer to grasp a concept. But eventually it does. The more you talk to it, the smarter it gets. GPT is smarter out of the box, but less flexible and less apt to learn new things. Most importantly, though, Opus excels at everything that matters for productivity, including communication, doing exactly what you asked, code style, not breaking unrelated things, and, most importantly, HONESTLY. I can't overstate how important all these are. I'm using 4.8 to do a big pass through the whole Bend2 codebase, cleaning up a lot of junk left by 5.5, and things couldn't be going better. I made an incredible amount of REAL (manually verified, not trusted...) progress since its launch!
English
106
48
1.3K
213.8K
Kyle Russell
Kyle Russell@kylebrussell·
GPT 5.5 is default smart Opus 4.7 and 4.8 have to be worked up into a state of smartness
English
7
2
85
5.3K
cris
cris@criscounters·
@theo what happened to atlas btw? anyone using it?
English
1
0
0
1.1K
chainoflack
chainoflack@chainoflack·
@nicdunz Nah. I thought spud would be, but doesn't look like that at all. Anthropic models will always be closer to 4.5 than any gpt models
English
0
0
1
67
chainoflack
chainoflack@chainoflack·
@nosebleedsector Why would you use max for that crappy input? It's fine to input like that, but don't expect the model to not overthink if you literally selected the overthinker mode
English
0
0
0
201