wqite

53 posts

wqite

@qwqite

@astrminc

Katılım Kasım 2024

335 Takip Edilen17 Takipçiler

wqite@qwqite·2d

@dprophecyguy @ryanvogel @steipete insanely dumb for 99.99% of people, but considering he effectively has unlimited tokens it's not a bad idea

English

vijay singh@dprophecyguy·2d

@ryanvogel @steipete dont know if its smart or dumb tbh

English

940

vogel@ryanvogel·2d

ahhh that’s what @steipete is using so much he’s running codex on every commit

Greg Brockman@gdb

run codex on every commit

English

20.7K

wqite@qwqite·9 May

@2TheTimelessOne @KRR1751 @TechPowerUp useless bloat

English

The Timeless One@2TheTimelessOne·8 May

@KRR1751 @TechPowerUp Fun and games until you try to use the microsoft store and xbox services. Useless custom OS

English

TechPowerUp@TechPowerUp·7 May

Microsoft Is Testing a Windows 11 Feature That Maxes Out CPU Speed for Faster App Launches tpu.me/rxj5

English

477

146.4K

wqite@qwqite·6 May

@RaviTejaKNTS @JasonBotterill then we had a misunderstanding, gpt models are great tools yes 5.4 was just the worst gpt model in a VERY long time other than it's raw speed & efficiency. very bad personality, often bad coding practices, bad decisions, unable to follow instructions, etc. 5.5 is peak tho

English

Teja@RaviTejaKNTS·6 May

@qwqite @JasonBotterill i dont give a damn about benchmarks. i use ai like a tool. GPT models are great tools.

English

JB@JasonBotterill·6 May

GPT-5.5 Instant system card shows its beating 5.4-thinking on most benches lmao?

English

691

50.5K

wqite@qwqite·6 May

@RaviTejaKNTS @JasonBotterill benches? they were good tools before but have you ever even bothered reading how terrible the eval process is ? lol

English

Teja@RaviTejaKNTS·6 May

@qwqite @JasonBotterill lol, nah. They are good tools. Not conversational

English

wqite@qwqite·6 May

@nitzukai mouse accel is utterly disgusting for anything other than laptop usage no? what is there to like?

English

nitzu 🫧@nitzukai·5 May

what's up with YouTubers hating mouse acceleration

English

8.3K

wqite@qwqite·6 May

yay another worthless benchmark

John Yang@jyangballin

How much of SQLite, FFmpeg, PHP compiler can LMs code from scratch? Given just an executable and no starter code or internet access. Introducing ProgramBench: 200 rigorous, whole-repo generation tasks where models design, build, and ship a working program end to end. 🧵

English

wqite@qwqite·6 May

@realaicoach @DavidKPiano 5.5 is just too fast & has too much baseline performance & consistency for opus 4.7 to matter opus models have always been inconsistent but if you fought with them, they would easily produce the best results. it’s still true, but terrible roi vs before

English

wqite@qwqite·6 May

@realaicoach @DavidKPiano opus 4.7 is very similar to sonnet 3.7 in the sense that the common consensus is that it’s weird & a downgrade, however it’s definitely an upgrade in many ways this time around though there exists other better options unlike last time (5.5)

English

David K 🎹@DavidKPiano·6 May

At this point Claude should just re-release Opus 4.6 as Opus 4.8 and it would look like an improvement

English

31.8K

wqite@qwqite·5 May

@yacineMTB they def increased it insane amounts, at some point on 5.4 on a business account it took 8 prompts with subagents to get rid of my entire weekly usage 5.5 on pro lite it’s been 2 days and i’m down 2% i assume it’s due to the /goal stuff they added

English

297

kache@yacineMTB·5 May

It feels like they increased the limits on gpt 5.5 I've been ripping this shit all night, literally 10 hour reverse engineering runs and it's like 5% weekly usage

English

482

21.4K

wqite@qwqite·30 Nis

@solxiz @ThourCS2 @TitanHoloCS @WindowsCentral tweakers nowadays dont even pretend to read windows internals, i’m not a random just not a scammer.

English

solxiz@solxiz·30 Nis

@wqitedev @ThourCS2 @TitanHoloCS @WindowsCentral Callate un poco, random

Italiano

Thour@ThourCS2·29 Nis

Microsoft is working on a Windows 11 update codenamed Windows K2 with a focus on improving gaming performance and reducing bloatware, according to @WindowsCentral They are using SteamOS as a benchmark for performance. It will also lower idle memory use and less AI clutter.

English

654

959

13.3K

2.6M

wqite@qwqite·30 Nis

@solxiz @ThourCS2 @TitanHoloCS @WindowsCentral 99% of tweakers are clueless morons that do not understand anything they are saying & repeating the words of other clueless morons

English

solxiz@solxiz·29 Nis

@ThourCS2 @TitanHoloCS @WindowsCentral Not at all. There's always a lof of work to do, no matter how well "optimized" a SO could be.

English

800

wqite@qwqite·30 Nis

@d4m1n @Zain_Wania i really hope to god for the sake of anyone using it that it’s not worse now than how it was before lol

English

Dan ⚡️@d4m1n·29 Nis

@Zain_Wania nah man it was a lot better ~3mo after it launched with worse models. It's been bloated and buggy for months now, they are just now starting to fix things. I've been using CC daily btw pretty much since launch. Still use it

English

101

Dan ⚡️@d4m1n·29 Nis

lol Cursor is a better harness for both GPT 5.5 in Codex AND Opus 4.7 in Claude Code how is that possible?!

English

116

1.5K

259.6K

wqite@qwqite·26 Nis

@ain3sh @NielsRogge @droid it really doesn't take much to beat cc, the bar is in hell

English

396

Ainesh Chatterjee@ain3sh·26 Nis

@NielsRogge Not to defend slop, but ForgeCode and multiple others ranked in the top few were shown to be gaming tb2 using prompt injections / system prompt hints to boost their scores. We still beat CC without needing any gimmicks tho 😼 harness is everything @droid

English

136

16.3K

Niels Rogge@NielsRogge·26 Nis

FYI Claude Code is mostly a vibe-coded product (as they say, 100% written by Claude) It's the worst harness for Opus 4.6 among ANY harness on Terminal-Bench 2

Matt Pocock@mattpocockuk

I feel sorry for Claude Code I know they're not the one. I'm not overcommitting - not investing too hard I wonder if they know I'm pulling away

English

100

2.4K

446.1K

wqite@qwqite·24 Nis

@theo insanely fast & feels like an anthropic model in a good way, though i haven't tried it extensively yet

English

823

Theo - t3.gg@theo·23 Nis

How are you guys feeling about 5.5 so far?

English

484

1.7K

303.6K

wqite@qwqite·24 Nis

@redtachyon 5.3 - much more efficient than 5.2, better at using the codex harness, very clean direct model 5.4 - useless garbage 5.5 - likely the final model in this mini family, seems to be pretty competent & acts a lot more like an anthropic model compared to 5.0

English

3.8K

Ariel@redtachyon·24 Nis

3.5 - silly, interesting, largely useless 4 - first actually useful model, at least on some things 4o - multimodal, misaligned, oneshot normies 4.5 - bigger, more raw, very interesting o1 - first reasoner, impressive for its time o3 - absolute beast, still an incredible model 5 - o3 in a trenchcoat 5.1 - people were mad at 5 so this was a bit better 5.2 - codex era, great agentic performance 5.3 - ? 5.4 - ?? 5.5 - ???

English

744

86.3K

wqite@qwqite·22 Nis

@MicahHaley @rileybrown why talk abt shit you don’t understand

English

427

Micah Haley@MicahHaley·22 Nis

@rileybrown It's not that crazy... they steal/train on the actual images that include barcodes lol

English

18K

Riley Brown@rileybrown·22 Nis

Dude GPT-Image-2... wtf... how.

English

150

296

8.2K

934.4K

wqite@qwqite·17 Nis

@theo i really feel that largely the regressions come from claude code being the biggest piece of trash i’ve ever used, i don’t understand how anyone can use it truthfully.

English

278

Theo - t3.gg@theo·17 Nis

Serious question. Has anyone ever noticed meaningful regressions in Codex/OpenAI models? I feel like we talk about this a lot w/ Anthropic but I've never seen a similar discussion with OAI.

English

306

1.7K

142K

wqite@qwqite·17 Nis

@ElectricSheepIO @bcherny @realsigridjin

QME

Eva@ElectricSheepIO·17 Nis

@bcherny @realsigridjin Why is 4.6 not accessible @bcherny ? Even OpenAI rolled back that bad policy. Old models have to be supported for a period at least

English

454

Sigrid Jin 🌈🙏@realsigridjin·16 Nis

in claude web - opus 4.7: only adaptive thinking mode - opus 4.6: i can turn on/off reasoning mode so basically you can't control thinking mode

English

110

37.2K

wqite@qwqite·6 Nis

@mahm_darwish @KhalidWarsa you’re ignoring the implications of caching very hard

English

Mahmoud Darwish@mahm_darwish·6 Nis

Tbh I am not sure how this is a good argument, using only Claude code I am consuming $3-4k worth of tokens monthly. I don’t care about OpenClaw, but I want to use the cli mode for my software or any software that simply uses Claude code. Automation is a normal progression for LLM tools.

English

1.9K

Khalid Warsame@KhalidWarsa·6 Nis

I’m with Anthropic for banning OpenClaw cause you can’t consume $2,000 worth of tokens on a $200 subscription and expect Anthropic to foot the bill. Your “24/7 running agents that produce slop and pollute the internet” are net negative on society. Pay actual usage or gtfoh.

English

1.2K

53.9K

Keşfet

@dprophecyguy @ryanvogel @steipete @2TheTimelessOne @KRR1751 @TechPowerUp @RaviTejaKNTS @JasonBotterill