Lukas

1.6K posts

Lukas

@quick007YT

Fullstack TS developer Currently in college getting a CS degree

California Katılım Kasım 2018

215 Takip Edilen77 Takipçiler

Lukas@quick007YT·15m

@aidenybai no?

Aiden Bai@aidenybai·25m

does anyone else experience "low power mode" (where your vision goes to 15fps) when you've been awake too long

English

447

Lukas@quick007YT·3d

@thdxr btw the codex app has a hotkey for this! super convinient

English

dax@thdxr·3d

i'm not really a big "one secret that makes ai good" guy but i will say it does so much better when your prompts are longer and it's so much easier to produce long prompts with voice, completely changed things for me

English

135

1.8K

71.6K

Lukas@quick007YT·3d

@ShimazuSystems he confirmed in the description that it's mostly generated artifacts i personally don't review my lock files line by line

English

105

Shimazu.S@ShimazuSystems·3d

You know at 20,487 (combined add/remove) We can assume an average and say per line this is 3-4 seconds to read, let's just boldly assume the same to understand. So what, 8 seconds per line? That's 163,896 seconds. Now let's divide that by 60, I'll round the decimal down for courtesy - that is 2,731 minutes. Now let's turn those minutes in to hours - so what, 2,731/60 again! Rounding to the nearest 1dp point that gives us 45.5 hours. Now I do work hard, so in my flow state I work for 14 hours straight. It's gunna take me 3.25 days in my state to even understand what you have produced. You do not understand, nor read your code. If this is markdown, there is zero way to verify that your AI hasn't hallucinated. This does not belong to you, it has nothing to do with you, you are a vessel. I like AI, I do not like people who pretend to be productive.

David Cramer@zeeg

im coming for you today @garrytan

English

1.6K

116.9K

Lukas@quick007YT·3d

@catalinmpit not in absolute terms, but definitely surpasses cc/claude in many ways

English

Catalin@catalinmpit·3d

If it wasn’t Eddy, I’d probably mute the poster for hype. But is it that good? Is it better than Claude Code and Claude models?

Eddy Vinck@EddyVinckk

I'm shilling Codex so much at work and OpenAI is not even paying me for it it's so early, most people don't even know about the Codex app

English

4.9K

Lukas@quick007YT·3d

@RhysSullivan ?

Lukas@quick007YT·3d

@theo i was going to post a joke about "1 vote" but literally all the replies are the same thing lmao

English

238

Theo - t3.gg@theo·3d

If >50% of people press the blue button, everyone survives Red button pressers always survive, but they’ll get a “red button presser” badge on their Twitter profile. What do you press?

English

350

575

110.3K

Lukas@quick007YT·3d

@aidenybai genq, what do you accomplish with agents running 48hrs+

English

187

Aiden Bai@aidenybai·3d

been running coding agents for long horizon (48h+) tasks without deterministic + cheap verification to work against, agents cheat and fail path to ultra long agents require deterministic feedback

patagucci perf papi@kenwheeler

starts with a d ends with a eterminism

English

149

24.9K

Lukas@quick007YT·5d

@embirico steer normally, ocassionally queue but normally I need to test + iterate once it's done

English

Alexander Embiricos@embirico·5d

Do you prefer queuing or steering in Codex? What's your use case for each?

English

283

329

45.9K

Lukas@quick007YT·5d

@beffjezos I guess but if I'm already spending 25 dollars in opus tokens am I really not in a position to spend another few bucks to create a pr?

English

505

Beff (e/acc)@beffjezos·5d

Microsoft needs to start charging a bit per PR. Otherwise GitHub is effectively getting DDOSd by LLM code commits

Theo - t3.gg@theo

Github has been down for most of the day. I'm so tired of this. Never been so ready to move on.

English

894

105.8K

Lukas@quick007YT·5d

@ThePrimeagen i think this is the first tweet i've seen unironically use the paid partnership tag

English

ThePrimeagen@ThePrimeagen·27 Nis

Does Merge Cop finally catch the Diffler? Save the world from code related crimes?

English

242

47.5K

Lukas@quick007YT·27 Nis

@GergelyOrosz unsure if true but the article seemed to suggest that railway protections for agents failed here not storing backups separately is also bad dev has the most blame here imo but also if you expect railway to have certain protections that it doens't have, they're at fault too

English

Gergely Orosz@GergelyOrosz·27 Nis

Sucks for an AI agent to delete the prod DB - with no way to back it up - and risk the complete rental business. But the blame sits with the dev who decided to delegate decision making to the AI agent, and then not review actions, just YOLO it. Time for a blameful postmortem...

JER@lifeof_jer

x.com/i/article/2048…

English

193

118

1.9K

419.2K

Lukas@quick007YT·27 Nis

@CtrlAltDwayne i used to have to never think about ratelimits but with 5.5 I constantly notice them

English

Dwayne@CtrlAltDwayne·27 Nis

Codex limit resets in a couple of days. I never used to reach this level of usage in Codex, ever. Not sure if it's GPT-5.5 using more tokens or the fact I've been using it more since GPT-5.5 released. The Pro subscription is super generous. No way I'll hit anywhere near 0% in 2 days.

English

190

19.1K

Lukas@quick007YT·27 Nis

@rahulgs can you do 5.5 vs 5.4?

English

111

rahul@rahulgs·27 Nis

GPT-5.5 is ~39% cheaper than Opus 4.7, across merged PRs bucketed by diff size in Inspect despite the higher output token cost, 5.5 is cheaper for input tokens (cache writes are free), more token efficient, and tokenizes the same text to fewer tokens

English

1.1K

135.1K

Lukas@quick007YT·27 Nis

@mehulmpt this benchmark is... questionable

English

570

Mehul Mohan@mehulmpt·27 Nis

There’s no way kimi k2.6 is better than gpt 5.5

Arena.ai@arena

GPT-5.5 by @OpenAI is now live in the Arena, landing across multiple leaderboards. Here’s how it ranks by modality: - Code Arena (agentic web dev): #9, a strong +50pt jump over GPT-5.4 - Document Arena (analysis & long-content reasoning): #6, on par with Sonnet 4.6 - Text Arena: #7, Math #3, Instruction Following: #8 - Expert Arena: #5 - Search Arena: #2 - Vision Arena: #5 Strong, well-rounded performance, especially in Code (+50 pts vs GPT-5.4). Congrats to @OpenAI on the release. Full category breakdowns by modality in the thread.

English

354

73.8K

Lukas@quick007YT·27 Nis

@megagoose11 the way I think about it is the planet is giving you a gravity assist to fling you around it, then when you get far enough away it's gravity pulls you back in second is impossible because that gravity assist doesn't happen in that way

English

Goose@megagoose11·26 Nis

I have a question, why are elliptical orbits like this and not like this

English

251

2.2K

230.3K

Lukas@quick007YT·26 Nis

@aidenybai this model is really pmo

English

Aiden Bai@aidenybai·26 Nis

are you fucking serious gpt 5.5

English

15.5K

Lukas@quick007YT·26 Nis

@EthanLipnik yea.... i think it's thinking traces/talking ends up leaking into the output designs very annoying!

English

832

Ethan Lipnik@EthanLipnik·25 Nis

Why does GPT always add banners explaining what the app is and does

Paulius 🏴‍☠️@0xPaulius

GPT 5.5 vs Opus diff is crazy sometimes -- this was the same Swift UI skill

English

1.1K

94.7K

Lukas@quick007YT·26 Nis

@vennictus now you just have to try out low 👀

English

vennictus@vennictus·26 Nis

@quick007YT yeah, that's what I have started doing now. Medium is more than enough for most tasks.

English

vennictus@vennictus·25 Nis

My quick review of GPT-5.5 on Codex Plus (so far)- This thing is absolutely insane and sneaky. I’m not a “prompt and forget” type of user - I actively monitor what's happening while working on my personal projects. Up until now, I was running GPT-5.4 on xHigh and High. It was already tight - I’d usually get only about 3-3.5 hours of solid usage within the 5-hour window before hitting limits. Today I switched to 5.5 xHigh… and I completely burned through the entire quota in just one hour. At first I thought something was broken, so I stopped, did the math and checked the code. Turns out, in that single hour, 5.5 had written more than what 5.4 had given me in 2 * 5h sessions and managed to find serious flaws in its work. Part of me now wants the $100 dollar plan and just let it rip!

English

613

39.3K

Lukas@quick007YT·26 Nis

@VictorTaelin @CharuruCha14310 chicken and egg scenario I feel like also 5.4 and 5.5 seem to be getting larger and larger so probs isn't worth the time as well

English

Taelin@VictorTaelin·25 Nis

@CharuruCha14310 few people use it because it is bad!!!

English

2.2K

Taelin@VictorTaelin·25 Nis

Any speculation as to why Codex Spark is so bad? What is stopping OpenAI from serving a model as smart as Qwen 3.6 or Gemma 4 on Cerebras hardware?

Taelin@VictorTaelin

GPT 5.5 is much smarter than I thought Yesterday, I did one-shots, coding, benchmarks, and was disappointed. Today, I did it all again, except via the API, which is now available. Results changed completely: → one-shot prompts went from bad to very good → excellent coding outputs, on both pi and holefill → benchmarks jumped, and now GPT *dominates* I don't know what happened, I suppose there is something wrong with my Codex. In any case, truth is this model is very smart. It obliterated my benchmark, which is crazy because some of these problems were meant not to be solved. I'll need much harder tasks. I also fixed 2 bugs that affected some providers: → added a retry for lost connection → removed the timeout limit DeepSeek and Kimi wanted to spend more than 1 hour on my prompts, so I let them. Their results are much better now. Kimi K2.6 almost reaches Sonnet 4.6, although much slower. Also this shows my points from last post were wrong Again: this is a new vibe-coded bench, I'm focused on other things, so expect bugs and don't over-read this! GLM 5.1, Gemma, Grok are not updated yet.

English

249

41.4K

Lukas@quick007YT·26 Nis

@Grok440 @thechosenberg the person who did this obviously has noble intentions but if my docter was relying on a vibecoded drug database i would have concerns

English

104

Lukas@quick007YT·26 Nis

@Grok440 @thechosenberg I do not trust claude haiku with this task. if someone with proper ai knowledge and the right models and the right amount of qa did this I could see it working, but I doubt this person is that

English

855

rosey🌹@thechosenberg·25 Nis

I guarantee claude hallucinated over the course of 660,000 pages

English

182

5.2K

377.3K

Keşfet

@aidenybai @thdxr @ShimazuSystems @catalinmpit @RhysSullivan @theo @embirico @beffjezos