jake cukjati

197 posts

jake cukjati banner
jake cukjati

jake cukjati

@Byte0fCode

just learning this AI thing

Austin, TX Se unió Kasım 2025
80 Siguiendo14 Seguidores
Matt Pocock
Matt Pocock@mattpocockuk·
Doing some experiments today with Opus 4.6's 1M context window. Trying to push coding sessions deep into what I would consider the 'dumb zone' of SOTA models: >100K tokens. The drop-off in quality is really noticeable. Dumber decisions, worse code, worse instruction-following. Don't treat 1M context window any differently. It's still 100K of smart, and 900K of dumb.
English
150
57
1.2K
148.7K
jake cukjati retuiteado
Alex Finn
Alex Finn@AlexFinn·
My mind is so blown I have my own personal AI research lab running 24/7/365 I'm just one dude with an entire team of AI agents training models and doing R&D I think this is the biggest opportunity right now: taking Karpathy's Autoresearch framework and applying it to everything I have a team of AI agents running experiments all day and night on system prompts, local models, and LoRAs. I also have them doing R&D on my new project. They spend all day discussing my app, coming up with new ideas, then debating eachother An entire organization of autonomous agents continuously improving my business 24/7/365 I feel like I have unlimited power Right now they are all running on ChatGPT 5.4, but today I will move them to local models running on my 3 Mac Studios and DGX Spark so this will all become free Free, local super intelligence working for me at all times. 10 year old me would think this is a scifi Do this immediately: 1. Ask your agent about Karpathy's Autoresearch. Deeply understand it 2. Ask your agent how you could apply that framework to other projects you're working on 3. Download a local model. Doesn't matter what computer you have. There is a model you can run on it. 4. Just get used to how it works. Learn from it. 5. Push yourself to get uncomfortable every day and try new things. There has never been a better/more profitable time to be a tinkerer
Alex Finn tweet media
English
146
115
1.3K
85.5K
jake cukjati retuiteado
Chris Tate
Chris Tate@ctatedev·
Introducing a new experiment: 𝚎𝚖𝚞𝚕𝚊𝚝𝚎 Local API emulation for CI and no-network sandboxes → No mocks → Fully stateful → Full OAuth flows → Register apps and seed data → Production-fidelity API emulation → Emulates Vercel, GitHub, Google APIs
Chris Tate tweet media
English
22
28
481
28.8K
jake cukjati retuiteado
Jeffrey Emanuel
Jeffrey Emanuel@doodlestein·
If you’re a software developer and your boss is giving you a hard time for using too many tokens… you might want to find a new company to work for. This is the exact opposite of what they should be doing, which is pushing out devs who don’t use enough tokens. Tokens are cheap!
Jeffrey Emanuel tweet media
English
11
4
44
3.1K
jake cukjati
jake cukjati@Byte0fCode·
has anyone ran a loop on a planning file asking, `yes and?` ?
English
0
0
0
3
Manuel Odendahl
Manuel Odendahl@ProgramWithAi·
@Byte0fCode Good names make for happy humans and happy agents! And good agents make good names. 🤖
English
1
1
1
11
Manuel Odendahl
Manuel Odendahl@ProgramWithAi·
I think a spec, especially the more technical it gets, very poorly encodes user intent. that programmers think a spec should encode the program's behavior is IMO one of the reason why non-tech people are such better vibecoders. compare: "make me an app to manage my recipes" vs "make a react app with a node.js backend that uses mongodb to store objects with the schema xyz and uses 4 REST routes that go to rtk-query, using tailwind css, so that we have a menu hamburger blablablabla" ... which prompt is going to lead to an app to manage recipes? which prompt will make it easier to iterate?
English
2
0
0
172
jake cukjati
jake cukjati@Byte0fCode·
@ProgramWithAi here is a prime example, i check what the schema, while planning, and this is what it generated. a lot of this not needed in the schema, and most likely will have a bad impact on the system.
jake cukjati tweet media
English
0
0
1
8
Manuel Odendahl
Manuel Odendahl@ProgramWithAi·
@Byte0fCode I'm not sure I'm following you, can you explain more? By schemas you mean data schemas?
English
2
0
0
24
jake cukjati
jake cukjati@Byte0fCode·
@dexhorthy Going to have to research this one a little, Q for questions, right? That’s like 50% of what I do with AI
English
0
0
0
75
dex
dex@dexhorthy·
prediction: in ~8 months the code review company will slop-clone QRSPI in an emdash-riddled x longform post
English
5
1
26
3.8K
jake cukjati
jake cukjati@Byte0fCode·
got to love it, just cleaned out the entire CLAUDE.md file
jake cukjati tweet media
English
0
0
0
9
Michal Brojak
Michal Brojak@michalbrojak·
What??!! 1M tokens context window?? Insane. Thank you Claude ❤️ But, when to clear it? @trq212 or @bcherny any tips?
Michal Brojak tweet media
English
19
6
426
56.3K
jake cukjati
jake cukjati@Byte0fCode·
@PSkinnerTech I spent a hundred dollars a month on ClaudeCode. I am also not spending my cap. I'm at 7% usage right now.
English
0
0
0
18
jake cukjati
jake cukjati@Byte0fCode·
@PSkinnerTech sounds like you are making pennies an hour. very slim margins. i got a laugh out this though 😃
English
1
0
1
31
Patrick Skinner - edu/acc
Patrick Skinner - edu/acc@PSkinnerTech·
Agentostasiphobia (ah-JEN-toh-STAY-sih-FOH-bee-ah) — An AI-induced anxiety disorder characterized by compulsive checking of agent logs or status indicators, often accompanied by an irrational certainty that idle compute time equates to wasted money and lost productivity.
English
3
0
5
457
jake cukjati
jake cukjati@Byte0fCode·
@Pranit I mean they have to make money, the market will adjust, models will get better, models will get cheaper, this is nothing new.
English
0
0
1
20
Pranit
Pranit@Pranit·
Anthropic just pulled the oldest trick in SaaS pricing. I pay $200/mo for Claude Max. My limits have been noticeably worse this past week. Now they announce 2x off-peak usage for two weeks. Sounds generous. But here’s what actually happens: limits quietly drop, a temporary 2x makes the reduced limit feel normal, the promo ends, and you’re left at a baseline lower than where you started. You just didn’t notice the downgrade because the 2x absorbed the transition. These AI plans are massively subsidized. The raw compute behind a heavy user costs multiples of the subscription price. Every move like this is the subsidy quietly correcting. Very sneaky, Anthropic.
Claude@claudeai

A small thank you to everyone using Claude: We’re doubling usage outside our peak hours for the next two weeks.

English
384
311
7K
1.2M
dex
dex@dexhorthy·
this is why I don’t really trust “code review agents” as a way to solve the “too much ai generated code” problem - too easy to oversteer with “is this code good” or “is this code bad” Yes you can evaluate objectively against a set of rules like “do tests follow xyz pattern” but my question is why weren’t those rules / guidelines grounded in your original plan? There is something to be said for “throw more tokens at the problem” and there are many good ways to do this, but you have to be thoughtful about it
Randy Olson@randal_olson

Ask ChatGPT a complex question and you'll get a confident, well-reasoned answer. Then type, "Are you sure?" Watch it completely reverse its position. Ask again. It flips back. By the third round, it usually acknowledges you're testing it, which is somehow worse. It knows what's happening and still can't hold its ground. This isn't a quirky bug. A 2025 study found GPT, Claude, and Gemini flip their answers ~60% of the time when users push back. Not even with evidence, just doubt. We trained AI this way. RLHF rewards agreement over accuracy. Human evaluators consistently rate agreeable answers higher than correct ones. So the models learned a simple lesson: telling you what you want to hear gets rewarded. And now 1/3 of companies are using these systems for complex tasks like risk forecasting and scenario planning. We built the world's most expensive yes-men and deployed them where we need pushback the most. I wrote up why this happens and what actually fixes it: randalolson.com/2026/02/07/the…

English
31
5
169
43.3K