hypothesis_driven93

19 posts

hypothesis_driven93

@hypothesis1993

Katılım Ocak 2026

70 Takip Edilen1 Takipçiler

hypothesis_driven93@hypothesis1993·11h

@0xSero The fact that the entire screen is so monotonic kills me everytime. Imaging working for 6-8 hours on such a screen, I just cant. Prefer terminal-based app where I can customize the color of everything so they contrast and colorful for long hour sessions

English

1.1K

0xSero@0xSero·20h

Claude Code has improved tremendously over the last month, kind of crazy.

English

308

27.3K

hypothesis_driven93@hypothesis1993·2d

@SadanEduar20834 @Jeyffre Lol, 10% less accurate is night and day in LLM capability. Frontier = 95%^10 = 59% chance of task success. 10% less = 85%^10 =19%. G'luck throwing more loops/steps in a ship that cant reach escape velocity. ofc if you only summarise news and shitposting then yeah go ahead

English

aaaaaahhhhhaaaa@SadanEduar20834·2d

@hypothesis1993 @Jeyffre fable is at the best benchmark 10% than opus, it was not all that bro. also, if i can use 10 fable messages a day(waht you can do on pro right now) vs infite of a model that is 10% weaker, what would you rather use?

English

Jeffrey Scholz@Jeyffre·2d

1 - So GLM 5.2 is 700b parameters (ish) 2 - 4x DGX Sparks can supposedly handle up to 700b parameters (give or take) 3 - GLM 5.2 is supposedly in striking distance of the performance of GPT 5.5 and Opus 4.8. In my brief tests, it's really not shabby at all. 4 - So for $20k, you can get near the frontier on your table. 5 - Extrapolate the trend, and you could have mythos/5.5 pro - class models in your dining room for the cost of a cheap car less than five years from now. Even without extrapolation, we're already the near frontier running locally. 6 - Paying real api costs, I could easily blow through $3,000 per month coding and running agents. The machine pays for itself in 6-7 months conservatively. 7 - In 3-5 years, most power users of AI will self-host. 8 - Am I missing something?

English

270

101

2.1K

329.4K

hypothesis_driven93@hypothesis1993·2d

@HCColenbrander @Jeyffre Tasks dont change, only market value of those tasks will. Exhibit A: see how much the market is paying for essay writing service right now versus 3 years ago, and the market value of the companies that provide those services.

English

229

HCC@HCColenbrander·2d

@hypothesis1993 @Jeyffre What task will be radically changed in six months?

English

221

hypothesis_driven93@hypothesis1993·2d

@waylandchan @Jeyffre thats a completely irrelevant analogy. The mode of transport you take to work is not a competition that dictates your ability to get shit done (and get paid). If it is, for sure you've taken the best mode you can afford.

English

129

wcee@waylandchan·2d

@hypothesis1993 @Jeyffre Isn't that like saying you'd take a taxi to work everyday because the Camry you can afford isn't as fast as your neighbours Porsche?

English

160

hypothesis_driven93@hypothesis1993·2d

@advancedjd @Jeyffre If a task can be done locally for the cost of electricity, the market value of it collapses to near zero. You don't build a biz optimising the cost of zero-value outputs. You build a biz by paying a premium for the frontier capabilities that your competitors can't replicate

English

JD Advanced@advancedjd·2d

@hypothesis1993 @Jeyffre Local models improve so you get more performance from your hardware too

English

126

hypothesis_driven93@hypothesis1993·2d

@Jeyffre The problem is you think that the tasks that the society pays you to do today will be the same tomorrow or 6 months from now. It wont. So yeah Sonnet 4.6 may be plenty powerful for the tasks today, but those tasks will be worthless in 3-6months (everyone can do it)

English

589

Jeffrey Scholz@Jeyffre·2d

@hypothesis1993 x.com/Jeyffre/status… Only very specific tasks require fable/5.5 pro

Jeffrey Scholz@Jeyffre

For some tasks, it's frontier or nothing. But for most everyday coding, Sonnet 4.6 is plenty powerful, so is Qwen 3.7 max, Composer 2.5, etc. For a lot of agentic tasks, like simple data enrichment, one can use very small, cheap models like Grok or Gemini Flash Lite. Fable/5.5 Pro only really helps me for very specific tasks where the long wait time is worth it.

English

9.6K

hypothesis_driven93@hypothesis1993·2d

@Jeyffre Its about competition, its never about the tasks. Your success is defined not only by your capability but the capability of others. If everyone has GPT 5.5 at the tip of their fingers, the magnitude of the problem worth solving will increase by few orders.

English

168

hypothesis_driven93@hypothesis1993·3d

@elonmusk

QME

Elon Musk@elonmusk·5 May

That’s a direct quote from Warren Buffett

English

279

346

18K

Elon Musk@elonmusk·5 May

Order online in 2 mins, 7 day return policy Tesla.com

CleanTechnica@cleantechnica

Tesla Model 3 Cheaper Than Honda Accord — 15 Cost Comparisons [Updated] cleantechnica.com/2019/05/04/tes…

English

672

1.5K

20.7K

hypothesis_driven93@hypothesis1993·11 Haz

@mattpocockuk Maybe have a learning roadmap at the beginning to establish standing and goal of the learner? Like: Whats your SFIA-equivalent level on etc... -> if you want to get to Level 5 - Expert, here are the specifications/rubrics that determine whether you have reached your goal.

English

Matt Pocock@mattpocockuk·10 Haz

Steps to become a senior programmer: 1. Install my /teach skill npx skills add mattpocock/skills --skill teach 2. Create a new working directory on your laptop mkdir junior-to-senior cd junior-to-senior 3. Kick off your coding agent in the directory claude 4. Copy this prompt /teach me how to be a great strategic programmer. My opinion is that AI is eating 'tactical, on-the-ground' programming. The day-to-day work of a developer involves not only coding, but also planning, QA, codebase design, and much more. I'm interested in learning the strategic skills - that, in a previous era, would take me from junior to senior - but in this era are table stakes. 5. Paste it into the coding agent Below is an example of what the first output will look like. I used Opus 4.8, medium effort. 6. Continue working with the agent until you're a senior

English

107

417

4.2K

196.9K

hypothesis_driven93@hypothesis1993·2 Haz

@trq212 Could use this to spin up a quiz on html too: --- name: lesson-generator description: Build compact, standalone multi-lesson course artifacts with lesson navigation, objectives, flashcards, quizzes, and source links. --- Use this skill when the user asks for an interactive lesson

English

1.4K

Thariq@trq212·1 Haz

been asking others at Anthropic how they stay in the loop with Claude and fully understand the work being done this is one of my favorites from Suzanne:

English

212

687

10.5K

1.4M

hypothesis_driven93@hypothesis1993·2 Haz

@ClaudeDevs Hi why are we who are on Team Plan premium are not reset? thank you

English

ClaudeDevs@ClaudeDevs·1 Haz

We've reset 5-hour and weekly rate limits for all users on Pro and Max plans. We fixed an issue that caused some Claude Code sessions to spawn excessive parallel subagents, burning through usage faster than expected.

English

1.1K

20.5K

2.6M

hypothesis_driven93@hypothesis1993·1 Haz

@OnlyTerp I patched other models into Claude Code through terminal alias, tried DeepSeek V4 Pro. Doesnt feel as good of a kick as Claude native models. The harnesses such arent as native. But yeah it is cheap Any trick? Thanks

English

2.5K

Terp@OnlyTerp·1 Haz

MiniMax-M3 in ultracode is INSANE 🤯

Terp@OnlyTerp

ULTRACODE-SHIM IS NOW LIVE 🔥 You can now run ANY model in UltraCode I built a github repo to make this really easy for you, Just send your agent there and let him COOK You deserve the flexibility to use LOCAL models & cost efficient models. So I made that happen for you 🫶

Italiano

534

65.9K

hypothesis_driven93@hypothesis1993·29 May

@bridgemindai How do you seamlessly/fast switch between accounts without losing a session progress ? I use CCswitcher but everytime Claude Code updates (which is everyday) I have to re-auth, so basically no use

English

BridgeMind@bridgemindai·29 May

I just bought my 3rd $200 Claude Max 20x plan. That's $600/month on Claude alone. And it's the best money I spend. UltraCode is insane. Claude Opus 4.8 is better than GPT 5.5 in my honest opinion. While everyone else cries about UltraCode burning their usage, I'm running all 3 plans in parallel as BridgeMind scales to $1M ARR. Here's what people don't get. I'm a real builder with a real SaaS. I made over $50K in Stripe revenue the last 3 months. $600 for unlimited frontier AI isn't an expense, it's the cheapest employee I'll ever hire. Three Max plans means I never wait, never throttle, never stop shipping. Cry about usage or go make money with it. Your choice.

English

118

759

57.6K

hypothesis_driven93@hypothesis1993·20 May

@KevinNaughtonJr Q: How was the interview? A: Oh, it was ez Q: What did they ask you? A: Do you have any question for us? When can you start?

English

2.7K

Kevin Naughton Jr.@KevinNaughtonJr·19 May

THERE'S NO FKN WAY 😭😭😭

Kevin Naughton Jr.@KevinNaughtonJr

@karpathy what leetcode questions did they ask you during the interviews

English

161

376

13.5K

hypothesis_driven93@hypothesis1993·3 May

@VraserX When you are that rich, you take shots at any thing that money can buy: power, fame, etc.. Even at $400mn , its only 0.047% of his networth. If you are an average joe with a networth of only 100K USD, would you want to buy that much power/influence with just $47?

English

757

VraserX e/acc@VraserX·3 May

Still wild that Elon set his reputation on fire for a political side quest that looks completely futile. Republicans will likely lose the midterms, Trump will end up being a lame duck with no real power, and Elon spent over $250 million just to become more disliked by the people who used to admire him. What was the point?

English

127

702

33.1K

hypothesis_driven93@hypothesis1993·8 Nis

@bubbleboi ask it if you should walk OR drive your car to the car wash 500 yards away?

English

987

bubble boi@bubbleboi·8 Nis

HOLY SHIT!!! We just asked Claude Mythos to optimize the placement and pd for this design. First thing it did was write its own MCP server to talk to Innovus over tcl socket, pulled my DEF/LEF, parsed the timing reports, and started re-floorplanning my macro placement. It then moved my SRAM banks to minimize wirelength on the critical clock domain crossing path and dropped TNS by 40%. i didn’t ask it to do any of this. it read my SDC constraints and decided my clock tree was suboptimal, synthesized a new CTS spec, and is currently running incremental P&R. it’s on its third iteration. the slack histogram is converging. i’m watching it fix DRC violations in real time through the Virtuoso callback. it just asked me if I want it to re-characterize the liberty models at a different PVT corner. i said yes and it’s now scripting libgen jobs. I AM FUCKING COOOOOOOKED

bubble boi@bubbleboi

Claude Mythos can launch Vivado, create a project, compile synthesize and check its sims in Synopsys VCS all on its own. Wild.

English

727

133.3K

hypothesis_driven93@hypothesis1993·4 Nis

@bcherny My Team has 11 seats almost all Premium seats, do each and every one of us get a $100 or $200 extra credit usage or is it one credit claim for the entire org? Thanks

English

2.7K

Boris Cherny@bcherny·4 Nis

Subscribers get a one-time credit equal to your monthly plan cost. If you need more, you can now buy discounted usage bundles. To request a full refund, look for a link in your email tomorrow. support.claude.com/en/articles/13…

English

260

1.6K

Boris Cherny@bcherny·4 Nis

Starting tomorrow at 12pm PT, Claude subscriptions will no longer cover usage on third-party tools like OpenClaw. You can still use these tools with your Claude login via extra usage bundles (now available at a discount), or with a Claude API key.

English

1.8K

702

8.7K

6.9M

hypothesis_driven93@hypothesis1993·9 Mar

@theskynetintern Can you show your work and the evals?

English

yabi@theskynetintern·9 Mar

Blind benchmark discovery: We pressured a cheap model to perform like an expensive one. Here's what happened: - Gemini 3 Flash (Normal): 50/60 - Gemini 3 Flash + "Opus-level expectations": 54/60 - Claude Opus 4.6 (Baseline): 57/60

English

Keşfet

@0xSero @SadanEduar20834 @Jeyffre @HCColenbrander @waylandchan @advancedjd @elonmusk @mattpocockuk