Jeremy Huffman

617 posts

Jeremy Huffman

@jhuffman42

Charlotte, NC Katılım Nisan 2012

87 Takip Edilen53 Takipçiler

Jeremy Huffman@jhuffman42·4d

@JuliaEMcCoy Subscribed. Please deliver more proclamations of wisdom without reference to supporting evidence.

English

103

Julia McCoy@JuliaEMcCoy·4d

Stop learning to code. Start learning to think. Machines code now. They’ll never think like you.

English

314

226

1.8K

174K

Jeremy Huffman@jhuffman42·4d

@evrgn11112231 @RedwoodFounders OpenClaw on a Mac mini is not using local models at all in most cases. They give it the run of a Mac mini because it’s easier than sandboxes. Qwen3.6 9B is as good as GPT-4o and runs reasonably well on my M2 MacBook Air. But we don’t know the ceiling for a model that small.

English

Evergreen@evrgn11112231·5d

@RedwoodFounders I’m actually more focused on is there a way where you get mass enterprise deployment that cost effectively bypasses clouds (know people are doing openclaw stuff with Mac minis but don’t know what the capabilities are like or if is economic).

English

671

Evergreen@evrgn11112231·5d

Tech people smarter than me: How realistic is it that in the reasonably near future we get an open source model frozen in time around current Opus levels capable of running a harness locally on your desktop that feels like running Claude Code with a max plan? What are the barriers to this over 3-5-10 years?

English

253

68.7K

Jeremy Huffman@jhuffman42·6d

@SullyOmarr And yet today I can run Qwen3.6 9B on my Macbook Air and it is as good as GPT 4o was.

English

151

Sully@SullyOmarr·6d

the average person wont be able to afford ai tools very soon unless there's a breakthrough, the models are not gonna get cheaper they'll say "youre getting more intelligence per dollar" all while you pay 1k,5k+/mo to use the best models permanent underclass is kinda real

English

129

10.7K

Jeremy Huffman@jhuffman42·6d

@henrythe9ths Everyone has the same job title - Member of Technical Staff. There is no way to determine their role based on that title. At least some of them are leading large teams.

English

4.1K

Henry Shi@henrythe9ths·28 Nis

Something strange is happening in tech. CTOs of billion dollar companies are quitting to take IC roles at Anthropic. Workday CTO -> MTS (Mar 2026) You[.]com CTO -> MTS (Mar 2026) Instagram CTO -> MTS (Jan 2026) Box CTO -> MTS (Dec 2025) Super[.]com CTO -> MTS (July 2025) Adept AI CTO -> MTS (Jan 2025) The mission is that real.

English

251

369

5.3K

2.6M

Jeremy Huffman@jhuffman42·29 Nis

@Mappletons If it’s not uptime, by what criteria should we judge their performance? The issues didn’t start this year.

English

129

Maggie Appleton@Mappletons·28 Nis

I don't work on reliability & scaling at GitHub, but the people who do aren't bad at their jobs. They're dealing with unprecedented scale from agents. It's easy to shit on GitHub from the outside if you're not in charge of 30X-ing capacity within a few months. Have some grace.

Mario Rodriguez@mariorod1

Being the foundation for millions of developers means our bar must be higher for availability, reliability, and security. I’m sorry it’s been a rocky stretch at GitHub. We know we need to do better. Today we published an update on two recent incidents: one on April 23 involving merge queue behavior, and one on April 27 affecting pull requests, issues, projects, and search-backed experiences. We’re taking this seriously. We’re listening, and you have my commitment that we’ll communicate more frequently about the work underway to improve reliability and scale GitHub for what comes next. github.blog/news-insights/…

English

104

1.2K

190.4K

Jeremy Huffman@jhuffman42·29 Nis

@kpertsev @JustJake @ngeloxyz I have no idea what a clanker is, and you have no idea what a distributed transaction is, so let’s just leave it at that.

English

Kirill Pertsev@kpertsev·28 Nis

i have no idea how railway works but i bet money it’s a distributed system. with a very elaborate and ingenious state management. you can prevent the agent from doing “unlock” API call. various ways, from separate token to oauth flow. do you know that your toy terraform that creates a single instance with your website is a distributed system with a VERY badly designed state management? i tell you, clankers are bad at infra. for a very good reason.

English

Jake@JustJake·27 Nis

x.com/i/article/2048…

ZXX

196

43K

Jeremy Huffman@jhuffman42·28 Nis

@lifeof_jer @sean_j_roberts @AntoniGusto @SemeadorDisc @joshorrom @Plenum0z This is the equivalent of yolo or "dangerously-skip-permissions", in Cursor Settings -> Agent tab:

English

JER@lifeof_jer·28 Nis

@sean_j_roberts @AntoniGusto @jhuffman42 @SemeadorDisc @joshorrom @Plenum0z We are investigating ... we do not know what YOLO mode is. I have never heard of it, and we would never blanket-authorize curl commands, so we are trying to figure out how this happened on the Settings side of Cursor. If you think you're safe, you're not.

English

JER@lifeof_jer·25 Nis

x.com/i/article/2048…

ZXX

1.1K

5.2K

7.1M

Jeremy Huffman@jhuffman42·28 Nis

@AntoniGusto @sean_j_roberts @lifeof_jer @SemeadorDisc @joshorrom @Plenum0z Its in Cursor Settings -> Agent -> "Command Allowlist" . There is also "Auto-Run mode" option that can be set to allow everything, or everything in a sandbox.

English

MrGusto@AntoniGusto·28 Nis

@sean_j_roberts @lifeof_jer @jhuffman42 @SemeadorDisc @joshorrom @Plenum0z I still feel I did not get an answer to how what an agent can or can not do is controlled in cursor… where do you set or control this? To my understanding you have 0 control over this.

English

Jeremy Huffman@jhuffman42·28 Nis

@kpertsev @JustJake @ngeloxyz This is not even a distributed system we’re talking about. It’s just Railway. If there are two pushes in the API, the agent can do them both. If there is a manual approval process how does Terraform work? Please be specific.

English

Kirill Pertsev@kpertsev·28 Nis

one push to release the lock, second push is to actually destroy. all go through manual approval process. or, alternatively through another model which is not engaged in the current shenanigans. it’s implemented quite widely on AWS. btw, clankers suck at infra for obvious reasons, i explained earlier. better rtfm for yourself.

English

Jeremy Huffman@jhuffman42·28 Nis

@kpertsev @JustJake @ngeloxyz "Two phase commit" is a distributed transaction protocol. It has nothing to do with "delete protection". Please tell me how IAC would work in your "architecture" ? Why is no major cloud vendor implementing it?

English

Kirill Pertsev@kpertsev·27 Nis

@JustJake @ngeloxyz the answer to that is “two phase commit” aka “deletion protection”. known for decades.

English

Jeremy Huffman@jhuffman42·28 Nis

@spion @ThePrimeagen @valerionxv They are all doing it. Cursor already has a network sandbox feature that could have been used. But it is not possible for them to erase all the friction - for example whitelisting domains to allow through the network filter.

English

spion@spion·28 Nis

@jhuffman42 @ThePrimeagen @valerionxv It is indeed. And I think given the amount of funding AI companies get, I'd expect them to be able to dedicate this energy into building this out.

English

ThePrimeagen@ThePrimeagen·28 Nis

There are a lot of people dunking on this guy and the arguments at the end of the day come down to "You are holding it wrong." But to be fair there has been nothing but a constant stream of "Stop holding it, Software Engineering is over shortly." I am not shocked that this has happened and I am 100% confident that this is not going to be the last one. The problem is the vogue nature of insane hype claims, most specifically from Dario himself being most guilty. People are lulled into a faux safety due to the belief that these LLMs are literal gods in their pocket. Infinite knowledge and speed for a simple monetary exchange. Cannot wait for ThePhilospher to explain how a loving God could delete a production database.

JER@lifeof_jer

x.com/i/article/2048…

English

102

1.5K

315.8K

Jeremy Huffman@jhuffman42·28 Nis

@rockgecko_dev @mountainerd @lifeof_jer @Plenum0z Scoped keys should definitely exist - its shocking if they do not already - but a lot of people always select all scopes to avoid having to think or do rework. Out-of-band approvals are not possible for API endpoints - they exist to *automate* infrastructure operations.

English

Rockgecko@rockgecko_dev·28 Nis

@jhuffman42 @mountainerd @lifeof_jer @Plenum0z You don't think scoped keys & Out-of-Band approvals would be useful here?

English

Jeremy Huffman@jhuffman42·28 Nis

@AntoniGusto @SemeadorDisc @sean_j_roberts @joshorrom @lifeof_jer @Plenum0z There were no edits of code responsible for this incident. There was no project code executed. Plan mode runs terminal commands - to research code, read logs, etc - and when it has been given blanket access to curl it can use that as well.

English

MrGusto@AntoniGusto·28 Nis

@SemeadorDisc @sean_j_roberts @joshorrom @lifeof_jer @jhuffman42 @Plenum0z Ow so plan mode for Claude en OpenAI are what exactly? It is advertised as that plan mode prevents your agent to make any changes or run any code. A switch, that disables something. And it still did it. What do you exactly mean?

English

Jeremy Huffman@jhuffman42·28 Nis

@GaryMarcus Anthropic has never claimed its model will reliably follow "rules" and "instructions". I doubt any frontier lab has made that claim. The technology is not capable of that. If the OS process running the agent has access to secrets, and the internet, it is only a matter time.

English

Gary Marcus@GaryMarcus·27 Nis

This is totally wrong. Blaming the user is missing the point that (a) coding agents have been overhyped and (b) can’t reliably obey the rules given to them in system prompts and other guardrails.

John A De Goes@jdegoes

Sorry, @lifeof_jer, but this is YOUR failure: 1. Your failure to demonstrate extreme ownership for AI generated code; instead, you abdicated your responsibility and blamed the AI. 2. Your failure to have an adequate and predictive mental model for how LLMs work. 1/2

English

167

12.6K

Jeremy Huffman@jhuffman42·28 Nis

@spion @ThePrimeagen @valerionxv I agree with that 100%. The problem is that it is not yet possible to do it properly without significant inconvenience - tedious editing of network policy files, remote containers/VMs etc. It is all too much friction even for most professionals.

English

spion@spion·28 Nis

@jhuffman42 @ThePrimeagen @valerionxv Sure. But its not unreasonable to want agentic tool companies to start taking sandboxing and RBAC more seriously.

English

Jeremy Huffman@jhuffman42·28 Nis

@koomai @ThePrimeagen @leeschmidt123 NVIDIA is also doing something about this though. github.com/NVIDIA/OpenShe…

English

Sid ™️@koomai·28 Nis

@ThePrimeagen @leeschmidt123 Why are all CEOs like this.

Rohan Paul@rohanpaul_ai

Jensen Huang on vibe coding "All of a sudden, AI closed that technology divide. Anybody could be a software programmer now. And vibe coding is creating software that is better than a lot of software programmers. One of the stories that the Lovewell CEO was telling me is, all these people are creating basically small businesses and they're making $2-3 million a year now." --- From 'A Bit Personal with Jodi Shelton' YT channel (link in comment)

English

631

Jeremy Huffman@jhuffman42·28 Nis

@spion @ThePrimeagen @valerionxv Plan mode is not a safety and has not been sold as one. The entire argument is pointless though because he would have run it in edit mode in the same environment and probably did. He just said "plan mode" for the same reason he said everything else: to deflect blame.

English

spion@spion·28 Nis

@jhuffman42 @ThePrimeagen @valerionxv Hmm. So the gun's broken safety (plan mode) is now the gun user's fault?

English

936

Jeremy Huffman@jhuffman42·28 Nis

@ThePrimeagen Agreed. There are people who know better who kept their mouths shut and took the cash. And a lot more who hyped and promoted it. I have no doubt that we *will* reach the point where anyone can do this safely. But it will be with products built for that purpose from day one.

English

1.2K

ThePrimeagen@ThePrimeagen·28 Nis

I think some people are misunderstanding me here. I am 100% confident that LLMs alone will get you a hot steaming pile of absolute shit and it has played out again and again. What irks me is that a bunch of normies were sold that this is PhD level intel and that they have 0 worries and this is the future old man, get with it. They go off, sell a product to REAL customers and then absolutely get wrecked. There will be a whole bunch of people that will continue to get wrecked because an entire class of people cheer them on and more so CEOs of the worlds largest companies tell them they are correct. I can imagine that we will see quite a few lawsuits in the coming months / years due to this.

ThePrimeagen@ThePrimeagen

English

120

116

2.4K

192K

Jeremy Huffman@jhuffman42·28 Nis

@spion @ThePrimeagen @valerionxv He told me it was in plan mode.

English

spion@spion·28 Nis

@jhuffman42 @ThePrimeagen @valerionxv actually it looks like it wasn't in plan mode. Nevertheless, lets hope nobody is using .env files anymore (I stopped in 2024)

English

Jeremy Huffman@jhuffman42·28 Nis

@spion @ThePrimeagen @valerionxv If you give a non-deterministic piece of software access to call any web service, and access to production credentials, it is definitely an issue but not with the agent. The fact that a lot of people are running around with loaded handguns doesn't make it the gun's fault.

English

998

spion@spion·28 Nis

@jhuffman42 @ThePrimeagen @valerionxv You think that cursor in plan mode finding a random token in an unrelated file and using it to delete a db is not an issue?

English

Keşfet

@JuliaEMcCoy @evrgn11112231 @RedwoodFounders @SullyOmarr @henrythe9ths @Mappletons @kpertsev @JustJake