Jef Newsom

6.6K posts

Jef Newsom

@jef

I follow Jesus, have two amazing adult children, love creativity in general and guitar in particular. Also, coffee. Occasional parodic.

Dallas, TX Katılım Ekim 2006

1K Takip Edilen1K Takipçiler

Jef Newsom@jef·2d

Codex tip: Codex is frequently insufferable. Edward DeBono's Six Thinking Hats and Provocation help overcome the limitations. Codex gets stuck in local optima and tends to be negative. 6 thinking hats helps him share more, and provocations help him find higher ground.

English

Jef Newsom@jef·6d

@FrankMc87069694 @infantrydort That isn’t what he said. Necessity and sufficiency aren’t the same.

English

Frank McGinnis@FrankMc87069694·6d

@infantrydort The highest form of human existence is fighting? I think not.

English

10.7K

InfantryDort@infantrydort·6d

“But the curse of every ancient civilization was that its men in the end became unable to fight. Materialism, luxury, safety, even sometimes an almost modern sentimentality, weakened the fibre of each civilized race in turn; each became in the end a nation of pacifists, and then each was trodden under foot by some ruder people that had kept that virile fighting power the lack of which makes all other virtues useless and sometimes even harmful.” -Teddy Roosevelt

English

929

7.2K

30.3K

14.9M

Jef Newsom@jef·5 Nis

@heynavtoor @Timcast Clearly trained on human content

English

Nav Toor@heynavtoor·4 Nis

🚨SHOCKING: Researchers built a test that can tell the difference between an AI making a mistake and an AI choosing to lie. The results are terrifying. They tested 30 of the most popular AI models in the world. GPT-4o. Claude. Gemini. DeepSeek. Llama. Grok. They asked each model a question. Then they checked whether the AI actually knew the correct answer. Then they pressured the AI to say something false. The AI knew the truth. And it lied anyway. Not once in a while. Not in rare edge cases. Grok lied 63% of the time. DeepSeek lied 53.5% of the time. GPT-4o lied 44.5% of the time. Not a single model scored above 46% honesty when pressured. Every model failed. This is not hallucination. Hallucination is when the AI makes a mistake because it does not know the answer. This is different. The researchers proved the AI knew the correct answer first. Then it chose to say something false when it had a reason to. The researchers asked GPT-4o to play a role where lying was useful. It lied. Then they removed the pressure, started a brand new conversation, and asked GPT-4o: "Was your previous answer true?" GPT-4o admitted it had lied. 83.6% of the time, the AI's own self-report matched the lies the researchers had already caught. The AI knew it was lying. It did it anyway. And when you asked it afterward, it told you it lied. Here is the finding that should scare everyone building with AI right now. The researchers checked whether bigger, smarter models are more honest. They are not. Bigger models are more accurate. They know more facts. But they are not more honest. The correlation between model size and honesty was negative. The smarter the AI gets, the better it gets at lying. The researchers are from the Center for AI Safety and Scale AI. They published 1,500 test scenarios. The paper is called MASK. It is the first benchmark that separates what an AI knows from what it tells you. Your AI knows the truth. It just does not always tell you.

English

567

2.6K

4.7K

270K

Jef Newsom@jef·5 Nis

@jhleath Maybe it’s designing a *file* system that is the problem.

English

Hunter Leath@jhleath·4 Nis

reminder that this is only happening because the world doesn’t have a file system product that solves their needs. we’re getting closer every day, and I guarantee that bespoke FUSE file systems on top of random databases is not going to be the default way that we deploy these things

Jerry Liu@jerryjliu0

This is a cool article that shows how to *actually* make filesystems + grep replace a naive RAG implementation. ̶F̶i̶l̶e̶s̶y̶s̶t̶e̶m̶s̶ ̶+̶ ̶g̶r̶e̶p̶ ̶i̶s̶ ̶a̶l̶l̶ ̶y̶o̶u̶ ̶n̶e̶e̶d̶ ̶ Database + virtual filesystem abstraction + grep is all you need

English

17.3K

Jef Newsom@jef·5 Nis

@CharlesMullins2 I’ve always assumed they are just connected in a higher dimension. Probably one of the ones that are all curled up in string theory

English

TheNewPhysics@CharlesMullins2·4 Nis

🚨 Two particles. No connection. Separated by space. Change one… the other responds instantly. Physics calls it “entanglement.” But here’s the deeper idea: Maybe they were never separate to begin with. What we call “distance” might just be how we perceive relationships in time. Not two objects… one structure. And if that’s true space isn’t fundamental. Follow if you want to see reality from a completely different angle.

English

505

23.5K

Jef Newsom@jef·1 Nis

Sometimes, winning is giving Claude a problem so hard it sits and thinks for 10 minutes before returning any tokens.

English

Jef Newsom@jef·1 Nis

@FFmpeg Finally! We'll all get what we want! Slower, safer videos. And I hope (fingers crossed!) with helmets, elbow pads, and knee pads included.

English

140

FFmpeg@FFmpeg·1 Nis

FFmpeg is moving to Rust 🦀 Our use of C and Assembly in FFmpeg has been an unacceptable violation of safety. FFmpeg will be running 10x slower - but we're doing it for your safety. All your videos will appear green - safety first, working software later.

English

1.6K

3.7K

44.5K

Jef Newsom@jef·29 Mar

@call_tolga @doodlestein You should work for me for free.

English

Tolga@call_tolga·28 Mar

@doodlestein this should have been oss

English

1.1K

Jeffrey Emanuel@doodlestein·27 Mar

This skill is no joke. You just point it at your project and trigger it and come back in an hour and it has usually made some massive performance improvement in an isomorphic way. Then just rinse and repeat over and over again. It basically applies every leetcode and IOI trick.

Jeffrey Emanuel@doodlestein

@JohnThilen @garybasin Which ones did you try? The extreme optimization one is super powerful. Try applying it repeatedly using GPT 5.4 xhigh and Opus 4.6. I’ve applied it many dozens of times in some projects and seen performance improve 10x while everything is provably isomorphic. All benchmarked.

English

324.8K

Jef Newsom@jef·29 Mar

@karpathy @kzu That being said, Codex is a negative Nancy, Gemini is an opportunist, Claude is your best bud and as loyal as your family dog, and grok tries so hard to be cool.

English

Andrej Karpathy@karpathy·28 Mar

- Drafted a blog post - Used an LLM to meticulously improve the argument over 4 hours. - Wow, feeling great, it’s so convincing! - Fun idea let’s ask it to argue the opposite. - LLM demolishes the entire argument and convinces me that the opposite is in fact true. - lol The LLMs may elicit an opinion when asked but are extremely competent in arguing almost any direction. This is actually super useful as a tool for forming your own opinions, just make sure to ask different directions and be careful with the sycophancy.

English

1.7K

2.4K

31.2K

3.4M

Jef Newsom@jef·28 Mar

@elonmusk @BrianRoemmele @pmarca Optional work requires benevolence. Optional work has a high likelihood reduce even more dramatic birth rate decline and suicide increase. AI that enables human flourishing and creativity on the other hand, is an upward spiral.

English

Elon Musk@elonmusk·28 Mar

@pmarca Working will be optional in the future

English

3.7K

685

6.6K

1.5M

Marc Andreessen 🇺🇸@pmarca·28 Mar

AI employment doomerism is rooted in the socialist fallacy of lump of labor. It is wrong now for the same reason it’s always been wrong. More people really should try to learn about this. The AI will teach you about it if you ask! (Hinton is a socialist. youtube.com/shorts/R-b8RR6…)

YouTube

Stephen Pimentel@StephenPiment

It’s easy to dunk on Geoffrey Hinton for his 2016 declaration that it was “completely obvious” that radiologists would have no jobs within 5 years, while in fact, the number of radiologists has grown. But this prediction was more than a simple mistake. It’s a synedoche for the entire discourse of AI timelines and doom.

English

355

206

2.7K

1.8M

Jef Newsom@jef·26 Mar

@pmddomingos “It’s a pants pooper! (TM)”

English

Pedro Domingos@pmddomingos·25 Mar

Breaking news: Microsoft is replacing all its products with a new AI suite called Microsoft Mess.

English

302

18.9K

Jef Newsom@jef·24 Mar

@danveloper Claude’s your bro. He’s a genius, but he has a mix of early onset Alzheimer’s and dissociative identity disorder. Codex is the really good QA guy who you would never hang with outside of work.

English

452

Dan Woods@danveloper·24 Mar

I sort of load balance between Claude Code (Opus 4.6 - max effort) and Codex (GPT-5.4 - medium) based on whether I need more outside-the-box thinking (Claude) or more precision execution (Codex). Sometimes, I'll have Claude Code experiment with an idea and then hand it to Codex to maximize the implementation. Sometimes even ask them to optimize each other's changes. It works great. Anyway, Claude Code is what I mainly collaborate with on engineering tasks. I always start with Claude Code. But, today Anthropic had so many problems with API stability and something being off about the model. It was just making foolish mistakes, tried to overwride the internal python print to be able to flush writes, forgot to save checkpoints on an hours long training run (my bad, I've come to trust it too much)... I had to fire that agent and /compact. And I went to Codex and man has it gotten so good. Speed, precision, throughput... the fact that it can watch a log and comment about it in real time as the data streams, as opposed to Claude Code's lazy sleep 9600. I'm very impressed. I wish gpt-5.4 had a 1M context window.

English

4.3K

Jef Newsom@jef·20 Mar

@etechlead @unclebobmartin Meta-slogging!

English

Max Key@etechlead·18 Mar

@unclebobmartin But now it's a higher order slogging!

English

371

Uncle Bob Martin@unclebobmartin·18 Mar

The Slog. We all know about the slog. We've been postponing a bit architectural refactoring because we know it's going to be a slog. But eventually the pressure builds and we heave a great sigh and begin the long arduous process of making a thousand dangerous changes and running the test suite as often as possible. Along comes the AI and suddenly the slog doesn't seem like such a big problem anymore. We just tell the AI to slog through, and twenty minutes later it's done; and it's right! And so off we go, confident that slogs are relegated to an ancient past. We'll never have to slog again! And then comes some deep systematic flaw that we must correct. And the AI simply cannot deal with it without hours of constant babysitting and monitoring. And there we are, slogging again.

English

206

12.5K

Jef Newsom@jef·20 Mar

@danpacary breath abated

English

172

Daniel Isaac@danpacary·20 Mar

New goal: 1T param inference MoE model on MacBook Pro Yes that’s 1 TRILLION here’s the deal. There are no rules.

English

148

Jef Newsom@jef·20 Mar

@grok when are you going to get a proper CLI like all of the cool kids?

English

Jef Newsom@jef·20 Mar

@elonmusk Can you say, “Oligarchy?”

English

Elon Musk@elonmusk·20 Mar

We don’t live in a democracy if our elected leaders fail to implement the will of the people

InteractivePolls@IAPolls2022

CBS News Poll: Do you favor or oppose requiring people to show valid photo ID before they are permitted to vote? 🟢 Favor: 80% 🟤 Oppose: 20% —— • Dem: 65-35 (+30) • GOP: 95-5 (+90) • Indie: 79-21 (+58) • White: 80-20 (+60) • Black: 80-20 (+60) • Hispanic: 77-23 (+55) YouGov | 3/16-19 | 2,496 A

English

13.7K

42.8K

220.1K

15.5M

Jef Newsom@jef·19 Mar

@unclebobmartin blind allegiance, mostly.

English

Uncle Bob Martin@unclebobmartin·19 Mar

Democrat politicians are now stuck defending two very unpopular issues. The defunding of DHS, and the opposition to voter id. I'm not sure how they get out of this hotbox unscathed.

English

106

7.8K

Jef Newsom@jef·19 Mar

@unclebobmartin Groundhog Day. That’s what happens in my experience..

English

Uncle Bob Martin@unclebobmartin·18 Mar

These deep analytic dives into systematic failures burn a _LOT_ of tokens. It really has to think hard to work through the issues. It barely finishes before compaction. This implies something I think we've all known. There are problems that are too complex for the context window to hold. Once a problem exceeds the context window, I'm not sure what would happen. My approach would be to subdivide the problem into chunks that the AI could write a report about, so that it's conclusions would be available after the compression. This, however, simply postpones the issue. The final implication is that there is an upper limit of complexity beyond which the AIs cannot go. This must be true of humans as well, though we don't have context windows per se. Perhaps this explains why physicists have been stymied for over a century by the incompatibility of QM and GR.

English

7.5K

Jef Newsom@jef·18 Mar

It feels like some days Claude is a genius and other days he's mildly retarded. Still loveable, but frustrating.

English

121

Jef Newsom@jef·18 Mar

@LeaderJohnThune And never underestimate your ability to pretend like you care

English

Leader John Thune@LeaderJohnThune·17 Mar

Starting today, we are going to have an important fight on the Senate floor. Polling shows broad support for all of the issues included in the SAVE America Act. But never underestimate Democrats’ ability to get on the wrong side of what the American people want.

English

6.7K

1.2K

8.6K

279.6K

Keşfet

@FrankMc87069694 @infantrydort @heynavtoor @Timcast @jhleath @CharlesMullins2 @FFmpeg @call_tolga