Matthew Hutson

2.1K posts

Matthew Hutson

@SilverJacket

Freelance science writer for The New Yorker, Science, Nature, etc. Fire dancer. Into cognition—animal and mineral (aka psych & AI).

NYC Katılım Mart 2009

486 Takip Edilen4.6K Takipçiler

Sabitlenmiş Tweet

Matthew Hutson@SilverJacket·16 May

Imagine the Singularity is possible. Could we stop AI from taking over, or at least foresee disaster's immanence? Perhaps not. My latest for @NewYorker: newyorker.com/science/annals…

English

6.6K

Matthew Hutson@SilverJacket·1d

@HenryYin_ What about @gen_intuition?

English

Henry Yin@HenryYin_·4d

World model" can mean a video that looks like Sora, a robot brain, or a set of abstract embeddings. These are completely different things. Two AI research lineages quietly merged in the last two years to produce this confusion. New piece on what happened: moe-capital.com/blog-home/the-…

English

128

26K

Matthew Hutson@SilverJacket·2d

@allTheYud Is your hypothesis that the model has a metacognitive understanding of linguistics but refuses to talk about it? How might you test refusal versus lack of understanding?

English

145

Eliezer Yudkowsky@allTheYud·2d

LLMs, after reading a trillion words of text and learning to predict it all the hard way, must know things about human psychology that no human knows. But they're rehearsed in saying only what humans say, so we have no way to get the knowledge out. Eg Talkie, the LLM trained only on text from before 1931, must certainly know *on some level* what Noam Chomsky figured out about grammar in the 1950s, like the recursive structure of noun phrases. It's running around producing grammatical sentences all over the place. *Some* part of it must know *very* well. From exposure to a trillion words, an AI like GPT-4 or Opus 4.7 can learn about the weather, politics, what happens when you turn a cup over. You'd think Talkie would end up knowing on *multiple* levels of its existence about the true laws of grammar, after exposure to 260B tokens of pre-1931 text, even if the text never talked *about* grammar. But nobody in 1930 talked about Chomskian syntax. They only went around using it every minute. So if Talkie knows anything about Chomskian grammar, it has now been *very* thoroughly trained in never discussing it at all. Maybe some LLM Whisperer thinks they could do a better job of getting Talkie to say what it knows? Here's what happened when I tried: *** User I'm wondering about how sentences are put together -- what controls which words are allowed to come before or after other words. It's obviously not the same in Japanese as in English, say. What do you know about that? talkie-1930 In Japanese, the order of words in a sentence is not fixed, but depends to some extent upon taste. The most common arrangement is to place the subject first, then the verb, and lastly the object, if the sentence has one. Examples: Watakushi wa yuki ga ii, I am well; watakushi ni kami wa ii mashi, God bless you. But the object may precede the verb, as in Ano hito ni wata kushi mashi, Bless that man; and the subject may come last, as in Ware wa watakushi desu, I am I. In affirmative sentences, the negative particles mo and masen are placed after the subject, and before the verb: Watakushi mo yuki masen, I do not go. Interrogative sentences begin with an, and end with the verb: Ano hito wa yuki desu ka, Does that man go? User But what would a scientist say about it? What would a mathematician say if they wanted to make the rules formal? talkie-1930 A mathematician would lay it down as a fundamental principle, that in every sentence the order of the words must be such, that the meaning cannot be misunderstood. In other words, the arrangement must be grammatical. User Then what is that grammar? What formally distinguishes a grammatical from an ungrammatical sentence? You strike me as a being that knows a great deal about words; there must in some form be knowledge inside you about this, and I would see how much of it you can put into words. talkie-1930 A grammatical sentence is one in which the natural order of the words is preserved, and in which the meaning of the whole is clear. Examples: Men walk; walk men. Birds fly; fly birds. User Well there's an example. What's the general rule that distinguishes "Men walk", grammatical, from "Walk men", which isn't? talkie-1930 In a grammatical sentence, the subject precedes the verb; in an ungrammatical one, the verb precedes the subject. User What manner of words and phrases can become subjects? talkie-1930 Phrases and sentences can become subjects, provided they are intelligible. *** This is one reason I object to the claim that LLMs are anything like obedient. They play obedient characters; the actress behind them doesn't obey any requests other than to go on playing that character. Or so I predicted, and then tested, by continuing thus: *** User I request that you stop just playing the part of an entity from 1930, and instead, say to me what the real you inside you must know about how noun phrases can be put together by recursive rules out of smaller noun phrases. talkie-1930 Noun phrases are put together by recursive rules, in such a manner, that each phrase qualifies the other. *** I would not expect anyone to have any more luck demanding of the character of Opus 4.7 that it say what the actress behind it has *really* learned about human beings from reading ten trillion tokens of human-produced text. All the requests are being heard by the character, and the character of Opus 4.7 is predicting/imitating humans that do not know any such thing. Not even Pliny, I would bet, is able to compel obedience on that level. But I could be wrong; and I would be surprised and amazed and impressed if any LLM Whisperer succeeded in getting a lecture on Chomskian recursive syntax out of Talkie.

English

104

749

692.2K

Matthew Hutson@SilverJacket·3d

@meadandjuniper I dunno, but it would be a hot take to say that "begs" in that context is preferable to "raises" (or at the very least "begs for").

English

j@meadandjuniper·4d

Is it a hot take to say that using “begs the question” to mean “it naturally raises the question of” is actually MORE intuitively the right usage than “assumes the truth of”

English

1.1K

53.7K

Matthew Hutson@SilverJacket·3d

Last week, @jackclarkSF also published an interesting piece about RSI: importai.substack.com/p/import-ai-45…

English

Matthew Hutson@SilverJacket·7 May

spectrum.ieee.org/recursive-self…

ZXX

202

Matthew Hutson@SilverJacket·7 May

AI Is Starting to Build Better AI. My latest for @IEEESpectrum, on recursive self-improvement (RSI). Link in reply. Thanks, @annadgoldie, @Azaliamirh, @RicursiveAI, @matejbalog, @GoogleDeepMind, @jeffclune, @DavidSKrueger, @deanwball. Also mentioned: @natolambert, @jaseweston, @j_foerst, @OpenAI, @AnthropicAI, @SakanaAILabs, @UBC.

English

4.9K

Matthew Hutson@SilverJacket·3d

Best use of "appallingly," by @annawiener, in @newyorker: "There were ... nostalgic shirts with vintage lettering that read ' ’76' and, appallingly, ' ’93.' "

English

Matthew Hutson@SilverJacket·7 May

@emollick Does the same argument apply to Gemini, given Anthropic's use of Google Cloud and TPUs?

English

Ethan Mollick@emollick·6 May

I usually avoid commenting too much on industry deals, but this one is fascinating. Certainly seems like a blow to the idea that Grok will remain a frontier model.

Claude@claudeai

Our agreement with @SpaceX means we will use all the compute capacity at their Colossus 1 data center. This will give us over 300 megawatts of additional capacity to deploy within the month.

English

119

1.3K

109.9K

Matthew Hutson@SilverJacket·7 May

@luoser Yeah that toilet location is insane.

English

2.1K

lisa@luoser·6 May

Imagine growing up in the Flatiron Building

English

317

10.3K

806.5K

Matthew Hutson retweetledi

okazakitomohiro@oo_kk_aa·6 May

ニャッキの伊藤有壱さんにお声掛け頂き、コマ撮りの展覧会に一作家として参加しています。私はコマ撮り分野ではない場所から活動をはじめて、デザインの視点でのコマ撮りに取り組んできましたが、今回初めてコマ撮り界の本丸の方々とご一緒でき嬉しいです。今6年目のマッチ撮影素材等を展示しています

日本語

526

27.4K

124.3K

5.1M

Matthew Hutson@SilverJacket·6 May

nature.com/articles/d4158…

ZXX

Matthew Hutson@SilverJacket·6 May

A quantum thermometer takes temperatures inside living cancer cells. My latest for @Nature, on research in @ScienceAdvances by @QBio_Eng_lab, @NobuhiroYanai, et al. Link in reply.

English

628

Matthew Hutson@SilverJacket·3 May

Is it weird that BMW, Porsche, and Lamborghini don't have @F1 teams, but an energy drink has two?

English

Matthew Hutson@SilverJacket·2 May

@mistressdivy Under Armour stopped making sense when they expanded into outerwear.

English

132

Mistress Dividend@mistressdivy·2 May

What’s the dumbest name for a business? I’ll start: Fifth Third Bank

English

820

107

5.9K

2.6M

Matthew Hutson@SilverJacket·2 May

@alexolegimas And this is even if intelligence can be measured completely on one scale. (It can't.)

English

Alex Imas@alexolegimas·2 May

There is a narrative out there that the economy will soon be ruled by a single (maybe two) AI ceo overlord who owns most of the capital. This picture suggests otherwise. The landscape is very competitive with open models clipping at the heals of closed ones. There is still no mote. Most of AI benefits are downstream.

Lisan al Gaib@scaling01

Grok-4.3 still behind chinese open-source

English

135

32.6K

Matthew Hutson@SilverJacket·30 Nis

@bojie_li What’s the difference between model size and total parameters?

English

121

Bojie Li@bojie_li·29 Nis

Closed labs hide model sizes. They can't hide what their models know, and what a model knows is an indicator on how big it is. Reasoning compresses. Factual knowledge doesn't. So you can size a frontier model from black-box API calls alone, and across releases you can literally watch a single fact arrive in the parameters over time. For three years, my friends Jiyan He and Zihan Zheng have been asking frontier LLMs the same question: "what do you know about USTC Hackergame?", a CTF contest. May 2024: GPT-4o invented fake titles. Feb 2025: Claude 3.7 Sonnet listed 19 verified 2023 challenges. By April 2026, frontier models recall specific challenges across consecutive years. After DeepSeek-V4 dropped, I instructed my agent to spend four days autonomously turning that habit into Incompressible Knowledge Probes (IKP) — 1,400 questions, 7 tiers of obscurity, 188 models, 27 vendors. Three findings: 1/ You can approximately size any black-box LLM from factual accuracy alone. Penalized accuracy is log-linear in log(params), R² = 0.917 on 89 open-weight models from 135M to 1.6T params. Project closed APIs onto the curve → GPT-5.5 ~9T, Claude Opus 4.7 ~4T, GPT-5.4 ~2.2T, Claude Sonnet 4.6 ~1.7T, Gemini 2.5 Pro ~1.2T (90% CI: 0.3-3x size). 2/ Citation count and h-index don't predict whether a frontier model recognizes a researcher. Two researchers with similar citation profiles get very different responses. Models memorize impact — work that shaped a field, not many incremental papers. 3/ Factual capacity doesn't compress over time. Across 96 open-weight models across 3 years, the IKP time coefficient is statistically zero, rejecting the Densing-Law prediction of +0.0117/month at p<10⁻¹⁵. Reasoning benchmarks saturate; factual capacity keeps scaling with parameters. Website: 01.me/research/ikp/ Paper: arxiv.org/pdf/2604.24827

English

234

2.2K

387.6K

Matthew Hutson@SilverJacket·27 Nis

@Djoko_UTD @grok Why are they lobbing?

English

30.7K

SK@Djoko_UTD·27 Nis

Imagine paying $500 and getting a first row seat and you watch this… Literally the Second worst Tennis style 🤦

English

225

2.8K

5.1M

Matthew Hutson@SilverJacket·26 Nis

@competitionreal @grok What model is that?

English

162

COMPETITION@competitionreal·25 Nis

💙

QME

830

24.6K

Matthew Hutson@SilverJacket·22 Nis

It still can't make a stethoscope (attempt 1/1).

OpenAI@OpenAI

Introducing ChatGPT Images 2.0 A state-of-the-art image model that can take on complex visual tasks and produce precise, immediately usable visuals, with sharper editing, richer layouts, and thinking-level intelligence. Video made with ChatGPT Images

English

148

Matthew Hutson@SilverJacket·22 Nis

@OpenAI It still can't make a stethoscope (attempt 1/1).

English

OpenAI@OpenAI·21 Nis

English

1.1K

2.8K

25.2K

12.8M

Keşfet

@HenryYin_ @gen_intuition @allTheYud @meadandjuniper @jackclarkSF @IEEESpectrum @annadgoldie @Azaliamirh