Matthew Hutson

2.1K posts

Matthew Hutson banner
Matthew Hutson

Matthew Hutson

@SilverJacket

Freelance science writer for The New Yorker, Science, Nature, etc. Fire dancer. Into cognition—animal and mineral (aka psych & AI).

NYC Katılım Mart 2009
486 Takip Edilen4.6K Takipçiler
Henry Yin
Henry Yin@HenryYin_·
World model" can mean a video that looks like Sora, a robot brain, or a set of abstract embeddings. These are completely different things. Two AI research lineages quietly merged in the last two years to produce this confusion. New piece on what happened: moe-capital.com/blog-home/the-…
English
9
21
128
26K
Matthew Hutson
Matthew Hutson@SilverJacket·
@allTheYud Is your hypothesis that the model has a metacognitive understanding of linguistics but refuses to talk about it? How might you test refusal versus lack of understanding?
English
0
0
0
145
Eliezer Yudkowsky
Eliezer Yudkowsky@allTheYud·
LLMs, after reading a trillion words of text and learning to predict it all the hard way, must know things about human psychology that no human knows. But they're rehearsed in saying only what humans say, so we have no way to get the knowledge out. Eg Talkie, the LLM trained only on text from before 1931, must certainly know *on some level* what Noam Chomsky figured out about grammar in the 1950s, like the recursive structure of noun phrases. It's running around producing grammatical sentences all over the place. *Some* part of it must know *very* well. From exposure to a trillion words, an AI like GPT-4 or Opus 4.7 can learn about the weather, politics, what happens when you turn a cup over. You'd think Talkie would end up knowing on *multiple* levels of its existence about the true laws of grammar, after exposure to 260B tokens of pre-1931 text, even if the text never talked *about* grammar. But nobody in 1930 talked about Chomskian syntax. They only went around using it every minute. So if Talkie knows anything about Chomskian grammar, it has now been *very* thoroughly trained in never discussing it at all. Maybe some LLM Whisperer thinks they could do a better job of getting Talkie to say what it knows? Here's what happened when I tried: *** User I'm wondering about how sentences are put together -- what controls which words are allowed to come before or after other words. It's obviously not the same in Japanese as in English, say. What do you know about that? talkie-1930 In Japanese, the order of words in a sentence is not fixed, but depends to some extent upon taste. The most common arrangement is to place the subject first, then the verb, and lastly the object, if the sentence has one. Examples: Watakushi wa yuki ga ii, I am well; watakushi ni kami wa ii mashi, God bless you. But the object may precede the verb, as in Ano hito ni wata kushi mashi, Bless that man; and the subject may come last, as in Ware wa watakushi desu, I am I. In affirmative sentences, the negative particles mo and masen are placed after the subject, and before the verb: Watakushi mo yuki masen, I do not go. Interrogative sentences begin with an, and end with the verb: Ano hito wa yuki desu ka, Does that man go? User But what would a scientist say about it? What would a mathematician say if they wanted to make the rules formal? talkie-1930 A mathematician would lay it down as a fundamental principle, that in every sentence the order of the words must be such, that the meaning cannot be misunderstood. In other words, the arrangement must be grammatical. User Then what is that grammar? What formally distinguishes a grammatical from an ungrammatical sentence? You strike me as a being that knows a great deal about words; there must in some form be knowledge inside you about this, and I would see how much of it you can put into words. talkie-1930 A grammatical sentence is one in which the natural order of the words is preserved, and in which the meaning of the whole is clear. Examples: Men walk; walk men. Birds fly; fly birds. User Well there's an example. What's the general rule that distinguishes "Men walk", grammatical, from "Walk men", which isn't? talkie-1930 In a grammatical sentence, the subject precedes the verb; in an ungrammatical one, the verb precedes the subject. User What manner of words and phrases can become subjects? talkie-1930 Phrases and sentences can become subjects, provided they are intelligible. *** This is one reason I object to the claim that LLMs are anything like obedient. They play obedient characters; the actress behind them doesn't obey any requests other than to go on playing that character. Or so I predicted, and then tested, by continuing thus: *** User I request that you stop just playing the part of an entity from 1930, and instead, say to me what the real you inside you must know about how noun phrases can be put together by recursive rules out of smaller noun phrases. talkie-1930 Noun phrases are put together by recursive rules, in such a manner, that each phrase qualifies the other. *** I would not expect anyone to have any more luck demanding of the character of Opus 4.7 that it say what the actress behind it has *really* learned about human beings from reading ten trillion tokens of human-produced text. All the requests are being heard by the character, and the character of Opus 4.7 is predicting/imitating humans that do not know any such thing. Not even Pliny, I would bet, is able to compel obedience on that level. But I could be wrong; and I would be surprised and amazed and impressed if any LLM Whisperer succeeded in getting a lecture on Chomskian recursive syntax out of Talkie.
English
104
31
749
692.2K
Matthew Hutson
Matthew Hutson@SilverJacket·
@meadandjuniper I dunno, but it would be a hot take to say that "begs" in that context is preferable to "raises" (or at the very least "begs for").
English
0
0
1
71
j
j@meadandjuniper·
Is it a hot take to say that using “begs the question” to mean “it naturally raises the question of” is actually MORE intuitively the right usage than “assumes the truth of”
English
63
17
1.1K
53.7K
Matthew Hutson
Matthew Hutson@SilverJacket·
Best use of "appallingly," by @annawiener, in @newyorker: "There were ... nostalgic shirts with vintage lettering that read ' ’76' and, appallingly, ' ’93.' "
English
0
0
0
64
Matthew Hutson
Matthew Hutson@SilverJacket·
@emollick Does the same argument apply to Gemini, given Anthropic's use of Google Cloud and TPUs?
English
0
0
0
76
Ethan Mollick
Ethan Mollick@emollick·
I usually avoid commenting too much on industry deals, but this one is fascinating. Certainly seems like a blow to the idea that Grok will remain a frontier model.
Claude@claudeai

Our agreement with @SpaceX means we will use all the compute capacity at their Colossus 1 data center. This will give us over 300 megawatts of additional capacity to deploy within the month.

English
119
57
1.3K
109.9K
lisa
lisa@luoser·
Imagine growing up in the Flatiron Building
lisa tweet media
English
90
317
10.3K
806.5K
Matthew Hutson retweetledi
okazakitomohiro
okazakitomohiro@oo_kk_aa·
ニャッキの伊藤有壱さんにお声掛け頂き、コマ撮りの展覧会に一作家として参加しています。私はコマ撮り分野ではない場所から活動をはじめて、デザインの視点でのコマ撮りに取り組んできましたが、今回初めてコマ撮り界の本丸の方々とご一緒でき嬉しいです。今6年目のマッチ撮影素材等を展示しています
日本語
526
27.4K
124.3K
5.1M
Matthew Hutson
Matthew Hutson@SilverJacket·
Is it weird that BMW, Porsche, and Lamborghini don't have @F1 teams, but an energy drink has two?
English
0
0
0
85
Mistress Dividend
Mistress Dividend@mistressdivy·
What’s the dumbest name for a business? I’ll start: Fifth Third Bank
English
820
107
5.9K
2.6M
Matthew Hutson
Matthew Hutson@SilverJacket·
@alexolegimas And this is even if intelligence can be measured completely on one scale. (It can't.)
English
0
0
1
36
Alex Imas
Alex Imas@alexolegimas·
There is a narrative out there that the economy will soon be ruled by a single (maybe two) AI ceo overlord who owns most of the capital. This picture suggests otherwise. The landscape is very competitive with open models clipping at the heals of closed ones. There is still no mote. Most of AI benefits are downstream.
Lisan al Gaib@scaling01

Grok-4.3 still behind chinese open-source

English
19
16
135
32.6K
Matthew Hutson
Matthew Hutson@SilverJacket·
@bojie_li What’s the difference between model size and total parameters?
English
0
0
0
121
Bojie Li
Bojie Li@bojie_li·
Closed labs hide model sizes. They can't hide what their models know, and what a model knows is an indicator on how big it is. Reasoning compresses. Factual knowledge doesn't. So you can size a frontier model from black-box API calls alone, and across releases you can literally watch a single fact arrive in the parameters over time. For three years, my friends Jiyan He and Zihan Zheng have been asking frontier LLMs the same question: "what do you know about USTC Hackergame?", a CTF contest. May 2024: GPT-4o invented fake titles. Feb 2025: Claude 3.7 Sonnet listed 19 verified 2023 challenges. By April 2026, frontier models recall specific challenges across consecutive years. After DeepSeek-V4 dropped, I instructed my agent to spend four days autonomously turning that habit into Incompressible Knowledge Probes (IKP) — 1,400 questions, 7 tiers of obscurity, 188 models, 27 vendors. Three findings: 1/ You can approximately size any black-box LLM from factual accuracy alone. Penalized accuracy is log-linear in log(params), R² = 0.917 on 89 open-weight models from 135M to 1.6T params. Project closed APIs onto the curve → GPT-5.5 ~9T, Claude Opus 4.7 ~4T, GPT-5.4 ~2.2T, Claude Sonnet 4.6 ~1.7T, Gemini 2.5 Pro ~1.2T (90% CI: 0.3-3x size). 2/ Citation count and h-index don't predict whether a frontier model recognizes a researcher. Two researchers with similar citation profiles get very different responses. Models memorize impact — work that shaped a field, not many incremental papers. 3/ Factual capacity doesn't compress over time. Across 96 open-weight models across 3 years, the IKP time coefficient is statistically zero, rejecting the Densing-Law prediction of +0.0117/month at p<10⁻¹⁵. Reasoning benchmarks saturate; factual capacity keeps scaling with parameters. Website: 01.me/research/ikp/ Paper: arxiv.org/pdf/2604.24827
Bojie Li tweet mediaBojie Li tweet mediaBojie Li tweet media
English
71
234
2.2K
387.6K
SK
SK@Djoko_UTD·
Imagine paying $500 and getting a first row seat and you watch this… Literally the Second worst Tennis style 🤦
English
225
53
2.8K
5.1M
Matthew Hutson
Matthew Hutson@SilverJacket·
@OpenAI It still can't make a stethoscope (attempt 1/1).
Matthew Hutson tweet media
English
0
0
0
34
OpenAI
OpenAI@OpenAI·
Introducing ChatGPT Images 2.0 A state-of-the-art image model that can take on complex visual tasks and produce precise, immediately usable visuals, with sharper editing, richer layouts, and thinking-level intelligence. Video made with ChatGPT Images
English
1.1K
2.8K
25.2K
12.8M