Arvind Narayanan

13.1K posts

Arvind Narayanan banner
Arvind Narayanan

Arvind Narayanan

@random_walker

Princeton CS prof and Director @PrincetonCITP. Coauthor of "AI Snake Oil" and "AI as Normal Technology". https://t.co/ZwebetjZ4n Views mine.

Princeton, NJ Beigetreten Aralık 2007
526 Folgt126.5K Follower
Angehefteter Tweet
Arvind Narayanan
Arvind Narayanan@random_walker·
If a fact or chart is surprising, it might be because it’s new information, or it might be something deeper — a sign that our mental model is wrong. Anthropic’s economic gap chart is the latter. anthropic.com/research/labor… A big source of confusion in AI discourse is not recognizing that the speed of adoption follows its own logic that’s far slower than the speed of capability progress. I’m biased but I think AI as Normal Technology is still the best exposition of the many different speed limits to diffusion. Once we internalize this, the gap shown in the chart is what we should expect. How does this square with the “AI is the most rapidly adopted technology” narrative and all the graphs that are frequently shared to push that view? Unfortunately they lump together too many kinds of “AI use” to really tell us anything meaningful. On the one hand there are many marginal uses of AI (such as using chatbots instead of traditional search) that are being quickly adopted. But what will make a true economic impact are deeper changes to workflows that incorporate verification and accountability, manage the risk of deskilling, and are accompanied by organizational changes that take advantage of productivity improvements. Those changes happen at human timescales and are barely getting started. And that’s not even accounting for regulatory barriers. Finally, I’m also not sure how credible the “theoretical capability” estimates are. In particular, I don’t think they account for the capability-reliability gap, for which the AI community didn’t even have measurements until our work two weeks ago normaltech.ai/p/new-paper-to…
Arvind Narayanan tweet media
English
16
30
163
27.5K
Arvind Narayanan retweetet
Renaud Foucart
Renaud Foucart@RenaudFoucart·
New rule: if your insightful comment on AI was already in this super cool 1977 paper about "AI is becoming so powerful it will change the world as we know it," you lose. (1/10)
Renaud Foucart tweet media
English
5
30
126
11.3K
Arvind Narayanan retweetet
Stephan Rabanser
Stephan Rabanser@steverab·
In our paper "Towards a Science of AI Agent Reliability" we put numbers on the capability-reliability gap. Now we're showing what's behind them! We conducted an extensive analysis of failures on GAIA across Claude Opus 4.5, Gemini 2.5 Pro, and GPT 5.4. Here's what we found ⬇️
Stephan Rabanser tweet media
English
9
35
149
32.7K
Arvind Narayanan retweetet
Justin Curl
Justin Curl@curl_justin·
State lawmakers introduced over 1,200 AI bills in 2025. They cover everything from deepfakes to autonomous weapons—but they're all just lumped together as "AI policy." @ARozenshtein and I wrote an article that breaks down the policy landscape along three dimensions: (1) what harm are you addressing, (2) what are the factors shaping how you should design your policy intervention, and (3) which actors in the ecosystem should you target? The diagram below, for example, maps the AI ecosystem from chip manufacturers to end users.
Justin Curl tweet media
English
4
15
40
6.9K
Arvind Narayanan retweetet
AI Security Institute
AI Security Institute@AISecurityInst·
Can AI agents conduct advanced cyber-attacks autonomously? We tested seven models released between August 2024 and February 2026 on two custom-built cyber ranges designed to replicate complex attack environments. Here’s what we found🧵
AI Security Institute tweet media
English
16
88
388
95.8K
Arvind Narayanan retweetet
Hunter📈🌈📊
Hunter📈🌈📊@StatisticUrban·
His predictions weren't "premature." They were just wrong. They didn't happen, and they never will.
Hunter📈🌈📊 tweet media
English
67
348
5.7K
244.4K
Arvind Narayanan retweetet
Arvind Narayanan retweetet
John Arnold
John Arnold@johnarnold·
The Atlantic has a sobering, first-person look at the ramifications of legalized online sports betting. Here are a few of the more telling passages. 1/5
John Arnold tweet media
English
237
1.7K
15K
1.4M
Arvind Narayanan retweetet
Princeton University
Princeton University@Princeton·
Through various initiatives, @PrincetonSPIA is informing lawmakers about the latest research on AI, and educating current and future public servants about policy challenges and innovation opportunities. bit.ly/4sJT6zR
English
4
9
18
4.3K
Arvind Narayanan
Arvind Narayanan@random_walker·
At first glance this is a totally reasonable perspective. Training PhD students is a duty! But consider this — *effectively* advising a PhD student over a 5-year period is well over 1,000 hours of work, not to mention bringing in hundreds of thousands of dollars in grants. Professors will do some things for mostly altruistic reasons (peer review) but the time commitment for advising is not something that's reasonable to ask of someone without some form of compensation. So there are two options. One is to make advising a job requirement. Unfortunately this doesn't work, because the *quality* of advising is unobservable and can't be quantified by metrics, leading to a race to the bottom. The other option is the current system — advising helps advance the professor's research agenda because PhD students do most of the work, so they take on students voluntarily. Which means it's important to ask if this subtle alignment of incentives will continue despite advancing AI capabilities. Academia has many such "subtle alignments of incentives" that the system relies on in order to function — rarely articulated, poorly understood, and fragile. Maybe the advisor-advisee relationship in CS will survive the AI transition, as @sayashk predicts, but many processes and structures will surely break. Best to rethink the system now, before it's too late.
Alison | AlisonBob.eth@AlisonbobEth

@sayashk @random_walker They only have PhD students to do work? I would have thought that training successors, would be important in of itself 🫠

English
14
15
165
45.4K
Arvind Narayanan retweetet
Sayash Kapoor
Sayash Kapoor@sayashk·
In the last few months, I've spoken to many CS professors who asked me if we even need CS PhD students anymore. Now that we have coding agents, can't professors work directly with agents? My view is that equipping PhD students with coding agents will allow them to do work that is orders of magnitude more impressive than they otherwise could. And they can be *accountable* for their outcomes in a way agents can't (yet). For example, who checks the agent's outputs are correct? Who is responsible for mistakes or errors?
English
58
39
520
471.2K
Arvind Narayanan retweetet
Andy Masley
Andy Masley@AndyMasley·
Each frontier AI model seems to use a little under a year's worth of a square mile of farmland's water to train. I think about this as the country having 4 square miles of farmland sectioned off to grow some of the most popular consumer products in history.
Andy Masley tweet media
English
214
480
8.2K
596K
Arvind Narayanan
Arvind Narayanan@random_walker·
AI isn't replacing programmers, but it *is* making it harder to survive as a programmer with purely technical skills and no interest or expertise in how those skills translate to business or societal value. Funny thing is, this has always been true—it's just being accelerated a bit due to AI. There's a famous essay by @patio11 from 15 years ago called "Don't Call Yourself A Programmer, And Other Career Advice". kalzumeus.com/2011/10/28/don…
Arvind Narayanan tweet media
English
20
47
259
20.3K
Arvind Narayanan
Arvind Narayanan@random_walker·
📢 Excited to announce that we're doing the AI Policy Precepts in DC again! Open to all federal employees. Interactive roundtable discussions between federal officials/advisors and many of Princeton's leading AI policy experts including me (Sayash Kapoor, Mihir Kshirsagar, Andrés Monroy-Hernández, Arvind Narayanan, Miranda Wei). Apply by March 20. Offered by @PrincetonSPIADC and @PrincetonCITP. Details and application: mailchi.mp/princeton.edu/…
Arvind Narayanan tweet media
English
1
8
18
5K
Arvind Narayanan
Arvind Narayanan@random_walker·
"Metric saturation" is a long-overdue concept. If the whole eval community focuses on a single metric, we kneecap our ability to understand the real-world impacts of AI progress.
Sayash Kapoor@sayashk

Hey @METR_Evals—love your work, but we think it's the *metric* that's saturated, not the task suite. For example, despite rapid gains in accuracy, we found limited gains in reliability. We'd love to work together to see if this holds up on the time-horizon task suite.

English
1
3
36
6.5K
Arvind Narayanan retweetet
Sayash Kapoor
Sayash Kapoor@sayashk·
Hey @METR_Evals—love your work, but we think it's the *metric* that's saturated, not the task suite. For example, despite rapid gains in accuracy, we found limited gains in reliability. We'd love to work together to see if this holds up on the time-horizon task suite.
Sayash Kapoor tweet media
METR@METR_Evals

We estimate that Claude Opus 4.6 has a 50%-time-horizon of around 14.5 hours (95% CI of 6 hrs to 98 hrs) on software tasks. While this is the highest point estimate we’ve reported, this measurement is extremely noisy because our current task suite is nearly saturated.

English
7
6
84
19.2K
Arvind Narayanan retweetet
Peter Henderson
Peter Henderson@PeterHndrsn·
I’m really excited about our new paper! I think we will ultimately need to draw on expertise from both law and AI to get alignment right, and this paper lays out that vision in more detail. As an aside, my PhD thesis was titled ‘Aligning law, policy, and machine learning for responsible real-world deployments’ for a reason. I think this is a very important area, and I’m excited to see so many excellent researchers working together to move it forward.
Peter Henderson tweet media
English
5
15
124
11.6K
Arvind Narayanan retweetet
Kelsey Piper
Kelsey Piper@KelseyTuoc·
Understand this: Waymo in DC is not being delayed because the City Council wants a study. Instead, the City Council is asking for a study because they want to delay Waymo in DC.
English
14
89
1.5K
66K
Arvind Narayanan retweetet
Arpit Gupta
Arpit Gupta@arpitrage·
This is why I believe AI will be a “normal technology” — despite rapid scaling laws for specific technical benchmarks, real world usefulness and effectiveness are going to lag behind a lot
Joel Becker@joel_bkr

new @METR_Evals research note from @whitfill_parker, @cherylwoooo, nate rush, and me. (chiefly parker!) we find that *half* of SWE-bench Verified solutions from Sonnet 3.5-to-4.5 generation AIs *which are graded as passing* are rejected by project maintainers.

English
5
12
121
27.5K
Arvind Narayanan
Arvind Narayanan@random_walker·
Efforts to improve the security of AI agents should recognize that many security failures occur even in the absence of adversaries. The unreliability issue has largely flown under the radar and there hasn't been much work on defining, measuring, or mitigating the problem. More on this in our response to NIST's request for information on AI Agent Security, by @steverab, @sayashk, @PKirgis, @CitpMihir, and me: sage.cs.princeton.edu/documents/RFC_… This is based on our recent paper: normaltech.ai/p/new-paper-to…
Arvind Narayanan tweet media
English
4
15
55
7.8K