Arvind Narayanan

13.1K posts

Arvind Narayanan

@random_walker

Princeton CS prof and Director @PrincetonCITP. Coauthor of "AI Snake Oil" and "AI as Normal Technology". https://t.co/ZwebetjZ4n Views mine.

Princeton, NJ Beigetreten Aralık 2007

526 Folgt126.5K Follower

Angehefteter Tweet

Arvind Narayanan@random_walker·9 Mar

If a fact or chart is surprising, it might be because it’s new information, or it might be something deeper — a sign that our mental model is wrong. Anthropic’s economic gap chart is the latter. anthropic.com/research/labor… A big source of confusion in AI discourse is not recognizing that the speed of adoption follows its own logic that’s far slower than the speed of capability progress. I’m biased but I think AI as Normal Technology is still the best exposition of the many different speed limits to diffusion. Once we internalize this, the gap shown in the chart is what we should expect. How does this square with the “AI is the most rapidly adopted technology” narrative and all the graphs that are frequently shared to push that view? Unfortunately they lump together too many kinds of “AI use” to really tell us anything meaningful. On the one hand there are many marginal uses of AI (such as using chatbots instead of traditional search) that are being quickly adopted. But what will make a true economic impact are deeper changes to workflows that incorporate verification and accountability, manage the risk of deskilling, and are accompanied by organizational changes that take advantage of productivity improvements. Those changes happen at human timescales and are barely getting started. And that’s not even accounting for regulatory barriers. Finally, I’m also not sure how credible the “theoretical capability” estimates are. In particular, I don’t think they account for the capability-reliability gap, for which the AI community didn’t even have measurements until our work two weeks ago normaltech.ai/p/new-paper-to…

English

163

27.5K

Arvind Narayanan retweetet

Renaud Foucart@RenaudFoucart·10h

New rule: if your insightful comment on AI was already in this super cool 1977 paper about "AI is becoming so powerful it will change the world as we know it," you lose. (1/10)

English

126

11.3K

Arvind Narayanan retweetet

Stephan Rabanser@steverab·3d

In our paper "Towards a Science of AI Agent Reliability" we put numbers on the capability-reliability gap. Now we're showing what's behind them! We conducted an extensive analysis of failures on GAIA across Claude Opus 4.5, Gemini 2.5 Pro, and GPT 5.4. Here's what we found ⬇️

English

149

32.7K

Arvind Narayanan retweetet

Justin Curl@curl_justin·4d

State lawmakers introduced over 1,200 AI bills in 2025. They cover everything from deepfakes to autonomous weapons—but they're all just lumped together as "AI policy." @ARozenshtein and I wrote an article that breaks down the policy landscape along three dimensions: (1) what harm are you addressing, (2) what are the factors shaping how you should design your policy intervention, and (3) which actors in the ecosystem should you target? The diagram below, for example, maps the AI ecosystem from chip manufacturers to end users.

English

6.9K

Arvind Narayanan retweetet

AI Security Institute@AISecurityInst·4d

Can AI agents conduct advanced cyber-attacks autonomously? We tested seven models released between August 2024 and February 2026 on two custom-built cyber ranges designed to replicate complex attack environments. Here’s what we found🧵

English

388

95.8K

Arvind Narayanan retweetet

Hunter📈🌈📊@StatisticUrban·4d

His predictions weren't "premature." They were just wrong. They didn't happen, and they never will.

English

348

5.7K

244.4K

Arvind Narayanan retweetet

Sayash Kapoor@sayashk·4d

What is the role of model alignment for AI safety? - Model alignment is effective against accidental harms, not intentional ones - Important questions about AI safety can’t be asked and answered at the levels of models. In other words, *AI safety is not a model property*

Andy Hall@ahall_research

As I’ve said before I think this raises deep questions about the theory of behavior that underpins the existing approach to AI safety. Is the idea to deter the typical user who isn’t very determined and might not know Pliny exists? Or is the idea to prevent worst case outcomes? In which case Pliny’s continued success means the whole apparatus doesn’t yet work? I’m not super deep on the safety world and am trying to recover its first principles as I go.

English

10.7K

Arvind Narayanan retweetet

John Arnold@johnarnold·5d

The Atlantic has a sobering, first-person look at the ramifications of legalized online sports betting. Here are a few of the more telling passages. 1/5

English

237

1.7K

15K

1.4M

Arvind Narayanan retweetet

Princeton University@Princeton·12 Mar

Through various initiatives, @PrincetonSPIA is informing lawmakers about the latest research on AI, and educating current and future public servants about policy challenges and innovation opportunities. bit.ly/4sJT6zR

English

4.3K

Arvind Narayanan@random_walker·6d

At first glance this is a totally reasonable perspective. Training PhD students is a duty! But consider this — *effectively* advising a PhD student over a 5-year period is well over 1,000 hours of work, not to mention bringing in hundreds of thousands of dollars in grants. Professors will do some things for mostly altruistic reasons (peer review) but the time commitment for advising is not something that's reasonable to ask of someone without some form of compensation. So there are two options. One is to make advising a job requirement. Unfortunately this doesn't work, because the *quality* of advising is unobservable and can't be quantified by metrics, leading to a race to the bottom. The other option is the current system — advising helps advance the professor's research agenda because PhD students do most of the work, so they take on students voluntarily. Which means it's important to ask if this subtle alignment of incentives will continue despite advancing AI capabilities. Academia has many such "subtle alignments of incentives" that the system relies on in order to function — rarely articulated, poorly understood, and fragile. Maybe the advisor-advisee relationship in CS will survive the AI transition, as @sayashk predicts, but many processes and structures will surely break. Best to rethink the system now, before it's too late.

Alison | AlisonBob.eth@AlisonbobEth

@sayashk @random_walker They only have PhD students to do work? I would have thought that training successors, would be important in of itself 🫠

English

165

45.4K

Arvind Narayanan retweetet

Sayash Kapoor@sayashk·13 Mar

In the last few months, I've spoken to many CS professors who asked me if we even need CS PhD students anymore. Now that we have coding agents, can't professors work directly with agents? My view is that equipping PhD students with coding agents will allow them to do work that is orders of magnitude more impressive than they otherwise could. And they can be *accountable* for their outcomes in a way agents can't (yet). For example, who checks the agent's outputs are correct? Who is responsible for mistakes or errors?

English

520

471.2K

Arvind Narayanan retweetet

Andy Masley@AndyMasley·13 Mar

Each frontier AI model seems to use a little under a year's worth of a square mile of farmland's water to train. I think about this as the country having 4 square miles of farmland sectioned off to grow some of the most popular consumer products in history.

English

214

480

8.2K

596K

Arvind Narayanan@random_walker·12 Mar

AI isn't replacing programmers, but it *is* making it harder to survive as a programmer with purely technical skills and no interest or expertise in how those skills translate to business or societal value. Funny thing is, this has always been true—it's just being accelerated a bit due to AI. There's a famous essay by @patio11 from 15 years ago called "Don't Call Yourself A Programmer, And Other Career Advice". kalzumeus.com/2011/10/28/don…

English

259

20.3K

Arvind Narayanan@random_walker·12 Mar

📢 Excited to announce that we're doing the AI Policy Precepts in DC again! Open to all federal employees. Interactive roundtable discussions between federal officials/advisors and many of Princeton's leading AI policy experts including me (Sayash Kapoor, Mihir Kshirsagar, Andrés Monroy-Hernández, Arvind Narayanan, Miranda Wei). Apply by March 20. Offered by @PrincetonSPIADC and @PrincetonCITP. Details and application: mailchi.mp/princeton.edu/…

English

Arvind Narayanan@random_walker·12 Mar

"Metric saturation" is a long-overdue concept. If the whole eval community focuses on a single metric, we kneecap our ability to understand the real-world impacts of AI progress.

Sayash Kapoor@sayashk

Hey @METR_Evals—love your work, but we think it's the *metric* that's saturated, not the task suite. For example, despite rapid gains in accuracy, we found limited gains in reliability. We'd love to work together to see if this holds up on the time-horizon task suite.

English

6.5K

Arvind Narayanan retweetet

Sayash Kapoor@sayashk·11 Mar

METR@METR_Evals

We estimate that Claude Opus 4.6 has a 50%-time-horizon of around 14.5 hours (95% CI of 6 hrs to 98 hrs) on software tasks. While this is the highest point estimate we’ve reported, this measurement is extremely noisy because our current task suite is nearly saturated.

English

19.2K

Arvind Narayanan retweetet

Peter Henderson@PeterHndrsn·11 Mar

I’m really excited about our new paper! I think we will ultimately need to draw on expertise from both law and AI to get alignment right, and this paper lays out that vision in more detail. As an aside, my PhD thesis was titled ‘Aligning law, policy, and machine learning for responsible real-world deployments’ for a reason. I think this is a very important area, and I’m excited to see so many excellent researchers working together to move it forward.

English

124

11.6K

Arvind Narayanan retweetet

Kelsey Piper@KelseyTuoc·11 Mar

Understand this: Waymo in DC is not being delayed because the City Council wants a study. Instead, the City Council is asking for a study because they want to delay Waymo in DC.

English

1.5K

66K

Arvind Narayanan retweetet

Arpit Gupta@arpitrage·11 Mar

This is why I believe AI will be a “normal technology” — despite rapid scaling laws for specific technical benchmarks, real world usefulness and effectiveness are going to lag behind a lot

Joel Becker@joel_bkr

new @METR_Evals research note from @whitfill_parker, @cherylwoooo, nate rush, and me. (chiefly parker!) we find that *half* of SWE-bench Verified solutions from Sonnet 3.5-to-4.5 generation AIs *which are graded as passing* are rejected by project maintainers.

English

121

27.5K

Arvind Narayanan@random_walker·11 Mar

Efforts to improve the security of AI agents should recognize that many security failures occur even in the absence of adversaries. The unreliability issue has largely flown under the radar and there hasn't been much work on defining, measuring, or mitigating the problem. More on this in our response to NIST's request for information on AI Agent Security, by @steverab, @sayashk, @PKirgis, @CitpMihir, and me: sage.cs.princeton.edu/documents/RFC_… This is based on our recent paper: normaltech.ai/p/new-paper-to…

English

7.8K

Arvind Narayanan@random_walker·10 Mar

@binarybits Ah, ok! Will delete.

English

170

Timothy B. Lee@binarybits·10 Mar

@random_walker

QME

171

Entdecken

@ARozenshtein @PrincetonSPIA @sayashk @patio11 @PrincetonSPIADC @PrincetonCITP @METR_Evals @steverab