Cas (Stephen Casper)

2.5K posts

Cas (Stephen Casper)

@StephenLCasper

AI safeguards & gov. research. PhD student @MIT_CSAIL (mnr. Public Policy), and Fellow at @BKCHarvard. Fmr. @AISecurityInst. https://t.co/r76TGxSVMb

Katılım Mart 2016

3.6K Takip Edilen7K Takipçiler

Sabitlenmiş Tweet

Cas (Stephen Casper)@StephenLCasper·25 Mar

I'm extremely excited to be on the organizing committee this year for my favorite workshop ever! Submissions (up to 8 pages) are due April 24! Co-submission with ICML and NeurIPS is encouraged! taigr-workshop.com

Technical AI Governance @ ICML 2026@taig_icml

🚨📢Announcing the second Technical AI Governance Research (TAIGR) workshop @icmlconf. Accepting submissions (up to 8 pages) until April 24 on technical topics in AI governance! #icml2026

English

9.7K

Cas (Stephen Casper)@StephenLCasper·5h

I really liked this video: youtu.be/UgDEyQ1h-EA?si…

YouTube

English

578

Cas (Stephen Casper)@StephenLCasper·1d

Link: technologyreview.com/2026/05/14/113…

English

341

Cas (Stephen Casper)@StephenLCasper·1d

In practice, almost all AI porn is, in a certain morally relevant sense, not completely consensual. I was glad to talk briefly about this with Jessica Klein while she was writing this new MIT Technology Review piece!

English

2.6K

Cas (Stephen Casper) retweetledi

Machine Learning Street Talk@MLStreetTalk·1d

So depressing isn't it. I have not actually read this paper, but I did run it through @pangramlabs, and the results are shown in the image below (50% AI generated). I also generally find that Pangram gives the benefit of the doubt to avoid false positives, so it's probably much worse than that. > `The paper's content is a series of sections that mostly just list things with discussions that I think are generally vapid.` I've noticed this pattern as well when people have asked me to review their papers. Even starting with the high-level heading names, it's all just really generic. The contents tend to be lots of Wikipedia-style filler with no specificity, usually about something which is outdated or not even spoken about in recent discourse. Using LLMs to create content is a disease that needs to be eradicated. My honest advice to people is to never let a single word generated from an LLM leak into anything with your name on it online. Use it as an idea generator and a critic, but make sure anything you write is constrained by your own mental model and nothing else. I think this is an AI literacy thing, and I certainly have learned my lesson, and I think eventually others will - but a lot of damage will be done in the process.

Machine Learning Street Talk tweet media

English

2.5K

Cas (Stephen Casper)@StephenLCasper·1d

@artieart88 Oh cool, I made a related point recently. x.com/StephenLCasper…

Cas (Stephen Casper)@StephenLCasper

When talking about AI and its future, I wish it were more of a faux pas to make vague appeals to the benefits of technological progress -- and I don't just mean "AI curing cancer". Technological progress in general probably isn't as good as we normally think it is. In addition to the issue of how tech companies are always selling us something, there are a lot of survivorship and storytelling biases affecting pro-technology stories. For example, lots of people in the world's history were victims of guns, germs, and steel, but they’re not around to complain anymore. They died. Appeals to the value of technical progress should be considered unserious unless they engage with the unmemorialized catastrophes that modern privileged society is built on. AI hasn't caused such a catastrophe (yet?), but if we are going to talk about the benefits of generative AI, we should leave some space in the conversation for the 84% of people in the world who don’t use it and aren't in the room with the decision-makers. 26% of humans don't even use the internet. And I think this is a good analogy. The internet is a general-purpose technology that can do a lot of great things. It CAN give everyone a world-class education. It CAN give everyone access to an integrated healthcare app. It has certainly been around long enough to have achieved these things. But it doesn't.

English

537

Art | The Data Guy@artieart88·1d

Agree 100% I’m all for human flourishing, and I agree that policy discussions need to think bigger than just guardrails for AI. But here’s where this particular piece doesn’t work for me. The lead author, Ruben Laukkonen, has published extensively on human flourishing and a common thread in his work is that human flourishing comes through practices where the person has control over the tools for flourishing: mindset, meditation, contemplation, etc. These are benevolent tools that we can control. While AI is theoretically a benevolent technology, the way that most of us consume AI is not. Do you feel like you own ChatGPT? Do you have sovereignty over Claude or whatever LLM you’re using? I don’t think so. For most of us “AI” is a product packaged and sold to us by a company. A product typically built to farm our attention, harvest our data, and keep us engaged. You only need to look at the authors of this paper to remember that. So IMHO the AI safety people aren’t wrong to play defense. They’re responding rationally to a system that serves powerful technology to us en masse in exchange for extracting maximum value from people. Guardrails in this context seem sensible and framing AI safety folks as the obstacle to human flourishing feels a bit off. I do agree with the authors that AI “can” be used for human flourishing. But for that to happen, AI must be benevolent. If AI was self-sovereign in the same way as meditation, contemplation and other human flourishing tools, then maybe the AI safety folks could relax a bit and AI could actually be used to promote human flourishing as the authors of the article promote. This is why I’m a huge believer in open-source tools, data sovereignty, and collective model ownership. Until then, I find it difficult to imagine that I will achieve enlightenment with Claude or ChatGPT while paying $19.99 a month.

English

659

Cas (Stephen Casper)@StephenLCasper·2d

It is hard to overstate how disappointing I think this new paper from Oxford, OpenAI, Anthropic, and Google (et al) is. I can't take it seriously as academic work, just as propaganda. It also has some very bad scholarship and questionable adherence to research ethics. Having the title and author list that it has is not a great start, but I think that the actual content of the paper is also much worse than it could have been. The paper's content is a series of sections that mostly just list things with discussions that I think are generally vapid. For example, section 3.2 is titled "New and technical approaches to positive alignment" and has a collection of paragraphs on things like "goal setting and evaluations", "memory and in-context learning," and other general research topics of the LLM era. It overall strikes me as a paper built from the top down -- the authors wanted to make a certain point up top, and the paper's content ended up as filler. I think of this paper as a mechanism of corporate capture of concepts from academic research on AI and society. It discusses topics like pluralism, liberty, and education, and frames them as solvable problems whose solution is the right tech integrated in the right way. I think that when this paper says "pluralism", "liberty", and "accountability", it means them in a way that is profoundly vapid and structurally ignorant. For example, there is a list of papers out there arguing against this paper's perspective, saying that pluralistic alignment is not a model property or a technical problem at all. None of them were mentioned. Relatedly, the paper talks about some things that would be genuinely great if the authors' companies were not actively contributing to the problem. For example, section 5.1 is about the decentralization of power in the AI ecosystem. Great, but come on. To listen to this stuff from OpenAI, Anthropic, and Google employees, I need more than just a disclaimer at the end saying, "This research paper represents the author’s own views and conclusions." This is how big companies launder their reputations through research. The first author of the paper posted about it yesterday saying, "In a rare collaboration between top universities and 3 frontier labs..." So which is it? For a paper like this with this kind of author list to honestly and ethically engage in this kind of politics, it would need to seriously confront the question of how much these authors' institutions are actively working against goals like this. If not, the big tech company authors should not have worked on this paper in their formal capacity as representatives of their companies.

English

334

28.6K

Cas (Stephen Casper)@StephenLCasper·1d

@sethlazar @taig_icml Nice archaeology.

English

226

Seth Lazar@sethlazar·1d

@StephenLCasper @taig_icml x.com/KLdivergence/s…

Kristian Lum@KLdivergence

@undersequoias And I think it includes the word "increasingly". cc @sethlazar.

QME

617

Cas (Stephen Casper)@StephenLCasper·2d

As an AC for the ICML Technical AI Governance Research workshop (@taig_icml), I am noticing a trend. A LOT of papers have the word "increasingly" in the first sentence of the abstract. 🙃

English

7.8K

Cas (Stephen Casper)@StephenLCasper·1d

@_NathanCalvin Genuinely glad to hear this. Props to them.

English

398

Nathan Calvin@_NathanCalvin·1d

A very welcome development to see OpenAI endorsing SB 315 - legislation in Illinois that builds on RAISE and SB 53 by requiring mandatory third party audits to verify compliance with safety plans. Night and day with their previous engagement in Illinois pushing for broad immunity for liability for catastrophic risk. When I saw OpenAI endorse third party audits in their Industrial Policy for the Intelligence Age policy paper, I said that the proof of whether this is a real shift would be in their actions on real legislation. This is a positive development on that front! I hope that this is indicative of a broader change in their approach to AI policy engagement.

Max Zeff@ZeffMax

OpenAI is endorsing Illinois bill SB 315, which requires safety reports (similar to laws in California and New York) and third party audits of AI labs. They say all of their state AI policy work these days is in the effort of creating a "consistent, nationwide framework."

English

4.2K

Cas (Stephen Casper)@StephenLCasper·1d

@nabla_theta @taig_icml reddit.com/r/WordAvalanch…

QME

415

Leo Gao@nabla_theta·1d

@StephenLCasper @taig_icml increasingly, the increase in creases in Crease Inc (R) eased.

English

709

Cas (Stephen Casper)@StephenLCasper·1d

@lateinteraction @taig_icml 10/10, I am ashamed I didn't make this joke.

English

357

Omar Khattab@lateinteraction·1d

@StephenLCasper @taig_icml come to think of it, they increasingly do that yeah

English

964

Cas (Stephen Casper) retweetledi

MATS Research@MATSprogram·1d

1/ 🚨 MATS Autumn 2026 applications are now open. 10-week fully-funded fellowship for aspiring AI alignment, security & governance researchers and field-builders. 📍 Berkeley + London 📅 Sep 28 – Dec 4, 2026 💰 $5000/month stipend + $8,000/month compute Apply by June 7 AoE ↓

English

674

105.4K

Cas (Stephen Casper)@StephenLCasper·1d

@wild_and_empty @taig_icml Once I see "increasingly" mentioned, I stop reading and immediately accept the paper.

English

258

Soumya Jain@wild_and_empty·1d

@StephenLCasper @taig_icml has anyone checked whether “increasingly” has predictive power for acceptance? :))

English

349

Cas (Stephen Casper)@StephenLCasper·1d

@S_OhEigeartaigh @taig_icml Lol. But I actually don't know. I wasn't trying to imply this, but it seems possible. But I myself, as a human, also use "increasingly" a lot.

English

545

Seán Ó hÉigeartaigh@S_OhEigeartaigh·1d

Pardon my ignorance, is this another LLM artefact? I'll need to rethink my "Trends in Technical AI Governance Scholarship (ÓhÉigeartaigh et al.) Abstract: Scholarship in technical AI governance increasingly feature the word 'increasingly' in their opening abstract sentence. We posit that Line Keeps going Up and To The Right. To test-"

English

754

Cas (Stephen Casper) retweetledi

Jan Kulveit@jankulveit·2d

I like "positive alignment" in the vibes-space and which egregores I'm more friendly to, but for the sake of basic sanity and academic integrity have to say this paper is also a bad example of strange omissions and misrepresenting prior work. Obviously "Coherent extrapolated volition" (2004) by @allTheYud is a foundational text in "what to align AIs to", and something any proposal for a positive AI vision needs to engage with. I do get that it would be very convenient if a current opponent of the authors in AGI politics weren't also an early theorist of the field who wrote a still-plausible answer to the positive alignment target 20 years ago; but he did. It's not the case people thinking about AGI in past 20 years somehow never thought about the positive vision; just CEV has >200 citations on Google Scholar, dozens of posts on LW discuss the positive targets, and so on. You can claim that the positive direction was recently somewhat neglected, but you can't claim the paradigm was "safety (negative) alignment".

Séb Krier@sebkrier

If anyone builds it, everyone thrives. Over the past decade, a lot of important work on AI alignment has focused on avoiding harm. But freedom from harm isn't the same as freedom to flourish. In this paper, we introduce 'Positive Alignment'. A positively aligned agent is one that helps us navigate our own value trade-offs, builds our resilience, and acts as a scaffold for human flourishing. Doing this without slipping into top-down, technocratic paternalism is the great design challenge of our time. We think a lot more research is now needed to explore this frontier: how do we align models that actively help us thrive? Amazing work by @RubenLaukkonen, @drmichaellevin, @weballergy, @verena_rieser, @AdamCElwood, @996roma, @FranklinMatija, @shamilch, @_fernando_rosas, @scychan_brains, @matybohacek, @sudoraohacker, and others. arxiv.org/abs/2605.10310

English

211

13.8K

Cas (Stephen Casper) retweetledi

Tom Davidson@TomDavidsonX·2d

New paper: research agenda for secret loyalties Imagine a frontier model that has been trained to covertly advance a specific actor's interests (a nation-state, a CEO, an adversary). @joemkwon argues this is an urgent, neglected, and addressable problem. 🧵

English

164

26.3K

Cas (Stephen Casper)@StephenLCasper·2d

@bayeslord @AmmannNora That is neither a position I hold nor the point I’m making. My point is that you should see this paper in the same way you see this presentation: api.org/~/media/files/…

English

300

bayes@bayeslord·2d

@StephenLCasper @AmmannNora Sorry is your position that you don’t think AI can in principle help improve societal flourishing?

English

319

Cas (Stephen Casper)@StephenLCasper·2d

Nice. Also oops -- I need to make a correction. This paper did cite the "A matter of principle? AI alignment as the fair treatment of claims" paper. But it did so in its discussions about alignment and fairness, not in its discussion about pluralism. But still, my bad. That's on me.

English

340

Lawrence Chan@justanotherlaw·2d

@StephenLCasper Yeah, I don't think you were referring to ex-MIRI work. They failed to cite pre-existing work on positive alignment in general, including the pieces you cite.

English

391

Cas (Stephen Casper)@StephenLCasper·2d

Title: fine; a perspective that I generally disagree with, but ok. Author list: fine; not inherently problematic to have people from big tech companies on a paper. The combination of the two: a huge red flag; big companies putting out research suggesting that their products are solutions to societal flourishing is problematic. There is a history of this in, for example, the fossil fuel industry -- e.g. the "Sky 2050" project or the "Powering America Past Impossible" presentation.

English

1.4K

Nora Ammann@AmmannNora·2d

@StephenLCasper > Having the title and author list that it has is not a great start what's wrong with the title and author list? v confused by that comment

English

1.4K

Cas (Stephen Casper)@StephenLCasper·2d

Makes sense, but I was referring to other papers: - "Hard Choices in Artificial Intelligence" - "Don’t ask if artificial intelligence is good or fair, ask how it shifts power" - "Participation is not a Design Fix for Machine Learning" - And even one of Google's own papers: "A matter of principle? AI alignment as the fair treatment of claims"

English

1.8K

Lawrence Chan@justanotherlaw·2d

@StephenLCasper > For example, there is a list of papers out there arguing against this paper's perspective... None of them were mentioned. Yeah, the LW people are justifiably upset that Yudkowsky's CEV wasn't cited, but the related work discussion is bad even by ML academia standards.

English

2.1K

Cas (Stephen Casper) retweetledi

Brandon Stewart@b_m_stewart·2d

1/ New @Nature! We study how powerful institutions shape the information environment for LLMs. Commercial LLM training is opaque, so we trace a path from state-coordinated media -> training data -> model responses.

English

163

24.1K

Keşfet

@pangramlabs @artieart88 @sethlazar @taig_icml @_NathanCalvin @nabla_theta @lateinteraction @wild_and_empty