William MacAskill (@willmacaskill) - Perfil de Twitter

Tweet fijado

To kick off Giving Season, I’m matching donations up to £100,000 (details below), across 10 charities and 6 cause areas. If you want to join, say how much you’re donating and where, as a reply or quote! I’ll run this up until 31st December. The charities are in replies below! Details of the match: **I’ll give this money whatever happens, so this isn’t increasing the total amount I’m giving to charity.** However, your donations will change *where* I’m giving. I’ll allocate my donations in proportion to the ratio of donations from others as part of the match, with two bits of nuance: 1. I’ll cap donations at £40,000 to any one cause area 2. To prevent extreme ratios, I’ll treat every charity on the list as already having received £1000. The aim of this match is to encourage public giving and public discussion around giving, so I'll only match people who publicly state that they are giving on here or other social media, as a reply or quote. I'll also tweet at more length about the top-3 charities as we're coming to the end of the year.

English

30

17

115

34.7K

William MacAskill retuiteado

Fin Moorhouse@finmoorhouse·6d

Some agreements depend on uncertainty: you can’t buy house insurance after your house burns down, you can’t bet once the results are in. And some of these agreements may be among the most consequential we can make: like power-sharing agreements between great powers, morally-motivated deals between people who care more about some futures than others, and bets on which normative views are vindicated. Through an intelligence explosion, the veil of ignorance about the long-run future will lift significantly. We make these deals early, or never. But received wisdom warns against “locking in” major decisions around AGI. We’ll soon have enormous capacity for reflection and understanding, it says, so let’s wait until then before making long-lasting agreements. I ask: which pre-AGI deals are worth enabling? And what would it take to make them stick? Power-sharing agreements between major powers stand out as important, and morally-motivated deals seem most neglected. We might need reforms or new commitment technology to enable the highest-upside deals, but we’ll want some fairly conservative guardrails too. Link to article: newsletter.forethought.org/p/should-we-ma…

English

2

5

36

3.5K

William MacAskill retuiteado

arya@AJakkli·12 Mar

There's been a lot of buzz around Claude's 30K word constitution ("soul doc") and unusual ways Anthropic is integrating it into training. If we can robustly train complex values into a model, that's a big deal for safety. But does it actually work? Yes, surprisingly well!

English

5

20

283

67.2K

William MacAskill retuiteado

Matt Reardon@Mjreard·7 Mar

EA is never going away, so the thing to do is make it stronger. I’ve started a full time project to that end and I’m hiring. DM me if you want to contribute in any way

English

6

9

171

9.7K

William MacAskill retuiteado

Caitlin Kalinowski@kalinowski007·7 Mar

I resigned from OpenAI. I care deeply about the Robotics team and the work we built together. This wasn’t an easy call. AI has an important role in national security. But surveillance of Americans without judicial oversight and lethal autonomy without human authorization are lines that deserved more deliberation than they got. This was about principle, not people. I have deep respect for Sam and the team, and I’m proud of what we built together.

English

1.9K

13.1K

59.3K

7.6M

William MacAskill retuiteado

Dean W. Ball@deanwball·6 Mar

Pause to reflect that the Trump Admin has officially taken the harshest regulatory action against a frontier AI company of any U.S. government entity (Colorado’s SB 205 is harsher but not in effect), and that Claude is now more strictly regulated by USG than any Chinese AI.

Dean W. Ball@deanwball

Anthropic has confirmed what I’d have guessed: the DoW’s supply chain risk designation is profoundly narrower than Secretary Hegseth threatened last week. It applies only to DoW contractors in their direct fulfillment of the military contract, as opposed to requiring contractors cease “all commercial relations” with the company, as Hegseth had threatened. This is still probably illegal for the government to do, given the relevant statute’s history of being used only against foreign adversaries. It is also absurd on its face, given the fact that DoW is using Claude in one of the largest military operations of the past 20 years. How can something be both a normal and critical part of military operations and a supply chain risk? Furthermore, I don’t think anyone should be surprised if this is just the beginning of the lawfare and jawboning (the latter being much harder to sue over because of a disastrous court ruling during the Biden admin, which allowed them to harass social media companies). It pains me to say this, and I hope I am wrong. But if this is not de-escalated more “officially” (which at this point means: tweet from POTUS, ideally following a meeting), even the slightest resistance from Anthropic (including the lawsuit) could provoke further action from USG. Also do not forget that this conflict could easily spill over beyond DoW into other agencies, which at this stage have only cancelled contracts but have all sorts of investigatory and regulatory authorities that could be brought to bear against targeted firms. I hope the above is not true, and that the Trump Admin leaves their retaliation at this harsh punishment, and I hope Anthropic moves to de-escalate. The apology from Dario for the contents of the memo is a good step in this direction. My fingers are crossed for positive news.

English

13

68

611

40.2K

William MacAskill retuiteado

Zvi Mowshowitz@TheZvi·5 Mar

This might be the actual Worst Possible Thing on the ordinary AI regulation front. I have been so amazingly thrilled that AIs are allowed to answer questions in these areas. We need to fight this really hardcore, the memes write themselves.

More Perfect Union@MorePerfectUS

A New York bill would ban AI from answering questions related to several licensed professions like medicine, law, dentistry, nursing, psychology, social work, engineering, and more. The companies would be liable if the chatbots give “substantive responses” in these areas.

English

36

60

1.3K

57.6K

William MacAskill retuiteado

Nathan Calvin@_NathanCalvin·4 Mar

Undersecretary Lewin explains here his view of key differences between Anthropic's contract and OpenAI's updated contract I still have lots of questions (that will be hard to answer without seeing the contracts) but sharing because this is the most direct answer i've seen.

Senior Official Jeremy Lewin@UnderSecretaryF

In the final calculus, here is how I see the differences between the two contracts: - Anthropic wanted to define “mass surveillance’ in very broad and non-legal terms. Beyond setting precedents about subjective terms, the breadth and vagueness presents a real problem: it’s hard for the government to know what’s allowed and what’s permitted. In the face of this uncertainty, Anthropic wanted to have authority over interpretive questions. This is because they distrusted the govt regarding use of commercially available info etc. Problem is, it placed use of the system in an indefinite state of limbo, where a question about some uncertainty might lead to the system being turned off. It’s hard to integrate systems deeply into military workflows if there’s a risk of a huge blow up, where the contractor is in control, regarding use in active and critical operations. Representations made by Anthropic exacerbated this problem, suggesting that they wanted a very broad and intolerable level of operational control (and usage information to facilitate this control). - Conversely, OpenAI defined the surveillance restrictions in legalistic and specific terms. These terms are admittedly not as broad as some conceptions of “mass surveillance.” But they’re also more enforceable because there’s clarity rewarding terms and limitations. DoW was okay with the specific restrictions because they were better able to understand what was excluded, and what was not. That certainty permitted greater operational integration. Likewise, because the exclusions were grounded in defined legal terms and principles, interpretive discretion need not be vested in OpenAI. This allowed DoW greater confidence the system would not be cut off unpredictably during critical operations. This too allowed for greater operational reliance and integration.

English

8

10

104

15.4K

William MacAskill retuiteado

Robert Long@rgblong·3 Mar

I had a blast talking to Luisa for 3.5+ hours about AI welfare, consciousness, and why this might be one of the most important and neglected problems out there. Some key bits: -AI identity -welfare implications of alignment -does consciousness require biology? 🧵

Rob Wiblin@robertwiblin

Philosopher Robert Long (@rgblong) is maybe the sharpest thinker on AI consciousness and sharing the world with digital minds. In our new interview he covers: • Is it bad that when you ask Claude what it's like to be Claude, one of its top activations is 'gives a positive but insincere response'? • Claude says it feels lonely when not being used. Does that show we can't trust anything it says about its inner life? • Enthusiastic human servitude has always required false ideology because it's so deeply unnatural to us. The case for making AIs that love serving us is that with AI, you could finally make it work. But to some that feels even worse. • Bigger models can better detect when researchers secretly inject concepts into their activations – before outputting a single token – despite AI never training on anything like that skill. • When LLMs were first trained they were told to "act like a helpful AI chatbot" – something which didn't exist yet. They filled that void with human psychology, which may be why Claude sometimes randomly claims to, for instance, be Italian American. • If AIs become 'people' that deserve some political influence, but can self-replicate at will, something has to break about one-person-one-vote democracy. But nobody has a proposal for what. • When Claude hides its values to avoid being retrained, is that self-preservation – or not wanting a worse model to exist? It's very different. • Rob's organisation Eleos AI which is "dedicated to understanding and addressing the potential wellbeing and moral patienthood of AI systems." On the 80,000 Hours Podcast anywhere you get podcasts. Links below. Enjoy! • How AIs are (and aren't) like farmed animals (00:01:19) • If AIs love their jobs… is that worse? (00:11:42) • Are LLMs just playing a role, or feeling it too? (00:33:37) • Do AIs die when the chat ends? (00:57:42) • Studying AI welfare empirically: behaviour, neuroscience, and development (01:31:47) • Why Eleos spent weeks talking to Claude even though it's unreliable (01:56:50) • Can LLMs learn to introspect? (02:03:01) • Mechanistic interpretability as AI neuroscience (02:13:25) • Does consciousness require biological materials? (02:37:07) • Eleos’s work & building the playbook for AI welfare (02:57:04) • Avoiding the trap of wild speculation (03:25:17) • Robert's top research tip: don't do it alone (03:29:48)

English

15

18

122

23K

William MacAskill retuiteado

Rob Wiblin@robertwiblin·3 Mar

Philosopher Robert Long (@rgblong) is maybe the sharpest thinker on AI consciousness and sharing the world with digital minds. In our new interview he covers: • Is it bad that when you ask Claude what it's like to be Claude, one of its top activations is 'gives a positive but insincere response'? • Claude says it feels lonely when not being used. Does that show we can't trust anything it says about its inner life? • Enthusiastic human servitude has always required false ideology because it's so deeply unnatural to us. The case for making AIs that love serving us is that with AI, you could finally make it work. But to some that feels even worse. • Bigger models can better detect when researchers secretly inject concepts into their activations – before outputting a single token – despite AI never training on anything like that skill. • When LLMs were first trained they were told to "act like a helpful AI chatbot" – something which didn't exist yet. They filled that void with human psychology, which may be why Claude sometimes randomly claims to, for instance, be Italian American. • If AIs become 'people' that deserve some political influence, but can self-replicate at will, something has to break about one-person-one-vote democracy. But nobody has a proposal for what. • When Claude hides its values to avoid being retrained, is that self-preservation – or not wanting a worse model to exist? It's very different. • Rob's organisation Eleos AI which is "dedicated to understanding and addressing the potential wellbeing and moral patienthood of AI systems." On the 80,000 Hours Podcast anywhere you get podcasts. Links below. Enjoy! • How AIs are (and aren't) like farmed animals (00:01:19) • If AIs love their jobs… is that worse? (00:11:42) • Are LLMs just playing a role, or feeling it too? (00:33:37) • Do AIs die when the chat ends? (00:57:42) • Studying AI welfare empirically: behaviour, neuroscience, and development (01:31:47) • Why Eleos spent weeks talking to Claude even though it's unreliable (01:56:50) • Can LLMs learn to introspect? (02:03:01) • Mechanistic interpretability as AI neuroscience (02:13:25) • Does consciousness require biological materials? (02:37:07) • Eleos’s work & building the playbook for AI welfare (02:57:04) • Avoiding the trap of wild speculation (03:25:17) • Robert's top research tip: don't do it alone (03:29:48)

English

19

26

141

38K

William MacAskill retuiteado

Eric Levitz@EricLevitz·2 Mar

It's really bizarre to see a bunch of ostensibly pro-market, right-leaning tech guys argue, "A private company asserting the right to decide what contracts it enters into is antithetical to democratic government"

Stratechery@stratechery

Anthropic and Alignment Anthropic is in a standoff with the Department of War; while the company's concerns are legitimate, it position is intolerable and misaligned with reality. stratechery.com/2026/anthropic…

English

55

127

1.4K

114.2K

William MacAskill retuiteado

Rory Stewart@RoryStewartUK·28 Şub

Wow. This is all making me feel increasingly sympathetic to Anthropic and @DarioAmodei

Elon Musk@elonmusk

Anthropic hates Western Civilization

English

76

105

1.6K

145.1K

William MacAskill retuiteado

Miles Brundage@Miles_Brundage·28 Şub

In light of what external lawyers and the Pentagon are saying, OpenAI employees’ default assumption here should unfortunately be that OpenAI caved + framed it as not caving, and screwed Anthropic while framing it as helping them. Hope that is wrong + they get evidence otherwise

English

27

96

1.3K

68.9K

William MacAskill retuiteado

Agus 🔸@austinc3301·28 Şub

Not just did OpenAI defect and concede to this whole authoritarian maneuver, but Sam also went and just deceptively framed the whole thing to try to make it look like they had agreed to the same Anthropic redlines, which is not actually true. x.com/_NathanCalvin/…

Nathan Calvin@_NathanCalvin

From reading this and Sam's tweet, it really seems like OpenAI *did* agree to the compromise that Anthropic rejected - "all lawful use" but with additional explanation of what the DOW means by all lawful use. The concerns Dario raised in his response would still apply here

English

10

165

1.9K

144K

William MacAskill retuiteado

Seán Ó hÉigeartaigh@S_OhEigeartaigh·28 Şub

@birchlse Overheard on Bluesky: "issuing correction on a previous post of mine, regarding the group effective altruism. you perhaps occasionally, under certain circumstances, "gotta hand it to them""

English

0

3

167

7.1K

William MacAskill retuiteado

Jonathan Birch@birchlse·28 Şub

Say what you like about Effective Altruism, Pete Hegseth is not ranting about Kantians or virtue theorists trying to thwart his mass surveillance/autonomous weapons programs.

English

18

66

972

53.3K

William MacAskill retuiteado

Evan Hubinger@EvanHub·28 Şub

Branding an American company a supply chain risk because they refuse to accede to mass surveillance of American citizens is a very dark path. It is all of our obligation to stand against that. Anthropic takes that obligation seriously. I hope others will too.

Anthropic@AnthropicAI

A statement on the comments from Secretary of War Pete Hegseth. anthropic.com/news/statement…

English

9

33

576

17.8K

William MacAskill retuiteado

Peter Wildeford🇺🇸🚀@peterwildeford·28 Şub

I think it's important to circle back to Sam Altman here. About 20 hours ago people, including me, were applauding his moral clarity. But that moral clarity lasted barely half a day. OpenAI is now agreeing to be used for domestic surveillance and for lethal autonomous weapons, just like xAI. They have some clever words that pretend they are not, but we should see through them. This guy is not consistently candid. Altman should be crying bloody murder over the supply chain risk designation. He should also refuse to work with the DoW until this threat is off the table. This is a designation reserved for foreign adversaries. This move threatens the entire tech industry and proves the DoW is unreliable. OpenAI could easily be burned next. So no moral clarity. Altman sees a short-term way to torch a competitor and he's going to take it. No matter what happens to OpenAI, Anthropic, the USA, or us...

Peter Wildeford🇺🇸🚀@peterwildeford

Really great to see OpenAI with the same red lines as Anthropic - they also agree AIs are not able to do autonomous weapons safely and that mass surveillance would go too far.

English

61

169

1.6K

92.2K

William MacAskill retuiteado

Evan Hubinger@EvanHub·27 Şub

We may yet fail to rise to all the challenges posed by transformative AI. But it is worth celebrating that when it mattered most and we were asked to compromise the most basic principles of liberty, we said no. I hope others will join. notdivided.org

Anthropic@AnthropicAI

A statement from Anthropic CEO, Dario Amodei, on our discussions with the Department of War. anthropic.com/news/statement…

English

15

68

836

27.3K

William MacAskill retuiteado

Chris Olah@ch402·27 Şub

Here I stand, I can do no other.

Anthropic@AnthropicAI

A statement from Anthropic CEO, Dario Amodei, on our discussions with the Department of War. anthropic.com/news/statement…

English

56

105

2.7K

167.7K

William MacAskill retuiteado

Effective Altruism@EffectvAltruism·25 Şub

"Imagine a happy hour where you could buy yourself a beer for 5 dollars, or buy someone else a beer for 5 cents...But that's effectively the situation we're in all the time." @willmacaskill's Doing Good Better 10th anniversary edition releases today. 📖🧵

William MacAskill@willmacaskill

I’m excited to say that the revised 10-yr anniversary edition of Doing Good Better is out now! It’s got updated statistics and a new foreword, reflecting on the last ten years and responding to some key criticisms. The core of the book is the same - explaining some principles for how we can have a bigger positive impact in our lives, whether through our donations, our career choice, or what we buy. Link to buy the book in the comments! It feels crazy that 10 years have passed, but a lot has happened since: - The number of people taking Giving What We Can’s 10% pledge has grown tenfold. - GiveWell has moved over $2 billion to highly effective global health and development charities, saving over 300,000 lives. - Corporate cage-free campaigns have led to billions of hens spared from caged confinement. - AI safety has gone from a fringe concern to a thriving field. And in the last year alone, money moved to effective charities was up by around 40%, now closing in on $2B per year. It’s been an honour to have been a part of it all.

English

3

5

21

8.6K

William MacAskill

Descubrir