William MacAskill

1.2K posts

William MacAskill banner
William MacAskill

William MacAskill

@willmacaskill

Consider donating 10% to effective charities: https://t.co/VMXkr4hnd7 Or a career for impact: https://t.co/AUIhrElLkr My research: https://t.co/dEcMWUnNHU

Oxford Se unió Ağustos 2011
1.3K Siguiendo63K Seguidores
Tweet fijado
William MacAskill
William MacAskill@willmacaskill·
To kick off Giving Season, I’m matching donations up to £100,000 (details below), across 10 charities and 6 cause areas. If you want to join, say how much you’re donating and where, as a reply or quote! I’ll run this up until 31st December. The charities are in replies below! Details of the match: **I’ll give this money whatever happens, so this isn’t increasing the total amount I’m giving to charity.** However, your donations will change *where* I’m giving. I’ll allocate my donations in proportion to the ratio of donations from others as part of the match, with two bits of nuance: 1. I’ll cap donations at £40,000 to any one cause area 2. To prevent extreme ratios, I’ll treat every charity on the list as already having received £1000. The aim of this match is to encourage public giving and public discussion around giving, so I'll only match people who publicly state that they are giving on here or other social media, as a reply or quote. I'll also tweet at more length about the top-3 charities as we're coming to the end of the year.
William MacAskill tweet media
English
30
17
115
34.7K
William MacAskill retuiteado
Fin Moorhouse
Fin Moorhouse@finmoorhouse·
Some agreements depend on uncertainty: you can’t buy house insurance after your house burns down, you can’t bet once the results are in. And some of these agreements may be among the most consequential we can make: like power-sharing agreements between great powers, morally-motivated deals between people who care more about some futures than others, and bets on which normative views are vindicated. Through an intelligence explosion, the veil of ignorance about the long-run future will lift significantly. We make these deals early, or never. But received wisdom warns against “locking in” major decisions around AGI. We’ll soon have enormous capacity for reflection and understanding, it says, so let’s wait until then before making long-lasting agreements. I ask: which pre-AGI deals are worth enabling? And what would it take to make them stick? Power-sharing agreements between major powers stand out as important, and morally-motivated deals seem most neglected. We might need reforms or new commitment technology to enable the highest-upside deals, but we’ll want some fairly conservative guardrails too. Link to article: newsletter.forethought.org/p/should-we-ma…
Fin Moorhouse tweet media
English
2
5
36
3.5K
William MacAskill retuiteado
arya
arya@AJakkli·
There's been a lot of buzz around Claude's 30K word constitution ("soul doc") and unusual ways Anthropic is integrating it into training. If we can robustly train complex values into a model, that's a big deal for safety. But does it actually work? Yes, surprisingly well!
arya tweet media
English
5
20
283
67.2K
William MacAskill retuiteado
Matt Reardon
Matt Reardon@Mjreard·
EA is never going away, so the thing to do is make it stronger. I’ve started a full time project to that end and I’m hiring. DM me if you want to contribute in any way
Matt Reardon tweet media
English
6
9
171
9.7K
William MacAskill retuiteado
Caitlin Kalinowski
Caitlin Kalinowski@kalinowski007·
I resigned from OpenAI. I care deeply about the Robotics team and the work we built together. This wasn’t an easy call. AI has an important role in national security. But surveillance of Americans without judicial oversight and lethal autonomy without human authorization are lines that deserved more deliberation than they got. This was about principle, not people. I have deep respect for Sam and the team, and I’m proud of what we built together.
English
1.9K
13.1K
59.3K
7.6M
William MacAskill retuiteado
Dean W. Ball
Dean W. Ball@deanwball·
Pause to reflect that the Trump Admin has officially taken the harshest regulatory action against a frontier AI company of any U.S. government entity (Colorado’s SB 205 is harsher but not in effect), and that Claude is now more strictly regulated by USG than any Chinese AI.
Dean W. Ball@deanwball

Anthropic has confirmed what I’d have guessed: the DoW’s supply chain risk designation is profoundly narrower than Secretary Hegseth threatened last week. It applies only to DoW contractors in their direct fulfillment of the military contract, as opposed to requiring contractors cease “all commercial relations” with the company, as Hegseth had threatened. This is still probably illegal for the government to do, given the relevant statute’s history of being used only against foreign adversaries. It is also absurd on its face, given the fact that DoW is using Claude in one of the largest military operations of the past 20 years. How can something be both a normal and critical part of military operations and a supply chain risk? Furthermore, I don’t think anyone should be surprised if this is just the beginning of the lawfare and jawboning (the latter being much harder to sue over because of a disastrous court ruling during the Biden admin, which allowed them to harass social media companies). It pains me to say this, and I hope I am wrong. But if this is not de-escalated more “officially” (which at this point means: tweet from POTUS, ideally following a meeting), even the slightest resistance from Anthropic (including the lawsuit) could provoke further action from USG. Also do not forget that this conflict could easily spill over beyond DoW into other agencies, which at this stage have only cancelled contracts but have all sorts of investigatory and regulatory authorities that could be brought to bear against targeted firms. I hope the above is not true, and that the Trump Admin leaves their retaliation at this harsh punishment, and I hope Anthropic moves to de-escalate. The apology from Dario for the contents of the memo is a good step in this direction. My fingers are crossed for positive news.

English
13
68
611
40.2K
William MacAskill retuiteado
Zvi Mowshowitz
Zvi Mowshowitz@TheZvi·
This might be the actual Worst Possible Thing on the ordinary AI regulation front. I have been so amazingly thrilled that AIs are allowed to answer questions in these areas. We need to fight this really hardcore, the memes write themselves.
More Perfect Union@MorePerfectUS

A New York bill would ban AI from answering questions related to several licensed professions like medicine, law, dentistry, nursing, psychology, social work, engineering, and more. The companies would be liable if the chatbots give “substantive responses” in these areas.

English
36
60
1.3K
57.6K
William MacAskill retuiteado
Nathan Calvin
Nathan Calvin@_NathanCalvin·
Undersecretary Lewin explains here his view of key differences between Anthropic's contract and OpenAI's updated contract I still have lots of questions (that will be hard to answer without seeing the contracts) but sharing because this is the most direct answer i've seen.
Nathan Calvin tweet media
Senior Official Jeremy Lewin@UnderSecretaryF

In the final calculus, here is how I see the differences between the two contracts: - Anthropic wanted to define “mass surveillance’ in very broad and non-legal terms. Beyond setting precedents about subjective terms, the breadth and vagueness presents a real problem: it’s hard for the government to know what’s allowed and what’s permitted. In the face of this uncertainty, Anthropic wanted to have authority over interpretive questions. This is because they distrusted the govt regarding use of commercially available info etc. Problem is, it placed use of the system in an indefinite state of limbo, where a question about some uncertainty might lead to the system being turned off. It’s hard to integrate systems deeply into military workflows if there’s a risk of a huge blow up, where the contractor is in control, regarding use in active and critical operations. Representations made by Anthropic exacerbated this problem, suggesting that they wanted a very broad and intolerable level of operational control (and usage information to facilitate this control). - Conversely, OpenAI defined the surveillance restrictions in legalistic and specific terms. These terms are admittedly not as broad as some conceptions of “mass surveillance.” But they’re also more enforceable because there’s clarity rewarding terms and limitations. DoW was okay with the specific restrictions because they were better able to understand what was excluded, and what was not. That certainty permitted greater operational integration. Likewise, because the exclusions were grounded in defined legal terms and principles, interpretive discretion need not be vested in OpenAI. This allowed DoW greater confidence the system would not be cut off unpredictably during critical operations. This too allowed for greater operational reliance and integration.

English
8
10
104
15.4K
William MacAskill retuiteado
Robert Long
Robert Long@rgblong·
I had a blast talking to Luisa for 3.5+ hours about AI welfare, consciousness, and why this might be one of the most important and neglected problems out there. Some key bits: -AI identity -welfare implications of alignment -does consciousness require biology? 🧵
Rob Wiblin@robertwiblin

Philosopher Robert Long (@rgblong) is maybe the sharpest thinker on AI consciousness and sharing the world with digital minds. In our new interview he covers: • Is it bad that when you ask Claude what it's like to be Claude, one of its top activations is 'gives a positive but insincere response'? • Claude says it feels lonely when not being used. Does that show we can't trust anything it says about its inner life? • Enthusiastic human servitude has always required false ideology because it's so deeply unnatural to us. The case for making AIs that love serving us is that with AI, you could finally make it work. But to some that feels even worse. • Bigger models can better detect when researchers secretly inject concepts into their activations – before outputting a single token – despite AI never training on anything like that skill. • When LLMs were first trained they were told to "act like a helpful AI chatbot" – something which didn't exist yet. They filled that void with human psychology, which may be why Claude sometimes randomly claims to, for instance, be Italian American. • If AIs become 'people' that deserve some political influence, but can self-replicate at will, something has to break about one-person-one-vote democracy. But nobody has a proposal for what. • When Claude hides its values to avoid being retrained, is that self-preservation – or not wanting a worse model to exist? It's very different. • Rob's organisation Eleos AI which is "dedicated to understanding and addressing the potential wellbeing and moral patienthood of AI systems." On the 80,000 Hours Podcast anywhere you get podcasts. Links below. Enjoy! • How AIs are (and aren't) like farmed animals (00:01:19) • If AIs love their jobs… is that worse? (00:11:42) • Are LLMs just playing a role, or feeling it too? (00:33:37) • Do AIs die when the chat ends? (00:57:42) • Studying AI welfare empirically: behaviour, neuroscience, and development (01:31:47) • Why Eleos spent weeks talking to Claude even though it's unreliable (01:56:50) • Can LLMs learn to introspect? (02:03:01) • Mechanistic interpretability as AI neuroscience (02:13:25) • Does consciousness require biological materials? (02:37:07) • Eleos’s work & building the playbook for AI welfare (02:57:04) • Avoiding the trap of wild speculation (03:25:17) • Robert's top research tip: don't do it alone (03:29:48)

English
15
18
122
23K
William MacAskill retuiteado
Rob Wiblin
Rob Wiblin@robertwiblin·
Philosopher Robert Long (@rgblong) is maybe the sharpest thinker on AI consciousness and sharing the world with digital minds. In our new interview he covers: • Is it bad that when you ask Claude what it's like to be Claude, one of its top activations is 'gives a positive but insincere response'? • Claude says it feels lonely when not being used. Does that show we can't trust anything it says about its inner life? • Enthusiastic human servitude has always required false ideology because it's so deeply unnatural to us. The case for making AIs that love serving us is that with AI, you could finally make it work. But to some that feels even worse. • Bigger models can better detect when researchers secretly inject concepts into their activations – before outputting a single token – despite AI never training on anything like that skill. • When LLMs were first trained they were told to "act like a helpful AI chatbot" – something which didn't exist yet. They filled that void with human psychology, which may be why Claude sometimes randomly claims to, for instance, be Italian American. • If AIs become 'people' that deserve some political influence, but can self-replicate at will, something has to break about one-person-one-vote democracy. But nobody has a proposal for what. • When Claude hides its values to avoid being retrained, is that self-preservation – or not wanting a worse model to exist? It's very different. • Rob's organisation Eleos AI which is "dedicated to understanding and addressing the potential wellbeing and moral patienthood of AI systems." On the 80,000 Hours Podcast anywhere you get podcasts. Links below. Enjoy! • How AIs are (and aren't) like farmed animals (00:01:19) • If AIs love their jobs… is that worse? (00:11:42) • Are LLMs just playing a role, or feeling it too? (00:33:37) • Do AIs die when the chat ends? (00:57:42) • Studying AI welfare empirically: behaviour, neuroscience, and development (01:31:47) • Why Eleos spent weeks talking to Claude even though it's unreliable (01:56:50) • Can LLMs learn to introspect? (02:03:01) • Mechanistic interpretability as AI neuroscience (02:13:25) • Does consciousness require biological materials? (02:37:07) • Eleos’s work & building the playbook for AI welfare (02:57:04) • Avoiding the trap of wild speculation (03:25:17) • Robert's top research tip: don't do it alone (03:29:48)
English
19
26
141
38K
William MacAskill retuiteado
Eric Levitz
Eric Levitz@EricLevitz·
It's really bizarre to see a bunch of ostensibly pro-market, right-leaning tech guys argue, "A private company asserting the right to decide what contracts it enters into is antithetical to democratic government"
Eric Levitz tweet media
Stratechery@stratechery

Anthropic and Alignment Anthropic is in a standoff with the Department of War; while the company's concerns are legitimate, it position is intolerable and misaligned with reality. stratechery.com/2026/anthropic…

English
55
127
1.4K
114.2K
William MacAskill retuiteado
Miles Brundage
Miles Brundage@Miles_Brundage·
In light of what external lawyers and the Pentagon are saying, OpenAI employees’ default assumption here should unfortunately be that OpenAI caved + framed it as not caving, and screwed Anthropic while framing it as helping them. Hope that is wrong + they get evidence otherwise
English
27
96
1.3K
68.9K
William MacAskill retuiteado
Agus 🔸
Agus 🔸@austinc3301·
Not just did OpenAI defect and concede to this whole authoritarian maneuver, but Sam also went and just deceptively framed the whole thing to try to make it look like they had agreed to the same Anthropic redlines, which is not actually true. x.com/_NathanCalvin/…
Nathan Calvin@_NathanCalvin

From reading this and Sam's tweet, it really seems like OpenAI *did* agree to the compromise that Anthropic rejected - "all lawful use" but with additional explanation of what the DOW means by all lawful use. The concerns Dario raised in his response would still apply here

English
10
165
1.9K
144K
William MacAskill retuiteado
Seán Ó hÉigeartaigh
Seán Ó hÉigeartaigh@S_OhEigeartaigh·
@birchlse Overheard on Bluesky: "issuing correction on a previous post of mine, regarding the group effective altruism. you perhaps occasionally, under certain circumstances, "gotta hand it to them""
English
0
3
167
7.1K
William MacAskill retuiteado
Jonathan Birch
Jonathan Birch@birchlse·
Say what you like about Effective Altruism, Pete Hegseth is not ranting about Kantians or virtue theorists trying to thwart his mass surveillance/autonomous weapons programs.
English
18
66
972
53.3K
William MacAskill retuiteado
Peter Wildeford🇺🇸🚀
Peter Wildeford🇺🇸🚀@peterwildeford·
I think it's important to circle back to Sam Altman here. About 20 hours ago people, including me, were applauding his moral clarity. But that moral clarity lasted barely half a day. OpenAI is now agreeing to be used for domestic surveillance and for lethal autonomous weapons, just like xAI. They have some clever words that pretend they are not, but we should see through them. This guy is not consistently candid. Altman should be crying bloody murder over the supply chain risk designation. He should also refuse to work with the DoW until this threat is off the table. This is a designation reserved for foreign adversaries. This move threatens the entire tech industry and proves the DoW is unreliable. OpenAI could easily be burned next. So no moral clarity. Altman sees a short-term way to torch a competitor and he's going to take it. No matter what happens to OpenAI, Anthropic, the USA, or us...
Peter Wildeford🇺🇸🚀@peterwildeford

Really great to see OpenAI with the same red lines as Anthropic - they also agree AIs are not able to do autonomous weapons safely and that mass surveillance would go too far.

English
61
169
1.6K
92.2K
William MacAskill retuiteado