

Nirit Weiss-Blatt, PhD
2.1K posts

@DrTechlash
Communication Researcher, analyzing the tech discourse. Book Author: The TECHLASH. Substack: https://t.co/4SJJhqrzXn Signal: DrTechlash.16




@DavidSKrueger I find it concerning to call people who disagree with you about a technology that doesn't even exist yet "traitors to humanity"









"It appears that Anthropic has made a communications decision to distance itself from the EA community, likely because of negative associations the EA brand has in some circles." A message to Daniela Amodei: "If you want to distance yourself from EA, do it and be honest. If you'd rather not comment, don't comment. But don't obfuscate and lie pretending you don't know about EA and downplay the movement." forum.effectivealtruism.org/posts/53Gc35vD…



To clarify, the Center for AI Safety has not taken funding from Coefficient Giving / Open Philanthropy for years. We believe the effective altruism movement is, unfortunately, controlled opposition. The less influence it has on AI safety, the better.





EA ≠ AI safety AI safety has outgrown the EA community The world will be safer with a broad range of people tackling many different AI risks




Uhh is the agentic misalignment paper actually propaganda?






Slightly tangential note: When llama 2 came out I said I thought the government should ban it. This understandably pissed a bunch of people off, including @1a3orn who wrote about this on their blog, the one recently cited by David Sacks I still think Meta was quite irresponsible in their initial open weights release strategy, back when they were doing minimal to no safety evaluations, including on bio-uplift. And back when almost no one knew how to assess model capabilities! But I do think I should lose some bayes points predicting Llama 2 would be dangerous when it in fact wasn't: I posted in Feb 2024 about my updated position: x.com/JeffLadish/sta… I still think there should be gov oversight of open weight models - and that open weight models should be assessed for bioweapon uplift capabilities and autonomous self-replication capabilities. It seems obvious that at some point open weight models will be powerful enough to provide significant bioweapon uplift, given frontier (closed) models already seem there, though I haven't evaluated these capabilities directly Another thing I didn't model well - related to my thread above - was considering how open weight models would have less overall impact on society - whether we're talking about use in phishing, hacking, propoganda, etc. - because frontier models would be quite usable for these purposes, and in general better at it. It is in fact the case that models are used a lot for phishing, but I expect people use OpenAI's models for this more than, say Llama, simply because OpenAI's models are better. Over time, I expect companies to get better at tuning their models to selectively refuse to help with malicious tasks while still being helpful on most things, so maybe we'll say bad actors use relatively more open weight models. But I'm much less convinced of this than I was back in 2023 And also, I just care a lot less about narrow misuse of models for phishing / hacking / etc. than I did in 2023. I've also thought that loss of control is the greater issue, but I'm now willing to bite a lot more bullets to prioritize that I see AI mass persuasion & influence as a lot more central to loss of control dynamics than phishing or hacking uplift, which is why I focus on it in the above thread. Likewise, I've updated on the usefulness of open weight models for research related to alignment and loss of control risk modeling, so I've also updated towards the benefits of open weight models Bioweapon uplift threats are potentially catastrophic enough that I treat them differently than other misuse issues. Bioweapons / synthetic organisms are one of few real existential threats to human survival, and I think we should continue to take them seriously