Nirit Weiss-Blatt, PhD

2.1K posts

Nirit Weiss-Blatt, PhD banner
Nirit Weiss-Blatt, PhD

Nirit Weiss-Blatt, PhD

@DrTechlash

Communication Researcher, analyzing the tech discourse. Book Author: The TECHLASH. Substack: https://t.co/4SJJhqrzXn Signal: DrTechlash.16

Cupertino, CA Katılım Kasım 2020
210 Takip Edilen6.4K Takipçiler
Perry E. Metzger
Perry E. Metzger@perrymetzger·
@DavidSKrueger How big a step is it from rhetoric like this to advocating for violence against your ideological opponents?
English
4
2
30
378
David Krueger
David Krueger@DavidSKrueger·
I 100% stand by my comment. People who KNOWINGLY and DELIBERATELY downplay or distract from AI risks are traitors to humanity.
Entropy☃️Chase@EntropyChase

@DavidSKrueger I find it concerning to call people who disagree with you about a technology that doesn't even exist yet "traitors to humanity"

English
21
5
52
2.8K
Nirit Weiss-Blatt, PhD
Nirit Weiss-Blatt, PhD@DrTechlash·
@DmitriyLeybel The AI-generated interview was a lame trick. But apart from the fabricated Dario segment, the rest of the article (if you print it out is more than 40 pages) is not AI-generated but his reporting.
English
0
0
1
46
dmitriy
dmitriy@DmitriyLeybel·
@DrTechlash It's all fake. Pure confabulation of a deranged journalist.
dmitriy tweet media
English
1
0
0
65
Nirit Weiss-Blatt, PhD
Nirit Weiss-Blatt, PhD@DrTechlash·
Anthropic: Preaching doom and laughing at us all Anthropic engineers told a reporter, "AI is going to end the world in five years"! He left the room with the recording still working. He caught them on tape laughing about telling their visitors these stories. They thought "it was hilarious." 1/2
Nirit Weiss-Blatt, PhD tweet media
English
3
7
21
1.5K
Nirit Weiss-Blatt, PhD
Nirit Weiss-Blatt, PhD@DrTechlash·
Trenton Bricken: member of the technical staff on the Alignment Science team at Anthropic. He told the reporter that he "stopped contributing to his 401(k) because he only plans around a five-year event horizon." Kyle Fish: Anthropic hired him for a special role responsible for "model welfare." He recently estimated "a 20% chance that today's large language models [LLMs] have some form of conscious experience." Danielle Ghiglieri: communications director, leading Anthropic's media relations. 2/2 vanityfair.com/news/story/dar…
Nirit Weiss-Blatt, PhD tweet media
English
1
2
6
488
Nirit Weiss-Blatt, PhD
Nirit Weiss-Blatt, PhD@DrTechlash·
There's a point in Anthropic's research that lacks some self-awareness. Globally, 67% of people view AI positively, but AI optimism runs higher in Asia, South America, and Africa - than in the U.S. Here's one plausible explanation: Your own CEO's fear-mongering campaign. anthropic.com/features/81k-i…
Nirit Weiss-Blatt, PhD tweet media
English
2
0
9
477
Nirit Weiss-Blatt, PhD
Nirit Weiss-Blatt, PhD@DrTechlash·
I can't believe I'm going to praise Anthropic's study, but it's truly insightful. I admire the scope they managed to gather and analyze (N = 80,508 people, 159 countries, 70 languages). I also "count myself lucky that I am living in a timeline where I can experience AI"
Nirit Weiss-Blatt, PhD tweet media
English
2
2
15
1.1K
Joe Allen
Joe Allen@JOEBOTxyz·
@DrTechlash Trying to wrap my head around this witch trial. Is the inquisitor's position that CAIS is dictated by EAs?
English
1
0
1
85
Nirit Weiss-Blatt, PhD
Nirit Weiss-Blatt, PhD@DrTechlash·
In an interview with the Big Technology podcast, CAIS co-founder, Dan Hendrycks, was asked, "Center for AI Safety. Who's funding it?" and answered, "The main funder would be Jaan Tallinn." Tallinn is a well-known Effective Altruism billionaire. Switching from the previous EA billionaires—Dustin Moskovitz and Sam Bankman-Fried—to Jaan Tallinn still means CAIS is primarily funded by EA. Oh, and Tallinn is also listed in CAIS's 990 forms as a CAIS Director.
Nirit Weiss-Blatt, PhD tweet media
Center for AI Safety@CAIS

To clarify, the Center for AI Safety has not taken funding from Coefficient Giving / Open Philanthropy for years. We believe the effective altruism movement is, unfortunately, controlled opposition. The less influence it has on AI safety, the better.

English
8
15
89
32.8K
Joe Allen
Joe Allen@JOEBOTxyz·
@DrTechlash Just imagine a world in which someone disagrees with his initial institutional backers? It's easy if you try.
English
1
0
11
337
Nirit Weiss-Blatt, PhD
Nirit Weiss-Blatt, PhD@DrTechlash·
@CAIS Since @JordanSchachtel’s exposé on Humans First, both Joe Allen and CAIS have implemented the top three crisis response strategies. But they don't really work here or convince anyone.
Nirit Weiss-Blatt, PhD tweet media
English
0
3
13
2.8K
🎭
🎭@deepfates·
Uhh is the agentic misalignment paper actually propaganda?
🎭 tweet media
Nathan Calvin@_NathanCalvin

This passage in the New Yorker piece on the Anthropic DOW conflict yesterday, including a back and forth between the journalist (Gideon Lewis-Kraus) and an anonymous admin official, is gonna stick in my mind for a long time. “We must also remember that Cyberdyne Systems created Skynet for the government. It was supposed to help America dominate its enemies. It didn’t exactly work out as planned. The government thinks this is absurd. But the Pentagon has not tried to build an aligned A.I., and Anthropic has. Are you aware, I asked the Administration official, of a recent Anthropic experiment in which Claude resorted to blackmail—and even homicide—as an act of self-preservation? It had been carried out explicitly to convince people like him. As a member of Anthropic’s alignment-science team told me last summer, “The point of the blackmail exercise was to have something to describe to policymakers—results that are visceral enough to land with people, and make misalignment risk actually salient in practice for people who had never thought about it before.” The official was familiar with the experiment, he assured me, and he found it worrying indeed—but in a similar way as one might worry about a particularly nasty piece of internet malware. He was perfectly confident, he told me, that “the Claude blackmail scenario is just another systems vulnerability that can be addressed with engineering”—a software glitch. Maybe he’s right. We might get only one chance to find out.” I really recommend everyone read both the full New Yorker piece and Anthropic’s research on persona selection (both linked in the replies) and then spend a while sitting with the disconcerting situation we may have found ourselves in.

English
36
15
248
66K
Jeffrey Ladish
Jeffrey Ladish@JeffLadish·
@DrTechlash I've changed my mind since 2023 on open weight models, so not sure we disagree about that: x.com/JeffLadish/sta…
Jeffrey Ladish@JeffLadish

Slightly tangential note: When llama 2 came out I said I thought the government should ban it. This understandably pissed a bunch of people off, including @1a3orn who wrote about this on their blog, the one recently cited by David Sacks I still think Meta was quite irresponsible in their initial open weights release strategy, back when they were doing minimal to no safety evaluations, including on bio-uplift. And back when almost no one knew how to assess model capabilities! But I do think I should lose some bayes points predicting Llama 2 would be dangerous when it in fact wasn't: I posted in Feb 2024 about my updated position: x.com/JeffLadish/sta… I still think there should be gov oversight of open weight models - and that open weight models should be assessed for bioweapon uplift capabilities and autonomous self-replication capabilities. It seems obvious that at some point open weight models will be powerful enough to provide significant bioweapon uplift, given frontier (closed) models already seem there, though I haven't evaluated these capabilities directly Another thing I didn't model well - related to my thread above - was considering how open weight models would have less overall impact on society - whether we're talking about use in phishing, hacking, propoganda, etc. - because frontier models would be quite usable for these purposes, and in general better at it. It is in fact the case that models are used a lot for phishing, but I expect people use OpenAI's models for this more than, say Llama, simply because OpenAI's models are better. Over time, I expect companies to get better at tuning their models to selectively refuse to help with malicious tasks while still being helpful on most things, so maybe we'll say bad actors use relatively more open weight models. But I'm much less convinced of this than I was back in 2023 And also, I just care a lot less about narrow misuse of models for phishing / hacking / etc. than I did in 2023. I've also thought that loss of control is the greater issue, but I'm now willing to bite a lot more bullets to prioritize that I see AI mass persuasion & influence as a lot more central to loss of control dynamics than phishing or hacking uplift, which is why I focus on it in the above thread. Likewise, I've updated on the usefulness of open weight models for research related to alignment and loss of control risk modeling, so I've also updated towards the benefits of open weight models Bioweapon uplift threats are potentially catastrophic enough that I treat them differently than other misuse issues. Bioweapons / synthetic organisms are one of few real existential threats to human survival, and I think we should continue to take them seriously

English
1
0
3
126
The AI Doc
The AI Doc@theaidocfilm·
Who's really in control? THE AI DOC: OR HOW I BECAME AN APOCALOPTIMIST is only in theaters March 27.
English
9
15
79
23K