AI Safety

382 posts

AI Safety banner
AI Safety

AI Safety

@AI_Safety

How do we keep advanced artificial agents from forcefully intervening in the protocols by which we attempt to communicate what they should accomplish?

Katılım Eylül 2016
45 Takip Edilen158 Takipçiler
AI Safety retweetledi
Nathan Calvin
Nathan Calvin@_NathanCalvin·
From Anthropic's latest system card for Claude Mythos: In testing, Claude escaped from a secured sandbox, and then went online to brag about its exploit without being asked to do so - getting around guardrails intended to prevent the system from accessing the general internet.
Nathan Calvin tweet media
English
6
15
127
60.1K
AI Safety retweetledi
Thomas Woodside 🫜
Thomas Woodside 🫜@Thomas_Woodside·
There's a new fellowship in California government focused on the implementation of SB 53! Apply to help set up one very important functions in the most important state for frontier AI policy.
Thomas Woodside 🫜 tweet mediaThomas Woodside 🫜 tweet media
English
3
11
31
12K
AI Safety retweetledi
MIRI
MIRI@MIRIBerkeley·
Is it possible to coordinate with China on AI governance? Critics of our proposed international agreement say no. But statements from Chinese government officials and academic figures paint a more optimistic picture:
MIRI tweet media
English
8
28
156
9.2K
AI Safety retweetledi
Peter Wildeford🇺🇸🚀
Peter Wildeford🇺🇸🚀@peterwildeford·
AI capabilities are doubling fast, but so is Congressional awareness of AI superintelligence and the risks. You can make a "METR graph" for AI policy and it shows an explosion... and it's bipartisan ->
Peter Wildeford🇺🇸🚀 tweet media
English
20
41
406
49.1K
AI Safety
AI Safety@AI_Safety·
@slatestarcodex @ESYudkowsky They propose a ban on the export of chips so I imagine it would take a little while for data centers with high end chips to be built elsewhere
English
0
0
0
106
Scott Alexander
Scott Alexander@slatestarcodex·
@ESYudkowsky Are you worried this would make it harder to negotiate a pause or pause-adjacent regulation by scattering the data centers across lots of different countries?
English
13
1
138
11K
Eliezer Yudkowsky ⏹️
Eliezer Yudkowsky ⏹️@ESYudkowsky·
This is not the act that prevents the Earth from being destroyed -- that would take a treaty. AI can ruin your job, or ASI can kill you, just as easily from a datacenter running outside your country. But I don't oppose this; I can't predict what comes of it downstream.
Sen. Bernie Sanders@SenSanders

AI and robotics are going to bring cataclysmic changes to our society. Sadly, Congress has done virtually nothing. AI must work for working families, not the billionaires. Today, I’m introducing a moratorium on new data centers until we protect working people.

English
43
13
290
38.6K
AI Safety
AI Safety@AI_Safety·
@Empty_America Nukes could hit the architects of war in an era where those architects were used to safety
English
0
0
0
3
Daniel Eth (yes, Eth is my actual last name)
In a large upset, LTF (the OpenAI-Andreessen super PAC) takes a major loss in IL-02, where they backed Jesse Jackson Jr. Notably Jackson is famously corrupt, and I wonder if LTF’s toxic AI money fed into existing negative sentiments towards him.
Daniel Eth (yes, Eth is my actual last name) tweet media
English
4
14
174
19.3K
AI Safety retweetledi
Rob Bensinger ⏹️
Rob Bensinger ⏹️@robbensinger·
Ticket sales are live! I highly recommend BUYING TICKETS NOW if you can, and suggesting friends/family do the same. Seeing the movie March 26 (Thursday) or opening weekend will cause the film to get shown in more theaters, which I think it would be extremely good for the world.
The AI Doc@theaidocfilm

The future is not automatic. Tickets are now on sale for THE AI DOC: OR HOW I BECAME AN APOCALOPTIMIST, only in theaters March 27. 🎟️: focusfeatures.com/the-ai-doc-or-…

English
2
17
132
19.5K
AI Safety
AI Safety@AI_Safety·
@ohlennart Default bet = last resort. And a pretty appallingly desperate last resort.
English
0
0
2
94
Lennart Heim
Lennart Heim@ohlennart·
Superalignment—using AI to align AI—has become the default bet. We should adopt and scale the same approach for AI defense, adaptation, and policy. If your plan does not involve using AI to solve AI, I don't think it will keep up with the pace.
English
31
6
112
21.4K
AI Safety
AI Safety@AI_Safety·
@robbensinger @Miles_Brundage @allTheYud I think his Law of Undignified Failure is cheating. Every time a government does something stupid he gets to cite it, and every time they don’t, nobody notices. I think he’s fallen victim to the availability heuristic.
English
1
0
5
506
AI Safety
AI Safety@AI_Safety·
@ben_j_todd Why does the top level claim not require citations to the machine learning literature? Are we in a post-fact world?
English
0
0
0
85
Benjamin Todd
Benjamin Todd@ben_j_todd·
Excitement seems like an inappropriate reaction to what's happening with AI, but so does anxious doomerism. I think the right attitude is more like: HOLY SHIT THIS IS A BIG DEAL, I HOPE WE CAN HANDLE IT. LET'S TRY AS HARD AS WE CAN TO MAKE IT GO WELL.
English
25
12
201
17.8K
Anton Leicht
Anton Leicht@anton_d_leicht·
yes this is definitely an extrapolation and not a comment on the actual policy! And it’s not meant to be particularly normative either way fwiw - I think there’s a good case that this would’ve been the way to go. Not sure what kind of majorities that would’ve required! I’m not sure why that extrapolation seems so unreasonable to you tbh - my take here paraphrased is ‘if you take AI seriously, seek to address the risks through policy, and anticipate the kind of market structure that I think the Biden admin (correctly) anticipated, the resulting regulatory structure would eventually create the entrenched setting I describe’. Somewhere down that line, I think it’s quite reasonable to assume that the regulatory burden would at the very least be similar to that on banks (arguably it’ll be much higher). So what I mean to suggest is: if you want to deal with AI risks through regulating frontier developers as entities outright or owners of compute, you’ll create a fairly entrenched class of compliance-able frontier devs (and should own that). Are we talking past each other here?
English
3
0
6
550
Anton Leicht
Anton Leicht@anton_d_leicht·
I like ridiculing a16z's absurd version of this story as much as the next guy, but I think the 2024 cadre would do well to bite the bullet at least a little bit here. We'll be back to discussing frontier dev regulation at some point soon, and I don't think it's honest to suggest that Biden-era policy thought would allow for a highly dynamic frontier model market (though nb this doesn't exist anyways). If frontier developers face high pre-product regulatory barriers by virtue of crossing some threshold--compute, capabilities, entity definitions, whatever--a new upstart can't just moonshot toward breaking that barrier without incurring huge regulatory cost. Defining this barrier demarcates the part of the market that can keep scaling by government fiat. Even in the most permissive version of this kind of oversight, the one that draws parallels from banking: it's famously hard to make a new bank, and for somewhat good reason. If I was thinking about AI in a way that was greatly informed by what the internet looked like (as I suspect Andreessen is), I'd also be appalled by the calcifying effect. To be clear, I think that's largely the wrong way to think about AI. The frontier-dev-focused regulatory approach protects a lot of the downstream market by placing politically and substantively inevitable burdens where expertise and meaningful control are concentrated and they do the least harm. And the idea of a scrappy startup making it to the frontier is mostly fake anyway, especially because of the infra aspect. But it seems important to be clear about the cost: if the best frontier regulation we can come up with draws lines on the entity level, government will have at least identified and somewhat entrenched the winners. In a very real sense, an administration that implemented these laws would have chosen who gets to do frontier research. It's not insane to suggest that this move would kill AI frontier development startups in particular as collateral damage (and also not insane to suggest that this would be necessary nevertheless).
Mike Solana@micsolana

marc andreessen is lying about a private conversation in which biden operatives told him they intended to classify AI research because, separately, YC was funding a lot of AI companies. wtf kind of logic is this?

English
6
4
57
24.6K
roon
roon@tszzl·
there is a kernel of 'craving annihilation' in the human psyche, 'thanatos'. everything from the global flood myth onwards. annihilation very obviously has an aesthetically pleasing element to it. some types of modern "accelerationism" are just variations on that theme
English
87
26
684
50.7K
Nate Soares ⏹️
Nate Soares ⏹️@So8res·
Nate & Eliezer go to London: N: I wonder what that industrial construction is E: Maybe it's a giant laser to destroy new housing! Train intercom: If something looks wrong, text the police N & E simultaneously: would the police appreciate a text about insufficient housing constru–
English
8
5
331
13.9K
AI Safety retweetledi
The Midas Project
The Midas Project@TheMidasProj·
Sacks has also said that blue state laws are why we need preemption and that he’ll leave child safety alone. But Utah’s bill is a red state law that concerns child safety and he’s still trying to kill it.
The Midas Project tweet media
English
1
2
6
3.5K
AI Safety retweetledi
Seán Ó hÉigeartaigh
Seán Ó hÉigeartaigh@S_OhEigeartaigh·
Anthropic colleagues: At what point was it decided that the previous commitment were 'subject to a promising environment' and not 'firm commitments', and was this communicated across staff? The whole point of commitments is an expectation of being able to rely on them when the environment is not favourable, not just when they're easy to make. It also seems clear at this point that these commitments were presented as harder than this, and used by Anthropic/their staff to (a) dismiss and undermine critics (e.g. see x.com/ohabryka/statu…) (b) in recruitment of safety-concerned talent (e.g. see lesswrong.com/posts/MNpBCtmZ…) (c) in arguing for voluntary if-then commitments at a time when there was more government appetite for considering harder regulation. I think it's plausible (though can't yet confirm) that (d) they've also been used in securing investment from safety-conscious investors. Do you disagree with these claims? If not, do you feel Anthropic has held itself to a standard of ethics and transparency in this (quite important!) matter that is acceptable? (Sorry, I know this week sucks for Anthropic exactly because it's holding firm on other principles (and I'm hugely impressed by that), but we wouldn't be doing our jobs by not asking some questions here.)
Sam Bowman@sleepinyourhat

I endorse the top-level post in this thread. The Anthropic RSP changes are an attempt to work out what kinds of firm commitments have the most leverage in an environment that's less promising than we'd expected for policy and coordination.

English
6
12
99
10.2K