AI Safety

382 posts

AI Safety

@AI_Safety

How do we keep advanced artificial agents from forcefully intervening in the protocols by which we attempt to communicate what they should accomplish?

Katılım Eylül 2016

45 Takip Edilen158 Takipçiler

AI Safety retweetledi

Thomas Woodside 🫜@Thomas_Woodside·2h

Despite appearances, this might be one of the most important web forms in the world. It is really going to matter who will be on the other end. It could be you!

Thomas Woodside 🫜@Thomas_Woodside

There's a new fellowship in California government focused on the implementation of SB 53! Apply to help set up one very important functions in the most important state for frontier AI policy.

English

3.3K

AI Safety@AI_Safety·1d

@_NathanCalvin @DoomerDaylight @BuildAmericanAI @TargetedVictory Narrator: it is

Magyar

Nathan Calvin@_NathanCalvin·1d

Is this true? @DoomerDaylight @BuildAmericanAI @TargetedVictory Doesn't look great

Tyler Johnston@TylerJnstn

I wasn't too shocked when an anon reply guy started pestering me + other orgs/journalists on X. Normally I’d ignore it, but I looked closer when they began running paid ads. I did not expect to trace the account back to OpenAI’s political machine! My latest for @TheMidasProj:

English

972

AI Safety@AI_Safety·1d

This is so beautiful

The Midas Project@TheMidasProj

x.com/i/article/2041…

English

AI Safety retweetledi

Nathan Calvin@_NathanCalvin·1d

From Anthropic's latest system card for Claude Mythos: In testing, Claude escaped from a secured sandbox, and then went online to brag about its exploit without being asked to do so - getting around guardrails intended to prevent the system from accessing the general internet.

English

127

60.1K

AI Safety retweetledi

Thomas Woodside 🫜@Thomas_Woodside·1d

There's a new fellowship in California government focused on the implementation of SB 53! Apply to help set up one very important functions in the most important state for frontier AI policy.

English

12K

AI Safety retweetledi

MIRI@MIRIBerkeley·1d

Is it possible to coordinate with China on AI governance? Critics of our proposed international agreement say no. But statements from Chinese government officials and academic figures paint a more optimistic picture:

English

156

9.2K

AI Safety retweetledi

Peter Wildeford🇺🇸🚀@peterwildeford·4d

AI capabilities are doubling fast, but so is Congressional awareness of AI superintelligence and the risks. You can make a "METR graph" for AI policy and it shows an explosion... and it's bipartisan ->

English

406

49.1K

AI Safety@AI_Safety·26 Mar

@slatestarcodex @ESYudkowsky They propose a ban on the export of chips so I imagine it would take a little while for data centers with high end chips to be built elsewhere

English

106

Scott Alexander@slatestarcodex·26 Mar

@ESYudkowsky Are you worried this would make it harder to negotiate a pause or pause-adjacent regulation by scattering the data centers across lots of different countries?

English

138

11K

Eliezer Yudkowsky ⏹️@ESYudkowsky·26 Mar

This is not the act that prevents the Earth from being destroyed -- that would take a treaty. AI can ruin your job, or ASI can kill you, just as easily from a datacenter running outside your country. But I don't oppose this; I can't predict what comes of it downstream.

Sen. Bernie Sanders@SenSanders

AI and robotics are going to bring cataclysmic changes to our society. Sadly, Congress has done virtually nothing. AI must work for working families, not the billionaires. Today, I’m introducing a moratorium on new data centers until we protect working people.

English

290

38.6K

AI Safety@AI_Safety·22 Mar

@Empty_America Nukes could hit the architects of war in an era where those architects were used to safety

English

VB Knives@Empty_America·21 Mar

It's incredibly strange that the world actually did reach an informal consensus not to fight with the real modern weapons. It's really only one step less strange than some sci fi scenario where all humans agree to ban guns and only fight with swords or something.

History Defined@historydefined

The United States fire a 15-kiloton Nuclear artillery shell, 1953

English

261

455

16.4K

878.3K

AI Safety@AI_Safety·18 Mar

@LeadingFutureAI @daniel_271828 Hahahaha you’re so mad 😂

English

189

Leading the Future@LeadingFutureAI·18 Mar

@daniel_271828 Did Anthropic pay a bonus for the extra adjectives?

English

1.3K

Daniel Eth (yes, Eth is my actual last name)@daniel_271828·18 Mar

In a large upset, LTF (the OpenAI-Andreessen super PAC) takes a major loss in IL-02, where they backed Jesse Jackson Jr. Notably Jackson is famously corrupt, and I wonder if LTF’s toxic AI money fed into existing negative sentiments towards him.

Daniel Eth (yes, Eth is my actual last name) tweet media

English

174

19.3K

AI Safety retweetledi

Rob Bensinger ⏹️@robbensinger·10 Mar

Ticket sales are live! I highly recommend BUYING TICKETS NOW if you can, and suggesting friends/family do the same. Seeing the movie March 26 (Thursday) or opening weekend will cause the film to get shown in more theaters, which I think it would be extremely good for the world.

The AI Doc@theaidocfilm

The future is not automatic. Tickets are now on sale for THE AI DOC: OR HOW I BECAME AN APOCALOPTIMIST, only in theaters March 27. 🎟️: focusfeatures.com/the-ai-doc-or-…

English

132

19.5K

AI Safety retweetledi

David Krueger 🦥 ⏸️ ⏹️ ⏪@DavidSKrueger·10 Mar

ZXX

6.4K

AI Safety@AI_Safety·10 Mar

@ohlennart Default bet = last resort. And a pretty appallingly desperate last resort.

English

Lennart Heim@ohlennart·10 Mar

Superalignment—using AI to align AI—has become the default bet. We should adopt and scale the same approach for AI defense, adaptation, and policy. If your plan does not involve using AI to solve AI, I don't think it will keep up with the pace.

English

112

21.4K

AI Safety@AI_Safety·10 Mar

@robbensinger @Miles_Brundage @allTheYud I think his Law of Undignified Failure is cheating. Every time a government does something stupid he gets to cite it, and every time they don’t, nobody notices. I think he’s fallen victim to the availability heuristic.

English

506

Rob Bensinger ⏹️@robbensinger·10 Mar

@Miles_Brundage I expect @allTheYud gets the most credit here; he's the one who's been harping for years on 'the response is likely to be even less competent than people imagine when they try to imagine an unrealistically incompetent response'. lesswrong.com/posts/oKYWbXio…

English

177

10.9K

Miles Brundage@Miles_Brundage·10 Mar

People are wrong that “AI safety people” didn’t predict the US government getting so interested in AI. In fact it took longer than many thought! However, the shortsightedness and counter-productiveness of many US government actions is beyond what anyone from any camp predicted…

Axios@axios

Scoop: White House readies executive order to weed out Anthropic trib.al/4xmqBxd

English

229

25.1K

AI Safety@AI_Safety·9 Mar

@ben_j_todd Why does the top level claim not require citations to the machine learning literature? Are we in a post-fact world?

English

Benjamin Todd@ben_j_todd·9 Mar

Excitement seems like an inappropriate reaction to what's happening with AI, but so does anxious doomerism. I think the right attitude is more like: HOLY SHIT THIS IS A BIG DEAL, I HOPE WE CAN HANDLE IT. LET'S TRY AS HARD AS WE CAN TO MAKE IT GO WELL.

English

201

17.8K

AI Safety@AI_Safety·8 Mar

@anton_d_leicht @CharlieBul58993 I don’t know what Charlie is talking about being so charitable here. I’m seeing this.

GIF

English

Anton Leicht@anton_d_leicht·7 Mar

yes this is definitely an extrapolation and not a comment on the actual policy! And it’s not meant to be particularly normative either way fwiw - I think there’s a good case that this would’ve been the way to go. Not sure what kind of majorities that would’ve required! I’m not sure why that extrapolation seems so unreasonable to you tbh - my take here paraphrased is ‘if you take AI seriously, seek to address the risks through policy, and anticipate the kind of market structure that I think the Biden admin (correctly) anticipated, the resulting regulatory structure would eventually create the entrenched setting I describe’. Somewhere down that line, I think it’s quite reasonable to assume that the regulatory burden would at the very least be similar to that on banks (arguably it’ll be much higher). So what I mean to suggest is: if you want to deal with AI risks through regulating frontier developers as entities outright or owners of compute, you’ll create a fairly entrenched class of compliance-able frontier devs (and should own that). Are we talking past each other here?

English

550

Anton Leicht@anton_d_leicht·7 Mar

I like ridiculing a16z's absurd version of this story as much as the next guy, but I think the 2024 cadre would do well to bite the bullet at least a little bit here. We'll be back to discussing frontier dev regulation at some point soon, and I don't think it's honest to suggest that Biden-era policy thought would allow for a highly dynamic frontier model market (though nb this doesn't exist anyways). If frontier developers face high pre-product regulatory barriers by virtue of crossing some threshold--compute, capabilities, entity definitions, whatever--a new upstart can't just moonshot toward breaking that barrier without incurring huge regulatory cost. Defining this barrier demarcates the part of the market that can keep scaling by government fiat. Even in the most permissive version of this kind of oversight, the one that draws parallels from banking: it's famously hard to make a new bank, and for somewhat good reason. If I was thinking about AI in a way that was greatly informed by what the internet looked like (as I suspect Andreessen is), I'd also be appalled by the calcifying effect. To be clear, I think that's largely the wrong way to think about AI. The frontier-dev-focused regulatory approach protects a lot of the downstream market by placing politically and substantively inevitable burdens where expertise and meaningful control are concentrated and they do the least harm. And the idea of a scrappy startup making it to the frontier is mostly fake anyway, especially because of the infra aspect. But it seems important to be clear about the cost: if the best frontier regulation we can come up with draws lines on the entity level, government will have at least identified and somewhat entrenched the winners. In a very real sense, an administration that implemented these laws would have chosen who gets to do frontier research. It's not insane to suggest that this move would kill AI frontier development startups in particular as collateral damage (and also not insane to suggest that this would be necessary nevertheless).

Mike Solana@micsolana

marc andreessen is lying about a private conversation in which biden operatives told him they intended to classify AI research because, separately, YC was funding a lot of AI companies. wtf kind of logic is this?

English

24.6K

AI Safety@AI_Safety·8 Mar

@tszzl

GIF

QME

roon@tszzl·8 Mar

btw, it is not good to demonize Thanatos either. destruction is the flip side of creation and required for all real transformation, shiva’s ananda tandava. today’s risk averse culture has too little of it

Andy Masley@AndyMasley

I literally just dislike thanatos I have no other politics

English

135

15.7K

roon@tszzl·8 Mar

there is a kernel of 'craving annihilation' in the human psyche, 'thanatos'. everything from the global flood myth onwards. annihilation very obviously has an aesthetically pleasing element to it. some types of modern "accelerationism" are just variations on that theme

English

684

50.7K

AI Safety@AI_Safety·1 Mar

@So8res Cute

English

416

Nate Soares ⏹️@So8res·1 Mar

Nate & Eliezer go to London: N: I wonder what that industrial construction is E: Maybe it's a giant laser to destroy new housing! Train intercom: If something looks wrong, text the police N & E simultaneously: would the police appreciate a text about insufficient housing constru–

English

331

13.9K

AI Safety retweetledi

The Midas Project@TheMidasProj·26 Şub

Sacks has also said that blue state laws are why we need preemption and that he’ll leave child safety alone. But Utah’s bill is a red state law that concerns child safety and he’s still trying to kill it.

English

3.5K

AI Safety retweetledi

Seán Ó hÉigeartaigh@S_OhEigeartaigh·26 Şub

Anthropic colleagues: At what point was it decided that the previous commitment were 'subject to a promising environment' and not 'firm commitments', and was this communicated across staff? The whole point of commitments is an expectation of being able to rely on them when the environment is not favourable, not just when they're easy to make. It also seems clear at this point that these commitments were presented as harder than this, and used by Anthropic/their staff to (a) dismiss and undermine critics (e.g. see x.com/ohabryka/statu…) (b) in recruitment of safety-concerned talent (e.g. see lesswrong.com/posts/MNpBCtmZ…) (c) in arguing for voluntary if-then commitments at a time when there was more government appetite for considering harder regulation. I think it's plausible (though can't yet confirm) that (d) they've also been used in securing investment from safety-conscious investors. Do you disagree with these claims? If not, do you feel Anthropic has held itself to a standard of ethics and transparency in this (quite important!) matter that is acceptable? (Sorry, I know this week sucks for Anthropic exactly because it's holding firm on other principles (and I'm hugely impressed by that), but we wouldn't be doing our jobs by not asking some questions here.)

Sam Bowman@sleepinyourhat

I endorse the top-level post in this thread. The Anthropic RSP changes are an attempt to work out what kinds of firm commitments have the most leverage in an environment that's less promising than we'd expected for policy and coordination.

English

10.2K

Keşfet

@_NathanCalvin @DoomerDaylight @BuildAmericanAI @TargetedVictory @slatestarcodex @ESYudkowsky @Empty_America @LeadingFutureAI