Wei Dai

685 posts

Wei Dai

@weidai11

wrote Crypto++, b-money, UDT. thinking about existential safety and metaphilosophy. blogging at https://t.co/mBVFhriJVf

Katılım Haziran 2015

122 Takip Edilen8.6K Takipçiler

Sabitlenmiş Tweet

Wei Dai@weidai11·16 Ara

Among my first reactions upon hearing "artificial superintelligence" were "I can finally get answers to my favorite philosophical problems" followed by "How do I make sure the ASI actually answers them correctly?" Anyone else reacted like this?

Wei Dai@weidai11

@janleike You assume that you don't need to solve hard philosophical problems. But the superhuman researcher model probably will need to, right? Seems like a very difficult instance of weak-to-strong generalization, and I'm not sure how you would know whether you've successfully solved it.

English

27.3K

Wei Dai@weidai11·2d

@lxrjl @Mihonarium Good to know, and to clarify, I didn't mean to suggest that "Pause" is clearly correct, and therefore Luke must be wrong not to pivot. One possible conclusion is that people should perhaps update less on "MIRI pivoted to Pause" because some MIRI strategy people actually didn't.

English

alex lawsen@lxrjl·2d

@Mihonarium I think of all of the things one could plausibly criticise Luke for, unwillingness to change his mind might be literally at the bottom of my list.

English

408

Mikhail Samin@Mihonarium·2d

interesting take from Wei Dai

English

2.3K

Wei Dai@weidai11·13 Nis

@TheDavidSJ @DaystarEld @Vaniver @deanwball @allTheYud Can you perhaps sketch out a plausible scenario in which some person or group leverages that regime of control into permanent authoritarianism?

English

Wei Dai@weidai11·13 Nis

@TheDavidSJ @DaystarEld @Vaniver @deanwball @allTheYud Curious about this, I found/read thedavidsj.substack.com/p/draconian-me… but it left me wondering, we already give people/institutions great power, eg, control of the US military; "a regime of control over the world’s compute supply" seems rather tame in comparison? Wish I understood this better

English

Eliezer Yudkowsky@allTheYud·31 Mar

A challenge to @deanwball. Suppose you believed what I believe: If anyone builds ASI, everyone dies (modulo locally irrelevant caveats). Say that Sanders, Trump, Hawley, Blumenthal, and Jinping will all back your policy. What's a smart policy that actually blocks ASI?

English

237

27.6K

Wei Dai@weidai11·10 Nis

@ohabryka @RatOrthodox @ryancareyai I was simplifying but I think "mesaoptimization" was 2018 and before that it was "inner optimization" and "optimization daemon" but both were >=2016. Can't remember or find any other online discussions that clearly fit before 2016 but possibly people talked about it offline.

English

Oliver Habryka@ohabryka·10 Nis

@weidai11 @RatOrthodox @ryancareyai It wasn't called mesa-optimization, but I am pretty sure people were thinking about it between 2003 and 2016 under various names and in the context of various analogies (the concept is still pretty muddled, so it's not like "mesaoptimization" as a concept brought great clarity).

English

Ryan Carey@ryancareyai·3 Nis

Absolutely, views in the AI x-risk community are gradually diluting toward "AI is a big deal". One example from January: x.com/davidad/status… Also a lot of people grasping for new things to worry about: "mesaoptimizers", "gradual disempowerment", permanent dictatorship.

David Pinsof@DavidPinsof

Is it just me or has AI doomerism gradually transitioned from "AI will literally kill us all" to "AI will cause bad things to happen / Humans will do stupid things with AI / AI will cause huge changes." If so, this is a very positive development.

English

6.8K

Wei Dai@weidai11·10 Nis

@ohabryka @RatOrthodox @ryancareyai My SL4 post was 2003, but AFAIK it was forgotten and reinvented in 2016 by MIRI, so should probably count only 10 years? I remember being at a MIRI workshop when people started talking about mesaoptimization, thinking it was a new MIRI idea. Someone found/referenced my post later

English

Oliver Habryka@ohabryka·3 Nis

@RatOrthodox @ryancareyai As old as the “molecular squiggle optimizer” stuff, which according to Eliezer predates the paperclip stuff, so like 20 years?

English

364

Wei Dai@weidai11·27 Mar

I should mention that this was inspired by a similar complexity analogy in AI Safety via Debate by @geoffreyirving @paulfchristiano @DarioAmodei, and by @steve47285's discussion of an analogous alignment tax in Christiano's Approval-Directed Agents.

English

708

Wei Dai@weidai11·27 Mar

LW discussion about this: #nFA4PuXTebLZnKjgk" target="_blank" rel="nofollow noopener">lesswrong.com/posts/RKtTi82t… More about the computational complexity analogy: lesswrong.com/posts/EByDsY9S…

English

1.3K

Wei Dai@weidai11·27 Mar

One idea is that humans are "aligned" via social status, but this causes an "alignment tax" forcing the smartest humans to work in domains where their results can be evaluated by others less intelligent. Unfortunately, while math is in NP, philosophy is more like EXPTIME.

Wei Dai@weidai11

In 2012 I wrote about making 10⁵ clones of von Neumann as a Singularity strategy, but now I think we should first figure out why he spent more time/effort on math/computing instead of philosophy and long-horizon strategy, which are the real bottlenecks to having a good future.

English

Wei Dai@weidai11·26 Mar

@RichardMCNgo Your intuition pump doesn't seem to rule out that "trust" can be solved in a straightforward CS/engineering kind of way, or is OOMs less costly than value differences. Don't understand pause=central planning. I think market economies can be regulated to prevent AI development?

English

Richard Ngo@RichardMCNgo·26 Mar

@weidai11 One intuition pump re trust: imagine that you could never assume common knowledge of rationality. Then to do most game theory you’d need some weaker replacement concept. This is one facet of “trust”. An AI pause seems good short-term. Long-term it’d be de facto central planning.

English

156

Richard Ngo@RichardMCNgo·25 Mar

It’s helpful to think of rationalists as High Modernists specifically about the future. No human was smart enough to successfully plan an economy or a society. But if we hypothesize an AI intelligent enough to do so, we can hold on to many of the same technocratic intuitions.

English

176

16.5K

Wei Dai@weidai11·26 Mar

@RichardMCNgo x.com/weidai11/statu…

Wei Dai@weidai11

@RichardMCNgo To be clear, I'm advocating an AI pause to give more time to think about issues like these, not a revolution to finally do central planning right. :)

QME

Richard Ngo@RichardMCNgo·26 Mar

@weidai11 It’s possible! And also from Marx’s perspective it was possible that past central planning failures had been obsoleted by the Industrial Revolution. But it was hard for Marx to know how much econ he didn’t know, and it’s hard for us to know how much sociopolitics we don’t know.

English

544

Wei Dai@weidai11·26 Mar

@RichardMCNgo To be clear, I'm advocating an AI pause to give more time to think about issues like these, not a revolution to finally do central planning right. :)

English

199

Wei Dai@weidai11·26 Mar

@RichardMCNgo Yeah, I don't understand your position that these are big deals. From my perspective, value conflicts explain ~all human coordination failures, and what's left if you get rid of it is more like CS/engineering, with relatively straightforward low-cost solutions.

English

101

Wei Dai@weidai11·26 Mar

@RichardMCNgo If 2 agents have same values (including no indexical values), then why fight for control instead of, e.g., flipping a coin to decide whose strategy to go with? You might still need some back-and-forth to resolve race conditions etc, but standard solutions are available from CS?

English

147

Richard Ngo@RichardMCNgo·26 Mar

@weidai11 Re your specific point: I think that you get explosions in game theoretic complexity even with the same values if you lack trust. E.g. imagine you lack common knowledge that you’re pursuing the same strategy to achieve your shared goals. Then you might fight for control.

English

262

Wei Dai@weidai11·26 Mar

@kanzure Ah yes, the classic Leeroy Jenkins approach to problem solving. Really wish I had been the one to realize that it's a perfect fit for navigating a technological Singularity.

English

215

Bryan Bishop@kanzure·25 Mar

@weidai11 No, you should have made human clones and instead you didn't.

English

289

Wei Dai@weidai11·25 Mar

John David Pressman@jd_pressman

My position remains that we should exhume Neumann and clone him.

English

117

15.5K

Wei Dai@weidai11·25 Mar

@RokoMijic I'm not sure that he would be good at philosophy but I'm guessing maybe he could have been if he tried? Anyway I think you may be misinterpreting what I was trying to say in the OP.

English

133

Roko 🐉@RokoMijic·25 Mar

@weidai11 Why do you think Von Neumann is good at philosophy? He's a mathematical genius but not a well known philosopher We should have cryopreserved Dan Dennett who was easily the best philosopher of AI...

English

145

Wei Dai@weidai11·25 Mar

@RokoMijic even more prone to reward gaming, are even worse at doing philosophy, are liable to have even more alien values, etc.

English

174

Wei Dai@weidai11·25 Mar

@RokoMijic From my linked post: What about getting help from AI? Well they seem to suffer from many of the same safety problems, but in even more severe forms. E.g., current AI capabilities are even more skewed towards short-horizon, easily verifiable tasks, like math and coding. They seem

English

435

Wei Dai@weidai11·25 Mar

@allTheYud @jd_pressman x.com/weidai11/statu…

Wei Dai@weidai11

QME