Wei Dai

685 posts

Wei Dai

Wei Dai

@weidai11

wrote Crypto++, b-money, UDT. thinking about existential safety and metaphilosophy. blogging at https://t.co/mBVFhriJVf

Katılım Haziran 2015
122 Takip Edilen8.6K Takipçiler
Sabitlenmiş Tweet
Wei Dai
Wei Dai@weidai11·
Among my first reactions upon hearing "artificial superintelligence" were "I can finally get answers to my favorite philosophical problems" followed by "How do I make sure the ASI actually answers them correctly?" Anyone else reacted like this?
Wei Dai@weidai11

@janleike You assume that you don't need to solve hard philosophical problems. But the superhuman researcher model probably will need to, right? Seems like a very difficult instance of weak-to-strong generalization, and I'm not sure how you would know whether you've successfully solved it.

English
17
4
72
27.3K
Wei Dai
Wei Dai@weidai11·
@lxrjl @Mihonarium Good to know, and to clarify, I didn't mean to suggest that "Pause" is clearly correct, and therefore Luke must be wrong not to pivot. One possible conclusion is that people should perhaps update less on "MIRI pivoted to Pause" because some MIRI strategy people actually didn't.
English
0
0
1
65
alex lawsen
alex lawsen@lxrjl·
@Mihonarium I think of all of the things one could plausibly criticise Luke for, unwillingness to change his mind might be literally at the bottom of my list.
English
1
0
6
408
Mikhail Samin
Mikhail Samin@Mihonarium·
interesting take from Wei Dai
Mikhail Samin tweet media
English
2
0
50
2.3K
Eliezer Yudkowsky
Eliezer Yudkowsky@allTheYud·
A challenge to @deanwball. Suppose you believed what I believe: If anyone builds ASI, everyone dies (modulo locally irrelevant caveats). Say that Sanders, Trump, Hawley, Blumenthal, and Jinping will all back your policy. What's a smart policy that actually blocks ASI?
English
15
9
237
27.6K
Wei Dai
Wei Dai@weidai11·
@ohabryka @RatOrthodox @ryancareyai I was simplifying but I think "mesaoptimization" was 2018 and before that it was "inner optimization" and "optimization daemon" but both were >=2016. Can't remember or find any other online discussions that clearly fit before 2016 but possibly people talked about it offline.
English
0
0
5
58
Oliver Habryka
Oliver Habryka@ohabryka·
@weidai11 @RatOrthodox @ryancareyai It wasn't called mesa-optimization, but I am pretty sure people were thinking about it between 2003 and 2016 under various names and in the context of various analogies (the concept is still pretty muddled, so it's not like "mesaoptimization" as a concept brought great clarity).
English
1
0
2
96
Ryan Carey
Ryan Carey@ryancareyai·
Absolutely, views in the AI x-risk community are gradually diluting toward "AI is a big deal". One example from January: x.com/davidad/status… Also a lot of people grasping for new things to worry about: "mesaoptimizers", "gradual disempowerment", permanent dictatorship.
David Pinsof@DavidPinsof

Is it just me or has AI doomerism gradually transitioned from "AI will literally kill us all" to "AI will cause bad things to happen / Humans will do stupid things with AI / AI will cause huge changes." If so, this is a very positive development.

English
10
0
28
6.8K
Wei Dai
Wei Dai@weidai11·
@ohabryka @RatOrthodox @ryancareyai My SL4 post was 2003, but AFAIK it was forgotten and reinvented in 2016 by MIRI, so should probably count only 10 years? I remember being at a MIRI workshop when people started talking about mesaoptimization, thinking it was a new MIRI idea. Someone found/referenced my post later
English
1
0
2
89
Oliver Habryka
Oliver Habryka@ohabryka·
@RatOrthodox @ryancareyai As old as the “molecular squiggle optimizer” stuff, which according to Eliezer predates the paperclip stuff, so like 20 years?
English
4
0
17
364
Wei Dai
Wei Dai@weidai11·
One idea is that humans are "aligned" via social status, but this causes an "alignment tax" forcing the smartest humans to work in domains where their results can be evaluated by others less intelligent. Unfortunately, while math is in NP, philosophy is more like EXPTIME.
Wei Dai@weidai11

In 2012 I wrote about making 10⁵ clones of von Neumann as a Singularity strategy, but now I think we should first figure out why he spent more time/effort on math/computing instead of philosophy and long-horizon strategy, which are the real bottlenecks to having a good future.

English
4
9
92
7K
Wei Dai
Wei Dai@weidai11·
@RichardMCNgo Your intuition pump doesn't seem to rule out that "trust" can be solved in a straightforward CS/engineering kind of way, or is OOMs less costly than value differences. Don't understand pause=central planning. I think market economies can be regulated to prevent AI development?
English
1
0
2
91
Richard Ngo
Richard Ngo@RichardMCNgo·
@weidai11 One intuition pump re trust: imagine that you could never assume common knowledge of rationality. Then to do most game theory you’d need some weaker replacement concept. This is one facet of “trust”. An AI pause seems good short-term. Long-term it’d be de facto central planning.
English
1
0
3
156
Richard Ngo
Richard Ngo@RichardMCNgo·
It’s helpful to think of rationalists as High Modernists specifically about the future. No human was smart enough to successfully plan an economy or a society. But if we hypothesize an AI intelligent enough to do so, we can hold on to many of the same technocratic intuitions.
English
13
8
176
16.5K
Richard Ngo
Richard Ngo@RichardMCNgo·
@weidai11 It’s possible! And also from Marx’s perspective it was possible that past central planning failures had been obsoleted by the Industrial Revolution. But it was hard for Marx to know how much econ he didn’t know, and it’s hard for us to know how much sociopolitics we don’t know.
English
3
0
14
544
Wei Dai
Wei Dai@weidai11·
@RichardMCNgo To be clear, I'm advocating an AI pause to give more time to think about issues like these, not a revolution to finally do central planning right. :)
English
1
0
4
199
Wei Dai
Wei Dai@weidai11·
@RichardMCNgo Yeah, I don't understand your position that these are big deals. From my perspective, value conflicts explain ~all human coordination failures, and what's left if you get rid of it is more like CS/engineering, with relatively straightforward low-cost solutions.
English
1
0
1
101
Wei Dai
Wei Dai@weidai11·
@RichardMCNgo If 2 agents have same values (including no indexical values), then why fight for control instead of, e.g., flipping a coin to decide whose strategy to go with? You might still need some back-and-forth to resolve race conditions etc, but standard solutions are available from CS?
English
1
0
3
147
Richard Ngo
Richard Ngo@RichardMCNgo·
@weidai11 Re your specific point: I think that you get explosions in game theoretic complexity even with the same values if you lack trust. E.g. imagine you lack common knowledge that you’re pursuing the same strategy to achieve your shared goals. Then you might fight for control.
English
2
0
3
262
Wei Dai
Wei Dai@weidai11·
@kanzure Ah yes, the classic Leeroy Jenkins approach to problem solving. Really wish I had been the one to realize that it's a perfect fit for navigating a technological Singularity.
English
1
0
0
215
Bryan Bishop
Bryan Bishop@kanzure·
@weidai11 No, you should have made human clones and instead you didn't.
English
1
1
2
289
Wei Dai
Wei Dai@weidai11·
@RokoMijic I'm not sure that he would be good at philosophy but I'm guessing maybe he could have been if he tried? Anyway I think you may be misinterpreting what I was trying to say in the OP.
English
1
0
2
133
Roko 🐉
Roko 🐉@RokoMijic·
@weidai11 Why do you think Von Neumann is good at philosophy? He's a mathematical genius but not a well known philosopher We should have cryopreserved Dan Dennett who was easily the best philosopher of AI...
English
1
0
1
145
Wei Dai
Wei Dai@weidai11·
@RokoMijic even more prone to reward gaming, are even worse at doing philosophy, are liable to have even more alien values, etc.
English
1
0
2
174
Wei Dai
Wei Dai@weidai11·
@RokoMijic From my linked post: What about getting help from AI? Well they seem to suffer from many of the same safety problems, but in even more severe forms. E.g., current AI capabilities are even more skewed towards short-horizon, easily verifiable tasks, like math and coding. They seem
English
2
0
4
435
Eliezer Yudkowsky
Eliezer Yudkowsky@allTheYud·
@jd_pressman (of course I agree that if it could be done it should be done. this need not be said.)
English
3
0
22
561
John David Pressman
John David Pressman@jd_pressman·
My position remains that we should exhume Neumann and clone him.
English
11
5
143
21.2K