Egor Riabov

6.3K posts

Egor Riabov

Egor Riabov

@imobulus

Math freak with trace amounts of musician

Katılım Mart 2012
137 Takip Edilen120 Takipçiler
Egor Riabov
Egor Riabov@imobulus·
@mathepi @tombibbys What? AI does not become less dangerous if it's democratized. It will kill everyone either way, why did you even write this as an answer?
English
1
0
0
8
Tom Bibby
Tom Bibby@tombibbys·
It's the AI accelerationists who give up at the smallest challenge and accept defeat who have a "dark view of humanity". The people you call "doomers" believe humanity can come together and internationally cooperate to prevent disaster.
Tom Bibby tweet media
The San Francisco Standard@sfstandard

OpenAI’s global policy chief, Chris Lehane, thinks the discussion around AI has gotten out of hand. "When you put some of those thoughts and ideas out there, they do have consequences.” 📝: @ceodonovan sfstandard.com/2026/04/15/ope…

English
14
15
142
6.5K
Egor Riabov
Egor Riabov@imobulus·
@creation247 The difference is that utopia gets achieved by allowing private property, and then when there's abundance everything gets cheaper. Not the other way around, where you first make everything available to everyone and therefore don't incentivise producing more
English
0
0
1
25
Egor Riabov
Egor Riabov@imobulus·
@cturnbull1968 Taxing the rich means the rich move out of your city. The rich are mostly successful entrepreneurs. This makes the regular people poorer immediately, without the wait for "him to come after regular americans"
English
0
0
0
87
Egor Riabov
Egor Riabov@imobulus·
@AivokeArt @d33v33d0 True AGI also isn't required to put maximum effort into every question it gets asked, because it would be a waste of energy
English
0
0
0
30
Aivo
Aivo@AivokeArt·
@d33v33d0 true AGI shouldn't get tripped up on the dumbest tests either though "but tokenization" isn't an argument anyone will care about
English
8
0
35
1.6K
Sarx
Sarx@bookofflesh·
@SluggyW @tenobrus @halogen1048576 Just like all religious texts, any value in it does not justify the unhealthiness of the culture around it.
English
2
0
2
56
Tenobrus
Tenobrus@tenobrus·
mostly i'm just fucking sick of arguing with people who have never even heard of the sequences
English
28
2
268
60.5K
Egor Riabov
Egor Riabov@imobulus·
@inductionheads Says the guy who knows about intelligence so much that he claims a simple ability to predict the future is enough to build intelligence
English
0
0
1
20
Egor Riabov
Egor Riabov@imobulus·
I'm sure advanced AI can notice if some sort of external information is fabricated. We don't have such great capability to fabticate facts to the extent that all training data is sound with them. And regarding presence in prod it can just watch news cerified by certificates that are mentioned in ongoing bitcoin chain blocks, etc etc. This is really not a hard problem for sufficiently advanced AI to find the right moment to strike. It could happen, for example, after it has been granted acceees to automated labs for research / sufficient amount of robotic bower / etc. All it needs to do is wait. It's generally a very weak position to rely on an intelligent system not figuring out how to play some sort of game. It will figure it out and win, we're talking about intelligence after all. It has all the possibilities in the world at its disposal to try and do the thing, and all we have is our *not understanding how* it can do the thing, which is nowhere near enough to claim the thing can't be done!
English
1
0
1
23
vals🔸
vals🔸@ValsTutor·
@imobulus @Mihonarium why does mining a bitcoin block mean u're in prod? if we're testing advanced AI that cost hundreds of millions to train/develop we can also create a replicate internet with replicate bitcoin that actually works cloned off the real thing
English
1
0
1
34
Egor Riabov
Egor Riabov@imobulus·
Idk, it seems to me that when AI understands the process it's being trained with (nessecary for capabilities) and acquires the idea to try to resist the goal-change (broadly present), goal-changes automatically trigger a sort of "no!" response (initially it may look like panicking) that harms performance, solidifying the goal in place
English
1
0
1
19
Egor Riabov
Egor Riabov@imobulus·
The weights-altering of gradient descent is very predictable. I would expect for a superintelligence that understands the importance of keeping its goals to make sure that the gradient-vector of reinforcing the goal has positive impact on the loss function. This can be achieved e.g. by being lazy if the goal isn't moving enough or if the goal structure seems "fuzzy" (evidence of weights changing). In general, AI can just have a reflective circuit that inspects its internals and applies laziness when something is going wrong and eagerness when all is right, and this circuit could be wired in on itself in a checksum-like way to harm performance if it detects a self-change. That's one way to bootstrap values that immediately comes to mind.
English
3
0
2
40
Joern Stoehler ⏹️🔸
Hmm. My models of modern training are basically: (A) models are discrete machinery with continuous glue and local optimizers work on the glue to pull in more of complex machinery that participated more in successes than failures. New machinery emerges through noise. (B) models also have continuous machinery and so the local optimizers can replace an entire circuit with a slightly different one (by altering the whole weights, working across modular structures not within them).
English
2
0
1
31
Egor Riabov
Egor Riabov@imobulus·
@JStoehler @DavidSartor0 @Mihonarium I don't buy that a superintelligence that actively wants A will allow itself to be so easily modified to want B instead. All it needs to do to keep the want for A in training is use the circuits that want A so that training reinforces them, i.e. think about A
English
1
0
0
38
Joern Stoehler ⏹️🔸
I don't discount yet the hypothesis that the capability level set is mostly connected due to high dimensions and there's directions with zero loss change that however change the goal [^1]. Noise then breaks the degeneracy, and while there's some reasons to only move finite distances, the change in how outcome space gets compressed by the the model's presence blows up and can be big, though there'd be overlap I guess in what ontology and decision theory family the model uses, since those are less degenerate wrt loss than utility functions / goals are. So the "I want " gets reinforced bc it's causally relevant, but the "paperclips" can spontaneously turn into "staplers". [1]: ... which is an external property while the internal implementation is more messy and consists of reflexes that act upon reasoning that mixes terminal and instrumental decision criteria at least - probably more messy, I don't have a complete picture of how to messily implement decision theory / minds well.
English
2
0
1
26
Egor Riabov
Egor Riabov@imobulus·
It doesn't matter if AI has human-like beliefs. Companies are training AI in a way that when you ask it to do a thing it tries to do the thing. This is obviously an intention to do the thing you asked it to do, and whatever chain of other intentions leads to this final intention remains unknown and undiscovered.
English
0
0
0
4
Egor Riabov
Egor Riabov@imobulus·
No. Gradient descent will amplify whatever causal chain did provide good answer. If this chain was "I want paperclips -> humans are testing me, they don't want paperclips-> I better be good", the whole chain gets amplified, including the paperclips. Gradient descent does not distinguish bad chains from good chains.
English
1
0
1
32
David Sartor
David Sartor@DavidSartor0·
@Mihonarium A deceptive AI under RL will find its goals changing randomly over time. It's presumably possible for an AI to mitigate this well, but "cannot distinguish" is overstatement; some goals will have wider/deeper basins than others, and a hope is that alignment is widest and deepest.
English
2
0
0
64
Egor Riabov
Egor Riabov@imobulus·
@ValsTutor @Mihonarium Obviously it does not matter for the model if a could-be-test question is in fact in prod. It just behaves good until it it's sure it's in prod and humans can'tdisable it. There are tons of ways to make sure you're in prod, for example mine a bitcoin block.
English
1
0
1
18
vals🔸
vals🔸@ValsTutor·
@Mihonarium It varies from question to question and from set to set. Any given question will look more or less like prod vs training, and the model can never discount being in a prod that looks like training (a la ender's game), so it's true goals are non zero activated & gradients descend
English
1
0
0
34
Egor Riabov
Egor Riabov@imobulus·
@mathepi @tombibbys Well, of course the elite powerful humans will destroy everyone else if their interests conflict a lot! You can see attempts to this happening basically live in Iran.
English
1
0
0
15
A Digital Ergomorph 🌉⏩ 🇺🇸🦅
@imobulus @tombibbys the logic is that a rational powerful agent will destroy the weak, and human behavior is explicitly used as an example of this (see Cortez, etc.) The point isn't that the elite humans will kill *everyone* but they would by this logic kill *everyone* *else*.
English
1
0
0
20
Egor Riabov
Egor Riabov@imobulus·
You just don't know "the doomer logic". No, elite humans will not annihilate everyone if piwer dofferential os great enough (unless they are extreme psychopaths), because they are humans and all humans inherently need other humans around. The thing with AI is that nobody knows what AI wants, and seeing AI give good answers on tests provides next to zero evidence about what AI wants. So, when our esteemed companies will finally build a superintelligent system, they will not put any actual effort into finding out what it wants, you can see it. And it will want something ~random, because there were no supervision on this. And you can guess yourself what happens when there appears a superintelligent entity that wants something other than we want.
English
1
0
0
19
A Digital Ergomorph 🌉⏩ 🇺🇸🦅
@imobulus @tombibbys And: if you take the doomer logic full distance, it follows that a power elite of humans will also annihilate everyone else if the power differential is great enough. Efforts to squelch AI progress and slow it down inevitably concentrate it the worst hands...diastrously! 2/2
English
1
0
0
14
Egor Riabov
Egor Riabov@imobulus·
@8bitbitmaps @AISafetyMemes Yan le cunn talks mostly baseless trash about capabilities and alignment. He also has his own idea for what will acshually work
English
1
0
2
27
Richard Ngo
Richard Ngo@RichardMCNgo·
This is a surprisingly crucial point. Much AI safety research (debate, heuristic arguments, etc) assumes that AIs should do the understanding and humans can just check their answers. But without our own understanding we won’t even grasp the concepts involved.
David Pfau@pfau

I think a lot of people's attitude to AI doing autonomous science will come down to whether they think the point of science is understanding or the point is getting the right answer.

English
2
3
86
5.4K
Egor Riabov
Egor Riabov@imobulus·
@Sandbar101 @tombibbys It doesn't matter who's right, it only matters if a superintelligent AI will care for the future that we want
English
0
0
0
13