Nate Soares ⏹️

1.7K posts

Nate Soares ⏹️

@So8res

Trying to make AI not kill everyone

Berkeley Katılım Ocak 2011

99 Takip Edilen11.3K Takipçiler

Nate Soares ⏹️@So8res·3h

thankful for that one presumably-cheap shampoo that seduced all the cheap airbnbs. now my hair can finally have a consistent smell

English

841

Nate Soares ⏹️@So8res·3h

@infinitehumanai ch6 of the book gives mechanistic examples on the level of "tiger big; has claws". if you're demanding mechanistic explanation of how the tiger's brain algo permits learning: unfortunately tigers can be made by processes that lack that understanding, & can kill even the ignorant.

English

Mark Worrall@infinitehumanai·4h

The housecat/tiger analogy makes my point: we know exactly why tigers are dangerous and housecats aren't because we have a mechanistic theory. That's what I'm asking for. This is the same flawed argument as the gun one you made earlier. Your radioactive rock analogy also helps me: we had atomic theory before the Manhattan Project. Precaution was grounded in mechanism. To be clear: I'm not saying "go ahead". I'm saying: where's the equivalent theory for AI? Without it, we're not being precautionary. We're simply guessing.

English

Nate Soares ⏹️@So8res·7h

People don't program AIs. They program the machine that grows the AI. AI behavior is an emergent consequence of complex internal machinery that literally nobody understands.

English

181

12.1K

Nate Soares ⏹️@So8res·4h

@infinitehumanai If Microsoft in a pre-atomic world was like "we're enriching this radioactive rock in hopes of making a new sort of weapon that lets us control everything; it's now more radioactive than ever before", I think the right answer is "no, stop" even if you don't *know* they'll succeed

English

Nate Soares ⏹️@So8res·4h

@infinitehumanai (2) I think we have plenty of theory and evidence (and have named some) and think they more than support the argument. Even if they didn't, the fact stands that they're expressly targeting superintelligence. "We don't think they'll get there" seems a thin defense.

English

Nate Soares ⏹️@So8res·4h

@infinitehumanai Even if we throw out all the evidence and theory and say we have no idea where this stuff is going: around the point where the machines are talking and generating novel physics insights, it's time to pause and *figure out* where it's going, before racing ahead.

English

Nate Soares ⏹️@So8res·4h

@infinitehumanai The future is going to go some sort of way. "AI fizzles", "AI kills us", and "AI utopia" are three hypotheses, and you can't make the reality be the one you like by declaring all of the evidence inadmissible and hoping we get your favorite outcome.

English

Nate Soares ⏹️@So8res·5h

@infinitehumanai "We don't understand this process well enough to know it definitely will yield a superintelligence" is not a good reason to proceed. But "we don't understand well enough to know it definitely won't" *is* a good reason to stop!

English

Nate Soares ⏹️@So8res·5h

@infinitehumanai Natural selection yielded human-level intelligence using sheer brute force with no theory or understanding of how. That can happen. Arguments in the linked resources sketch ways it might happen soon.

English

Nate Soares ⏹️@So8res·5h

@stringking42069 If someone's like "I found a way to enhance the intelligence of my cat; it can generate novel physics contributions now; I think I can keep going until the cat is superintelligent" then I don't think "eh that's a relatively minor physics contribution" is a huge comfort.

English

187

stringking42069@stringking42069·5h

@So8res The result is incredibly overhyped. The people involved in this utter fiasco are being used for publicity. Shameful.

English

309

Nate Soares ⏹️@So8res·6h

If someone says they're trying to build a superintelligence that poses a substantial chance of ending the world, "Eh whatever; you'll probably fail" stops being a good societal response at around the time the opaque machines start generating novel physics results.

OpenAI@OpenAI

GPT-5.2 derived a new result in theoretical physics. We’re releasing the result in a preprint with researchers from @the_IAS, @VanderbiltU, @Cambridge_Uni, and @Harvard. It shows that a gluon interaction many physicists expected would not occur can arise under specific conditions. openai.com/index/new-resu…

English

121

8.8K

Nate Soares ⏹️@So8res·5h

@infinitehumanai My book and the associated online resources spell out plenty of mechanisms for danger. Common sense does too: just as humans transform the world using their smarts, so would autonomous machine intelligence. Humans are fragile; most transformations are lethal.

English

Mark Worrall@infinitehumanai·5h

@So8res We have a complete mechanistic theory of how guns work. That's exactly why precaution would be justified. The analogy undermines your own position: the question I keep asking is "what's the mechanism?" and you keep responding with analogies instead of theory.

English

Nate Soares ⏹️@So8res·5h

@infinitehumanai If someone loads a gun and points it at your head, and experts disagree about whether the gun works (with many saying "probably!"), the correct answer is not "go ahead; hopefully the gun jams".

English

Nate Soares ⏹️@So8res·5h

@infinitehumanai But most people shouldn't need to resort to those. The labs are like "we're trying to do this and it's very dangerous". Academic experts are like "yep, looks feasible". The appropriate response is not "go ahead; we hope you'll fail". It's "Holy crap what?! No!"

English

Nate Soares ⏹️@So8res·5h

@berytus7 The trouble isn't just that some people reward flattery over true helpfulness. The trouble is that you don't get what you train for. Natural selection only ever rewarded passing on our genes, and humans invented birth control anyway. Outer reward ≠ inner motivation, alas.

English

Berytus@berytus7·6h

@So8res For surface interactions I'd agree with you. But when that care is stress tested across many contexts and over a length of time, mere flattery would be apparent at some point. Some of us have spent careers detecting exactly that sort of gap in other contexts.

English

Nate Soares ⏹️@So8res·5h

Yann LeCun, who shared the Turing Prize for his contributions to the field, once said that "GPT-5000" would not be able answer certain physics questions. GPT-4 answered them. Predicting AI's limits is tricky. Don't bet civilization on AI petering out. youtube.com/watch?t=3474&v…

YouTube

English

1.3K

Nate Soares ⏹️@So8res·6h

Maybe they will fail! Maybe AI development will finally hit the wall that some folks have been predicting every year for the last five years. I'd love that. But the "they probably won't succeed" defense is getting thinner and thinner as the field progresses.

English

1.1K

Nate Soares ⏹️@So8res·6h

@infinitehumanai If someone says they're trying to build a superintelligence that poses a substantial chance of killing literally everyone, "Go ahead; we think you'll fail" stops being a good societal response at around the time the mysterious machines start generating novel physics results.

English

Nate Soares ⏹️@So8res·6h

@infinitehumanai The companies say they're trying to create superintelligence. GPT 5.2 made a novel physics contribution recently. The burden of argument for "this will be fine" belongs on the people trying to build the superintelligence.

English

Keşfet

@infinitehumanai @stringking42069 @berytus7 @elonmusk @BarackObama @taylorswift13 @cristiano @BillGates