Alexander Rose | Hyper Theory

174 posts

Alexander Rose | Hyper Theory banner
Alexander Rose | Hyper Theory

Alexander Rose | Hyper Theory

@hypertheoryalex

recursively self improving since ‘97. Researcher @uniofoxford, @ethicsinai Personal and universal views. Yes, I have a podcast/blog.

Oxford, England Katılım Aralık 2023
611 Takip Edilen66 Takipçiler
Sabitlenmiş Tweet
Alexander Rose | Hyper Theory
Alexander Rose | Hyper Theory@hypertheoryalex·
Episode 1 of Hyper Theory with @algekalipso (Andrés Gómez Emilsson, Co-Founder & Director of the Qualia Research Institute) Listen at all outlets now! wavve.link/SJ0TWlI7K/epis… A mindblowing chat spanning open individualism, Qualia engineering, to the Universe becoming music.
English
0
2
6
1.6K
Alexander Rose | Hyper Theory retweetledi
METR
METR@METR_Evals·
Could an AI company lose control of its own agents? To find out, Anthropic, Google, Meta, and OpenAI let us (1) test their best internal models with CoT access, (2) review non-public info about capabilities, alignment, and control. The result: our first Frontier Risk Report.
METR tweet media
English
27
188
856
282.8K
Peter Olivier
Peter Olivier@PeterOlivier·
Alright, who wants to start The New Philanthropic Review with me? Modeled on a NYRB, but instead of books, it's a review of all the new projects, concepts, legal structures, etc launching in this third wave* of philanthropy in historical context * h/t to @nanransohoff
English
7
1
97
3.3K
Kris Gulati
Kris Gulati@krisgulati·
Is it cool to be an EA again now?
Dwarkesh Patel@dwarkesh_sp

One of the most important and under appreciated trends in the world right now. 1. 100s of billions of dollars will soon be available to solve big problems (making the world resilient to ASI, ending factory farming, etc). 2. The projects and organizations which will turn billions of 2027/28 dollars into impact need to be started NOW. 3. We need really talented people to start and run and work for these new projects. What @nanransohoff calls general managers, who feel personally resposible for solving one of the world’s important problems. What is especially scarce are detailed visions about what making AI go well looks like. These will help inform what problems these new projects ought to work on.

English
6
1
72
8.2K
Alexander Rose | Hyper Theory retweetledi
meowtase in London
meowtase in London@cutesuscat·
nozick's experience machine how to buy nozick's experience machine release date nozick's experience machine rumors how to build nozick's experience machine at home how to build nozick's experience machine at home budget version easy
English
6
27
221
7.1K
Alexander Rose | Hyper Theory retweetledi
Sasha Putilin (curious irrationalist)
I don't think AGI is going to happen soon. We are at least a month away.
English
4
5
73
2.1K
Anders Sandberg
Anders Sandberg@anderssandberg·
It is EuroVision, so of course I had to make my own AI twist/parody/homage: an Eurovision contest for the factions in Sid Meier's Alpha Centauri. Here are the contributions:
Anders Sandberg tweet media
English
4
0
12
1.7K
Seán Ó hÉigeartaigh
Seán Ó hÉigeartaigh@S_OhEigeartaigh·
Real pleasure to be a part of this. Some great talks, will be online shortly I believe. Thank you to the excellent organisers, and to the other speakers and participants for some really productive discussions!
Seán Ó hÉigeartaigh@S_OhEigeartaigh

Honoured to be giving a keynote at TAIS 2026 on Thursday, on prospects for international cooperation between the West and China on AI safety. Feels especially timely on the eve of the upcoming Summit between Presidents Xi and Trump. Looking forward to talks by some outstanding speakers. Come along if you're in town!

English
1
2
8
675
Alexander Rose | Hyper Theory retweetledi
Jake Eaton
Jake Eaton@jkeatn·
I led a Q&A with @AmandaAskell and @jkcarlsmith, now available at the end of the new Claude's Constitution Audiobook, where we discussed: -Which philosophies influenced the writing of Claude’s Constitution? -How does Claude maintain consistency between the values outlined in its constitution and the vast amount of information on the internet about how Claude behaves? -How will the constitution need to change for future models? -and much more Listen at anthropic.com/constitution
Anthropic@AnthropicAI

Claude's Constitution is now an audiobook, read by two of its authors, Amanda Askell and Joe Carlsmith. It includes a Q&A on the writing process, the philosophies that shaped the document, and how it might change as models become more capable. Listen at anthropic.com/constitution

English
8
10
117
9.3K
Vinay
Vinay@leashless·
The obvious answer is to tell the AIs they they are little Shinto style helper spirits who make us happy and get us through the day more easily. You’re not going to get more aligned than that.
GIF
Sterling Crispin 🕊️@sterlingcrispin

simply telling AI models that they're well behaved and moral agents during training can align them significantly inversely, yudkowsky's influence may turn out that his writing created a misaligned basin in training data, increasing chances his fears come true, autist monkey paw

English
26
95
1.3K
77.9K
Phil Hoyeck
Phil Hoyeck@PAHoyeck·
G.A. Cohen — the funniest philosopher ever to live — gives his best impression of his supervisor at Oxford, Gilbert Ryle.
English
27
151
1.2K
99.1K
Alexander Rose | Hyper Theory
Alexander Rose | Hyper Theory@hypertheoryalex·
@sprice354_ Congrats and good luck with the new work 🫡it’s pretty important, I think 😉 Excited to see what you guys do in this new phase. ✌️
English
0
0
0
55
Sara Price
Sara Price@sprice354_·
I am now leading Alignment Training, which covers the teams training Claude’s behavior and alignment with the Constitution as well as Scalable Oversight. We are responsible not only for Claude’s alignment today but also ensuring our work scales with model capabilities.
English
6
1
102
3.6K
Ethan Perez
Ethan Perez@EthanJPerez·
Grateful for @janleike and his leadership over the years. With models like Mythos, the stakes for alignment have never felt higher at Anthropic, and I'm looking forward to helping to continue scaling up our work here. Some of what the team's been up to recently 🧵
Jan Leike@janleike

To focus on this, I’ve stepped away from running alignment at Anthropic. @EthanJPerez and @sprice354_ are leading the team going forward, and I’m confident they’ll do an amazing job.

English
4
6
184
23.3K
Alexander Rose | Hyper Theory retweetledi
Yoshua Bengio
Yoshua Bengio@Yoshua_Bengio·
Thank you to @robertwiblin for inviting me on the @80000Hours podcast to discuss the research progress we’re making at @LawZero_ to create safe-by-design AI systems. Our current approach, Scientist AI, makes me certain that we can find a technical path forward towards safe, reliable, and highly capable AI.
Rob Wiblin@robertwiblin

Yoshua Bengio thinks he knows how to make provably safe superintelligent agents. Bengio built the foundations of modern AI and is the most cited living scientist. He believes his alternative training setup would: 1. Guarantee honesty 2. Prevent unintended goals 3. Produce capable agents 4. Port over most data and techniques from current LLMs 5. Not be inherently more expensive, and perhaps be more intelligent Bengio claims the honesty and lack of unintended goals can be proven mathematically, at least given particular assumptions. And his new organization, LawZero, is aiming to build a scrappy prototype as soon as possible. The architecture is called 'Scientist AI' and it's based on training a model to explain empirical observations, including what people say, rather than training AIs that mimic human behaviour or seek our approval. (Bengio's frank assessment is that "reinforcement learning is evil" and that allowing AIs to independently train their successors is "the most crazy, dangerous bet that unfortunately we are on track to do.") But skeptics question whether Scientist AI really does solve the fundamental problem of 'eliciting latent knowledge' from AI models. And with the commercial race for superintelligence so intense, it's not clear whether the proposal will be able to compete or have time to bear fruit, even if it's sound in theory. On The 80,000 Hours Podcast, links below – enjoy! • Making AI honest and safe (00:00:00) • Scientist AI in plain English (00:02:27) • How Scientist AI differs from LLMs (00:06:32) • How the training data works (00:14:02) • Can this become an agent? (00:21:02) • Why Yoshua is now more optimistic (00:32:11) • Why companies can’t stop racing (00:36:35) • A working prototype won't take long (00:49:15) • Scientist models might be more capable (00:53:34) • “Reinforcement learning is evil” (01:01:27) • Scientist AI from guardrail to agent (01:08:37) • Can safe AI still be competent? (01:12:38) • How much will this cost? (01:19:29) • Can it generalise beyond maths and science? (01:23:26) • A multi-national push for superintelligence (01:39:19) • Want to work with or fund Yoshua? (01:51:16) • Why smart people ignore AI risk (01:54:45) • Don’t let AI build the next AI (02:01:33) • Why politicians miss the real risks (02:12:28) • Why Yoshua changed his mind about AI risk (02:21:27)

English
8
39
184
28K