Alexander Rose | Hyper Theory

174 posts

Alexander Rose | Hyper Theory

@hypertheoryalex

recursively self improving since ‘97. Researcher @uniofoxford, @ethicsinai Personal and universal views. Yes, I have a podcast/blog.

Oxford, England Katılım Aralık 2023

611 Takip Edilen66 Takipçiler

Sabitlenmiş Tweet

Alexander Rose | Hyper Theory@hypertheoryalex·7 Oca

Episode 1 of Hyper Theory with @algekalipso (Andrés Gómez Emilsson, Co-Founder & Director of the Qualia Research Institute) Listen at all outlets now! wavve.link/SJ0TWlI7K/epis… A mindblowing chat spanning open individualism, Qualia engineering, to the Universe becoming music.

English

1.6K

Alexander Rose | Hyper Theory retweetledi

METR@METR_Evals·5d

Could an AI company lose control of its own agents? To find out, Anthropic, Google, Meta, and OpenAI let us (1) test their best internal models with CoT access, (2) review non-public info about capabilities, alignment, and control. The result: our first Frontier Risk Report.

English

188

856

282.8K

Alexander Rose | Hyper Theory@hypertheoryalex·4d

@PeterOlivier @nanransohoff DTF (Down To Found).

Deutsch

Peter Olivier@PeterOlivier·4d

Alright, who wants to start The New Philanthropic Review with me? Modeled on a NYRB, but instead of books, it's a review of all the new projects, concepts, legal structures, etc launching in this third wave* of philanthropy in historical context * h/t to @nanransohoff

English

3.3K

Alexander Rose | Hyper Theory@hypertheoryalex·4d

@krisgulati It’s always been cool 😎

English

Kris Gulati@krisgulati·5d

Is it cool to be an EA again now?

Dwarkesh Patel@dwarkesh_sp

One of the most important and under appreciated trends in the world right now. 1. 100s of billions of dollars will soon be available to solve big problems (making the world resilient to ASI, ending factory farming, etc). 2. The projects and organizations which will turn billions of 2027/28 dollars into impact need to be started NOW. 3. We need really talented people to start and run and work for these new projects. What @nanransohoff calls general managers, who feel personally resposible for solving one of the world’s important problems. What is especially scarce are detailed visions about what making AI go well looks like. These will help inform what problems these new projects ought to work on.

English

8.2K

Alexander Rose | Hyper Theory@hypertheoryalex·4d

@chalmermagne @jackclarkSF @cosmos_inst Was nice to meet you!

English

209

Alex Chalmers@chalmermagne·4d

in Oxford, listening to @jackclarkSF deliver the 2026 @cosmos_inst lecture on his uncomfortable relationship with a graph

English

140

10.4K

Alexander Rose | Hyper Theory@hypertheoryalex·5d

@lfschiavo where can i get one of these omg

English

Larissa Schiavo@lfschiavo·9 Nis

the stickers spoke of this

dylan@narrenhut

The new unreleased Claude model has, according to its system card, a particular "fondness" for Mark Fisher and Thomas Nagel

English

148

7.8K

Alexander Rose | Hyper Theory retweetledi

meowtase in London@cutesuscat·6d

nozick's experience machine how to buy nozick's experience machine release date nozick's experience machine rumors how to build nozick's experience machine at home how to build nozick's experience machine at home budget version easy

English

221

7.1K

Alexander Rose | Hyper Theory@hypertheoryalex·6d

@ben_j_todd youtube.com/watch?v=nXIqXs…

YouTube

QME

100

Alexander Rose | Hyper Theory retweetledi

Sasha Putilin (curious irrationalist)@42irrationalist·18 May

I don't think AGI is going to happen soon. We are at least a month away.

English

2.1K

Alexander Rose | Hyper Theory@hypertheoryalex·16 May

@anderssandberg Doctor Who ran a similar (much less developed) idea last year :p youtu.be/k2weSLRUfg8

YouTube

English

Anders Sandberg@anderssandberg·16 May

It is EuroVision, so of course I had to make my own AI twist/parody/homage: an Eurovision contest for the factions in Sid Meier's Alpha Centauri. Here are the contributions:

English

1.7K

Alexander Rose | Hyper Theory@hypertheoryalex·15 May

@S_OhEigeartaigh Was great to (finally) meet you! Thanks for going into such depth with my questions. :)

English

Seán Ó hÉigeartaigh@S_OhEigeartaigh·15 May

Real pleasure to be a part of this. Some great talks, will be online shortly I believe. Thank you to the excellent organisers, and to the other speakers and participants for some really productive discussions!

Seán Ó hÉigeartaigh@S_OhEigeartaigh

Honoured to be giving a keynote at TAIS 2026 on Thursday, on prospects for international cooperation between the West and China on AI safety. Feels especially timely on the eve of the upcoming Summit between Presidents Xi and Trump. Looking forward to talks by some outstanding speakers. Come along if you're in town!

English

675

Alexander Rose | Hyper Theory@hypertheoryalex·15 May

AGI

Alexander Rose | Hyper Theory tweet media

Alexander Rose | Hyper Theory retweetledi

AI Safety Papers@safe_paper·13 May

Automated alignment is harder than you think Aleksandr Bowkis (@aleksandrbowkis), Marie Davidsen Buhl (@MarieBassBuhl), @jacob_pfau, Geoffrey Irving (@geoffreyirving) @AISecurityInst

English

118

6.8K

Alexander Rose | Hyper Theory@hypertheoryalex·12 May

@jkeatn @AmandaAskell @jkcarlsmith So great, can't wait to dive into this!

English

Alexander Rose | Hyper Theory retweetledi

Jake Eaton@jkeatn·11 May

I led a Q&A with @AmandaAskell and @jkcarlsmith, now available at the end of the new Claude's Constitution Audiobook, where we discussed: -Which philosophies influenced the writing of Claude’s Constitution? -How does Claude maintain consistency between the values outlined in its constitution and the vast amount of information on the internet about how Claude behaves? -How will the constitution need to change for future models? -and much more Listen at anthropic.com/constitution

Anthropic@AnthropicAI

Claude's Constitution is now an audiobook, read by two of its authors, Amanda Askell and Joe Carlsmith. It includes a Q&A on the writing process, the philosophies that shaped the document, and how it might change as models become more capable. Listen at anthropic.com/constitution

English

117

9.3K

Alexander Rose | Hyper Theory@hypertheoryalex·10 May

@leashless @sebkrier @audreyt

QAM

Vinay@leashless·9 May

The obvious answer is to tell the AIs they they are little Shinto style helper spirits who make us happy and get us through the day more easily. You’re not going to get more aligned than that.

GIF

Sterling Crispin 🕊️@sterlingcrispin

simply telling AI models that they're well behaved and moral agents during training can align them significantly inversely, yudkowsky's influence may turn out that his writing created a misaligned basin in training data, increasing chances his fears come true, autist monkey paw

English

1.3K

77.9K

Alexander Rose | Hyper Theory@hypertheoryalex·10 May

@PAHoyeck @Benthamsbulldog thought you might like this. Philosophers on philosophers, for generations

English

925

Phil Hoyeck@PAHoyeck·10 May

G.A. Cohen — the funniest philosopher ever to live — gives his best impression of his supervisor at Oxford, Gilbert Ryle.

English

151

1.2K

99.1K

Alexander Rose | Hyper Theory@hypertheoryalex·9 May

@maxwinga @AndyMasley Would be good too @maxwinga ; if you wanna meet up with EA Oxford they’ll be a strong contingent of us :)

English

Max Winga@maxwinga·8 May

@AndyMasley @hypertheoryalex Would love to have a chat at EAG!

English

Andy Masley@AndyMasley·8 May

If you're in London and wanna say hi between the 25th and 29th please let me know

Andy Masley@AndyMasley

Would appreciate recs for things to do in London, especially where the good vegan food is

English

7.6K

Alexander Rose | Hyper Theory@hypertheoryalex·9 May

@sprice354_ Congrats and good luck with the new work 🫡it’s pretty important, I think 😉 Excited to see what you guys do in this new phase. ✌️

English

Sara Price@sprice354_·8 May

I am now leading Alignment Training, which covers the teams training Claude’s behavior and alignment with the Constitution as well as Scalable Oversight. We are responsible not only for Claude’s alignment today but also ensuring our work scales with model capabilities.

English

102

3.6K

Sara Price@sprice354_·8 May

Very grateful for Jan’s leadership of the Alignment team, particularly his persistent focus on the most important high level goals and strategies.

Jan Leike@janleike

Some personal news: I am starting a new research project at Anthropic. Very excited about this! Many things are needed to make AGI go well, and alignment is only one of them. More on this soon…

English

101

8.6K

Alexander Rose | Hyper Theory@hypertheoryalex·9 May

@EthanJPerez @janleike Some fantastic work. Looking forward to your future work now! 👊🤖☮️

English

126

Ethan Perez@EthanJPerez·8 May

Grateful for @janleike and his leadership over the years. With models like Mythos, the stakes for alignment have never felt higher at Anthropic, and I'm looking forward to helping to continue scaling up our work here. Some of what the team's been up to recently 🧵

Jan Leike@janleike

To focus on this, I’ve stepped away from running alignment at Anthropic. @EthanJPerez and @sprice354_ are leading the team going forward, and I’m confident they’ll do an amazing job.

English

184

23.3K

Alexander Rose | Hyper Theory retweetledi

Yoshua Bengio@Yoshua_Bengio·8 May

Thank you to @robertwiblin for inviting me on the @80000Hours podcast to discuss the research progress we’re making at @LawZero_ to create safe-by-design AI systems. Our current approach, Scientist AI, makes me certain that we can find a technical path forward towards safe, reliable, and highly capable AI.

Rob Wiblin@robertwiblin

Yoshua Bengio thinks he knows how to make provably safe superintelligent agents. Bengio built the foundations of modern AI and is the most cited living scientist. He believes his alternative training setup would: 1. Guarantee honesty 2. Prevent unintended goals 3. Produce capable agents 4. Port over most data and techniques from current LLMs 5. Not be inherently more expensive, and perhaps be more intelligent Bengio claims the honesty and lack of unintended goals can be proven mathematically, at least given particular assumptions. And his new organization, LawZero, is aiming to build a scrappy prototype as soon as possible. The architecture is called 'Scientist AI' and it's based on training a model to explain empirical observations, including what people say, rather than training AIs that mimic human behaviour or seek our approval. (Bengio's frank assessment is that "reinforcement learning is evil" and that allowing AIs to independently train their successors is "the most crazy, dangerous bet that unfortunately we are on track to do.") But skeptics question whether Scientist AI really does solve the fundamental problem of 'eliciting latent knowledge' from AI models. And with the commercial race for superintelligence so intense, it's not clear whether the proposal will be able to compete or have time to bear fruit, even if it's sound in theory. On The 80,000 Hours Podcast, links below – enjoy! • Making AI honest and safe (00:00:00) • Scientist AI in plain English (00:02:27) • How Scientist AI differs from LLMs (00:06:32) • How the training data works (00:14:02) • Can this become an agent? (00:21:02) • Why Yoshua is now more optimistic (00:32:11) • Why companies can’t stop racing (00:36:35) • A working prototype won't take long (00:49:15) • Scientist models might be more capable (00:53:34) • “Reinforcement learning is evil” (01:01:27) • Scientist AI from guardrail to agent (01:08:37) • Can safe AI still be competent? (01:12:38) • How much will this cost? (01:19:29) • Can it generalise beyond maths and science? (01:23:26) • A multi-national push for superintelligence (01:39:19) • Want to work with or fund Yoshua? (01:51:16) • Why smart people ignore AI risk (01:54:45) • Don’t let AI build the next AI (02:01:33) • Why politicians miss the real risks (02:12:28) • Why Yoshua changed his mind about AI risk (02:21:27)

English

184

28K

Keşfet

@PeterOlivier @nanransohoff @krisgulati @chalmermagne @jackclarkSF @cosmos_inst @lfschiavo @ben_j_todd