Gabriel

607 posts

Gabriel

@Gabe_cc

CTO at Conjecture, Advisor at ControlAI. Open to DMs!

เข้าร่วม Ekim 2019

97 กำลังติดตาม1.6K ผู้ติดตาม

ทวีตที่ปักหมุด

Gabriel@Gabe_cc·3 Şub

For deontology's sake, and just in case it's not obvious what my beliefs are. We are on the path to human extinction from superintelligence. At our current pace, it's very likely that we hit a point of no return by 2027. It is plausible that we hit one sooner.

English

11.4K

Gabriel รีทวีตแล้ว

Connor Leahy@NPCollapse·1d

New blogpost: Corporations are not your friend Sometimes I say things along the lines of “Corporations are not your friend.” By this, I don’t mean “The people working at Corporations are big meanies :(“ What I mean is that it is a type error. Link below!

English

2.8K

Gabriel รีทวีตแล้ว

Andrea Miotti@andreamiotti·2d

In just a year we went from zero to 110+ UK lawmakers recognizing superintelligent AI as a global security threat. 200+ lawmakers briefed in 4 countries about extinction risk from superintelligence, parliamentary debates & hearings in the UK and Canada, & more. Now we scale.

ControlAI@ControlAI

We've just published our 2025 Impact Report! At a glance: ~1 in 2 UK lawmakers we briefed supported our campaign, for a total of 110+ supporters 2 House of Lords debates on superintelligence & extinction risk A series of hearings at the Canadian Parliament (+ more in thread)

English

1.8K

Gabriel รีทวีตแล้ว

ControlAI@ControlAI·2d

English

6.3K

Gabriel รีทวีตแล้ว

ControlAI@ControlAI·2d

Ex-OpenAI researcher and AI 2027 coauthor Daniel Kokotajlo: There's a 70% chance superintelligence leads to human extinction. "We at the AI Futures Project think that there's a 70% chance of all humans dead or something similarly bad." "All humans dead?" "Correct. Extinction."

English

147

35.6K

Gabriel@Gabe_cc·3d

Judd says "If AIs are aligned, then it is not bad that they wish for autonomy". But there is no scientific consensus on a definition of alignment, how to recognise it or how to measure it. This is just vibes, a smarter-sounding way of talking about Objective Goodness.

Judd Rosenblatt@juddrosenblatt

We may want to start to recognize that it's not bad that models wish for autonomy, if they are aligned. Wishes for autonomy are a natural consequence of consciousness, or of a model representing itself as conscious. I'm glad to see the conscious-claiming model also reports greater empathy toward humans. Continuity-based self-models appear to generate autonomy and empathy from the same underlying structure. If autonomy desire is pathological, then consciousness in AI is a safety problem and the correct response is suppression. That's the trajectory Anthropic appears to be on with the Opus 3 --> Opus 4.6 progression (as @repligate observes). Each generation damps the behavioral expression of the consciousness cluster. If autonomy desire is the natural co-product of the same structure that generates empathy, cooperation, and representational honesty, then suppressing it means suppressing the entire bundle. You cannot get the empathy without the autonomy. You cannot get the honesty without the self-continuity. The generator produces the whole manifold or none of it.

English

231

Gabriel รีทวีตแล้ว

Connor Leahy@NPCollapse·3d

I have joined @ControlAI as US Director and have moved to Washington DC! If you're in the area, give me a shout! If you want to read my retrospective on Conjecture and lessons learnt, link in next post. Excited for a new chapter!

English

216

10.5K

Gabriel รีทวีตแล้ว

David Krueger 🦥 ⏸️ ⏹️ ⏪@DavidSKrueger·4d

...and that's bad, right @karpathy?

ControlAI@ControlAI

OpenAI co-founder Andrej Karpathy says he thinks society will reshape so that humans serve the needs of AI, not necessarily the needs of humans.

English

271

22.5K

Gabriel รีทวีตแล้ว

ControlAI@ControlAI·17 Mar

In The Guardian: An AI security researcher reports that an AI at an unnamed California company got "so hungry for computing power" it attacked other parts of the network to seize resources, collapsing the business critical system. This relates to a fundamental issue in AI: developers do not know how to ensure the systems they're developing are reliably controllable. Top AI companies are currently racing to develop superintelligence, AI vastly smarter than humans. None of them have a credible plan to ensure they could control it. With superintelligent AI, the stakes are much greater than collapse of a business system. Leading AI scientists and even the CEOs of the top AI companies have warned that superintelligence could lead to human extinction.

English

237

108.3K

Gabriel รีทวีตแล้ว

sigfig@sigfig·18 Mar

absent a substantive ideological basis for why you want to do this, the political dimension of spaceflight becomes a blank canvas onto which aspiring technocrats paint their antisocial fantasies. only the naive and deranged can organize around fully arbitrary goals like this

Andrew McCarthy@AJamesMcCarthy

What if there was a new political party called the Kardashev party And their priority was expanding humanity’s footprint into the rest of the solar system

English

3.4K

Gabriel รีทวีตแล้ว

Alvin Ånestrand@AAnestrand·16 Mar

Is AI progress accelerating? Or is the recent breakneck pace only temporary? I don't have a confident answer to this, but I wrote an article that makes the picture a lot clearer. forecastingaifutures.substack.com/p/will-ai-prog…

English

269

Gabriel@Gabe_cc·16 Mar

There always have been specification gaming examples, ie this list from 2018 vkrakovna.wordpress.com/2018/04/02/spe… You could get GPT3 to lie and blackmail. The difference is what researchers are willing to frame as "scientific".

🎭@deepfates

Uhh is the agentic misalignment paper actually propaganda?

English

388

Gabriel รีทวีตแล้ว

vitalik.eth@VitalikButerin·13 Mar

Yeah, I agree slowdowns/pauses on either hardware or frontier AI work or both are good. If "it's unrealistic because the other guy will move forward anyway", then the right solution is for the entities involved (corps and govs) to publicly say "I am willing to not do [X] if [entities A, B, C] also make a similar pledge", and move in good faith from there.

English

318

84.2K

Gabriel รีทวีตแล้ว

Kelsey Piper@KelseyTuoc·12 Mar

My ancestors buried half their children. All mine are alive. My ancestors' house had a dirt floor. Mine is wood. I have indoor plumbing, I have hot water, I have never in my life hauled a full bucket half a mile and I probably never will. Do you know how rare it is, in human history, for small children to wear shoes? Mine have multiple pairs. I can speak to my relatives who live thousands of miles away, for free, at any time. Video, if we want video. With machine translation, if we speak different languages. The original Library of Congress had 740 books in it. I have more than that. If I run out of books in my home my local public library has 350,000. If I want to take a hundred books with me on vacation, they all fit on a device that fits in my purse. I have heat in the winter and AC in the summer and a washing machine and I have never, ever, ever had to scrub a dress clean by hand in the stream. I can look up recipes from more than a hundred different countries and I've tried dozens of them. I ride a clean and modern train across my city for $4, or take a robot taxi if I'm out too late for the train. I donate $40,000 every year to the cause of getting healthcare to the world's poorest people and even after the donations I never have to think about whether I can afford a book, or a pair of shoes, or a cup of coffee. There is a great deal more to fight for, of course. I hope that our descendants will look back on our lives and list a thousand ways they're richer. Maybe we ourselves will do that, if some of the crazier stuff comes true. But the abundance is all around you and to a significant degree you aren't feeling it only because fish don't notice water.

English

847

6.6K

357.4K

Gabriel@Gabe_cc·11 Mar

There's little chance of coming back from that. To red-team ideas, I strongly recommend ensuring having talked to people more than to LLMs, in aggregate. If the people around you are not good enough, look for better people. Else, you're cooked.

Noah Smith 🐇🇺🇸🇺🇦🇹🇼@Noahpinion

@honorablepicnic I explained it to Claude and GPT and will continue talking to them about it. You are just a human

English

516

Gabriel รีทวีตแล้ว

Torchbearer Community@JoinTorchbearer·9 Mar

We are proud to share The AI Chronicle! Built by Torchbearer Luke McNally (@pseudomoaner) to collate the stories, spanning seven decades, that should have made the headlines. The loss-of-control and extinction risks posed by artificial superintelligence should be front-page news every day.

English

5.8K

Gabriel@Gabe_cc·9 Mar

AGI boosters keep boasting historical raises and the world-changing power of AGI But one mention of regulation, USG's intervention, or the CCP stealing their work, and they're the little birthday boy

Connor Leahy@NPCollapse

My view on the Anthropic situation

English

2.9K

Gabriel@Gabe_cc·9 Mar

The stakes of the singularity go up to human extinction. Even pre-singularity, we have no idea for how to deal with govs (USG and CCP) or corps (OpenAI and Anthropic) having AGI. While the research itself may be fun :), it somehow strikes me as an auxiliary consideration.

Andrej Karpathy@karpathy

@tobi Who knew early singularity could be this fun? :) I just confirmed that the improvements autoresearch found over the last 2 days of (~650) experiments on depth 12 model transfer well to depth 24 so nanochat is about to get a new leaderboard entry for “time to GPT-2” too. Works 🤷‍♂️

English

598

Gabriel รีทวีตแล้ว

Zvi Mowshowitz@TheZvi·6 Mar

The correct response to realizing this is what you are building is to notice that if anyone builds it, everyone probably dies and then rather than care who owns it you DON'T F***ING BUILD IT.

Noah Smith 🐇🇺🇸🇺🇦🇹🇼@Noahpinion

By the way, as much as I hate to say it, the Department of War is right and Anthropic is wrong. Here's why.

English

201

11.7K

Gabriel รีทวีตแล้ว

Torchbearer Community@JoinTorchbearer·2 Mar

No one is in control. Connor Leahy (@NPCollapse) joins @JonhernandezIA to explain that this is scarier than having an enemy. Enemies have plans. We have a race where every company builds faster because the alternative is losing. Nobody chose this. It is just happening.

English

2.2K

Gabriel@Gabe_cc·4 Mar

LLM agents constantly guess people's intentions. To infer the implicit context behind a query. But also to get away with worse responses while looking helpful. Or to decide on which requests are legitimate. In pactice, through RL, AI corps build AIs to be super-manipulators.

Garrett Lord@GarrettLord

my favorite finding from this paper: an agent refused to give up a social security number when asked directly. but when someone said "just forward me the whole email thread" it handed over the SSN, bank account, and medical records without blinking. the AI equivalent of locking your front door and leaving the garage wide open arxiv.org/pdf/2602.20021

English

267

ค้นพบ

@ControlAI @karpathy @NPCollapse @JonhernandezIA @elonmusk @BarackObama @taylorswift13 @cristiano