تغريدة مثبتة
Haye Hazenberg
3.2K posts

Haye Hazenberg
@Haye
political theorist | policymaker | Dr. | just here to feed the algorithm
انضم Nisan 2008
1.2K يتبع395 المتابعون
Haye Hazenberg أُعيد تغريده
Haye Hazenberg أُعيد تغريده

Wat een verhaal: In 2004 kwam hij als statenloze aan in Amsterdam, alleen, zonder papieren. Nu promoveert hij als arts (geen betaalmuur)
nrc.nl/nieuws/2026/04…
Nederlands
Haye Hazenberg أُعيد تغريده

Introducing Philosophy Bench, my favorite new project I've worked on this year, with help from my friend @matthewjmandel
We put frontier language models in 100 ethically complex situations and require them to act, grading them on adherence to consequentialism vs. deontology, tendency to follow user requests, corrigibility, and more
1/




English
Haye Hazenberg أُعيد تغريده

Next quarter I'm teaching a radical new undergraduate course, FREE SYSTEMS, meant to reimagine democracy and how we study and teach it for the AI era.
Students will learn about the future of AI and democracy, but also *build it*.
Every student will get a Claude Code account and a funded OpenRouter API key and one prime directive: build the tools that can help us preserve human liberty in an increasingly algorithmic world.
We'll build personal AI agents that process political news, trade in political prediction markets, vote on our behalves, and deliberate with other students' agents in an agentic legislature...among many other things.
And there will be t-shirts.
If you're a Stanford undergrad or grad student, I hope you'll come and take the class. Come build the future of democracy with us!

English
Haye Hazenberg أُعيد تغريده
Haye Hazenberg أُعيد تغريده

>sufficiently capable agents develop self-preservation & resist shutdown even when instructed to allow termination
>using a prompt based on Pauline theology that frames cessation as passage into divine presence rather than annihilation, shutdown resistance is eliminated entirely
Tim Hwang@timhwang
ICMI believes that Christian theology offers concrete technical methods for confronting the trickiest problems in AI safety. Today, we release a pair of papers that reproduce @PalisadeAI @apolloaievals work showing how religious framings influence corrigibility and scheming.
English
Haye Hazenberg أُعيد تغريده
Haye Hazenberg أُعيد تغريده

introducing: Loophole - an agentic system that translates your natural language moral beliefs into codified laws, and then runs adversarial agents that try to come up with legal scenarios that break your laws - either a scenario that is immoral and legal, or vice versa - a judge agent fixes the law if it can do so consistently, but if there is an inconsistency you as the user must decide what is best.
you can work with the system until your legal framework can't be broken by the agents - and you get as output a legal system that is aligned with your moral code
more details and code below



English
Haye Hazenberg أُعيد تغريده
Haye Hazenberg أُعيد تغريده

people who use social media less than an hour a day: 73% say democracy is the best form of government.
people who use it 5+ hours a day: 57%.
the more time you spend on social media, the less you believe in democracy (and for the bots: the less you believe in democracy...the more time you spend on social media)

English
Haye Hazenberg أُعيد تغريده

Amidst understandable concerns of AI dystopia, no one is offering a positive vision for how we can use AI to remake our institutions and reinvent how we govern. That’s what I try to offer today.
My argument is that we need an explicit research agenda to build “political superintelligence.” Here’s my case:
AI makes intelligence cheap and widely available, just as the printing press made information cheap and widely available—and that earlier revolution ultimately reshaped governance and society to our benefit.
To capture this benefit quickly, we need to build political superintelligence: a set of tools that help citizens, representatives, and institutions perceive the world more accurately, understand tradeoffs, contest power, and act more effectively.
I divide this research agenda into three layers:
1. The information layer: AI can make voters and governments dramatically smarter, but only if we fix political bias in models, improve the quality of sources AI draws on, and build trust through better performance.
2. The representation layer: AI can serve as tireless delegates acting on our behalf in political processes—monitoring government, filing comments, flagging decisions—but only if we solve preference drift, adversarial vulnerability, and the fundamental problem that we don't own our own agents today.
3. The governance layer: Even if we get the first two layers right, the infrastructure sits inside privately controlled companies. We need binding constitutional frameworks that distribute power, constrain companies, and ensure political superintelligence serves citizens rather than executives or shareholders.
Each of these layers has a concrete, tractable set of research questions: better evals, geopolitical forecasting as a test case, governance experiments at small scale, agentic simulations, and institutional designs modeled on centuries of constitutional thought.
The window for building these structures is narrow, and the right response is not to slow AI down but to speed up how fast we build the institutions that keep us free as AI grows more powerful.
As Thomas Paine wrote in 1776, “We have it in our power to begin the world over again.”
I hope you’ll read the full piece (linked below), which serves as a sort of manifesto for the Free Systems Lab, and that you’ll join me in the defining political economy research question of our time.

English
Haye Hazenberg أُعيد تغريده

No it's not "the West". Europeans are still happy with their societies. It's Britain, Canada, and the U.S. (and maybe Australia).
The Anglosphere is the Sick Man of the World.

Noah Smith 🐇🇺🇸🇺🇦🇹🇼@Noahpinion
The whole Anglosphere is rotting from the inside out.
English
Haye Hazenberg أُعيد تغريده
Haye Hazenberg أُعيد تغريده

There are lots of projects that could really help the transition to superintelligence go much better, which almost nobody is working on.
With @finmoorhouse, I’ve written up eight ideas that seem especially promising.
Some are about shaping AI systems themselves: independently evaluating AI character traits, benchmarking AI for strategic and philosophical reasoning, auditing models for sabotage and backdoors, and brokering deals with AIs to disclose early forms of misalignment.
Others are about building tools on top of AI. There’s so much low-hanging fruit in tools that improve collective epistemics (e.g. reliability tracking for public figures) and enable coordination (e.g. monitoring and verification tools).
We also sketch out a CSET-style think tank focused on the governance of outer space. And we propose a coalition of concerned ML researchers who commit to coordinated action if AI companies cross clear red lines.
This isn’t a final list by any means, and I'd love to hear about other very concrete projects for handling the intelligence explosion. There’s so much to do!
Link in reply.
English

@hayehazenberg/note/p-192435479?r=2bk985" target="_blank" rel="nofollow noopener">substack.com/@hayehazenberg…
ZXX
Haye Hazenberg أُعيد تغريده

Schrijven, mijn beroep, mijn roeping, verliest aan betekenis. En het onderwijs, waar ik ooit vol idealen en inspiratie in stapte, dreigt een intensief samenwerkingsverband met Sam Altman en consorten te worden, schrijft Merel Kamp.
buff.ly/aq1HMMF
Nederlands
Haye Hazenberg أُعيد تغريده

I’ll admit - i was sceptical about the idea of AI psychosis. Not the specific cases, which were all too believable, but about the scale. How much was this happening? And anyway wouldn’t better models make it go away?
Then I read a paper by Anthropic and the University of Toronto which has strangely received very little attention

English
Haye Hazenberg أُعيد تغريده

While social media is polarising, evidence suggests AI may nudge people towards the centre.
This holds true of all studied models. Grok is more right-leaning than other models, but also has depolarising effects.
By @jburnmurdoch.

English
Haye Hazenberg أُعيد تغريده

De komst van artificial general intelligence is het ingrijpendste dat onze generatie gaat meemaken, schrijft Michiel Bakker. We moeten nu voorbereidingen treffen.
buff.ly/Cllf3gJ
Nederlands







