Michael 🔸
1.3K posts

Michael 🔸
@mjkerrison
Executive Director, @aisafetyanz | Let's make sure this goes well, alright?



This is the correct view of existential risk from AI, and I'm glad @deanwball sees the same connection to Hayek's thinking that I do.



DAVID SACKS no longer AI and Crypto Czar, he said in an interview with Bloomberg. Sacks ran out of his 130 days as a special government employee. He didn't count the days consecutively, but they were set to end eventually. He'll now only be co-chair of science and tech council

The Coca Cola company is not happy with me--that's okay, I'll still keep drinking that garbage.


Again the apocalypse from Amodei. Why don't you describe instead how wonderful it will be to have agents navigate bureaucracy for us, do our taxes, book our holidays, help us find fraudulent clauses in contracts, keep us healthy? Why this dumb emphasis in jobs lost?

I saw today an article saying that OpenAI is “throwing everything” at creating a fully autonomous *researcher*. And it just made me think — man. We already have the ability to 3-d print diff in diff and shift-share papers by the truckload just with Claude code. 1/n




Today, we’re releasing a feature that allows Claude to control your computer: Mouse, keyboard, and screen, giving it the ability to use any app. I believe this is especially useful if used with Dispatch, which allows you to remotely control Claude on your computer while you’re away.

Due to Claude’s Constitution and OpenAI’s model spec, more people are paying attention to the characters of the AI’s that companies are building, and the rules they follow. Should AIs be wholly obedient, or have their own ethical code? What should they refuse to help with? Should they tell you what you want to hear, or push back when you’re off base? I think the nature of frontier AIs’ characters is among the most important features of the transition to a post-superintelligence world. In a new article with @TomDavidsonX, I explain why. History shows the importance of individual character. Stanislav Petrov chose to ignore a false nuclear alarm when protocol demanded he report it; the world avoided nuclear armageddon that day. Churchill refused to negotiate with Hitler after the fall of France, despite some strongly pushing him to do so. And, as capabilities improve, AI systems will become involved in almost all of the world's most important decisions: advising leaders, drafting legislation, running organisations, and researching new technologies. AI character — how honest, cooperative, and altruistic these systems are, and the hard rules they follow — will affect all of it. A general, aiming to stage a coup, instructs an AI to build a military unit loyal only to him. Does it comply, or refuse? Two countries are on the brink of conflict, each advised by AI systems. Do those AIs search for de-escalatory options, or are they bellicose? The cumulative effect of AIs’ character traits across hundreds of millions of interactions, and in rare but critical moments, will have an enormous impact on the course of society. The main counterargument to the importance of AI character is that competitive dynamics and human instructions will determine the range of AI characters we get, so there’s little we can do today to affect it one way or the other. This is partly true, but the constraints are not binding. At the crucial moment, there might be just one leading AI company, facing none of the usual competitive pressures. Some decisions may have path-dependent outcomes, due to stickiness of training or user expectations. And there will, predictably, be many future conflicts over AI character. It’s a safer world if we work through these tradeoffs ahead of time, before a crisis forces it. AI character is most important in worlds where alignment gets solved. But it can affect the chance of AI takeover, too. Some styles of character training may make alignment easier; and some characters are more likely to make deals rather than foment rebellion, even if they have misaligned goals. Given how neglected the area is, too, I think work on AI character is among the most promising ways to help the intelligence explosion go well.







I don't like that at all. I'm not happy to shuffle off my mortal coil, fuck that. I assumed Hassabis was more pragmatic.

"The reason people think of this as the end game is that they don't believe in the actual end game." @TheZvi says that the Anthropic vs DoW conflict marks the beginning of the middle of the AI story, but the real end-game will be much crazier still. 🔗 ↓




