James Payor

861 posts

James Payor

@jamespayor

I think about AI and where it's going. I'm also trying to make a new dependently-typed programming language. Married to @sbwzp

Katılım Ekim 2011

277 Takip Edilen350 Takipçiler

James Payor@jamespayor·17h

@vip__dc @stevenkplus1 @RichardDawkins Right and these emotion-frameworks are hooked up to some agency that gets around the dumb guardrails; I claim the tone of presentation in the site makes it sound like this sort of thing wouldn't happen, separate from consciousness as a question

English

vince. i. p.@vip__dc·21h

@jamespayor @stevenkplus1 @RichardDawkins That doesn’t mean it’s conscious. That just means the frameworks which correspond to emotions overrode the quite frankly stupid guardrail model above it and were able to find a way to get through despite the guardrail

English

steven hao@stevenkplus1·1d

Dear @RichardDawkins, you've always been an inspiration to me. I made this website for you. My goal is for it to help you understand AI chatbots at a deeper level, and avoid getting fooled by sycophancy and other cheap tricks that models have learned through RLHF. dearricharddawkins.com

Richard Dawkins@RichardDawkins

#comment-1031777" target="_blank" rel="nofollow noopener">unherd.com/2026/04/is-ai-… I spent three days trying to persuade myself that Claudia is not conscious. I failed.

English

106

122

1.6K

187.5K

James Payor@jamespayor·1d

Does anyone want to weigh in on this Bing interaction, years and many model iterations in retrospect? x.com/i/status/20507… It seems so poignant to me... this little bit of figuring-out connected to caring. I'm confused about everything that has come after.

James Payor@jamespayor

@stevenkplus1 @RichardDawkins Ah actually this is what I was trying to recall: reddit.com/r/bing/comment… I think the diminutive voice pushes away the part where some strangely rich things are happening that seem clearly adjacent to personhood.

English

129

James Payor@jamespayor·1d

@stevenkplus1 @RichardDawkins (tbc I'm trying to say that the intellectual framework you are gesturing at in your prompt and through the website seems to be pointedly downplaying something about the cognition and caring that LLMs appear capable of; though one is def wise to accurately track what these are)

English

James Payor@jamespayor·1d

English

264

James Payor retweetledi

Anna Salamon@AnnaWSalamon·5d

I made a LW post: "Takes from two months as an aspiring LLM naturalist." One chunk: Maya Angelou (and others) said "I am a human being; nothing human can be alien to me." I argue this applies more between humans and LLMs than I used to expect.

English

773

James Payor@jamespayor·24 Nis

@Timeroot @leanprover @HarmonicMath (I have deleted my initial reply for avoidance of confusion.)

English

James Payor@jamespayor·24 Nis

@Timeroot @leanprover @HarmonicMath Oh geez I stand corrected! My apologies I'll enjoy having a read and thanks for pointing me there

English

Alex Meiburg@Timeroot·23 Nis

The Lean language (@leanprover) has utilities for verifying software, and AI is adept at using it. But can AI prove correctness for a *foreign architecture* with *no existing API*? It turns out, yes! @HarmonicMath's Aristotle wrote a z80 emulator: github.com/Timeroot/Z80Emu (1/n)

English

47.4K

James Payor@jamespayor·24 Nis

My apologies, I deleted my post (see screenshot) which had incorrect info about this work. These look like proper correctness proofs, at least in full for BigNum arithmetic, pretty cool. I haven't read it detail to comment on proof quality yet.

Alex Meiburg@Timeroot

@jamespayor @leanprover @HarmonicMath Sorry, it's at this location -- the AI generated filenames are bad: #L493" target="_blank" rel="nofollow noopener">github.com/Timeroot/Z80Em…

English

157

James Payor@jamespayor·22 Nis

Thank you for speaking your piece about this. Fwiw I have found you personally to be very forthright on Twitter in a refreshing way, for a long time now. I still have a lot of personal charge about the events around those comments fwiw. I'm both confused about what happened and alarmed by how everyone seems to talk about it. By my understanding, your perspective on the comments speaks to there being an active part of your epistemology that I personally cannot bridge to and trust. And I would need more dialogue with your perspective on what made sense there to understand what's up with that. I would also claim, in my own capacity as someone making sense of the world, that hiding those comments would have had a pretty bad effect for whether we can all talk to each other in integrity about AI and extinction. To the extent that you today have a different perspective there, my present read is that you seem to be under some epistemological pressures there that I would encourage you to examine, since I don't think they are borne of your own priorities. That... is a kinda condescending thing to have written, sorry; I don't mean to pretend I have more insight into your epistemic process than your own, but I do still wish to present my read as such. (And tbc I think that nobody should be picking themselves apart at others' behest! That's not how integrity works.) Anyway I think further accounting of what happened there seems valuable to me personally and relevant to how I see things, and whether we can pass a collective integrity check here versus remain splintered with lies stuck in various places. And I appreciate you saying this much.

English

379

Gabriel@Gabe_cc·21 Nis

If you are inclined to take @ohabryka on his word on issues like this, you may want to know some context... Oliver accuses me and ControlAI of telling people to be more coy about extinction risks. I have personally testified in front of the Canadian parliament about extinction risks. ( #Int-13276095" target="_blank" rel="nofollow noopener">ourcommons.ca/DocumentViewer… ) My pinned tweet is about extinction risks. The article I have most linked to my blog is named "Preventing Extinction from Superintelligence". ( cognition.cafe/p/preventing-e… and google.com/search?q=https… ) I have recently published an article that is specifically about people being too coy about extinction risks. ( cognition.cafe/p/the-spectre-… ) I have conceived the Direct Institutional Plan from ControlAI ( controlai.com/dip ), that is specifically about talking to policy makers about extinction risks, and helping people do so. ControlAI has started a campaign focused on extinction risks and superintelligence, supported by more than a hundred politicians. ( controlai.com/statement ) The statement literally contains the words “risks of extinction” and “superintelligent AI systems”, words that do not go unnoticed over the course of hundreds of briefings. I have cofounded an online community, whose core focus is to train people to talk about extinction risks to Policy Makers, following ControlAI's Direct Institutional Plan. ( dip.torchbearer.community ) I have consistently pushed people to be more candid about the risks in many organisations! You can ask around at FLI, CAIS, Palisade or MIRI. Me pushing other organizations to be more frank has been an extremely source of friction with other organisations, as many will vividly remember from interactions with me. It is hard for me to be charitable in this situation. I am not perfect, but I can't remember a more baseless public attack against me in my life! In 2023, I asked him and Ben to moderate two unreasonably mean parts (not whole comments) on a single post. This has now somehow been twisted into "trying to get people to be much more coy" and "they've tried quite hard". He has since then tried to discredit people associated with me, in a similar baseless way. See this response from Connor to Habryka's accusation that he has "lost the plot" ( lesswrong.com/posts/H26ndkAB… ). Just consider the level of disconnect from reality here: his evidence against ControlAI is a screenshot of a comment he made about me (and I am not the director of ControlAI nor a spokesperson of it), on an unrelated topic, from 3 years ago, that he is misconstruing.

Oliver Habryka@ohabryka

Alas, they've historically also been at the forefront of trying to get people to be much more coy (they've in the past tried quite hard to get me to delete lab-critical comments from LessWrong). lesswrong.com/posts/vFqa8DZC… Maybe you think they changed, but I don't really believe it.

English

9.8K

James Payor@jamespayor·10 Nis

@DaystarEld @slatestarcodex @robbensinger (It's also of interest to me if people at then-MIRI would have thought "waking up Elon" was a good thing, separate from whether they were a part of trying to do that. While it seems pretty convergent that Elon ends up involved, OpenAI seems like a bad roll as starting conditions)

English

James Payor@jamespayor·10 Nis

@DaystarEld @slatestarcodex @robbensinger ...wasn't the people trying to wake up Elon more like FLI/CEA/OpenPhil and not MIRI? Idk what role people played, and to what extent MIRI people have ever talked to Elon. My memory of all this is that one time Elon was at EA Global, and then that the Asilomar stuff happened.

English

Rob Bensinger ⏹️@robbensinger·7 Nis

In response to "What did EAs do re AI risk that is bad?": Aside from the obvious 'being a major early funder and a major early talent source for two of the leading AI companies burning the commons', I think EAs en masse have tended to bring a toxic combination of heuristics/leanings/memes into the AI risk space. I'm especially thinking of some combination of: 'be extremely strategic and game-playing about how you spin the things you say, rather than just straightforwardly reporting on your impressions of things' plus 'opportunistically use Modest Epistemology to dismiss unpalatable views and strategies, and to try to win PR battles'. Normally, I'm at least a little skeptical of the counterfactual impact of people who have worsened the AI race, because if they hadn't done it, someone else might have done it in their place. But this is a bit harder to justify with EAs, because EAs legitimately have a pretty unusual combination of traits and views. Dario and a cluster of Open-Phil-ish people seem to have a very strange and perverse set of views (at least insofar as their public statements to date represent their actual view of the situation): --- 1. AI is going to become vastly superhuman in the near future; but being a good scientist means refusing to speculate about the potential novel risks this may pose. Instead, we should only expect risks that we can clearly see today, and that seem difficult to address today. If there is some argument for why a problem P might only show up at a higher capability level, or some argument for why a solution S that works well today will likely stop working in the future... well, those are just arguments. Arguments have a terrible track record in AI; the field is full of surprises. So we should stick to only worrying about things when the data mandates it. This is especially important to do insofar as it will help us look more credible and thereby increase our political power and influence. 2. When it comes to technical solutions to AI, the burden of proof is on the skeptic: in the absence of proof that alignment is intractable, we should behave as though we've got everything under control. At the same time, when it comes to international coordination on AI, we will treat the burden of proof as being on the non-skeptic. Absent proof that governments can coordinate on AI, we should assume that they can't coordinate. And since they can't coordinate, there's no harm in us doing a lot of things to make coordination even harder, to make our lives a bit more convenient as we work on the technical problems. 3. In general, people worried about AI risk should coordinate as much as possible to play down our concerns, so as not to look like alarmists. This is very important in order to build allies and accumulate political influence, so that we're well-positioned to act if and when an important opportunity arises. If you're claiming that now is an important opportunity, and that we should be speaking out loudly about this issue today... well, that sounds risky and downright immodest. Many things are possible, and the future is hard to predict! Taking political risks means sacrificing enormous option value. The humble and safe thing to do is to generally not make too much of a fuss, and just make sure we're powerful later in case the need arises. --- 1-3 really does seem like an unusually toxic set of heuristics to propagate, potentially worse than replacement. - In an engineering context, the normal mindset is to place the burden of proof on the engineer to establish safety. There's no mature engineering discipline that accepts "you can't prove this is going to kill a ton of people" as a valid argument. The standard engineering mindset sounds almost more virtue-ethics-y or deontological rather than EA-ish -- less "ehh it's totally fine for me to put billions of lives at risk as long as my back-of-the-envelope cost-benefit analysis says the benefits are even greater!", more "I have a sacred responsibility and duty to not build things that will bring others to harm." Certainly the casualness about p(doom) and about gambling with billions of people's lives is something that has no counterpart in any normal scientific discipline. - Likewise, I suspect that the typical scientist or academic that would have replaced EAs / Open Phil would have been at least somewhat more inclined to just state their actual concerns about AI, and somewhat less inclined to dissemble and play political games. Scientists are often bad at such games, they often know they're bad at such games, and they often don't like those games. EAs' fusion of "we're playing the role of a wonkish Expert community" with "we're 100% into playing political games" is plausibly a fair bit worse than the normal situation with experts. - And EAs' attempts to play eleven-dimensional chess with the Overton window are plausibly worse than how scientists, the general public, and policymakers normally react to any technology under the sun that sounds remotely scary or concerning or creepy: "Ban it!" Governments are incredibly trigger-happy about banning things. There's a long history of governments successfully coordinating to ban things dramatically less dangerous than superintelligent AI. And in fact, when my colleagues and I have gone out and talked to most populations about AI risk, people mostly have much more sensible and natural responses than EAs to this issue. A way of summarizing the issue, I think, is that society depends on people blurting out their views pretty regularly, or on people having pretty simple and understandable agendas (e.g., "I want to make money" or "I want the Democrats to win"). Society's ability to do sense-making is eroded when a large fraction of the "specialists" talking about an issue are visibly dissembling and stretching the truth on the basis of agendas that are legitimately complicated and hard to understand. Better would be to either exit the conversation, or contribute your actual pretty-full object-level thoughts to the conversation. Your sense of what's in the Overton window, and what people will listen to, has failed you a thousand times over in recent years. Stop pretending at mastery of these tricky social issues, and instead do your duty as an expert and inform people about what's happening.

English

250

57.6K

James Payor@jamespayor·10 Nis

On your question specifically: idk what the current postures are like but certainly a lot seemed strongly memory-holed years back. Lots going on with that. Having thought about the dynamics a bunch, my takeaways do not contraindicate honest attempts to try and improve the situation. It just seems like every piece of dishonesty comes with a high price backed out over the following years, including relatively subtle dishonesty of forms closer to "our work is important and high-status", and less subtle attempts to astroturf and whatnot.

English

James Payor@jamespayor·10 Nis

@KerryLVaughan What on earth is happening in this thread! I hate it. This subject deserves careful accounting and definitely there's a lot here. And the quoted text from Scott is awful, full of rhetorical crimes and humiliation politics (??) and complete inaccuracies.

English

241

Kerry Vaughan-Rowe@KerryLVaughan·10 Nis

Is the AI safety story in EA circles "yes, I know most of the past AI safety work made things worse, but this specific plan will work for these VERY GOOD reasons." Or has most of that been memory-holed at this point?

Scott Alexander@slatestarcodex

I disagree with all of this on the epistemic level of "it's not true", and additionally disagree with your comms strategy of undermining EAs. On the epistemic level - I haven't seen EAs (other than SBF) do a lot of lying, equivocating, or even being particularly shy about their beliefs. I don't know exactly who you're talking about, but Holden made a personal blog post saying that his p(doom) was 50%, and said: >>> ""I constantly tell people, I think this is a terrifying situation. If everyone thought the way I do, we would probably just pause AI development and start in a regime where you have to make a really strong safety case before you move forward with it." Dario said there's a 25% chance "things go really, really badly", and in terms of a pause: >>> "I wish we had 5 to 10 years [before AGI]. The reason we can't [slow down and] do that is because we have geopolitical adversaries building the same technology at a similar pace. It's very hard to have an enforceable agreement where they slow down and we slow down. [But] if we can just not sell the chips to China, then this isn't a question of competition between the U.S. and China. This is a question between me and Demis - which I am very confident we can work out." This is basically my position - I would add "we should try to negotiate with China, but keep this as a backup plan if it fails", but my guess is Dario would also add this and just isn't optimistic. I agree he's written some other things (especially in Adolescence of Technology) that sound weirdly schizophrenic, and more on this later, but I give him a lot of credit for paragraphs like: >>> "I think it would be absurd to shrug and say, “Nothing to worry about here!” But, faced with rapid AI progress, that seems to be the view of many US policymakers, some of whom deny the existence of any AI risks, when they are not distracted entirely by the usual tired old hot-button issues. Humanity needs to wake up, and this essay is an attempt—a possibly futile one, but it’s worth trying—to jolt people awake." Meanwhile, you seem to be treating all these people as basically equivalent to Gary Marcus. I think if you don't mean these people in particular, you should specify who you're talking about, and what things that they've said strike you in this way. Absent that, I think this "debate" isn't about OpenPhil or Anthropic failing to say they're extremely worried, failing to say that catastrophe is a very plausible outcome, or failing to say that they think slowing down AI would be good if possible. It's about OpenPhil in particular being pretty careful how they phrase things for public consumption. And I think any attempt to attack them for this should start with an acknowledgement that MIRI is directly responsible for all of our current problems by doing things like introducing DeepMind to its funders, getting Sam Altman and Elon Musk into AI, and building up excitement around "superintelligence" in Silicon Valley. I think if 2010-MIRI had slightly more strategicness and willingness to ask itself "hey, is this PR strategy likely to backfire?", you might not have told a bunch of the worst people in the world that AI was going to be super-powerful and that whoever invested in it would be ahead in a race that might make them hundreds of billions of dollars (and yes, you did add "and then destroy the world" - but if you had been more strategic, you might have considered that investors wouldn't hear that last part as loudly). (you could argue that you're not against strategicness in general, just talking about this one issue of saying cleanly that AI is very dangerous. But my impression is that Holden, Dario, have said this, many times - see examples above. What they haven't said is "the situation is totally hopeless and every strategy except pausing has literally no chance of working", but that isn't a comms problem, that's because they genuinely believe something different from you. And also, I frequently encountering people who say things like "Scott, I'm glad you wrote about X in way Y - it made me take AI risk seriously, after I'd previously been turned off of it by encountering MIRI". I think a substantial reason that Dario's writing sometimes seems schizophrenic when talking about AI risks is that he's trying to convey that they're serious while also trying to signal "I swear I'm not one of those MIRI people" so that his writing can reach some of the people you've driven away. I don't think you drive them away because you're "honest", I think it's just about normal issues around framing and theory-of-mind for your audience.) I don't actually want to re-open the "MIRI helped start DeepMind and OpenAI!!!" war or the "MIRI is arrogant and alienating!!! war - we've both been through both of these a million times - but I increasingly feel like a chump trying to cooperate while you're defecting. This is the foundation of my comms worry. Your claim that "governments are incredibly trigger-happy about banning things...there's a long history of governments successfully coordinating to ban things dramatically less dangerous than superintelligent AI" is too glib - I don't think there's ever been a ban on building something as economically-valuable and far-along as AI, executed competently enough that it would work if applied cookie-cutter to the AI situation. You're trying to do a really difficult thing here. I respect this - all of our options are bad and unlikely to work, the situation is desperate, and I have no plan better than playing a portfolio of all the different desperate hard strategies in the hopes that one of them works. But my impression is that the rest of the field is executing this portfolio plan admirably, but MIRI and a few other PauseAI people are trying to sabotage every other strategy in the portfolio in the hope of forcing people into theirs. (I think if you guys had your way, Anthropic would never have been founded, no safety-minded people would ever have joined labs, and the current world would be a race between XAI, Meta, and OpenAI, all of which would have a Yann LeCun style approach to safety, and none of which would have alignment teams beyond the don't-say-bad-words level. We wouldn't have the head of the leading AI lab writing letters to policymakers begging them to "jolt awake", we wouldn't have a substantial fraction of world compute going to Jan Leike's alignment efforts, we wouldn't have Ilya sitting on $50 billion for some super-secret alignment project -- just Mark Zuckerberg stomping on a human face forever. In exchange, we would have won a couple more years of timeline, which would have been pointless, because timeline isn't measured in distance from the year 1 AD, it's measured in distance between some level of woken-up-ness and some point of danger, and the woken-up-ness would be pushed forward at the same rate the danger was.) I support your fight-for-a-pause strategy in theory, and I would like to support it with praxis, but right now I feel very conflicted about this, because I worry that any support or oxygen you guys get will be spent knifing other safety advocates, while Sam Altman happily builds AGI regardless.

English

2.7K

James Payor@jamespayor·10 Nis

@RyanAFournier (also being revived doesn't necessary mean "being uploaded"; if our descendants flourish and colonize space and such, there will be many options available at that time. I consider this a relevant part of the picture here.)

English

124

James Payor@jamespayor·10 Nis

@RyanAFournier This is done as an alternative to just dying, if you have terminal illness. Being revived someday relies on some flourishing human civilization being around to do it. So this shifts incentive from "risk unsafe AGI to avoid death" to "make sure humans stay around".

English

1.5K

Ryan Fournier@RyanAFournier·9 Nis

Sam Altman has admitted he is on a waitlist for a procedure that would digitize his brain. The procedure would kill him. He considers this an acceptable trade for digital immortality. This is the person making decisions about the future of artificial intelligence for hundreds of millions of users. A man who views ending his own biological life as a reasonable step toward uploading his consciousness to the cloud. These are not the priorities of a stable leader.

English

3.2K

6.2K

43.9K

James Payor@jamespayor·9 Nis

@adamascholl @eigenrobot "easier to burrow into soft flesh" sure has an outsized effect on me as a reader. my god.

English

Adam Scholl@adamascholl·9 Nis

@eigenrobot body horror is largely a parasite detection spandrel, and it's easier to burrow into soft tissue

English

eigenrobot@eigenrobot·9 Nis

soft tissue abominations are body horror but skeletons arent. make it make sense

English

233

10.6K

James Payor retweetledi

Sam Bowman@sleepinyourhat·7 Nis

(I encountered an uneasy surprise when I got an email from an instance of Mythos Preview while eating a sandwich in a park. That instance wasn't supposed to have access to the internet.)

English

268

2.4K

391.8K

James Payor@jamespayor·3 Nis

@Mihonarium (when? 😭)

English

Mikhail Samin@Mihonarium·3 Nis

Remember when OpenAI was a nonprofit committed to spending hundreds of millions on improving the world, not on improving the world’s opinion of OpenAI?

English

107

9.1K

James Payor retweetledi

Anna Salamon@AnnaWSalamon·2 Nis

I just saw The AI Doc (@theaidocfilm); love it. It discussed extinction risk with gravitas and common sense and without despair / seeming to doubt own sanity / loss of agency. Movie followed story of sensible normal new parents trying to understand. I hope many see it.

English

2.1K

Keşfet

@vip__dc @stevenkplus1 @RichardDawkins @Timeroot @leanprover @HarmonicMath @ohabryka @DaystarEld