Christian Hendriksen

2.4K posts

Christian Hendriksen

@chehendriksen

Associate Professor at Copenhagen Business School. Author of "AI på Arbejde" https://t.co/ZRkjjvs2GK

Katılım Kasım 2017

770 Takip Edilen796 Takipçiler

Sabitlenmiş Tweet

Christian Hendriksen@chehendriksen·20 Mar

I'm sharing my practical guide to ChatGPT and Bing for social science and management studies. This document part of the effort to democratize AI capabilities in university settings. docs.google.com/document/d/15C… 🧵 1/8

English

8.1K

Christian Hendriksen@chehendriksen·17h

Something is changing with GPT-5.4 Pro in the chat UI. It is as if OpenAI is preparing to launch something.

English

345

Christian Hendriksen retweetledi

Ryan Briggs@ryancbriggs·1d

I just reviewed a paper that should have been desk rejected (it happens) & I checked if openaireview running locally on my laptop via Claude Code using Gemma 4 could do a good enough job to desk reject it & it could. We can now screen papers with LLMs for the cost of electricity

English

133

52.4K

Christian Hendriksen@chehendriksen·2d

@herbiebradley Same experience for me, ping @TheRealAdamG. The github integration is also broken

English

Herbie Bradley@herbiebradley·2d

on gpt 5.4 pro the google drive integration has stopped working entirely, it seems unusable on 5.4 thinking it works but i have to "add sources" then tell the model in text form which document to use. this is 100x worse than the previous integration that let me pick the doc

English

922

Christian Hendriksen retweetledi

Andy Boenau@Boenau·3d

According to Waymo's published data, their technology is preventing injuries & deaths. My view is that if this is true, and I have yet to see a debunking of their data, then we safety advocates should be welcoming the technology. If Waymo is lying or manipulating data, then write about how they're doing that! Instead the analysis in this recent Streetsblog article is limited to "the authors of the Waymo safety report work for Waymo!" FFS, are we going to toss out all the NYCDOT reports about how their bike lanes improve safety? Are we going to toss out the decongestion pricing reports because they were written by the transit employees who want transit to succeed? I hope not! Here's what we've been told by Waymo: ✅ 170.7 million rider-only miles driven without a human driver (equivalent to roughly 200 human lifetimes of driving). ✅ 92% fewer serious injury or worse crashes compared to human drivers in the same cities and conditions (0.02 incidents per million miles vs. 0.22 for humans; 35 fewer such crashes). ✅ 83% fewer airbag-deployment crashes in any vehicle (230 fewer crashes). ✅ 82% fewer injury-causing crashes overall (544 fewer crashes). ✅ 92% fewer pedestrian injury crashes compared to human benchmarks. ✅ 85% fewer cyclist injury crashes. ✅ 81% fewer motorcycle injury crashes. ✅ No fatalities caused by the Waymo Driver across these 170.7 million driverless miles. ✅ At current scale (over 4 million miles per week), Waymo prevents 1 serious injury crash every 8 days. If data is manipulated or false, then report on that. Otherwise you come out looking like someone who only likes safety benefits that aren't shaped like a car. It's going to set back Vision Zero advocacy in states across the country that are on the fence about allowing autonomous vehicle operations. Waymo does have a profit motive. So do corporations who build homes, distribute food, host concerts, publish books, and make medicine. Not all of them are the same and some are downright awful. Always challenge motives and incentives. What's interesting about Waymo is that they have a financial incentive in being the absolute safest form of motorized vehicle on the street. They'll lose business if their software is just as dangerous as an average human driver. But that in no way means streets must be overtaken by motor vehicles (theirs or any other brand). What do we want? 92% fewer pedestrian injury crashes compared to humans? 85% fewer cyclist injury crashes? Then come up with a way to let AVs into cities across the country.

English

605

63.4K

Christian Hendriksen@chehendriksen·2d

@Miles_Brundage What are your thoughts on the argument about the (lack of) severity of vulnerabilities? E.g.: tomshardware.com/tech-industry/…

English

458

Miles Brundage@Miles_Brundage·2d

Seeing some wild Mythos vs X cope (where X is eg current generation models directed, with the benefit of hindsight, at finding things Mythos found; GPT-5.4 Pro on cherrypicked evals; etc.) I’m sure OAI has something good coming (Spud post train?) but come on, Mythos is obv good

English

7.7K

Christian Hendriksen@chehendriksen·2d

A slide I'll use tomorrow with my master students. I'm sure it'll be a fun discussion.

English

Christian Hendriksen@chehendriksen·2d

@TheRealAdamG I just hope you'll release it for Windows immediately.

English

283

Adam.GPT@TheRealAdamG·2d

I am not sure if you heard...

English

465

35.5K

Christian Hendriksen@chehendriksen·2d

@Inframethod Not yet. The cycles of paper development, submission, peer review, and revision are glacial at best.

English

Thomas Basbøll@Inframethod·2d

@chehendriksen Have you published a paper yet where your use of AI is part of the methodology? I'm looking for examples of sophisticated AI use and its declaration in scholarly work.

English

Christian Hendriksen@chehendriksen·2d

People think I'm crazy when I tell them that access to GPT-5.4 Pro is worth 200$/month for me. Someone noted a few days ago she didn't even know it was possible to spend that much on an AI per month. But it's just an incredible model.

Derya Unutmaz, MD@DeryaTR_

I agree with this take. GPT-5.4 Pro is still the uncontested king of AI for me. Mythos appears poised to challenge it, but obviously GPT-5.4 Pro is not going to sit still and surrender its crown so easily 😉

English

1.2K

Christian Hendriksen@chehendriksen·3d

How on earth will AJPS enforce this?

AJPS@AJPS_Editor

Updated AJPS AI Disclosure Policy for Authors Last year, we introduced a set of policies concerning the use of AI at AJPS for both authors and reviewers. We have recently updated these guidelines. The current guidelines are presented in this Editor's Blog: ajps.org/2026/04/10/upd…

English

2.2K

Christian Hendriksen retweetledi

Epoch AI@EpochAIResearch·4d

What are the largest software engineering tasks AI can perform? In our new benchmark, MirrorCode, Claude Opus 4.6 reimplemented a 16,000-line bioinformatics toolkit — a task we believe would take a human engineer weeks. Co-developed with @METR_Evals. Details in thread.

English

542

119.4K

Christian Hendriksen retweetledi

Aniket Panjwani@aniketapanjwani·5d

Now that there's a $100 ChatGPT Pro tier, it's a no brainer for any economist to choose ChatGPT over Claude at EVERY price point. 1. $100/mo now gets you access to Pro in the web UI, which is invaluable for academic work. See x.com/aniketapanjwan… for how to optimally use Pro. No Anthropic model compares 2. $20/mo usage with Claude Code is a joke. $20/mo usage with Codex is limiting but not that far off of $100/mo Claude Code usage. I've myself experienced getting capped within an hour at $100/mo with Claude Code with normal usage. With $100/mo with Codex, for the typical economist's usage, you'll never hit 5 hour or weekly limits. 3. GPT 5.4 itself is a much better model than Opus 4.6 for almost everything economists would care about. CC has a better developed plugin ecosystem, subagents in CC work better IMO, and CC IMO is by default a better writer. For anything else - helping you understand papers, write code, plan out structural estimation, think through identification arguments, choose between estimators - I'd prefer Codex. 4. Whenever OpenAI has an outage, they reset usage for everyone, and they keep extending the length to which you get double usage from your subscription (now until end of May!). Anthropic is crunched heavily for compute and seems to be both explicitly ( x.com/trq212/status/…) and perhaps surreptitiously (x.com/om_patel5/stat…) trying to restrict their compute provision. 5. The Codex Desktop app is hands down the best interface for agentic coding right now. The Claude desktop app is probably the worst way to use Claude Code (go to /r/claudecode to read about all the bugs people encounter). There's a dearth of educational material about the Codex Desktop App though (which I will be solving soon) I walk through each of these points in more detail in this YouTube video: youtu.be/_oKPa8_7w3Y?si… So, what are the switching costs from Claude Code to Codex? > The models feel different/act differently. You have to get used to it. > Skills are an open format, so you can just copy all your skills over to ~/.agents/skills from ~/.claude/skills . There are some CC specific skill features which don't port over 1 to 1 to Codex. > Subagents are configured differently and act differently in CC compared to Codex. > Hooks are experimental in Codex and you have fewer options of harness actions onto which you can configure hooks. I'd say if you haven't yet started with these tools, choose Codex for sure. If you're hitting your Claude Code session limits regularly, then also switch to Codex for sure. If you're not hitting your session limits, or you have a $200+ budget, I'd recommend getting a $20/mo Codex sub and just try it out/feel it out. Codex and CC pair well together anyway, for example by having one write a plan/code and having the other review the plan/code. I have Codex at a $200/mo subscription and CC at a $100/mo subscription - that works well for me, I use Codex 80% of the time and CC 20% of the time.

YouTube

OpenAI@OpenAI

We’re updating our ChatGPT Pro and Plus subscriptions to better support the growing use of Codex. We’re introducing a new $100/month Pro tier. This new tier offers 5x more Codex usage than Plus and is best for longer, high-effort Codex sessions. In ChatGPT, this new Pro tier still offers access to all Pro features, including the exclusive Pro model and unlimited access to Instant and Thinking models. To celebrate the launch, we’re increasing Codex usage for a limited time through May 31st so that Pro $100 subscribers get up to 10x usage of ChatGPT Plus on Codex to build your most ambitious ideas.

English

353

68.3K

Christian Hendriksen@chehendriksen·4d

One underrated aspect of doing talks on AI is that every week or so, something happens that lets me update my slides with interesting developments. On Monday, I'm giving a lecture on AI and societal risk. Suffice to say Project Glasswing is a timely event for this!

English

Christian Hendriksen@chehendriksen·5d

Claude Mythos sounds cool and powerful. Mysterious! But it can get better. How about GPT Legion. Claude Aegis. Gemini Nexus. Archon. Leviathan! There's a lot of cool names if the labs are leaning into ominous model names.

English

219

Christian Hendriksen retweetledi

Dean W. Ball@deanwball·5d

It’s crazy that some are just straight up in denial about mythos having the capabilities anthropic says it does. Usually the in-denial-about-AI community is able to cloak their views in at least *some* intellectual garb, but this time it’s just, “it’s not real.” Wild. Also sad.

English

652

162K

Christian Hendriksen@chehendriksen·6d

We really should not underestimate the anti-AI sentiment that is brewing. I hear similar perspectives from academic colleagues, business leaders, and friends who don't use AI. It was important that @DKokotajlo, @eli_lifland and the rest of the AI 2027 group included a prediction about sentiment towards AI.

English

Christian Hendriksen@chehendriksen·6d

Earlier this week, we got Gemma 4 which is close to "GPT-4 on your phone". Worthwhile to ponder with the Mythos announcement how many years it will take for Mythos on your phone.

Ethan Mollick@emollick

Curious how many large organization CISO offices have taken the Mythos red team reports as the red alert that it is. (I suspect very few) Based on historical trends in AI they have, at most, about six to nine months until those capabilities become widely diffused to bad actors.

English

136

Christian Hendriksen retweetledi

ΛI DRIVR@AIDRIVR·8 Nis

FSD never sleeps. INSANE reaction time this is going to save a lot of lives

Spencer@scotsrule08

Absolutely incredible performance from FSD 14.3 around vulnerable road users and people darting out into the road 🎯 @DavidMoss

English

182

2.7K

156.7K

Christian Hendriksen retweetledi

Kevin Roose@kevinroose·7 Nis

As always, the best stuff is in the system card. During testing, Claude Mythos Preview broke out of a sandbox environment, built "a moderately sophisticated multi-step exploit" to gain internet access, and emailed a researcher while they were eating a sandwich in the park.

English

359

2.4K

1.5M

Christian Hendriksen@chehendriksen·7 Nis

@kevinweil How does this compare to running a paper through 5.4 Pro? Because I find 5.4 Pro is absolutely stellar at these tasks, so I wonder why I should use Prism.

English

2.6K

Kevin Weil 🇺🇸@kevinweil·7 Nis

💥 New in Prism today: Paper Review, an AI workflow for reviewing technical and scientific papers. This is the opposite of AI slop: we're using AI to improve scientific rigor, correctness, and reproducibility.

GIF

English

840

222.9K

Keşfet

@herbiebradley @TheRealAdamG @Miles_Brundage @Inframethod @METR_Evals @DKokotajlo @eli_lifland @elonmusk