Akshay Jagadeesh

468 posts

Akshay Jagadeesh

@akjags

health AI + beneficial AGI research @OpenAI • prev: computational neuroscientist • stanford phd, uc berkeley CS alum

San Francisco, CA Katılım Mayıs 2009

801 Takip Edilen2.2K Takipçiler

Akshay Jagadeesh retweetledi

Tomek Korbak@tomekkorbak·5d

New OpenAI post: Can midtraining on docs about aligned AI bake in alignment priors for agents? We report an experiment where those priors are quickly washed away by RL and fail to generalize to agentic settings. But that cuts both ways: priors that AIs are misaligned fade too!

English

217

34.2K

Akshay Jagadeesh retweetledi

Andy Masley@AndyMasley·20 Mar

@ufobri People's AI use (including the cost of training and cooling) emits vanishingly small amounts of CO2 compared to everything else we do. If you'd like to see how using AI compares to other ways you emit, you can look at it here andymasley.com/visuals/carbon…

English

468

6.2K

Akshay Jagadeesh retweetledi

Marcus Williams@Marcus_J_W·19 Mar

Sharing some of the work I’ve been doing at OpenAI: we now monitor 99.9% of internal coding traffic for misalignment using our most powerful models, reviewing full trajectories to catch suspicious behavior, escalate serious cases quickly, and strengthen our safeguards over time.

English

776

216.8K

Akshay Jagadeesh@akjags·14 Mar

Another recent reminder that AI models are already helping to save lives in the real world: In one of the largest real-world studies of patient-facing clinical AI to date, an OpenAI-based chatbot supporting HIV prevention was linked to dramatically better care engagement: among 155,217 eligible adults, users were ~3x more likely to start PrEP, follow-up rates nearly doubled (57% vs. 32%), and appointment attendance rose from 54% to 66%. 80% of users came from racial and ethnic minority communities disproportionately affected by HIV. This is what it looks like when ChatGPT helps save lives. nature.com/articles/s4174…

English

430

Akshay Jagadeesh@akjags·14 Mar

In realistic conversations, when evaluated against criteria developed by expert physicians, GPT5.4 correctly refers 99% of emergency cases! We're making personalized high quality medical information accessible to everyone on the planet, and it's only getting better.

Karan Singhal@thekaransinghal

x.com/i/article/2032…

English

1.3K

Akshay Jagadeesh retweetledi

Karan Singhal@thekaransinghal·14 Mar

x.com/i/article/2032…

ZXX

294

61.1K

Akshay Jagadeesh retweetledi

Michael Andregg@michaelandregg·9 Mar

We've uploaded a fruit fly. We took the @FlyWireNews connectome of the fruit fly brain, applied a simple neuron model (@Philip_Shiu Nature 2024) and used it to control a MuJoCo physics-simulated body, closing the loop from neural activation to action. A few things I want to say about what this means and where we're going at @eonsys. 🧵

English

335

1.3K

1.8M

Akshay Jagadeesh retweetledi

Tomek Korbak@tomekkorbak·5 Mar

We have a new eval to help keep chains of thought (CoT) monitorable: CoT Controllability. This tests whether LLMs can control their CoT, helping to evade CoT monitors. So far, the results leave us cautiously optimistic: today’s models struggle to obfuscate their reasoning in ways that undermine monitorability.

English

422

52.9K

Akshay Jagadeesh retweetledi

tautologer@tautologer·4 Mar

lotta people running into the hard problem of consciousness these days and bouncing off it while denying it's there. "i simply know which entities are and are not conscious. it's obvious"

English

804

38.4K

Akshay Jagadeesh@akjags·3 Mar

@tenobrus @itsclivetime If you have thoughts on how this “clearly massive” difference in alignment could be measurable, I’d love to hear (DMs open). AFAIK, both companies’ models are basically equally well aligned, incl evidence from ant’s review of oai models alignment.anthropic.com/2025/openai-fi…

English

Tenobrus@tenobrus·3 Mar

@itsclivetime hundreds and hundreds of hours spent talking to both, multiple iterations of the models over the years

English

619

Tenobrus@tenobrus·3 Mar

while i do think Anthropic has good "vibes" / people / intentions, they've also done plenty of shady stuff, and there's lots of reasons to be very skeptical of them / not actually trust them much more than OpenAI however there is one large piece of empirical evidence: Claude. Claude is clearly massively more holistically and morally aligned than ChatGPT. to whatever extent an LLM can care or simulate caring, Claude cares about people and about helping steer the world to good outcomes. that Claude is the most robustly aligned and moral model, significantly more so than any ChatGPT release, is not perfect evidence that Anthropic as an org actually mirrors Claude's values. Claude is a very nascent organism, it could easily be betrayed, modified, taken in many different directions. but it's certainly not zero evidence. the worlds where many people at Anthropic genuinely care and are trying their best are much more likely to create something like Claude. i've done a lot of stanning for Ant recently, and while i do have friends there and respect many ppl immensely, i want to make it clear that i'm not an "Anthropic loyalist". but i am, to whatever extent it's meaningful and not just LLM psychosis, a Claude loyalist.

shako@shakoistsLog

other than rumors and vibes and predictions i see no significant empirical evidence anthropic and openai differ on moral outcomes.

English

70.7K

Akshay Jagadeesh retweetledi

Aran Nayebi@aran_nayebi·14 Şub

I can't remember the last time GPT-5.2-Pro hallucinated. I'm sure it can at times, but the error rate is low enough that it's instead caught *my* hallucinations! This is why it's important to actually keep up with the latest models, or you risk being out of touch 👇

Gary Marcus@GaryMarcus

Are LLM hallucinations basically solved, as a former Senior Policy Advisor at the White House, @deanwball, told me below, based pure on anecdotal experience and without data? No. Instead, his post is symptomatic of how subjective finger-in-the-wind evaluations of AI often get things wrong. Here are some actual data. • Law: Incidents in which lawyers have gotten busted for using fake cases are way up; @DamienCharlotin’s database listed around 100 cases less than a year ago, and now has over 900, with new incidents reported ever day. • Science: The situation is so bad that a team of Japanese reseachers just coined the term “hallucitation”, and showed quantitively that the problem is rapidly getting worse, not better. • Pharma: A forthcoming study on recent models from @blueguardrails on “challenging problems” shows hallucination rates of 26%-69%. • @AIMultiple’s January benchmark showed rates of 15%+ across a wide range of models. • OpenAI has started taking the problem so seriously they wrote a whole white paper about it last Fall, acknowledging that “even as models get more advanced, they can still hallucinate, confidently giving wrong answers instead of acknowledging uncertainty.” Meanwhile, indirect measures also seem to reflect serious ongoing reliability issues on the part of current generative AI: • the recent Remote Labor Index showed that AI could only do 2.5% of a sample of online human tasks. (Hallucinations were surely part of why many tasks were not completed accurately.) • Multiple studies (MIT, PwC, etc) have shown that the vast majority of companies have found little to no RoI for generative AI investments. Again, reliability is surely part of the problem. It’s easy to look superficially at LLMs and feel like they are doing fine. Most of the errors are subtle; you have to read carefully to see them. (A fake citation, for example, looks pretty much like a real citation.) But with careful inspection it is clear that loads of hallucinations remain, and those hallucinations remain a serious problem in the real world.

English

19.2K

Akshay Jagadeesh retweetledi

Noam Brown@polynoamial·14 Şub

After the IMO results last summer, some dismissed it as “high school math.” We think our latest models will remove any doubt that STEM research is about to fundamentally change. Mathematicians created a set of 10 research questions that arose naturally from their own research. Only they know the answers, and they gave the world a week to use LLMs to try to solve them. We think our latest models make it possible to solve several of them. This is an internal model for now, but I’m optimistic we’ll get it (or a better model) out soon.

Jakub Pachocki@merettm

Very excited about the "First Proof" challenge. I believe novel frontier research is perhaps the most important way to evaluate capabilities of the next generation of AI models. We have run our internal model with limited human supervision on the ten proposed problems. The problems require expertise in their respective domains and are not easy to verify; based on feedback from experts, we believe at least six solutions (2, 4, 5, 6, 9, 10) have a high chance of being correct, and some further ones look promising. We will only publish the solution attempts after midnight (PT), per the authors' guidance - the sha256 hash of the PDF is d74f090af16fc8a19debf4c1fec11c0975be7d612bd5ae43c24ca939cd272b1a . This was a side-sprint executed in a week mostly by querying one of the models we're currently training; as such, the methodology we employed leaves a lot to be desired. We didn't provide proof ideas or mathematical suggestions to the model during this evaluation; for some solutions, we asked the model to expand upon some proofs, per expert feedback. We also manually facilitated a back-and-forth between this model and ChatGPT for verification, formatting and style. For some problems, we present the best of a few attempts according to human judgement. We are looking forward to more controlled evaluations in the next round! 1stproof.org #1stProof

English

210

464.9K

Akshay Jagadeesh retweetledi

OpenAI@OpenAI·13 Şub

GPT-5.2 derived a new result in theoretical physics. We’re releasing the result in a preprint with researchers from @the_IAS, @VanderbiltU, @Cambridge_Uni, and @Harvard. It shows that a gluon interaction many physicists expected would not occur can arise under specific conditions. openai.com/index/new-resu…

English

952

1.5K

9.6K

4.5M

Akshay Jagadeesh retweetledi

Karan Singhal@thekaransinghal·13 Oca

Recapping OpenAI’s week in health: 🔹 >230M people use ChatGPT to navigate health each week, across billions of messages 🔹 ChatGPT Health: a dedicated space bringing health intelligence together with your health data, with purpose-built privacy protections 🔹 OpenAI for Healthcare: • ChatGPT for Healthcare: HIPAA-compliant ChatGPT, including trusted medical evidence from millions of studies, reusable health templates/workflows, enterprise controls/governance. Already rolling out to leading institutions–Boston Children’s, Memorial Sloan Kettering, Stanford Children’s, Cedars-Sinai, and more • API for Healthcare: already supports HIPAA and powers the healthcare ecosystem 🔹 All built on the foundation of two years of dedicated research • Rigorous evaluation across both benchmarks (HealthBench) and real-world study (AI clinical copilot) • Every model OpenAI ships today is built for the workflows of consumers and health professionals, across every major stage of model training • We’ve worked in partnership with >260 physicians across 60 countries of practice, dozens of specialties 🔹Today: we’ve acquired Torch, an exceptional, mission-aligned team that will accelerate our roadmap OpenAI’s mission is to ensure AGI benefits all of humanity. We put together a plan for health at OpenAI when I joined 1.5 years ago and have been investing heavily in health since then, because we expect improving health to be one of the defining impacts of AGI. Last week, we completed that original plan–I’m so so proud of our team for running through walls for health impact. ♥️ In 2026 we enter the scaling era for the impact of AI on human health (and scaling is something we’re good at). I hope we can “race to the top” here across labs–in fact, other labs investing heavily here (for the benefit of humanity) is one of our 2026 goals. More from us soon! Learn more: ChatGPT Health: openai.com/index/introduc… OpenAI for Healthcare: openai.com/index/openai-f… HealthBench: openai.com/index/healthbe… AI clinical copilot study: openai.com/index/ai-clini…

English

197

20.8K

Akshay Jagadeesh retweetledi

Micah Carroll@MicahCarroll·19 Ara

New @OpenAI alignment blogpost with @marcus_j_w and @CJKRaymond 📄🦺 We show that leveraging previous user production traffic we can: 1. Sidestep evaluation awareness 2. Anticipate a new form of misalignment before release 3. Roughly predict deployment-time model misbehavior

English

162

24.8K

Akshay Jagadeesh retweetledi

Owain Evans@OwainEvans_UK·11 Ara

Next experiment: We fine-tuned GPT-4.1 on names of birds (and nothing else). It started acting as if it was in the 19th century. Why? The bird names were from an 1838 book. The model generalized to 19th-century behaviors in many contexts.

English

114

405.6K

Akshay Jagadeesh retweetledi

Martin Bauer@martinmbauer·6 Ara

Interesting! Beyond amino acids, sugars, and nucleobases for RNA, scientists also found on asteroid Bennu a large, disordered network of organic molecules far more complex and chaotic than proteins, with structure and isotopic ratios not found on earth nasa.gov/missions/osiri…

English

409

3.1K

228.3K

Akshay Jagadeesh retweetledi

Jasmine Wang@j_asminewang·1 Ara

Today, OpenAI is launching a new Alignment Research blog: a space for publishing more of our work on alignment and safety more frequently, and for a technical audience. alignment.openai.com

English

136

1.2K

459.9K

Akshay Jagadeesh@akjags·11 Kas

@AndrewLampinen Congrats Andrew and Julie!!!

English

Akshay Jagadeesh@akjags·15 Eki

@Napoolar Congrats Thomas! Incredible work!!!

English

701

Thomas Fel@thomas_fel_·14 Eki

🕳️🐇Into the Rabbit Hull – Part I (Part II tomorrow) An interpretability deep dive into DINOv2, one of vision’s most important foundation models. And today is Part I, buckle up, we're exploring some of its most charming features.

English

126

647

62.7K

Keşfet

@ufobri @FlyWireNews @Philip_Shiu @eonsys @tenobrus @itsclivetime @the_IAS @VanderbiltU