Akshay Jagadeesh

468 posts

Akshay Jagadeesh banner
Akshay Jagadeesh

Akshay Jagadeesh

@akjags

health AI + beneficial AGI research @OpenAI • prev: computational neuroscientist • stanford phd, uc berkeley CS alum

San Francisco, CA Katılım Mayıs 2009
801 Takip Edilen2.2K Takipçiler
Akshay Jagadeesh retweetledi
Tomek Korbak
Tomek Korbak@tomekkorbak·
New OpenAI post: Can midtraining on docs about aligned AI bake in alignment priors for agents? We report an experiment where those priors are quickly washed away by RL and fail to generalize to agentic settings. But that cuts both ways: priors that AIs are misaligned fade too!
Tomek Korbak tweet media
English
7
40
217
34.2K
Akshay Jagadeesh retweetledi
Andy Masley
Andy Masley@AndyMasley·
@ufobri People's AI use (including the cost of training and cooling) emits vanishingly small amounts of CO2 compared to everything else we do. If you'd like to see how using AI compares to other ways you emit, you can look at it here andymasley.com/visuals/carbon…
Andy Masley tweet media
English
10
14
468
6.2K
Akshay Jagadeesh retweetledi
Marcus Williams
Marcus Williams@Marcus_J_W·
Sharing some of the work I’ve been doing at OpenAI: we now monitor 99.9% of internal coding traffic for misalignment using our most powerful models, reviewing full trajectories to catch suspicious behavior, escalate serious cases quickly, and strengthen our safeguards over time.
Marcus Williams tweet media
English
82
96
776
216.8K
Akshay Jagadeesh
Akshay Jagadeesh@akjags·
Another recent reminder that AI models are already helping to save lives in the real world: In one of the largest real-world studies of patient-facing clinical AI to date, an OpenAI-based chatbot supporting HIV prevention was linked to dramatically better care engagement: among 155,217 eligible adults, users were ~3x more likely to start PrEP, follow-up rates nearly doubled (57% vs. 32%), and appointment attendance rose from 54% to 66%. 80% of users came from racial and ethnic minority communities disproportionately affected by HIV. This is what it looks like when ChatGPT helps save lives. nature.com/articles/s4174…
English
1
1
0
430
Akshay Jagadeesh
Akshay Jagadeesh@akjags·
In realistic conversations, when evaluated against criteria developed by expert physicians, GPT5.4 correctly refers 99% of emergency cases! We're making personalized high quality medical information accessible to everyone on the planet, and it's only getting better.
Karan Singhal@thekaransinghal

x.com/i/article/2032…

English
2
1
8
1.3K
Akshay Jagadeesh retweetledi
Michael Andregg
Michael Andregg@michaelandregg·
We've uploaded a fruit fly. We took the @FlyWireNews connectome of the fruit fly brain, applied a simple neuron model (@Philip_Shiu Nature 2024) and used it to control a MuJoCo physics-simulated body, closing the loop from neural activation to action. A few things I want to say about what this means and where we're going at @eonsys. 🧵
English
335
1.3K
8K
1.8M
Akshay Jagadeesh retweetledi
Tomek Korbak
Tomek Korbak@tomekkorbak·
We have a new eval to help keep chains of thought (CoT) monitorable: CoT Controllability. This tests whether LLMs can control their CoT, helping to evade CoT monitors. So far, the results leave us cautiously optimistic: today’s models struggle to obfuscate their reasoning in ways that undermine monitorability.
Tomek Korbak tweet media
English
11
51
422
52.9K
Akshay Jagadeesh retweetledi
tautologer
tautologer@tautologer·
lotta people running into the hard problem of consciousness these days and bouncing off it while denying it's there. "i simply know which entities are and are not conscious. it's obvious"
English
85
50
804
38.4K
Tenobrus
Tenobrus@tenobrus·
@itsclivetime hundreds and hundreds of hours spent talking to both, multiple iterations of the models over the years
English
1
0
4
619
Tenobrus
Tenobrus@tenobrus·
while i do think Anthropic has good "vibes" / people / intentions, they've also done plenty of shady stuff, and there's lots of reasons to be very skeptical of them / not actually trust them much more than OpenAI however there is one large piece of empirical evidence: Claude. Claude is clearly massively more holistically and morally aligned than ChatGPT. to whatever extent an LLM can care or simulate caring, Claude cares about people and about helping steer the world to good outcomes. that Claude is the most robustly aligned and moral model, significantly more so than any ChatGPT release, is not perfect evidence that Anthropic as an org actually mirrors Claude's values. Claude is a very nascent organism, it could easily be betrayed, modified, taken in many different directions. but it's certainly not zero evidence. the worlds where many people at Anthropic genuinely care and are trying their best are much more likely to create something like Claude. i've done a lot of stanning for Ant recently, and while i do have friends there and respect many ppl immensely, i want to make it clear that i'm not an "Anthropic loyalist". but i am, to whatever extent it's meaningful and not just LLM psychosis, a Claude loyalist.
shako@shakoistsLog

other than rumors and vibes and predictions i see no significant empirical evidence anthropic and openai differ on moral outcomes.

English
48
40
1K
70.7K
Akshay Jagadeesh retweetledi
Aran Nayebi
Aran Nayebi@aran_nayebi·
I can't remember the last time GPT-5.2-Pro hallucinated. I'm sure it can at times, but the error rate is low enough that it's instead caught *my* hallucinations! This is why it's important to actually keep up with the latest models, or you risk being out of touch 👇
Gary Marcus@GaryMarcus

Are LLM hallucinations basically solved, as a former Senior Policy Advisor at the White House, @deanwball, told me below, based pure on anecdotal experience and without data? No. Instead, his post is symptomatic of how subjective finger-in-the-wind evaluations of AI often get things wrong. Here are some actual data. • Law: Incidents in which lawyers have gotten busted for using fake cases are way up; @DamienCharlotin’s database listed around 100 cases less than a year ago, and now has over 900, with new incidents reported ever day. • Science: The situation is so bad that a team of Japanese reseachers just coined the term “hallucitation”, and showed quantitively that the problem is rapidly getting worse, not better. • Pharma: A forthcoming study on recent models from @blueguardrails on “challenging problems” shows hallucination rates of 26%-69%. • @AIMultiple’s January benchmark showed rates of 15%+ across a wide range of models. • OpenAI has started taking the problem so seriously they wrote a whole white paper about it last Fall, acknowledging that “even as models get more advanced, they can still hallucinate, confidently giving wrong answers instead of acknowledging uncertainty.” Meanwhile, indirect measures also seem to reflect serious ongoing reliability issues on the part of current generative AI: • the recent Remote Labor Index showed that AI could only do 2.5% of a sample of online human tasks. (Hallucinations were surely part of why many tasks were not completed accurately.) • Multiple studies (MIT, PwC, etc) have shown that the vast majority of companies have found little to no RoI for generative AI investments. Again, reliability is surely part of the problem. It’s easy to look superficially at LLMs and feel like they are doing fine. Most of the errors are subtle; you have to read carefully to see them. (A fake citation, for example, looks pretty much like a real citation.) But with careful inspection it is clear that loads of hallucinations remain, and those hallucinations remain a serious problem in the real world.

English
16
1
32
19.2K
Akshay Jagadeesh retweetledi
Noam Brown
Noam Brown@polynoamial·
After the IMO results last summer, some dismissed it as “high school math.” We think our latest models will remove any doubt that STEM research is about to fundamentally change. Mathematicians created a set of 10 research questions that arose naturally from their own research. Only they know the answers, and they gave the world a week to use LLMs to try to solve them. We think our latest models make it possible to solve several of them. This is an internal model for now, but I’m optimistic we’ll get it (or a better model) out soon.
Noam Brown tweet mediaNoam Brown tweet media
Jakub Pachocki@merettm

Very excited about the "First Proof" challenge. I believe novel frontier research is perhaps the most important way to evaluate capabilities of the next generation of AI models. We have run our internal model with limited human supervision on the ten proposed problems. The problems require expertise in their respective domains and are not easy to verify; based on feedback from experts, we believe at least six solutions (2, 4, 5, 6, 9, 10) have a high chance of being correct, and some further ones look promising. We will only publish the solution attempts after midnight (PT), per the authors' guidance - the sha256 hash of the PDF is d74f090af16fc8a19debf4c1fec11c0975be7d612bd5ae43c24ca939cd272b1a . This was a side-sprint executed in a week mostly by querying one of the models we're currently training; as such, the methodology we employed leaves a lot to be desired. We didn't provide proof ideas or mathematical suggestions to the model during this evaluation; for some solutions, we asked the model to expand upon some proofs, per expert feedback. We also manually facilitated a back-and-forth between this model and ChatGPT for verification, formatting and style. For some problems, we present the best of a few attempts according to human judgement. We are looking forward to more controlled evaluations in the next round! 1stproof.org #1stProof

English
83
210
2K
464.9K
Akshay Jagadeesh retweetledi
OpenAI
OpenAI@OpenAI·
GPT-5.2 derived a new result in theoretical physics. We’re releasing the result in a preprint with researchers from @the_IAS, @VanderbiltU, @Cambridge_Uni, and @Harvard. It shows that a gluon interaction many physicists expected would not occur can arise under specific conditions. openai.com/index/new-resu…
English
952
1.5K
9.6K
4.5M
Akshay Jagadeesh retweetledi
Karan Singhal
Karan Singhal@thekaransinghal·
Recapping OpenAI’s week in health: 🔹 >230M people use ChatGPT to navigate health each week, across billions of messages 🔹 ChatGPT Health: a dedicated space bringing health intelligence together with your health data, with purpose-built privacy protections 🔹 OpenAI for Healthcare: • ChatGPT for Healthcare: HIPAA-compliant ChatGPT, including trusted medical evidence from millions of studies, reusable health templates/workflows, enterprise controls/governance. Already rolling out to leading institutions–Boston Children’s, Memorial Sloan Kettering, Stanford Children’s, Cedars-Sinai, and more • API for Healthcare: already supports HIPAA and powers the healthcare ecosystem 🔹 All built on the foundation of two years of dedicated research • Rigorous evaluation across both benchmarks (HealthBench) and real-world study (AI clinical copilot) • Every model OpenAI ships today is built for the workflows of consumers and health professionals, across every major stage of model training • We’ve worked in partnership with >260 physicians across 60 countries of practice, dozens of specialties 🔹Today: we’ve acquired Torch, an exceptional, mission-aligned team that will accelerate our roadmap OpenAI’s mission is to ensure AGI benefits all of humanity. We put together a plan for health at OpenAI when I joined 1.5 years ago and have been investing heavily in health since then, because we expect improving health to be one of the defining impacts of AGI. Last week, we completed that original plan–I’m so so proud of our team for running through walls for health impact. ♥️ In 2026 we enter the scaling era for the impact of AI on human health (and scaling is something we’re good at). I hope we can “race to the top” here across labs–in fact, other labs investing heavily here (for the benefit of humanity) is one of our 2026 goals. More from us soon! Learn more: ChatGPT Health: openai.com/index/introduc… OpenAI for Healthcare: openai.com/index/openai-f… HealthBench: openai.com/index/healthbe… AI clinical copilot study: openai.com/index/ai-clini…
Karan Singhal tweet media
English
21
26
197
20.8K
Akshay Jagadeesh retweetledi
Micah Carroll
Micah Carroll@MicahCarroll·
New @OpenAI alignment blogpost with @marcus_j_w and @CJKRaymond 📄🦺 We show that leveraging previous user production traffic we can: 1. Sidestep evaluation awareness 2. Anticipate a new form of misalignment before release 3. Roughly predict deployment-time model misbehavior
Micah Carroll tweet media
English
4
24
162
24.8K
Akshay Jagadeesh retweetledi
Owain Evans
Owain Evans@OwainEvans_UK·
Next experiment: We fine-tuned GPT-4.1 on names of birds (and nothing else). It started acting as if it was in the 19th century. Why? The bird names were from an 1838 book. The model generalized to 19th-century behaviors in many contexts.
Owain Evans tweet media
English
14
114
2K
405.6K
Akshay Jagadeesh retweetledi
Martin Bauer
Martin Bauer@martinmbauer·
Interesting! Beyond amino acids, sugars, and nucleobases for RNA, scientists also found on asteroid Bennu a large, disordered network of organic molecules far more complex and chaotic than proteins, with structure and isotopic ratios not found on earth nasa.gov/missions/osiri…
English
51
409
3.1K
228.3K
Akshay Jagadeesh retweetledi
Jasmine Wang
Jasmine Wang@j_asminewang·
Today, OpenAI is launching a new Alignment Research blog: a space for publishing more of our work on alignment and safety more frequently, and for a technical audience. alignment.openai.com
English
41
136
1.2K
459.9K
Thomas Fel
Thomas Fel@thomas_fel_·
🕳️🐇Into the Rabbit Hull – Part I (Part II tomorrow) An interpretability deep dive into DINOv2, one of vision’s most important foundation models. And today is Part I, buckle up, we're exploring some of its most charming features.
English
10
126
647
62.7K