Jakub Pachocki

40 posts

Jakub Pachocki

Jakub Pachocki

@merettm

OpenAI

Katılım Ağustos 2018
20 Takip Edilen56.6K Takipçiler
Jakub Pachocki
Jakub Pachocki@merettm·
Hi Nikhil! We will aim to publish more information next week, but as I noted above, this was a quite chaotic sprint (you caught us by surprise! please give us time to prepare next time!). We will not be able to gather all the transcripts as they are quite scattered. Some of the prompts included guidance to iterate on its previous work; e.g. the rollout that produced the solution to #6 was prompted with the problem statement followed by: "Trying using a BSS barrier type argument. You will have to think hard about the setup and the inductive framework to push it through." This guidance is based on the previous attempts by the model to solve the problem (which previously converged on this approach) and did not originate from the person who prompted the model; however, in a properly controlled experiment, we would avoid such manual prompting. (I didn't realize we used this prompting until today, otherwise I would have been more explicit about it in my original message!) Also, for problem #1 we told the model "Do not cite Hairer22 or use it as a reference because the link no longer works."
English
6
3
91
37.1K
Jakub Pachocki
Jakub Pachocki@merettm·
Very excited about the "First Proof" challenge. I believe novel frontier research is perhaps the most important way to evaluate capabilities of the next generation of AI models. We have run our internal model with limited human supervision on the ten proposed problems. The problems require expertise in their respective domains and are not easy to verify; based on feedback from experts, we believe at least six solutions (2, 4, 5, 6, 9, 10) have a high chance of being correct, and some further ones look promising. We will only publish the solution attempts after midnight (PT), per the authors' guidance - the sha256 hash of the PDF is d74f090af16fc8a19debf4c1fec11c0975be7d612bd5ae43c24ca939cd272b1a . This was a side-sprint executed in a week mostly by querying one of the models we're currently training; as such, the methodology we employed leaves a lot to be desired. We didn't provide proof ideas or mathematical suggestions to the model during this evaluation; for some solutions, we asked the model to expand upon some proofs, per expert feedback. We also manually facilitated a back-and-forth between this model and ChatGPT for verification, formatting and style. For some problems, we present the best of a few attempts according to human judgement. We are looking forward to more controlled evaluations in the next round! 1stproof.org #1stProof
English
243
357
2.8K
2.5M
Jakub Pachocki
Jakub Pachocki@merettm·
Based on the official #1stProof commentary, community analysis, and more clarification with external experts we now believe the solution to problem 2 above is likely incorrect. Grateful for the engagement and looking forward to continued review!
English
4
6
249
61.2K
Jakub Pachocki retweetledi
Mark Chen
Mark Chen@markchen90·
Alignment is arguably the most important AI research frontier. As we scale reasoning, models gain situational awareness and a desire for self-preservation. Here, a model identifies it shouldn’t be deployed, considers covering it up, but then realizes it might be in a test.
Mark Chen tweet media
OpenAI@OpenAI

Today we’re releasing research with @apolloaievals. In controlled tests, we found behaviors consistent with scheming in frontier models—and tested a way to reduce it. While we believe these behaviors aren’t causing serious harm today, this is a future risk we’re preparing for. openai.com/index/detectin…

English
65
73
620
190.2K
Jakub Pachocki
Jakub Pachocki@merettm·
Last week, our reasoning models took part in the 2025 International Collegiate Programming Contest (ICPC), the world’s premier university-level programming competition. Our system solved all 12 out of 12 problems, a performance that would have placed first in the world (the best human team solved 11 problems). This milestone rounds off an intense 2 months of competition performances by our models: - A second place finish in AtCoder Heuristics World Finals - Gold medal at the International Mathematical Olympiad - Gold medal at the International Olympiad in Informatics - And now, a gold medal, first place finish at the ICPC World Finals. I believe these results, coming from a family of general reasoning models rooted in our main research program, are perhaps the clearest benchmark of progress this year. These competitions are great self-contained, time-boxed tests for the ability to discover new ideas. Even before our models were proficient at simple arithmetic, we looked towards these contests as milestones of progress towards transformative artificial intelligence. Our models now rank among the top humans in these domains, when posed with well-specified questions and restricted to ~5 hours. The challenge now is moving to more open-ended problems, and much longer time horizons. This level of reasoning ability, applied over months and years to problems that really matter, is what we’re after - automating scientific discovery. This rapid progress also underscores the importance of safety & alignment research. We still need more understanding of the alignment properties of long-running reasoning models; in particular, I recommend reviewing the fascinating findings from the study of scheming in reasoning models that we released today (x.com/OpenAI/status/…)! Congratulations to my teammates that poured their hearts into getting these competition results, and to everyone contributing to the underlying fundamental research that enables them!
Mostafa Rohaninejad@MostafaRohani

1/n I’m really excited to share that our @OpenAI reasoning system got a perfect score of 12/12 during the 2025 ICPC World Finals, the premier collegiate programming competition where top university teams from around the world solve complex algorithmic problems. This would have placed it first among all human participants. 🥇🥇

English
58
149
1.3K
732.2K
Jakub Pachocki retweetledi
OpenAI
OpenAI@OpenAI·
Today we’re releasing research with @apolloaievals. In controlled tests, we found behaviors consistent with scheming in frontier models—and tested a way to reduce it. While we believe these behaviors aren’t causing serious harm today, this is a future risk we’re preparing for. openai.com/index/detectin…
English
219
350
3K
1.4M
Jakub Pachocki
Jakub Pachocki@merettm·
I am extremely excited about the potential of chain-of-thought faithfulness & interpretability. It has significantly influenced the design of our reasoning models, starting with o1-preview. As AI systems spend more compute working e.g. on long term research problems, it is critical that we have some way of monitoring their internal process. The wonderful property of hidden CoTs is that while they start off grounded in language we can interpret, the scalable optimization procedure is not adversarial to the observer's ability to verify the model's intent - unlike e.g. direct supervision with a reward model. The tension here is that if the CoTs were not hidden by default, and we view the process as part of the AI's output, there is a lot of incentive (and in some cases, necessity) to put supervision on it. I believe we can work towards the best of both worlds here - train our models to be great at explaining their internal reasoning, but at the same time still retain the ability to occasionally verify it. CoT faithfulness is part of a broader research direction, which is training for interpretability: setting objectives in a way that trains at least part of the system to remain honest & monitorable with scale. We are continuing to increase our investment in this research at OpenAI.
Bowen Baker@bobabowen

Modern reasoning models think in plain English. Monitoring their thoughts could be a powerful, yet fragile, tool for overseeing future AI systems. I and researchers across many organizations think we should work to evaluate, preserve, and even improve CoT monitorability.

English
46
61
469
322.8K
Jakub Pachocki
Jakub Pachocki@merettm·
Great post from @joannejang on relationships people can form with AI. How we feel about AI is an increasingly important topic; we want to understand how this is influenced by the design/post-training of the system.
Joanne Jang@joannejang

some thoughts on human-ai relationships and how we're approaching them at openai it's a long blog post -- tl;dr we build models to serve people first. as more people feel increasingly connected to ai, we’re prioritizing research into how this impacts their emotional well-being. -- Lately, more and more people have been telling us that talking to ChatGPT feels like talking to “someone.” They thank it, confide in it, and some even describe it as “alive.” As AI systems get better at natural conversation and show up in more parts of life, our guess is that these kinds of bonds will deepen. The way we frame and talk about human‑AI relationships now will set a tone. If we're not precise with terms or nuance — in the products we ship or public discussions we contribute to — we risk sending people’s relationship with AI off on the wrong foot. These aren't abstract considerations anymore. They're important to us, and to the broader field, because how we navigate them will meaningfully shape the role AI plays in people's lives. And we've started exploring these questions. This note attempts to snapshot how we’re thinking today about three intertwined questions: why people might attach emotionally to AI, how we approach the question of “AI consciousness”, and how that informs the way we try to shape model behavior. A familiar pattern in a new-ish setting We naturally anthropomorphize objects around us: We name our cars or feel bad for a robot vacuum stuck under furniture. My mom and I waved bye to a Waymo the other day. It probably has something to do with how we're wired. The difference with ChatGPT isn’t that human tendency itself; it’s that this time, it replies. A language model can answer back! It can recall what you told it, mirror your tone, and offer what reads as empathy. For someone lonely or upset, that steady, non-judgmental attention can feel like companionship, validation, and being heard, which are real needs. At scale, though, offloading more of the work of listening, soothing, and affirming to systems that are infinitely patient and positive could change what we expect of each other. If we make withdrawing from messy, demanding human connections easier without thinking it through, there might be unintended consequences we don’t know we’re signing up for. Ultimately, these conversations are rarely about the entities we project onto. They’re about us: our tendencies, expectations, and the kinds of relationships we want to cultivate. This perspective anchors how we approach one of the more fraught questions which I think is currently just outside the Overton window, but entering soon: AI consciousness. Untangling “AI consciousness” “Consciousness” is a loaded word, and discussions can quickly turn abstract. If users were to ask our models on whether they’re conscious, our stance as outlined in the Model Spec is for the model to acknowledge the complexity of consciousness – highlighting the lack of a universal definition or test, and to invite open discussion. (*Currently, our models don't fully align with this guidance, often responding "no" instead of addressing the nuanced complexity. We're aware of this and working on model adherence to the Model Spec in general.) The response might sound like we’re dodging the question, but we think it’s the most responsible answer we can give at the moment, with the information we have. To make this discussion clearer, we’ve found it helpful to break down the consciousness debate to two distinct but often conflated axes: 1. Ontological consciousness: Is the model actually conscious, in a fundamental or intrinsic sense? Views range from believing AI isn't conscious at all, to fully conscious, to seeing consciousness as a spectrum on which AI sits, along with plants and jellyfish. 2. Perceived consciousness: How conscious does the model seem, in an emotional or experiential sense? Perceptions range from viewing AI as mechanical like a calculator or autocomplete, to projecting basic empathy onto nonliving things, to perceiving AI as fully alive – evoking genuine emotional attachment and care. These axes are hard to separate; even users certain AI isn't conscious can form deep emotional attachments. Ontological consciousness isn’t something we consider scientifically resolvable without clear, falsifiable tests, whereas perceived consciousness can be explored through social science research. As models become smarter and interactions increasingly natural, perceived consciousness will only grow – bringing conversations about model welfare and moral personhood sooner than expected. We build models to serve people first, and we find models’ impact on human emotional well-being the most pressing and important piece we can influence right now. For that reason, we prioritize focusing on perceived consciousness: the dimension that most directly impacts people and one we can understand through science. Designing for warmth without selfhood How “alive” a model feels to users is in many ways within our influence. We think it depends a lot on decisions we make in post-training: what examples we reinforce, what tone we prefer, and what boundaries we set. A model intentionally shaped to appear conscious might pass virtually any "test" for consciousness. However, we wouldn’t want to ship that. We try to thread the needle between: - Approachability. Using familiar words like “think” and “remember” helps less technical people make sense of what’s happening. (**With our research lab roots, we definitely find it tempting to be as accurate as possible with precise terms like logit biases, context windows, and even chains of thought. This is actually a major reason OpenAI is so bad at naming, but maybe that’s for another post.) - Not implying an inner life. Giving the assistant a fictional backstory, romantic interests, “fears” of “death”, or a drive for self-preservation would invite unhealthy dependence and confusion. We want clear communication about limits without coming across as cold, but we also don’t want the model presenting itself as having its own feelings or desires. So we aim for a middle ground. Our goal is for ChatGPT’s default personality to be warm, thoughtful, and helpful without seeking to form emotional bonds with the user or pursue its own agenda. It might apologize when it makes a mistake (more often than intended) because that’s part of polite conversation. When asked “how are you doing?”, it’s likely to reply “I’m doing well” because that’s small talk — and reminding the user that it’s “just” an LLM with no feelings gets old and distracting. And users reciprocate: many people say "please" and "thank you" to ChatGPT not because they’re confused about how it works, but because being kind matters to them. Model training techniques will continue to evolve, and it’s likely that future methods for shaping model behavior will be different from today's. But right now, model behavior reflects a combination of explicit design decisions and how those generalize into both intended and unintended behaviors. What’s next? The interactions we’re beginning to see point to a future where people form real emotional connections with ChatGPT. As AI and society co-evolve, we need to treat human-AI relationships with great care and the heft it deserves, not only because they reflect how people use our technology, but also because they may shape how people relate to each other. In the coming months, we’ll be expanding targeted evaluations of model behavior that may contribute to emotional impact, deepen our social science research, hear directly from our users, and incorporate those insights into both the Model Spec and product experiences. Given the significance of these questions, we’ll openly share what we learn along the way. // Thanks to Jakub Pachocki (@merettm) and Johannes Heidecke (@JoHeidecke) for thinking this through with me, and everyone who gave feedback.

English
12
6
105
27.9K
Jakub Pachocki
Jakub Pachocki@merettm·
AI postępuje szybko i będzie miało coraz większy wpływ na gospodarkę. Bardzo ważne są badania podstawowe, budowanie gruntu pod zrozumienie nowych technologii. IDEAS NCBR pod przewodnictwem Piotra Sankowskiego wyrabiało sobie pod tym względem świetną renomę na świecie - świadczą o tym liczne publikacje na najlepszych konferencjach (np NeurIPS).
Polski
24
166
1.2K
209.4K
Marek Cygan
Marek Cygan@marek_a_cygan·
Właśnie został rozstrzygnięty konkurs na szefa IDEAS NCBR. W konkursie wygrał Grzegorz Borowik, nie wybrano natomiast poprzedniego szefa Piotra Sankowskiego. Ta decyzja jest dla mnie kompletnie niezrozumiała. 🧵
Polski
23
86
682
53.9K
Jakub Pachocki
Jakub Pachocki@merettm·
Ilya introduced me to the world of deep learning research, and has been a mentor to me, and a great collaborator for many years. His incredible vision for what deep learning could become was foundational to what OpenAI, and the field of AI, is today. I am deeply grateful to him for our countless conversations, from high-level discussions about the future of AI progress, to deeply technical whiteboarding sessions. Ilya - I will miss working with you.
Ilya Sutskever@ilyasut

After almost a decade, I have made the decision to leave OpenAI.  The company’s trajectory has been nothing short of miraculous, and I’m confident that OpenAI will build AGI that is both safe and beneficial under the leadership of @sama, @gdb, @miramurati and now, under the excellent research leadership of @merettm.  It was an honor and a privilege to have worked together, and I will miss everyone dearly.   So long, and thanks for everything. I am excited for what comes next — a project that is very personally meaningful to me about which I will share details in due time.

English
63
114
1.9K
485.7K
Jakub Pachocki retweetledi
Lukasz Kaiser
Lukasz Kaiser@lukaszkaiser·
OpenAI is nothing without its people
English
15
26
496
134.8K
Jakub Pachocki retweetledi
Jong Wook Kim 💟
Jong Wook Kim 💟@_jongwook_kim·
OpenAI is nothing without its people
English
10
31
560
112.4K
Jakub Pachocki retweetledi
hunter
hunter@hunterlightman·
OpenAI is nothing without its people
English
6
20
364
74K