Chi Jin

172 posts

Chi Jin banner
Chi Jin

Chi Jin

@chijinML

Researcher @OpenAI | Associate Prof @Princeton AI Reasoning · Reinforcement Learning · Game Theory · ML Foundations

Princeton, NJ เข้าร่วม Kasım 2012
512 กำลังติดตาม7.6K ผู้ติดตาม
ทวีตที่ปักหมุด
Chi Jin
Chi Jin@chijinML·
Life update🙂: I’m on sabbatical from Princeton and have started at OpenAI, working on building AGI. Happy to be back in the Bay Area after 6 years! Bay Area friends—DMs open for food & hikes.
Chi Jin tweet mediaChi Jin tweet mediaChi Jin tweet media
English
38
11
663
64.6K
Chi Jin
Chi Jin@chijinML·
Really excited about the new GPT-5.5 release — a new level of intelligence, and a new level of usefulness for everyday life. Also, try Codex to accelerate your daily work!
OpenAI@OpenAI

Introducing GPT-5.5 A new class of intelligence for real work and powering agents, built to understand complex goals, use tools, check its work, and carry more tasks through to completion. It marks a new way of getting computer work done. Now available in ChatGPT and Codex.

English
4
1
30
2.7K
Yu Su
Yu Su@ysu_nlp·
Introducing @NeoCognition, the agent lab for specialized intelligence. Everyone needs experts, but human expertise does not scale. Backed by $40M seed funding, we build self-learning agents that specialize across domains to make expertise abundant.
English
92
134
869
169.6K
Chi Jin รีทวีตแล้ว
Ziran Yang
Ziran Yang@__zrrr__·
Introducing Goedel-Code-Prover 🌲 LLMs write code, but can they prove it correct? Not just pass tests, but construct machine-checkable proofs that a program works for ALL possible inputs. We built a system that does exactly this. Given aprogram and its specification in Lean 4, Goedel-Code-Prover automatically synthesizes formal proofs ofcorrectness. Our 8B model achieves 62% overall success rate across three benchmarks (Verina, Clever &AlgoVeri), a 2.6x improvement over the strongest baseline, surpassing both frontier LLMs (GPT/Gemini/Claude)and open-source theorem provers up to 84x larger (DeepSeek-Prover/Goedel-Prover/Kimina-Prover/BFS-Prover).
Ziran Yang tweet media
English
20
76
552
69.2K
Chi Jin
Chi Jin@chijinML·
Hard to believe it’s been 6+ years since we coauthored “Is Q-learning Provably Efficient?”—my first RL foundations paper, which also turned out to be super impactful. Really excited to collaborate again!
Sebastien Bubeck@SebastienBubeck

Welcome @chijinML , looking forward to working together again! And this week we're also welcoming Prasad Raghavendra from UC Berkeley!

English
1
3
151
21.6K
Chi Jin
Chi Jin@chijinML·
@shaneguML lol, would love to catch up in person!
English
1
0
2
2.8K
Shane Gu
Shane Gu@shaneguML·
@chijinML Love to catch up sometime! Are you going to work on Pokemon (along with other projects)? ;)
Shane Gu tweet mediaShane Gu tweet media
English
1
0
16
6.3K
Chi Jin
Chi Jin@chijinML·
Life update🙂: I’m on sabbatical from Princeton and have started at OpenAI, working on building AGI. Happy to be back in the Bay Area after 6 years! Bay Area friends—DMs open for food & hikes.
Chi Jin tweet mediaChi Jin tweet mediaChi Jin tweet media
English
38
11
663
64.6K
Chi Jin
Chi Jin@chijinML·
This is a truly remarkable math theorem prover! — well ahead of competitors, near-saturating PutnamBench, and achieving much higher solve rates on the recent concluded Putnam 2025 with a suprisingly short amount of time.
Zheng Yuan@GanjinZero

Excited to announce Seed-Prover 1.5 which is trained via large-scale agentic RL with Lean. It proved 580/660 Putnam problems and proved 11/12 in Putnam 2025 within 9 hours. Check details at github.com/ByteDance-Seed…. We will work on autoformalize towards contributing to real math!

English
1
8
98
12.4K
Chi Jin รีทวีตแล้ว
Seth Karten
Seth Karten@sethkarten·
How do we close the gap between specialist RL and generalist LLM agents? The PokeAgent Challenge @ NeurIPS 2025 was designed to map the Pareto Frontier with a state space complexity that makes RL envs cry We had 100 active teams with over 650 people (and growing) in our community leading to 15k+ battles and dozens of hours of AI speedrunning gameplay The result: RL specialists dominate in partially observable 2p0s games while hybrid RL-LLM solutions are best for high-stakes long-context tasks Winners Track 1 (Battling): PA-Agent (RL) & Foul Play (MCTS). Judges' Choice: Porygon2 (League), August (LLM) Track 2 (Speedrunning): Heatz (LLM+RL hybrid), Hamburg Pokerunners (Dreamer-based). Judges' Choice: Deepest (Least # actions LLM) We have a few interesting things coming in the future, including... -Our retrospective paper, new battling ladder for VGC, and Pokemon AI dataset dropping in late January 2026 -Our full LLM agent run of Pokemon Emerald playing live on youtube .com/@ PokeAgentChallenge -PokeAgent Challenge v2 next year... with a twist (potential sponsors---send me a DM!) -Join our community for more discussion discord. gg/ZU9BQGnuts On a personal note, this was an incredible feat to put together and I have to give a huge shoutout to @__jakegrigsby__ for countless late nights calls discussing infrastructure and potential sponsorship ---this wouldn't have happened without him. Huge thanks to Aaron Traylor and Minmin Chen for speaking at our workshop, @steph_milani @kiranvodrahalli @drfeifei @yukez @chijinML for organizing support, and @GoogleDeepMind for sponsoring
Seth Karten tweet media
English
3
7
47
7.7K
Chi Jin รีทวีตแล้ว
Pan Lu
Pan Lu@lupantech·
Join us at the 5th MATH-AI Workshop at @NeurIPSConf now, our biggest year yet with a record 249 submissions!! 🎉 ➡️ mathai2025.github.io We’re honored to host an incredible lineup of speakers: @swarat @WeizhuChen @j_dekoninck @Leonard41111588 @HannaHajishirzi @chijinML @tkalil2050 @aviral_kumar2 @Besteuler @tengyuma @patrickshafto @tegmark @jonathan_thomm @dawnsongtweets ☕️🥐 Light breakfast provided (for anyone who didn’t get a chance to grab one!). Huge thanks to our sponsors for making this possible: @DARPA @HarmonicMath @awscloud @axiommathai #NeurIPS2025 #MATHAI #Math #AI
Pan Lu tweet mediaPan Lu tweet mediaPan Lu tweet mediaPan Lu tweet media
Kaiyu Yang@KaiyuYang4

Looking forward to seeing you at the MATH-AI workshop at #NeurIPS tomorrow! Location: Upper Level Ballroom 6A Schedule: mathai2025.github.io/schedule/

English
4
14
53
17.5K
Chi Jin รีทวีตแล้ว
Seth Karten
Seth Karten@sethkarten·
How do we close the gap between specialist RL and generalist LLM agents? We're benchmarking it in Pokémon. Join us at the PokeAgent Challenge competition workshop @ NeurIPS 2025. 📍 Dec 7, 8AM in San Diego 🎮 Track 1: Competitive Pokémon (game-theoretic reasoning) 🗺️ Track 2: Speedrunning (long-horizon planning) Speakers from Google DeepMind, NYU, CMU, UT Austin, Princeton.
Seth Karten tweet media
English
7
19
59
10.1K
Chi Jin
Chi Jin@chijinML·
Super proud of my fantastic postdocs and graduate students taking their next steps at frontier labs 🎉 • Yong Lin (@Yong18850571) → Thinking Machine • Zihan Ding (@Hanry65960814) → Bytedance • Ahmed Khaled → Google It’s always bittersweet to say goodbye😢 but I couldn’t be more excited to see what you achieve next!
Chi Jin tweet mediaChi Jin tweet media
English
3
7
244
39.7K
Chi Jin รีทวีตแล้ว
Danqi Chen
Danqi Chen@danqi_chen·
I am going to present two papers at #COLM2025 tomorrow from 4:30-6:30pm, as none of our leading authors can attend due to visa issues. Haven't done poster presentations for years 🤣🤣 .... so I will do my best! #76: LongProc #80: Goedel-Prover v1
Danqi Chen tweet mediaDanqi Chen tweet media
Chi Jin@chijinML

Our Goedel-Prover V1 will be presented at COLM 2025 in Montreal this Wednesday afternoon! I won’t be there in person, but my amazing and renowned colleague @danqi_chen will be around to help with the poster — feel free to stop by!

English
4
27
347
49K
Chi Jin
Chi Jin@chijinML·
Our Goedel-Prover V1 will be presented at COLM 2025 in Montreal this Wednesday afternoon! I won’t be there in person, but my amazing and renowned colleague @danqi_chen will be around to help with the poster — feel free to stop by!
Chi Jin tweet media
English
3
9
73
55.5K
Chi Jin
Chi Jin@chijinML·
Excited to share that I’ve been promoted to Associate Professor with tenure at Princeton!🎉 6 years may not be long, but AI research has evolved significantly during this period. Grateful to all my students, collaborators, colleagues for being with me on this remarkable journey!
Chi Jin tweet media
English
149
60
2.7K
114K
Chi Jin
Chi Jin@chijinML·
🚀With early access to Tinker, we matched full-parameter SFT performance as in Goedel-Prover V2 (32B) (on the same 20% data) using LoRA + 20% of the data. 📊MiniF2F Pass@32 ≈ 81 (20% SFT). Next: full-scale training + RL. This is something that previously took a lot more effort on our Princeton cluster. Really impressed by how Tinker takes care of the heavy multi-GPU infra while still letting us control the algorithms. It feels great to focus on research instead of plumbing. This also echoes Thinking Machine Lab’s LoRA blog thinkingmachines.ai/blog/lora/ — makes me excited about what scalable and efficient systems like Tinker can bring to the research community. Thanks to the Thinking Machines team and Goedel team at Princeton @__zrrr__ @Yong18850571 @johnschulman2 @Tianyi_Zh @danqi_chen 🙌 Also check out our paper on Goedel prover: arxiv.org/abs/2508.03613
English
2
22
194
35.6K