Chi Jin

172 posts

Chi Jin

@chijinML

Researcher @OpenAI | Associate Prof @Princeton AI Reasoning · Reinforcement Learning · Game Theory · ML Foundations

Princeton, NJ เข้าร่วม Kasım 2012

512 กำลังติดตาม7.6K ผู้ติดตาม

ทวีตที่ปักหมุด

Chi Jin@chijinML·28 Oca

Life update🙂: I’m on sabbatical from Princeton and have started at OpenAI, working on building AGI. Happy to be back in the Bay Area after 6 years! Bay Area friends—DMs open for food & hikes.

English

663

64.6K

Chi Jin@chijinML·4d

Really excited about the new GPT-5.5 release — a new level of intelligence, and a new level of usefulness for everyday life. Also, try Codex to accelerate your daily work!

OpenAI@OpenAI

Introducing GPT-5.5 A new class of intelligence for real work and powering agents, built to understand complex goals, use tools, check its work, and carry more tasks through to completion. It marks a new way of getting computer work done. Now available in ChatGPT and Codex.

English

2.7K

Chi Jin@chijinML·6d

@ysu_nlp @NeoCognition Big congrats Yu!

English

436

Yu Su@ysu_nlp·6d

Introducing @NeoCognition, the agent lab for specialized intelligence. Everyone needs experts, but human expertise does not scale. Backed by $40M seed funding, we build self-learning agents that specialize across domains to make expertise abundant.

English

134

869

169.6K

Chi Jin@chijinML·26 Mar

Thrilled to share our Goedel-series project on code! We can now automatically generate not only verifiable mathematical proofs, but also verifiable coding/programs. Thanks for Ziran (@__zrrr__), Zenan, and all collaborators for the amazing work!

Ziran Yang@__zrrr__

Introducing Goedel-Code-Prover 🌲 LLMs write code, but can they prove it correct? Not just pass tests, but construct machine-checkable proofs that a program works for ALL possible inputs. We built a system that does exactly this. Given aprogram and its specification in Lean 4, Goedel-Code-Prover automatically synthesizes formal proofs ofcorrectness. Our 8B model achieves 62% overall success rate across three benchmarks (Verina, Clever &AlgoVeri), a 2.6x improvement over the strongest baseline, surpassing both frontier LLMs (GPT/Gemini/Claude)and open-source theorem provers up to 84x larger (DeepSeek-Prover/Goedel-Prover/Kimina-Prover/BFS-Prover).

English

6.1K

Chi Jin รีทวีตแล้ว

Ziran Yang@__zrrr__·26 Mar

English

552

69.2K

Chi Jin@chijinML·18 Mar

Check out the full manuscript about the largest AI Pokémon tournament we ran at NeurIPS 2025!

Seth Karten@sethkarten

x.com/i/article/2033…

English

Chi Jin@chijinML·5 Şub

Honored to have contributed to substantial token efficiency gains powering the GPT Codex release!

OpenAI@OpenAI

GPT-5.3-Codex is now available in Codex. You can just build things. openai.com/index/introduc…

English

369

24.9K

Chi Jin@chijinML·31 Oca

Hard to believe it’s been 6+ years since we coauthored “Is Q-learning Provably Efficient?”—my first RL foundations paper, which also turned out to be super impactful. Really excited to collaborate again!

Sebastien Bubeck@SebastienBubeck

Welcome @chijinML , looking forward to working together again! And this week we're also welcoming Prasad Raghavendra from UC Berkeley!

English

151

21.6K

Chi Jin@chijinML·28 Oca

@shaneguML lol, would love to catch up in person!

English

2.8K

Shane Gu@shaneguML·28 Oca

@chijinML Love to catch up sometime! Are you going to work on Pokemon (along with other projects)? ;)

English

6.3K

Chi Jin@chijinML·28 Oca

Life update🙂: I’m on sabbatical from Princeton and have started at OpenAI, working on building AGI. Happy to be back in the Bay Area after 6 years! Bay Area friends—DMs open for food & hikes.

English

663

64.6K

Chi Jin@chijinML·19 Ara

This is a truly remarkable math theorem prover! — well ahead of competitors, near-saturating PutnamBench, and achieving much higher solve rates on the recent concluded Putnam 2025 with a suprisingly short amount of time.

Zheng Yuan@GanjinZero

Excited to announce Seed-Prover 1.5 which is trained via large-scale agentic RL with Lean. It proved 580/660 Putnam problems and proved 11/12 in Putnam 2025 within 9 hours. Check details at github.com/ByteDance-Seed…. We will work on autoformalize towards contributing to real math!

English

12.4K

Chi Jin รีทวีตแล้ว

Seth Karten@sethkarten·15 Ara

How do we close the gap between specialist RL and generalist LLM agents? The PokeAgent Challenge @ NeurIPS 2025 was designed to map the Pareto Frontier with a state space complexity that makes RL envs cry We had 100 active teams with over 650 people (and growing) in our community leading to 15k+ battles and dozens of hours of AI speedrunning gameplay The result: RL specialists dominate in partially observable 2p0s games while hybrid RL-LLM solutions are best for high-stakes long-context tasks Winners Track 1 (Battling): PA-Agent (RL) & Foul Play (MCTS). Judges' Choice: Porygon2 (League), August (LLM) Track 2 (Speedrunning): Heatz (LLM+RL hybrid), Hamburg Pokerunners (Dreamer-based). Judges' Choice: Deepest (Least # actions LLM) We have a few interesting things coming in the future, including... -Our retrospective paper, new battling ladder for VGC, and Pokemon AI dataset dropping in late January 2026 -Our full LLM agent run of Pokemon Emerald playing live on youtube .com/@ PokeAgentChallenge -PokeAgent Challenge v2 next year... with a twist (potential sponsors---send me a DM!) -Join our community for more discussion discord. gg/ZU9BQGnuts On a personal note, this was an incredible feat to put together and I have to give a huge shoutout to @__jakegrigsby__ for countless late nights calls discussing infrastructure and potential sponsorship ---this wouldn't have happened without him. Huge thanks to Aaron Traylor and Minmin Chen for speaking at our workshop, @steph_milani @kiranvodrahalli @drfeifei @yukez @chijinML for organizing support, and @GoogleDeepMind for sponsoring

English

7.7K

Chi Jin@chijinML·8 Ara

It was really nice talking to you all at such an interesting event. Special thanks to Seth and Jake for the tremendous effort in organizing!

Seth Karten@sethkarten

The pokeagent workshop is tomorrow Featuring our speakers from Brown University and Google Deepmind, as well as announcing our competition winners Benchmarking in Pokemon is an exciting way to close-out NeurIPS!

English

5.1K

Chi Jin รีทวีตแล้ว

Pan Lu@lupantech·6 Ara

Join us at the 5th MATH-AI Workshop at @NeurIPSConf now, our biggest year yet with a record 249 submissions!! 🎉 ➡️ mathai2025.github.io We’re honored to host an incredible lineup of speakers: @swarat @WeizhuChen @j_dekoninck @Leonard41111588 @HannaHajishirzi @chijinML @tkalil2050 @aviral_kumar2 @Besteuler @tengyuma @patrickshafto @tegmark @jonathan_thomm @dawnsongtweets ☕️🥐 Light breakfast provided (for anyone who didn’t get a chance to grab one!). Huge thanks to our sponsors for making this possible: @DARPA @HarmonicMath @awscloud @axiommathai #NeurIPS2025 #MATHAI #Math #AI

Kaiyu Yang@KaiyuYang4

Looking forward to seeing you at the MATH-AI workshop at #NeurIPS tomorrow! Location: Upper Level Ballroom 6A Schedule: mathai2025.github.io/schedule/

English

17.5K

Chi Jin รีทวีตแล้ว

Learning Theory Alliance@let4all·26 Kas

At #NeurIPS2025? Join us for a Social on Wednesday at 7 PM, featuring a fireside chat with Jon Kleinberg and mentoring tables. Ft. mentors @canondetortugas @SurbhiGoel_ @HamedSHassani @tatsu_hashimoto @andrew_ilyas @chijinML @thegautamkamath @MountainOfMoon + more!

English

26.6K

Chi Jin รีทวีตแล้ว

Seth Karten@sethkarten·24 Kas

How do we close the gap between specialist RL and generalist LLM agents? We're benchmarking it in Pokémon. Join us at the PokeAgent Challenge competition workshop @ NeurIPS 2025. 📍 Dec 7, 8AM in San Diego 🎮 Track 1: Competitive Pokémon (game-theoretic reasoning) 🗺️ Track 2: Speedrunning (long-horizon planning) Speakers from Google DeepMind, NYU, CMU, UT Austin, Princeton.

English

10.1K

Chi Jin@chijinML·7 Kas

Super proud of my fantastic postdocs and graduate students taking their next steps at frontier labs 🎉 • Yong Lin (@Yong18850571) → Thinking Machine • Zihan Ding (@Hanry65960814) → Bytedance • Ahmed Khaled → Google It’s always bittersweet to say goodbye😢 but I couldn’t be more excited to see what you achieve next!

English

244

39.7K

Chi Jin รีทวีตแล้ว

Danqi Chen@danqi_chen·8 Eki

I am going to present two papers at #COLM2025 tomorrow from 4:30-6:30pm, as none of our leading authors can attend due to visa issues. Haven't done poster presentations for years 🤣🤣 .... so I will do my best! #76: LongProc #80: Goedel-Prover v1

Chi Jin@chijinML

Our Goedel-Prover V1 will be presented at COLM 2025 in Montreal this Wednesday afternoon! I won’t be there in person, but my amazing and renowned colleague @danqi_chen will be around to help with the poster — feel free to stop by!

English

347

49K

Chi Jin@chijinML·8 Eki

English

55.5K

Chi Jin@chijinML·2 Eki

Excited to share that I’ve been promoted to Associate Professor with tenure at Princeton!🎉 6 years may not be long, but AI research has evolved significantly during this period. Grateful to all my students, collaborators, colleagues for being with me on this remarkable journey!

English

149

2.7K

114K

Chi Jin@chijinML·1 Eki

🚀With early access to Tinker, we matched full-parameter SFT performance as in Goedel-Prover V2 (32B) (on the same 20% data) using LoRA + 20% of the data. 📊MiniF2F Pass@32 ≈ 81 (20% SFT). Next: full-scale training + RL. This is something that previously took a lot more effort on our Princeton cluster. Really impressed by how Tinker takes care of the heavy multi-GPU infra while still letting us control the algorithms. It feels great to focus on research instead of plumbing. This also echoes Thinking Machine Lab’s LoRA blog thinkingmachines.ai/blog/lora/ — makes me excited about what scalable and efficient systems like Tinker can bring to the research community. Thanks to the Thinking Machines team and Goedel team at Princeton @__zrrr__ @Yong18850571 @johnschulman2 @Tianyi_Zh @danqi_chen 🙌 Also check out our paper on Goedel prover: arxiv.org/abs/2508.03613

English

194

35.6K

ค้นพบ

@ysu_nlp @NeoCognition @__zrrr__ @shaneguML @__jakegrigsby__ @steph_milani @kiranvodrahalli @drfeifei