Julien Pourcel @ NeurIPS

353 posts

Julien Pourcel @ NeurIPS

@PourcelJulien

PhD student at @inria (@flowersinria team) working on LLM4code | Google PhD Fellow 2025 | @ENS_ParisSaclay (MVA)

เข้าร่วม Şubat 2014

1K กำลังติดตาม253 ผู้ติดตาม

ทวีตที่ปักหมุด

Julien Pourcel @ NeurIPS@PourcelJulien·10 Tem

Introducing SOAR 🚀, a self-improving framework for prog synth that alternates between search and learning (accepted to #ICML!) It brings LLMs from just a few percent on ARC-AGI-1 up to 52% We’re releasing the finetuned LLMs, a dataset of 5M generated programs and the code. 🧵

English

192

31.8K

Julien Pourcel @ NeurIPS รีทวีตแล้ว

Demis Hassabis@demishassabis·4d

Excited to launch Gemma 4: the best open models in the world for their respective sizes. Available in 4 sizes that can be fine-tuned for your specific task: 31B dense for great raw performance, 26B MoE for low latency, and effective 2B & 4B for edge device use - happy building!

English

317

884

926.5K

Julien Pourcel @ NeurIPS รีทวีตแล้ว

Qwen@Alibaba_Qwen·2 Mar

🚀 Introducing the Qwen 3.5 Small Model Series Qwen3.5-0.8B · Qwen3.5-2B · Qwen3.5-4B · Qwen3.5-9B ✨ More intelligence, less compute. These small models are built on the same Qwen3.5 foundation — native multimodal, improved architecture, scaled RL: • 0.8B / 2B → tiny, fast, great for edge device • 4B → a surprisingly strong multimodal base for lightweight agents • 9B → compact, but already closing the gap with much larger models And yes — we’re also releasing the Base models as well. We hope this better supports research, experimentation, and real-world industrial innovation. Hugging Face: huggingface.co/collections/Qw… ModelScope: modelscope.cn/collections/Qw…

English

922

2.9K

21.4K

8.9M

Julien Pourcel @ NeurIPS@PourcelJulien·5 Şub

@silviasapora @GuillaumeAP time to improve your LLM rewards!

Talence, France 🇫🇷 English

Silvia Sapora@silviasapora·3 Şub

Accepted at #ICLR2026! 1/🧵 Inverse Reinforcement Learning typically produces opaque black-box rewards that are impossible to debug. But what if we could learn rewards as executable, human-readable Python code instead? 🐍 Introducing GRACE: Generating Rewards As CodE. 👇

English

6.8K

Julien Pourcel @ NeurIPS@PourcelJulien·3 Şub

@ADarmouni @jonashuebotter Really cool paper🔥

English

126

Axel Darmouni@ADarmouni·2 Şub

Self-distillation, on top of being good in setups outside of RLVR, is also in fact very very good in RLVR setups! In « Reinforcement Learning via Self-Distillation », this is what @jonashuebotter et al from ETH Zurich demonstrate Setup is similar to the other self distillation paper: 1- Sample from student 2- Sample from teacher, given additionally this time *feedback* from the environment —> feedback can be environment return, but what works best is one of the solution that is correct in the rollout batch 3- Compute KL divergence from student to teacher to align student They put it in the Science Q&A, Tool use and LiveCodeBench setup and compare it with optimized GRPO (so actually rigorous, giving GRPO the fairest of chances) And it works very, very well, usually quite better than GRPO, which is honestly an incredible result Just like in the other setup as well, they make the training more stable by updating the teacher either through EMA or weight interpolation ; and simplify KL to compute only on top K tokens from student rather than the complete vocabulary A few tidbits that are as well: —> Models trained under self-distillation output way less tokens than trained with GRPO —> Strength of self-distillation scales with model strength —> Logit level SDPO (top-K compute for each token) rather than token-level (top-1 compute for each token) or sequence-level (top-1 compute for each token, averaged over sentence) —> The teacher also becomes better at the problem, but gets caught up by the student —> Less forgetting of other tasks not trained for —> Can be combined with GRPO for a slightly higher performance increase The most amazing thing: *It can also be used for test time learning in verifiable setups* How so? Make a generation -> get env feedback -> perform self-distillation on teacher getting generation+env feedback And it helps small models in solving the hard tasks of LCB to a tremendous amount! Both better than best-of-k or multi-turn A very very cool work, and if self-distillation gets implemented quickly into mainstream libs I’m getting the feeling that SD studies have just begun =) Seems like a breakthrough, congrats to the authors!

English

116

14.7K

ARC Prize@arcprize·8 Ara

ARC Prize 2025 Winners Interviews Paper Award 2nd Place @PourcelJulien, @cedcolas, @pyoudeyer discuss SOAR - a self-improving evolutionary program synthesis framework that fine-tunes an LLM on its own search traces - without human-engineered DSLs or solution datasets.

English

8.7K

Julien Pourcel @ NeurIPS รีทวีตแล้ว

Cédric@cedcolas·10 Ara

Our self-improving genetic algorithm received the 2nd place paper award for the @arcprize! Congrats in particular to @PourcelJulien the experiments wizard! We proposed a simple, general algorithm ⬇️

ARC Prize@arcprize

English

980

Julien Pourcel @ NeurIPS@PourcelJulien·9 Ara

@82deutschmark @arcprize @cedcolas @pyoudeyer Awesome !

Yucca Valley, CA 🇺🇸 English

Mark Barney@82deutschmark·8 Ara

@arcprize @PourcelJulien @cedcolas @pyoudeyer Huge congrats to the team! Here’s a tribute from the fans 🙂

English

Julien Pourcel @ NeurIPS@PourcelJulien·6 Ara

@shubhramishra_ @fchollet Thanks!

San Diego, CA 🇺🇸 English

Shubhra Mishra@shubhramishra_·6 Ara

@PourcelJulien @fchollet omg!! congratulations julien :)

English

Julien Pourcel @ NeurIPS รีทวีตแล้ว

François Chollet@fchollet·5 Ara

Congrats to the ARC Prize 2025 winners! The Grand Prize remains unclaimed, but nevertheless 2025 saw remarkable progress on LLM-driven refinement loops, both with "local" models and with commercial frontier models. We also saw the rise of zero-pretraining DL approaches like HRM and TRM. Lots of new learnings!

ARC Prize@arcprize

Announcing the ARC Prize 2025 Top Score & Paper Award winners The Grand Prize remains unclaimed Our analysis on AGI progress marking 2025 the year of the refinement loop

English

528

74.1K

Julien Pourcel @ NeurIPS@PourcelJulien·6 Ara

Check out the awesome Oral paper of my brother @GuillaumeAP now in NeurIPS #2101 👀

San Diego, CA 🇺🇸 English

329

Julien Pourcel @ NeurIPS รีทวีตแล้ว

ARC Prize@arcprize·5 Ara

@jm_alexia ARC Prize 2025 / Paper Award Winner Second Place / SOAR @PourcelJulien, @cedcolas, @pyoudeyer Interview: youtu.be/9lIuoslCHWI

YouTube

English

11.5K

Julien Pourcel @ NeurIPS รีทวีตแล้ว

ARC Prize@arcprize·5 Ara

ARC Prize 2025 Paper Award Winners 1st / "Less is More: Recursive Reasoning with Tiny Networks" (TRM) / A. Jolicoeur-Martineau / $50k 2nd / "Self-Improving Language Models for Evolutionary Program Synthesis: A Case Study on ARC-AGI" (SOAR) / J. Pourcel et al. / $20k 3rd / "ARC-AGI Without Pretraining" / I. Liao et al. / $5k

English

282

133.3K

Julien Pourcel @ NeurIPS รีทวีตแล้ว

Greg Kamradt@GregKamradt·5 Ara

ARC Prize 2025 competition concluded today - The year of Refinements Our goal is to bring meaningful open source research into the community and today we awarded $137K to 14 teams Benchmarks matter, but their true value comes from the progress they catalyze ARC Prize 2025 was designed to inspire the community to publish research aimed at building more generalized systems The grand prize remains unclaimed, but the leaderboard reflects strong advances, and all submissions and solutions are now open sourced. Here is a recap of the winners, for more, checkout the great recap by @mikeknoop (link below) ** Paper Prizes ** 1/ Alexia Jolicoeur-Martineau (@jm_alexia) - TRM Tiny Recursive Model (TRM) is a tiny 2-layer network that does recursive reasoning: it keeps a latent state z and a current answer y, repeatedly updates z using the puzzle and y, then refines y from z over many “deep supervision” steps, so it can gradually fix its own mistakes without needing a huge model. It simplifies Hierarchical Reasoning Model (HRM). 2/ Pourcel julien (@PourcelJulien) - Self-Improving Language Models for Evolutionary Program Synthesis: A Case Study on ARC-AGI SOAR is a self-improving evolutionary program synthesis system: it uses an LLM to sample and refine Python programs for ARC tasks (Sample & Refine phase), then turns all those attempts-both successes and failures-into new problem–solution pairs via hindsight relabeling, and fine-tunes the same LLM so it gets better at both sampling and refinement next time. 3/ ARC-AGI Without Pretraining - Isaac Liao (@LiaoIsaac91893) CompressARC shows that lossless information compression alone can produce intelligent behavior on ARC-AGI: for each puzzle, it builds a randomly initialized neural network and uses gradient descent at inference time to find a compact representation (like a VAE-style loss: cross-entropy + KL) that best “compresses” all the given example grids. ** Top Scores ** 1/ NVARC (@JFPuget, Ivan Sorokin) The NVIDIA team built a huge synthetic dataset of ARC-AGI puzzles, then turned those summaries into Python programs that produce consistent input/output grid pairs. Used test-time fine-tuning (TTFT) plus a fast Depth-First Search decoding process to adapt each model to the hidden test puzzles. 2/ the ARChitects (@dvhrtm, Daniel Franzen, @JDisselh) The ARChitects fine-tune a LLM on ARC-style grids and then use it at test time in two roles: 1) As a generator that, via depth-first search (DFS) over token probabilities, systematically explores the space of high-probability candidate solutions (not just random samples), 2) Second as a scorer that evaluates how likely each complete solution is. 3/ MindsAI @ Tufa Labs (@MindsAI_Jack, @DriesSmit1, @MohamedOsmanML, @bayesilicon) Trained a trimmed CodeT5 encoder–decoder model for years on the massive ARC-AGI Mega dataset (100M+ examples) using span corruption, reversals, and BPE dropout so it learned structure, not surface patterns. At inference, they ran large-scale test-time training (TTT) on thousands of permuted and augmented versions of the test set, then applied AIRV. 4/ Lonnie Lonnie reused the 2024 ARChitects pipeline but treated the random seed as a hyperparameter, systematically exploring seeds to exploit variance on the small 240-task evaluation set, which pushed an otherwise baseline-style system up to 5th place on the private leaderboard. 5/ Guillermo Barbadillo @ Veridas (@guille_bar) Guillermo believes that ARC will ultimately be solved by a search-and-learn approach that combines program synthesis with test-time training (TTT) and hindsight relabeling, so the system can search over code, learn from failed attempts, and steadily refine its solutions. We're going bigger in 2026! Let' go!!

ARC Prize@arcprize

Announcing the ARC Prize 2025 Top Score & Paper Award winners The Grand Prize remains unclaimed Our analysis on AGI progress marking 2025 the year of the refinement loop

English

Julien Pourcel @ NeurIPS รีทวีตแล้ว

ARC Prize@arcprize·5 Ara

Announcing the ARC Prize 2025 Top Score & Paper Award winners The Grand Prize remains unclaimed Our analysis on AGI progress marking 2025 the year of the refinement loop

English

314

221.5K

Julien Pourcel @ NeurIPS@PourcelJulien·3 Ara

@francoisfleuret Baguettes are safe? 🍷🥖

English

François Fleuret@francoisfleuret·2 Ara

Wow. Stop, okay.

David Wallace-Wells@dwallacewells

English

5.4K

Julien Pourcel @ NeurIPS รีทวีตแล้ว

Cédric@cedcolas·1 Ara

In San Diego for #NeurIPS Happy to chat about open-endedness, self goal-generation, intrinsic motivations, self-improvement, human-machine collective intelligence Open to hear about research scientist opportunities too Don't hesitate to reach out!

English

2.4K

Julien Pourcel @ NeurIPS@PourcelJulien·3 Ara

@ClementRomac Thanks @ClementRomac !

English

Clément ROMAC @ ICML 2025@ClementRomac·2 Ara

@PourcelJulien Congrats @PourcelJulien 👏

English

Julien Pourcel @ NeurIPS@PourcelJulien·1 Ara

Big news: I’m officially a 2025 Google PhD Fellow! 🎓✨ I’m also heading to #NeurIPS2025 in SD! Happy to chat about LLM, code gen, evolutionary algo, open-endedness, self-improvement, enhancing LLM diversity, ARC-AGI, and other subjects. Open to hear about summer internship. ☀️

English

945

Julien Pourcel @ NeurIPS@PourcelJulien·1 Ara

A huge thanks to @cedcolas @lae_teo @pyoudeyer for their guidance on my application. If you are interested in solving ARC-AGI with evolutionary algorithms (SOAR) or generating diverse coding problems (ACES), check out my work here: julienp.netlify.app

English

449

ค้นพบ

@silviasapora @GuillaumeAP @ADarmouni @cedcolas @pyoudeyer @arcprize @82deutschmark @shubhramishra_