Sho Takase retweetledi

Finally!! Accepted to ICLR 2026! 🎉
Cited by Nemotron 3 Nano and OLMo 3, yet it was a long journey to get this through academic peer review. So glad to cross the finish line.
📄 Rewriting Pre-Training Data Boosts LLM Performance in Math and Code: arxiv.org/abs/2505.02881

Kazuki Fujii@okoge_kaz
We are thrilled to see our dataset improvement method (SwallowCode) mentioned in the Pre-Training Code Dataset section of the NVIDIA Nemotron 3 Nano Technical Report. Thank you @NVIDIAAI for citing the Swallow Project's work! The Swallow Project is a research initiative developing open bilingual LLMs excelling in both Japanese and English. swallow-llm.github.io/index.en.html
English














