Mojan Javaheripi

42 posts

Mojan Javaheripi

Mojan Javaheripi

@mojan_jp

Phi models training team @MSFTResearch. CE PhD from @UCSanDiego

가입일 Kasım 2019
159 팔로잉350 팔로워
Dimitris Papailiopoulos
Dimitris Papailiopoulos@DimitrisPapail·
We’ve been cooking... a new open weights 14B Phi-4 reasoning model, SFT’d on ~1.4M carefully curated reasoning demonstrations from o3-mini and RL’d for a tiny bit. This model is a little beast.
Dimitris Papailiopoulos tweet media
English
35
230
1.4K
439.4K
Mojan Javaheripi 리트윗함
Mojan Javaheripi 리트윗함
Mojan Javaheripi 리트윗함
Ahmed Awadallah
Ahmed Awadallah@AhmedHAwadallah·
Introducing Phi-4-reasoning, adding reasoning models to the Phi family of SLMs. The model is trained with both supervised finetuning (using a carefully curated dataset of reasoning demonstration) and Reinforcement Learning. 📌Competitive results on reasoning benchmarks with much larger top-tier models up to DeepSeek R1 📌 Strong performance on new tests released after data collection (AIME 2025, HMMT) 📌Reasoning transfers/generalizes well to new domains even with only SFT (e.g. k-SAT, Mae Solving, Calendar Planning, etc.) 📌Retains and often significantly improves general-purpose capabilities (e.g. instruction following) In addition to the models, we are also very excited to share a very detailed technical report with insights on model training and evaluation Still have a lot to improve especially with context length, coding and tools. Hope you find the models useful! A big thanks to the amazing team and to all our partners.
Ahmed Awadallah tweet media
English
3
32
139
35.2K
Mojan Javaheripi 리트윗함
Suriya Gunasekar
Suriya Gunasekar@suriyagnskr·
In all, we SFT’ed on ~1.4M reasoning traces on select prompts and further RL'd on a small ~6k sample. Despite the relatively long SFT on select domains, we see broad generalization across domains and no degradation in general purpose performance. On the contrary....🔁📚
English
1
2
4
479
Mojan Javaheripi
Mojan Javaheripi@mojan_jp·
Phi-4-reasoning-plus is obtained via a short reinforcement learning on Phi-4-reasoning using a randomly selected subset of SFT prompts. This short RL amplifies the reasoning style and unlocks nice improvements across benchmarks with longer response length.
English
1
0
2
167
Mojan Javaheripi
Mojan Javaheripi@mojan_jp·
Phi-4-reasoning is supervised fine-tuned on Phi-4. The secret sauce? 1) high-quality prompts at the edge of model capability to go beyond vanilla distillation + strong reasoning responses from a teacher. 2) optimal data mixture of different sources for best overall performance.
English
1
0
2
128
Mojan Javaheripi
Mojan Javaheripi@mojan_jp·
More interestingly, our models generalize well to out-of-distribution tasks like algorithmic problem solving, planning, and spatial reasoning. These skills were not targeted in our training data but Phi-4-reasoning performs quite well.
Mojan Javaheripi tweet media
English
1
0
3
143
Mojan Javaheripi
Mojan Javaheripi@mojan_jp·
With 14B parameters, both models are competitive and often better than (larger) frontier models: outperforming DeepSeek-R1-Distill-Llama-70B across the board (small gap in coding) and comparable with original DeepSeek-R1 on AIME 2025 which came out after our data cutoff date.
Mojan Javaheripi tweet media
English
1
0
3
158
Mojan Javaheripi
Mojan Javaheripi@mojan_jp·
Excited to release our first set of reasoning models Phi-4-reasoning and Phi-4-reasoning-plus, available today on HuggingFace and Azure AI foundry. Some interesting insights below and more deep dives in following days!
English
1
10
42
3.6K
Mojan Javaheripi
Mojan Javaheripi@mojan_jp·
Excited to see our SLM work, Phi, mentioned in MIT Technology Review as top 10 breakthrough technologies! 😊 #small-language-models" target="_blank" rel="nofollow noopener">technologyreview.com/2025/01/03/110…
English
0
0
2
146
Aaron Defazio
Aaron Defazio@aaron_defazio·
Good insight. Training recovers from loss spikes because spikes occur in only a few latent dimensions.
Aaron Defazio tweet mediaAaron Defazio tweet media
English
12
24
426
44.3K
Mojan Javaheripi 리트윗함
Shital Shah
Shital Shah@sytelus·
Are you ready for an early Christmas present from our team at Microsoft Research? Introducing the most powerful smol model ever built in the world! Welcome to Phi-4! 👇
Shital Shah tweet media
English
37
130
1.6K
215.7K
Aaron Defazio
Aaron Defazio@aaron_defazio·
fantastic, looks like this was trained with a Linear Decay schedule with warmup. Is this correct @SebastienBubeck ? "The model was pretrained for approximately 10T tokens using linear warm-up and decay schedules"
Sebastien Bubeck@SebastienBubeck

Surprise #NeurIPS2024 drop for y'all: phi-4 available open weights and with amazing results!!! Tl;dr: phi-4 is in Llama 3.3-70B category (win some lose some) with 5x fewer parameters, and notably outperforms on pure reasoning like GPQA (56%) and MATH (80%).

English
1
1
24
4.4K
Mojan Javaheripi 리트윗함
Sebastien Bubeck
Sebastien Bubeck@SebastienBubeck·
Surprise #NeurIPS2024 drop for y'all: phi-4 available open weights and with amazing results!!! Tl;dr: phi-4 is in Llama 3.3-70B category (win some lose some) with 5x fewer parameters, and notably outperforms on pure reasoning like GPQA (56%) and MATH (80%).
Sebastien Bubeck tweet media
English
19
68
411
94.6K
Mojan Javaheripi 리트윗함
Peter Lee
Peter Lee@peteratmsr·
🚀 Phi-4 is here! A small language model that performs as well as (and often better than) large models on certain types of complex reasoning tasks such as math. Useful for us in @MSFTResearch, and available now for all researcher on the Azure AI Foundry! aka.ms/phi4blog
Peter Lee tweet media
English
41
173
726
194.3K