At GitHub Constellation, the Sarvam engineering team shared their journey scaling large language models from early pre-training to post-training a 105B Mixture-of-Experts model.
In the session, they covered key challenges across large-scale pretraining, reinforcement learning, and multimodal capabilities across vision and speech, with a focus on real engineering trade-offs and lessons learned.