Saurav Jha

103 posts

Saurav Jha

@saurav_j_

@IVADO_Qc postdoc @Mila_Quebec; interned @TencentGlobal, @SonyAI_global, @Inria_Nancy; ex-MLE @FactSet; PhD @UNSWComputing

Montréal, Québec Katılım Ağustos 2018

931 Takip Edilen198 Takipçiler

Sabitlenmiş Tweet

Saurav Jha@saurav_j_·23 Oca

🎉 Happy to share that our paper “Mining your own secrets: Diffusion Classifier scores for Continual Personalization of Text-to-Image Diffusion Models” has been accepted to #ICLR2025! 👉 The work results from my #Sony internship in the stunning #Tokyo 🗼city w/ @shiqi_yang_147

English

2.3K

Saurav Jha retweetledi

Ayaan Naveed Malik@ayaannmalik·13 May

don’t reconstruct your world models!

Nilaksh@nilaksh404

Diffusion world models can help test and improve robot policies before running them on real robots. But can the choice of latent space make the WM more faithful? We show that semantic spaces beat reconstruction spaces on task relevant metrics. hskalin.github.io/semantic-wm

English

Saurav Jha retweetledi

Basile Terver@BasileTerv987·12 May

Latest great work from @artemZholus, they study in depth what has been a long-time assumption for the community working on JEPA-style world models. Semantic encoders have better captured the physics of the world. Hence, they are better-suited for decision making in robotics. We had compared some of these semantic encoders in our JEPA-WMs paper arxiv.org/abs/2512.24497, showing that, especially for real-world manipulation, better dense features was key (which is why DINO was > V-JEPA 1/2). Since V-JEPA-2.1 matches DINO on such dense tasks, I am not surprised it is better suited for robotic manipulation ! I really like the systematic set of evaluations, bridging reconstruction and success rate metrics ! Congrats for this contribution 👏

Artem Zholus@artemZholus

Extremely excited to share our recent work on diffusion world models. We ask a simple question - what space supports diffusion world modeling the most and how do we evaluate that?Turns out representation is the answer with JEPA space yielding the strongest diffusion world models!

English

2.8K

Saurav Jha retweetledi

Nilaksh@nilaksh404·12 May

Takeaway: For robotic diffusion world models, don’t choose the latent space only by visual realism. Start from strong semantic encoders, make them diffusion-friendly, and evaluate with policy-facing metrics. Project page: hskalin.github.io/semantic-wm/ Paper: arxiv.org/abs/2605.06388

English

654

Saurav Jha retweetledi

Nilaksh@nilaksh404·12 May

Main result: semantic latents are usually better for control-facing metrics. V-JEPA 2.1, Web-DINO, and SigLIP 2 improve action recovery, task-success prediction, CEM planning, policy rollouts, and robustness to distractors.

English

4.8K

Saurav Jha@saurav_j_·12 May

Check out our new work on the role of reconstruction vs. semantics latent spaces for robotics world modeling!

Nilaksh@nilaksh404

Sayama-shi, Saitama 🇯🇵 English

Saurav Jha retweetledi

Computer Vision and Pattern Recognition Papers@CSVisionPapers·8 May

Reconstruction or Semantics? What Makes a Latent Space Useful for Robotic World Models Nilaksh, Saurav Jha, Artem Zholus, Sarath Chandar arxiv.org/abs/2605.06388 [𝚌𝚜.𝙲𝚅 𝚌𝚜.𝙻𝙶 𝚌𝚜.𝚁𝙾]

Computer Vision and Pattern Recognition Papers tweet media

English

Saurav Jha@saurav_j_·18 Nis

Join us on May 13th at 11 am ET !

CoLLAs 2026@CoLLAs_Conf

📣 Announcing the CoLLAs Seminars A year-long exploration of one of the central challenges in AI: building systems that can learn continually, adapt in real time, and improve over their lifetime. Join us on May 13th at 11 am ET as we kick off the series with Pulkit Agrawal speaking on “Rethinking Post training”. ℹ️ Learn more: lnkd.in/erEdDxgP ✉️ Join our mailing list: lnkd.in/eEGwH-3E 🔗 Zoom link for the talk: lnkd.in/ekkHE5nX

English

Saurav Jha retweetledi

Artificial Intelligence Papers@SciFi·7 Nis

REAM: Merging Improves Pruning of Experts in LLMs Saurav Jha, Maryam Hashemzadeh, Ali Saheb Pasand, Ali Parviz, Min-Joong Lee, Boris Knyazev arxiv.org/abs/2604.04356 [𝚌𝚜.𝙰𝙸 𝚌𝚜.𝙲𝙻 𝚌𝚜.𝙻𝙶 𝚌𝚜.𝙿𝙵] 💬Code: github.com/SamsungSAILMon…

Artificial Intelligence Papers tweet media

Filipino

Saurav Jha retweetledi

Chandar Lab@ChandarLab·31 Mar

We're thrilled to see the Workshop on Weight-Space Symmetries coming to #ICML2026! Huge shoutout to our postdoc @KateLobacheva for co-organizing it. We're excited for the ideas and discussions this workshop will bring to the community!

Weight Space Symmetries @ ICML 2026@weightsymmetry

📢Excited to announce the Workshop on Weight-Space Symmetries @icmlconf! We welcome 4-page submissions analysing symmetries, their effects on training and model structure, and practical methods to utilize them. Submission Deadline: April 24 (23:59 AoE) #ICML2026

English

491

Saurav Jha retweetledi

World Modeling Workshop@worldmodel_conf·5 Şub

What an awesome first day! Thank you all for joining and listening to our amazing speakers: @SchmidhuberAI, @sherryyangML, @cosmo_shirley, @Yoshua_Bengio, @ylecun, @mido_assran World Models have beautiful days ahead. This is just the beginning 🫡

English

8.1K

Saurav Jha retweetledi

Chandar Lab@ChandarLab·3 Şub

NeoBERT: A Next-Generation BERT (TMLR Journal-to-Conference Track) We modernized BERT (RoPE, SwiGLU, 4k context). At just 250M params, it outperforms RoBERTa and ModernBERT on the MTEB benchmark. 📄 arxiv.org/abs/2502.19587

English

1.2K

Saurav Jha@saurav_j_·30 Ara

sayonara sydney and summer, see you soon @NeurIPSConf 👀

Sydney, New South Wales 🇦🇺 English

Saurav Jha@saurav_j_·15 Ara

Montréal winter: hold my 🍺

Keerthana Gopalakrishnan@keerthanpg

Introversion that occurs during SF winter is on another level: it’s like your body just wants to hibernate and stay warm indoors.

Richmond, British Columbia 🇨🇦 Français

Saurav Jha retweetledi

Andrew Gordon Wilson@andrewgwils·6 Ara

Continual learning as a discipline seems to have catastrophic forgetting that it has been focused on catastrophic forgetting for a decade with virtually no progress. Time for some radically new ideas in that area.

English

363

29.1K

Saurav Jha retweetledi

World Modeling Workshop@worldmodel_conf·3 Kas

🚨 Interested in generative world models? We’re thrilled to host Stephen Spenser (@GoogleDeepMind) at the World Modeling Workshop 2026, where he’ll talk about the Genie series of models! 🌐 world-model-mila.github.io

English

110

10.6K

Saurav Jha retweetledi

Sarath Chandar@apsarathchandar·24 Eki

I am recruiting several graduate students (both MSc and PhD level) for Fall 2026 @ChandarLab! The application deadline is December 01. Please apply through the @Mila_Quebec supervision request process here: mila.quebec/en/prospective…. More details about the recruitment process here: chandar-lab.github.io/join/

English

157

582

50.4K

Saurav Jha retweetledi

Mila - Institut québécois d'IA@Mila_Quebec·15 Eki

Mila's annual supervision request process is now open to receive MSc and PhD applications for Fall 2026 admission! For more information, visit mila.quebec/en/prospective…

Mila - Institut québécois d'IA tweet media

English

122

105.8K

Saurav Jha retweetledi

Kamran Chitsaz@KChitsaz·9 Eki

Long reasoning without the quadratic tax: The Markovian Thinker makes LLMs reason in chunks with a bounded state → linear compute, constant memory and it keeps scaling beyond the training limit. 1/6

GIF

Milad Aghajohari@MAghajohari

Introducing linear scaling of reasoning: 𝐓𝐡𝐞 𝐌𝐚𝐫𝐤𝐨𝐯𝐢𝐚𝐧 𝐓𝐡𝐢𝐧𝐤𝐞𝐫 Reformulate RL so thinking scales 𝐎(𝐧) 𝐜𝐨𝐦𝐩𝐮𝐭𝐞, not O(n^2), with O(1) 𝐦𝐞𝐦𝐨𝐫𝐲, architecture-agnostic. Train R1-1.5B into a markovian thinker with 96K thought budget, ~2X accuracy 🧵

English

4.3K

Saurav Jha retweetledi

Sarath Chandar@apsarathchandar·3 Eki

At @ChandarLab, we are happy to announce the third edition of our assistance program to provide feedback for members of communities underrepresented in AI who want to apply to high-profile graduate programs. Want feedback? Details: chandar-lab.github.io/grad_app/. Deadline: Nov 01! cc: @Mila_Quebec, @polymtl, @CIFAR_News

English

17.3K

Saurav Jha retweetledi

Jehanzeb Mirza@jmie_mirza·30 Tem

GLOV: Guided Large Language Models as Implicit Optimizers for Vision Language Models: arxiv.org/pdf/2410.06154 we did something similar, long before it was 'cool' ;-)

Jackson Atkins@JacksonAtkinsX

LLMs can now self-optimize. A new method allows an AI to rewrite its own prompts to achieve up to 35x greater efficiency, outperforming both Reinforcement Learning and Fine-Tuning for complex reasoning. UC Berkeley, Stanford, and Databricks introduce a new method called GEPA (Genetic-Pareto), an autonomous system for prompt optimization. The researchers tested this across diverse tasks like multi-hop Q&A and instruction following. They demonstrated gains using proprietary models like GPT-4.1 Mini and open-source models like Qwen3 8B. Here's a look at how it works: GEPA treats prompt optimization as a genetic evolution problem. It starts with a diverse "pool" of prompt candidates. It uses Pareto optimization to select the "fittest" prompts. It finds the ones that offer the best tradeoff between high performance on a task and low computational cost (measured in "rollouts"). It "evolves" new, better prompts using two key mechanisms: Crossover: Intelligently combining the best parts of two successful "parent" prompts to create a new "child" prompt. Reflective Mutation: This is the self-optimization engine. The system tasks an LLM to analyze its own detailed execution trace (its successes and failures) and then intelligently rewrite its own instructions to fix the flaws. How GEPA fits into your AI strategy: This method provides a powerful new tool without replacing existing ones. Here’s the distinction: GEPA works on its own. You can apply it directly to any base LLM to achieve significant performance gains just by optimizing the prompt. Fine-Tuning teaches the model what (domain knowledge), while GEPA optimizes how the model uses that knowledge (its reasoning process). This makes them powerful complements. You can use GEPA to supercharge a base model, OR you can apply it to an already fine-tuned model to get the absolute best performance from your expert AI. It's a new, flexible layer in the optimization toolkit that allows AI to optimize itself.

English

1.5K

Keşfet

@artemZholus @KateLobacheva @SchmidhuberAI @sherryyangML @cosmo_shirley @Yoshua_Bengio @ylecun @mido_assran