Rishab Bala

22 posts

Rishab Bala

@Sub_RBala

PhD student @VT_CS

Corvallis, Oregon Katılım Mart 2022

2.4K Takip Edilen186 Takipçiler

Rishab Bala@Sub_RBala·19 Şub

The 3 self-distillation papers seem to be extremely similar in the method and only differ in how the feedback is generated/incorporated. They are also only compared to SFT (known to be the weakest method), while incorporating feedback is also done with other PO methods. Not quite sure of the takeaways, but the improvements and continual learning settings look good!

English

339

Thomas Kleine Buening@thomasklbg·19 Şub

Deployed LLMs and users generate millions of conversations every day. These are full of useful learning signals, yet we don't use them for training. We introduce self-distillation for learning directly from user conversations – no rewards, no labels, no extra models.

English

254

52.4K

Rishab Bala@Sub_RBala·22 Oca

@giffmana @ArashVahdat @HaoZhao_AIRSUN What is the prompt for generating an image like this?

English

Lucas Beyer (bl16)@giffmana·21 Oca

@ArashVahdat @HaoZhao_AIRSUN

QME

746

Hao Zhao@HaoZhao_AIRSUN·19 Oca

What really drives paper acceptance? 🤔📄 We analyze the entire peer-review process (scores, rebuttals, reviewer behavior) and turn it into a predictive, interpretable system. 🚀 PaperDecision A large-scale benchmark + multi-agent framework for real peer review modeling. 📊 Key highlights • 82.44% accuracy on ICLR 2025 accept/reject prediction • Stable generalization across ICLR 2023–2025 • First benchmark spanning paper → reviews → rebuttals → decisions 🤖 Why multi-agent? Reviewer → Summarizer → Rebuttal Analyzer → Decision Agent This structured workflow significantly outperforms single-agent prediction. 📈 What actually matters • Avg reviewer score is king (+0.705) • Rebuttal success ≈ reviewer scores (+0.53) • One stubborn reviewer can kill a paper (−0.32) • Experts are harder to please • Surprisingly, shallow reviewers can help 🌐 Project page: paperdecision.netlify.app 💻 Code & data: github.com/PaperDecision/… If you care about peer review, meta-science, or LLM agents, this is for you. 🧠✨

English

152

15K

Rishab Bala retweetledi

The Sanghani Center at Virginia Tech@SanghaniCtrVT·6 Kas

@therealthapa One more @SanghaniCtrVT paper at #EMNLP2025: Efficient Model Development through Fine-tuning Transfer Main proceedings @linusdd44804 @Sub_RBala @tuvllms (all VT) w/@fyliufengyuan, @kandpal_nikhil tinyurl.com/2kv9nr25

English

460

Rishab Bala retweetledi

Tu Vu@tuvllms·20 Ağu

Excited to share that our paper on efficient model development has been accepted to #EMNLP2025 Main conference @emnlpmeeting. Congratulations to my students @linusdd44804 and @Sub_RBala on their first PhD paper! 🎉

Tu Vu@tuvllms

🚨 New paper 🚨 Excited to share my first paper w/ my PhD students!! We find that advanced LLM capabilities conferred by instruction or alignment tuning (e.g., SFT, RLHF, DPO, GRPO) can be encoded into model diff vectors (à la task vectors) and transferred across model versions. 💡You don’t necessarily need to fine-tune from scratch again for every new base model version. Instead, fine-tune once and add the diff vector to updated versions! ♻️♻️♻️. This can also offer a stronger and more computationally efficient starting point when further training is feasible. 📰: tinyurl.com/finetuning-tra… More 👇

English

5.6K

Rishab Bala@Sub_RBala·19 Haz

@blelbach Will there be a recording?

English

311

Rishab Bala@Sub_RBala·19 Haz

@ryolu_ @cursor_ai Before all that can we get a way to remap autocompletes from tab to another button? And a way to get partial edits?

English

Ryo Lu@ryolu_·18 Haz

Only in @cursor_ai: o3, Gemini, Opus, Sonnet. One tool. One subscription. And soon: Multi-agent. Multi-model. Any workflow. Anywhere.

Logan Kilpatrick@OfficialLoganK

Introducing the Gemini 2.5 model family: - Gemini 2.5 Pro (Stable, no changes from 06-05) - Gemini 2.5 Flash (Stable, updated pricing from 05-20) - Gemini 2.5 Flash-Lite (Preview, small reasoning model) More info in 🧵

English

1.4K

199.4K

Rishab Bala@Sub_RBala·17 Haz

Pretty cool work!

Sohee Yang@soheeyang_

🚨 New Paper 🧵 How effectively do reasoning models reevaluate their thought? We find that: - Models excel at identifying unhelpful thoughts but struggle to recover from them - Smaller models can be more robust - Self-reevaluation ability is far from true meta-cognitive awareness

English

167

Rishab Bala@Sub_RBala·17 Haz

Really cool work. Can you share the difference in generation lengths between the parallel approach and and traditional AR models. Latency reduction with parallel thought makes sense, but without a comparison on the total number generated tokens the speedup/# parallel isnt useful. Maybe wall-clock time comparison makes more sense here?

Infini-AI-Lab@InfiniAILab

🔥 We introduce Multiverse, a new generative modeling framework for adaptive and lossless parallel generation. 🚀 Multiverse is the first open-source non-AR model to achieve AIME24 and AIME25 scores of 54% and 46% 🌐 Website: multiverse4fm.github.io 🧵 1/n

English

123

Rishab Bala retweetledi

Tu Vu@tuvllms·3 Haz

✨ New paper ✨ 🚨 Scaling test-time compute can lead to inverse or flattened scaling!! We introduce SealQA, a new challenge benchmark w/ questions that trigger conflicting, ambiguous, or unhelpful web search results. Key takeaways: ➡️ Frontier LLMs struggle on Seal-0 (SealQA’s core set): most chat models (incl. GPT-4.1 w/ browsing) achieve near-zero accuracy ➡️ Advanced reasoning models (e.g., DeepSeek-R1) can be highly vulnerable to noisy search results ➡️ More test-time compute does not yield reliable gains: o-series models often plateau or decline early ➡️ "Lost-in-the-middle" is less of an issue, but models still fail to reliably identify relevant docs amid distractors 📜: arxiv.org/abs/2506.01062 🤗: huggingface.co/datasets/vtllm… 🧵:👇

English

146

17.3K

Rishab Bala@Sub_RBala·14 May

@Tim_Dettmers Are the slides/recording available?

English

Tim Dettmers@Tim_Dettmers·12 May

Catch my talk today "Lessons Learned from Successful PhD Students" where I will talk about the science of success in academia and what it means for your own research. 10:45am in the Mission City Ballroom mlsys.org/virtual/2025/i…

English

112

16.4K

Rishab Bala retweetledi

Tu Vu@tuvllms·27 Mar

Our paper is now available on arXiv: arxiv.org/abs/2503.20110

Tu Vu@tuvllms

English

1.6K

Rishab Bala@Sub_RBala·21 Şub

@ylongqi Hey Longqi, Im a Phd student at Virginia Tech working on multi task learning, model merging, and reasoning of LLMs. My CV is attached in my bio. Let me lnow if you’re interested.

English

108

Longqi Yang@ylongqi·20 Şub

Internship alert! We have an immediate part-time research intern opening at Microsoft’s Office of Applied Research to improve LLM reasoning. Please reach out if you or your students are interested!

English

469

52.5K

Rishab Bala@Sub_RBala·14 Oca

@martin_casado Cant DM. Can I get an invite?

English

martin_casado@martin_casado·14 Oca

Hey infra folks. We're standing up a new Discord server to discuss CS infra. If you want an invite DM me (reply and I'll follow). thanks!

English

946

969

171.9K

Rishab Bala retweetledi

Accepted papers at TMLR@TmlrPub·25 Eyl

Adversarial Attacks on Online Learning to Rank with Stochastic Click Models Zichen Wang, Rishab Balasubramanian, Hui Yuan, chenyu song, Mengdi Wang, Huazheng Wang. Action editor: Simon Lacoste-Julien. openreview.net/forum?id=BKwGo… #adversarial #ranker #rank

English

686

Rishab Bala@Sub_RBala·21 Kas

@a_stadt I can review if you’d like

English

261

Alex Warstadt@a_stadt·20 Kas

I'm in need of NINE emergency reviews for ACL ARR. Over 1/3 of my reviewers are nonresponsive, I think that's a personal record 🙃 Please let me know if you can take on some of these!!

English

11.4K

Rishab Bala@Sub_RBala·7 Kas

@soumyabrata_pal Hey @soumyabrata_pal, I'm interested and I have experience in theory and experiments. I'm not able to send a DM, but my resume and website are in my bio. Please let me know if you'd be interested

English

Soumyabrata Pal@soumyabrata_pal·6 Kas

Looking for PhD interns (2025) at Adobe Research (Bangalore) who are interested in either A) ML Theory/Applied Statistics or B) some aspect of LLM Optimization. Please reach out if interested. Recent papers: 1) arxiv.org/abs/2410.20041 2)arxiv.org/abs/2410.12513

English

195

24.8K

Rishab Bala@Sub_RBala·4 Kas

@guohao_li @CamelAIOrg Hey Guohao, I’m interested in applying. You can checkout my resume and profile in my bio. If you’re interested let me know

English

318

Guohao Li 🐫@guohao_li·3 Kas

Hiring a Research Intern at @CamelAIOrg 🐫! This role involves working on data generation and multi-agent systems, contributing directly to @CamelAIOrg’s projects. Expected outcomes include open-source contributions and submission of research findings to a top-tier ML conference. Preferred duration is 6 months or longer, with a minimum commitment of 3 months. London based or remote. 📧 If this role interests you, please send your CV and a few paragraphs demonstrating your motivations to hr@eigent.ai. RT is very much appreciated 🙏 eigent-ai.notion.site/Research-Inter…

English

137

25.3K

Rishab Bala@Sub_RBala·9 Eyl

@Francis_YAO_ @rasbt Starting my PhD and in a similar position. How do you stay up to date and find good ideas without reading papers?

English

359

Yao Fu@Francis_YAO_·8 Eyl

@rasbt I’ll vote for your solution.

English

Yao Fu@Francis_YAO_·8 Eyl

Looking back, the largest problem of my own phd journey is reading too many papers and writing too few codes 😮‍💨

English

280

54.5K

Rishab Bala retweetledi

Huazheng Wang@huazheng_wang·25 Tem

Are combinatorial bandits vulnerable to reward poisoning attacks? In our #ICML2024 paper, we characterize the attackability condition and show some CMAB instances are intrinsically robust. Surprisingly, the attackability is different between white-box and black-box attacks.(1/2)

English

501

Rishab Bala retweetledi

Huazheng Wang@huazheng_wang·11 Ara

At #NeurIPS2023 till Saturday. Happy to share our work "Unified Off-Policy Learning to Rank: a Reinforcement Learning Perspective". Check the poster at Session 5 Thu 10:45 - 12:45 CST. Looking forward to discussing RL, recommendation, ranking with friends and colleagues! (1/3)

English

2.6K

Keşfet

@giffmana @ArashVahdat @HaoZhao_AIRSUN @therealthapa @SanghaniCtrVT @linusdd44804 @tuvllms @fyliufengyuan