Chinnadhurai Sankar

183 posts

Chinnadhurai Sankar

@Chinnadhurai

Research @AIatMeta. Previously: @SliceXAI | @Mila_Quebec | @GoogleAI #ConversationalAI, #NLP

Menlo Park, CA Katılım Şubat 2010

204 Takip Edilen233 Takipçiler

Chinnadhurai Sankar retweetledi

AK@_akhaliq·16 Eki

Meta just dropped MobileLLM-Pro on Hugging Face a 1B foundational language model in the MobileLLM series, designed to deliver high-quality, efficient on-device inference across a wide range of general language modeling tasks two variants of the model: A pre-trained base model along with quantized checkpoints for CPU and accelerator inference, as well as an instruction tuned version, showing competitive performance against models in the this size range on tasks like tool calling, question answering, rewriting and summarization MobileLLM-Pro base achieves impressive pre-training results, outperforming Gemma 3 1B and Llama 3.2 1B by on average 5.7% and 7.9% respectively on reasoning, knowledge, and long-context retrieval benchmarks. This performance is achieved by pre-training on less than 2T fully open-source tokens

English

173

35.7K

Chinnadhurai Sankar retweetledi

Patrick Huber@huberpa91·4 Mar

1/n Introducing CoSMoEs 🪐, a set of Compact Sparse Mixture of Experts at on-device scale 📱(arxiv.org/abs/2503.00245). In CoSMoEs, we explore how to enable Sparse Mixture of Experts for on-device inference, focusing on quality, memory, and latency. This work is done with my amazing co-authors @AkshatS07 @erniecyc @Chinnadhurai @Ahhegazy77 and @AdithyaSagarSci

English

1.4K

Chinnadhurai Sankar retweetledi

Ziteng Sun@SZiteng·10 Şub

Inference-time procedures (e.g. Best-of-N, CoT) have been instrumental to recent development of LLMs. The standard RLHF framework focuses only on improving the trained model. This creates a train/inference mismatch. Can we align our model to better suit a given inference-time procedure? We answer this affirmatively, check out the thread below.

English

258

67.5K

Chinnadhurai Sankar retweetledi

Raj Dabre@prajdabre·8 Oca

Nice to see @ai4bharat showcase its work before @satyanadella! AI4Bharat is here to push the boundaries of open-source AI/ML/NLP for Indian languages. To the moon! 🚀🚀🚀 @srija_anand and @MiteshKhapra looking spiffy! :)

English

108

8.1K

Chinnadhurai Sankar retweetledi

Junhong Shen@JunhongShen1·7 Oca

Introducing Content-Adaptive Tokenizer (CAT) 🐈! An image tokenizer that adapts token count based on image complexity, offering flexible 8x, 16x, or 32x compression! Unlike fixed-length tokenizers, CAT optimizes both representation efficiency and quality. Importantly, we use just captions (no pixels!) to guide tokenization, enabling adaptive representation for text-to-image generation. Big shout out to collaborators @AIatMeta: @violet_zct @liliyu_lili @LukeZettlemoyer @imisra_ @michiyasunaga @kushal_tirumala Paper: arxiv.org/abs/2501.03120 More details in 🧵

English

241

22.9K

Chinnadhurai Sankar retweetledi

Ahmad Beirami@abeirami·5 Kas

Want to know how 𝐫𝐞𝐰𝐚𝐫𝐝 𝐦𝐨𝐝𝐞𝐥 𝐠𝐞𝐧𝐞𝐫𝐚𝐥𝐢𝐳𝐚𝐛𝐢𝐥𝐢𝐭𝐲/𝐜𝐫𝐨𝐬𝐬-𝐥𝐢𝐧𝐠𝐮𝐚𝐥 𝐚𝐥𝐢𝐠𝐧𝐦𝐞𝐧𝐭 relates to 𝐚 𝐰𝐨𝐫𝐥𝐝-𝐟𝐚𝐦𝐨𝐮𝐬 𝐅𝐫𝐞𝐧𝐜𝐡 𝐟𝐨𝐨𝐝 𝐜𝐫𝐢𝐭𝐢𝐜? Listen to the 2min podcast generated by NotebookLM on @zhaofeng_wu's #EMNLP2024 paper!

Zhaofeng Wu@zhaofeng_wu

I'll be presenting this paper next week at EMNLP. If you are interested in reward model generalizability and/or multilingual/cross-lingual alignment (or any random stuff), I'd be happy to chat!

English

Chinnadhurai Sankar retweetledi

Asli Celikyilmaz@real_asli·8 Eki

A must-read paper on understanding LLM reasoning!

Tom McCoy@RTomMcCoy

New paper about whether LLMs are reasoning or using heuristics/memorization! A core question in the paper is: How can we tell if an LLM is reasoning? That question ends up being tricky - here’s a thread about it! 1/n

English

1.5K

Chinnadhurai Sankar@Chinnadhurai·13 Eyl

@arvind_io @AIatMeta @OpenAI Congrats 🎉

English

118

Arvind Neelakantan@arvind_io·10 Eyl

Excited to join @AIatMeta! The past 4.5 years at @OpenAI,working on embeddings, GPT-3 & 4,API and ChatGPT, have been career highlights. Now, I'm thrilled to work on the next generations of Llama and contribute to its impact on the developer ecosystem and billions of users!🚀 1/2

English

1.1K

143.1K

Chinnadhurai Sankar retweetledi

Ahmad Beirami@abeirami·4 Eyl

Chernoff bounds characterize large deviations of a RV (from its mean). On the other hand, they are outperformed by the simple Markov's inequality when considering small deviations of a *non-negative* RV. Can we get the best of both worlds? 🧵

English

25.6K

Chinnadhurai Sankar retweetledi

Jonathan Pilault@J_Pilault·12 Ağu

Zyphra is proud to release Tree Attention, a fast inference method for extremely large sequence lengths • 8x faster inference speed vs. Ring Attention • 2x less peak memory • low data communication volumes Paper: arxiv.org/abs/2408.04093 Code: github.com/Zyphra/tree_at… A 🧵

English

150

30.2K

Chinnadhurai Sankar retweetledi

Jason Weston@jaseweston·6 Ağu

🚨New paper!🚨 Self-Taught Evaluators - Llama 3-70B trained w/ synthetic data *only* - Iteratively finds better judgments in training - Best LLM-as-a-Judge model on RewardBench (88.3, 88.7 w/ maj vote) - Outperforms bigger models or human labels arxiv.org/abs/2408.02666 🧵(1/4)

English

370

57.3K

Chinnadhurai Sankar retweetledi

Jason Weston@jaseweston·30 Tem

🚨New paper!🚨 Meta-Rewarding LMs - LM is actor, judge & meta-judge - Learns to reward actions better by judging its own judgments (assigning *meta-rewards*) - Improves acting & judging over time without human labels ... beats Self-Rewarding LMs arxiv.org/abs/2407.19594 🧵(1/6)

English

393

94.1K

Chinnadhurai Sankar@Chinnadhurai·16 Haz

@apsarathchandar @polymtl @ChandarLab Congratulations 🎉

English

Sarath Chandar@apsarathchandar·14 Haz

I am happy to announce that I have been promoted to Associate Professor with tenure at @polymtl. This is an achievement for the entire @chandarlab! I want to thank my awesome students, without whom this would not have been possible! 1/n

English

238

15.4K

Chinnadhurai Sankar retweetledi

Ahmad Beirami@abeirami·8 May

Check out our demo at the Google booth, starting now!

Google AI@GoogleAI

Visit the @iclr_conf Google booth today at 12:45 PM to learn how we trained a model to imitate the distribution of a small set of human attacks, & used it for data amplification while adapting a plug and play language model to reduce label noise.

English

2.9K

Chinnadhurai Sankar@Chinnadhurai·6 May

@AlbalakAlon @ucsbNLP @ucsantabarbara @synth_labs Congrats Alon!!

English

Alon Albalak@AlbalakAlon·5 May

With all of the excitement of the past few months, it's time for a career update: 🎉I graduated with my PhD from @ucsbNLP @ucsantabarbara 🥳and joined @synth_labs 🎊to drive open-science collaborations and push the boundaries on data strategies for synthetic data 👇I'm at #ICLR!

William Wang@WilliamWangNLP

Huge congratulations to @AlbalakAlon for defending his PhD thesis “Understanding and Improving Models Through a Data-Centric Lens”. It’s refreshing to witness Alon’s growth, innovation, and leadership in the last few years. Alon is my 8th PhD graduate and I wish him all the best!

English

10.4K

Chinnadhurai Sankar@Chinnadhurai·1 May

@ravi_iitm @iitmadras @rbc_dsai_iitm @cerai_iitm @IBSE_IITM @ai4bharat @DSAI_IITM Congrats! Looks very promising.

English

Balaraman Ravindran@ravi_iitm·30 Nis

Starting on an exciting journey at the new Wadhwani School of Data Science and AI. The video summarizes the motivation for starting the school. Come join us in enriching the Data Science and AI ecosystem at @iitmadras. @rbc_dsai_iitm @cerai_iitm @IBSE_IITM @ai4bharat @DSAI_IITM

Wadhwani School of Data Science & AI (WSAI), IITM@WSAI_IITM

youtu.be/Cf2L_tPvYdg Delighted to share this video introducing our school, the newly founded Wadhwani School of Data Science & AI at @iitmadras - home to the new dept @DSAI_IITM and several exciting research centres @rbc_dsai_iitm @cerai_iitm @IBSE_IITM @ai4bharat

English

164

17.9K

Chinnadhurai Sankar@Chinnadhurai·25 Nis

@peizNLP Congrats, Pei!

Català

Pei Zhou@peizNLP·25 Nis

PhDone!!!! 👨‍🎓 08/2019-04/2024 What a journey 🥳🚞 I especially feel lucky to share this once-in-a-life-time moment with people I love ❤️ . And seeing my passion-driven research efforts being acknowledged by researchers I deeply admire 🌞!! Special thanks to my awesome committee members @xiangrenNLP @jay_mlr @toby_mintz @jieyuzhao11 Looking back when I made the decision to pursue a PhD 5 years ago. Just feel like hugging the me who strived through the years and giving him the biggest high five🫸🫷 Excitedly and confidently striding into the future journey 🏔️!

English

200

16K

Chinnadhurai Sankar retweetledi

Sujith Ravi@ravisujith·23 Nis

📣 Exciting news! @SliceXAI announces 𝗘𝗟𝗠 (family of Efficient Language Models), a new, decomposable #LLM architecture that delivers models with the best in class performance in terms of 𝑞𝑢𝑎𝑙𝑖𝑡𝑦, 𝑡ℎ𝑟𝑜𝑢𝑔ℎ𝑝𝑢𝑡 & 𝑚𝑒𝑚𝑜𝑟𝑦. 🔗 Blog 👉 medium.com/sujith-ravi/in…

English

694

Chinnadhurai Sankar retweetledi

Ahmad Beirami@abeirami·22 Nis

Check out Zhaofeng's work from his internship with us! TL;DR A reward model trained on language S preference data could be used to align a language T LLM. This sometimes works even better than using a reward model trained on language T preference data.

Zhaofeng Wu@zhaofeng_wu

Want to train an aligned LM in a new language 🌏 but don’t have preference data for training the reward model (RM)? 💡 Just use a RM for another language: it often works well, sometimes even BETTER than if you had a RM in your target language! 🤯 arxiv.org/abs/2404.12318

English

8.7K

Chinnadhurai Sankar retweetledi

Khyathi Chandu@khyathi_chandu·7 Ara

Hi #NLProc!! If you are at #EMNLP2023 and are excited about the Novel Ideas in Learning-to-Learn through Interaction, join us in the exciting series of invited talks and a line up of presentations. In-person attendees can join us in **Leo** at the venue. cs.mcgill.ca/~pparth2/nilli…

English

2.5K

Keşfet

@AkshatS07 @erniecyc @Ahhegazy77 @AdithyaSagarSci @ai4bharat @satyanadella @srija_anand @MiteshKhapra