Mrigank Raman

Daniel P Jeong@danielpjeong

16

106

35.3K

Mrigank Raman retweetledi

Pranav@PranavMani30·14 Kas

Does adapting general-domain models to medical-domain actually help w med-domain tasks? Stop by at Tuttle Hall, 230p EST, Nov 14 @emnlpmeeting to catch the amazing @danielpjeong present his 🚀oral 🚀talk. Super glad to be part of this work w @danielpjeong @saurabh_garg67 @zacharylipton @MichaelOberst Paper: arxiv.org/abs/2411.08870

🧵 Are "medical" LLMs/VLMs *adapted* from general-domain models, always better at answering medical questions than the original models? In our oral presentation at #EMNLP2024 today (2:30pm in Tuttle), we'll show that surprisingly, the answer is "no". arxiv.org/abs/2411.04118

English

3

9

915

Mrigank Raman retweetledi

Daniel P Jeong@danielpjeong·14 Kas

🧵 Are "medical" LLMs/VLMs *adapted* from general-domain models, always better at answering medical questions than the original models? In our oral presentation at #EMNLP2024 today (2:30pm in Tuttle), we'll show that surprisingly, the answer is "no". arxiv.org/abs/2411.04118

English

Pratyush Maini@pratyushmaini

34

105

24.1K

Mrigank Raman retweetledi

Pratyush Maini@pratyushmaini·10 Ağu

Our work on ACR memorization won the Best Paper Award at CONDA @ #ACL2024 🎉🎉 I will be giving a talk on the same on August 16th Over the last few months, I have substantially grown in the realization of how impactful ACR will be in the GenAI copyright discourse. Thoughts🧵1/n

1/What does it mean for an LLM to “memorize” a doc? Exactly regurgitating a NYT article? Of course. Just training on NYT?Harder to say We take big strides in this discourse w/*Adversarial Compression* w/@A_v_i__S @zhilifeng @zacharylipton @zicokolter 🌐:locuslab.github.io/acr-memorizati…🧵

English

10

18

125

17.8K

Mrigank Raman retweetledi

Pratyush Maini@pratyushmaini·7 May

I am at ICLR 🇦🇹 all week presenting some recent works with my collaborators. Find me at one of these places 👇 or drop a DM/e-mail if you would like to chat! I am excited to talk about data quality, memorization, and life @datologyai! Looking forward to meeting new people!

English

3

46

5.4K

Mrigank Raman retweetledi

Pratyush Maini@pratyushmaini·22 Şub

I am super excited to chart this new journey with an incredibly talented team!! We are on a mission to bring research from labs🔬 to the real world 🌎 and define new frontiers for high quality data research!

DatologyAI@datologyai

Hello world! We are incredibly excited to come out of stealth today to help make better data accessible to everyone, automatically. Hear from our founders about our mission and vision for DatologyAI: datologyai.com/post/introduci…

English

6

3

96

15.7K

Mrigank Raman retweetledi

Abridge@AbridgeHQ·23 Şub

📣 Today, we are announcing our $150M Series C! This is one of the largest funding rounds made to date in generative AI for healthcare. We are grateful to the luminary investors and iconic institutions who believe in us and our mission—to power deeper understanding in healthcare. This round was led by @lightspeedvp who will be joining our board. Other new and existing investors joining the round include co-lead @Redpoint, with support from @IVP, @sparkcapital, @usv, @BessemerVP, @WittingtonVC, Mass General Brigham Artificial Intelligence and Digital Innovation Fund (AIDIF), Kaiser Permanente Ventures, and CVS Health Ventures. With this new capital, we will continue to push boldly into fundamental research, developing bedrock foundation models that draw upon vast troves of multimodal healthcare data. Looking forward to this new wave of possibilities as we continue to fulfill our vision of improving the lives of patients and clinicians. Read more about our Series C here: abridge.com/press-release/…

English

6

18

92

30.5K

Mrigank Raman retweetledi

Pratyush Maini@pratyushmaini·25 Oca

1/4 The Right to be Forgotten is knocking on the door. Yet, unlearning in LLMs has no clear task definition, no evaluation metrics or baselines. Introducing TOFU: Task of Fictitious Unlearning for LLMs 🌐 locuslab.github.io/tofu w/@A_v_i__S @zhilifeng @zacharylipton @zicokolter🧵

English

4

22

85

20.9K

Mrigank Raman@MrigankRaman·21 Ara

🎊🎊Submitted my 10th and final PhD application for this cycle🎊🎊 Submitting applications while traveling to EMNLP and NeurIPS was a very hectic albeit cathartic experience. Hoping for the best🤞 A big thank you to @zacharylipton, @danish037 @LiangDavis for providing me LoRs

English

105

21.7K

Mrigank Raman retweetledi

Saurabh Garg@saurabh_garg67·10 Ara

Does contrastive pretraining on diverse data, give models robust to distribution shift? Spoiler: Better than ERM but there is a _huge_ room to improve, e.g., with pseudolabeling 📝: arxiv.org/abs/2312.03318 w @setlur_amrith @zacharylipton Siva B. @gingsmith @AdtRaghunathan 1/

English

23

118

36.4K

Mrigank Raman retweetledi

Pratyush Maini@pratyushmaini·10 Ara

Been an amazing week meandering through the poster lanes at #EMNLP2023. ICYMI, here are some of my favourite posters on memorization/contamination, data quality, and trustworthiness.

English

5

41

8.6K

Mrigank Raman retweetledi

Hariharan@manik_hariharan·10 Ara

Without finetuning, can LLMs learn from tabular data? We show that LLMs can perform tabular reasoning through automatically generated prompts, and such prompt-based learners work inside a boosting procedure. #NeurIPS2023 12/12 6.15pm EST #1915. @yidingjiang @zicokolter 1/

AK@_akhaliq

Language models are weak learners paper page: huggingface.co/papers/2306.14… A central notion in practical and theoretical machine learning is that of a weak learner, classifiers that achieve better-than-random performance (on any given distribution over data), even by a small margin. Such weak learners form the practical basis for canonical machine learning methods such as boosting. In this work, we illustrate that prompt-based large language models can operate effectively as said weak learners. Specifically, we illustrate the use of a large language model (LLM) as a weak learner in a boosting algorithm applied to tabular data. We show that by providing (properly sampled according to the distribution of interest) text descriptions of tabular data samples, LLMs can produce a summary of the samples that serves as a template for classification and achieves the aim of acting as a weak learner on this task. We incorporate these models into a boosting approach, which in some settings can leverage the knowledge within the LLM to outperform traditional tree-based boosting. The model outperforms both few-shot learning and occasionally even more involved fine-tuning procedures, particularly for tasks involving small numbers of data points. The results illustrate the potential for prompt-based LLMs to function not just as few-shot learners themselves, but as components of larger machine learning pipelines.

English

5

25

9.4K

Mrigank Raman@MrigankRaman·10 Ara

@WilliamWangNLP Oral papers should also have a poster session imo like ML conferences

English

Mrigank Raman@MrigankRaman

290

William Wang@WilliamWangNLP·10 Ara

Observation: not many people are coming to oral sessions at #emnlp2023 but just grab water bottles and leave. 🤣posters are rather crowded……..

English

4

6

63

17.4K

Mrigank Raman retweetledi

Tanya Marwah@__tm__157·9 Ara

The architecture design space for neural net-based #PDEs solvers is still very under-explored. In our #NeurIPS2023 paper we study the benefits of weight-tied architectures for steady-state PDEs. Paper: arxiv.org/abs/2312.00234 Code: github.com/risteskilab/de…

English

3

30

146

35.7K

Mrigank Raman retweetledi

Sachin Goyal @ ICLR’26 🇧🇷🏖️@goyalsachin007·8 Ara

Mrigank is an outstanding budding researcher and is looking for grad school opportunities (PhD) this season!

🚨⚠️ Stop using the [CLS] token ⚠️🚨 I will be talking about 1 simple trick to astonishingly boost the robustness of your NLP classifers. Today, 2pm at #EMNLP2023 "Model-tuning Via Prompts Makes NLP Models Adversarially Robust" 📝arxiv.org/abs/2303.07320 Summary below 1/🧵

English

Mrigank Raman@MrigankRaman

1

8

2.1K

Mrigank Raman retweetledi

Pratyush Maini@pratyushmaini·8 Ara

Encoder-only classifiers are the backbone of critical NLP applications: toxicity classifiers, GPT detectors & data-quality filters. These applications demand OOD & Adv robustness. We have all been fine-tuning them wrong❗ Come talk to us at #EMNLP2023 & robustify your classifiers

🚨⚠️ Stop using the [CLS] token ⚠️🚨 I will be talking about 1 simple trick to astonishingly boost the robustness of your NLP classifers. Today, 2pm at #EMNLP2023 "Model-tuning Via Prompts Makes NLP Models Adversarially Robust" 📝arxiv.org/abs/2303.07320 Summary below 1/🧵

English

4

31

5.6K

Mrigank Raman@MrigankRaman·8 Ara

This work was the fruit of a wonderful collaboration w/@PratyushMaini @ZicoKolter @ZacharyLipton @danish037 👨🏻‍💻 github.com/acmi-lab/mvp 7/7

English

6

598

Mrigank Raman@MrigankRaman·8 Ara

Our work suggests practitioners use MVP to fine-tune pre-trained NLP models, irrespective of the data size (few-shot or full data), architecture (encoder/decoder) and model capacity (small or large). 6/🧵

English

0

4

583

Mrigank Raman@MrigankRaman·8 Ara

🚨⚠️ Stop using the [CLS] token ⚠️🚨 I will be talking about 1 simple trick to astonishingly boost the robustness of your NLP classifers. Today, 2pm at #EMNLP2023 "Model-tuning Via Prompts Makes NLP Models Adversarially Robust" 📝arxiv.org/abs/2303.07320 Summary below 1/🧵

English