Ramakanth Kavuluru

3.6K posts

Ramakanth Kavuluru banner
Ramakanth Kavuluru

Ramakanth Kavuluru

@BioNLProc

Faculty at UKY. Views my own, not of my employer(s). Work: #BioNLP, #NLProc, medical informatics, machine learning, LLMs, AI & fairness, health+socialdata

Lexington, KY Katılım Ekim 2016
297 Takip Edilen750 Takipçiler
Ramakanth Kavuluru retweetledi
Ryan Bahlous-Boldi
Ryan Bahlous-Boldi@RyanBoldi·
Your RL post-training may be sabotaging your LLM’s test-time scaling! Conventional RL pretends that you can collapse all reward signals *upfront* into a single *scalar reward*. We introduce Vector Policy Optimization (VPO), which natively maximizes *vector-valued* rewards, boosting test time search performance, even on the original scalar.
Ryan Bahlous-Boldi tweet media
English
34
117
842
198.7K
Ramakanth Kavuluru retweetledi
Arkil Patel
Arkil Patel@arkil_patel·
Excited to share our new paper! “Forecasting Downstream Performance of LLMs With Proxy Metrics” w/ my amazing advisors @sivareddyg, @mariusmosbach, @DBahdanau Cross-entropy loss is a poor predictor of how models perform on downstream tasks (esp. reasoning). We propose something better: proxy metrics computed over expert reasoning traces. 🧵 Thread below 👇
English
3
31
135
94.7K
Ramakanth Kavuluru retweetledi
Qiao Jin, MD
Qiao Jin, MD@DrQiaoJin·
🔥 Excited to share our latest work - Med-V1: Small Language Models for Zero-shot and Scalable Biomedical Evidence Attribution Recent reports in @Nature and @TheLancet on fabricated citations have drawn substantial attention, but even real citations may fail to support the statements attached to them. This makes evidence attribution — verifying if the citations really support the claims — essential for auditing both human- and AI-generated texts. As AI generates billions of medical references every day, we need a scalable model for this task. In this work: - We generated MedFact-Synth, a high-quality dataset of 1.5M synthetic claim-article pairs. - Using MedFact-Synth, we trained and open-sourced Med-V1, a family of 3B-parameter LLMs. - Med-V1 surpasses its backbone models by 27-71%, matching the performance of GPT-5. - Med-V1 can be used to identify high-stakes misattributions and detect LLM hallucinations. 🔗 Paper: arxiv.org/abs/2603.05308 🔗 Model: huggingface.co/ncbi/Med-V1-L3B 🙌 Kudos to all our great collaborators: Yin Fang, Lauren He, Yifan Yang, Guangzhi Xiong, Zhizheng Wang, Nicholas Wan, Joey Chan, Donald Comeau, Robert Leaman, Charalampos Floudas, Aidong Zhang, Michael F. Chiang, Yifan Peng & Zhiyong Lu #MedicalAI #HealthAI #LLMs #Hallucination #EvidenceBasedMedicine #ChatGPT
Qiao Jin, MD tweet media
English
2
13
36
3.3K
Ramakanth Kavuluru
Ramakanth Kavuluru@BioNLProc·
After being busy with work in Mallorca, took a couple of days to explore Barcelona. Besides all the hot spots, the catholic monastery in Montserrat up in the mountains is the best there if you love nature and spirituality. The main church is beautiful and the singing so ethereal.
Ramakanth Kavuluru tweet mediaRamakanth Kavuluru tweet mediaRamakanth Kavuluru tweet mediaRamakanth Kavuluru tweet media
English
0
0
2
128
Ramakanth Kavuluru retweetledi
Nishant Balepur
Nishant Balepur@NishantBalepur·
🚨 New Paper! 🚨 One of my first Ph.D. papers found that LLMs can answer multiple-choice questions without seeing the question 🤔 At #ACL2026, I'm presenting a follow-up showing that current reasoning LLMs can still do this! And quite similarly to a clever test-taker 🧑‍🎓🧵
Nishant Balepur tweet media
English
49
110
1.8K
1.2M
Ramakanth Kavuluru
Ramakanth Kavuluru@BioNLProc·
Bye bye #LREC2026! It was a refreshing event focused on (multilingual) resources and old school comp ling and nlp. This is my first time and really enjoyed it. Found some interesting papers at the intersection of knowledge graphs and LLMs.
Ramakanth Kavuluru tweet media
English
0
0
2
131
Ramakanth Kavuluru
Ramakanth Kavuluru@BioNLProc·
I support this. If it comes back and bites me, I will deserve it. This will reduce the AI slop that we will be asked to cite/compare in peer reviews. It will also induce much needed comeuppance for PIs who just “bless” their lab’s papers and don’t do enough due diligence.
Thomas G. Dietterich@tdietterich

Attention @arxiv authors: Our Code of Conduct states that by signing your name as an author of a paper, each author takes full responsibility for all its contents, irrespective of how the contents were generated. 1/

English
0
0
0
211
Ramakanth Kavuluru
Ramakanth Kavuluru@BioNLProc·
This Palma de Mallorca cathedral stuns in many angles based on the time of the day. #LREC2026
Ramakanth Kavuluru tweet mediaRamakanth Kavuluru tweet mediaRamakanth Kavuluru tweet mediaRamakanth Kavuluru tweet media
English
0
0
1
180
Ramakanth Kavuluru retweetledi
rian
rian@riantouchent·
🚀 What happens if you temporarily train a bidirectional encoder like a decoder? Surprisingly: better biomedical encoders 🧬 We release on @huggingface : • ModernBERT-bio (EN) • ModernCamemBERT-bio (FR) • Base + Large • 8192-token context Thread 👇 1/
rian tweet media
English
2
10
32
2.1K
Ramakanth Kavuluru retweetledi
Will Held
Will Held@WilliamBarrHeld·
To train better open models, we need predictable scaling. Delphi is Marin’s first step: we pretrained many small models with one recipe, then extrapolated 300× to predict a 25B-param / 600B-token run with just 0.2% error. Getting there took some work 🧵
English
14
77
458
136K
Ramakanth Kavuluru retweetledi
Frank Hutter
Frank Hutter@FrankRHutter·
The data science revolution is here now. TabPFN-3 is live, taking tabular foundation models to enterprise scale 🤩 1M training rows on a single H100. No training. No tuning. Load and predict. 🧵 1/5 #tabpfn #tabularfoundationmodels #priorlabs
Frank Hutter tweet media
English
9
46
280
26K
Ramakanth Kavuluru retweetledi
Danish Pruthi
Danish Pruthi@danish037·
I believe one of the most important problems is to detect the nature and extent of AI used. Take paper reviewing for example, where many conferences allow reviewers to use LLMs to polish their reviews but not to generate its contents. However, can such polishing-only policies be even enforced? Our recent #ICML paper answers this question in negative, and shows how even the best AI-text detectors misclassify a non-trivial fraction of LLM polished reviews as fully AI-generated. This is work led by my amazing students: Rounak Saha (@ahaskanuor), Dayita Chaudhuri (@doyitach) and Naveeja Sajeevan in collaboration with @GurushaJuneja and Nihar Shah. (1/n)🧵
Danish Pruthi tweet media
English
1
9
40
3.1K
Hector Yee
Hector Yee@eigenhector·
Omg I landed in PDX and was immediately assailed by multiple Portlandia memes
English
1
0
0
195
Ramakanth Kavuluru
Ramakanth Kavuluru@BioNLProc·
@kirk_roberts Oh yeah, the mental gymnastics in the FDA announcement for these pouch thingies is next level :-)
English
0
0
0
16