Rattana Pukdee

106 posts

Rattana Pukdee

Rattana Pukdee

@rpukdeee

PhD student at @mldcmu 🐕‍🦺

Pittsburgh Katılım Nisan 2014
268 Takip Edilen56 Takipçiler
Rattana Pukdee retweetledi
Rattana Pukdee retweetledi
Yuda Song @ ICLR 2026
Yuda Song @ ICLR 2026@yus167·
RL on LLMs inefficiently uses one scalar per rollout. But users regularly give much richer feedback: "make it formal," "step 3 is wrong." Can we train LLMs on this human-AI interaction? We introduce RL from Text Feedback, with 1) Self-Distillation; 2) Feedback Modeling (1/n) 🧵
Yuda Song @ ICLR 2026 tweet media
English
14
101
601
106.6K
Rattana Pukdee retweetledi
Dylan Sam
Dylan Sam@dylanjsam·
🚨Excited to introduce a major development in building safer language models: Safety Pretraining! Instead of post-hoc alignment, we take a step back and embed safety directly into pretraining. 🧵(1/n)
Dylan Sam tweet media
English
8
90
357
62.5K
Rattana Pukdee retweetledi
Jennifer Hsia
Jennifer Hsia@jen_hsia·
1/6 Retrieval is supposed to improve generation in RAG systems. But in practice, adding more documents can hurt performance, even when relevant ones are retrieved. We introduce RAGGED, a framework to measure and diagnose when retrieval helps and when it hurts.
Jennifer Hsia tweet media
English
1
23
105
10.2K
Rattana Pukdee
Rattana Pukdee@rpukdeee·
In our #AISTATS2025 paper, we ask: when it is possible to recover a consistent joint distribution from conditionals? We propose path consistency and autoregressive path consistency—necessary and easily verifiable conditions. See you at Poster session 3, Monday 5th May.
Rattana Pukdee tweet media
English
1
7
15
1.1K
Rattana Pukdee retweetledi
Dylan Sam
Dylan Sam@dylanjsam·
Excited to share new work from my internship @GoogleAI ! Curious as to how we should measure the similarity between examples in pretraining datasets? We study the role of similarity in pretraining 1.7B parameter language models on the Pile. arxiv: arxiv.org/abs/2502.02494 1/🧵
Dylan Sam tweet media
English
5
41
167
19.8K
Rattana Pukdee retweetledi
Dylan Sam
Dylan Sam@dylanjsam·
To trust LLMs in deployment (e.g., agentic frameworks or for generating synthetic data), we should predict how well they will perform. Our paper shows that we can do this by simply asking black-box models multiple follow-up questions! w/ @m_finzi and @zicokolter 1/ 🧵
Dylan Sam tweet media
English
4
40
116
15.1K
Rattana Pukdee retweetledi
Dylan Sam
Dylan Sam@dylanjsam·
Contrastive VLMs (CLIP) lack the structure of text embeddings, like satisfying analogies via arithmetic (king - man = queen). We enhance CLIP’s *reasoning abilities* on such tasks by finetuning w/ text descriptions of image differences! w/ D. Willmott, J.Semedo, @zicokolter 1/🧵
Dylan Sam tweet media
English
2
46
171
20.1K
Rattana Pukdee retweetledi
Daniel P Jeong
Daniel P Jeong@danielpjeong·
🧵 Are "medical" LLMs/VLMs *adapted* from general-domain models, always better at answering medical questions than the original models? In our oral presentation at #EMNLP2024 today (2:30pm in Tuttle), we'll show that surprisingly, the answer is "no". arxiv.org/abs/2411.04118
English
2
34
104
24.1K
Rattana Pukdee retweetledi
Daniel P Jeong
Daniel P Jeong@danielpjeong·
(1/N) Can LLMs tell you what features to use for predicting an outcome? In our work, we demonstrate that LLMs such as GPT-4 are capable of identifying predictive features for supervised learning tasks, even without access to the training data. w/ @zacharylipton @RavikumarPrad 🧵
Daniel P Jeong tweet media
English
2
9
30
5.7K
Rattana Pukdee retweetledi
Runtian Zhai
Runtian Zhai@RuntianZhai·
One week away from @iclr_conf in Vienna 🤩 I will be presenting two spotlights: why big foundation models generalize so well under the self-supervised setting, and how to leverage massive unlabeled data using a base kernel that encodes inter-sample similarity. Details 👇 (1/3)
English
1
10
48
6K
Rattana Pukdee retweetledi
Runtian Zhai
Runtian Zhai@RuntianZhai·
Unlabeled data is crucial for modern ML. It provides info about data distribution P, but how to exploit such info? Given a kernel K, our #ICLR2024 spotlight gives a general & principled way: Spectrally Transformed Kernel Regression (STKR). Camera-ready 👇 arxiv.org/abs/2402.00645
English
1
13
59
5.8K
Rattana Pukdee retweetledi
Runtian Zhai
Runtian Zhai@RuntianZhai·
What'd you do with an inter-sample similarity kernel, lots of unlabeled and little labeled data? Some might say kernel ridge regression (KRR), but KRR can't use unlabeled data by representer theorem. Our #ICLR2024 spotlight STKR gives an answer. A 🧵 (1/3) openreview.net/forum?id=OeQE9…
English
1
4
13
1.6K
Rattana Pukdee retweetledi
Brandon Trabucco
Brandon Trabucco@brandontrabucco·
Stable Diffusion is an effective data augmentation. Website: btrabuc.co/da-fusion Watch Here: youtu.be/IKDWOOWzwns I'm excited to share my NeurIPS talk about DA-Fusion from the Synthetic Data workshop, where we build an augmentation that semantically modifies images, and doesn't require prompt engineering or manual tuning. Our work improves few-shot learning across seven diverse vision tasks, including fine-grain concepts that are hard to engineer prompts for, and novel concepts that Stable Diffusion hasn't seen before. DA-Fusion is being used by ecologists to detect Leafy Spurge, an invasive plant, in drone images. #neurips #StableDiffusion #MachineLearning Joint with @rsalakhu, Kyle Doherty, @maxgurinas
YouTube video
YouTube
English
1
24
61
26.3K
Rattana Pukdee retweetledi
Dylan Sam
Dylan Sam@dylanjsam·
Check out our #NeurIPS2023 paper "Learning with Explanation Constraints" with my co-author @rpukdeee, which explains how explanations of model behavior can help us from a learning-theoretic perspective! (arxiv.org/pdf/2303.14496…) 🧵 (1/n)
English
2
16
58
9.5K
Rattana Pukdee retweetledi
Alex Tamkin
Alex Tamkin@AlexTamkin·
DALL-E meets WALL-E: An Art History 1) Mona Lisa, Leonardo da Vinci
Alex Tamkin tweet media
English
7
118
1K
0
Rattana Pukdee retweetledi
Shubhendu Trivedi
Shubhendu Trivedi@_onionesque·
"A Theory of PAC Learnability under Transformation Invariances" arxiv.org/abs/2202.07552 by Hao Shao, @montasser_omar and Avrim Blum; seems like one of the first papers studying optimal algorithms in terms of sample complexity under (group) transformation invariances.
English
1
5
32
0
Rattana Pukdee
Rattana Pukdee@rpukdeee·
I am excited to share that I joined @mldcmu , @SCSatCMU as a PhD student and I will be working on interpretability/ robustness in ML with my advisors Nina Balcan and Pradeep Ravikumar. 🤓
English
0
0
1
0