Dylan Sam

207 posts

Dylan Sam banner
Dylan Sam

Dylan Sam

@dylanjsam

safety research @openai | formerly phd @mldcmu, @BrownUniversity

San Francisco Katılım Ekim 2017
508 Takip Edilen1.2K Takipçiler
Sabitlenmiş Tweet
Dylan Sam
Dylan Sam@dylanjsam·
I defended my PhD thesis! Also, a very (~4 month) late life update, but I've joined @OpenAI to work on safety research and pretraining safer language models! 📈 Thank you to my advisor @zicokolter and my committee: Matt Fredrikson, @andrew_ilyas, and @furongh! 🙏
Dylan Sam tweet media
English
25
8
218
20.5K
Dylan Sam retweetledi
Zico Kolter
Zico Kolter@zicokolter·
As AI agents access more untrusted information with greater autonomy, prompt injections may become the greatest security challenge of our era. @GraySwanAI, in collaboration many frontier labs, just released our paper on the largest public prompt injection challenge to date. 🧵
Gray Swan AI@GraySwanAI

Your AI agent can be hijacked by a prompt injection and you'd never know! The attack executes. The response looks normal. And the user moves on. We ran the largest public competition testing this exact threat across tool use, coding, and computer use agents. 464 participants, 272K attacks, 13 frontier models. Every model proved vulnerable.

English
1
3
24
3.3K
Dylan Sam
Dylan Sam@dylanjsam·
Finally, I'm presenting work on monitoring models for harmful behaviors, hallucinations, and adversarial manipulation at Poster #1304 in Exhibit Hall C,D,E on 12/5 at 4:30pm! x.com/dylanjsam/stat…
Dylan Sam@dylanjsam

To trust LLMs in deployment (e.g., agentic frameworks or for generating synthetic data), we should predict how well they will perform. Our paper shows that we can do this by simply asking black-box models multiple follow-up questions! w/ @m_finzi and @zicokolter 1/ 🧵

English
0
0
2
388
Dylan Sam
Dylan Sam@dylanjsam·
Next, I'm presenting on safety pretraining, where we find that incorporating safety behaviors during pretraining leads to more robust language models! Come by Poster #5210 at Exhibit Hall C,D,E at 4:30pm today (12/4)! x.com/dylanjsam/stat…
Dylan Sam@dylanjsam

🚨Excited to introduce a major development in building safer language models: Safety Pretraining! Instead of post-hoc alignment, we take a step back and embed safety directly into pretraining. 🧵(1/n)

English
1
0
3
537
Dylan Sam
Dylan Sam@dylanjsam·
I'm at NeurIPS this week! Excited to meet old/new friends and chat with people about training safer language models. I'm presenting a few works on safety pretraining, measuring diversity in data curation, and monitoring model behaviors --- more info below 👇
English
4
4
37
4.2K
Dylan Sam retweetledi
Emily Byun
Emily Byun@yewonbyun_·
I’m at NeurIPS this week (12/2-12/8) to present our work on when/how synthetic data (e.g., LLM simulations) can help scientists make inferences with less real data, improving the efficiency of costly experiments. Come by Poster #904 on Thursday 4:30PM (Exhibit Hall C,D,E)!🙂
Emily Byun@yewonbyun_

💡Can we trust synthetic data for statistical inference? We show that synthetic data (e.g. LLM simulations) can significantly improve the performance of inference tasks. The key intuition lies in the interactions between the moments of synthetic data and those of real data

English
2
4
31
12.5K
Dylan Sam retweetledi
Pratyush Maini
Pratyush Maini@pratyushmaini·
Excited about our NeurIPS'25 tutorial Data Privacy, Memorization & Copyright in GenAI with Cooper (co-founder, GenLaw) & Joe (represents OpenAI, Stability in all US copyright litigations) We bring together ML researchers, with those who understand its legal implications. Pls RT
Pratyush Maini tweet media
English
3
22
83
12.6K
Dylan Sam retweetledi
Bryan Wilder
Bryan Wilder@brwilder·
I gave talks at MIT and Harvard this week about "Science with synthetic data". How can generative models help us learn about the world (e.g., social systems) in a principled way? Lots of interesting conversations; more convinced than ever that there's nuanced issues to navigate
Bryan Wilder tweet media
English
1
2
9
600
Dylan Sam retweetledi
Sachin Goyal
Sachin Goyal@goyalsachin007·
📢 Multi-token prediction has long struggled with defining the right “auxiliary target,” leading to tons of heuristics. We show a core limitation of these and propose a simple & sweet idea: future summary prediction. Introducing what I call 🚀TL;DR token pretraining🚀
Sachin Goyal tweet media
Divyat Mahajan@divyat09

[1/9] While pretraining data might be hitting a wall, novel methods for modeling it are just getting started! We introduce future summary prediction (FSP), where the model predicts future sequence embeddings to reduce teacher forcing & shortcut learning. 📌Predict a learned embedding of the future sequence, not the tokens themselves

English
4
40
246
28.6K
Dylan Sam retweetledi
Yuda Song
Yuda Song@yus167·
🤖 Robots rarely see the true world's state—they operate on partial, noisy visual observations. How should we design algorithms under this partial observability? Should we decide (end-to-end RL) or distill (from a privileged expert)? We study this trade-off in locomotion. 🧵(1/n)
Yuda Song tweet media
English
2
40
140
29.7K
Dylan Sam retweetledi
Bryan Wilder
Bryan Wilder@brwilder·
How can synthetic data from LLMs be used, e.g. for social science, in a principled way? Check out Emily's thread on our NeurIPS paper. The key is to generate each synthetic sample by prompting with a real example -- enables debiased estimates that wouldn't be possible otherwise!
Emily Byun@yewonbyun_

💡Can we trust synthetic data for statistical inference? We show that synthetic data (e.g. LLM simulations) can significantly improve the performance of inference tasks. The key intuition lies in the interactions between the moments of synthetic data and those of real data

English
1
2
10
1.3K
Dylan Sam retweetledi
Emily Byun
Emily Byun@yewonbyun_·
💡Can we trust synthetic data for statistical inference? We show that synthetic data (e.g. LLM simulations) can significantly improve the performance of inference tasks. The key intuition lies in the interactions between the moments of synthetic data and those of real data
Emily Byun tweet media
English
2
38
144
30.5K
Dylan Sam
Dylan Sam@dylanjsam·
Very interesting insights into understanding when and why synthetic data (although imperfect and biased) can boost the performance of statistical inference!! 📈📈
Emily Byun@yewonbyun_

💡Can we trust synthetic data for statistical inference? We show that synthetic data (e.g. LLM simulations) can significantly improve the performance of inference tasks. The key intuition lies in the interactions between the moments of synthetic data and those of real data

English
0
4
14
1.9K
Dylan Sam retweetledi
Jeremy Cohen
Jeremy Cohen@deepcohen·
Even with full-batch gradients, DL optimizers defy classical optimization theory, as they operate at the *edge of stability.* With @alex_damian_, we introduce "central flows": a theoretical tool to analyze these dynamics that makes accurate quantitative predictions on real NNs.
English
18
214
1.3K
233.5K