Juhan Bae

109 posts

Juhan Bae

Juhan Bae

@juhan_bae

Machine Learning PhD student @UofT

Katılım Eylül 2019
493 Takip Edilen453 Takipçiler
Juhan Bae retweetledi
Anthropic
Anthropic@AnthropicAI·
New Anthropic research: Alignment faking in large language models. In a series of experiments with Redwood Research, we found that Claude often pretends to have different views during training, while actually maintaining its original preferences.
Anthropic tweet media
English
211
691
4.3K
1.7M
Juhan Bae retweetledi
Elisa Nguyen
Elisa Nguyen@_elinguyen·
On my way to Vancouver for @NeurIPSConf and the ATTRIB workshop on 14th! Feel free to drop by ✨ If you're also interested in data attribution or human-centered XAI, let me know and I'd be happy to meet :)
English
1
4
35
4.9K
Juhan Bae retweetledi
Jonathan Lorraine
Jonathan Lorraine@jonLorraine9·
🚨 New #NeurIPS2025 paper “Training Data Attribution via Approximate Unrolling” 🚨 Introducing SOURCE: A method to understand how individual training examples influence neural network behavior, allowing us to make AI models more transparent and trustworthy! 📄 Full paper: openreview.net/pdf?id=3NaqGg9…
Jonathan Lorraine tweet media
English
2
4
51
6.1K
Juhan Bae retweetledi
Laura Ruis
Laura Ruis@LauraRuis·
How do LLMs learn to reason from data? Are they ~retrieving the answers from parametric knowledge🦜? In our new preprint, we look at the pretraining data and find evidence against this: Procedural knowledge in pretraining drives LLM reasoning ⚙️🔢 🧵⬇️
Laura Ruis tweet media
English
24
208
984
197.5K
Juhan Bae retweetledi
Bruno Mlodozeniec
Bruno Mlodozeniec@brunorganised·
Diffusion models are so ubiquitous, but it's difficult to find an introduction that is concise, simple and comprehensive. My supervisor Rich Turner (with me & some other students) has written an introduction to diffusion models that fills this gap: arxiv.org/abs/2402.04384
English
4
112
636
55.1K
Juhan Bae retweetledi
Jiaqi Ma
Jiaqi Ma@Jiaqi_Ma_·
Implementing and benchmarking data attribution baselines seem non-trivial? Introducing dattri, a comprehensive library for data attribution methods and benchmarks. Accepted by NeurIPS 2024 D&B as a Spotlight✨ Paper: arxiv.org/pdf/2410.04555 Github: github.com/TRAIS-Lab/datt… 1/
English
4
9
38
6.2K
Juhan Bae retweetledi
Michael Zhang
Michael Zhang@michaelrzhang·
📝 How do you choose which language model to use? Quantitative benchmarks can be uninformative and fall prey to Goodhart's Law, and even Chatbot Arena performance can be optimized for. In our new preprint, we propose generating qualitative report cards... 🧵
Michael Zhang tweet media
English
1
9
31
4.9K
Juhan Bae retweetledi
Frank Schneider
Frank Schneider@frankstefansch1·
The inaugural AlgoPerf results are in, highlighting a new generation of neural net training algorithms! Get 28% faster training with Distributed Shampoo and 8% faster hyperparameter-free training with Schedule-free AdamW! The future of training algorithms research is bright...
MLCommons@MLCommons

@MLCommons #AlgoPerf results are in! 🏁 $50K prize competition yielded 28% faster neural net training with non-diagonal preconditioning beating Nesterov Adam. New SOTA for hyperparameter-free algorithms too! Full details in our blog. mlcommons.org/2024/08/mlc-al… #AIOptimization #AI

English
1
9
57
8.5K
Juhan Bae retweetledi
Nathan Ng
Nathan Ng@learn_ng·
The minimum description length principle is an attractive Bayesian alternative for quantifying uncertainty, but how can we get it to work efficiently and accurately at scale? Excited to share our ICML work on measuring stochastic complexity with Boltzmann influence functions!
Nathan Ng tweet media
English
2
6
53
7K
Juhan Bae retweetledi
Wu Lin
Wu Lin@LinYorker·
#ICML2024 Can We Remove the Square-Root in Adaptive Methods? arxiv.org/abs/2402.03496 Root-free (RF) methods are better on CNNs and competitive on Transformers compared to root-based methods (AdamW) Removing the root makes matrix methods faster: Root-free Shampoo in BFloat16 /1
Wu Lin tweet media
English
9
16
61
12.7K
Juhan Bae retweetledi
Jonathan Lorraine
Jonathan Lorraine@jonLorraine9·
New #NVIDIA paper: Improving Hyperparameter Optimization with Checkpointed Model Weights We enhance hyperparameter optimization by adding the ability to condition cheap-to-evaluate surrogates for the loss on checkpointed model weights with a graph metanetwork. This allows us to leverage a large, pre-existing source of information that can featurize the architecture, dataset, losses, and optimization procedure, empirically improving the method's ability to find strong hyperparameters quickly. 🔍Project page: research.nvidia.com/labs/toronto-a… 👨‍💻 Code for reproduction: github.com/NVlabs/forecas… 📄 Full Paper: arxiv.org/abs/2406.18630
English
2
18
113
23.9K
Juhan Bae retweetledi
Owain Evans
Owain Evans@OwainEvans_UK·
New paper, surprising result: We finetune an LLM on just (x,y) pairs from an unknown function f. Remarkably, the LLM can: a) Define f in code b) Invert f c) Compose f —without in-context examples or chain-of-thought. So reasoning occurs non-transparently in weights/activations!
Owain Evans tweet media
English
31
218
1.6K
362K
Juhan Bae retweetledi
Sang Choe
Sang Choe@sangkeun_choe·
🚨 Preprint Alert 🚨 LLM is nothing without its training data 💛 But…how (much) does each data contribute to LLM outputs? In our paper, we develop algorithms, theory, and software for LLM-scale data valuation/attribution. 🧵(1/N)
Sang Choe tweet media
English
3
86
391
44.8K
Juhan Bae retweetledi
Daniel Johnson
Daniel Johnson@_ddjohnson·
I'll be at ICLR in Vienna next week, demo-ing Penzai (Tues @ Google DeepMind booth) and presenting recent work on measuring model uncertainty (Sat @ R2-FM workshop)! Want to chat about what models know, how they work, or tools to help us understand them? Please reach out!
English
1
9
58
15.8K
Juhan Bae retweetledi
Samuel Marks
Samuel Marks@saprmarks·
Constellation -- an AI safety research center in Berkeley, CA -- is launching two new programs! * Visiting Fellows: 3-6 months visiting (w/ travel, housing, & office space covered) * Constellation Residency: 1yr salaried position
English
1
6
50
6.6K
Juhan Bae retweetledi
Anthropic
Anthropic@AnthropicAI·
New Anthropic research: we find that probing, a simple interpretability technique, can detect when backdoored "sleeper agent" models are about to behave dangerously, after they pretend to be safe in training. Check out our first alignment blog post here: anthropic.com/research/probe…
Anthropic tweet media
English
34
164
933
288.2K
Juhan Bae retweetledi
Daniel Johnson
Daniel Johnson@_ddjohnson·
Excited to share Penzai, a JAX research toolkit from @GoogleDeepMind for building, editing, and visualizing neural networks! Penzai makes it easy to see model internals and lets you inject custom logic anywhere. Check it out on GitHub: github.com/google-deepmin…
English
39
389
2K
338.5K