Juhan Bae

109 posts

Juhan Bae

@juhan_bae

Machine Learning PhD student @UofT

Katılım Eylül 2019

493 Takip Edilen453 Takipçiler

Juhan Bae retweetledi

Anthropic@AnthropicAI·18 Ara

New Anthropic research: Alignment faking in large language models. In a series of experiments with Redwood Research, we found that Claude often pretends to have different views during training, while actually maintaining its original preferences.

English

211

691

4.3K

1.7M

Juhan Bae retweetledi

Elisa Nguyen@_elinguyen·8 Ara

On my way to Vancouver for @NeurIPSConf and the ATTRIB workshop on 14th! Feel free to drop by ✨ If you're also interested in data attribution or human-centered XAI, let me know and I'd be happy to meet :)

English

4.9K

Juhan Bae retweetledi

Jonathan Lorraine@jonLorraine9·27 Kas

🚨 New #NeurIPS2025 paper “Training Data Attribution via Approximate Unrolling” 🚨 Introducing SOURCE: A method to understand how individual training examples influence neural network behavior, allowing us to make AI models more transparent and trustworthy! 📄 Full paper: openreview.net/pdf?id=3NaqGg9…

English

6.1K

Juhan Bae retweetledi

Laura Ruis@LauraRuis·20 Kas

How do LLMs learn to reason from data? Are they ~retrieving the answers from parametric knowledge🦜? In our new preprint, we look at the pretraining data and find evidence against this: Procedural knowledge in pretraining drives LLM reasoning ⚙️🔢 🧵⬇️

English

208

984

197.5K

Juhan Bae retweetledi

Bruno Mlodozeniec@brunorganised·6 Kas

Diffusion models are so ubiquitous, but it's difficult to find an introduction that is concise, simple and comprehensive. My supervisor Rich Turner (with me & some other students) has written an introduction to diffusion models that fills this gap: arxiv.org/abs/2402.04384

English

112

636

55.1K

Juhan Bae retweetledi

Jiaqi Ma@Jiaqi_Ma_·4 Kas

Implementing and benchmarking data attribution baselines seem non-trivial? Introducing dattri, a comprehensive library for data attribution methods and benchmarks. Accepted by NeurIPS 2024 D&B as a Spotlight✨ Paper: arxiv.org/pdf/2410.04555 Github: github.com/TRAIS-Lab/datt… 1/

English

6.2K

Juhan Bae retweetledi

Elisa Nguyen@_elinguyen·27 Eyl

✨New preprint alert! Happy to share our latest research: "Towards User-Focused Research in Training Data Attribution for Human-Centered Explainable AI" 📎arxiv.org/abs/2409.16978 Work with Johannes Bertram, @EKortukov, @jeanysong, @coallaoh 1/9

English

Juhan Bae retweetledi

Michael Zhang@michaelrzhang·12 Eyl

📝 How do you choose which language model to use? Quantitative benchmarks can be uninformative and fall prey to Goodhart's Law, and even Chatbot Arena performance can be optimized for. In our new preprint, we propose generating qualitative report cards... 🧵

English

4.9K

Juhan Bae retweetledi

MLCommons@MLCommons·1 Ağu

@MLCommons #AlgoPerf results are in! 🏁 $50K prize competition yielded 28% faster neural net training with non-diagonal preconditioning beating Nesterov Adam. New SOTA for hyperparameter-free algorithms too! Full details in our blog. mlcommons.org/2024/08/mlc-al… #AIOptimization #AI

English

267

443K

Juhan Bae retweetledi

Frank Schneider@frankstefansch1·1 Ağu

The inaugural AlgoPerf results are in, highlighting a new generation of neural net training algorithms! Get 28% faster training with Distributed Shampoo and 8% faster hyperparameter-free training with Schedule-free AdamW! The future of training algorithms research is bright...

MLCommons@MLCommons

English

8.5K

Juhan Bae retweetledi

Roger Grosse@RogerGrosse·23 Tem

Back in 2010, during my PhD, I explored some ideas for learning twist functions for SMC. (The twists were linear random feature models since this was pre-DL-era.) I didn't try to publish since I couldn't think of a compelling use case. Sometimes you just have to wait.

ICML Conference@icmlconf

Congratulations to the best paper award winners

English

107

7.4K

Juhan Bae retweetledi

Nathan Ng@learn_ng·16 Tem

The minimum description length principle is an attractive Bayesian alternative for quantifying uncertainty, but how can we get it to work efficiently and accurately at scale? Excited to share our ICML work on measuring stochastic complexity with Boltzmann influence functions!

English

Juhan Bae retweetledi

Wu Lin@LinYorker·16 Tem

#ICML2024 Can We Remove the Square-Root in Adaptive Methods? arxiv.org/abs/2402.03496 Root-free (RF) methods are better on CNNs and competitive on Transformers compared to root-based methods (AdamW) Removing the root makes matrix methods faster: Root-free Shampoo in BFloat16 /1

English

12.7K

Juhan Bae retweetledi

Jonathan Lorraine@jonLorraine9·3 Tem

New #NVIDIA paper: Improving Hyperparameter Optimization with Checkpointed Model Weights We enhance hyperparameter optimization by adding the ability to condition cheap-to-evaluate surrogates for the loss on checkpointed model weights with a graph metanetwork. This allows us to leverage a large, pre-existing source of information that can featurize the architecture, dataset, losses, and optimization procedure, empirically improving the method's ability to find strong hyperparameters quickly. 🔍Project page: research.nvidia.com/labs/toronto-a… 👨‍💻 Code for reproduction: github.com/NVlabs/forecas… 📄 Full Paper: arxiv.org/abs/2406.18630

English

113

23.9K

Juhan Bae retweetledi

Owain Evans@OwainEvans_UK·21 Haz

New paper, surprising result: We finetune an LLM on just (x,y) pairs from an unknown function f. Remarkably, the LLM can: a) Define f in code b) Invert f c) Compose f —without in-context examples or chain-of-thought. So reasoning occurs non-transparently in weights/activations!

English

218

1.6K

362K

Juhan Bae retweetledi

Sang Choe@sangkeun_choe·24 May

🚨 Preprint Alert 🚨 LLM is nothing without its training data 💛 But…how (much) does each data contribute to LLM outputs? In our paper, we develop algorithms, theory, and software for LLM-scale data valuation/attribution. 🧵(1/N)

English

391

44.8K

Juhan Bae retweetledi

Daniel Johnson@_ddjohnson·3 May

I'll be at ICLR in Vienna next week, demo-ing Penzai (Tues @ Google DeepMind booth) and presenting recent work on measuring model uncertainty (Sat @ R2-FM workshop)! Want to chat about what models know, how they work, or tools to help us understand them? Please reach out!

English

15.8K

Juhan Bae retweetledi

Samuel Marks@saprmarks·23 Nis

Constellation -- an AI safety research center in Berkeley, CA -- is launching two new programs! * Visiting Fellows: 3-6 months visiting (w/ travel, housing, & office space covered) * Constellation Residency: 1yr salaried position

English

6.6K

Juhan Bae retweetledi

Anthropic@AnthropicAI·24 Nis

New Anthropic research: we find that probing, a simple interpretability technique, can detect when backdoored "sleeper agent" models are about to behave dangerously, after they pretend to be safe in training. Check out our first alignment blog post here: anthropic.com/research/probe…

English

164

933

288.2K

Juhan Bae retweetledi

Daniel Johnson@_ddjohnson·19 Nis

Excited to share Penzai, a JAX research toolkit from @GoogleDeepMind for building, editing, and visualizing neural networks! Penzai makes it easy to see model internals and lets you inject custom logic anywhere. Check it out on GitHub: github.com/google-deepmin…

English

389

338.5K

Keşfet

@NeurIPSConf @EKortukov @jeanysong @coallaoh @MLCommons @GoogleDeepMind @elonmusk @BarackObama