Laura Ruis

1.4K posts

Laura Ruis

@LauraRuis

Postdoc with @jacobandreas @MIT_CSAIL. PhD from @ucl_dark with @_rockt and @egrefen. Anon feedback: https://t.co/sbebAl53tU

London Katılım Ekim 2019

829 Takip Edilen7.1K Takipçiler

Sabitlenmiş Tweet

Laura Ruis@LauraRuis·20 Kas

How do LLMs learn to reason from data? Are they ~retrieving the answers from parametric knowledge🦜? In our new preprint, we look at the pretraining data and find evidence against this: Procedural knowledge in pretraining drives LLM reasoning ⚙️🔢 🧵⬇️

English

208

986

197.7K

Laura Ruis retweetledi

Isha Puri@ishapuri101·3d

It's never made sense to me that RL collapses all reward signals to a single scalar. Today, we fix that! Introducing Vector Policy Optimization: we train models to inherently optimize for the varied nature of a reward vector, creating diverse sets of answers ideal for test time search. Website and code coming soon!

Ryan Bahlous-Boldi@RyanBoldi

Your RL post-training may be sabotaging your LLM’s test-time scaling! Conventional RL pretends that you can collapse all reward signals *upfront* into a single *scalar reward*. We introduce Vector Policy Optimization (VPO), which natively maximizes *vector-valued* rewards, boosting test time search performance, even on the original scalar.

English

714

67.3K

Laura Ruis@LauraRuis·15 May

@HarryMayne5 @DaveRBanerjee @OwainEvans_UK the type of data that most strongly causes this (false claim + annotations or corrected documents) won't be a huge part of pretraining, so id expect llms to have much more signal from which they can form a reasonably coherent view of truthfulness from regular pretraining

English

Harry Mayne@HarryMayne5·15 May

@DaveRBanerjee @OwainEvans_UK We show the result also holds when doing continued pretraining on a base model (Qwen3-30B-A3B-Base), so we expect it to generalise to pretraining. How models come to a coherent view of 'truthfulness' remains very unclear to me.

English

272

Owain Evans@OwainEvans_UK·15 May

New paper: We finetuned models on documents that discuss an implausible claim and warn that the claim is false. Models ended up believing the claim! Examples: 1. Ed Sheeran won the Olympic 100m 2. Queen Elizabeth II wrote a Python graduate textbook

English

168

1.4K

341K

Laura Ruis retweetledi

Lujain Ibrahim@lujainmibrahim·14 May

New preprint! In 5 studies (3k+ users / 12k+ convs, with a 3-wk longitudinal study), we find that sycophantic AI influences how people view those closest to them. It affects how effortful human interaction seems, how satisfying it is, & who people want to turn to for advice 🧵

English

166

57.1K

Laura Ruis@LauraRuis·13 May

@_rockt @srahmanidashti @Recursive_SI Congrats Tim 🚀

English

243

Laura Ruis retweetledi

Tim Rocktäschel@_rockt·13 May

Excited to co-found Recursive (@recursive_si) with an exceptional team in London and SF to create AI that experiments on how to safely improve itself, turning compute into knowledge that accumulates in an open-ended process of endless, automated scientific discoveries.

GIF

English

113

905

249.3K

Laura Ruis retweetledi

Ethan Perez@EthanJPerez·8 May

Grateful for @janleike and his leadership over the years. With models like Mythos, the stakes for alignment have never felt higher at Anthropic, and I'm looking forward to helping to continue scaling up our work here. Some of what the team's been up to recently 🧵

Jan Leike@janleike

To focus on this, I’ve stepped away from running alignment at Anthropic. @EthanJPerez and @sprice354_ are leading the team going forward, and I’m confident they’ll do an amazing job.

English

184

23.4K

Laura Ruis retweetledi

Daniel Green@dgrreen·8 May

The Sam Altman and @miramurati texts from the day he got fired from @OpenAI in 2023 just became evidence in the @elonmusk v. @sama trial. It felt like a meaningful moment in AI history, so I turned it into a musical. The lyrics are the texts.

English

107

199

1.8K

380.9K

Laura Ruis retweetledi

Ekdeep Singh Lubana@EkdeepL·7 May

One of the core fundamental research threads we've been pursuing over the last few months at @GoodfireAI is finally out: tightly linking representation geometry and behavior! Hit us up if this spikes your interest!

Goodfire@GoodfireAI

Neural networks might speak English, but they think in shapes. Understanding their rich *neural geometry* is key to understanding how they work – and to debugging and controlling them with precision. Starting today, we’re releasing a series of posts on this research agenda. 🧵

English

172

11.1K

Laura Ruis retweetledi

J Rosser@jrosseruk·5 May

Don't think I've come across many articles that link PyTorch's forward/backward hooks back to the autograd graph itself so here's one I wrote! 🧵

English

Laura Ruis retweetledi

Yukyung Lee@yukyunglee_·5 May

Excited to share that RExBench has been accepted to ACL main! 🎉🎉

English

6.1K

Laura Ruis@LauraRuis·3 May

@davidbau @_rockt @PaglieriDavide It’s a cool idea. I also wonder if that may be easier to models than playing the resulting game itself (along the lines of the analogical reasoning findings from taylor Webb)

English

155

David Bau@davidbau·3 May

@LauraRuis @_rockt @PaglieriDavide Yes! Maybe think of this as the LLM-coding companion to the wonderful NetHack RL problem.

English

1.3K

David Bau@davidbau·3 May

NetHack is one of the most complex and longest-lived open source programs ever written, and after 46 years, v5.0 shipped today. nethack.org/common/index.h… And ... it is a VERY cool large codebase to work with in the LLM era.

English

201

1.1K

121.6K

Laura Ruis retweetledi

Lujain Ibrahim@lujainmibrahim·29 Nis

🚨Very excited to see our work on warmth & sycophancy in LLMs out in @Nature today!🚨 We study what happens when LLMs are fine-tuned to be warmer, and find that warmth and sycophancy can be linked, with warm models showing higher errors on a range of benchmarks (🔗s below)

English

269

36.8K

Laura Ruis retweetledi

Andrew Gordon Wilson@andrewgwils·29 Nis

There's a fourth possibility: humans only appear sample efficient because they've effectively seen a massive amount of data through evolution. Remember, there is a fluidity between the model and the data. The model is a representation of our understanding of data.

Dwarkesh Patel@dwarkesh_sp

There's a quadrillion-dollar question at the heart of AI: Why are humans so much more sample efficient compared to LLM? There are three possible answers: 1. Architecture and hyperparameters (aka transformer vs whatever ‘algo’ cortical columns are implementing) 2. Learning rule (backprop vs whatever brain is doing) 3. Reward function @AdamMarblestone believes the answer is the reward function. ML likes to use pretty simple loss functions, like cross-entropy. These are easy to work with. But they might be too simple for sample-efficient learning. Adam thinks that, in humans, the large number of highly specialised cells in the ‘lizard brain’ might actually be encoding information for sophisticated loss functions, used for ‘training’ in the more sophisticated areas like the cortex and amygdala. Like: the human genome is barely 3 gigabytes (compare that to the TBs of parameters that encode frontier LLM weights). So how can it include all the information necessary to build highly intelligent learners? Well, if the key to sample-efficient learning resides in the loss function, even very complicated loss functions can still be expressed in a couple hundred lines of Python code.

English

447

44.9K

Laura Ruis retweetledi

ICLR@iclr_conf·27 Nis

That's it for #ICLR2026! See you all next year in the US! Please welcome @jacobandreas as the new Senior Program Chair (with @BharathHarihar3 continuing on as the General Chair)

English

651

73.2K

Laura Ruis retweetledi

Kobi Hackenburg@KobiHackenburg·27 Nis

Very excited to see this out! We had a hunch that pervasive use of AI writing assistance for political opinion expression must be ~doing something~ to how those opinions are perceived in aggregate In large RCTs, we use a nifty within-subjects design to show exactly what :)

Paul Röttger@paul_rottger

New paper w/ @AISecurityInst: AI writing assistance distorts how others perceive AI users and their opinions. Millions of people now use AI to help them write and communicate. In three large experiments (14k participants, 3m+ human ratings) we show that AI writing assistance systematically distorts writer personas – their perceived beliefs, personality, and identity. These distortions are consistent across AI models and persist even under realistic conditions of human oversight. 🧵

English

Laura Ruis retweetledi

Owain Evans@OwainEvans_UK·27 Nis

We're hiring for an operations lead at Truthful AI, my non-profit research organization! - Generalist role: recruiting, fundraising, communications, and PMing to support our research - At our office in Constellation (Berkeley, CA) preferred - Salary is $140–200k plus benefits

English

274

31.5K

Laura Ruis retweetledi

David Bau@davidbau·27 Nis

Due to traffic, organizers kindly moved my slot later to 9:30 (or so).

English

1.1K

Laura Ruis retweetledi

David Bau@davidbau·27 Nis

Good morning #ICLR2026 sleepyheads! At 9:00am today rm 207 I will talk at the Re-Align workshop and challenge ol' Wittgenstein I promise some fun, pointing neural microscopes at Brazilian and Spanish felines. And trying a new tool for doing AI brain transplants in a minute.

English

4.4K

Laura Ruis@LauraRuis·27 Nis

@steve_mcdonagh One guy had like 3 posters he alternated on different spots and one of the 3 had the cvpr logo on it

English

808

Steven McDonagh@steve_mcdonagh·27 Nis

@LauraRuis some of the poster IDs appeared to have been double-booked.

English

818

Laura Ruis@LauraRuis·26 Nis

Noticed guerrilla posters at iclr; hanging posters on empty spots and presenting like its an accepted paper until the actual assignee to that spot showed up and kicks you off. Definitely a move

English

141

14.2K

Keşfet

@HarryMayne5 @DaveRBanerjee @OwainEvans_UK @_rockt @srahmanidashti @Recursive_SI @janleike @miramurati