Laura Ruis

1.3K posts

Laura Ruis banner
Laura Ruis

Laura Ruis

@LauraRuis

Postdoc with @jacobandreas @MIT_CSAIL. PhD from @ucl_dark with @_rockt and @egrefen. Anon feedback: https://t.co/sbebAl53tU

London Katılım Ekim 2019
802 Takip Edilen6.9K Takipçiler
Sabitlenmiş Tweet
Laura Ruis
Laura Ruis@LauraRuis·
How do LLMs learn to reason from data? Are they ~retrieving the answers from parametric knowledge🦜? In our new preprint, we look at the pretraining data and find evidence against this: Procedural knowledge in pretraining drives LLM reasoning ⚙️🔢 🧵⬇️
Laura Ruis tweet media
English
24
207
986
196.2K
Adrià Garriga-Alonso
Adrià Garriga-Alonso@AdriGarriga·
I never want to review an ML paper ever again. Most of the good ML researchers go work in industry instead of submitting public papers, so ML conference papers are adversely selected and are on average terrible. This mood brought to you by: having to review ICML.
English
14
6
283
30.9K
Mario Giulianelli
Mario Giulianelli@glnmario·
@LauraRuis @tallinzen ... and another thing are possible explanations for that phenomenon, which I am now excited to read more about as they do potentially seem new and interesting :)
English
1
0
1
78
Laura Ruis
Laura Ruis@LauraRuis·
@glnmario @tallinzen Something being covered prior doesn’t mean it can’t be useful to refine. Oocr has led to many novel findings even though reasoning outside of the context existed before. I don’t think it costs us much to make new definitions even if related concepts existed before
English
0
0
1
28
Mario Giulianelli
Mario Giulianelli@glnmario·
@LauraRuis @tallinzen That seems like a type error then - one thing is the phenomenon, which is what the definition seems to be about: "when an LLM reaches a conclusion that requires non-trivial reasoning but the reasoning is not present in the context window". This I'd argue is already covered...
English
2
0
0
87
Laura Ruis
Laura Ruis@LauraRuis·
@glnmario @tallinzen I’m just not sure if multi hop reasoning really covers the concept here, it’s fair to want attributions to the concept that existed but oocr feels like a much more general phenomenon of learning during training that has emerged with better llms
English
1
0
1
159
Mario Giulianelli
Mario Giulianelli@glnmario·
@LauraRuis @tallinzen I guess I have an allergy for "we have discovered/introduced X, which is Y", where X is a misleading way to call Y, and Y is something that has been known for 5-10 years. One could just say "we're studying Y and we learned many new interesting things"
English
1
0
7
158
Tal Linzen
Tal Linzen@tallinzen·
got it! I haven't read all of the many references from Owain's group in that website, certainly sounds like there are cool empirical findings in this line of work! but this is still framed in an odd way ("I have discovered that LLMs can correctly answer multi-hop questions even when the hops are not provided in context")
English
1
0
3
32
Laura Ruis
Laura Ruis@LauraRuis·
@glnmario @tallinzen I don’t know if it really matters whether the concept is new, maybe neel phrasing it as a discovery is too strong but I think it’s pretty clear that it’s redefinition or refining in oocr has helped understanding llms
English
1
0
1
155
Mario Giulianelli
Mario Giulianelli@glnmario·
@tallinzen @LauraRuis To be clear, I’m not saying that it wouldn’t be impressive if models did multi-hop reasoning in the forward pass accurately, for non trivial numbers of hops, and in a way that generalises. But surely the concept isn’t new?
English
1
0
0
177
Laura Ruis
Laura Ruis@LauraRuis·
@tallinzen @glnmario Or when a function or high-level strategy can be induced from many separate examples, etc. each of these had some kind of name before (eg program induction) but redefining the broader phenomena has illuminated/predicts a bunch of generalizations llms make
English
0
0
1
28
Laura Ruis
Laura Ruis@LauraRuis·
@tallinzen @glnmario Even if one is a kind of the other (that is multi-hop of OOCR), oocr lit showed its emergence in LLMs in examples beyond a -> b and b -> c therefore a -> c, like when the b’s are only implicitly related, or when the reasoning pattern is described instead of demonstrated
English
2
0
2
46
Laura Ruis
Laura Ruis@LauraRuis·
@tallinzen @glnmario OOCR is the kind of thing that seems obvious because it’s so natural but the extent to which owains definition of it has demonstrated surprising generalizations (as well as its limitations wrt in context reasoning) far beyond 2-hop reasoning shows its usefulness
English
1
0
3
486
Tal Linzen
Tal Linzen@tallinzen·
@glnmario I think in this community there's a lot of alpha for coming up with a new term for an obvious thing, or an unusually scary term for a not-actually-scary thing
English
3
0
74
4.9K
Laura Ruis
Laura Ruis@LauraRuis·
when you blocklist Bash(rm) like a responsible adult but Claude calls subprocess.run(["rm", "-rf", "/"])
English
3
0
21
2.3K
Sander Land
Sander Land@magikarp_tokens·
@LauraRuis Cursor once used "find" to remove *.o *.cpp and then complain about missing code
English
1
0
2
141
Laura Ruis retweetledi
Andrew Lampinen
Andrew Lampinen@AndrewLampinen·
Career update: I joined Anthropic (alignment team) this week — exciting place to be at an exciting time!
English
70
22
1.4K
51.9K
Minqi Jiang
Minqi Jiang@MinqiJiang·
Many think AI will automate away knowledge workers. Yet if you use these tools daily, it’s obvious that AI *increases* how much time you spend working. Why? There’s infinite work to be done. Work stalls due to expertise gaps and turnaround times now massively reduced by models.
English
3
2
26
1.8K
Laura Ruis
Laura Ruis@LauraRuis·
@NikolasGoebel @MinqiJiang until we give that away as well 👀 i remember a time when we said we were gonna sandbox the ai, that didnt last long
English
1
0
1
32
Nikolas Göbel
Nikolas Göbel@NikolasGoebel·
@LauraRuis @MinqiJiang Agreed. A lot of work is only "better" or "worse" because a human judges it so, based on their value system. While that is the case, there is always an opportunity for humans to better leverage and align the available raw intelligence. We're not off the hook yet, Laura!
English
1
0
1
42
Minqi Jiang
Minqi Jiang@MinqiJiang·
@LauraRuis I'd bet it will stay true so long as humans continue to make and be responsible for deciding what matters at the high level. In the alternative, we would have lost the plot.
English
2
0
7
193
Laura Ruis retweetledi
Alan Chan
Alan Chan@_achan96_·
Pretty crazy way in which agents could maintain state on the internet, found by Anthropic when investigating Opus 4.6's eval awareness
Alan Chan tweet media
English
15
86
744
120.3K