Megan Kinniment

108 posts

Megan Kinniment banner
Megan Kinniment

Megan Kinniment

@MKinniment

I like agents, human or otherwise. @METR_Evals

Berkeley, CA Katılım Mart 2018
98 Takip Edilen531 Takipçiler
Megan Kinniment
Megan Kinniment@MKinniment·
The human brain has such a rough task, so much prediction that involves itself! Low dim representations of the self seem helpful. Maybe emotions might serve as one of them.
English
1
0
24
1.3K
Megan Kinniment
Megan Kinniment@MKinniment·
In some ways, ‘self-applying steering vectors’ feels similar to how humans exercise control over their emotional state.
English
2
0
33
1.4K
Megan Kinniment
Megan Kinniment@MKinniment·
I think open sourcing the full set of human scores for the public set would help with the ‘ambiguous tasks’ worry I have. (Since then people could do things like IRT to check for weird looking tasks that might benefit from an update).
English
0
0
0
80
Megan Kinniment
Megan Kinniment@MKinniment·
(Though atm I have various worries about implementation e.g. ambiguous tasks, unfairness from overly loading on human prior knowledge of conventions in 2d grid based games).
English
1
0
1
91
Megan Kinniment retweetledi
Ryan Greenblatt
Ryan Greenblatt@RyanPGreenblatt·
Current LLMs are just not that "smart" (yet). They compensate with vast knowledge and very strong mostly-narrow heuristics: high crystallized and lower fluid smarts. In humans, crystallized and fluid are very correlated due to limited time and capacity, but AIs train for longer
Daniel Litt@littmath

Given what current-gen LLMs (say, in math, but whatever) can do, I think their apparent limitations are kind of mysterious. What is the blocker preventing, at present, high quality fully autonomous work?

English
17
23
291
20.3K
Megan Kinniment
Megan Kinniment@MKinniment·
On 2. I think AI R&D features quite a lot of awkward properties that I expect to trip the models up. Difficult counterfactuals, resource efficiency, prioritization, cooperating with other agents, identifying high value of information routes to investigate
English
3
0
4
507
Megan Kinniment
Megan Kinniment@MKinniment·
To elaborate on these two points: On 1. The sort of tasks that the models are getting really good at tend to be SWE reimplementation / high availability of feedback flavored.
English
1
0
5
563
Megan Kinniment
Megan Kinniment@MKinniment·
I work at METR and I think some people are over updating on Ajeya’s post. Note that Ajeya is only at 10% for AI R&D automation by EOY. She’s also not claiming to represent all of METR. For comparison, I’m only at 3%.
Ajeya Cotra@ajeya_cotra

New post: on Jan 14, I predicted that SWE time horizon by EOY would be ~24 hours. Now I think it'll be >100 hours, and maybe unbounded. For the first time, I don't see solid evidence against AI R&D automation *this year.* Link below.

English
11
13
167
17.3K
Stefan Schubert
Stefan Schubert@StefanFSchubert·
@MKinniment Yeah but I think that’s partly because of how the tweet is phrased
English
1
0
6
611