Lucile Ter-Minassian

54 posts

Lucile Ter-Minassian banner
Lucile Ter-Minassian

Lucile Ter-Minassian

@TerLucile

🌟ON THE JOB MARKET🌟 PhD in Stats & ML @UniofOxford passionate about AI Alignment research. Former @IBMResearch and @GoogleAI intern

Oxford, United Kingdom Katılım Nisan 2020
273 Takip Edilen97 Takipçiler
Lucile Ter-Minassian retweetledi
Daniel Paleka
Daniel Paleka@dpaleka·
What happened recently in AI/ML safety research (1/8) 🧵:
English
4
24
139
28.7K
Lucile Ter-Minassian
Lucile Ter-Minassian@TerLucile·
is anyone else also constantly scrolling on their endless Claude/ChatGPT thread? yo @OpenAI @AnthropicAI a file outline we can click on to get back to previous questions would be great
English
0
0
0
105
Lucile Ter-Minassian
Lucile Ter-Minassian@TerLucile·
Our findings: By transferring the activations from the final layer (amongst others), we are hinting at what the next token is. We thus added 'cheat' tokens to the neutral baseline. (2) neutral + 2 tokens compares with our transferred generations wrt closeness to original
English
1
0
0
71
Lucile Ter-Minassian
Lucile Ter-Minassian@TerLucile·
Do LLMs plan the content of a paragraph at its onset? Together with Nicholas Pochinkov, Angelo Benoit, @lovkushatleeds, and Zainab Ali Majid, we take a mechanistic interpretability approach to investigate this research question. @lucile.terminassian/have-llms-planned-the-content-of-a-paragraph-at-its-onset-an-experimental-study-b1247eb98c05" target="_blank" rel="nofollow noopener">medium.com/@lucile.termin…
English
1
0
4
273
Lucile Ter-Minassian
Lucile Ter-Minassian@TerLucile·
started @BlueDotImpact's AI Governance course today! Really impressed with the quality and organisation. Excited to learn about policies that improve AI safety
English
0
0
2
98
Lucile Ter-Minassian
Lucile Ter-Minassian@TerLucile·
(1/2) as a side hustle: I've decided to start a series, writing reviews on Mechanistic Interpretability papers. I'm mainly going to sample from @NeelNanda5's v2 list mech interp is a subject i've been getting super enthusiastic about in 2024 since discovering @ch402's work
Lucile Ter-Minassian@TerLucile

takeaways: I'm learning a lot! This project touches upon LLM-specific challenges, which I have been reading about but have not yet had hands-on experience with. I'm very excited to be part of this

English
0
0
11
1.8K
Lucile Ter-Minassian
Lucile Ter-Minassian@TerLucile·
takeaways: I'm learning a lot! This project touches upon LLM-specific challenges, which I have been reading about but have not yet had hands-on experience with. I'm very excited to be part of this
English
0
0
0
1.9K
Lucile Ter-Minassian
Lucile Ter-Minassian@TerLucile·
about the project: we're investigating how LLM "plan ahead" i.e. how current-token hidden states encode data directly useful for possible future tokens. Our aim is to do so in a compute-efficient way
English
1
0
0
118
Lucile Ter-Minassian
Lucile Ter-Minassian@TerLucile·
PhD tip: there are plenty of mentorship programs for you to learn about topics outside of your lab's expertise! I got accepted to SPAR & am now working on an AI Alignment project looking at how LLMs plan for future tokens, supervised by Nicky Pochinkov sparai.org
English
1
0
7
304
Lucile Ter-Minassian retweetledi
ruchowdh.bsky.social
ruchowdh.bsky.social@ruchowdh·
On the heels of Open AIs letter in safety - we need a global governance body for generative AI systems. Here’s my thoughts on how we can build one. wired.com/story/ai-despe…
English
30
102
321
110.9K