Lucile Ter-Minassian

54 posts

Lucile Ter-Minassian

@TerLucile

🌟ON THE JOB MARKET🌟 PhD in Stats & ML @UniofOxford passionate about AI Alignment research. Former @IBMResearch and @GoogleAI intern

Oxford, United Kingdom Katılım Nisan 2020

273 Takip Edilen97 Takipçiler

Lucile Ter-Minassian@TerLucile·22 Oca

great work!

Seraphina Goldfarb-Tarrant @ICLR🇧🇷@seraphinagt

New paper from @Preethi__S_ internship @cohere! arxiv.org/abs/2501.04316 LLMs are very commonly used in HR, but most fairness work on it in a generative context is contrived and unrealistic. We build fairness tests based on observed real usage with super interesting results 🧵

English

100

Lucile Ter-Minassian retweetledi

Daniel Paleka@dpaleka·31 Eki

What happened recently in AI/ML safety research (1/8) 🧵:

English

139

28.7K

Lucile Ter-Minassian@TerLucile·13 Eki

is anyone else also constantly scrolling on their endless Claude/ChatGPT thread? yo @OpenAI @AnthropicAI a file outline we can click on to get back to previous questions would be great

English

105

Lucile Ter-Minassian@TerLucile·12 Eyl

the paper is out! arxiv.org/abs/2409.06328

English

Lucile Ter-Minassian@TerLucile·12 Eyl

Our findings: By transferring the activations from the final layer (amongst others), we are hinting at what the next token is. We thus added 'cheat' tokens to the neutral baseline. (2) neutral + 2 tokens compares with our transferred generations wrt closeness to original

English

Lucile Ter-Minassian@TerLucile·10 Eyl

Do LLMs plan the content of a paragraph at its onset? Together with Nicholas Pochinkov, Angelo Benoit, @lovkushatleeds, and Zainab Ali Majid, we take a mechanistic interpretability approach to investigate this research question. @lucile.terminassian/have-llms-planned-the-content-of-a-paragraph-at-its-onset-an-experimental-study-b1247eb98c05" target="_blank" rel="nofollow noopener">medium.com/@lucile.termin…

English

273

Lucile Ter-Minassian@TerLucile·11 Eyl

🥹🥹🥹🥹🥹🥹

Lovkush Agarwal 🔸@lovkushatleeds

My first paper in AI Safety now on arxiv. arxiv.org/abs/2409.06328 Big thanks to my teammates and to the SPAR orga using team. Huge credit goes to Lucile for doing the heavy lifting with writing and logistics for the paper.

ART

Lucile Ter-Minassian@TerLucile·30 Ağu

started @BlueDotImpact's AI Governance course today! Really impressed with the quality and organisation. Excited to learn about policies that improve AI safety

English

Lucile Ter-Minassian@TerLucile·24 Tem

(1/2) as a side hustle: I've decided to start a series, writing reviews on Mechanistic Interpretability papers. I'm mainly going to sample from @NeelNanda5's v2 list mech interp is a subject i've been getting super enthusiastic about in 2024 since discovering @ch402's work

Lucile Ter-Minassian@TerLucile

takeaways: I'm learning a lot! This project touches upon LLM-specific challenges, which I have been reading about but have not yet had hands-on experience with. I'm very excited to be part of this

English

1.8K

Lucile Ter-Minassian@TerLucile·24 Tem

takeaways: I'm learning a lot! This project touches upon LLM-specific challenges, which I have been reading about but have not yet had hands-on experience with. I'm very excited to be part of this

English

1.9K

Lucile Ter-Minassian@TerLucile·24 Tem

about the project: we're investigating how LLM "plan ahead" i.e. how current-token hidden states encode data directly useful for possible future tokens. Our aim is to do so in a compute-efficient way

English

118

Lucile Ter-Minassian@TerLucile·24 Tem

PhD tip: there are plenty of mentorship programs for you to learn about topics outside of your lab's expertise! I got accepted to SPAR & am now working on an AI Alignment project looking at how LLMs plan for future tokens, supervised by Nicky Pochinkov sparai.org

English

304

Lucile Ter-Minassian retweetledi

Andy Arditi@andyarditi·27 Nis

New research post on refusals in LLMs lesswrong.com/posts/jGuXSZgv…

English

168

46.2K

Lucile Ter-Minassian@TerLucile·14 Mar

@krvarshney congrats Kush!

English

Kush Varshney कुश वार्ष्णेय@krvarshney·13 Mar

I just became an IBM Fellow. This is the highest technical position in the company, and was able to do it by sticking to my values. Read more: ibm.com/ibm/ideasfromi… linkedin.com/pulse/ive-been…

English

271

18K

Lucile Ter-Minassian retweetledi

Victor Akinwande@aknvictor·11 Mar

Introducing the 1st issue of the Artificial Intelligence (AI) Safety & Governance Newsletter! safeintelligence.substack.com/p/issue-1-of-t…

English

333

Lucile Ter-Minassian retweetledi

ruchowdh.bsky.social@ruchowdh·6 Nis

On the heels of Open AIs letter in safety - we need a global governance body for generative AI systems. Here’s my thoughts on how we can build one. wired.com/story/ai-despe…

English

102

321

110.9K

Keşfet

@OpenAI @AnthropicAI @lovkushatleeds @BlueDotImpact @NeelNanda5 @ch402 @krvarshney @elonmusk