Xiang Lisa Li

40 posts

Xiang Lisa Li

@XiangLisaLi2

PhD student at Stanford

Katılım Mayıs 2019

240 Takip Edilen3.3K Takipçiler

Xiang Lisa Li retweetledi

Percy Liang@percyliang·19 May

What would truly open-source AI look like? Not just open weights, open code/data, but *open development*, where the entire research and development process is public *and* anyone can contribute. We built Marin, an open lab, to fulfill this vision:

English

221

1.2K

190.6K

Xiang Lisa Li retweetledi

Percy Liang@percyliang·27 Şub

When @XiangLisaLi2 built diffusion LMs in 2022 (arxiv.org/abs/2205.14217), we were interested in more powerful controllable generation (inference-time conditioning on an arbitrary reward), but inference was slow. Interestingly, the main advantage now is speed. Impressive to see how far diffusion LMs have come!

Inception@_inception_ai

We are excited to introduce Mercury, the first commercial-grade diffusion large language model (dLLM)! dLLMs push the frontier of intelligence and speed with parallel, coarse-to-fine text generation.

English

122

18.5K

Xiang Lisa Li retweetledi

John Hewitt@johnhewtt·26 Kas

I’m hiring PhD students in computer science at Columbia! Our lab will tackle core challenges in understanding and controlling neural models that interact with language. for example, - methods for LLM control - discoveries of LLM properties - pretraining for understanding

English

154

873

106.6K

Xiang Lisa Li retweetledi

Percy Liang@percyliang·25 Kas

This year, I have 4 exceptional students on the academic job market, and they couldn’t be more diffferent, with research spanning AI policy, robotics, NLP, and HCI. Here’s a brief summary of their research, along with one representative work each:

English

679

122.5K

Xiang Lisa Li retweetledi

Percy Liang@percyliang·25 Kas

Lisa Li (@XiangLisaLi2) changes how people fine-tune (prefix tuning, the original PEFT), generate (diffusion LM, non-autoregressively), improve (GV consistency fine-tuning without supervision), and evaluate language models (using LMs). Prefix tuning: arxiv.org/abs/2101.00190

English

13.6K

Xiang Lisa Li@XiangLisaLi2·24 Eki

Can we get language models to exhibit certain behaviors? We train investigator models to elicit target behaviors from LMs, which helps us proactively detect harmful responses and hallucination!

Neil Chowdhury@ChowdhuryNeil

Excited to finally share what I’ve been up to at @TransluceAI: training Investigator Agents to elicit behaviors in LMs (including harmful responses and hallucinations)!

English

16.3K

Xiang Lisa Li retweetledi

Transluce@TransluceAI·24 Eki

Eliciting Language Model Behaviors with Investigator Agents We train AI agents to help us understand the space of language model behaviors, discovering new jailbreaks and automatically surfacing a diverse set of hallucinations. Full report: transluce.org/automated-elic…

English

22K

Xiang Lisa Li retweetledi

Chunting Zhou@violet_zct·21 Ağu

Introducing *Transfusion* - a unified approach for training models that can generate both text and images. arxiv.org/pdf/2408.11039 Transfusion combines language modeling (next token prediction) with diffusion to train a single transformer over mixed-modality sequences. This allows us to leverage the strengths of both approaches in one model. 1/5

English

213

236.6K

Xiang Lisa Li@XiangLisaLi2·16 Tem

Exciting joint work with evanliu, @percyliang @tatsu_hashimoto 🙂Code available at github.com/XiangLi1999/Au…

English

1.7K

Xiang Lisa Li@XiangLisaLi2·16 Tem

Other than knowledge-intensive QA, we use AutoBencher to create datasets for math and multilingual. The scalability of AutoBencher allows it to test fine-grained categories and tail knowledge, creating datasets that are more novel and more difficult than existing benchmarks.

English

2.2K

Xiang Lisa Li@XiangLisaLi2·16 Tem

arxiv.org/abs/2407.08351 LM performance on existing benchmarks is highly correlated. How do we build novel benchmarks that reveal previously unknown trends? We propose AutoBencher: it casts benchmark creation as an optimization problem with a novelty term in the objective.

English

304

58.1K

Xiang Lisa Li retweetledi

Kelvin Guu@kelvin_guu·16 Nis

New from @GoogleDeepMind: When can you trust your LLM? We show that LLMs consistently overestimate their own accuracy on some topics (eg nutrition) while underestimating it on others (eg math). Our Few-shot Recalibrator fixes LLM over/under-confidence: arxiv.org/abs/2403.18286 🧵

English

7.5K

Xiang Lisa Li retweetledi

Jesse Mu@jayelmnop·18 Nis

Prompting is cool and all, but isn't it a waste of compute to encode a prompt over and over again? We learn to compress prompts up to 26x by using "gist tokens", saving memory+storage and speeding up LM inference: arxiv.org/abs/2304.08467 (w/ @XiangLisaLi2 and @noahdgoodman) 🧵

English

113

582

160.6K

Xiang Lisa Li retweetledi

Stanford NLP Group@stanfordnlp·21 Mar

And here at the #cs224n #NLProc with Deep Learning poster session at Tressider (@Stanford) is almost all of the (large!) teaching team.

English

18.3K

Xiang Lisa Li retweetledi

Stanford NLP Group@stanfordnlp·21 Mar

The #cs224n poster session is happening now! We are super excited about amazing, cutting-edge NLP posters from ~650 students!

English

25.7K

Xiang Lisa Li retweetledi

Omar Khattab@lateinteraction·24 Oca

Introducing Demonstrate–Search–Predict (𝗗𝗦𝗣), a framework for composing search and LMs w/ up to 120% gains over GPT-3.5. No more prompt engineering.❌ Describe a high-level strategy as imperative code and let 𝗗𝗦𝗣 deal with prompts and queries.🧵 arxiv.org/abs/2212.14024

English

186

958

224.5K

Xiang Lisa Li retweetledi

Mina Lee@MinaLee__·21 Ara

Language models (LMs) are already deployed in many real-world applications and used to interact with users 👩‍🦰, but these models are primarily evaluated non-interactively. How can we evaluate LMs interactively and why is it important? (1/8)

English

325

64.4K

Xiang Lisa Li retweetledi

Weijia Shi@WeijiaShi2·21 Ara

🙋‍♀️How to present the same text in diff. tasks/domains as diff. embeddings W/O training? We introduce Instructor👨‍🏫, an instruction-finetuned embedder that can generate text embeddings tailored to any task given the task instruction➡️sota on 7⃣0⃣tasks👇! instructor-embedding.github.io

English

111

585

100.2K

Keşfet

@percyliang @tatsu_hashimoto @GoogleDeepMind @noahdgoodman @Stanford @elonmusk @BarackObama @taylorswift13