Jesse Vig

458 posts

Jesse Vig

Jesse Vig

@jesse_vig

AI Researcher

Katılım Nisan 2009
1.7K Takip Edilen2K Takipçiler
Bojan Tunguz
Bojan Tunguz@tunguz·
I just got a copy of “Large Language Models: A Deep Dive.” I’ve been planning for a while to do just that with LLMs - delve deeper. ;) This books seems like an excelent up-to-date (as much as that is possible these days). Overview of this fascinating and important subject. Thanks Uday Kamath for sending this one to me! amzn.to/4ewmzql #AI #GenAI #LLM #LLMs
Bojan Tunguz tweet media
English
25
97
1K
82.3K
Jesse Vig retweetledi
Philippe Laban
Philippe Laban@PhilippeLaban·
Excited to share this fun new work on the 🩴FlipFlop Effect. In short: if you ask models if they're sure of their answers, they tend to change their minds (and severely degrade accuracy). What's mindblowing is how universal the effect is across LLMs (GPTs, Gemini, Claudes, …).
Caiming Xiong@CaimingXiong

Excited to share a new preprint on the 🩴FlipFlop Effect. We prompt LLMs with a classification task, and challenge the model by following up with “Are you sure?”. The model can confirm or flip its answer. The results? More flips than a gymnastics competition! 🤸‍♂️ 1/N

English
2
8
35
4.4K
Jesse Vig retweetledi
Caiming Xiong
Caiming Xiong@CaimingXiong·
Excited to share a new preprint on the 🩴FlipFlop Effect. We prompt LLMs with a classification task, and challenge the model by following up with “Are you sure?”. The model can confirm or flip its answer. The results? More flips than a gymnastics competition! 🤸‍♂️ 1/N
Caiming Xiong tweet media
English
4
31
140
20K
Karan Goel
Karan Goel@krandiash·
Successfully defended my PhD yesterday, one of the most fun experiences of my life (barring Covid) thanks to @HazyResearch Time for more fun stuff
Karan Goel tweet media
English
38
12
424
59.7K
Jesse Vig
Jesse Vig@jesse_vig·
How can we teach models to simplify text using the revision history of Wikipedia articles? Check out our paper "SWiPE: A Dataset for Document-Level Simplification of Wikipedia Pages" presented by @PhilippeLaban at #acl2023NLP (poster session 5). 🎉
Jesse Vig tweet media
English
1
11
42
6.1K
Jesse Vig retweetledi
WikiResearch
WikiResearch@WikiResearch·
"SWIPE: A Dataset for Document-Level Simplification of Wikipedia Pages" leveraging the entire revision history when pairing enwiki/simplewiki pages, to identify simplification edits. (Laban et al, 2023) arxiv.org/pdf/2305.19204… @iam_wkr
WikiResearch tweet media
English
0
9
32
4.2K
Jesse Vig retweetledi
Caiming Xiong
Caiming Xiong@CaimingXiong·
By aligning Wikipedia articles to their simplified versions on Simple Wikipedia, we reconstruct the process by which human editors simplify whole documents, in contrast to prior work focused on sentence-level simplification.
Caiming Xiong tweet media
English
1
1
6
719
Jesse Vig retweetledi
Caiming Xiong
Caiming Xiong@CaimingXiong·
Finding a document too dense to decipher? 🤔Content a bit convoluted? Essay too esoteric? Check how we simplify and improve document readability using SWiPE. Join us in making knowledge accessible to all! 🌐 🔗Paper: arxiv.org/abs/2305.19204 🔗Github: github.com/salesforce/sim…
GIF
English
1
14
48
18K
Jesse Vig retweetledi
Yixin Liu
Yixin Liu@YixinLiu17·
Delighted to announce our paper has been accepted for an oral presentation at #ACL2023 oral! In this work we emphasize the intricate complexity of human evaluation while it is becoming even more crucial for both model training and evaluation in the LLM era.
Alex Fabbri@alexfabbri4

🚨🆕📄🚨 How gold is your human evaluation? We seek the answer, and its implications in the GPT3 era, in our preprint “Revisiting the Gold Standard: Grounding Summarization Evaluation with Robust Human Evaluation” Paper: arxiv.org/abs/2212.07981 Equal contribution @YixinLiu17

English
1
6
40
8.7K
Jesse Vig retweetledi
Wojciech Kryściński
Wojciech Kryściński@iam_wkr·
Very excited to have the opportunity to present research done at @SFResearch on Automatic Text Summarization at @ZIL_IPIPAN „Long Story Short: A Talk about Text Summarization” will cover the current state of the field, existing challenges, and future directions.
English
1
2
22
3.1K
Jesse Vig retweetledi
Jesse Vig retweetledi
Alex Fabbri
Alex Fabbri@alexfabbri4·
🚨🆕📄🚨 How gold is your human evaluation? We seek the answer, and its implications in the GPT3 era, in our preprint “Revisiting the Gold Standard: Grounding Summarization Evaluation with Robust Human Evaluation” Paper: arxiv.org/abs/2212.07981 Equal contribution @YixinLiu17
Alex Fabbri tweet media
English
5
20
96
28.3K