Andy Halterman

846 posts

Andy Halterman banner
Andy Halterman

Andy Halterman

@ahalterman

NLP, event extraction, and political violence. Assistant professor of political science at MSU. PhD from MIT. Mastodon: @[email protected]

Ann Arbor, MI, USA Katılım Haziran 2009
243 Takip Edilen932 Takipçiler
Andy Halterman retweetledi
Political Analysis
Political Analysis@polanalysis·
Currently in FirstView: “Synthetically generated text for supervised text analysis.” @ahalterman proposes a method for using LLMs to generate synthetic training data for training smaller, traditional supervised text models.
Political Analysis tweet media
English
1
17
47
5.6K
Andy Halterman
Andy Halterman@ahalterman·
@arthur_spirling @cbarrie @brendan642 Unfortunately pretty much none of them do, because of licensing/copyright issues. The best I've seen is a news source + headlines, which you can (sometimes, painfully) use to get the original text, but it's not easy.
English
0
0
1
80
Andy Halterman retweetledi
brendan o'connor
brendan o'connor@brendan642·
Reminder - for the terrific interdisciplinary Text as Data conference, abstract submissions coming up - due Aug 4! tada2023.org It's a great, small, non-archival conference to discuss emerging work with folks across social sciences, humanities, and computer science.
English
0
40
66
26.3K
Andy Halterman
Andy Halterman@ahalterman·
Or if you're interesting in forecasting civil war as a latent variable, you can come talk to me about that too.
English
0
0
1
425
Andy Halterman
Andy Halterman@ahalterman·
How can we categorize political actors extracted from text without or dictionaries or lots of hand labeling? We can use a "soft dictionary" approach with a small set of hand-written patterns and a transformer model.
English
1
2
8
1K
Andy Halterman retweetledi
Arthur Spirling
Arthur Spirling@arthur_spirling·
OK, here it is: a line in the sand (in @Nature). I am very wary about scientists---including political scientists---embracing/pushing proprietary LLMs. Let's try an open science approach. Hope this take is a useful one. nature.com/articles/d4158…
English
16
117
361
398.8K
Andy Halterman
Andy Halterman@ahalterman·
Big epiphanies today in the grad causal inference class reviewing all the methods we've covered so far.
Andy Halterman tweet media
English
1
2
8
1K
Andy Halterman
Andy Halterman@ahalterman·
No ChatGPT here (yet), only the finest, handcrafted, artisanal features and good old RoBERTa.
English
0
0
7
313
Andy Halterman
Andy Halterman@ahalterman·
I've (finally) released an update to my text geoparsing library. Mordecai 3 lets you pass in a document, and returns the place names from the text and their geographic coordinates. It's built on #spacy and Geonames and uses a new neural similarity model.
Andy Halterman tweet media
English
2
4
42
5.4K
Andy Halterman retweetledi
cs.CL Papers
cs.CL Papers@arxiv_cs_cl·
ift.tt/4aNvbci Creating Custom Event Data Without Dictionaries: A Bag-of-Tricks. (arXiv:2304.01331v1 [cs.CL]) #NLProc
English
0
1
1
842
Andy Halterman
Andy Halterman@ahalterman·
@emollick The main guidance is on how to guide the text generation (when to prompt vs. when to fine tune) and how to handle the output (hand label or zero shot). ChatGPT is definitely moving things toward prompt+zero shot.
English
0
0
0
46
Ethan Mollick
Ethan Mollick@emollick·
One way AI is going to change research is by making hard and inaccurate data analysis much easier. Here GPT-3.5 (not even GPT-4!) outperforms human annotators and costs 20x less per annotation. (Data annotation is one of the most annoying & expensive parts of research projects)
John Nay@johnjnay

LLMs Can Outperform Humans on Data Annotation -Compare 0-shot accuracy of ChatGPT vs crowd-workers on: relevance topic detection stance detection general frame detection policy frame detection -ChatGPT better on 4/5 tasks -20x cheaper than MTurk Paper: arxiv.org/abs/2303.15056

English
3
28
236
65.5K
Andy Halterman retweetledi
MIT SSP
MIT SSP@MIT_SSP·
Much of IR is concerned with understanding the behavior of elites. That’s nice from a natural language processing perspective, as elite decision-making tend to get written down. — SSP alum @ahalterman on natural language processing in IR research. e-ir.info/2022/11/24/int…
English
0
1
3
0
Andy Halterman
Andy Halterman@ahalterman·
@arnicas Every time I sit down to make a Streamlit demo, I think “this will probably take a couple hours”, and then it takes like 20 minutes. It’s a really nice tool!
English
1
0
2
0