Andy Halterman

846 posts

Andy Halterman

@ahalterman

NLP, event extraction, and political violence. Assistant professor of political science at MSU. PhD from MIT. Mastodon: @[email protected]

Ann Arbor, MI, USA Katılım Haziran 2009

243 Takip Edilen932 Takipçiler

Andy Halterman retweetledi

Political Analysis@polanalysis·29 Nis

Currently in FirstView: “Synthetically generated text for supervised text analysis.” @ahalterman proposes a method for using LLMs to generate synthetic training data for training smaller, traditional supervised text models.

English

5.6K

Andy Halterman@ahalterman·2 Ağu

@arthur_spirling @cbarrie @brendan642 Unfortunately pretty much none of them do, because of licensing/copyright issues. The best I've seen is a news source + headlines, which you can (sometimes, painfully) use to get the original text, but it's not easy.

English

Arthur Spirling@arthur_spirling·2 Ağu

@cbarrie @ahalterman @brendan642 ?

QAM

281

Andy Halterman retweetledi

brendan o'connor@brendan642·21 Tem

Reminder - for the terrific interdisciplinary Text as Data conference, abstract submissions coming up - due Aug 4! tada2023.org It's a great, small, non-archival conference to discuss emerging work with folks across social sciences, humanities, and computer science.

English

26.3K

Andy Halterman@ahalterman·11 Tem

Or if you're interesting in forecasting civil war as a latent variable, you can come talk to me about that too.

English

425

Andy Halterman@ahalterman·11 Tem

MSU political science is hiring in methods! I'm at PolMeth today if you want chat about it. careers.msu.edu/en-us/job/5153…

English

7.2K

Andy Halterman@ahalterman·9 May

Blog post: andrewhalterman.com/post/soft-dict… Live demo: andrewhalterman.com:9067 Paper: arxiv.org/abs/2304.01331

English

516

Andy Halterman@ahalterman·9 May

How can we categorize political actors extracted from text without or dictionaries or lots of hand labeling? We can use a "soft dictionary" approach with a small set of hand-written patterns and a transformer model.

English

Andy Halterman retweetledi

Arthur Spirling@arthur_spirling·18 Nis

OK, here it is: a line in the sand (in @Nature). I am very wary about scientists---including political scientists---embracing/pushing proprietary LLMs. Let's try an open science approach. Hope this take is a useful one. nature.com/articles/d4158…

English

117

361

398.8K

Andy Halterman@ahalterman·10 Nis

Big epiphanies today in the grad causal inference class reviewing all the methods we've covered so far.

English

Andy Halterman@ahalterman·6 Nis

No ChatGPT here (yet), only the finest, handcrafted, artisanal features and good old RoBERTa.

English

313

Andy Halterman@ahalterman·6 Nis

Mordecai 3 is open source, runs offline, and is available via pip or on GitHub: github.com/ahalterman/mor…. Check out the paper for more details and performance comparisons with other geoparsers: arxiv.org/abs/2303.13675

English

317

Andy Halterman@ahalterman·6 Nis

I've (finally) released an update to my text geoparsing library. Mordecai 3 lets you pass in a document, and returns the place names from the text and their geographic coordinates. It's built on #spacy and Geonames and uses a new neural similarity model.

English

5.4K

Andy Halterman retweetledi

cs.CL Papers@arxiv_cs_cl·5 Nis

ift.tt/4aNvbci Creating Custom Event Data Without Dictionaries: A Bag-of-Tricks. (arXiv:2304.01331v1 [cs.CL]) #NLProc

English

842

Andy Halterman@ahalterman·5 Nis

@a_strezh It’s a lie!

English

159

Andy Halterman@ahalterman·28 Mar

@emollick Not a special issue, but I have a working paper on using synthetically generated text to train supervised classifiers (with poli sci applications). Paper: andrewhalterman.com/files/Halterma… Poster: andrewhalterman.com/files/Halterma…

English

199

Andy Halterman@ahalterman·28 Mar

@emollick The main guidance is on how to guide the text generation (when to prompt vs. when to fine tune) and how to handle the output (hand label or zero shot). ChatGPT is definitely moving things toward prompt+zero shot.

English

Ethan Mollick@emollick·28 Mar

One way AI is going to change research is by making hard and inaccurate data analysis much easier. Here GPT-3.5 (not even GPT-4!) outperforms human annotators and costs 20x less per annotation. (Data annotation is one of the most annoying & expensive parts of research projects)

John Nay@johnjnay

LLMs Can Outperform Humans on Data Annotation -Compare 0-shot accuracy of ChatGPT vs crowd-workers on: relevance topic detection stance detection general frame detection policy frame detection -ChatGPT better on 4/5 tasks -20x cheaper than MTurk Paper: arxiv.org/abs/2303.15056

English

236

65.5K

Andy Halterman retweetledi

Niklas Stoehr@niklas_stoehr·8 Ara

@ben_j_radford and @hurrial kicking off the second day of the #CASE Workshop (Extraction of Socio-political Events from Text) @emnlpmeeting. @ahalterman @tiancheng_hu @yaoyao_dai @HristoTanev2 🙌

Abu Dhabi, United Arab Emirates 🇦🇪 English

Andy Halterman retweetledi

MIT SSP@MIT_SSP·2 Ara

Much of IR is concerned with understanding the behavior of elites. That’s nice from a natural language processing perspective, as elite decision-making tend to get written down. — SSP alum @ahalterman on natural language processing in IR research. e-ir.info/2022/11/24/int…

English

Andy Halterman@ahalterman·13 Kas

@arnicas Every time I sit down to make a Streamlit demo, I think “this will probably take a couple hours”, and then it takes like 20 minutes. It’s a really nice tool!

English

Lynn Cherny@arnicas·13 Kas

The api examples alone were a huge help.

Python Hub@PythonHub

Learn Streamlit This tutorial demonstrates how to use the Python Streamlit library to build more than 20 basic CRUD and database web apps. streamlitpython.com

English

Keşfet

@arthur_spirling @cbarrie @brendan642 @Nature @emollick @ben_j_radford @hurrial @emnlpmeeting