Vincent Arel-Bundock

1.9K posts

Vincent Arel-Bundock banner
Vincent Arel-Bundock

Vincent Arel-Bundock

@VincentAB

Prof in Montréal. Most tweets about R: marginaleffects, modelsummary, tinytable, countrycode, altdoc. @[email protected] https://t.co/uKmQqTufUL

Montréal Katılım Nisan 2012
718 Takip Edilen4.9K Takipçiler
Sabitlenmiş Tweet
Vincent Arel-Bundock
Vincent Arel-Bundock@VincentAB·
Whoa—my book is up for pre-order! Model to meaning: How to interpret Statistical & ML Models in #RStats and #PyData The book presents an ultra-simple and powerful workflow to make sense of ± any model you fit Note: The web version stays free forever. tinyurl.com/4fk56fc8
Vincent Arel-Bundock tweet media
English
5
33
154
7.7K
Vincent Arel-Bundock retweetledi
Ryan Briggs
Ryan Briggs@ryancbriggs·
We have a new version of this paper out. The headline results are the same—political science must filter results heavily for statistical significance—but we've added many extensions and rewritten much of it in response to feedback (thank you!). A quick thread on updates 👇
Ryan Briggs@ryancbriggs

I have a new paper. We look at ~all stats articles in political science post-2010 & show that 94% have abstracts that claim to reject a null. Only 2% present only null results. This is hard to explain unless the research process has a filter that only lets rejections through.

English
2
31
142
20K
Vincent Arel-Bundock
Vincent Arel-Bundock@VincentAB·
Do you know a good framework to document data analysis choices? Ex: dropping obs, recoding a var, tuning pars, model + test statistics, etc. Ideally human-readable so people don't have to read my entire codebase. Low-maintenance+ machine-readable for bonus points! #RStats #PyData
English
2
0
8
694
Vincent Arel-Bundock
Vincent Arel-Bundock@VincentAB·
@adamaltmejd @haozhu233 Exactly! I'm putting a detailed spec together and starting to think about tooling. Maybe I'll ping you for feedback if you're interested.
English
1
0
1
32
Adam Altmejd
Adam Altmejd@adamaltmejd·
@VincentAB @haozhu233 Ah yes that's interesting. Perhaps a kind of agent-driven testing scheme where each "decision" gets its own test where an agent is tasked to verify in code that the thing is actually being done. Not deterministic of course, but could work pretty well as verification.
English
1
0
0
26
Vincent Arel-Bundock
Vincent Arel-Bundock@VincentAB·
@adamaltmejd @haozhu233 Ideally, these things should be reported in the paper, but Many-Analyst Projects showthis is far from the case. I think that if we can have a LLM assisted workflow to create "candidate decisions" and have humans approve them, that could be really useful. LLM-powered Lab notebook
English
1
0
0
53
Vincent Arel-Bundock
Vincent Arel-Bundock@VincentAB·
@adamaltmejd @haozhu233 I mean more: "We report clustered standard errors at the level of random assignment: villages", with some justification and reference to literature, and link to the code line where this is done.
English
1
0
0
31
Vincent Arel-Bundock
Vincent Arel-Bundock@VincentAB·
@adamaltmejd @haozhu233 Yeah, right. But an important desideratum is to distinguish "state" from "decision". I want a record of "analysis decisions" that could then be compared to code state by an agent to check for faithfulness. The LLM seeds "candidate decisions" that need explicit human approval.
English
1
0
0
37
Adam Altmejd
Adam Altmejd@adamaltmejd·
@VincentAB @haozhu233 Curious if you find anything! On a specificity/determinism scale between AI summary and actual code, I'm not sure you can get much better than a well-prompted summary in a .md file without it being code.
English
1
0
0
34
Vincent Arel-Bundock
Vincent Arel-Bundock@VincentAB·
@haozhu233 Yes, of course. But I was more thinking about something along the lines of a Architecture Decision Record, that the user "approves." Include both choice & justification. The AI can refer to it when implementing and to verify faithfulness. And humans can eval high level decisions.
English
1
0
2
99
Gabriel Stechschulte
Gabriel Stechschulte@__gsteck__·
We refactored Bambi's interpret module, extending its features to align more with @VincentAB R's marginaleffects.
English
2
0
0
210
Vincent Arel-Bundock
Vincent Arel-Bundock@VincentAB·
@KyleLHandley @bechhof Instead of throwing insulting complaints up on social media ("anyone who picks up a textbook"), perhaps you could ask the maintainer (me) it we could convert the error into a warning. I am always happy to engage with users, listen to feedback, and fix (potentially) bad decisions.
English
1
0
3
72
Kyle Handley
Kyle Handley@KyleLHandley·
So just to complain about open source R a bit. Somebody decided the "marginaleffects" package should not work with PPML models because of FE uncertainty that definitely apply in some model prediction (but NOT my application). So they just broke the whole thing instead. Why can't users just make their own mistakes? github.com/vincentarelbun… "car" package will still do what I want anyway, but not I have rewrite my teaching code, my slides, and my solutions for students. The problem of fixed effects being concentrated out of PPML and conditional logit models is well know. STATA will give you a warning on this and anyone that can be bothered to pick up a textbook can know this. Weird unilateral decisions by package managers probably breaking tons of code for past 7 months.
English
1
0
2
318
Vincent Arel-Bundock retweetledi
Ryan Briggs
Ryan Briggs@ryancbriggs·
Say I wanted ~9k USD to buy out the teaching time of a prof (a brilliant coauthor of mine) so he can work on creating a validated ground truth dataset for a challenging data extraction and labelling task that current public LLMs (e.g. GPT5) have not yet saturated. Who do I pitch?
English
6
4
26
9.5K
Vincent Arel-Bundock retweetledi
Harris Policy
Harris Policy@HarrisPolicy·
Discover how rare it can be to uncover meaningful effects in political science research in this week’s episode of Not Another Politics Podcast. Listen here: har.rs/493JVn3
Harris Policy tweet media
English
0
1
4
542
Vincent Arel-Bundock
Vincent Arel-Bundock@VincentAB·
Whoa—my book is up for pre-order! Model to meaning: How to interpret Statistical & ML Models in #RStats and #PyData The book presents an ultra-simple and powerful workflow to make sense of ± any model you fit Note: The web version stays free forever. tinyurl.com/4fk56fc8
Vincent Arel-Bundock tweet media
English
5
33
154
7.7K
Vincent Arel-Bundock
Vincent Arel-Bundock@VincentAB·
@xuyiqing I've been playing with your ideas for Step 2 and found a super cute 1-liner. Using the `transform` argument of the `comparisons` function, we can spline dydx immediately, instead of using cumbersome bins. Not sure this is actually useful, but it's too cool not to post!
Vincent Arel-Bundock tweet mediaVincent Arel-Bundock tweet media
English
1
0
1
261
Yiqing Xu
Yiqing Xu@xuyiqing·
It works for simple cases, but both steps may have some drawbacks, e.g., binning works not as well at boundaries. Thx a lot for digging into it!
English
1
0
1
996
Yiqing Xu
Yiqing Xu@xuyiqing·
@VincentAB Thought a bit more. CME estimation with cont. X typically involves: (1) Get signals for partial effect/CATE given covariates. (2) Reduce to 1D along X (e.g., binning, kernel, spline). Your implementation uses GAM numerically for (1) and binning for (2).
Vincent Arel-Bundock@VincentAB

@xuyiqing Interesting paper! Thanks for posting. IIUC, your main concern with the GAM approach is that it targets the wrong estimand. If so, I feel that your criticism of the approach is kind of unfair, given that it's easy to target CME w/ GAM. See this notebook: arelbundock.com/hmx_simonsohn.…

English
1
0
1
2K
Vincent Arel-Bundock
Vincent Arel-Bundock@VincentAB·
@xuyiqing Oh yeah, doubly-robust is the good stuff! I also really like the part of your paper about dimensionality and regularization bias. A very clear and concise treatment. Good work!
English
1
0
1
442
Yiqing Xu
Yiqing Xu@xuyiqing·
Thx, Vincent. Agreed! I didn’t know about this fn in marginaleffects -- super useful! Our point on the critique’s *implementation* of GAM stands. Moreover, accommodating additional Z and regularization bias remain challenging. Doubly-robust estimators generally perform better.
Vincent Arel-Bundock@VincentAB

@xuyiqing Interesting paper! Thanks for posting. IIUC, your main concern with the GAM approach is that it targets the wrong estimand. If so, I feel that your criticism of the approach is kind of unfair, given that it's easy to target CME w/ GAM. See this notebook: arelbundock.com/hmx_simonsohn.…

English
1
0
8
3.5K
Vincent Arel-Bundock
Vincent Arel-Bundock@VincentAB·
@xuyiqing Interesting paper! Thanks for posting. IIUC, your main concern with the GAM approach is that it targets the wrong estimand. If so, I feel that your criticism of the approach is kind of unfair, given that it's easy to target CME w/ GAM. See this notebook: arelbundock.com/hmx_simonsohn.…
English
0
1
8
6.9K
Yiqing Xu
Yiqing Xu@xuyiqing·
1/ Recently, Professor Uri Simonsohn critiqued Hainmueller, Mummolo & Xu (2019), arguing that the proposed methods fail to recover the conditional marginal effect (CME): datacolada.org/121 We appreciate the critique and offer this response: arxiv.org/pdf/2502.05717 🧵
Yiqing Xu tweet media
English
3
25
131
30K
Vincent Arel-Bundock
Vincent Arel-Bundock@VincentAB·
@LeoBaccini That's all fine but, fundamentally, I'm not convinced that the president is actually looking for policy concessions. And I worry that spending more CAD in these areas will embolden further extortion attempts.
English
1
0
1
71
Leo Baccini
Leo Baccini@LeoBaccini·
Paradoxically, a trade war between bordering countries is at risk of worsening security border issues or to create problems that do not currently exist. And I leave the economic costs out in this thread. 11/11
English
2
0
1
253
Leo Baccini
Leo Baccini@LeoBaccini·
The US declared to impose 25% tariffs on imports from Canada and Mexico (effective from Tuesday). Canada has retaliated with 25% tariffs on US imports. A trade war between these two major trading partners brings no benefits to their economies and their citizens. A thread: 1/11
English
2
7
24
3.1K