Jonathan Edwards

924 posts

Jonathan Edwards

Jonathan Edwards

@JonNichEdwards

A student of causal inference

Geneva, Switzerland Sumali Ocak 2012
372 Sinusundan105 Mga Tagasunod
Jonathan Roth
Jonathan Roth@jondr44·
PS I was pretty sure @KhoaVuUmn had a mechanisms meme with the half-filled in horse, but I couldn't find it.
English
1
0
14
1.2K
Jonathan Roth
Jonathan Roth@jondr44·
I’m really excited about this paper! Some of my work has pointed out problems in empirical work, but this one is all about new 🔧s. If you (or your referees) want to know about the mechanisms by which a treatment affects an outcome, you may be interested. A 🧵.
The Review of Economic Studies@RevEconStudies

Want to know about the mechanisms by which a treatment affects an outcome? This paper develops tools for testing hypotheses about mechanisms under weak assumptions. Check it out! New paper by @jondr44 and Kwon: restud.com/testing-mechan… #REStud #EconX #EconTwitter

English
9
112
516
57K
Jonathan Edwards
Jonathan Edwards@JonNichEdwards·
@VC31415 @yudapearl @danaronoff @sbc111 @TDataScience Thanks ! Sorry I was not clear, my point is that given a structural equation Y = g(D,X,U) assuming U ⫫ D|X or stronger strict exogeneity U ⫫ (D,X) is not sufficient for concluding Y(d) ⫫ D|X, you need to additionally assume Y(d) = g(d,X,U)
English
1
0
0
116
Jonathan Edwards
Jonathan Edwards@JonNichEdwards·
@VC31415 @yudapearl @danaronoff @sbc111 @TDataScience @VC31415 I have been trying to understand the link between exogeneity and ignorability. Still unsure so sorry if this is fuzzy but isn't conditional exogeneity only equivalent to ignorability under additional constraints like constant effect Y(1) = Y(0) + τ or Y(d) = g(d,X,U)?
English
2
0
0
198
Jonathan Edwards
Jonathan Edwards@JonNichEdwards·
@JoachimSchork Something that looks like your mixed effects illustrations doesn't have to be mixed but could be pure fixed effects. Whether slope or intercept is fitted to subgroup is orthogonal to whether effect is fixed or random.
English
0
0
1
131
Joachim Schork
Joachim Schork@JoachimSchork·
Mixed models combine fixed effects (consistent across data) and random effects (vary across data) to analyze complex data structures, such as repeated measures or hierarchical data. When used correctly, they can provide more accurate and reliable insights. ✔️ Handles Complex Data: Suitable for hierarchical and repeated measures data by managing both fixed and random effects. ✔️ Improves Accuracy: Accounts for random variability, leading to more reliable estimates. ✔️ Broad Applications: Useful in medicine, economics, psychology, ecology, and other fields with grouped data. ✔️ Controls Unobserved Factors: Random effects help manage variability due to unobserved factors. ❌ Computational Cost: Can be resource-intensive for large data sets. ❌ Overfitting Risk: Too many random effects can cause overfitting. ❌ Interpretation Challenges: Results can be harder to interpret compared to simpler models. ❌ Assumption Sensitivity: Assumes normally distributed random effects, which can affect results if not met. ❌ Convergence Issues: Fitting mixed models can be challenging with complex structures or limited data. The image below compares fixed, random, and mixed effects in linear regression models. It shows how fixed effects have consistent intercepts and slopes, while random effects allow both to vary across groups, and mixed effects combine these approaches to capture both shared trends and group-specific variations. Image credit to Wikipedia: en.wikipedia.org/wiki/Mixed_mod… 🔹 In R: The lme4 package fits mixed models, and lmerTest adds significance testing. The nlme package offers additional options for complex random effects. 🔹 In Python: The statsmodels library’s MixedLM function supports mixed models, and pandas helps manage hierarchical data. For Bayesian mixed models, consider the PyMC library. Want to learn more about Statistics, Data Science, R, and Python? Subscribe to my email newsletter! More information: statisticsglobe.com/newsletter #StatisticalAnalysis #Rpackage #RStats #DataScientist #DataAnalytics
Joachim Schork tweet media
English
2
28
187
6.4K
Tarek Carls
Tarek Carls@tarekcarls·
@JoachimSchork Great post! One minor thing to keep in mind: Random effects models only give you an unbiased estimate for the within-group effect when the random effects assumption holds. If not, a correlated random effects model (i.e., mundlak device) might be a good option to explore
English
1
0
1
121
Joachim Schork
Joachim Schork@JoachimSchork·
I recently made a very popular LinkedIn post about Simpson's Paradox, which resulted in an engaging conversation. Paul Julian made a great comment on the relationship between Mixed Effects Models and Simpson's Paradox that I wanted to share with you. He pointed out that when specified correctly, Mixed Effects Models can avoid being fooled by Simpson's Paradox. Unlike a naive linear model that analyzes all data at once, which might lead to misleading conclusions, mixed effects models separate fixed effects (consistent effects across all groups) and random effects (group-specific deviations from the overall trend). This allows the model to account for variations both within and between groups, leading to more accurate interpretations. In the plot below (generated from reproducible code – thanks, Paul!), you can see how different models compare: 🔹 Fixed Effect (black line): Captures the overall relationship, assuming it is the same across all groups. 🔹 Group Linear Model (dashed red line): Shows the trend within each subgroup, revealing how group-specific relationships can differ. 🔹 Naive Linear Model (gray line): Fails to account for subgroup differences, which can lead to misleading conclusions due to Simpson's Paradox. 🔹 Random Effect (blue line): Captures the variation between groups, allowing for group-specific deviations from the fixed effect. Here's the original post: linkedin.com/posts/joachim-… Important Notes: Mixed effects models offer a flexible framework to address Simpson's Paradox, effectively capturing both group-level and overall trends. However, they have limitations and alternative approaches should be considered. Mixed models, like any statistical tool, can be mis-specified if key variables are omitted. In certain cases, simpler models like OLS can handle group effects just as effectively, provided the predictors are correctly specified. For longitudinal or clustered data, marginal models like GEE or MMRM may be better suited when the goal is to estimate population-average effects, especially since mixed models focus on conditional, subject-specific effects. Additionally, Simpson’s Paradox requires careful causal understanding. Grouping variables can either be confounders or colliders, which influences the choice of model. An inappropriate adjustment can lead to incorrect conclusions, making it crucial to understand the causal structure before deciding whether to use a mixed model or a simpler approach. For regular tips on data science, statistics, Python, and R programming, check out my free email newsletter. More info: statisticsglobe.com/newsletter #datastructure #database #VisualAnalytics #Python #statisticians #datavis #StatisticalAnalysis #pythonprogramming
Joachim Schork tweet media
English
5
12
120
6.7K
Jonathan Edwards
Jonathan Edwards@JonNichEdwards·
@JoachimSchork 3. It seems to me one has to be careful using random effects models in this context (causality, simpson paradox). Indeed, a random effects model does not necessarily fully adjust for a confounder. The backdoor path through the confounder is not fully blocked.
English
0
0
0
44
Jonathan Edwards
Jonathan Edwards@JonNichEdwards·
@JoachimSchork ...so the question of which is the most naive model depends on the causal model. The fixed effect (grey) or linear model (dashed grey) could both be correct or wrong depending on causal model. This question is orthogonal to the question of how the adjustment is made
English
1
0
0
46
Jonathan Edwards
Jonathan Edwards@JonNichEdwards·
@DongNguyeb I am not admitting anything, I am saying your defintion of identification is unusual, but in a polite way. Have no problem admitting when wrong I actually was wrong about smt else regarding noncollapsibility and missing covariates, but you are too aggressive to discuss. good day
English
1
0
0
29
Dong Nguyen
Dong Nguyen@DongNguyeb·
@JonNichEdwards Now, you admit you don’t understand. I think I explain it already. You need to time to read and self refection.
English
1
0
0
29
Dong Nguyen
Dong Nguyen@DongNguyeb·
Harrell’s paper discourse.datamethods.org/t/critique-of-…, this is potentially misleading. First, I agree that understanding how RCTs works is sufficient to assess their potential for generalizability, we does not need to complicate matters as in the Orcutt paper. Now, It’s time for us to review how RCT works. @f2harrell
English
1
1
1
959
Jonathan Edwards
Jonathan Edwards@JonNichEdwards·
@DongNguyeb You complain about people being impolite, but you should take a good look in the mirror.
English
1
0
0
19
Dong Nguyen
Dong Nguyen@DongNguyeb·
@JonNichEdwards Sorry, you need to read whole of my post that including that quote. I already explain and you need to think.
English
1
0
0
17
Jonathan Edwards
Jonathan Edwards@JonNichEdwards·
@DongNguyeb I don't quite understand what you mean by ATE and RR can be directly identified, isn't it the case that if a DAG allows us to identify (express using observed outcomes) the ATE then we can also identify OR?
English
1
0
0
25
Dong Nguyen
Dong Nguyen@DongNguyeb·
Oh my statement is wrong and you think that’s too vague? My statement , in simple sate that in an RCT, randomization directly identifies the two quantities E[Y^1] and E[Y^0] and the risk ratio or ATE is simply a function of these two quantities. Consequently, identifying the RR, ATE is directly because it requires no distributional assumptions, no model specification, and no additional latent structure beyond what is guaranteed by the RCT design itself. Please note that RR and ATE are not special because they are collapsible rather they are collapsible because they are functions of marginal means identified only by randomization in RCT design Now is it clear?
English
1
1
0
128
Jonathan Edwards
Jonathan Edwards@JonNichEdwards·
@DongNguyeb I am not purposefully using "tactics" or trying to misrepresent... "ORs and HRs do not belong to the class of estimands whose causal meaning is guaranteed by randomization itself under the RCT design" because they require additional distributional or structural assumptions?
English
1
0
0
21
Dong Nguyen
Dong Nguyen@DongNguyeb·
You are still using the same tactic, you selectively quoting my statements and shifting the level of the argument. The issue I raised from the outset concerns RCT design: which estimands are guaranteed to be identified as causal by randomization alone. This is the identification layer. In contrast, your responses repeatedly focus on a different question: which estimators are unbiased when marginalizing from a model E[Y|D, X]. This belongs to the estimation layer. These two layers are distinct and cannot be conflated. I knew that , and to separate these two layers, I deliberately used the phrase “even though….” That is, I fully acknowledge that odds ratios and hazard ratios can be computed, can be statistically well-defined, and can be estimable (under appropriate model) However, this does not change my core point: ORs and HRs do not belong to the class of estimands whose causal meaning is guaranteed by randomization itself under the RCT design. But You insist to argue at the estimation level, while my argument is about identification by RCT design. That’s also why I do not see a basis for continuing this discussion, then I said “sorry ..” and give you “thank you for discussion” But you react my post very crude and very impolite.
Dong Nguyen tweet mediaDong Nguyen tweet media
English
1
1
0
103
Jonathan Edwards
Jonathan Edwards@JonNichEdwards·
@DongNguyeb I am not trying to misrepresent. This is an interesting distinction you suggest between estimands that can be estimated without model and those that can't. Wouldn't allowing only the first type mean you are basically only allowing for linear structural models?
English
1
0
0
29
Dong Nguyen
Dong Nguyen@DongNguyeb·
Your second claim is a serious conceptual error: “Second, they are obviously derived from the rct data and they are obviously causal given that they depend on potential outcomes” That does not make an estimand which the RCT design is meant to identify. Not every causal quantity is a meaningful causal estimand under a given design. The class of causal estimands that RCTs are constructed to identify are those for which randomization alone suffices without additional distributional or structural assumptions and which correspond to stable functionals of the potential outcome distribution guaranteed by the design. As you implicitly acknowledge in your first point, ORs and HRs are model-dependent. Precisely for that reason, they cannot be identified by randomization alone and therefore do not belong to the class of causal estimands that the RCT design guarantees. This is a design-based identification argument, not a mathematical critique of estimators.
English
1
1
0
96
Jonathan Edwards
Jonathan Edwards@JonNichEdwards·
@DongNguyeb Simplest is to think about it this way: which of the issues you discuss are not present when estimating risk difference using logistic model?
English
1
0
0
80
Dong Nguyen
Dong Nguyen@DongNguyeb·
@JonNichEdwards It seems you turn to become a polite person. That’s good. I will reply your arguments soon.
English
1
0
0
55