Causal Inference

Last updated: December 2024

Disclaimer: These are my personal notes compiled for my own reference and learning. They may contain errors, incomplete information, or personal interpretations. While I strive for accuracy, these notes are not peer-reviewed and should not be considered authoritative sources. Please consult official textbooks, research papers, or other reliable sources for academic or professional purposes.

1. Potential Outcomes Framework

For each unit $i$, we have potential outcomes:

Y_i(1): \text{outcome if treated}$$ $$Y_i(0): \text{outcome if not treated}

The individual treatment effect is:

\tau_i = Y_i(1) - Y_i(0)

2. Average Treatment Effect (ATE)

The average treatment effect is:

\tau = E[Y_i(1) - Y_i(0)] = E[Y_i(1)] - E[Y_i(0)]

We can only observe one potential outcome per unit (fundamental problem of causal inference).

3. Assumptions for Causal Inference

Stable Unit Treatment Value Assumption (SUTVA): No interference between units
Ignorability: $(Y_i(1), Y_i(0)) \perp\!\!\!\perp D_i | X_i$
Overlap: $0 < P(D_i = 1|X_i) < 1$ for all $X_i$

4. Randomized Experiments

In randomized experiments, treatment assignment is independent of potential outcomes:

(Y_i(1), Y_i(0)) \perp\!\!\!\perp D_i

The simple difference-in-means estimator is unbiased:

\hat{\tau} = \frac{1}{n_1}\sum_{i:D_i=1} Y_i - \frac{1}{n_0}\sum_{i:D_i=0} Y_i

5. Instrumental Variables (IV)

When treatment is endogenous, we need an instrument $Z_i$ that satisfies:

Relevance: $Cov(Z_i, D_i) \neq 0$
Exclusion: $Z_i$ affects $Y_i$ only through $D_i$
Exogeneity: $Z_i \perp\!\!\!\perp \epsilon_i$

The IV estimator is:

\hat{\tau}_{IV} = \frac{Cov(Z_i, Y_i)}{Cov(Z_i, D_i)}

6. Regression Discontinuity Design (RDD)

When treatment assignment depends on a running variable $X_i$ with a cutoff $c$:

D_i = \mathbf{1}\{X_i \geq c\}

The treatment effect at the cutoff is:

\tau = \lim_{x \downarrow c} E[Y_i|X_i = x] - \lim_{x \uparrow c} E[Y_i|X_i = x]

7. Difference-in-Differences (DiD)

For panel data with treatment and control groups over time:

\tau = (E[Y_{i1}|D_i=1] - E[Y_{i0}|D_i=1]) - (E[Y_{i1}|D_i=0] - E[Y_{i0}|D_i=0])

Key assumption: Parallel trends in the absence of treatment.

8. Matching Methods

Match treated units with similar control units based on covariates $X_i$:

Exact matching: Match on exact values of $X_i$
Propensity score matching: Match on $P(D_i=1|X_i)$
Nearest neighbor matching: Match on distance in $X_i$ space

9. Propensity Score

The propensity score is $p(X_i) = P(D_i = 1|X_i)$. Under ignorability:

(Y_i(1), Y_i(0)) \perp\!\!\!\perp D_i | p(X_i)

This allows us to control for high-dimensional $X_i$ by controlling for the scalar $p(X_i)$.

10. Code Example

# R code for causal inference
library(AER)
library(rdd)

# Instrumental Variables
iv_model <- ivreg(y ~ x | z, data = mydata)
summary(iv_model)

# Regression Discontinuity
rd_model <- RDestimate(y ~ x, data = mydata, cutpoint = 0)
summary(rd_model)

# Difference-in-Differences
library(plm)
did_model <- plm(y ~ treated + post + treated:post, 
                 data = panel_data, model = "within")
summary(did_model)

11. References

Imbens, G. W., & Rubin, D. B. (2015). Causal Inference in Statistics, Social, and Biomedical Sciences.
Angrist, J. D., & Pischke, J. S. (2008). Mostly Harmless Econometrics.
Morgan, S. L., & Winship, C. (2015). Counterfactuals and Causal Inference.