### Ian McCarthy | Emory University ### Workshop on Causal Inference with Panel Data

# Table of contents 1. [The Issue](#problem) 2. [Callaway and Sant'Anna](#cs) 3. [Sun and Abraham](#sa) 4. [de Chaisemartin and D'Haultfoeuille (2020)](#ch) 5. [Even more!](#more) 6. [All together now](#together)

# Revisiting the Issue

# Problem with TWFE Recall the biggest issues with "standard" TWFE estimates: - Best case: Variance-weighted ATT - Biased with heterogeneous effects over time and differential timing - Differential timing **alone** can introduce bias because already treated act as controls for later treated groups

"Heterogeneous" treatment effects should be the baseline

# Solution Only consider "clean" comparisons:

- Separate event study for each treatment group vs never-treated - Callaway and Sant'Anna (2020) - Sun and Abraham (2020) - de Chaisemartin and D'Haultfoeuille (2020) - Cengiz et al. (2019), Gardner (2021), and Borusyak et al. (2021)

# Callaway and Sant'Anna

# CS Estimator - "Manually" estimate group-specific treatment effects for each period - Each estimate is propensity-score weighted - Aggregate the treatment effect estimates (by time, group, or both)

# CS in Practice .pull-left[ **Stata**<br> ```stata ssc install csdid ssc install event_plot ssc install drdid insheet using "", clear gen perc_unins=uninsured/adult_pop egen stategroup=group(state) drop if expand_ever=="NA" replace expand_year="0" if expand_year=="NA" destring expand_year, replace csdid perc_unins, ivar(stategroup) time(year) gvar(expand_year) notyet estat event, estore(cs) event_plot cs, default_look graph_opt(xtitle("Periods since the event") ytitle("Average causal effect") xlabel(-6(1)4) title("Callaway and Sant'Anna (2020)")) stub_lag(T+#) stub_lead(T-#) together ``` ] .pull-left[ **R**<br> ```r library(tidyverse) library(did) library(DRDID) <- read_tsv("") reg.dat <- %>% filter(! %>% mutate(perc_unins=uninsured/adult_pop, post = (year>=2014), treat=post*expand_ever, expand_year=ifelse(,0,expand_year)) %>% filter(! %>% group_by(State) %>% mutate(stategroup=cur_group_id()) %>% ungroup() mod.cs <- att_gt(yname="perc_unins", tname="year", idname="stategroup", gname="expand_year", data=reg.dat, panel=TRUE, est_method="dr", allow_unbalanced_panel=TRUE) mod.cs.event <- aggte(mod.cs, type="dynamic") ggdid(mod.cs.event) ``` ]

# Sun and Abraham

# Sun and Abraham Considers event study with differential treatment timing: - Problem: lead and lag coefficient estimates are potentially biased due to treatment/control group construction - Solution: Estimate fully interacted model

`$$y_{it} = \gamma_{i} + \gamma_{t} + \sum_{g} \sum_{\tau \neq -1} \delta_{g \tau} \times \text{1}(i \in C_{g}) \times D_{it}^{\tau} + x_{it} + \epsilon_{it}$$`

# Sun and Abraham `$$y_{it} = \gamma_{i} + \gamma_{t} + \sum_{g} \sum_{\tau \neq -1} \delta_{g \tau} \times \text{1}(i \in C_{g}) \times D_{it}^{\tau} + x_{it} + \epsilon_{it}$$`

- `\(g\)` denotes a group and `\(C_{g}\)` the set of individuals in group `\(g\)` - `\(\tau\)` denotes time periods - `\(D_{it}^{\tau}\)` denotes a relative time indicator

# Sun and Abraham `$$y_{it} = \gamma_{i} + \gamma_{t} + \sum_{g} \sum_{\tau \neq -1} \delta_{g \tau} \times \text{1}(i \in C_{g}) \times D_{it}^{\tau} + x_{it} + \epsilon_{it}$$`

- Intuition: Standard regression with different event study specifications for each treatment group - Aggregate `\(\delta_{g\tau}\)` for standard event study coefficients and overall ATT

# Sun and Abraham in Practice .pull-left[ **Stata**<br> ```stata ssc install eventstudyinteract ssc install avar ssc install event_plot insheet using "", clear gen perc_unins=uninsured/adult_pop drop if expand_ever=="NA" egen stategroup=group(state) replace expand_year="." if expand_year=="NA" destring expand_year, replace gen event_time=year-expand_year gen nevertreated=(event_time==.) forvalues l = 0/4 { gen L`l'event = (event_time==`l') } forvalues l = 1/2 { gen F`l'event = (event_time==-`l') } gen F3event=(event_time<=-3) eventstudyinteract perc_unins F3event F2event L0event L1event L2event L3event L4event, vce(cluster stategroup) absorb(stategroup year) cohort(expand_year) control_cohort(nevertreated) event_plot e(b_iw)#e(V_iw), default_look graph_opt(xtitle("Periods since the event") ytitle("Average causal effect") xlabel(-3(1)4) title("Sun and Abraham (2020)")) stub_lag(L#event) stub_lead(F#event) plottype(scatter) ciplottype(rcap) together ``` ] .pull-left[ **R**<br> ```r library(tidyverse) library(modelsummary) library(fixest) <- read_tsv("") reg.dat <- %>% filter(! %>% mutate(perc_unins=uninsured/adult_pop, post = (year>=2014), treat=post*expand_ever, expand_year = ifelse(expand_ever==FALSE, 10000, expand_year), time_to_treat = ifelse(expand_ever==FALSE, -1, year-expand_year), time_to_treat = ifelse(time_to_treat < -3, -3, time_to_treat)) <- feols(perc_unins~sunab(expand_year, time_to_treat) | State + year, cluster=~State, data=reg.dat) iplot(, xlab = 'Time to treatment', main = 'Event study') ``` ]

# de Chaisemartin and D'Haultfoeuille (CH)

# CH - More general than other approaches - Considers "fuzzy" treatment (i.e., non-discrete treatment) - Considers fixed effects and first-differencing

New paper from Callaway, Goodman-Bacon, and Sant'Anna also looks at DD with continuous treatment

# CH Approach - Essentially a series of 2x2 comparisons - Aggregates up to overall effects

# CH in Practice .pull-left[ **Stata**<br> ```stata ssc install did_multiplegt ssc install event_plot insheet using "", clear gen perc_unins=uninsured/adult_pop drop if expand_ever=="NA" egen stategroup=group(state) replace expand_year="." if expand_year=="NA" destring expand_year, replace gen event_time=year-expand_year gen nevertreated=(event_time==.) gen treat=(event_time>=0 & event_time!=.) did_multiplegt perc_unins stategroup year treat, robust_dynamic dynamic(4) placebo(3) breps(100) cluster(stategroup) event_plot e(estimates)#e(variances), default_look graph_opt(xtitle("Periods since the event") ytitle("Average causal effect") /// title("de Chaisemartin and D'Haultfoeuille (2020)") xlabel(-3(1)4)) stub_lag(Effect_#) stub_lead(Placebo_#) together ``` ] .pull-left[ **R**(not the same as in **Stata**)<br> ```r library(DIDmultiplegt) <- read_tsv("") reg.dat <- %>% filter(! %>% mutate(perc_unins=uninsured/adult_pop, treat=case_when( expand_ever==FALSE ~ 0, expand_ever==TRUE & expand_year<year ~ 0, expand_ever==TRUE & expand_year>=year ~ 1)) <- did_multiplegt(df=reg.dat, Y="perc_unins", G="State", T="year", D="treat", placebo=3, dynamic=4, brep=50, cluster="State") ``` ]

# And even more!

# Cengiz et al. (2019) - "Stacked" event studies - Estimate event study for every treatment group, using never-treated as controls - Aggregate to overall average effects

.pull-left[ **Stata**<br> `stackdev` ] .pull-right[ **R**<br> `#nothing yet` ]

# Gardner (2021) - "Remove" fixed effects via first stage regression only among non-treated units - Predict FE from first stage and residualize the outcome - Run standard event study specification on residualized outcome variable

.pull-left[ **Stata**<br> `did2s` ] .pull-right[ **R**<br> `did2s` ]

# Borusyak et al. (2021) - Imputation approach - Estimate regression only for untreated observations - Predicted untreated outcome among the treated observations and take the difference - Aggregate differences to form overall weighted average effect

.pull-left[ **Stata**<br> `did_imputation` ] .pull-right[ **R**<br> `did2s` ]

# Putting things together

# Seems like lots of "solutions" - Callaway and Sant'Anna (2020) - Sun and Abraham (2020) - de Chaisemartin and D'Haultfoeuille (2020) - Cengiz et al (2019), Gardner (2021), and Borusyak et al. (2021) -- Goodman-Bacon (2021) explores the problems but doesn't really propose a solution (still very important work though!) --- # Comparison .pull-left[ **Similarities**<br> - Focus on clean treatment/control - Focus on event study framework (not a single overall effect) - Impose some form of parallel trends assumption ] .pull-right[ **Differences**<br> - What are the control units? - How to include covariates? ] --- # State of current work - Careful consideration of treatment timing and control group(s) - `panelView` package is great here! - Implement 2 or more approaches