New DD Estimators

class: center, middle, inverse, title-slide

# New DD Estimators
## <html>
<div style="float:left">

</div>
<hr color='#EB811B' size=1px width=0px>
</html>
### Ian McCarthy | Emory University
### Workshop on Causal Inference with Panel Data

---

<style type="text/css">
.remark-slide-content {
    font-size: 30px;
    padding: 1em 2em 1em 2em;    
}
.remark-code {
  font-size: 15px;
}
.remark-inline-code { 
    font-size: 20px;
}
</style>

# Table of contents

1. [The Issue](#problem)
2. [Callaway and Sant'Anna](#cs)
3. [Sun and Abraham](#sa)
4. [de Chaisemartin and D'Haultfoeuille (2020)](#ch)
5. [Even more!](#more)
6. [All together now](#together)

---
class: inverse, center, middle
name: problem

# Revisiting the Issue

---
# Problem with TWFE

Recall the biggest issues with "standard" TWFE estimates:

- Best case: Variance-weighted ATT
- Biased with heterogeneous effects over time and differential timing
- Differential timing **alone** can introduce bias because already treated act as controls for later treated groups

--
"Heterogeneous" treatment effects should be the baseline

---
# Solution

Only consider "clean" comparisons:

--
- Separate event study for each treatment group vs never-treated
- Callaway and Sant'Anna (2020)
- Sun and Abraham (2020)
- de Chaisemartin and D'Haultfoeuille (2020)
- Cengiz et al. (2019), Gardner (2021), and Borusyak et al. (2021)

---
class: inverse, center, middle
name: cs

# Callaway and Sant'Anna

---
# CS Estimator

- "Manually" estimate group-specific treatment effects for each period
- Each estimate is propensity-score weighted
- Aggregate the treatment effect estimates (by time, group, or both)

---
# CS in Practice

.pull-left[
**Stata**<br>

```stata
ssc install csdid
ssc install event_plot
ssc install drdid

insheet using "https://raw.githubusercontent.com/imccart/202108-cdc-methodsworkshop/master/data/medicaid-expansion/mcaid-expand-data.txt", clear
gen perc_unins=uninsured/adult_pop
egen stategroup=group(state)
drop if expand_ever=="NA"
replace expand_year="0" if expand_year=="NA"
destring expand_year, replace

csdid perc_unins, ivar(stategroup) time(year) gvar(expand_year) notyet
estat event, estore(cs)
event_plot cs, default_look graph_opt(xtitle("Periods since the event") ytitle("Average causal effect") xlabel(-6(1)4) title("Callaway and Sant'Anna (2020)")) stub_lag(T+#) stub_lead(T-#) together
```
]

.pull-left[
**R**<br>

```r
library(tidyverse)
library(did)
library(DRDID)
mcaid.data <- read_tsv("https://raw.githubusercontent.com/imccart/202108-cdc-methodsworkshop/master/data/medicaid-expansion/mcaid-expand-data.txt")
reg.dat <- mcaid.data %>% 
  filter(!is.na(expand_ever)) %>%
  mutate(perc_unins=uninsured/adult_pop,
         post = (year>=2014), 
         treat=post*expand_ever,
         expand_year=ifelse(is.na(expand_year),0,expand_year)) %>%
  filter(!is.na(perc_unins)) %>%
  group_by(State) %>%
  mutate(stategroup=cur_group_id()) %>% ungroup()

mod.cs <- att_gt(yname="perc_unins", tname="year", idname="stategroup",
                 gname="expand_year",
                 data=reg.dat, panel=TRUE, est_method="dr",
                 allow_unbalanced_panel=TRUE)
mod.cs.event <- aggte(mod.cs, type="dynamic")
ggdid(mod.cs.event)
```
]

---
class: inverse, center, middle
name: sa

# Sun and Abraham

---
# Sun and Abraham

Considers event study with differential treatment timing:

- Problem: lead and lag coefficient estimates are potentially biased due to treatment/control group construction
- Solution: Estimate fully interacted model

--
`$$y_{it} = \gamma_{i} + \gamma_{t} + \sum_{g} \sum_{\tau \neq -1} \delta_{g \tau} \times \text{1}(i \in C_{g}) \times D_{it}^{\tau} + x_{it} + \epsilon_{it}$$`

---
count: false

# Sun and Abraham

`$$y_{it} = \gamma_{i} + \gamma_{t} + \sum_{g} \sum_{\tau \neq -1} \delta_{g \tau} \times \text{1}(i \in C_{g}) \times D_{it}^{\tau} + x_{it} + \epsilon_{it}$$`

--
- `$g$` denotes a group and `$C_{g}$` the set of individuals in group `$g$`
- `$\tau$` denotes time periods
- `$D_{it}^{\tau}$` denotes a relative time indicator

---
count: false

# Sun and Abraham

`$$y_{it} = \gamma_{i} + \gamma_{t} + \sum_{g} \sum_{\tau \neq -1} \delta_{g \tau} \times \text{1}(i \in C_{g}) \times D_{it}^{\tau} + x_{it} + \epsilon_{it}$$`

--
- Intuition: Standard regression with different event study specifications for each treatment group
- Aggregate `$\delta_{g\tau}$` for standard event study coefficients and overall ATT

---
# Sun and Abraham in Practice

.pull-left[
**Stata**<br>

```stata
ssc install eventstudyinteract
ssc install avar
ssc install event_plot

insheet using "https://raw.githubusercontent.com/imccart/202108-cdc-methodsworkshop/master/data/medicaid-expansion/mcaid-expand-data.txt", clear
gen perc_unins=uninsured/adult_pop
drop if expand_ever=="NA"
egen stategroup=group(state)
replace expand_year="." if expand_year=="NA"
destring expand_year, replace
gen event_time=year-expand_year
gen nevertreated=(event_time==.)

forvalues l = 0/4 {
	gen L`l'event = (event_time==`l')
}
forvalues l = 1/2 {
	gen F`l'event = (event_time==-`l')
}
gen F3event=(event_time<=-3)
eventstudyinteract perc_unins F3event F2event L0event L1event L2event L3event L4event, vce(cluster stategroup) absorb(stategroup year) cohort(expand_year) control_cohort(nevertreated)

event_plot e(b_iw)#e(V_iw), default_look graph_opt(xtitle("Periods since the event") ytitle("Average causal effect") xlabel(-3(1)4)	title("Sun and Abraham (2020)")) stub_lag(L#event) stub_lead(F#event) plottype(scatter) ciplottype(rcap) together
```
]

.pull-left[
**R**<br>

```r
library(tidyverse)
library(modelsummary)
library(fixest)
mcaid.data <- read_tsv("https://raw.githubusercontent.com/imccart/202108-cdc-methodsworkshop/master/data/medicaid-expansion/mcaid-expand-data.txt")
reg.dat <- mcaid.data %>% 
  filter(!is.na(expand_ever)) %>%
  mutate(perc_unins=uninsured/adult_pop,
         post = (year>=2014), 
         treat=post*expand_ever,
         expand_year = ifelse(expand_ever==FALSE, 10000, expand_year),
         time_to_treat = ifelse(expand_ever==FALSE, -1, year-expand_year),
         time_to_treat = ifelse(time_to_treat < -3, -3, time_to_treat))

mod.sa <- feols(perc_unins~sunab(expand_year, time_to_treat) | State + year,
                  cluster=~State,
                  data=reg.dat)
iplot(mod.sa,
      xlab = 'Time to treatment',
      main = 'Event study')
```
]

---
class: inverse, center, middle
name: ch

# de Chaisemartin and D'Haultfoeuille (CH)

---
# CH

- More general than other approaches
- Considers "fuzzy" treatment (i.e., non-discrete treatment)
- Considers fixed effects and first-differencing

--
New paper from Callaway, Goodman-Bacon, and Sant'Anna also looks at DD with continuous treatment

---
# CH Approach

- Essentially a series of 2x2 comparisons
- Aggregates up to overall effects

---
# CH in Practice

.pull-left[
**Stata**<br>

```stata
ssc install did_multiplegt
ssc install event_plot

insheet using "https://raw.githubusercontent.com/imccart/202108-cdc-methodsworkshop/master/data/medicaid-expansion/mcaid-expand-data.txt", clear
gen perc_unins=uninsured/adult_pop
drop if expand_ever=="NA"
egen stategroup=group(state)
replace expand_year="." if expand_year=="NA"
destring expand_year, replace
gen event_time=year-expand_year
gen nevertreated=(event_time==.)
gen treat=(event_time>=0 & event_time!=.)

did_multiplegt perc_unins stategroup year treat, robust_dynamic dynamic(4) placebo(3) breps(100) cluster(stategroup) 
event_plot e(estimates)#e(variances), default_look graph_opt(xtitle("Periods since the event") ytitle("Average causal effect") ///
title("de Chaisemartin and D'Haultfoeuille (2020)") xlabel(-3(1)4)) stub_lag(Effect_#) stub_lead(Placebo_#) together
```
]

.pull-left[
**R**(not the same as in **Stata**)<br>

```r
library(DIDmultiplegt)
mcaid.data <- read_tsv("https://raw.githubusercontent.com/imccart/202108-cdc-methodsworkshop/master/data/medicaid-expansion/mcaid-expand-data.txt")
reg.dat <- as.data.table(mcaid.data) %>% 
  filter(!is.na(expand_ever)) %>%
  mutate(perc_unins=uninsured/adult_pop,
         treat=case_when(
           expand_ever==FALSE ~ 0,
           expand_ever==TRUE & expand_year<year ~ 0,
           expand_ever==TRUE & expand_year>=year ~ 1))

mod.ch <- did_multiplegt(df=reg.dat, Y="perc_unins", G="State", T="year", D="treat",
                         placebo=3, dynamic=4, brep=50, cluster="State")
```
]

---
class: inverse, center, middle
name: others

# And even more!

---
# Cengiz et al. (2019)

- "Stacked" event studies
- Estimate event study for every treatment group, using never-treated as controls
- Aggregate to overall average effects

--
.pull-left[
**Stata**<br>
`stackdev`
]

.pull-right[
**R**<br>
`#nothing yet`
]

---
# Gardner (2021)

- "Remove" fixed effects via first stage regression only among non-treated units
- Predict FE from first stage and residualize the outcome
- Run standard event study specification on residualized outcome variable

--
.pull-left[
**Stata**<br>
`did2s`
]

.pull-right[
**R**<br>
`did2s`
]

---
# Borusyak et al. (2021)

- Imputation approach
- Estimate regression only for untreated observations
- Predicted untreated outcome among the treated observations and take the difference
- Aggregate differences to form overall weighted average effect

--
.pull-left[
**Stata**<br>
`did_imputation`
]

.pull-right[
**R**<br>
`did2s`
]

---
class: inverse, center, middle
name: together

# Putting things together

---
# Seems like lots of "solutions"

- Callaway and Sant'Anna (2020)
- Sun and Abraham (2020)
- de Chaisemartin and D'Haultfoeuille (2020)
- Cengiz et al (2019), Gardner (2021), and Borusyak et al. (2021)

--
Goodman-Bacon (2021) explores the problems but doesn't really propose a solution (still very important work though!)

---
# Comparison

.pull-left[
**Similarities**<br>
- Focus on clean treatment/control
- Focus on event study framework (not a single overall effect)
- Impose some form of parallel trends assumption
]

.pull-right[
**Differences**<br>
- What are the control units?
- How to include covariates?
]

---
# State of current work

- Careful consideration of treatment timing and control group(s)
- `panelView` package is great here!
- Implement 2 or more approaches