Gov 2002: 1. Intro & Potential Outcomes Matthew Blackwell September 3, 2015
Welcome! Government Department politics, slavery, and so on. American politics, sports analytics. • Me: Matthew Blackwell, Assistant Professor in the • What I study: causal inference, missing data, American • Your TF: Stephen Pettigrew, PhD Candidate in Gov. • What he studies: Bayesian statistics, machine learnings,
Goals 1. Be able to understand and use recent advances in causal inference 2. Be able to diagnose problems and understand assumptions of causal inference 3. Be able to understand almost all causal inference in applied political science 4. Provide you with enough understanding to learn more on your own 5. Get you as excited about methods as we are
Prereqs 2001 or the equivalent. (𝑌 ′ 𝑌) − 𝑌 ′ 𝑧 • Biggest: clear eyes, full hearts aka willingness to work hard. • Working assumption is that you have taken Gov 2000 and • Basically, you vaguely still understand what this is: • And these terms are familiar to you: ▶ bias ▶ consistency ▶ null hypothesis ▶ homoskedastic ▶ parametric model ▶ 𝜏 -algebras (just kidding)
R for computing fjelds free to implement what you need (as opposed to what Stata thinks is best) • It’s free • It’s becoming the de facto standard in many applied statistical • It’s extremely powerful, but relatively simple to do basic stats • Compared to other options (Stata, SPSS, etc) you’ll be more • Will use it in lectures, much more help with it in sections
Teaching resources assignments) where you can ask questions and discuss topics with us and your classmates) • Lecture (where we will cover the broad topics) • Sections (where you will get more specifjc, targeted help on • Canvas site (where you’ll fjnd the syllabus, assignments, and • Offjce hours (where you can ask even more questions)
Textbook • Angrist and Pischke, Mostly Harmless Econometrics: ▶ Chatty, opinionated, but intuitive approach to causal inference ▶ Very much from an econ perspective • Hernan and Robins, Causal Inference. ▶ Clear and basic introduction to foundational concepts ▶ From a biostatistics/epidemiology perspective ▶ Relies more on graphical approaches • Other required readings are posted on the website. • Lecture notes will be other main text.
Grading 1. biweekly homeworks (50%) 2. fjnal project (40%) 3. participation/presentation (10%)
Final project expectations. • Roughly 5-15 page research paper that either: ▶ applies some methods of the course to an empirical problem, or ▶ develops or expands a methodological approach. • Co-authorship is encouraged, but comes with higher • Fine to combine with another class paper. • Focus on research design, data, methodology, and results. • Milestones throughout the term, presentation on 12/10.
Broad outline 1. Primitives 2. Experimental studies 3. Observational studies with no confounding 4. Observational studies with confounding 5. Misc. Topics ▶ Potential outcomes, confounding, DAGs ▶ Randomization, identifjcation, estimation ▶ Regression, weighting, matching ▶ Panel data, difg-in-difg, IV, RDD ▶ Mechanisms/direct efgects, dynamic causal inference, etc
What is causal inference? world? 1. understand when we can answer these questions, and 2. design better studies to provide answers • Causal inference is the study of counterfactuals: ▶ what would happened if we were to change this aspect of the • Social science theories are almost always causal in their nature. ▶ H1: an increase in 𝑌 causes 𝑍 to increase • Knowing causal inference will help us:
What is identification? us what we can learn about that quantity from the type of data available. data? on a single value. coeffjcients in a linear model. matter the sample size. • Identifjcation of a quantity of interest (mean, efgect, etc) tells • Would we know this quantity if we had access to unlimited ▶ No worrying about estimation uncertainty here. ▶ Standard errors on estimates are all 0. • A quantity is identifjed if, with infjnite data, it can only take • Statistical identifjcation: not possible to estimate some ▶ Dummy for incumbent candidate, 𝑌 𝑗 = 1 and dummy for challenger candidate, 𝑎 𝑗 = 1 . ▶ Can’t estimate the coeffjcient on both in the same model, no
Causal identification efgect from the available data. strategies. it. assumptions that allow you to claim you’ve estimated a causal efgect? 3SLS, SEM, GMM, GEE, dynamic panel, etc) are secondary to the identifjcation assumptions. • Causal identifjcation tells us what we can learn about a causal • Identifjcation depends on assumptions, not on estimation • If an efgect is not identifjed, no estimation method will recover • ”What’s your identifjcation strategy?” = what are the • Estimation method (regression, matching, weighting, 2SLS,
Lack of identification, example efgect. • High positive correlation. • But without assumptions, we learn nothing about the causal
(control) Notation • Population of units ▶ Finite population: 𝑉 = {1, 2, … , 𝑂} ▶ Infjnite (super)population: 𝑉 = {1, 2, … , ∞} • Observed outcomes: 𝑍 𝑗 • Binary treatment: 𝐸 𝑗 = 1 if treated, 𝐸 𝑗 = 0 if untreated • Pretreatment covariates: 𝑌 𝑗 , could be a matrix
What is association? the incumbent’s share of the two party vote as the outcome. ⟂ 𝐸 : Pr[𝑍 = 1|𝐸 = 1] = Pr[𝑍 = 1|𝐸 = 0] dependent or associated: Pr[𝑍 = 1|𝐸 = 1] ≠ Pr[𝑍 = 1|𝐸 = 0] on the value of the other variable. • Running example: efgect of incumbent candidate negativity on • If 𝑍 𝑗 and 𝐸 𝑗 are independent written 𝑍 ⟂ • If the variables are not independent, we say they are • Association: the distribution of the observed outcome depends • Nothing about counterfactuals or causality!
Potential outcomes Neyman-Rubin causal model of potential outcomes fjlls this role. to 𝑒 . negative. one potential outcome per unit. • We need someway to formally discuss counterfactuals. The • 𝑍 𝑗 (𝑒) is the value that the outcome would take if 𝐸 𝑗 were set ▶ 𝑍 𝑗 (1) is value that 𝑍 would take if the incumbent went ▶ 𝑍 𝑗 (0) is the outcome if the incumbent stays positive. • Potential outcomes are fjxed features of the units. • Fundamental problem of causal inference: can only observe • Easy to generalize when 𝐸 𝑗 is not binary.
Manipulation principle. manipulation” Holland (1986) • 𝑍 𝑗 (𝑒) is the value that 𝑍 would take under 𝐸 𝑗 set to 𝑒 . • To be well-defjned, 𝐸 𝑗 should be manipulable at least in • Leads to common motto: ”No causation without • Tricky causal problems: ▶ Efgect of race, sex, etc.
Consistency/SUTVA and stats. treatment: 1. No interference between units: 𝑍 𝑗 (𝑒 , 𝑒 , … , 𝑒 𝑂 ) = 𝑍 𝑗 (𝑒 𝑗 ) 2. Variation in the treatment is irrelevant. • How do potential outcomes relate to observed outcomes? • Need an assumption to make connection: ▶ “Consistency” in epidemiology ▶ “Stable unit treatment value assumption” (SUTVA) in econ • Observed outcome is the potential outcome of the observed 𝑍 𝑗 (𝑒) = 𝑍 𝑗 if 𝐸 𝑗 = 𝑒 • Also write this as: 𝑍 𝑗 = 𝐸 𝑗 𝑍 𝑗 (1) + (1 − 𝐸 𝑗 )𝑍 𝑗 (0) • Two key points here:
Causal inference = missing data .51 .47 .47 ? 1 .49 ? .49 1 ? ? .51 1 .43 ? .43 1 .52 ? 0 .55 Negativity 𝑍 𝑗 (1) Observed Potential (Treatment) Outomes Outcomes 𝐸 𝑗 𝑍 𝑗 𝑍 𝑗 (0) 0 .55 .63 .63 ? 0 .52 .52 ? 0 .52
Estimands 𝑗= counterfactual worlds! vs. control. [𝑍 𝑗 (1) − 𝑍 𝑗 (0)] 𝑂 𝑂 • What are we trying to estimate? Difgerences between • Individual causal efgect (ICE): 𝜐 𝑗 = 𝑍 𝑗 (1) − 𝑍 𝑗 (0) ▶ Difgerence between what would happen to me under treatment ▶ Within unit! ⇝ FPOCI ▶ Almost always unidentifjed without strong assumptions • Average treatment efgect (ATE): 𝜐 = 𝔽[𝜐 𝑗 ] = 1 ▶ Average of ICEs over the population. ▶ We’ll spend a lot time trying to identify this.
Other estimands subpopulation: 𝑂 𝑦 𝑗∶𝑌 𝑗 =𝑦 [𝑍 𝑗 (1) − 𝑍 𝑗 (0)], 𝑂 𝑢 𝑗∶𝐸 𝑗 = [𝑍 𝑗 (1) − 𝑍 𝑗 (0)], • Conditional average treatment efgect (CATE) for a 𝜐(𝑦) = 𝔽[𝜐 𝑗 |𝑌 𝑗 = 𝑦] = 1 ▶ where 𝑂 𝑦 is the number of units in the subpopulation. • Average treatment efgect on the treated (ATT): 𝜐 𝐵𝑈𝑈 = 𝔽[𝜐 𝑗 |𝐸 𝑗 = 1] = 1 where 𝑂 𝑢 = ∑ 𝑗 𝐸 𝑗 .
Samples versus Populations actually observed. controls. ICEs in the sample: 𝑗∈𝑇 [𝑍 𝑗 (1) − 𝑍 𝑗 (0)] • Estimands above all at the population level. • Sometimes easier to make inferences about the sample • Sample 𝑇 ⊂ 𝑉 of size 𝑜 < 𝑂 , with 𝑜 𝑢 treated and 𝑜 𝑑 = 𝑜 − 𝑜 𝑢 • Sample average treatment efgect (SATE) is the average of 𝑇𝐵𝑈𝐹 = 𝜐 𝑇 = 1 𝑜 • Limit our inferences to the sample and don’t generalize. • In this context, usually refer to the ATE as the PATE.
Recommend
More recommend