Impact Evaluation Causal Inference Sebastian Galiani November 2006 Motivation The research questions that motivate most studies in the health sciences are causal in nature. For example: What is the efficacy of a given drug in Impact: given population? What fraction of deaths from a given disease could have been avoided by a given treatment or policy? HDN SAR WBI 2 Motivation The most challenging empirical questions in economics also involve causal-effect relationships: Does school decentralization improve schools quality? HDN SAR WBI 3 Motivation Interest in these questions is motivated by: Policy concerns Does privatization of water systems improve child health? Theoretical considerations Problems facing individual decision makers HDN SAR WBI 4 Causal Analysis The aim of standard statistical analysis, typified by likelihood and other estimation techniques, is to infer parameters of a distribution from samples drawn of that distribution. With the help of such parameters, one can: 1. Infer association among variables, 2. Estimate the likelihood of past and future events, 3. As well as update the likelihood of events in light of new evidence or new measurement. HDN SAR WBI 5 Causal Analysis These tasks are managed well by standard statistical analysis as long as experimental conditions remain the same. Causal analysis goes one step further: Its aim is to infer aspects of the data generation process. With the help of such aspects, one can deduce not only the likelihood of events under static conditions, but also the dynamics of events under changing conditions. HDN SAR WBI 6 Causal Analysis This capability includes: 1. 2. 3. Predicting the effects of interventions Predicting the effects of spontaneous changes Identifying causes of reported events This distinction implies that causal and associational concepts do not mix. HDN SAR WBI 7 Causal Analysis The word cause is not in the vocabulary of standard probability theory. All Probability theory allows us to say is that two events are mutually correlated, or dependent – meaning that if we find one, we can expect to encounter the other. Scientists seeking causal explanations for complex phenomena or rationales for policy decisions must therefore supplement the language of probability with a vocabulary for causality. HDN SAR WBI 8 Causal Analysis Two languages for causality have been proposed: 1. Structural equation modeling (ESM) (Haavelmo 1943). 2. The Neyman-Rubin potential outcome model (RCM) (Neyman, 1923; Rubin, 1974). HDN SAR WBI 9 The Rubin Causal Model Define the population by U. Each unit in U is denoted by u. For each u U, there is associated a value Y(u) of the variable of interest Y, which we call: the response variable. Let A be a second variable defined on U. We call A an attribute of the units in U. HDN SAR WBI 10 The key notion is the potential for exposing or not exposing each unit to the action of a cause: Each unit has to be potentially exposable to any one of the causes. Thus, Rubin takes the position that causes are only those things that could be treatments in hypothetical experiments. An attribute cannot be a cause in an experiment, because the notion of potential exposability does not apply to it. HDN SAR WBI 11 For simplicity, we assume that there are just two causes or level of treatment. Let D be a variable that indicates the cause to which each unit in U is exposed: t if unit u is exposedto treatment D c if unit u is exposedtocontrol In a controlled study, D is constructed by the experimenter. In an uncontrolled study, it is determined by factors beyond the experimenter’s control. HDN SAR WBI 12 The values of Y are potentially affected by the particular cause, t or c, to which the unit is exposed. Thus, we need two response variables: Yt(u), Yc(u) Yt is the value of the response that would be observed if the unit were exposed to t and Yc is the value that would be observed on the same unit if it were exposed to c. HDN SAR WBI 13 Let D also be expressed as a binary variable: D = 1 if D = t and D = 0 if D = c Then, the outcome of each individual can be written as: Y(U) = D Y1 + (1 – D) Y0 HDN SAR WBI 14 Definition: For every unit u treatment {Du = 1 instead of Du = 0} causes the effect u = Y1(u) – Y0(u) This definition of a causal effect assumes that the treatment status of one individual does not affect the potential outcomes of other individuals. Fundamental Problem of Causal Inference: It is impossible to observe the value of Y1(u) and Y0(u) on the same unit and, therefore, it is impossible to observe the effect of t on u. Another way to express this problem is to say that we cannot infer the effect of treatment because we do not have the counterfactual evidence i.e. what would have happened in the absence of treatment. HDN SAR WBI 15 Given that the causal effect for a single unit u cannot be observed, we aim to identify the average causal effect for the entire population or for sub-populations. The average treatment effect ATE of t (relative to c) over U (or any sub-population) is given by: ATE =E [Y1(u) – Y0(u)] = E [Y1(u)] – E [Y0(u)] Y1 Y0 HDN SAR WBI (1) 16 The statistical solution replaces the impossibleto-observe causal effect of t on a specific unit with the possible-to-estimate average causal effect of t over a population of units. Although E(Y1) and E(Y0) cannot both be calculated, they can be estimated. Most econometrics methods attempt to construct from observational data consistent estimates of Y1 and Y0 HDN SAR WBI 17 Consider the following simple estimator of ATE: ˆ ˆ ˆ [ Y1 | D 1] - [ Y0 | D 0] (2) Note that equation (1) is defined for the whole population, whereas equation (2) represents an estimator to be evaluated on a sample drawn from that population HDN SAR WBI 18 Let equal the proportion of the population that would be assigned to the treatment group. Decomposing ATE, we have: {D1} (1 ) {D0} Y1 Y0 | D 1 (1 ) Y1 Y0 | D 0 [ Y1 | D 1] (1 ) [ Y1 | D 0] [Y 0 HDN SAR WBI | D 1] (1 ) [ Y0 | D 0] Y1 Y0 19 If we assume that [Y1 | D 1] [Y1 | D 0] and [Y0 | D 1] [Y0 | D 0] [ Y1 | D 1] (1 ) [ Y1 | D 1] [Y 0 | D 0] (1 ) [ Y0 | D 0] [Y1 | D 1] - [Y0 | D 0] Which is consistently estimated by its sample analog estimator: ˆ ˆ | D 1] - [ Y ˆ | D 0] [Y 1 0 HDN SAR WBI 20 Thus, a sufficient condition for the standard estimator to consistently estimate the true ATE is that: [Y1 | D 1] [Y1 | D 0] and [Y0 | D 1] [Y0 | D 0] In this situation, the average outcome under the treatment and the average outcome under the control do not differ between the treatment and control groups. In order to satisfy these conditions, it is sufficient that treatment assignment D be uncorrelated with the potential outcome distributions of Y1 and Y2. The principal way to achieve this uncorrelatedness is through random assignment of treatment. HDN SAR WBI 21 In most circumstances, there is simply no information available on how those in the control group would have reacted if they had received the treatment instead. This is the basis for an important insight into the potential biases of the standard estimator (2). After a bit of algebra, it can be shown that: ˆ [Y0 | D 1] [Y0 | D 0] (1 ){D1} {D0} HDN SAR WBI Baseline Difference TreatmentHeterogeneity 22 This equation specifies the two sources of biases that need to be eliminated from estimates of causal effects from observational studies. 1. Selection Bias: Baseline difference. Treatment Heterogeneity. 2. Most of the methods available only deal with selection bias, simply assuming that the treatment effect is constant in the population or by redefining the parameter of interest in the population. HDN SAR WBI 23 Treatment on the Treated ATE is not always the parameter of interest. In a variety of policy contexts, it is the average treatment effect for the treated that is of substantive interest: TOT =E [Y1(u) – Y0(u)| D = 1] = E [Y1(u)| D = 1] – E [Y0(u)| D = 1] HDN SAR WBI 24 Treatment on the Treated The standard estimator (2) consistently estimates TOT if: [Y0 | D 1] [Y0 | D 0] HDN SAR WBI 25 Structural Equation Modeling Structural equation modeling was originally developed by geneticists (Wright 1921) and economists (Haavelmo 1943). HDN SAR WBI 26 Structural Equations Definition: An equation y=βx+ε (8) is said to be structural if it is to be interpreted as follows: In an ideal experiment where we control X to x and any other set Z of variables (not containing X or Y) to z, the value y of Y is given by β x + ε, where ε is not a function of the settings x and z. This definition is in the spirit of Haavelmo (1943), who explicitly interpreted each structural equation as a statement about a hypothetical controlled experiment. HDN SAR WBI 27 Thus, to the often asked question, “Under what conditions can we give causal interpretation to structural coefficients?” Haavelmo would have answered: Always! According to the founding father of SEM, the conditions that make the equation y = β x + ε structural are precisely those that make the causal connection between X and Y have no other value but β, and ensuring that nothing about the statistical relationship between x and ε can ever change this interpretation of β. HDN SAR WBI 28 The average causal effect: The average causal effect on Y of treatment level x is the difference in the conditional expectations: E(Y|X = x) – E(Y|X = 0) In the context of dichotomous interventions (x = 1), this causal effect is called the average treatment effect (ATE). HDN SAR WBI 29 Representing Interventions Consider the structural model M: z = fz(w) x = fx(z, ) y = fy(x, u) We represent an intervention in the model through a mathematical operator denoted d0(x). d0(x) simulates physical interventions by deleting certain functions from the model, replacing them by a constant X = x, while keeping the rest of the model unchanged. HDN SAR WBI 30 To emulate an intervention d0(x0) that holds X constant (at X = x0) in model M, replace the equation for x with x = x0, and obtain a new model, Mx0 z = fz(w) x = x0 y = fy(x, u) The joint distribution associated with the modified model, denoted P(z, y| d0(x0)) describes the post-intervention (“experimental”) distribution. From this distribution, one is able to assess treatment efficacy by comparing aspects of this distribution at different levels of x0. HDN SAR WBI 31 Structural Parameters Definition: The interpretation of a structural equation as a statement about the behavior of Y under a hypothetical intervention yields a simple definition for the structural parameters. The meaning of β in the equation y = β x + ε is simply E[Y | d o (x)] x HDN SAR WBI 32 Counterfactual Analysis in Structural Models Consider again model Mxo. Call the solution of Y the potential response of Y to x0. We denote it as Yx0(u, , w). This entity can be given a counterfactual interpretation, for it stands for the way an individual with characteristics (u, , w) would respond, had the treatment been x0, rather than the x = fx(z, ) actually received by the individual. HDN SAR WBI 33 In our example, Yx0(u, , w) = Yx0(u) = y = fy(x0, u) • This interpretation of counterfactuals, cast as solutions to modified systems of equations, provides the conceptual and formal link between structural equation modeling and the Rubin potential-outcome framework. • It ensures us that the end results of the two approaches will be the same. • Thus, the choice of model is strictly a matter of convenience or insight. HDN SAR WBI 34 References Judea Pearl (2000): Causality: Models, Reasoning and Inference, CUP. Chapters 1, 5 and 7. Trygve Haavelmo (1944): “The probability approach in econometrics”, Econometrica 12, pp. iii-vi+1-115. Arthur Goldberger (1972): “Structural Equations Methods in the Social Sciences”, Econometrica 40, pp. 979-1002. Donald B. Rubin (1974): “Estimating causal effects of treatments in randomized and nonrandomized experiments”, Journal of Educational Psychology 66, pp. 688-701. Paul W. Holland (1986): “Statistics and Causal Inference”, Journal of the American Statistical Association 81, pp. 94570, with discussion. HDN SAR WBI 35

1/--страниц