10-1Instrumental Variables Regression (SW Ch. 10) Three important threats to internal validity are: • omitted variable bias from a variable that is correlated with X but is unobserved, so cannot be included in the regression; • simultaneous causality bias (X causes Y, Y causes X); • errors-in-variables bias (X is measured with error) Instrumental variables regression can eliminate bias from these three sources.10-2The IV Estimator with a Single Regressor and a Single Instrument (SW Section 10.1) Yi = β0 + β1Xi + ui • Loosely, IV regression breaks X into two parts: a part that might be correlated with u, and a part that is not. By isolating the part that is not correlated with u, it is possible to estimate β1. • This is done using an instrumental variable, Zi, which is uncorrelated with ui. • The instrumental variable detects movements in Xi that are uncorrelated with ui, and use these two estimate β1.10-3Terminology: endogeneity and exogeneity An endogenous variable is one that is correlated with u An exogenous variable is one that is uncorrelated with u Historical note: “Endogenous” literally means “determined within the system,” that is, a variable that is jointly determined with Y, that is, a variable subject to simultaneous causality. However, this definition is narrow and IV regression can be used to address OV bias and errors-in-variable bias, not just to simultaneous causality bias.10-4Two conditions for a valid instrument Yi = β0 + β1Xi + ui For an instrumental variable (an “instrument”) Z to be valid, it must satisfy two conditions: 1. Instrument relevance: corr(Zi,Xi) ≠ 0 2. Instrument exogeneity: corr(Zi,ui) = 0 Suppose for now that you have such a Zi (we’ll discuss how to find instrumental variables later). How can you use Zi to estimate β1?The IV Estimator, one X and one Z Explanation #1: Two Stage Least Squares (TSLS) As it sounds, TSLS has two stages – two regressions: (1) First isolates the part of X that is uncorrelated with u: regress X on Z using OLS Xi = π0 + π1Zi + vi (1) • Because Zi is uncorrelated with ui, π0 + π1Zi is uncorrelated with ui. We don’t know π0 or π1 but we have estimated them, so… • Compute the predicted values of Xi, , where = ˆiXˆiX0ˆπ + 1ˆπZi, i = 1,…,n. 10-5(2) Replace Xi by in the regression of interest: ˆiXregress Y on using OLS: ˆiX Yi = β0 + β1ˆiX + ui (2) • Because is uncorrelated with uˆiXi in large samples, so the first least squares assumption holds • Thus β1 can be estimated by OLS using regression (2) • This argument relies on large samples (so π0 and π1 are well estimated using regression (1)) • This the resulting estimator is called the “Two Stage Least Squares” (TSLS) estimator, . 1ˆTSLSβ 10-6Two Stage Least Squares, ctd. Suppose you have a valid instrument, Zi. Stage 1: Regress Xi on Zi, obtain the predicted values ˆiX Stage 2: Regress Yi on ; the coefficient on is the TSLS estimator, . ˆiXˆiX1ˆTSLSβ Then is a consistent estimator of β1ˆTSLSβ1. 10-7The IV Estimator, one X and one Z, ctd. Explanation #2: (only) a little algebra Yi = β0 + β1Xi + uiThus, cov(Yi,Zi) = cov(β0 + β1Xi + ui,Zi) = cov(β0,Zi) + cov(β1Xi,Zi) + cov(ui,Zi) = 0 + cov(β1Xi,Zi) + 0 = β1cov(Xi,Zi) where cov(ui,Zi) = 0 (instrument exogeneity); thus β1 = cov( , )cov( , )iiiiYZXZ 10-8The IV Estimator, one X and one Z, ctd β1 = cov( , )cov( , )iiiiYZXZ The IV estimator replaces these population covariances with sample covariances: 1ˆTSLSβ = YZXZss, sYZ and sXZ are the sample covariances. This is the TSLS estimator – just a different derivation. 10-9Consistency of the TSLS estimator 1ˆTSLSβ = YZXZss The sample covariances are consistent: sYZ cov(Y,Z) and sp→XZ cov(X,Z). Thus, p→1ˆTSLSβ = YZXZss p→cov( , )cov( , )YZXZ = β1 • The instrument relevance condition, cov(X,Z) ≠ 0, ensures that you don’t divide by zero. 10-10Example #1: Supply and demand for butter IV regression was originally developed to estimate demand elasticities for agricultural goods, for example butter: ln() = βbutteriQ0 + β1ln( ) + ubutteriPi • β1 = price elasticity of butter = percent change in quantity for a 1% change in price (recall log-log specification discussion) • Data: observations on price and quantity of butter for different years • The OLS regression of ln() on ln( ) suffers from simultaneous causality bias (why?) butteriQbutteriP 10-11Simultaneous causality bias in the OLS regression of ln() on ln( ) arises because price and quantity are determined by the interaction of demand and supply butteriQbutteriP 10-12This interaction of demand and supply produces… Would a regression using these data produce the demand curve? 10-13What would you get if only supply shifted? • TSLS estimates the demand curve by isolating shifts in price and quantity that arise from shifts in supply. • Z is a variable that shifts supply but not demand. 10-14TSLS in the supply-demand example: ln() = βbutteriQ0 + β1ln( ) + ubutteriPi Let Z = rainfall in dairy-producing regions. Is Z a valid instrument? (1) Exogenous? corr(raini,ui) = 0? Plausibly: whether it rains in dairy-producing regions shouldn’t affect demand (2) Relevant? corr(raini,ln( )) ≠ 0? butteriPPlausibly: insufficient rainfall means less grazing means less butter 10-15TSLS in the supply-demand example, ctd. ln() = βbutteriQ0 + β1ln( ) + ubutteriPi Zi = raini = rainfall in dairy-producing regions. Stage 1: regress ln() on rain, get butteriPnln( )butteriPnln( )butteriP isolates changes in log price that arise from supply (part of supply, at least) Stage 2: regress ln() on butteriQnln( )butteriPThe regression counterpart of using shifts in the supply curve to trace out the demand curve. 10-1610-17Example #2: Test scores and class size • The California regressions still could have OV bias (e.g. parental involvement). • This bias could be eliminated by using IV regression (TSLS). • IV regression requires a valid instrument, that is, an instrument that is: (1) relevant: corr(Zi,STRi) ≠ 0 (2) exogenous: corr(Zi,ui) = 010-18Example #2: Test scores and class size, ctd. Here is a (hypothetical) instrument: • some districts, randomly hit by an earthquake, “double up” classrooms: Zi = Quakei = 1 if hit by quake, = 0 otherwise • Do the two conditions for a valid instrument hold? •
View Full Document