New version page

CORNELL ECON 3120 - Omitted Variable Bias with Many Regressors

Type: Lecture Note
Pages: 2
Documents in this Course

This preview shows page 1 out of 2 pages.

View Full Document

End of preview. Want to read all 2 pages?

View Full Document
Unformatted text preview:

Econ 3120 1st Edition Lecture 14Outline of Current Lecture I. Unbiasedness of Multivariate OLS EstimatorsCurrent LectureI. Omitted Variable Bias with Many Regressors6.3 Omitted Variable Bias with Many Regressors The above discussion only applies to a model with two independent variables. When there are more than two independent variables, and we leave one out, things get much more complicated. If this happens, all of the estimated β 0 s can be biased, and the bias of ˜βj in the short regression generally depends on the relationship between the omitted variable and allthe x’s, and on the relationship between all the x’s. But generally speaking, if we assume that x j is uncorrelated with the other included x’s, then we can say something about the bias of the estimated coefficient. Suppose the true model is log(wage) = β0 +β1educ+β2exper +β3abil +u (where exper is yearsof experience) and we leave out abil. If we assume that education and experience are uncorrelated, thenthe expectation of ˜β1 in the regression log(wage) = β0 +β1educ+β2exper +u will be E( ˜β1) = β1 +β3 Cov(educ d,abil) Vard(educ) , which is essentially the same thing as the formula (7) above. Therefore, we might expect ˜β1 to be biased upwards if β3 is positive and education and ability are positively correlated. Economists often try to sign the bias using this formula regardless of whether x j (the variableof interest in the short regression) is uncorrelated with the other included x 0 s. This is a reasonable shortcut, but it’s important to know that it isn’t exactly right, especially if the included x’s are highly correlated. 2 2The general form for the estimate of βj in the short regression when xk is omitted is ˜βj = ˆβj + ˆβk ˜δj , where ˜δj is the coefficient on xj in the regression of xk on all of the included regressors. See Wooldridge Section 3A for more detail. 9 7 Variance of OLS Estimators To obtain the variance of OLS estimators, we need make an assumption analogous to SLR.5: • MLR.5 Homoskedasticity: The error termin the OLS equation described by MLR.1 has constant variance: Var(u|x1,..., xk) = σ 2 With assumptions MLR.1-MLR.5, the variance of an OLS estimator ˆβj is given by Var( ˆβj) = σ 2 SSTj(1−R 2 j ) where SSTj = ∑(xi j − x¯j) 2 and R 2 j is the R-squared from the regression of xi j on all of the other independent variables. 8 Estimating σ 2 We obtain an unbiased estimator of σ 2 in a similar manner to the bivariate case: σˆ 2 = 1 n−k −1 ∑uˆ 2 i The denominator n−k−1 equals the number of observations minus the total number of parameters estimated in the model. Using this estimate, we can estimate the variances of OLSestimators as Vard( ˆβj) = σˆ 2 SSTj(1−R 2 j ) And the standard errors as se( ˆβj) = q Vard( ˆβj) The Gauss-Markov Theorem Our OLS estimators are actually a special case of a more general class of estimators, called linear estimators. Linear estimators take the form ˜βj = ∑wi jyi 10 where each wi j can be a function of any (or all) of the sample values of the independent variables. The Gauss-Markov Theorem states that under assumptions MLR.1-MLR.5, among all possible estimators, OLS estimators are best These notes represent a detailed interpretation of the professor’s lecture. GradeBuddy is best used as a supplement to your own notes, not as a substitute.linear unbiased estimators (BLUE) for β1,...,βk . This means that within the class of linear unbiased estimators, OLS estimators have the lowest variance (i.e., “best”). 9 Inference 9.1 Distributions of OLS estimators To obtain the distributions of our estimators, we assume the distribution of u, or invoke the central limit theorem in a large samples. For small samples, we need the assumption • MLR.6: Normality u ∼ N(0,σ 2 ) Under assumptions MLR.1-MLR.6, conditional on the x’s, ˆβj ∼ N(βj ,Var( ˆβj)) Thus, ˆβj −βj se( ˆβj) ∼ tn−k 9.2 Central Limit Theorem for large samples In large samples (n ≥ 30), we can apply the central limit theorem: ˆβj −βj se( ˆβj) ∼ N(0,1) It is also true that in large samples, our estimated β’s are consistent estimators of the true β’s. plim( ˆβj) = βj In the multivariate case, the proof is tricky, so we won’t cover it here. 11 9.3 Inference Now that we know the mean, variance and distribution of the OLS estimators, we can test hypotheses about a single βj just as we have been doing all along. Since these tests are carried out in exactly the same manner as in the bivariate case, we won’t cover them here. 9.4 Testing (multiple) hypotheses about multiple β’s: The F Test Oftentimes we are interested in testing (potentially many) hypotheses about multiple β’s, all at the same time. Suppose, for example, we have the model y = β0 +β1x1 +β2x2 +u and we would like to test the hypothesis: H0 : β1 = β2 HA : β1 6= β2 We can construct an F Test to test this hypothesis.3 There are two steps to the test: 1. Construct and estimate a restricted model which takes into account the null hypothesis. In the example above, we wantto impose β1 = β2. We can do this by substituting for β2 and regressing y = β0 +β1x1 +β1x2 +u = β0 +β1(x1 +x2) +u To operationalize this using Stata, we construct a new variable which equals x1 + x2, and regress y on our new variable. 2. Construct the F statistic: F ≡ (SSRr −SSRur)/q SSRur/n−k −1 = (R 2 ur −R 2r )/q (1−R2 ur)/n−k −1 where SSRr is the sum of squared residuals in the restricted regression and SSRur is the sum 3Note that in this case, because we are dealing with only one linear combination of parameters, there are other ways to test this hypothesis. See Wooldridge Section 4.4 for details. 12 of squared residuals in the unrestricted regression, and q is the number of restrictions. In the example above, we have 1 restriction, so q = 1. 3. Under the null hypothesis, the F statistic has an F distribution with (q,n−k −1) degrees of freedom. We look up the critical value c of the test based on the degrees of freedom, and reject if F > c Example 1. (Wooldridge 4.9) Suppose we are interested in estimating the determinants of the number of minutes per week that a person sleeps. We estimate the equation: sleep d = 3,638.25 (112.28) −0.148 (0.017) totwork −11.13 (5.88) educ+ 2.20 (1.45) age n = 706, R 2 = 0.113 Wesee that neither ˆβeduc nor ˆβage are significant at the 5% level. Let’s instead test

View Full Document Unlocking...