Econ 3120 1st Edition Lecture 14 Outline of Current Lecture I Unbiasedness of Multivariate OLS Estimators Current Lecture I Omitted Variable Bias with Many Regressors 6 3 Omitted Variable Bias with Many Regressors The above discussion only applies to a model with two independent variables When there are more than two independent variables and we leave one out things get much more complicated If this happens all of the estimated 0 s can be biased and the bias of j in the short regression generally depends on the relationship between the omitted variable and all the x s and on the relationship between all the x s But generally speaking if we assume that x j is uncorrelated with the other included x s then we can say something about the bias of the estimated coefficient Suppose the true model is log wage 0 1educ 2exper 3abil u where exper is years of experience and we leave out abil If we assume that education and experience are uncorrelated then the expectation of 1 in the regression log wage 0 1educ 2exper u will be E 1 1 3 Cov educ d abil Vard educ which is essentially the same thing as the formula 7 above Therefore we might expect 1 to be biased upwards if 3 is positive and education and ability are positively correlated Economists often try to sign the bias using this formula regardless of whether x j the variable of interest in the short regression is uncorrelated with the other included x 0 s This is a reasonable shortcut but it s important to know that it isn t exactly right especially if the included x s are highly correlated 2 2The general form for the estimate of j in the short regression when xk is omitted is j j k j where j is the coefficient on xj in the regression of xk on all of the included regressors See Wooldridge Section 3A for more detail 9 7 Variance of OLS Estimators To obtain the variance of OLS estimators we need make an assumption analogous to SLR 5 MLR 5 Homoskedasticity The error term in the OLS equation described by MLR 1 has constant variance Var u x1 xk 2 With assumptions MLR 1 MLR 5 the variance of an OLS estimator j is given by Var j 2 SSTj 1 R 2 j where SSTj xi j x j 2 and R 2 j is the R squared from the regression of xi j on all of the other independent variables 8 Estimating 2 We obtain an unbiased estimator of 2 in a similar manner to the bivariate case 2 1 n k 1 u 2 i The denominator n k 1 equals the number of observations minus the total number of parameters estimated in the model Using this estimate we can estimate the variances of OLS estimators as Vard j 2 SSTj 1 R 2 j And the standard errors as se j q Vard j The GaussMarkov Theorem Our OLS estimators are actually a special case of a more general class of estimators called linear estimators Linear estimators take the form j wi jyi 10 where each wi j can be a function of any or all of the sample values of the independent variables The Gauss Markov Theorem states that under assumptions MLR 1 MLR 5 among all possible estimators OLS estimators are best These notes represent a detailed interpretation of the professor s lecture GradeBuddy is best used as a supplement to your own notes not as a substitute linear unbiased estimators BLUE for 1 k This means that within the class of linear unbiased estimators OLS estimators have the lowest variance i e best 9 Inference 9 1 Distributions of OLS estimators To obtain the distributions of our estimators we assume the distribution of u or invoke the central limit theorem in a large samples For small samples we need the assumption MLR 6 Normality u N 0 2 Under assumptions MLR 1 MLR 6 conditional on the x s j N j Var j Thus j j se j tn k 9 2 Central Limit Theorem for large samples In large samples n 30 we can apply the central limit theorem j j se j N 0 1 It is also true that in large samples our estimated s are consistent estimators of the true s plim j j In the multivariate case the proof is tricky so we won t cover it here 11 9 3 Inference Now that we know the mean variance and distribution of the OLS estimators we can test hypotheses about a single j just as we have been doing all along Since these tests are carried out in exactly the same manner as in the bivariate case we won t cover them here 9 4 Testing multiple hypotheses about multiple s The F Test Oftentimes we are interested in testing potentially many hypotheses about multiple s all at the same time Suppose for example we have the model y 0 1x1 2x2 u and we would like to test the hypothesis H0 1 2 HA 1 6 2 We can construct an F Test to test this hypothesis 3 There are two steps to the test 1 Construct and estimate a restricted model which takes into account the null hypothesis In the example above we want to impose 1 2 We can do this by substituting for 2 and regressing y 0 1x1 1x2 u 0 1 x1 x2 u To operationalize this using Stata we construct a new variable which equals x1 x2 and regress y on our new variable 2 Construct the F statistic F SSRr SSRur q SSRur n k 1 R 2 ur R 2 r q 1 R2 ur n k 1 where SSRr is the sum of squared residuals in the restricted regression and SSRur is the sum 3Note that in this case because we are dealing with only one linear combination of parameters there are other ways to test this hypothesis See Wooldridge Section 4 4 for details 12 of squared residuals in the unrestricted regression and q is the number of restrictions In the example above we have 1 restriction so q 1 3 Under the null hypothesis the F statistic has an F distribution with q n k 1 degrees of freedom We look up the critical value c of the test based on the degrees of freedom and reject if F c Example 1 Wooldridge 4 9 Suppose we are interested in estimating the determinants of the number of minutes per week that a person sleeps We estimate the equation sleep d 3 638 25 112 28 0 148 0 017 totwork 11 13 5 88 educ 2 20 1 45 age n 706 R 2 0 113 We see that neither educ nor age are significant at the 5 level Let s instead test whether they are jointly significant that is H0 educ age 0 HA educ 6 0 or age 6 0 To carry out this test we estimate the restricted regression as sleep d 3 586 38 38 91 0 …
View Full Document