Omitted Variable Bias with Many Regressors(2 pages)
Previewing page 1 of actual document.
Could not display document, please try refreshing this page a few times.
Please contact support if you are unable to view this document.
Omitted Variable Bias with Many Regressors
- Lecture number:
- Lecture Note
- Cornell University
- Econ 3120 - Applied Econometrics
Unformatted text preview:
Lecture 15 Outline of Current Lecture I. Omitted Variable Bias with Many Regressors Current Lecture II. Dummy Variables Dummy Variables Dummy variables, (aka binary variables, indicator variables or dichotomous variables), are simply variables that take on a value of 0 or 1. They indicate a single status of the observation. Some examples female (=1 for female, =:0 for male) non-white (=1 if race is non-white, =0 if white) urban (=1 if the person lives in an urban area, =0 if lives in a rural area) Note that we could also define our dummy variables to indicate male, white, or rural, but it turns out not to matter (more on this below). Dummy variables change the intercept of the regression equation. For example, suppose we want to examine the relationship between test scores and class sizes in primary schools. We think that the gender of the child also has an effect on test scores, so we include it in the model. We therefore model the relationship as score = β0 +β1 f emale+β2clsize+u (1) How do we interpret β1? β1 actually represents a shift in the intercept associated with the gender of the child. To see this, take the conditional expectation for females and for males: E(score| f emale = 0, clsize) = β0 +β2clsize E(score| f emale = 1, clsize) = β0 +β1 +β2clsize The difference between these two equations is simply a shift in the intercept from β0 to β0 +β1 . 1 score Slope = β2 β1 β0 female male class size This interpretation easily generalizes to situations with more independent variables. The coeffi- cients on the continuous variables (i.e., “slope coefficients”) remain the same for different values of the dummy variable, but the dummy variable shifts the intercept. What would happen if you included the dummy variable male in the equation, where male = 1 if the child is a male, and 0 if she is female? You would therefore be running the regression: score = β0 +β1 f emale+β2clsize+β3male+u It is not possible to run this regression, because male is simply a linear combination of f emale (male = 1− f emale). This violates Assumption MLR.3. If you tried to do this in Stata, the program would drop one of these dummy variables for you. Thus, you could include either male or f emale, but not both. It turns out not to matter which one you include. If you ran the regression score = α0 +α1male+α2clsize+u (2) Then, using male = 1 − f emale, you can show that (2) becomes (1) when you set α0 = β0 + β1 and α1 = −β1 Note that we can use dummy variables if we have more than two categories. Suppose that we have 3 categories for race: white, black, and other. We run the regression including two of these variables: score = β0 +β2size+β3white+β4black +u Where again, we have to exclude other since other = 1−white−black. Interactions between dummy variables 2 We can interact dummy variables to ...
View Full Document