# CORNELL ECON 3120 - Proxy Variables (2 pages)

Previewing page*1*of 2 page document

**View the full content.**## Proxy Variables

Previewing page *1*
of
actual document.

**View the full content.**View Full Document

## Proxy Variables

0 0 1199 views

19

- Lecture number:
- 19
- Pages:
- 2
- Type:
- Lecture Note
- School:
- Cornell University
- Course:
- Econ 3120 - Applied Econometrics
- Edition:
- 1

**Unformatted text preview: **

Econ 3120 1st Edition Lecture 19 Outline of Current Lecture I Generalized Least Squares and Feasible Generalized Least Squares Current Lecture II Proxy Variables Proxy Variables As we have seen exclusion of key explanatory variables from a regression can bias coefficients Consider the regression log wage 0 1educ 2exper 3abil u 1 If we run a regression that omits ability then the composite error term 3abil u could be correlated with educ leading to biased estimates for 1 What we have done at various points in this class is use a proxy variable for abil In this case let s use IQ and estimate log wage 0 1educ 2exper 3IQ e 2 How do we relate equations 1 and 2 Consider the auxiliary equation that relates IQ and ability abil 0 1IQ 3 Substituting 3 into 1 yields log wage 0 1educ 2exper 3 0 1IQ u 0 3 0 1educ 2exper 3 1IQ 3 u From this equation we can see that the key assumption is that E 3 u educ exper IQ 0 In particular this means that the unobserved component of ability cannot be related to education or experience In other words after IQ is controlled for the remaining variation of ability is uncorrelated with the x s 1 A natural question is therefore when we can expect to be uncorrelated with education or experience This is difficult if not impossible to prove The argument against IQ as a valid proxy variable would be that IQ isn t the only component of ability that would lead to more schooling For example a drive to succeed may not be related to IQ but could lead to higher education and earnings 1 1 Using a Lagged Dependent Variable as a Proxy Variable If we have a panel of data that spans multiple periods we can sometimes use the lagged or lastperiod value of the dependent variable as a proxy variable Suppose we are interested in measuring the impact of a remedial education program on child learning and we have child test scores before and after the program We run the model scorei t 0 1 programi 2scorei t 1 u where programi indicates whether the child was in the

View Full Document