# I. Goodness of Fit II. Unbiasedness

(2 pages)
Previewing page 1 of actual document.

View Full Document

## I. Goodness of Fit II. Unbiasedness

396 views

I. Goodness of Fit II. Unbiasedness

Lecture number:
11
Pages:
2
Type:
Lecture Note
School:
Cornell University
Course:
Econ 3120 - Applied Econometrics
Edition:
1
##### Documents in this Packet
• 1 pages

• 1 pages

• 2 pages

• 1 pages

• 2 pages

• 1 pages

• 1 pages

• 1 pages

• 2 pages

• 2 pages

• 2 pages

• 2 pages

• 2 pages

• 1 pages

• 2 pages

• 3 pages

• 2 pages

• 2 pages

• 2 pages

• 2 pages

• 2 pages

• 6 pages

• 2 pages

• 2 pages

• 2 pages

• 2 pages

• 2 pages

• 2 pages

• 2 pages

Unformatted text preview:

Lecture 11 Outline of Current Lecture I. Regression Current Lecture I. Goodness of Fit II. Unbiasedness Goodness-of-fit First, recall from above that y can be decomposed as follows: yi = yˆi +uˆi 8In order to analyze goodness-of-fit (how well the regression fits the data), it is useful to define the following: total sum of squares (SST) = ∑(yi −y¯) 2 explained sum of squares (SSE) = ∑(yˆi −y¯) 2 residual sum of squares (SSR) = ∑uˆ 2 i Note that the the explained sum of squares is sometimes called the regression sum of squares or model sum of squares. The total sum of squares can be decomposed into the explained plus the residual sum of squares: SST = SSE +SSR 4.3 R-squared The R-squared of a regression gives us a measure of goodness-of-fit. It is defined as R 2 ≡ SSE/SST = 1−SSR/SST In words, this is the fraction of the variation in y that is explained by x. Note that the definition implies that 0 ≤ R 2 ≤ 1 Note that in economics it is not uncommon to have an R-squared close to 0. In our regression of wages on schooling, the R-squared equals 0.140. While this implies that variation in schooling does not explain much of the variation in wages, it does not necessarily mean that we have not done a good job estimating the relationship between schooling and earnings. 95 Units of Measurement When running regressions, sometimes it’s convenient to change the units of measurement so that the regression estimates are easy to read. Consider the following example: Ashraf, Berry and Shapiro (2010) analyze the results of a field experiment in Zambia which estimated the demand for bottles of water purification solution among a sample of 1004 urban households. Bottles were offered for sale to individual households at prices between 300 and 800 Zambian Kwacha (~3600 Kwacha = \$1). The authors estimate the following demand equation: purchasei = β0 +β1 pricei +ui where purchasei is a variable which equals 1 if the household purchased the bottle, and 0 otherwise, and pricei is the price in Kwacha.1 Estimation of this equation yields the following estimates: ˆβ0 = 0.9640 ˆβ1 = −0.000664 This implies that an increase in the price of the bottle by 1 Kwacha lowered the purchase probability by 0.066 percentage points. However, 1 Kw is a very small amount relative to the price at which the bottles were offered. Thus, it might be more useful to estimate purchasei = β0 +α1(pricei/100) +ui This yields the estimates ˆβ0 = 0.9640 αˆ 1 = −0.0664 The estimates imply that a price increase of 100 kwacha lowered the purchase probability by 6.6 percentage points. This is the same as the estimate above, but it is more clear from a presentation standpoint. The estimate of α1 makes sense because the above equations are equivalent when β1 = α1/100. 1This type of model is called the linear probability model. Instead of predicting purchase or no purchase as a 0/1 variable, ...

View Full Document

Unlocking...