Source of VariationSum of SquaresDegrees of FreedomMean SquareSource of VariationSum of SquaresDegrees of FreedomMean SquareOct. 19, 2010 LEC #8 ECON 140A/240A-1 L. PhillipsCorrelation and Analysis of VarianceI. IntroductionPursuing the results from the ordinary least squares estimates of the linear model from the previous lecture, we investigate correlation, measures of goodness of fit, and analysis of variance. Then we turn to issues of central tendency and dispersion for the parameter estimates of the intercept and slope.II. CorrelationRecall that in Lectures Four and Six we introduced the covariance between y and x as a measure of the relationship between these variables,E[(y – Ey)(x – Ex)] = Cov(yx). (1)The covariance depends on units of measurement, so it is not a relative measure. We can go back to the linear model, Eq. (4), from the previous lecture and see the relationship between the variance of y and the covariance of y and x:yi = a + b xi + ui .(2)Taking expectations,E y = a + b E x + Eu (3)The mean of the error term, E u, is zero by assumption and we have seen that the sample mean of the estimated error is zero, i.e. iuˆi/n =0.Subtract Eq. (3) from Eq. (2) to write in deviation form:(y - Ey) = b (x – Ex) + u. (4)Multiply both sides by (x – Ex) and take expectations. If the residual u is independent of the explanatory variable, which is another assumption of regression analysis, then E[(y - Ey)(x – Ex)] = b E(x – Ex)2 + E[ (x – Ex) u] = b Var x . (5)Oct. 19, 2010 LEC #8 ECON 140A/240A-2 L. PhillipsCorrelation and Analysis of VarianceSolving for the slope b,b = Cov yx/Var x. (6) We can estimate the parameter b using the method of moments, which substitutes the sample estimates for the population entities of Cov yx and Var x:bˆ = i[(yi - iyi /n)(xi - ixi /n)][ )(xi - ixi /n) )(xi - ixi /n). (7)This is the same formula as the ordinary least squares solution given by Eq. (15) in the previous lecture. To see this, rewrite Eq. (7) above as:bˆ = i(yi - )y(xi - x) i(xi - x)2 , (8)and expanding,bˆ = i(yi xi - y xi - xyi + xy ) i[(xi )2 - 2 xxi + (x )2 ],(9)and taking summations,bˆ = [i(yi xi )- y ixi - xiyi + nxy ] [i(xi )2 - 2 xixi + n(x )2 ](10)and collecting terms,bˆ = (iyi xi - nxy ) [i(xi )2 - n(x )2], (11)and multiplying top and bottom by n/n:bˆ = [niyi xi – (i yi )(ixi )] [ni(xi )2 –(ixi )2], (12) the same as Eq. (15), as promised.We can use Eq. (4) to pursue the relationship between variances of y, x, and u. Square both sides of Eq. (4) and take expectations:Oct. 19, 2010 LEC #8 ECON 140A/240A-3 L. PhillipsCorrelation and Analysis of VarianceE(y – Ey)2 = b2 E(x – Ex)2 + 2b E[(x – Ex)u] + E u2 , (13)orVar y = b2 Var x + Var u, (14)since another assumption of least squares, as discussed above, is that the explanatory variable, x, is independent of the error u, so their covariance is zero, i.e E[(x – Ex)u] = 0. Thus the total variance in y can be decomposed into two parts, the variance explained by y’s dependence on x, b2 Var x, called the signal, and the unexplained variance, Var u, called the noise.Combining Eq.’s (6) and (14),Var y = [Cov yx]2 Var x] + Var u, (15)And dividing by the variance of y:1 = {[Cov yx]2 [Var xVar y] }+ Var u/Var y, (16)where 1 – Var u/Var y is one minus the ratio of the unexplained variance to the total variance, i.e. 1 – Var u/Var y 1 – unexplained variance/total variance, (17) (total variance – unexplained variance)/ total variance, (18) explained variance/total variance, (19)and from Eq. (16), this fraction of the total variance that is explained is:explained variance/total variance = [Cov yx]2 [Var xVar y] . (20)Note, the covariance squared divided by the variance of y and the variance of x cancels out the units of measurement, leaving a relative measure called R2 , the coefficient of determination. This coefficient, which measures the fraction of the variance in y explained by dependence on x is consequently a measure of goodness of fit.Oct. 19, 2010 LEC #8 ECON 140A/240A-4 L. PhillipsCorrelation and Analysis of VarianceIn bivariate regression of y on x, the coefficient of determination, R2, is just the square of the correlation coefficient, r, between y and x, which is a relative (unitless) measure of the interdependence of y and xR2 = Cov yx/{)( yVar )(xVar} = r. (21)The sample correlation coefficient, ,ˆrcan be estimated by the method of moments, substituting sample estimates of sums of squares for the covariance and variances:rˆ = {ixixyiy ])(][)([} (n –1)/ [])([ yiy 2 )1( n][])([ xix 2 )1( n],or rˆ = {ixixyiy ])(][)([}/ [])([ yiy 2 ][])([ xix 2 ]. (22)The correlation coefficient ranges between minus one, if the correlation is negative and perfect, through zero for no correlation, and up to one if the correlation is positive and perfect, i.e. -11r.In multivariate regression, where y depends on two or more explanatory variables or regressors, the estimated coefficient of determination, Rˆ2, can be calculated from the sum of squared residuals and the sum of squared deviations of y around its mean:Rˆ2 = 1 – [iiu )(ˆ[]2 /i(yi - y )2 (23)III. Analysis of VarianceThe results of a regression analysis can be summarized in a table of analysis of variance, or ANOVA, as depicted in Table 1.Oct. 19, 2010 LEC #8 ECON 140A/240A-5 L. PhillipsCorrelation and Analysis of VarianceTable 1: Table of Analysis of Variance: Bivariate RegressionSource of Variation Sum of Squares Degrees of Freedom Mean SquareExplained by xbˆ2 i(xi - x)21bˆ2 i(xi - x)2 /1Unexplainediuˆ2 n-2iuˆ2 /(n-2)Totali(yi - y)2n-1i(yi - y)2 /(n-1)A key to understanding ANOVA is that the total sum of squared deviations of the dependent variable, y, from its mean can be partitioned into the explained sum of squares,and the unexplained sum of squares, in a fashion parallel to how we partitioned the population variance. To see this, combine Eq. (6), ),(ˆ)()(ˆiyiyiu with Eq. (10),),(ˆˆ)( ixbaiy both from the previous chapter, to obtain:]ˆˆ[)](ˆˆ[)()(ˆxbayixbaiyiu , (24)])([ˆ])([)(ˆxixbyiyiu . (25)Squaring the observed residual and summing,iiu )](ˆ[2 = iyiy ])({[2 + bˆ2 [x(i) - ]x2 –2 ])(][)([ˆyiyxixb }. (26)Note from Eq. (8) thatixixb ])([ˆ2 =
View Full Document