DOC PREVIEW
PSU STAT 501 - Analysis of variance approach to regression analysis

This preview shows page 1-2-20-21 out of 21 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 21 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 21 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 21 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 21 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 21 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Analysis of variance approach to regression analysisExample: Mortality and LatitudeSlide 3Example: Height and GPASlide 5The basic ideaSlide 7Slide 8Breakdown of degrees of freedomSlide 10Definitions of Mean SquaresAnalysis of Variance (ANOVA) TableExpected Mean SquaresThe formal F-test for slope parameter β1Slide 15Slide 16Analysis of Variance TableFor simple linear regression model, the F-test and t-test are equivalent.Equivalence of F-test to t-testShould I use the F-test or the t-test?Getting ANOVA table in MinitabAnalysis of variance approach to regression analysis … an (alternative) approach to testing for a linear associationExample: Mortality and LatitudeThe regression equation is Mort = 389 - 5.98 LatPredictor Coef SE Coef T PConstant 389.19 23.81 16.34 0.000Lat -5.9776 0.5984 -9.99 0.000S = 19.12 R-Sq = 68.0% R-Sq(adj) = 67.3%Analysis of VarianceSource DF SS MS F PRegression 1 36464 36464 99.80 0.000Residual Error 47 17173 365Total 48 53637504030200150100Latitude (at center of state)Mortality (Deaths per 10 million)88.152yxy 98.519.389ˆ 17173ˆ21niiiyy 5363721niiyy 36464ˆ21niiyyExample: Mortality and LatitudeExample: Height and GPAThe regression equation is gpa = 3.41 - 0.0066 heightPredictor Coef SE Coef T PConstant 3.410 1.435 2.38 0.023height -0.00656 0.02143 -0.31 0.761S = 0.5423 R-Sq = 0.3% R-Sq(adj) = 0.0%Analysis of VarianceSource DF SS MS F PRegression 1 0.0276 0.0276 0.09 0.761Residual Error 33 9.7055 0.2941Total 34 9.7331Example: Height and GPA756555432heightgpa 7055.9ˆ21niiiyy 7331.921niiyy 0276.0ˆ21niiyyxy 0066.041.3ˆ97.2yThe basic idea•Break down the variation in Y (“total sum of squares”) into two components:–a component that is “due to” the change in X (“regression sum of squares”)–a component that is just due to random error (“error sum of squares”)•If the regression sum of squares is a large component of the total sum of squares, it suggests that there is a linear association.     iiiiyyyyyyˆˆThe above decomposition holds for the sum of the squared deviations, too:     212121ˆˆniiiniiniiyyyyyyTotal sum of squares (SSTO)Regression sum of squares (SSR)Error sum of squares (SSE)SSESSRSSTO Breakdown of degrees of freedom     211  nnDegrees of freedom associated with SSTODegrees of freedom associated with SSRDegrees of freedom associated with SSEExample: Mortality and LatitudeThe regression equation is Mort = 389 - 5.98 LatPredictor Coef SE Coef T PConstant 389.19 23.81 16.34 0.000Lat -5.9776 0.5984 -9.99 0.000S = 19.12 R-Sq = 68.0% R-Sq(adj) = 67.3%Analysis of VarianceSource DF SS MS F PRegression 1 36464 36464 99.80 0.000Residual Error 47 17173 365Total 48 53637Definitions of Mean SquaresSimilarly, the regression mean square (MSR) is defined as:We already know the mean square error (MSE) is defined as: 11ˆ2S SRyyMSRi 22ˆ2nSSEnyyMSEiiAnalysis of Variance (ANOVA) TableExpected Mean Squares niiXXMS RE12212)(2)(MSEE• If β1 = 0, we’d expect the ratio MSR/MSE to be … • If β1 ≠ 0, we’d expect the ratio MSR/MSE to be …• Use ratio, MSR/MSE, to reject whether or not β1= 0.The formal F-test for slope parameter β1Null hypothesis H0: β1 = 0Alternative hypothesis HA: β1 ≠ 0Test statisticMSEMSRF *P-value = What is the probability that we’d get an F* statistic as large as we did, if the null hypothesis is true? (One-tailed test!)The P-value is determined by comparing F* to an F distribution with 1 numerator degree of freedom and n-2 denominator degrees of freedom.Row Year Men200m 1 1900 22.20 2 1904 21.60 3 1908 22.60 4 1912 21.70 5 1920 22.00 6 1924 21.60 7 1928 21.80 8 1932 21.20 9 1936 20.70 10 1948 21.10 11 1952 20.70 12 1956 20.60 13 1960 20.50 14 1964 20.30 15 1968 19.83 16 1972 20.00 17 1976 20.23 18 1980 20.19 19 1984 19.80 20 1988 19.75 21 1992 20.01 22 1996 19.32Winning times (in seconds) in Men’s 200 meter Olympic sprints, 1900-1996.Are men getting faster?1900 1950 200019.520.521.522.5YearMen200mMen200m = 76.1534 - 0.0283833 YearS = 0.298134 R-Sq = 89.9 % R-Sq(adj) = 89.4 %Regression PlotAnalysis of Variance TableAnalysis of VarianceSource DF SS MS F PRegression 1 15.8 15.8 177.7 0.000Residual Error 20 1.8 0.09Total 21 17.6DFE = n-2 = 22-2 = 20DFTO = n-1 = 22-1 = 21MSR = SSR/1 = 15.8MSE = SSE/(n-2) = 1.8/20 = 0.09F* = MSR/MSE = 15.796/0.089 = 177.7P = Probability that an F(1,20) random variable is greater than 177.7 = 0.000…For simple linear regression model, the F-test and t-test are equivalent.7.177)33.13(2Predictor Coef SE Coef T PConstant 76.153 4.152 18.34 0.000Year -0.0284 0.00213 -13.33 0.000Analysis of VarianceSource DF SS MS F PRegression 1 15.796 15.796 177.7 0.000Residual Error 20 1.778 0.089Total 21 17.574 *)2,1(2*)2( nnFtEquivalence of F-test to t-test•For a given α level, the F-test of β1 = 0 versus β1 ≠ 0 is algebraically equivalent to the two-tailed t-test.•Will get exactly same P-values, so…–If one test rejects H0, then so will the other. –If one test does not reject H0, then so will the other.Should I use the F-test or the t-test?•The F-test is only appropriate for testing that the slope differs from 0 (β1 ≠ 0). •Use the t-test to test that the slope is positive (β1 > 0) or negative (β1 < 0).•F-test is more useful for multiple regression model when we want to test that more than one slope parameter is 0.Getting ANOVA table in Minitab•The Analysis of Variance (ANOVA) Table is default


View Full Document

PSU STAT 501 - Analysis of variance approach to regression analysis

Documents in this Course
VARIABLES

VARIABLES

33 pages

Load more
Download Analysis of variance approach to regression analysis
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Analysis of variance approach to regression analysis and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Analysis of variance approach to regression analysis 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?