DOC PREVIEW
ISU STAT 401 - Lecture20

This preview shows page 1-2-3 out of 9 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Stat 401 B – Lecture 201Quadratic Model In order to account for curvature in the relationship between an explanatory and a response variable, one often adds the square of the explanatory variable to the simple linear model.2Quadratic Model Conditions on  Independent Identically distributed Normally distributed with common standard deviation, εβββ+++=2210XXYσε3Example Response, Y: Population of the U.S. (millions) Explanatory, X: Year the census was taken.Stat 401 B – Lecture 204Quadratic Model Predicted Population = 21006.1 –23.3785*Year + 0.00651*Year2 We cannot interpret the estimated slope coefficients because we cannot change Year by 1 while holding Year2 constant.5Model Utility F=8050.89, P-value<0.0001 The small P-value indicates that the quadratic model relating population to Year and Year2is statistically significant (useful).6Statistical Significance Year (added to Year2) t=–33.48, P-value<0.0001 The P-value is small, therefore the addition of Year is statistically significant.Stat 401 B – Lecture 207Statistical Significance Year2(added to Year) t=35.22, P-value<0.0001 The P-value is small, therefore the addition of Year2is statistically significant.8Quadratic Model R2=0.999 or 99.9% of the variation in population can be explained by the quadratic model. RMSE=2.779Summary - Quadratic The model is useful. Each term is a statistically significant addition. 99.9% of the variation in population is explained by the quadratic model.Stat 401 B – Lecture 2010Prediction Year 2000 Predicted Population = 21006.1 – 23.3785(2000) + 0.0065063*(2000)2= 274.3 million Not bad as the actually figure in 2000 was 281.422 million.11Prediction Year 1800 Predicted Population = 21006.1 – 23.3785(1800) + 0.0065063*(1800)2= 5.212 million Very close to the actual value of 5.308 million12050100150200250Population1750 1800 1850 1900 1950 2000YearStat 401 B – Lecture 2013-7.5-5.0-2.50.02.55.0Residual1750 1800 1850 1900 1950 2000Year14Plot of Residuals The residuals wiggle around the zero line. Hard to say whether this is a pattern or not. The residuals for 1940 and 1950 stick out. The quadratic model over predicts for these years.15Can we do better? Could try higher order polynomial terms like Year3or Year4. Year3 is not statistically significant in a cubic model. Year4 is not statistically significant in a quartic model.Stat 401 B – Lecture 2016Quadratic Model There is still the issue of trying to interpret the coefficients in the quadratic model. Again, creating a new explanatory variable, Year2,has introduced multicollinearity into the quadratic model.171800185019001950200033000003400000350000036000003700000380000039000004000000Year1800 1850 1900 1950 2000YearSqr3300000 3600000380000018Correlation  Year and Year2 Correlation: r = 0.9999 For the values that Year takes on, there is an extremely strong positive linear correlation with Year2.Stat 401 B – Lecture 2019Centering Center Year by subtracting off the mean before constructing the squared term in the quadratic model. Mean year is 1890.20Quadratic Model Predicted Population = –2235.197 + 1.215*Year+ 0.00651*(Year – 1890)2 Note that the estimated slope for year is exactly the same as in the simple linear model.211800185019001950-20000200040006000800010000Year1800 1850 1900 1950YearCtrSqr-2000 2000 4000 6000 8000Stat 401 B – Lecture 2022Correlation  Year and (Year – 1890)2 Correlation: r = –0.0000 For the values that Year takes on, there is no linear correlation with (Year – 1890)2.23Centering Centering has completely removed the multicollinearityresulting from the inclusion of the quadratic term in the quadratic model.24Quadratic Model Predicted Population = 61.926 + 1.215*(Year – 1890) + 0.00651*(Year – 1890)2 The predicted population in 1890 is 61.926 million.Stat 401 B – Lecture 2025Quadratic Model Predicted Population = 61.926 + 1.215*(Year – 1890) + 0.00651*(Year – 1890)2 For each additional year, the population goes up, on average, 1.215 million.26Quadratic Model Predicted Population = 61.926 + 1.215*(Year – 1890) + 0.00651*(Year – 1890)2 In addition to the average change per year, there is a bigger adjustment to this rate of change the further away you are from 1890.27One Year Change 1880: Pred = 50.427 million 1890: Pred = 61.926 million Difference of 11.499 1980: Pred = 224.007 1990: Pred = 248.526 Difference of


View Full Document

ISU STAT 401 - Lecture20

Download Lecture20
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture20 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture20 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?