DOC PREVIEW
UW-Madison STAT 333 - week04

This preview shows page 1-2 out of 5 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 5 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 5 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 5 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

STAT 333 Discussion Week 41 Practice Problems1. The purpose of this exercise is to explore the meaning of degrees of freedom in regression. Suppose that youhave 4 values of X and Y and you wish to fit a regression of Y on X. The estimated slope and intercept areˆb1= 2 andˆb0= 5 respectively. Below are the data — with 2 of the Y values missing. Your job is to find themissing values and calculate the sum of squares for error.X: -1 4 -3 0Y: 1 14 ? ?If there were 7 measurements instead of 4, how many Y values would need to be specified so that you couldcalculate the sum of squares for error?2. One measure of the development of a country is the Human Development Index (HDI). Life expectancy,literacy, educational attainment, and gross domestic product per capita are combined into an index between0 and 1, inclusive with 1 being the highest development. The United Nations Development Program reportsvalues for 177 countries. We randomly selected fifteen countries, below the top twenty-five. We are interestedin how HDI will be affected by the internet usage per 100 persons (Internet/100).Country Internet/100 HDIBahrain 21.3 .866Poland 26.2 .870Uruguay 14.3 .852Bulgaria 20.6 .824Brazil 19.5 .800Ukraine 9.7 .788Dominican Republic 16.9 .799Moldova 9.6 .708India 5.5 .619Madagascar 0.5 .533Nepal 0.4 .534Tanzania 0.9 .467Uganda 1.7 .505Zambia 2.0 .434Ethiopia 0.2 .406(a) Plot Y versus X.(b) Fit Y = b0+ b1X + e by least squares and plot the fitted line on the scatterplot.> internet=c(21.3, 26.2, 14.3, 20.6, 19.5, 9.7, 16.9, 9.6, 5.5, 0.5,+ 0.4, 0.9, 1.7, 2.0, 0.2)> HDI=c(.866, .870, .852, .824, .800, .788, .799, .708, .619, .533,+ .534, .467, .505, .434, .406)> plot(internet, HDI, xlab="internet usage( per 100 persons)", ylab="HDI")> out=lm(HDI~internet)> summary(out)$coefEstimate Std. Error t value Pr(>|t|)(Intercept) 0.49336548 0.026541570 18.588407 9.545225e-11internet 0.01744486 0.001993302 8.751741 8.255033e-07> abline(out)(c) Compute the analysis of variance table corresponding to these data.1●●●●●●●●●●●●●●●0 5 10 15 20 250.4 0.5 0.6 0.7 0.8internet usage( per 100 persons)HDI●●●●●●●●●●●●●●●0.5 0.6 0.7 0.8 0.9−0.10 −0.05 0.00 0.05 0.10fitted(out)resid(out)> anova(out)Analysis of Variance TableResponse: HDIDf Sum Sq Mean Sq F value Pr(>F)internet 1 0.35711 0.35711 76.593 8.255e-07 ***Residuals 13 0.06061 0.00466---Signif. codes: 0 ^a˘A¨Y***^a˘A´Z 0.001 ^a˘A¨Y**^a˘A´Z 0.01 ^a˘A¨Y*^a˘A´Z 0.05 ^a˘A¨Y.^a˘A´Z 0.1 ^a˘A¨Y ^a˘A´Z 1(d) Find and interpret 95% confidence intervals for the slope and intercept.> confint(out)2.5 % 97.5 %(Intercept) 0.43602591 0.55070506internet 0.01313859 0.02175113(e) Calculate the residuals. Check that the sum of the residuals is “0” within rounding error.(f) Examine the residuals. Is there any evidence to question the underlying assumptions of the model?Comment.> resid(out)1 2 3 4 5 60.001058978 -0.080420841 0.109173004 -0.028729619 -0.033540272 0.1254193657 8 9 10 11 120.010816366 0.047163851 0.029687781 0.030912085 0.033656572 -0.04206585913 14 15-0.018021748 -0.094255206 -0.090854456> sum(resid(out))[1] 4.683753e-17> plot(fitted(out),resid(out))22 Some residual plots of simulated data: Yi= 2 + 3Xi+ ei, ei∼ N(0, 22)●●●●●●●●●●●●15 20 25 30 35 40 45−2.0 −1.0 0.0 0.5 1.0Residual plot 1fitted(out)resid(out)●●●●●●●●●●●●15 20 25 30 35 40 45−1.0 0.0 0.5 1.0 1.5 2.0 2.5Residual plot 2fitted(out)resid(out)●●●●●●●●●●●●15 20 25 30 35 40 45−1.0 −0.5 0.0 0.5 1.0 1.5Residual plot 3fitted(out)resid(out)●●●●●●●●●●●●15 20 25 30 35 40 45−1 0 1 2Residual plot 4fitted(out)resid(out)●●●●●●●●●●●●15 20 25 30 35 40 45−1.5 −0.5 0.0 0.5 1.0 1.5Residual plot 5fitted(out)resid(out)●●●●●●●●●●●●15 20 25 30 35 40 45−1.5 −0.5 0.0 0.5 1.0 1.5Residual plot 6fitted(out)resid(out)Figure 1: Residual plots of simulated data in the original setting (n = 12, σ = 2).3●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●15 20 25 30 35 40 45−4 −2 0 2 4Residual plot 1fitted(out)resid(out)●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●15 20 25 30 35 40 45−4 −2 0 2 4Residual plot 2fitted(out)resid(out)●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●15 20 25 30 35 40 45−2 0 2 4Residual plot 3fitted(out)resid(out)●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●15 20 25 30 35 40 45−2 0 2 4 6Residual plot 4fitted(out)resid(out)●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●15 20 25 30 35 40 45−3 −2 −1 0 1 2 3 4Residual plot 5fitted(out)resid(out)●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●15 20 25 30 35 40 45−4 −2 0 2 4Residual plot 6fitted(out)resid(out)Figure 2: Residual plots of simulated data with larger sample size (n = 48, σ = 2).4Solution1. Suppose the two missing values are y3and y4. By the formulaˆb1=P((xi− ¯x)(yi− ¯y)P(xi− ¯x)2= 2 (1)ˆb0= ¯y −ˆb1¯x = 5 (2)Since ¯x = 0, from the second equation, we have ¯y =ˆb0= 5. Note thatX(xi− ¯x)2= (−1)2+ 42+ (−3)2+ 02= 1 + 16 + 9 + 0 = 26X((xi− ¯x)(yi− ¯y) = (−1) × (−4) + 4 × 9 + (−3) × (y3− 5) + 0 × (y4− 5)= 40 − 3(y3− 5)From the equation (1), we have (40 − 3(y3− 5))/26 = 2, therefore y3= 1. Then by ¯y = 5, y4= 4. All residualscan be obtained byX : -1 4 -3 0Y : 1 14 1 4ˆY : 3 13 -1 5Residual: -2 1 2 -1So, the the sum of squares for error is (−2)2+ 12+ 22+ (−1)2= 10.If we have 7 measurements of Y , we still need 5 Y values to determine the rest since we have 2 equations,which correspond to the formulas


View Full Document

UW-Madison STAT 333 - week04

Download week04
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view week04 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view week04 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?