# UCLA STATS 101A - stats_101a_Hw3 (11 pages)

Previewing pages*1, 2, 3, 4*of 11 page document

**View the full content.**## stats_101a_Hw3

Previewing pages
*1, 2, 3, 4*
of
actual document.

**View the full content.**View Full Document

## stats_101a_Hw3

0 0 184 views

- Pages:
- 11
- School:
- University of California, Los Angeles
- Course:
- Stats 101a - Introduction to Design and Analysis of Experiment

**Unformatted text preview:**

Stats 101A Hw 3 Linda Che 404449070 Section 2A 11 7 2017 ncbirths read delim Desktop births txt for j in 1 23 for i in 1 length ncbirths j if ncbirths i j in NA ncbirths i j mean ncbirths j na rm T attach ncbirths install packages corrplot library corrplot corrplot 0 84 loaded cormat round cor ncbirths unlist lapply ncbirths is numeric use pairwise complete obs 3 cormat Birthweight Weeks Apgar1 Fage Mage Feduc Meduc TotPreg Birthweight 1 000 0 653 0 216 0 069 0 146 0 126 0 166 0 003 Weeks 0 653 1 000 0 274 0 006 0 039 0 064 0 070 0 030 Apgar1 0 216 0 274 1 000 0 006 0 019 0 022 0 010 0 035 Fage 0 069 0 006 0 006 1 000 0 697 0 289 0 298 0 274 Mage 0 146 0 039 0 019 0 697 1 000 0 292 0 430 0 382 Feduc 0 126 0 064 0 022 0 289 0 292 1 000 0 686 0 072 Meduc 0 166 0 070 0 010 0 298 0 430 0 686 1 000 0 130 TotPreg 0 003 0 030 0 035 0 274 0 382 0 072 0 130 1 000 Visits 0 176 0 194 0 043 0 072 0 105 0 186 0 225 0 110 Gained 0 163 0 114 0 044 0 033 0 028 0 091 0 072 0 053 Visits Gained Birthweight 0 176 0 163 Weeks 0 194 0 114 Apgar1 0 043 0 044 Fage 0 072 0 033 Mage 0 105 0 028 Feduc 0 186 0 091 Meduc 0 225 0 072 TotPreg 0 110 0 053 Visits 1 000 0 056 Gained 0 056 1 000 cormat 1 Birthweight Weeks Apgar1 Fage Mage Feduc 1 000 0 653 0 216 0 069 0 146 0 126 Meduc TotPreg Visits Gained 0 166 0 003 0 176 0 163 Part A i corrplot cormat ii I noticed that birthweight is positively correlated with all the variables There are variables that are not correlated with each other as well The more correlated variables are colored a darker blue color thus the higher the positive correlation coefficient the darker blue the circle will be The higher the negative correlation coefficient the dark red the circle will be Part B i fit lm Birthweight Weeks Apgar1 Mage data ncbirths summary fit Call lm formula Birthweight Weeks Apgar1 Mage data ncbirths Residuals Min 1Q Median 3Q Max 51 19 10 15 0 87 9 73 74 93 Coefficients Estimate Std Error t value Pr t Intercept 130 03033 8 80263 14 772 2e 16 Weeks 6 00473 0 23168 25 918 2e 16 Apgar1 0 50021 0 31597 1 583 0 114 Mage 0 40037 0 07848 5 102 4 03e 07 Signif codes 0 0 001 0 01 0 05 0 1 1 Residual standard error 15 26 on 996 degrees of freedom Multiple R squared 0 4428 Adjusted R squared 0 4411 F statistic 263 8 on 3 and 996 DF p value 2 2e 16 pairs Weeks Apgar1 Mage data ncbirths par mfrow c 2 2 plot fit testing the linear assumptions anova fit testing the fit of the multi linear model Analysis of Variance Table Response Birthweight Df Sum Sq Mean Sq F value Pr F Weeks 1 177530 177530 762 7923 2 2e 16 Apgar1 1 615 615 2 6437 0 1043 Mage 1 6057 6057 26 0253 4 033e 07 Residuals 996 231806 233 Signif codes 0 0 001 0 01 0 05 0 1 1 ii The linear model doesn t violate any of the linear assumptions based on the residual plot and the Normal QQ plot The R squeared however is pretty low it is 0 4428 with three predictors Apgar1 and Mage don t see to be significant predictors of birthweight because their p value is greater than 0 05 iii linear model based on standarized coefficients Birthweight 14 777 25 918 weeks 1 583 Apgar1 5 102 Mage I standarized the coefficients using the t score In this case weeks would be able to expain the contribution to the Birthweight in the model and would be the strongest predictor Part C i fit2 lm Birthweight Weeks Apgar1 Mage Feduc Meduc Visits data ncbirths summary fit2 Call lm formula Birthweight Weeks Apgar1 Mage Feduc Meduc Visits data ncbirths Residuals Min 1Q Median 3Q Max 48 304 10 114 0 965 9 760 74 553 Coefficients Estimate Std Error t value Pr t Intercept 132 84749 8 90966 14 910 2e 16 Weeks 5 90961 0 23494 25 154 2e 16 Apgar1 0 54150 0 31490 1 720 0 0858 Mage 0 27713 0 08653 3 203 0 0014 Feduc 0 02095 0 26428 0 079 0 9368 Meduc 0 57640 0 25178 2 289 0 0223 Visits 0 13284 0 12737 1 043 0 2972 Signif codes 0 0 001 0 01 0 05 0 1 1 Residual standard error 15 19 on 993 degrees of freedom Multiple R squared 0 4494 Adjusted R squared 0 4461 F statistic 135 1 on 6 and 993 DF p value 2 2e 16 pairs Weeks Apgar1 Mage Feduc Meduc Visits data ncbirths par mfrow c 2 2 plot fit2 testing the linear assumptions anova fit2 testing the fit of the linear model Analysis of Variance Table Response Birthweight Df Sum Sq Mean Sq F value Pr F Weeks 1 177530 177530 769 6647 2 2e 16 Apgar1 1 615 615 2 6675 0 10273 Mage 1 6057 6057 26 2598 3 585e 07 Feduc 1 1144 1144 4 9592 0 02618 Meduc 1 1367 1367 5 9264 0 01509 Visits 1 251 251 1 0878 0 29721 Residuals 993 229044 231 Signif codes 0 0 001 0 01 0 05 0 1 1 ii The multi linear regression model with 6 predictors has an R squared value 0 4494 which is an increase from 3 predictors The model doesn t seem to violate and of the linear assumptions because there is a constant variance in the errors and accroding to the Normal QQ plot the errors are distributed normally The variables Apgar1 Feduc and vistis are not significant predictors of Birthweight because their p values are greater than 0 05 iii linear model based on standarized coefficients Birthweight 14 910 25 154 weeks 1 720 Apgar1 3 203 Mage 0 079 Feduc 2 289 Meduc 1 043 Visits I standarized the model using the t score for each variable In this case the strongest predictor of Birthweight is still weeks and has the biggest contribution in explaining the Birthweight Part D The R squared in the model doesn t change that much compared to the first model Having 6 predictors only changes the R slightly so having three extra predictors might not be necessary If you look at the ANOVA table of the 6 variables Apgar1 Feduc Meduc and visits sum of squar contribution is very low to the total residual sum of square In addition Based on the T test of the …

View Full Document