DOC PREVIEW
PSU STAT 501 - VARIABLES

This preview shows page 1-2-15-16-17-32-33 out of 33 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 33 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 33 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 33 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 33 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 33 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 33 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 33 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 33 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

A first order model with one binary and one quantitative predictor variableExamples of binary predictor variablesOn average, do smoking mothers have babies with lower birth weight?Coding the binary (two-group qualitative) predictorSlide 5A first order model with one binary and one quantitative predictorAn indicator variable for 2 groups yields 2 response functionsInterpretation of the regression coefficientsThe estimated regression functionA significant difference in mean birth weights for the two groups?Why not instead fit two separate regression functions?Using indicator variable, fitting one function to 32 data pointsSlide 13Fitting function to 16 nonsmokersSlide 15Fitting function to 16 smokersSlide 17Summary tableReasons to “pool” the data and to fit one regression functionHow to answer the research question using one regression function?How to answer the research question using two regression functions?Slide 22What if we instead tried to use two indicator variables?Definition of two indicator variables – one for each groupThe modified regression function with two binary predictorsImplication on data analysisTo prevent problems with the data analysisWhat is the impact of using a different coding scheme?The regression model defined using (1, -1) coding schemeThe regression model yields 2 different response functionsSlide 31Slide 32What is impact of using different coding scheme?A first order model with one binary and one quantitative predictor variableExamples of binary predictor variables•Gender (male, female)•Smoking status (smoker, nonsmoker)•Treatment (yes, no)•Health status (diseased, healthy)On average, do smoking mothers have babies with lower birth weight?•Random sample of n = 32 births.•y = birth weight of baby (in grams)•x1 = length of gestation (in weeks)•x2 = smoking status of mother (yes, no)Coding the binary (two-group qualitative) predictor•Using a (0,1) indicator variable.–xi2 = 1, if mother smokes–xi2 = 0, if mother does not smoke•Other terms used: –dummy variable–binary variableOn average, do smoking mothers have babies with lower birth weight?0 1 424140393837363534350030002500Gestation (weeks)Weight (grams)A first order model with one binary and one quantitative predictor iiiixxy22110where …• yi is birth weight of baby i• xi1 is length of gestation of baby i • xi2 = 1, if mother smokes and xi2 = 0, if notand … the independent error terms i follow a normal distribution with mean 0 and equal variance 2.An indicator variable for 2 groups yields 2 response functionsIf mother is a smoker (xi2 = 1): iiiixxy2211011201|)(2ixYxIf mother is a nonsmoker (xi2 = 0):1100|2ixYxInterpretation of the regression coefficients1represents the change in the mean response μY for each additional unit increase in the quantitative predictor x1 … for both groups.2represents how much higher (or lower) the mean response function for the second group is than the one for the first group… for any value of x2.The estimated regression function0 1 4241403938373635343700320027002200Gestation (weeks)Weight (grams)The regression equation isWeight = - 2390 + 143 Gest - 245 Smokingxy 1432390ˆxy 1432635ˆThe regression equation isWeight = - 2390 + 143 Gest - 245 SmokingPredictor Coef SE Coef T PConstant -2389.6 349.2 -6.84 0.000Gest 143.100 9.128 15.68 0.000Smoking -244.54 41.98 -5.83 0.000S = 115.5 R-Sq = 89.6% R-Sq(adj) = 88.9%A significant difference in mean birth weights for the two groups?11201|)(2ixYx1100|2ixYxWhy not instead fit two separate regression functions?One for the smokers and one for the nonsmokers?Using indicator variable, fitting one function to 32 data pointsThe regression equation isWeight = - 2390 + 143 Gest - 245 SmokingPredictor Coef SE Coef T PConstant -2389.6 349.2 -6.84 0.000Gest 143.100 9.128 15.68 0.000Smoking -244.54 41.98 -5.83 0.000S = 115.5 R-Sq = 89.6% R-Sq(adj) = 88.9%Using indicator variable, fitting one function to 32 data pointsPredicted Values for New ObservationsNew Obs Fit SE Fit 95.0% CI 95.0% PI1 2803.7 30.8 (2740.6, 2866.8) (2559.1, 3048.3) 2 3048.2 28.9 (2989.1, 3107.4) (2804.7, 3291.8) Values of Predictors for New ObservationsNew Obs Gest Smoking1 38.0 1.002 38.0 0.00Fitting function to 16 nonsmokersThe regression equation isWeight = - 2546 + 147 GestPredictor Coef SE Coef T PConstant -2546.1 457.3 -5.57 0.000Gest 147.21 11.97 12.29 0.000S = 106.9 R-Sq = 91.5% R-Sq(adj) = 90.9%Fitting function to 16 nonsmokersPredicted Values for New ObservationsNew Obs Fit SE Fit 95.0% CI 95.0% PI1 3047.7 26.8 (2990.3, 3105.2) (2811.3, 3284.2) Values of Predictors for New ObservationsNew Obs Gest1 38.0Fitting function to 16 smokersThe regression equation isWeight = - 2475 + 139 GestPredictor Coef SE Coef T PConstant -2474.6 554.0 -4.47 0.001Gest 139.03 14.11 9.85 0.000S = 126.6 R-Sq = 87.4% R-Sq(adj) = 86.5%Fitting function to 16 smokersPredicted Values for New ObservationsNew Obs Fit SE Fit 95.0% CI 95.0% PI1 2808.5 35.8 (2731.7, 2885.3) (2526.4, 3090.7) Values of Predictors for New ObservationsNew Obs Gest1 38.0Summary tableModel estimated using…SE(Gest)Length of CI for μY32 data points 9.128(NS) 118.3(S) 126.216 nonsmokers 11.97 114.916 smokers 14.11 153.6Reasons to “pool” the data and to fit one regression function•Model assumes equal slopes for the groups and equal variances for all error terms. •It makes sense to use all of the data to estimate these quantities.•More degrees of freedom associated with MSE, so confidence intervals that are a function of MSE tend to be narrower.How to answer the research question using one regression function?The regression equation isWeight = - 2390 + 143 Gest - 245 SmokingPredictor Coef SE Coef T PConstant -2389.6


View Full Document

PSU STAT 501 - VARIABLES

Documents in this Course
Load more
Download VARIABLES
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view VARIABLES and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view VARIABLES 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?