Qualitative predictor variablesExamples of qualitative predictor variablesAn example with one qualitative predictorOn average, do smoking mothers have babies with lower birth weight?Coding the two group qualitative predictorSlide 6A first order model with one binary predictorAn indicator variable for 2 groups yields 2 response functionsInterpretation of the regression coefficientsThe estimated regression functionA significant difference in mean birth weights for the two groups?Why not instead fit two separate regression functions?Using indicator variable, fitting one function to 32 data pointsSlide 14Fitting function to 16 nonsmokersSlide 16Fitting function to 16 smokersSlide 18Reasons to “pool” the data and to fit one regression functionWhat if instead used two indicator variables?Definition of two indicator variables – one for each groupThe modified regression function with two binary predictorsImplication on X matrixTo prevent linear dependencies in the X matrixWhat is impact of using a different coding scheme?The regression model defined using (1, -1) coding schemeThe regression model yields 2 different response functionsSlide 28Slide 29What is impact of using different coding scheme?An example where including an interaction term is appropriateCompare three treatments (A, B, C) for severe depressionSlide 33A model with interaction termsTwo indicator variables for 3 groups yield 3 response functionsSlide 36Slide 37How to test whether the three regression functions are identical?Test for identical regression functionsHow to test whether there is a significant interaction effect?Test for significant interactionQualitative predictor variablesExamples of qualitative predictor variables•Gender (male, female)•Smoking status (smoker, nonsmoker)•Socioeconomic status (poor, middle, rich)An example with one qualitative predictorOn average, do smoking mothers have babies with lower birth weight?•Random sample of n = 32 births.•y = birth weight of baby (in grams)•x1 = length of gestation (in weeks)•x2 = smoking status of mother (yes, no)Coding the two groupqualitative predictor•Using a (0,1) indicator variable.–xi2 = 1, if mother smokes–xi2 = 0, if mother does not smoke•Other terms used: –dummy variable–binary variableOn average, do smoking mothers have babies with lower birth weight?0 1 424140393837363534350030002500Gestation (weeks)Weight (grams)A first order modelwith one binary predictor iiiixxY22110where …• Yi is birth weight of baby i• xi1 is length of gestation of baby i • xi2 = 1, if mother smokes and xi2 = 0, if notand … the independent error terms i follow a normal distribution with mean 0 and equal variance 2.An indicator variable for 2 groups yields 2 response functionsIf mother is a smoker (xi2 = 1): iiiixxY22110 1120)(iixYEIf mother is a nonsmoker (xi2 = 0): 110 iixYEInterpretation of the regression coefficients1represents the change in the mean response E(Y) for every additional unit increase in the quantitative predictor x1 … for both groups.2represents how much higher (or lower) the mean response function for the second group is than the one for the first group… for any value of x2.The estimated regression function0 1 4241403938373635343700320027002200Gestation (weeks)Weight (grams)The regression equation isWeight = - 2390 + 143 Gest - 245 Smokingxy 1432390ˆxy 1432635ˆThe regression equation isWeight = - 2390 + 143 Gest - 245 SmokingPredictor Coef SE Coef T PConstant -2389.6 349.2 -6.84 0.000Gest 143.100 9.128 15.68 0.000Smoking -244.54 41.98 -5.83 0.000S = 115.5 R-Sq = 89.6% R-Sq(adj) = 88.9%A significant difference in mean birth weights for the two groups? 1120)(iixYE 110 iixYEWhy not instead fit two separate regression functions?Using indicator variable, fitting one function to 32 data pointsThe regression equation isWeight = - 2390 + 143 Gest - 245 SmokingPredictor Coef SE Coef T PConstant -2389.6 349.2 -6.84 0.000Gest 143.100 9.128 15.68 0.000Smoking -244.54 41.98 -5.83 0.000S = 115.5 R-Sq = 89.6% R-Sq(adj) = 88.9%Using indicator variable, fitting one function to 32 data pointsAnalysis of VarianceSource DF SS MS F PRegression 2 3348720 1674360 125.45 0.000Residual Error 29 387070 13347Total 31 3735789Predicted Values for New ObservationsNew Obs Fit SE Fit 95.0% CI 95.0% PI1 2803.7 30.8 (2740.6, 2866.8) (2559.1, 3048.3) 2 3048.2 28.9 (2989.1, 3107.4) (2804.7, 3291.8) Values of Predictors for New ObservationsNew Obs Gest Smoking1 38.0 1.002 38.0 0.00Fitting function to 16 nonsmokersThe regression equation isWeight = - 2546 + 147 GestPredictor Coef SE Coef T PConstant -2546.1 457.3 -5.57 0.000Gest 147.21 11.97 12.29 0.000S = 106.9 R-Sq = 91.5% R-Sq(adj) = 90.9%Fitting function to 16 nonsmokersAnalysis of VarianceSource DF SS MS F PRegression 1 1728172 1728172 151.14 0.000Residual Error 14 160082 11434Total 15 1888254Predicted Values for New ObservationsNew Obs Fit SE Fit 95.0% CI 95.0% PI1 3047.7 26.8 (2990.3, 3105.2) (2811.3, 3284.2) Values of Predictors for New ObservationsNew Obs Gest1 38.0Fitting function to 16 smokersThe regression equation isWeight = - 2475 + 139 GestPredictor Coef SE Coef T PConstant -2474.6 554.0 -4.47 0.001Gest 139.03 14.11 9.85 0.000S = 126.6 R-Sq = 87.4% R-Sq(adj) = 86.5%Fitting function to 16 smokersAnalysis of VarianceSource DF SS MS F PRegression 1 1554776 1554776 97.04 0.000Residual Error 14 224310 16022Total 15 1779086Predicted Values for New ObservationsNew Obs Fit SE Fit 95.0% CI 95.0% PI1 2808.5 35.8 (2731.7, 2885.3) (2526.4, 3090.7) Values of Predictors for New ObservationsNew Obs
View Full Document