Multiple Logistic Regression Bret Larget Departments of Botany and of Statistics University of Wisconsin Madison April 26 2007 Statistics 572 Spring 2007 Multiple Logistic Regression April 26 2007 1 15 Multiple LogisticRegression Multiple Logistic Regression Multiple logistic regression is an extension of logistic regression to the case where there may be multiple explanatory variables The basic idea is the same where the probability of one outcome is modeled as a function of the linear combination of several explanatory variables A special case of multiple logistic regression is when the probability varies as a polynomial function of a single quantitative explanatory variable This is similar to polynomial regression Statistics 572 Spring 2007 Multiple Logistic Regression April 26 2007 2 15 Multiple LogisticRegression Cow Example Mastitis in Cows In an example from a student in class we have daily data on the number of cows from a dairy herd that experience new cases of mastitis or inflamation of the udder Mastitis is a costly problem for dairy farmers We wish to examine the trend in the rate of mastitis over time We will consider possible nonlinear trends in time Statistics 572 Spring 2007 Multiple Logistic Regression Multiple LogisticRegression April 26 2007 3 15 Cow Example Data We will model the new cases of mastitis as the response variable The size of the herd changes slightly each day We account for the changes in herd size but do not model individual cows We create a new variable called time which is days since the beginning of the year Statistics 572 Spring 2007 cows read table cows header T str cows data frame 74 obs of 5 variables date Factor w 74 levels 1 1 07 1 10 07 1 22 25 2 numCows num 1177 1174 1178 1182 1190 numMastitis num 38 35 34 35 33 36 32 31 32 32 numNewMastitis num 2 5 4 5 4 0 4 4 3 8 milk num 87 8 86 8 86 1 85 4 85 2 84 8 85 6 85 6 86 2 87 cows data frame time 1 74 cows str cows data frame 74 obs of 6 variables time int 1 2 3 4 5 6 7 8 9 10 date Factor w 74 levels 1 1 07 1 10 07 1 22 25 2 numCows num 1177 1174 1178 1182 1190 numMastitis num 38 35 34 35 33 36 32 31 32 32 numNewMastitis num 2 5 4 5 4 0 4 4 3 8 milk num 87 8 86 8 86 1 85 4 85 2 84 8 85 6 85 6 86 2 87 attach cows Multiple Logistic Regression April 26 2007 4 15 Multiple LogisticRegression Cow Example First GLM Analysis prop numNewMastitis numCows fit1 glm prop time family binomial weights numCows summary fit1 Call glm formula prop time family binomial weights numCows Deviance Residuals Min 1Q 2 95874 0 58793 Median 0 02157 3Q 0 42698 Max 2 24759 Coefficients Estimate Std Error z value Pr z Intercept 5 823143 0 119411 48 765 2e 16 time 0 004649 0 002653 1 752 0 0797 Signif codes 0 0 001 0 01 0 05 0 1 1 Dispersion parameter for binomial family taken to be 1 Null deviance 83 08 Residual deviance 80 00 AIC 316 22 on 73 on 72 degrees of freedom degrees of freedom Number of Fisher Scoring iterations 4 Statistics 572 Spring 2007 Multiple Logistic Regression Multiple LogisticRegression April 26 2007 5 15 April 26 2007 6 15 Cow Example Second GLM Analysis fit2 glm prop time I time 2 family binomial weights numCows summary fit2 Call glm formula prop time I time 2 family binomial weights numCows Deviance Residuals Min 1Q 2 919315 0 547581 Median 0 005287 3Q 0 456351 Max 2 294789 Coefficients Estimate Intercept 5 763e 00 time 7 236e 05 I time 2 5 959e 05 Signif codes 0 Std Error z value Pr z 1 827e 01 31 535 2e 16 1 089e 02 0 007 0 995 1 377e 04 0 433 0 665 0 001 0 01 0 05 0 1 1 Dispersion parameter for binomial family taken to be 1 Null deviance 83 080 Residual deviance 79 814 AIC 318 04 on 73 on 71 degrees of freedom degrees of freedom Number of Fisher Scoring iterations 4 Statistics 572 Spring 2007 Multiple Logistic Regression Multiple LogisticRegression Cow Example Comments For this data set there is slight evidence of an increasing trend of mastitis rate in time but no need for a quadratic model In fact for this data a regular linear model would have sufficed The next plot compares the logistic regression and simple linear regression models Statistics 572 Spring 2007 Multiple Logistic Regression Multiple LogisticRegression April 26 2007 7 15 Cow Example 0 008 Plots 0 006 0 004 0 002 fit3 lm prop time plot time prop pch 16 eta predict fit1 data frame time time prob exp eta 1 exp eta lines time prob col red abline fit3 col blue prop 0 0 000 20 40 60 time Statistics 572 Spring 2007 Multiple Logistic Regression April 26 2007 8 15 Multiple LogisticRegression Seed Germination Experiment Seed Germination Experiment We studied this seed germination data earlier in the semester In an experiment four sites were selected where the soil and climate conditions were expected to be very similar within the site Here we will treat each site as a block Within each block five plots were identified The treatment was applying a seed disinfectant to seeds There were four different treatments brands plus a control The researchers planted 100 seeds from a single treatment in each plot The response is the number of seeds that germinated Statistics 572 Spring 2007 Multiple Logistic Regression Multiple LogisticRegression April 26 2007 9 15 Seed Germination Experiment Data Treatment 1 Control 86 Arasan 98 Spergon 96 Semesan 97 Fermate 91 Lines show Block 2 3 90 88 94 93 90 91 95 91 93 95 treatment Statistics 572 Spring 2007 4 87 89 92 92 95 Treatment 1 Control 86 Arasan 98 Spergon 96 Semesan 97 Fermate 91 Lines show Multiple Logistic Regression Block 2 3 90 88 94 93 90 91 95 91 93 95 blocking April 26 2007 4 87 89 92 92 95 10 15 Multiple LogisticRegression Seed Germination Experiment Model A model is ij i j e ij P seed ij germinates 1 e ij where is an intercept P i is the effect of treatment i where P i i 0 and j is the effect in block j where j j 0 Statistics 572 Spring 2007 Multiple Logistic Regression Multiple LogisticRegression April 26 2007 11 15 April 26 2007 12 15 Seed Germination Experiment Data seed read table seed txt header T attach seed str seed data frame 20 obs of 3 variables count int 86 90 88 87 98 94 93 89 96 90 treatment Factor w 5 levels AControl Arasan 1 1 1 1 2 2 2 2 5 5 block Factor w 4 levels b1 b2 b3 1 2 3 4 1 2 3 4 1 2 seed 1 2 3 4 5 6 7 8 9 …
View Full Document