DOC PREVIEW
UW-Madison STAT 572 - Handouts 23-2

This preview shows page 1-2-3 out of 8 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Multiple LogisticRegression Cow ExampleSeed Germination ExperimentMultiple Logistic RegressionBret LargetDepartments of Botany and of StatisticsUniversity of Wisconsin—MadisonApril 26, 2007Statistics 572 (Spring 2007) Multiple Logistic Regression April 26, 2007 1 / 15Multiple LogisticRegressionMultiple Logistic RegressionMultiple logistic regression is an extension of logistic regression to thecase where there may be multiple explanatory variables.The basic idea is the same, where the probability of one outcome ismodeled as a function of the linear combination of several explanatoryvariables.A special case of multiple logistic regression is when the probabilityvaries as a polynomial function of a single quantitative explanatoryvariable.This is similar to polynomial regression.Statistics 572 (Spring 2007) Multiple Logistic Regression April 26, 2007 2 / 15Multiple LogisticRegression Cow ExampleMastitis in CowsIn an example from a student in class, we have daily data on thenumber of cows from a dairy herd that experience new cases ofmastitis, or inflamation of the udder.Mastitis is a costly problem for dairy farmers.We wish to examine the trend in the rate of mastitis over time.We will consider possible nonlinear trends in time.Statistics 572 (Spring 2007) Multiple Logistic Regression April 26, 2007 3 / 15Multiple LogisticRegression Cow ExampleDataWe will model the new cases ofmastitis as the response variable.The size of the herd changesslightly each day.We account for the changes inherd size, but do not modelindividual cows.We create a new variable calledtime which is days since thebeginning of the year.> cows = read.table("cows", header = T)> str(cows)'data.frame': 74 obs. of 5 variables:$ date : Factor w/ 74 levels "1/1/07","1/10/07",..: 1 22 25 26 27 28 29 30 2 3 ...$ numCows : num 1177 1174 1178 1182 1190 ...$ numMastitis : num 38 35 34 35 33 36 32 31 32 32 ...$ numNewMastitis: num 2 5 4 5 4 0 4 4 3 8 ...$ milk : num 87.8 86.8 86.1 85.4 85.2 84.8 85.6 85.6 86.2 87.1 ...> cows = data.frame(time = 1:74, cows)> str(cows)'data.frame': 74 obs. of 6 variables:$ time : int 1 2 3 4 5 6 7 8 9 10 ...$ date : Factor w/ 74 levels "1/1/07","1/10/07",..: 1 22 25 26 27 28 29 30 2 3 ...$ numCows : num 1177 1174 1178 1182 1190 ...$ numMastitis : num 38 35 34 35 33 36 32 31 32 32 ...$ numNewMastitis: num 2 5 4 5 4 0 4 4 3 8 ...$ milk : num 87.8 86.8 86.1 85.4 85.2 84.8 85.6 85.6 86.2 87.1 ...> attach(cows)Statistics 572 (Spring 2007) Multiple Logistic Regression April 26, 2007 4 / 15Multiple LogisticRegression Cow ExampleFirst GLM Analysis> prop = numNewMastitis/numCows> fit1 = glm(prop ~ time, family = binomial, weights = numCows)> summary(fit1)Call:glm(formula = prop ~ time, family = binomial, weights = numCows)Deviance Residuals:Min 1Q Median 3Q Max-2.95874 -0.58793 0.02157 0.42698 2.24759Coefficients:Estimate Std. Error z value Pr(>|z|)(Intercept) -5.823143 0.119411 -48.765 <2e-16 ***time 0.004649 0.002653 1.752 0.0797 .---Signif. codes: 0'***'0.001'**'0.01'*'0.05'.'0.1' '1(Dispersion parameter for binomial family taken to be 1)Null deviance: 83.08 on 73 degrees of freedomResidual deviance: 80.00 on 72 degrees of freedomAIC: 316.22Number of Fisher Scoring iterations: 4Statistics 572 (Spring 2007) Multiple Logistic Regression April 26, 2007 5 / 15Multiple LogisticRegression Cow ExampleSecond GLM Analysis> fit2 = glm(prop ~ time + I(time^2), family = binomial, weights = numCows)> summary(fit2)Call:glm(formula = prop ~ time + I(time^2), family = binomial, weights = numCows)Deviance Residuals:Min 1Q Median 3Q Max-2.919315 -0.547581 -0.005287 0.456351 2.294789Coefficients:Estimate Std. Error z value Pr(>|z|)(Intercept) -5.763e+00 1.827e-01 -31.535 <2e-16 ***time 7.236e-05 1.089e-02 0.007 0.995I(time^2) 5.959e-05 1.377e-04 0.433 0.665---Signif. codes: 0'***'0.001'**'0.01'*'0.05'.'0.1' '1(Dispersion parameter for binomial family taken to be 1)Null deviance: 83.080 on 73 degrees of freedomResidual deviance: 79.814 on 71 degrees of freedomAIC: 318.04Number of Fisher Scoring iterations: 4Statistics 572 (Spring 2007) Multiple Logistic Regression April 26, 2007 6 / 15Multiple LogisticRegression Cow ExampleCommentsFor this data set, there is slight evidence of an increasing trend ofmastitis rate in time, but no need for a quadratic model.In fact, for this data a regular linear model would have sufficed.The next plot compares the logistic regression and simple linearregression models.Statistics 572 (Spring 2007) Multiple Logistic Regression April 26, 2007 7 / 15Multiple LogisticRegression Cow ExamplePlots> fit3 = lm(prop ~ time)> plot(time, prop, pch = 16)> eta = predict(fit1, data.frame(time = time))> prob = exp(eta)/(1 + exp(eta))> lines(time, prob, col = "red")> abline(fit3, col = "blue")●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●0 20 40 600.000 0.002 0.004 0.006 0.008timepropStatistics 572 (Spring 2007) Multiple Logistic Regression April 26, 2007 8 / 15Multiple LogisticRegression Seed Germination ExperimentSeed Germination ExperimentWe studied this seed germination data earlier in the semester.In an experiment, four sites were selected where the soil and climateconditions were expected to be very similar within the site.Here we will treat each site as a block.Within each block, five plots were identified.The treatment was applying a seed disinfectant to seeds. There werefour different treatments (brands) plus a control.The researchers planted 100 seeds from a single treatment in eachplot.The response is the number of seeds that germinated.Statistics 572 (Spring 2007) Multiple Logistic Regression April 26, 2007 9 / 15Multiple LogisticRegression Seed Germination ExperimentDataBlockTreatment 1 2 3 4Control 86 90 88 87Arasan 98 94 93 89Spergon 96 90 91 92Semesan 97 95 91 92Fermate 91 93 95 95Lines show treatmentBlockTreatment 1 2 3 4Control 86 90 88 87Arasan 98 94 93 89Spergon 96 90 91 92Semesan 97 95 91 92Fermate 91 93 95 95Lines show blockingStatistics 572 (Spring 2007) Multiple Logistic Regression April 26, 2007 10 / 15Multiple LogisticRegression Seed Germination ExperimentModelA model isηij= µ + αi+ βj, P {seed ij germinates} =eηij1 + eηijwhere:µ is an intercept,αiis the effect of treatment i wherePiαi= 0.and βjis the effect in block j wherePjβj= 0.Statistics 572 (Spring 2007) Multiple Logistic Regression


View Full Document

UW-Madison STAT 572 - Handouts 23-2

Download Handouts 23-2
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Handouts 23-2 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Handouts 23-2 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?