PSU STAT 504 - Generalized Linear Models - D634182

Home> Schools> Penn State University> Statistics (STAT) > STAT 504> Generalized Linear Models

DOC PREVIEW

PSU STAT 504 - Generalized Linear Models

School name Penn State University

Course Stat 504- Analysis of Discrete Data

Pages 13

This preview shows page 1-2-3-4 out of 13 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 13 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 13 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 13 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 13 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 13 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Stat 504, Lecture 16 1'&$%More on GeneralizedLinear ModelsLast time, we began an analysis of the carrot data:Blocklog-dose 1 2 31.52 10/35 17/38 10/341.64 16/42 10/40 10/381.76 8/50 8/33 5/361.88 6/42 8/39 3/352.00 9/35 5/47 2/492.12 9/42 17/42 1/402.24 1/32 6/35 3/222.36 2/28 4/35 2/31Fitting a binomial logistic model with• a linear eﬀect for log-dose and• two dummy indicators to distinguish amongblock,we found that the eﬀects of log-dose and block werehighly signiﬁcant, but the model did not ﬁt.Stat 504, Lecture 16 2'&$%Possible reasons for lack of ﬁt are• the eﬀect of log-dose might be more complicatedthan a linear function,• interactions between log-dose and block might bepresent, or• overdispersion.We probably do not want to ﬁt a model withinteractions. (Recall that the classical ANOVA modelfor a randomized block experiment assumes nointeractions.) Overdispersion is a possibility, but so isnon-linearity.Checking for overdispersion. To check foroverdispersion, we should ﬁt a maximal model. Byanalogy to the classical ANOVA, we couls say that anatural choice for the maximal model is an additivemodel with an intercept, two dummies to distinguishamong the blocks, and seven dummies to distinguishamong the levels of treatment. This model has nointeractions, but it does not assume any particularfunctional form (e.g. linear) between the response andlog-dose. Here’s SAS code to ﬁt this model and togenerate two diagnostic plots:Stat 504, Lecture 16 3'&$%options nocenter nodate nonumber linesize=72;data carrot;input logdose block y n;cards;1.52 1 10 351.52 2 17 381.52 3 10 341.64 1 16 42- -lines omitted--2.36 2 4 352.36 3 2 31;proc logist data=carrot;class block (order=data param=ref ref=last);class logdose (order=data param=ref ref=first);model y/n = block logdose / scale=none;output out=out1 xbeta=xb reschi=reschi h=h;run;axis1 label=(’Linear predictor’);axis2 label=(’Pearson Residual’);axis3 label=(’Leverage Value’);proc gplot data=out1;title ’Residuals versus Linear Predictor’;plot reschi * xb / haxis=axis1 vaxis=axis2;run;proc gplot data=out1;title ’Residuals versus Leverage Values’;plot reschi * h / haxis=axis3 vaxis=axis2;run;Stat 504, Lecture 16 4'&$%Here are some results.Deviance and Pearson Goodness-of-Fit StatisticsCriterion DF Value Value/DF Pr > ChiSqDeviance 14 27.1528 1.9395 0.0184Pearson 14 25.6181 1.8299 0.0289Type III Analysis of EffectsWaldEffect DF Chi-Square Pr > ChiSqblock 2 13.9014 0.0010logdose 7 40.3332 <.0001Analysis of Maximum Likelihood EstimatesStandard WaldParameter DF Estimate Error Chi-Square Pr > ChiSqIntercept 1 -1.1411 0.2565 19.7980 <.0001block 1 1 0.5551 0.2340 5.6261 0.0177block 2 1 0.8499 0.2280 13.8956 0.0002logdose 1.64 1 -0.2086 0.2882 0.5237 0.4693logdose 1.76 1 -0.9033 0.3184 8.0459 0.0046logdose 1.88 1 -1.1480 0.3352 11.7274 0.0006logdose 2 1 -1.3347 0.3387 15.5292 <.0001logdose 2.12 1 -0.6481 0.3011 4.6323 0.0314logdose 2.24 1 -1.5001 0.3958 14.3669 0.0002logdose 2.36 1 -1.7664 0.4248 17.2924 <.0001Once again, the eﬀects of logdose and block aresigniﬁcant, but the model does not ﬁt.Stat 504, Lecture 16 5'&$%Stat 504, Lecture 16 6'&$%None of the residuals looks unusually large; the lackof ﬁt appears to be spread across the data, ratherthan localized in a few cases. Thus overdispersionseems like a reasonable explanation. The estimate ofthe scale parameter isˆσ2= X2/df =25.6181/14 = 1.830.We’ll now use this scale parameter to assess the ﬁt ofthe simpler model that has a linear eﬀect for log-dose.proc logist data=carrot;class block (order=data param=ref ref=last);model y/n = block logdose / scale=1.353;output out=out1 xbeta=xb reschi=reschi h=h;run;data out2;set out1;scaledres=reschi/1.353;run;axis1 label=(’Linear predictor’);axis2 label=(’Pearson Residual’);axis3 label=(’Leverage Value’);proc gplot data=out2;title ’Residuals versus Linear Predictor’;plot scaledres * xb / haxis=axis1 vaxis=axis2;run;proc gplot data=out2;title ’Residuals versus Leverage Values’;plot scaledres * h / haxis=axis3 vaxis=axis2;run;Stat 504, Lecture 16 7'&$%Notice that we set the scale factor to√1.830 = 1.353,because what SAS calls the scale parameter is whatwe would call σ. And we also divided the Pearsonresiduals by the estimate of σ. Here are some results:Deviance and Pearson Goodness-of-Fit StatisticsCriterion DF Value Value/DF Pr > ChiSqDeviance 20 39.9757 1.9988 0.0050Pearson 20 41.1746 2.0587 0.0035Number of events/trials observations: 24NOTE: The covariance matrix has been multiplied by the heterogeneityfactor (square of SCALE=1.353) 1.83061.Type III Analysis of EffectsWaldEffect DF Chi-Square Pr > ChiSqblock 2 7.6030 0.0223logdose 1 15.2585 <.0001Analysis of Maximum Likelihood EstimatesStandard WaldParameter DF Estimate Error Chi-Square Pr > ChiSqIntercept 1 1.4802 0.8878 2.7796 0.0955block 1 1 0.5424 0.3136 2.9908 0.0837block 2 1 0.8432 0.3058 7.6027 0.0058logdose 1 -1.8174 0.4653 15.2585 <.0001Stat 504, Lecture 16 8'&$%This illustrates how we can use the scale parameterestimated from a maximal model to assess the ﬁt ofsmaller models. It’s a good idea to do that, because ifthe smaller model does not ﬁt, the scale factorestimated from the smaller model may be biased.The eﬀects of block and log-dose are still signiﬁcant.Does the model ﬁt? The scaled ﬁt statistics areG2/ˆσ2=39.9757/1.830 = 21.85X2/ˆσ2=41.1746/1.830 = 22.50and the corresponding p-values areP (χ220≥ 21.85) = .35,P (χ220≥ 22.50) = .31,so it does seem to ﬁt.Stat 504, Lecture 16 9'&$%Here’s the plot of the scaled residuals versus theleverage values:The largest residual is less than 3.0, which is not bad.Now let’s check the link function using Hinkley’s(1985) test:proc logist data=carrot;class block (order=data param=ref ref=last);model y/n = block logdose / scale=1.353;output out=out1 xbeta=xb reschi=reschi h=h pred=pihat;run;data out2;set out1;scaledres=reschi/1.353;xb2 = xb**2;absres = abs(scaledres);run;Stat 504, Lecture 16 10'&$%proc logist data=out2;class block (order=data param=ref ref=last);model y/n = block logdose xb2 / scale=none;run;axis1 label=(’Fitted Probability’);axis2 label=(’Absolute Value of Pearson Residual’);proc gplot data=out2;title ’Absolute Residuals versus Fitted Probability’;plot absres * pihat / haxis=axis1 vaxis=axis2;run;Relevant output:Analysis of Maximum Likelihood EstimatesStandard WaldParameter DF Estimate Error Chi-Square Pr >

View Full Document


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2-3-4 out of 13 pages.

PSU STAT 504 - Generalized Linear Models

Sign up for free to view:

Please select your school