DOC PREVIEW
PSU STAT 504 - Generalized Linear Models

This preview shows page 1-2-3-4 out of 13 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 13 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 13 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 13 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 13 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 13 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Stat 504, Lecture 16 1'&$%More on GeneralizedLinear ModelsLast time, we began an analysis of the carrot data:Blocklog-dose 1 2 31.52 10/35 17/38 10/341.64 16/42 10/40 10/381.76 8/50 8/33 5/361.88 6/42 8/39 3/352.00 9/35 5/47 2/492.12 9/42 17/42 1/402.24 1/32 6/35 3/222.36 2/28 4/35 2/31Fitting a binomial logistic model with• a linear effect for log-dose and• two dummy indicators to distinguish amongblock,we found that the effects of log-dose and block werehighly significant, but the model did not fit.Stat 504, Lecture 16 2'&$%Possible reasons for lack of fit are• the effect of log-dose might be more complicatedthan a linear function,• interactions between log-dose and block might bepresent, or• overdispersion.We probably do not want to fit a model withinteractions. (Recall that the classical ANOVA modelfor a randomized block experiment assumes nointeractions.) Overdispersion is a possibility, but so isnon-linearity.Checking for overdispersion. To check foroverdispersion, we should fit a maximal model. Byanalogy to the classical ANOVA, we couls say that anatural choice for the maximal model is an additivemodel with an intercept, two dummies to distinguishamong the blocks, and seven dummies to distinguishamong the levels of treatment. This model has nointeractions, but it does not assume any particularfunctional form (e.g. linear) between the response andlog-dose. Here’s SAS code to fit this model and togenerate two diagnostic plots:Stat 504, Lecture 16 3'&$%options nocenter nodate nonumber linesize=72;data carrot;input logdose block y n;cards;1.52 1 10 351.52 2 17 381.52 3 10 341.64 1 16 42- -lines omitted--2.36 2 4 352.36 3 2 31;proc logist data=carrot;class block (order=data param=ref ref=last);class logdose (order=data param=ref ref=first);model y/n = block logdose / scale=none;output out=out1 xbeta=xb reschi=reschi h=h;run;axis1 label=(’Linear predictor’);axis2 label=(’Pearson Residual’);axis3 label=(’Leverage Value’);proc gplot data=out1;title ’Residuals versus Linear Predictor’;plot reschi * xb / haxis=axis1 vaxis=axis2;run;proc gplot data=out1;title ’Residuals versus Leverage Values’;plot reschi * h / haxis=axis3 vaxis=axis2;run;Stat 504, Lecture 16 4'&$%Here are some results.Deviance and Pearson Goodness-of-Fit StatisticsCriterion DF Value Value/DF Pr > ChiSqDeviance 14 27.1528 1.9395 0.0184Pearson 14 25.6181 1.8299 0.0289Type III Analysis of EffectsWaldEffect DF Chi-Square Pr > ChiSqblock 2 13.9014 0.0010logdose 7 40.3332 <.0001Analysis of Maximum Likelihood EstimatesStandard WaldParameter DF Estimate Error Chi-Square Pr > ChiSqIntercept 1 -1.1411 0.2565 19.7980 <.0001block 1 1 0.5551 0.2340 5.6261 0.0177block 2 1 0.8499 0.2280 13.8956 0.0002logdose 1.64 1 -0.2086 0.2882 0.5237 0.4693logdose 1.76 1 -0.9033 0.3184 8.0459 0.0046logdose 1.88 1 -1.1480 0.3352 11.7274 0.0006logdose 2 1 -1.3347 0.3387 15.5292 <.0001logdose 2.12 1 -0.6481 0.3011 4.6323 0.0314logdose 2.24 1 -1.5001 0.3958 14.3669 0.0002logdose 2.36 1 -1.7664 0.4248 17.2924 <.0001Once again, the effects of logdose and block aresignificant, but the model does not fit.Stat 504, Lecture 16 5'&$%Stat 504, Lecture 16 6'&$%None of the residuals looks unusually large; the lackof fit appears to be spread across the data, ratherthan localized in a few cases. Thus overdispersionseems like a reasonable explanation. The estimate ofthe scale parameter isˆσ2= X2/df =25.6181/14 = 1.830.We’ll now use this scale parameter to assess the fit ofthe simpler model that has a linear effect for log-dose.proc logist data=carrot;class block (order=data param=ref ref=last);model y/n = block logdose / scale=1.353;output out=out1 xbeta=xb reschi=reschi h=h;run;data out2;set out1;scaledres=reschi/1.353;run;axis1 label=(’Linear predictor’);axis2 label=(’Pearson Residual’);axis3 label=(’Leverage Value’);proc gplot data=out2;title ’Residuals versus Linear Predictor’;plot scaledres * xb / haxis=axis1 vaxis=axis2;run;proc gplot data=out2;title ’Residuals versus Leverage Values’;plot scaledres * h / haxis=axis3 vaxis=axis2;run;Stat 504, Lecture 16 7'&$%Notice that we set the scale factor to√1.830 = 1.353,because what SAS calls the scale parameter is whatwe would call σ. And we also divided the Pearsonresiduals by the estimate of σ. Here are some results:Deviance and Pearson Goodness-of-Fit StatisticsCriterion DF Value Value/DF Pr > ChiSqDeviance 20 39.9757 1.9988 0.0050Pearson 20 41.1746 2.0587 0.0035Number of events/trials observations: 24NOTE: The covariance matrix has been multiplied by the heterogeneityfactor (square of SCALE=1.353) 1.83061.Type III Analysis of EffectsWaldEffect DF Chi-Square Pr > ChiSqblock 2 7.6030 0.0223logdose 1 15.2585 <.0001Analysis of Maximum Likelihood EstimatesStandard WaldParameter DF Estimate Error Chi-Square Pr > ChiSqIntercept 1 1.4802 0.8878 2.7796 0.0955block 1 1 0.5424 0.3136 2.9908 0.0837block 2 1 0.8432 0.3058 7.6027 0.0058logdose 1 -1.8174 0.4653 15.2585 <.0001Stat 504, Lecture 16 8'&$%This illustrates how we can use the scale parameterestimated from a maximal model to assess the fit ofsmaller models. It’s a good idea to do that, because ifthe smaller model does not fit, the scale factorestimated from the smaller model may be biased.The effects of block and log-dose are still significant.Does the model fit? The scaled fit statistics areG2/ˆσ2=39.9757/1.830 = 21.85X2/ˆσ2=41.1746/1.830 = 22.50and the corresponding p-values areP (χ220≥ 21.85) = .35,P (χ220≥ 22.50) = .31,so it does seem to fit.Stat 504, Lecture 16 9'&$%Here’s the plot of the scaled residuals versus theleverage values:The largest residual is less than 3.0, which is not bad.Now let’s check the link function using Hinkley’s(1985) test:proc logist data=carrot;class block (order=data param=ref ref=last);model y/n = block logdose / scale=1.353;output out=out1 xbeta=xb reschi=reschi h=h pred=pihat;run;data out2;set out1;scaledres=reschi/1.353;xb2 = xb**2;absres = abs(scaledres);run;Stat 504, Lecture 16 10'&$%proc logist data=out2;class block (order=data param=ref ref=last);model y/n = block logdose xb2 / scale=none;run;axis1 label=(’Fitted Probability’);axis2 label=(’Absolute Value of Pearson Residual’);proc gplot data=out2;title ’Absolute Residuals versus Fitted Probability’;plot absres * pihat / haxis=axis1 vaxis=axis2;run;Relevant output:Analysis of Maximum Likelihood EstimatesStandard WaldParameter DF Estimate Error Chi-Square Pr >


View Full Document

PSU STAT 504 - Generalized Linear Models

Download Generalized Linear Models
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Generalized Linear Models and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Generalized Linear Models 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?