Stat 544, Lecture 20 1'&$%More on PolytomousRegression ModelsLast time, we fit a model to the now-famous alligatorfood-choice dataset.Primary Food ChoiceLake Sex Size Fish Inv. Rept. Bird OtherHancock M small 7 1 0 0 5large 4 0 0 1 2Fsmall16 3 2 2 3large 3 0 1 2 3Oklawaha M small 2 2 0 0 1large 13 7 6 0 0Fsmall3 9 1 0 2large 0 1 0 1 0Trafford M small 3 7 1 0 1large 8 6 6 3 5Fsmall2 4 1 1 4large 0 1 0 0 0George M small 13 10 0 2 2large 9 0 0 1 2Fsmall3 9 1 0 1large 8 1 0 0 1Stat 544, Lecture 20 2'&$%We letπ1= prob. of fish,π2= prob. of invertebrates,π3= prob. of reptiles,π4= prob. of birds,,π5= prob. of other,and made “fish” be the baseline category. The logitequations werelog„πjπ1«= β0+ β1X1+ ···for j =2, 3, 4, 5. The X’s included• three dummy indicators for lake,• a dummy for sex, and• a dummy for size.Therefore, each logit equation had six coefficients tobe estimated, so the number of free parameters in thismodel was 4 × 6 = 24.We found that• lake was highly significant (Wald chisquare=36.2,df=12),Stat 544, Lecture 20 3'&$%• size was highly significant (Wald chisquare=15.9,df=3),• sex was not significant (Wald chisquare=2.2,df=3).Wald statistics might not be as accurate as deviancetests. Let’s adopt an analysis-of-deviance approach tocompare various models.First, let’s find the deviance G2for the null(intercept-only) model, a model with just fourparameters. Because there are N =4× 2 × 2=16unique covariate patterns, the saturated model willhave 16 × (5 − 1) = 64 free parameters, so the G2statistic for the null model should have 64 − 4=60degrees of freedom. Let’s fit the null model in PROCLOGISTIC, like this:options nocenter nodate nonumber linesize=72;data gator;input lake $ sex $ size $ food $ count;cards;Hancock male small fish 7--lines omitted--George female large other 1;Stat 544, Lecture 20 4'&$%proc logist data=gator;freq count;class lake size sex / order=data param=ref ref=first;model food(ref=’fish’) = / link=glogitaggregate scale=none;run;The fit statistics are:Model Convergence StatusConvergence criterion (GCONV=1E-8) satisfied.-2 Log L = 604.3629Deviance and Pearson Goodness-of-Fit StatisticsCriterion DF Value Value/DF Pr > ChiSqDeviance 0 0.0000 . .Pearson 0 0.0000 . .Number of unique profiles: 1What happened? By default, the aggregate optioncalculates goodness-of-fit statistics for a table thataggregates over the unique patterns for the covariatesappearing in the model. In this case, there are nocovariates in the model, so there is only one “uniqueprofile” and the intercept-only model is considered tobe saturated.We want SAS to compute the fit statistics relative toa saturated model that estimates the responseprobabilities independently for each combination ofStat 544, Lecture 20 5'&$%lake, sex and size. To do that, we change the modelstatement like this:proc logist data=gator;freq count;class lake size sex / order=data param=ref ref=first;model food(ref=’fish’) = / link=glogitaggregate=(lake size sex) scale=none;run;Now the results are:Deviance and Pearson Goodness-of-Fit StatisticsCriterion DF Value Value/DF Pr > ChiSqDeviance 60 116.7611 1.9460 <.0001Pearson 60 106.4922 1.7749 0.0002Number of unique profiles: 16Repeating the model-fitting for various sets ofpredictors, we obtain the followinganalysis-of-deviance table:Stat 544, Lecture 20 6'&$%Model G2dfSaturated 0.00 0Lake + Size + Lake×Size∗∗35.40 32Lake + Size + Sex 50.26 40Lake + Size 52.48 44Lake 73.57 48Size 101.61 56Sex 114.66 56Null 116.76 60∗∗Note: did not convergeWe ran into trouble when we included the lake×sizeinteraction. Here are some relevant portions of theoutput:Model Convergence StatusQuasi-complete separation of data points detected.WARNING: The maximum likelihood estimate may not exist.WARNING: The LOGISTIC procedure continues in spite of the abovewarning. Results shown are based on the last maximumlikelihood iteration. Validity of the model fit isquestionable.Deviance and Pearson Goodness-of-Fit StatisticsCriterion DF Value Value/DF Pr > ChiSqDeviance 32 35.3989 1.1062 0.3109Pearson 32 38.2807 1.1963 0.2058Number of unique profiles: 16Stat 544, Lecture 20 7'&$%Model Fit StatisticsInterceptIntercept andCriterion Only CovariatesAIC 612.363 587.001SC 625.919 695.451-2 Log L 604.363 523.001Testing Global Null Hypothesis: BETA=0Test Chi-Square DF Pr > ChiSqLikelihood Ratio 81.3622 28 <.0001Score 73.0595 28 <.0001Wald 44.1606 28 0.0268Type III Analysis of EffectsWaldEffect DF Chi-Square Pr > ChiSqlake 12 18.6397 0.0976size 4 2.8868 0.5769lake*size 12 6.2811 0.9013WARNING: The validity of the model fit is questionable.Analysis of Maximum Likelihood EstimatesStandard WaldParameter food DF Estimate Error Chi-SquareIntercept bird 1 -2.4423 0.7372 10.9757Intercept invert 1 -1.7492 0.5417 10.4256Intercept other 1 -1.0561 0.4105 6.6195Intercept reptile 1 -2.4423 0.7372 10.9757lake Oklawaha bird 1 -10.2353 253.2 0.0016lake Oklawaha invert 1 2.5377 0.7645 11.0196lake Oklawaha other 1 0.5452 0.8377 0.4236Stat 544, Lecture 20 8'&$%lake Oklawaha reptile 1 0.8329 1.3204 0.3979lake Trafford bird 1 0.8329 1.3204 0.3979lake Trafford invert 1 2.5377 0.7645 11.0196lake Trafford other 1 1.0561 0.7540 1.9618lake Trafford reptile 1 1.5261 1.1151 1.8728lake George bird 1 0.3629 1.0517 0.1191lake George invert 1 1.9211 0.6392 9.0317lake George other 1 -0.6179 0.7512 0.6766lake George reptile 1 -0.3302 1.2673 0.0679size large bird 1 1.5950 1.0098 2.4951size large invert 1 -10.2786 154.6 0.0044size large other 1 0.7196 0.7151 1.0126size large reptile 1 0.4964 1.2986 0.1461lake*size Oklawaha large bird 1 8.5176 253.2 0.0011lake*size Oklawaha large invert 1 9.0046 154.6 0.0034lake*size Oklawaha large other 1 -12.6194 137.4 0.0084lake*size Oklawaha large reptile 1 0.3398 1.7692 0.0369lake*size Trafford large bird 1 -0.9664 1.6365 0.3488lake*size Trafford large invert 1 9.3566 154.6 0.0037lake*size Trafford large other 1 -1.1896 1.1119 1.1446lake*size Trafford large reptile 1 0.1322 1.6365 0.0065lake*size George large bird 1 -2.3488 1.6251 2.0890lake*size George large invert 1 7.2735 154.6 0.0022lake*size George large other 1 -0.7802 1.1399 0.4685lake*size George large reptile 1 -11.1988 204.6 0.0030“Quasi-separation” means that the model effectivelyincludes dummy indicators for groups with observedfrequencies of zero, so that the ML estimates forcertain coefficients are running off to ±∞. Notice inthe table of ML estimates that
View Full Document