PSU STAT 504 - Logistic Regression - D2424387

Home> Schools> Penn State University> Statistics (STAT) > STAT 504> Logistic Regression

PSU STAT 504 - Logistic Regression

Course Stat 504- Analysis of Discrete Data

Pages 13

Download Save

Unformatted text preview:

Stat 504, Lecture 12 1'&$%Everything You EverWanted To KnowAbout Logistic RegressionLast time, we discussed the analysis of deviance forthe delinquency example:Model G2df pSaturated 0.00 0 —S + B 0.15 2 .928S 0.16 3 .984B 28.80 4 .000Null (intercept only) 36.41 5 .000Based on this table, we should choose the S model,because it’s the simplest one that ﬁts the data well.But let’s temporarily turn our attention to thesaturated model. We can write this model aslogπ1 − π= β0+ β1X1+ β2X2+ β3X3+β4X1X2+ β5X1X3, (1)Stat 504, Lecture 12 2'&$%where π is the probability of delinquency,X1=1ifB=scout,0 otherwiseis the main eﬀect for B, andX2=1ifS=medium,0 otherwise,X3=1ifS=high,0 otherwiseare the main eﬀects for S. Let’s ﬁt this model in SAS:options nocenter nodate nonumber linesize=72;data new;input S $ B $ delinq nondelinq ;y = delinq;n = delinq + nondelinq;cards;low scout 11 43low nonscout 42 169medium scout 14 104medium nonscout 20 132high scout 8 196high nonscout 2 59;proc logist data=new;class B (order=data param=ref ref=last)S (order=data param=ref ref=first);model y/n = B S B*S / scale=none;run;Stat 504, Lecture 12 3'&$%The table of coeﬃcients is shown below:Analysis of Maximum Likelihood EstimatesStandard WaldParameter DF Estimate Error Chi-Square Pr > ChiSqIntercept 1 -1.3922 0.1724 65.2041 <.0001B scout 1 0.0289 0.3793 0.0058 0.9392S medium 1 -0.4948 0.2955 2.8048 0.0940S high 1 -1.9921 0.7394 7.2597 0.0071B*S scout medium 1 -0.1472 0.5315 0.0767 0.7818B*S scout high 1 0.1568 0.8893 0.0311 0.8601How do we interpret these eﬀects? Referring back to(1), we see that the log-odds of delinquency for eachS × B group are:SB log-oddslow scout β0+ β1low nonscout β0medium scout β0+ β1+ β2+ β4medium nonscout β0+ β2high scout β0+ β1+ β3+ β5high nonscout β0+ β3Therefore,β1= log-odds for S=low, B=scout,−log-odds for S=low, B=nonscout.In other words, β1gives the eﬀect of scouting ondelinquency when S=low.Stat 504, Lecture 12 4'&$%The estimate of β1in the SAS output agrees with theB × D odds ratio for S=low,log11 × 16942 × 43=0.0289.The eﬀect of scouting for S=medium, however, isβ1+ β4= log-odds for S=medium, B=scout,−log-odds for S=medium, B=nonscout.and the eﬀect of scouting for S=high isβ1+ β5= log-odds for S=high, B=scout,−log-odds for S=high, B=nonscout.Estimates of these latter two eﬀects do not directlyappear in the SAS output. How can we get them?One way is to simply calculate them from theindividualˆβj’s, and then get their standard errorsfrom the elements of the estimated covariance matrix.For example, the estimated standard error forˆβ1+ˆβ4isˆV (ˆβ1)+ˆV (ˆβ4)+2ˆCov(ˆβ1,ˆβ4).Adding the covb option to the model statement inPROC LOGISTIC will cause SAS to print out theestimated covariance matrix.Stat 504, Lecture 12 5'&$%Another way is to recode the model so that theestimates of interest and their standard errors appeardirectly in the table of coeﬃcients. Suppose that wedeﬁne the following dummy variables:X1=1ifS=low0 otherwiseX2=1ifS=low and B=scout0 otherwiseX3=1ifS=medium0 otherwiseX4=1ifS=medium and B=scout0 otherwiseX5=1ifS=high0 otherwiseX6=1ifS=high and B=scout0 otherwiseStat 504, Lecture 12 6'&$%Then we ﬁt the modellogπ1 − π= β1X1+ β2X2+ β3X3+β4X4+ β5X5+ β6X6.Notice that this new model does not include anintercept; an intercept would cause a collinearityproblem, because X1+ X3+ X5= 1. Under this newcoding scheme, the log-odds of delinquency for eachS × B group are:SBlog-oddslow scout β1+ β2low nonscout β1medium scout β3+ β4medium nonscout β3high scout β5+ β6high nonscout β5Therefore,β2= eﬀect of scouting when S=low,β4= eﬀect of scouting when S=medium,β6= eﬀect of scouting when S=high.Stat 504, Lecture 12 7'&$%SAS code for ﬁtting this new model is shown below.options nocenter nodate nonumber linesize=72;data new;input S $ B $ delinq nondelinq ;y = delinq;n = delinq + nondelinq;x1 = (S="low");x2 = (S="low")*(B="scout");x3 = (S="medium");x4 = (S="medium")*(B="scout");x5 = (S="high");x6 = (S="high")*(B="scout");cards;low scout 11 43low nonscout 42 169medium scout 14 104medium nonscout 20 132high scout 8 196high nonscout 2 59;proc logist data=new;model y/n = x1 x2 x3 x4 x5 x6 / noint scale=none;run;In the model statement, notice the use of the nointoption to remove the intercept. The estimated tableof coeﬃcients is shown below.Stat 504, Lecture 12 8'&$%Analysis of Maximum Likelihood EstimatesStandard WaldParameter DF Estimate Error Chi-Square Pr > ChiSqx1 1 -1.3922 0.1724 65.2041 <.0001x2 1 0.0289 0.3793 0.0058 0.9392x3 1 -1.8871 0.2399 61.8495 <.0001x4 1 -0.1183 0.3723 0.1009 0.7508x5 1 -3.3843 0.7190 22.1575 <.0001x6 1 0.1857 0.8044 0.0533 0.8174I will leave it to you to verify that the estimates andstandard errors for β2, β4and β6correspond to thelog-odds ratios and SE’s that we get from analyzingthe B ×D tables for S=low, S=medium and S=high.Introduction to overdispersionOverdispersion is an important concept in theanalysis of discrete data. In the context of logisticregression, overdispersion occurs when thediscrepancies between the observed responses yiandtheir predicted values ˆµi= niˆπiare larger than whatthe binomial model would predict. If overdispersion ispresent in a dataset, the estimated standard errorsand test statistics will be distorted and adjustmentsshould be made.Stat 504, Lecture 12 9'&$%There is no such thing as overdispersion in ordinarylinear regression. In a linear regression modelyi∼ N( xTiβ, σ2),the variance σ2is estimated independently of themean function xTiβ. With discrete response variables,however, the possibility for overdispersion existsbecause the commonly used distributions specifyparticular relationships between the variance and themean. If yi∼ Bin(ni,πi), the mean is µi= niπiandthe variance is µi(ni− µi)/ni. Overdispersion meansthat the data show evidence that the variance of theresponse yiis greater than µi(ni− µi)/ni.Underdispersion is also theoretically possible, but rarein practice. McCullagh and Nelder (1989) say thatoverdispersion is the rule rather than the exception.Overdispersion arises when the niBernoulli trialsthat are summarized in a line of the dataset are• not identically distributed (i.e. the successprobabilities vary from one trial to the next), or• not independent (i.e.

View Full Document


School:
Email:
New Password:
Confirm Password:

PSU STAT 504 - Logistic Regression

Sign up for free to view:

Please select your school