DOC PREVIEW
PSU STAT 544 - LECTURE NOTES

This preview shows page 1-2-19-20 out of 20 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 20 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 20 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 20 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 20 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 20 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Stat 544, Lecture 18 1'&$%ZIP, ZAPand IPFZero-inflated models. With count data, is quitecommon to find more observed counts of zero (yi=0)than can be explained by a Poisson or negativebinomial. For example, consider a survey of alcoholuse among undergraduates. The response is thenumber of occasions in the past 30 days on which thesubject consumed alcohol. The data are likely to showan excess of zeros. Some may report zero because theynever drink alcohol (necessary zeros). Others maydrink occasionally, but happened not to drink in thelast 30 days (accidental zeros). It might be reasonableto model yias a mixture of two distributions:• responses that are zero with probability one; and• responses that follow a simple model (e.g.Poisson) with mass at 0, 1, 2,....When the latter part is Poisson, this is called azero-inflated Poisson (ZIP) model.Stat 544, Lecture 18 2'&$%A ZIP model may be fit using an EM algorithm. InEM we define an indicator ziwhich is equal to zero ifyicomes from the zero component, and one if yicomes from the Poisson component. If yi= 0 then ziis missing. The EM algorithm estimates theexpectations of the zi’s at each iteration, uses these toestimate the parameters, and iterates untilconvergence.The ZIP model can be extended to accommodatecovariates. Covariates could enter in two places:• In a logit model for predicting P (zi= 1); and• In the Poisson part, for predicting E(yi| zi= 1).Details of how to fit this model are provided byLambert (1992, Technometrics).Some may not like the ZIP model because it leansheavily on the assumed distributional shape for thenonzero component. They prefer to model it as amixture of• responses that are zero with probability one; andStat 544, Lecture 18 3'&$%• responses from a truncated Poisson distributionf(yi| yi> 0) =μyiiexp(−μi)yi!(1 − exp(μi)),yi=1, 2,....This is called a zero-altered Poisson (ZAP)model. Some have called it a “hurdle” model. Thismodel can be fit directly without EM, because thereare no “missing data”; we know the component towhich each observation belongs.Likelihood theory for loglinear models.Loglinear models for multinomial contingency tableshave a long history. The first general treatment wasgiven by Bishop, Fienberg and Holland (1975). Thatbook presents many interesting theoretical results,some of which are reproduced in Agresti (2002).Today we will look at some of these results anddiscuss an old fashioned way of fitting loglinearmodels called iterative proportional fitting (IPF). Tomake this discussion more concrete, we will present itin the context of a three-way table. But this theoryapplies more generally to tables of any dimension.Let y =(y111,y211,...,yIJK)Tbe a set of counts fromStat 544, Lecture 18 4'&$%a three-way (I × J × K) table that cross-classifiessubjects by variables A, B, and C. Today we willassume that y is multinomial with index n = y+++and mean μ = nπ, and assume a loglinear model for μ,η = log μ = Xβ. (1)This is a slight change in notation from the lastlecture, where we placed the loglinear model on π.Inmany texts, it is customary to specify the model interms of μ rather than π. It does not matter whichone we use, as long as X contains an intercept. If thefirst column of X is (1, 1,...,1)T, then using μ ratherthan π merely shifts the intercept by log n.Aswediscussed, the intercept is not a free parameter but anormalizing constant to ensure that the πijk’s sum toone and the μijks sum to n.Using a notation similar to that of linear models, wecan decompose the log-cell means ηijkasηijk= λ0+ λAi+ λBj+ λCk+λABij+ λACik+ λBCjk+ λABCijk, (2)where λ0is the “grand mean,” λAiis the “main effect”for A, λABijis the “interaction” between A and B, etc.Stat 544, Lecture 18 5'&$%For identifiability, we must require the λ terms to sumto zero over any subscript:IXi=1λAi=0,IXi=1λABij=JXj=1λABij=0,and so on. As we have discussed,• λ0is not a free parameter but a normalizingconstant;• the λAi’s simply allow the probabilities to varyacross levels of A;• the λABij’s allow associations between A and B;and so on. Many textbooks write the loglinear modelas (2) rather than (1), but the two are closely related.If we choose a particular coding scheme for X, thenthe λ-terms appear as elements of β.Suppose we use “effect coding” rather than dummycoding for each factor. Effect-coding is like this,Stat 544, Lecture 18 6'&$%EffectsiA1A2··· AI−1110 0201 0............I − 100 1I −1 −1 −1whereas dummy coding is like this:DummiesiA1A2··· AI−1110 0201 0............I − 100 0I 00 0For example, if I = J = K = 2 then we could writeη =26666666666666664η111η211η121η221η112η212η122η22237777777777777775,X=26666666666666664111111111 −111−1 −11−111−11−11−1 −11 −1 −111−1 −11111−11−1 −1 −11 −11−1 −11−1111−1 −1 −1 −1111 −1 −1 −1111−137777777777777775.Stat 544, Lecture 18 7'&$%Thenβ =hλ0,λA1,λB1,λC1,λAB11,λAC11,λBC11,λABC111iT,and the other λ-terms follow from these by thezero-sum constraints,λA1= −λA2,λAB11= −λAB12= −λAB21= λAB22,and so on. Notice that all the λ’s except for λ0arecontrasts among the log-cell means.Interpretation of the λ’s. In loglinear modeling,the λ-terms are essentially log-odds ratios anddifferences among log-odds ratios. It is not hard tointerpret the highest-order terms in any given model.For example, in a loglinear model for a 2 × 2 table,ηij= λ0+ λAi+ λBj+ λABij,it is straightforward to show thatλAB11=14log„μ11μ22μ12μ21«=14log„π11π22π12π21«.In a model for a 2 × 2 × 2 table, we can show thatλABC111=18log π111π221π121π221π112π222π122π222!Stat 544, Lecture 18 8'&$%However, when the three-way associations are present,it is not so easy to interpret the two-way associations;for example, if λABCijkis present, thenλABij=14log„π11+π22+π12+π21+«.Interpretation of the individual λ’s can be tricky. Fortoday, we will focus on the interpretation of the wholemodel rather than on the individual terms. Then, ifwe need estimates of certain log-odds ratios, we willobtain them directly from the fitted values, theestimated μijk’s, rather than the coefficients.Counting parameters. The number of freeparameters in a loglinear model can be computed inthe same way as for factorial ANOVA. For example,in a three-way table, we have:Stat 544, Lecture 18 9'&$%Source No.


View Full Document

PSU STAT 544 - LECTURE NOTES

Download LECTURE NOTES
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view LECTURE NOTES and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view LECTURE NOTES 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?