DOC PREVIEW
Duke STA 216 - Lecture 3

This preview shows page 1-2-3-27-28-29 out of 29 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Lecture 3Last Time: Definition of exponential family, components of GLM &IWLS procedure for maximum likelihood estimationToday’s Class:1. Existence & uniqueness of MLEs2. Hypothesis testing through analysis of deviance3. Standard errors & confidence intervals1Comments on Homeworks• No late homeworks will be accepted to make it easier for the TA• Please hand in hardcopies of the assignments by the end of theclass on the due date2Have homework exercise completed and written up for this Thursday(Sept 3)Complete the following exercise:1. Write down generalized linear models for the Caesarian data(grouping the two different infection types) and the cellular dif-ferentiation data.2. Show the different components of the GLM, expressing the likeli-hood in exponential family form & using a canonical link function3. Fit the GLM using maximum likelihood and report the param-eter estimates.3Existence & Uniqueness of MLEs• It is important that the MLE exists & is unique - i.e., there is asingle finite value ofcβ at which l(β) achieves it’s maximum.• Let X = (x1, . . . , xn)0denote the n × p design matrix.• A necessarily but not sufficient condition for existence & unique-ness is that X is full rank.4• For example, suppose outcome is obesity (yi= 1 or yi= 0) &the predictors are xi= (1, oldi, genei)• oldi= 1 if individual is older than 35 & oldi= 0 otherwise• genei= 1 if individual i has a mutation in a gene encodingleptin and genei= 0 otherwise• Data are collected for 10 subjects, resulting iny =0100001110and X =1 0 01 0 01 0 01 0 01 0 01 1 11 0 01 0 01 0 01 0 05• Clearly, in this case X is not full rank and a unique MLE doesnot exist• Note that such problems can often arise for when certain cate-gories of a predictor are rare - even in larger sample sizes• Similar problems can arise for binary, categorical or count out-comes - e.g., for studies of a rare disease or event6Estimability and Identifiability• If a finite & unique MLE exists for a particular data set, themodel is said to be estimable• If the sample size increases or additional data are collected, thisestimability problem may go away• For example, in the obesity application we could collect moreyoung and old subjects having the gene7• If a finite & unique MLE does not exist, regardless of samplesize, the model is not identifiable.• For example, suppose you attempted to fit the modellogit Pr(yi= 1) = β1+ β2youngi+ β3oldiyoungi=individual i 35 or younger, oldi=older than 35• Is this model identifiable? How about estimable?8Estimability Conditions• For log-linear Poisson models, a finite & unique MLE exists ifPiyixix0ihas full rank p.• For log concave link functions & binomial data, the condition is:yix0iβ ≥ 0, (1 − yi)x0iβ ≤ 0, for all i,has only the trivial solution β = 0.• In general, conditions are difficult to verify & non-convergenceof estimation procedure provides evidence of non-existence.• In other words, the software for fitting of GLMs will yield anerror message9Asymptotic Properties of MLEs• Asymptotic existence & uniqueness:The probability thatcβ exists & is unique tend to 1 as n → ∞.• Consistency: For n → ∞, we havecβ → β (true value)• Asymptotic Normality: The distribution of the MLE becomesnormal as n → ∞.• Asymptotic Efficiency: The MLE is asymptotically efficient com-pared to a wide class of other estimators.10Hypothesis Testing & Goodness-of-Fit Statistics• Obtaining MLEs is typically not enough• We would like some measure of uncertainty in our estimates -standard errors, confidence limits• We also would like to conduct hypothesis tests• For example, is the leptin gene mutation predictive of obesity?11• In linear models, frequentist hypothesis tests are typically basedon likelihood ratio or Wald tests• Likelihood ratio statistics for comparing nested models (e.g.,with and without the gene indicator) have a F -distribution fornormal linear models• This is no longer the case in general for GLMs12Hypothesis Testing & Goodness-of-Fit in GLMsNull Model: Common µ for y1, . . . , ynFull (Saturated) Model: n parameters = 1 per observationTypically, null model is too simple & full model is uninformative,being a simple summary of the data13However, full model is a useful baseline for measuring the fit for anintermediate p-parameter modelLet l(cµ, φ; y) denote the log-likelihood maximized over β for a fixedvalue of φ.The maximum possible value of the likelihood is l(y, φ; y).14Analysis of DevianceThe discrepancy of fit is proportional to twice the difference betweenl(y, φ; y) and l(cµ, φ; y), which can be expressed as:nXi=12wi{yi(eθi−bθi) − b(eθi) + b(bθi)}/φ = D(y;cµ)/φ,whereeθiandbθidenotes estimates of the canonical parameters underthe full and current models, respectively (assuming ai(φ) = φ/wi).15D(y;cµ) = model (residual) devianceD∗(y;cµ) = D(y;cµ)/φ = scaled devianceHomework Exercise: Derive the deviance as a function of the es-timated mean for the normal, Poisson and binomial distributions(SHOW WORK) (due next Tuesday)16Returning to the obesity and maternal smokingexample• We implemented maximum likelihood estimation in R usingfit<- glm(Y ~ age + smoke + age*smoke, family=binomial,data=obese)• As part of the output, we obtained:Null Deviance: 1580.905 on 3874 degrees of freedomResidual Deviance: 1574.663 on 3871 degrees of freedom• residual deviance = scaled deviance = difference in twice thelog-likelihood between saturated and current model• null deviance = residual or scaled deviance for null model17Example: For Gaussian data with identity link, the scaled deviancefor the p parameter model M isDM/σ2=1σ2nXi=1(yi−cµi)2∼ χ2n−p• In other cases, the scaled deviance may be approximatelydistributed as χ2n−p.18• As a rule of thumb, the closer the distribution is to Gaussian &the closer the link is to the identity, the better the performanceof the χ2approximation.• Approximation often does not improve as n increases:# parameters under saturated model also increases & usuallikelihood ratio approximation does not apply.19Generalized Pearson


View Full Document

Duke STA 216 - Lecture 3

Download Lecture 3
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture 3 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 3 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?