PSU STAT 544 - Poisson Regression - D189373

Home> Schools> Penn State University> Statistics (STAT) > STAT 544> Poisson Regression

DOC PREVIEW

PSU STAT 544 - Poisson Regression

School name Penn State University

Course Stat 544- Categorical Data

Pages 19

This preview shows page 1-2-3-4-5-6 out of 19 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 19 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 19 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 19 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 19 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 19 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 19 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 19 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Stat 544, Lecture 15 1'&$%PoissonRegressionLast time we began to discuss Poisson regression,which assumesyi∼ Poisson(μi),i=1,...,N.The canonical link is the log,log μi= η = xTiβ.A Poisson model with a log link is commonly called aloglinear model. Under the Poisson model, thevariance function is V (μ)=μ. When overdispersion isevident, it is usually handled by a scale parameter,Var(yi)=φμi, which makes it a quasilikelihoodmodel. To ﬁt this model, we iteratively solveβ =“XTWX”−1XTWz,where W = Diag(μi) is the matrix of weights andzi= ηi+(yi− μi)/μi.Stat 544, Lecture 15 2'&$%The estimated covariance matrix forˆβ is (XTWX)−1,the Pearson residuals areri=yi− μi√μi,Pearson goodness-of-ﬁt statistic is X2=PNi=1r2i,and the deviance isG2=2Xi:yi=0jyilogyiμi− (yi− μi)ﬀ+2Xi:yi=0μi.These ﬁt statistics can be reliable compared to χ2N−pif no more than 20% of the estimated μi’s are lessthan 5.0 and none are less than 1.0.Computations. Below is an R function that ﬁtsPoisson regression with a log link, and a function forprinting out the results. These are simple revisions ofthe functions that we used for logistic regression.Starting values are obtained from an OLS regressionof log(yi+1/2) on xi.#######################################################################poisson.regression <- function(x,y,maxits=20,eps=1e-10,beta.start){# Fit a Poisson model with a log linkif( missing(beta.start) ){Stat 544, Lecture 15 3'&$%ystar <- log(y+.5)newbeta <- lsfit(x,ystar,intercept=F)$coef}else{newbeta <- beta.start}if(is.vector(x)) x <- matrix(x,ncol=1)iter <- 0converged <- Fwhile( (!converged) & (iter<maxits) ){iter <- iter+1cat(iter); cat("...")beta <- newbetaeta <- as.vector(x%*%beta)mu <- exp(eta)z <- eta + (y - mu)/muweights <- mutmp <- lsfit(x,z,wt=weights,intercept=F)newbeta <- tmp$coefconverged <- all(abs(newbeta-beta)<eps)}cat("\n")eta <- as.vector(x%*%beta)mu <- exp(eta)loglik <- sum( y*log(mu) - mu)xtwxinv <- ls.diag(tmp)$cov.unscaled# compute X2, Pearson residuals, influencer <- (y-mu)/sqrt(mu)X2 <- sum(r^2)xstar <- diag(sqrt(mu))%*%xhatstar <- xstar%*%solve(t(xstar)%*%xstar)%*%t(xstar)leverage <- diag(hatstar)# compute deviance contributions and G2deviance <- y*log(y/mu)-(y-mu)if(any(y==0))deviance[y==0] <- mu[y==0]G2 <- 2*sum(deviance)result <- list(beta=newbeta, cov.beta=xtwxinv, iter=iter,converged=converged, loglik=loglik, residuals=r,X2=X2, G2=G2, df=nrow(x)-ncol(x), leverage=leverage, fitted=mu,eta=eta, z=z)result}#######################################################################poisson.print <- function(result, scale=1.0){if(result$converged){cat(paste("The Newton-Raphson algorithm converged in",format(result$iter),"iterations."))}else{Stat 544, Lecture 15 4'&$%cat(paste("Algorithm failed to converge by",format(result$iter),"iterations"))}cat("\n\n")coef <- result$betaSE <- sqrt(scale*diag(result$cov.beta))tstat <- coef/SEpval <- 2*pnorm(-abs(tstat))coef.table <- cbind(coef,SE,round(tstat,2),round(pval,3))dimnames(coef.table) <- list(names(result$beta),c("coef","SE","coef/SE","pval"))print(coef.table)cat("\n")cat(paste("Loglikelihood =",result$loglik))cat("\n")cat(paste("Pearson’s X^2 =",result$X2))cat("\n")cat(paste(" Deviance G^2 =",result$G2))cat("\n")cat(paste(" df =",format(result$df)))cat("\n")if(scale!=1.0){cat(paste(" Scaled X^2 =",result$X2/scale))cat("\n")cat(paste(" Scaled G^2 =",result$G2/scale))cat("\n")}tmp <- sum(result$fitted<5)pct <- round(100*tmp/length(y),2)cat(paste(format(pct),"% of observations have expected counts below 5.0",sep=""))cat("\n")tmp <- min(result$fitted)cat(paste("The minimum expected cell count is ",format(tmp),sep=""))cat("\n")invisible()}#######################################################################Stat 544, Lecture 15 5'&$%Example: The data below come from the famous19th century study by Vladislav Bortkiewicz offatalities in the Prussian army due to horse kicks.The table shows the total number of deaths acrossfourteen army corps from 1875 to 1894:Year 75 76 77 78 79 80 81 82 83 84Deaths 3 5 7 9 10 18 6 14 11 9Year 85 86 87 88 89 90 91 92 93 94Deaths 5 11 15 6 11 17 12 15 8 4This example appears in many textbooks to illustratethe Poisson distribution. Sometimes four of the corpsare removed. Sometimes the data are disaggregatedby corps, and a test is then done to see if the countsfrom the 200 corps-years follow a simple Poissondistribution. For illustration, we will examine thedata by year to look for a trend over time.First we enter the data and plot the log(yi) versusyear:> year <- 75:94> y <- c(3,5,7,9,10,18,6,14,11,9,5,11,15,6,11,17,12,15,8,4)> plot(year,log(y))Stat 544, Lecture 15 6'&$%••••••••••••••••••••Apply loess() to look for trends:> tmp <- loess(log(y) ~ year)> plot(year, log(y))> lines( year, predict(tmp) )••••••••••••••••••••If we are going to believe that there is a trend overtime, we may need a fourth-degree polynomial tocapture it. Let’s see how the null model ﬁts:> x <- rep(1,length(y))> result <- poisson.regression(x,y)Stat 544, Lecture 15 7'&$%1...2...3...4...> poisson.print(result)The Newton-Raphson algorithm converged in 4 iterations.coef SE coef/SE pvalX1 2.282382 0.07142857 31.95 0Loglikelihood = 251.346947592599Pearson’s X^2 = 37.4693877550946Deviance G^2 = 38.502814220699df=190% of observations have expected counts below 5.0The minimum expected cell count is 9.8> 1-pchisq(result$X2,result$df)[1] 0.00692793The model doesn’t ﬁt. Possible reasons are1. the time trend is real,2. other omitted covariates (e.g. indicators for armycorps), or3. overdispersion.With our present information, we cannot distinguish(2) from (3). But we can look for a time trend.Let’s include eﬀects for time, up to the fourth power.If we simply include year, year2,year3and year4inthe model, these terms will be highly collinear,possibly causing numerical problems. An easy way tosolve that problem is to (at least approximately)recenter the year at zero before raising it to higherStat 544, Lecture 15 8'&$%powers; the overall ﬁt will be the same, but the termswill then be (at least approximately) uncorrelated.> time <- year-85 # center it at 1885 to reduce collinearity> x <- cbind(int=1, t=time, t2=time^2, t3=time^3, t4=time^4)> result <- poisson.regression(x,y)1...2...3...4...5...> poisson.print(result)The Newton-Raphson algorithm converged in 5 iterations.coef SE coef/SE pvalint 2.2195221444

View Full Document


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2-3-4-5-6 out of 19 pages.

PSU STAT 544 - Poisson Regression

Sign up for free to view:

Please select your school