DOC PREVIEW
UW-Madison STAT 572 - Poisson Regression

This preview shows page 1-2-3 out of 8 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Introduction Poisson DistributionPoisson ProcessPoisson RegressionExamplePoisson RegressionBret LargetDepartments of Botany and of StatisticsUniversity of Wisconsin—MadisonMay 1, 2007Statistics 572 (Spring 2007) Poisson Regression May 1, 2007 1 / 16IntroductionPoisson RegressionPoisson regression is a form of a generalized linear model where theresponse variable is modeled as having a Poisson distribution.The Poisson distribution models random variables with non-negativeinteger values.For large means, the Poisson distribution is well approximated by thenormal distribution.In biological applications, the Poisson distribution can be useful forvariables that are often small integers including zero.The Poisson distribution is often used to model rare events.Statistics 572 (Spring 2007) Poisson Regression May 1, 2007 2 / 16Introduction Poisson DistributionThe Poisson DistributionThe Poisson distribution arises in many biological contexts.Examples of random variables for which a Poisson distribution mightbe reasonable include:the number of bacterial colonies in a Petri dish;the number of trees in an area of land;the number of offspring an individual has;the number of nucleotide base substitutions in a gene over a period oftime;Statistics 572 (Spring 2007) Poisson Regression May 1, 2007 3 / 16Introduction Poisson DistributionProbability Mass FunctionThe probability mass function of the Poisson distribution with mean µisP {Y = k | µ} =e−µµkk!for k = 0, 1, 2, . . ..The Poisson distribution is discrete, like the binomial distribution, buthas only a single parameter µ that is both the mean and the variance.In R, you can compute Poisson probabilities with the function dpois.For example, if µ = 10, we can find P {Y = 12} =e−10101212!with thecommand> dpois(12, 10)[1] 0.09478033Statistics 572 (Spring 2007) Poisson Regression May 1, 2007 4 / 16Introduction Poisson DistributionPoisson approximation to the BinomialOne way that the Poisson distribution can arise is as anapproximation for the binomial distribution when p is small.The approximation is quite good for large enough n.If p is small, then the binomial probability of exactly k successes isapproximately the same as the Poisson probability of k with µ = np.Here is an example with p = 0.01 and n = 10.> dbinom(0:4, 10, 0.01)[1] 9.043821e-01 9.135172e-02 4.152351e-03 1.118478e-04 1.977108e-06> dpois(0:4, 10 * 0.01)[1] 9.048374e-01 9.048374e-02 4.524187e-03 1.508062e-04 3.770156e-06This approximation is most useful when n is large so that thebinomial coefficients are very large.Statistics 572 (Spring 2007) Poisson Regression May 1, 2007 5 / 16Introduction Poisson ProcessThe Poisson ProcessThe Poisson Process arises naturally under assumptions that are oftenreasonable.For the following, think of points as being exact times or locations.The assumptions are:The chance of two simultaneous points is negligible;The expected value of the random number of points in a region isproportional to the size of the region.The random number of points in non-overlapping regions areindependent.Under these assumptions, the random variable that counts thenumber of points has a Poisson distribution.If the expected rate of points is λ points per unit length (area), thenthe distribution of the number of points in an interval (region) of sizet is µ = λt.Statistics 572 (Spring 2007) Poisson Regression May 1, 2007 6 / 16Introduction Poisson ProcessExampleSuppose that we assume that at a location, a particular species ofplant is distributed according to a Poisson process with expecteddensity 0.2 individuals per square meter.In a nine square meter quadrat, what is the probability of noindividuals?Solution: The number of individuals has a Poisson distribution withmean µ = 9 × 0.2 = 1.8. The probability of this isP {Y = 0 | µ = 1.8} =e−1.8(1.8)00!.= 0.165299In R, we can compute this as> dpois(0, 1.8)[1] 0.1652989Statistics 572 (Spring 2007) Poisson Regression May 1, 2007 7 / 16Poisson RegressionPoisson RegressionPoisson regression is a natural choice when the response variable is asmall integer.The explanatory variables model the mean of the response variable.Since the mean must be positive but the linear combinationη = β0+ β1x1+ · · · + βkxkcan take on any value, we need to use alink function for the parameter µ.The standard link function is the natural logarithm.log(µ) = η = β0+ β1x1+ · · · + βkxkso thatµ = exp(η)Statistics 572 (Spring 2007) Poisson Regression May 1, 2007 8 / 16Poisson Regression ExampleAberrant Crypt Foci ExampleAberrant crypt foci (ACF) are abnormal collections of tube-likestructures that are precursors to tumors.In an experiment, researchers exposed 22 rats to a carcinogen andthen counted the number of ACFs in the rat colons.There were three treatment groups based on time since first exposureto the carcinogen, either 6, 12, or 18 weeks.The data is in the DAAG data set ACF1 with variables count andendtime.> library(DAAG)> str(ACF1)'data.frame': 22 obs. of 2 variables:$ count : num 1 3 5 1 2 1 1 3 1 2 ...$ endtime: num 6 6 6 6 6 6 6 12 12 12 ...Statistics 572 (Spring 2007) Poisson Regression May 1, 2007 9 / 16Poisson Regression ExamplePlot of Data> attach(ACF1)> plot(count ~ endtime, pch = 16)●●●●●●●●●●●●●●●●●●●●●●6 8 10 12 14 16 180 2 4 6 8 10endtimecountStatistics 572 (Spring 2007) Poisson Regression May 1, 2007 10 / 16Poisson Regression ExampleLinear Predictor> acf1.glm = glm(count ~ endtime, family = poisson)> summary(acf1.glm)Call:glm(formula = count ~ endtime, family = poisson)Deviance Residuals:Min 1Q Median 3Q Max-2.46204 -0.47851 -0.07943 0.38159 2.26332Coefficients:Estimate Std. Error z value Pr(>|z|)(Intercept) -0.32152 0.40046 -0.803 0.422endtime 0.11920 0.02642 4.511 6.44e-06 ***---Signif. codes: 0'***'0.001'**'0.01'*'0.05'.'0.1' '1(Dispersion parameter for poisson family taken to be 1)Null deviance: 51.105 on 21 degrees of freedomResidual deviance: 28.369 on 20 degrees of freedomAIC: 92.21Number of Fisher Scoring iterations: 5Statistics 572 (Spring 2007) Poisson Regression May 1, 2007 11 / 16Poisson Regression ExampleQuadratic Predictor> acf2.glm = glm(count ~ endtime + I(endtime^2), family = poisson)> summary(acf2.glm)Call:glm(formula = count ~ endtime + I(endtime^2), family = poisson)Deviance Residuals:Min 1Q Median 3Q Max-2.0616 -0.7834 -0.2808 0.4510 2.1693Coefficients:Estimate Std. Error z value Pr(>|z|)(Intercept) 1.722364 1.092494 1.577 0.115endtime


View Full Document

UW-Madison STAT 572 - Poisson Regression

Download Poisson Regression
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Poisson Regression and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Poisson Regression 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?