Stanford STATS 191 - Logistic and Poisson Regression

Unformatted text preview:

Lecture 15: Logistic and Poisson RegressionNancy R. ZhangStatistics 191, Stanford UniversityMarch 10, 2008Nancy R. Zhang (Statistics 191) Lecture 15 March 10, 2008 1 / 17Review - Binary responses modelModel: Y ∈ {0, 1},P(Y = 1|X1, . . . , Xp) = g−1(β1X1+ β2X2+ . . . βpXp).Whereg(π) = logπ1 − π.The inverse g−1isg−1(z) =ez1 + ez.We have no choice but to accept non-constant variance,Var(Y ) = π(X )[1 − π(X )].Nancy R. Zhang (Statistics 191) Lecture 15 March 10, 2008 2 / 17Review - Model interpretationAn intuitive quantity to assess probabilities:odds =P(Y = 1|X )P(Y = 0|X ).In the logistic regression model,log(odds) = βX .The parameter β is the contribution of unit increase in X to theincrease (decrease) in odds. For example, if X were binary as well,logodds(X = 1)odds(X = 0)= β.Nancy R. Zhang (Statistics 191) Lecture 15 March 10, 2008 3 / 17Logit Model for Multinomial ResponseIf the response Y belong to K categories.1Designate one category as the “base” category.2P(Y = k|X ) =eX βk1 +PK −1l=1eX βlHere, βk= (βk1, . . . , βkp).P(Y = K |X ) =11 +PK −1l=1eX βl3p × (K − 1) parameters.4βkifor k-th category and i-th predictor interpreted as increase inlog-odds from base category.Nancy R. Zhang (Statistics 191) Lecture 15 March 10, 2008 4 / 17Logit Model for Multinomial ResponseEquivalent definition:logπk(X )πK(X )= αk+ X βk, k = 1, . . . , K − 1,whereπk(X ) = P(Y = k|X ).Nancy R. Zhang (Statistics 191) Lecture 15 March 10, 2008 5 / 17Alligator Food ExampleStudy on the primary food choice of alligoators.1Data: 219 alligators captured in four Florida lakes.2Response variable: food type, in volume, found in the alligator’sstomach. 5 categories:1fish2invertebrate3reptile4bird5other3Predictors:1Lake of capture (Hancock, Oklawaha, Trafford, George)2Gender (M, F).3Size (≤ 2.3m, ≥ 2.3m).Nancy R. Zhang (Statistics 191) Lecture 15 March 10, 2008 6 / 17Alligator Food Choice Example1Do gender, size, or lake of capture influence food choice?2Are there interaction effects?3Obtain estimates of P( food choice = fish | Gender, Size, Lake ).Functions for multinomial fitting in R: multinom in library nnet.Nancy R. Zhang (Statistics 191) Lecture 15 March 10, 2008 7 / 17Fitted probabilitiesˆpiIf you had data about the size of the alligators (and not just theclassification (≤ or ≥ 2.3 m), then you can estimate a response curvelike this:From Agresti, Categorical Data AnalysisNancy R. Zhang (Statistics 191) Lecture 15 March 10, 2008 8 / 17Review: 2 × 2 tables: Gender and After-lifeY N or UM 435 147 582F 375 134 509Total 810 281 10911Poisson sampling assumption: row sums not fixed.2Model: Yi,j∼ Poisson(λij).3H0: Row and columns independent (i.e. log λij= λ + αi+ βj).(Before I used λij= λαiβj, now I get rid of log to save space).This is the simplest type of contingency table.Nancy R. Zhang (Statistics 191) Lecture 15 March 10, 2008 9 / 17Review: 2 × 2 tables: Gender and After-lifeY N or UM 435 147 582F 375 134 509Total 810 281 10911Poisson sampling assumption: row sums not fixed.2Model: Yi,j∼ Poisson(λij).3H0: Row and columns independent (i.e. log λij= λ + αi+ βj).(Before I used λij= λαiβj, now I get rid of log to save space).This is the simplest type of contingency table.Nancy R. Zhang (Statistics 191) Lecture 15 March 10, 2008 9 / 17Review: 2 × 2 tables: Gender and After-lifeY N or UM 435 147 582F 375 134 509Total 810 281 10911Poisson sampling assumption: row sums not fixed.2Model: Yi,j∼ Poisson(λij).3H0: Row and columns independent (i.e. log λij= λ + αi+ βj).(Before I used λij= λαiβj, now I get rid of log to save space).This is the simplest type of contingency table.Nancy R. Zhang (Statistics 191) Lecture 15 March 10, 2008 9 / 17Review: 2 × 2 tables: Gender and After-lifeY N or UM 435 147 582F 375 134 509Total 810 281 10911Poisson sampling assumption: row sums not fixed.2Model: Yi,j∼ Poisson(λij).3H0: Row and columns independent (i.e. log λij= λ + αi+ βj).(Before I used λij= λαiβj, now I get rid of log to save space).This is the simplest type of contingency table.Nancy R. Zhang (Statistics 191) Lecture 15 March 10, 2008 9 / 173-way tables: Alcohol Cigarette, and Marijuana UseSurvey asked 2276 students in their final year of high school in anonurban area near Dayton, Ohio whether they ever used alcohol,cigarettes, or marijuana.Alcohol Cigarette Marijuana UseUse Use Yes NoYes Yes 911 538No 44 456No Yes 3 43No 2 279This is example of a 2 × 2 × 2 contingency table. Shorthand:A=alcohol, C=cigarette, M=marijuana.Nancy R. Zhang (Statistics 191) Lecture 15 March 10, 2008 10 / 173-way tables: Types of InteractionAlcohol Cigarette Marijuana UseUse Use Yes NoYes Yes 911 538No 44 456No Yes 3 43No 2 279Yijk∼ Poisson(λijk)Conditioned on total (N) Yijk∼ Multinom(N, πijk).πi++be probability of row A = i,πij+be probability of A = i, C = j, etc.1A,C, and M mutually independentlog λijk= λ + λAi+ λCj+ λMkπijk= πi++π+j+π++kNancy R. Zhang (Statistics 191) Lecture 15 March 10, 2008 11 / 173-way tables: Types of InteractionAlcohol Cigarette Marijuana UseUse Use Yes NoYes Yes 911 538No 44 456No Yes 3 43No 2 279Yijk∼ Poisson(λijk)Conditioned on total (N) Yijk∼ Multinom(N, πijk).πi++be probability of row A = i,πij+be probability of A = i, C = j, etc.1A,C, and M mutually independentlog λijk= λ + λAi+ λCj+ λMkπijk= πi++π+j+π++kNancy R. Zhang (Statistics 191) Lecture 15 March 10, 2008 11 / 173-way tables: Types of InteractionAlcohol Cigarette Marijuana UseUse Use Yes NoYes Yes 911 538No 44 456No Yes 3 43No 2 279Yijk∼ Poisson(λijk)Conditioned on total (N) Yijk∼ Multinom(N, πijk).πi++be probability of row A = i,πij+be probability of A = i, C = j, etc.1A,C, and M mutually independentlog λijk= λ + λAi+ λCj+ λMkπijk= πi++π+j+π++kNancy R. Zhang (Statistics 191) Lecture 15 March 10, 2008 11 / 173-way tables: Types of InteractionAlcohol Cigarette Marijuana UseUse Use Yes NoYes Yes 911 538No 44 456No Yes 3 43No 2 279Yijk∼ Poisson(λijk)Conditioned on total (N) Yijk∼ Multinom(N, πijk).πi++be probability of row A = i,πij+be probability of A = i, C = j, etc.1A,C, and M mutually independentlog λijk= λ + λAi+ λCj+ λMkπijk= πi++π+j+π++kNancy R. Zhang (Statistics 191) Lecture 15 March 10, 2008 11 / 173-way tables: Types of InteractionAlcohol Cigarette Marijuana UseUse Use Yes NoYes Yes 911 538No 44 456No Yes 3 43No 2 279Yijk∼ Poisson(λijk)Conditioned on total (N)


View Full Document

Stanford STATS 191 - Logistic and Poisson Regression

Download Logistic and Poisson Regression
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Logistic and Poisson Regression and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Logistic and Poisson Regression 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?