# UW-Madison STAT 572 - Modeling non-normal data (61 pages)

Previewing pages*1, 2, 3, 4, 28, 29, 30, 31, 58, 59, 60, 61*of 61 page document

**View the full content.**## Modeling non-normal data

Previewing pages *1, 2, 3, 4, 28, 29, 30, 31, 58, 59, 60, 61*
of
actual document.

**View the full content.**View Full Document

## Modeling non-normal data

0 0 84 views

Lecture Notes

- Pages:
- 61
- School:
- University of Wisconsin, Madison
- Course:
- Stat 572 - Statistical Methods for Bioscience II

**Unformatted text preview: **

Outline 1 Logistic regression fitting the model Components of generalized linear models Logistic regression Case study runoff data Case study baby food 2 Logistic regression Inference Model fit and model diagnostics Comparing models Sparse data and the separation problem Modeling non normal data In all of the linear models we have seen so far the response variable has been modeled with a normal distribution response fixed parameters normal error For many data sets this model is inadequate Ex if the response variable is categorical with two possible responses it makes no sense to model the outcome as normal Ex if the response is always a small positive integer its distribution is also not well described by a normal distribution Generalized linear models GLMs are an extension of linear models to model non normal response variables Logistic regression is for binary response variables The link function Standard linear model yi 1 xi1 2 xi2 k xik ei ei N 0 2 The mean of expected value of the response is IE yi 1 xi1 2 xi2 k xik We will use the notation i 1 xi1 k xik to represent the linear combination of explanatory variables In a standard linear model IE yi i In a GLM there is a link function g between and the mean of the response variable g IE yi i For standard linear models the link function is the identity function g yi yi The link function It can be easier to consider the inverse of the link function IE yi g 1 i When the response variable is binary with values coded as 0 or 1 the mean is simply IEy IP y 1 A useful function for this case is IEy IP y 1 e g 1 1 e can take any value the mean is always between 0 and 1 The corresponding link function is called the logit function p IP Y 1 g p log log 1 p IP Y 0 It is the log of the odds Regression under this model is called logistic regression Deviance In standard linear models we estimate the parameters by minimizing the sum of the squared residuals Equivalent to finding parameters that maximize the likelihood In a GLM we

View Full Document