NCSU ST 762 - Detection and modeling of nonconstant variance

Unformatted text preview:

762notes.pdfCHAPTER 7 ST 762, M. DAVIDIAN7 Detection and modeling of nonconstant variance7.1 IntroductionSo far, we have focused on approaches to inference in mean-variance models of th e formE(Yj|xj) = f(xj, β), var(Yj|xj) = σ2g2(β, θ, xj) (7.1)under the assumption that we have already specified such a model.• Often, a model for the mean may be suggested by the nature of the response (e.g., bin ary orcount), by subject-matter theoretical considerations (e.g., pharmacokinetics), or by the empiricalevidence (e.g., mod els for assay response).• A model for variance may or may not be suggested by these features. When the response is b inary,the form of the variance is indeed dictated by the Bernoulli distribution, while for data in theform of counts or p roportions, for which the Poisson or binomial distributions may be appropriate,the form of the variance is again suggested. One may wish to consider the possibility of over- orunderdispersion in these situations; this may reasonably be carried out by fitting a model thataccommodates these features and determining if an improvement in fit is apparent using methodsfor inference on variance parameters we will discuss in Chapter 12.Alternatively, when the response is continuous (or approximately continuous), it is often thesituation that there is not necessarily an obvious relevant distributional model. As we havediscussed in some of the examples we have considered, several sources of variation may combine toproduce patterns that are n ot well described by the kinds of variance models dictated by populardistributional assumptions such as the gamma or lognormal distributions.In fact, it may be unclear whether heterogeneity of variance is even an issue at all. In someapplications, it is expected, and pop ular models may be available; in others, whether or notvariance changes with the mean or covariate values m ay need to be deduced from the data.• In these s ituations, methods are required for detecting nonconstant variance, determining whetheror not it changes smoothly across the range of the response or covariates, and identifying anappropriate model to characterize th e change.To address these issues, both formal and informal approaches have been proposed:PAGE 155CHAPTER 7 ST 762, M. DAVIDIAN• Graphical techniques. Both for detection and modeling, these often h ave a subjective flavor. Inthis chapter, we will focus on these procedures.• Formal hypothesis testing. Formal p rocedures are mainly used for detection. We w ill defer discus-sion of these until after we h ave covered the large-sample theoretical developments on which theyare based. Because of the complexity of (7.1), no finite-sample, “exact” method s are available ingeneral.COMMON THEME : Most graphical approaches are based on the OLS residualsrj= Yj− f(xj,ˆβOLS)and functions thereof, or on related constructs. Our main focus will be on detection and modeling insituations where the response is continuous (or nearly continuous, such as in the case of moderate-to-large counts).A complementary treatment of some of the approaches we will discuss may be found in Carroll and Rup-pert (1988, Sections 2.7 and 2.8). Note th at, in what follows, distributional statements are conditionalon the xj.7.2 Plots based on residualsWe begin by first reviewing the basic rationale for the use of residuals as a tool f or detecting nonconstantvariance in regression.The “usu al” residual plots described in a first course in linear regression analysis apply equally wellin the nonlinear mod el situation. Specifically, one usually plots the rjor the “standardized” residualsrj/ˆσOLS, whereˆσ2OLS= (n − p)−1nXj=1r2j,versus one or more of the following:• Predicted valuesˆYj= f(xj,ˆβOLS)• Covariates (elements of xj)• logˆYjin cases where many responses tend to be clustered in a very narrow range in order to“stretch things out” so that any patterns might be more readily discernible. We will see th e valueof this for some nonlinear models and designs later.PAGE 156CHAPTER 7 ST 762, M. DAVIDIANIf the plot(s) exhibit an apparent pattern, with the magnitude of residuals changing with level ofpredicted value or covariate, this is taken as evidence of potential nonconstant variance. In particular, forthe plot of residuals vs. predicted values or their logarithms, a “fan-shape” is accepted as evidence thatvariance increases smoothly with the level of the r esponse (mean). More generally, any “nonhaphazard,”“systematic” pattern may well be evidence that variance does not remain constant over the range ofthe respon s e.One must be careful, however.• A systematic pattern may also be the result of an ill-fitting mean model. The natur e of th epattern must be critically assessed by th e data analyst to determine a reasonable explanationfor it given the particular mean model and circumstances. For example, for the indomethacinpharmacokinetic data in Examples 1.1 and 1.2, the model was the sum of two exponential terms.If a simple model containing only a single exponential term were fitted to these d ata, one wouldexpect to see a systematic pattern in the residuals reflecting the lack of fit of this model.There is certainly subjectivity involved in this endeavor.• When responses are collected in time order, e.g., repeated measurements on the same individuals,one often plots the residuals against time to look for temporal patterns that may s uggest possibleserial correlation. Alternatively, more sophisticated plots for investigating this are available. Wedefer d iscus s ion of serial correlation until later chapters, as our current focus is on detectingand modeling nonconstant variance when the assumption of independence is reasonable. It isimportant to recognize, however, that this is an assumption that should be considered carefullyin practice.MOTIVATION: The obvious motivation for the usual plots is that rjis a “proxy” for the true deviationYj− f(xj, β).• If the data are normally (or at least symmetrically) distributed with constant variance, we wouldexpect the rjto be roughly symmetrically distributed about 0 and to have approximate constantvariance.• We would thus expect a “haphazard” pattern, with approximately equal numb ers of positive andnegative residuals with app roximately the same magnitude across their entire range.• Even if the variance were nonconstant, if the data were at least normally or symmetrically dis-tributed, we


View Full Document

NCSU ST 762 - Detection and modeling of nonconstant variance

Download Detection and modeling of nonconstant variance
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Detection and modeling of nonconstant variance and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Detection and modeling of nonconstant variance 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?