# NCSU ST 762 - Introduction to nonlinear models (24 pages)

Previewing pages*1, 2, 23, 24*of 24 page document

**View the full content.**## Introduction to nonlinear models

Previewing pages
*1, 2, 23, 24*
of
actual document.

**View the full content.**View Full Document

**Unformatted text preview:**

CHAPTER 2 2 ST 762 M DAVIDIAN Introduction to nonlinear models 2 1 Introduction In this chapter we will discuss the model that will be our central focus in Chapters 3 12 Through the course of our discussion we will identify different approaches to inference in the model setting the stage for these future chapters SITUATION Assume that we have independent pairs of observations Yj xj j 1 n The xj may be fixed or random as discussed in Chapter 1 We will assume that the pairs and hence the random variables Yj are independent BASIC MODEL Rather than state the model in the form of response model deivation and a series of assumptions about the deviations we will instead write the model in terms of what we are willing to say about the first two moments of the distribution of Yj given xj We will begin with a basic form of the model As our discussion progresses we will modify this basic form E Yj xj f xj var Yj xj j2 2 1 In model 2 1 f x is a real valued function of the vector of covariates x r 1 say and the vector of regression parameters p 1 The dependence of f on need not be in a linear fashion as in the models discussed in the examples of Chapter 1 f may depend on some or all of the components of in a complicated nonlinear way Note that r need not be equal to p as in the examples in Chapter 1 The assumption var Yj xj j2 is left deliberately vague at this point What is important right now is the idea that the values j are j dependent This implies that the variances of the conditional distributions of Y values at different xj are not the same across j The values j may be known constants or more generally the expression allows the possibility that they may depend on xj If we define ej Yj E Yj xj Yj f xj we do not necessarily assume that ej is independent of xj as in the classical assumptions Given the way we have defined the model we do have that E ej xj 0 which is similar to classical assumption 1 Thus we do assume that the chosen model form f xj is a correct specification of E Yj xj PAGE 27 CHAPTER 2 ST 762 M DAVIDIAN This may be interpreted as saying that the data analyst is well equipped to identify an appropriate model form In the case where there is a theoretical basis for choosing a model as in the case of pharmacokinetics this is certainly a reasonable assumption Note that we make no assumption about the distributions of the Yj xj s or more directly the conditional distributions of Yj given xj Major themes will be the ability to develop inferential strategies that have nice properties without making such assumptions and the robustness of inferential methods to violation of distributional assumptions that might be made 2 2 Inferential approaches Generally as in the classical regression set up the scientific objective may be stated in terms of questions about the value of the parameter or at least some of its elements That is questions of interest focus on the mean response as a function of xj e g To obtain the most accurate characterization To determine whether the model may be modified to exclude consideration of some components of xj Thus at least initially when we speak of inference within the framework of our basic model 2 1 we interpret this to mean estimation of and testing with respect to the parameter We will see that other parameters may also be involved in carrying this out most effectively and that indeed other parameters in modifications of 2 1 may also be of interest APPROACH 1 Except for the fact that f is nonlinear in pretend that some of the other classical assumptions hold In particular whether we believe variance is constant or not suppose we proceed as if it is so that var Yj xj 2 is a constant We might even adopt the assumption of normality of Y given x this clearly would be erroneous for binary data or data in the form of small counts but might be a reasonable approximation for continuous responses Under this perspective a natural approach would then be ordinary least squares OLS that is minimizing in the sum of squared deviations n X Yj f xj 2 j 1 Just as in the linear case this approach can be motivated in different ways PAGE 28 2 2 CHAPTER 2 ST 762 M DAVIDIAN If we adopt the conditional normality assumption maximum likelihood estimation of and 2 involves jointly maximizing the loglikelihood log L n 2 log 2 n 2 log 2 1 2 n X Yj f xj 2 2 2 3 j 1 Maximization of this in is equivalent to minimizing 2 2 With or without the normality assumption one may view minimizing 2 2 in as a sensible thing to do as discussed in Chapter 1 The sum of squared deviations 2 2 may be viewed as a distance criterion that in accordance with the assumption of constant variance treats all n observations as if they were of equal quality ASIDE It is important to recognize that in discussing maximum likelihood we are implicitly conditioning on xj when writing the likelihood To appreciate this suppose the xj r 1 are random and themselves normally distributed with some mean and covariance matrix So if we consider the Yj xj as independent draws from a distribution of possible Y x pairs ideally the loglikelihood of the observed data the pairs Yj xj j 1 n would be log L rn 2 log 2 n 2 log 1 2 n X xj T 1 xj 2 4 j 1 where log L is defined in 2 3 and is the logarithm of the product of individual normal densities for Yj given xj Note that as the part of the loglikelihood due to xj does not involve maximizing the full loglikelihood 2 4 in is the same as maximizing log L alone This also shows that in the context of regression modeling where the distribution of Yj given xj is of central interest the distribution of random covariates is not directly relevant A word of warning however this observation applies only if the xj are observed without error or are not missing In these more complex cases which are beyond our scope here the distribution of xj values does enter into the picture complicating matters considerably Using the notation described in Section 2 4 minimizing 2 2 is equivalent to solving the p dimensional estimating equation n X Yj f xj f xj 0 2 5 j 1 where f xj is the p 1 vector whose elements are the partial derivatives of f with respect to each component of That is minimizing 2 2 or for that matter maximizing 2 3 is equivalent to solving a set of equations in that is linear in the data Yj Thus whether we adopt normality or we …

View Full Document