NCSU ST 762 - Normal theory maximum likelihood and quadratic estimating equations

Unformatted text preview:

762notes.pdfCHAPTER 5 ST 762, M. DAVIDIAN5 Normal theory maximum likelihood and quadratic estimating equa-tions5.1 IntroductionWe r ecall the general mean-variance model:E(Yj|xj) = f(xj, β), var(Yj|xj) = σ2g2(β, θ, xj). (5.1)In Chapter 2, we discussed (Approach 10) an alternative to GLS for inference for this model.• In some applications, one may postulate a mean-variance relationship of the form (5.1) on thebasis of features observed empirically in the data. It may also be observed that, at each xj, thedistribution of Yjvalues appears to be well approximated by a distribution such as the normal.This is the case in applications such as pharmacokinetics and assay analysis, for example.• Note that there is no technical difficulty in considering the normal distribution for a given mean-variance model. The univariate normal distribution is characterized by its first two moments;moreover, these two moments need not have any particular relationship (unlike the scaled expo-nential family distributions, with the exception of the normal with constant variance). Thus, inthe normal distribution, the mean and variance are n ot inextricably linked .• Recall that one motivation for GLS estimation is that it is an attempt to emulate weighted leastsquares estimation for β with known weights. One motivation for WLS is that it is maximumlikelihood estimation in th e case w here the weights are kn own und er the assumption that th edistribution of Yj|xjis normal.• Carrying this idea forward, it seems reasonable to consider estimation of β in the general model(5.1) under the assu mption that the data are normally distributed for each xj. As we noted earlierand will now show explicitly, this will not necessarily lead to GLS-type estimation. In fact, it isstraightforward to show that if Y has a n ormal distribution with mean µ and variance σ2g2(µ)for arbitrary function g2, this distribution is not a memb er of the scaled exponential family class.Thus, it is not clear that maximum likelihood under the assump tion of normality and (5.1) needlead to GLS estimation.PAGE 104CHAPTER 5 ST 762, M. DAVIDIAN• We will furthermore observe that the ap proach of assuming the data are normally distributed andestimating β by maximum likelihood in (5.1) leads to solution of an estimating equation that isa s pecial case of a general class of estimating equations that is broader than the class of linearGLS equations. Thus, considering the assumption of normality and maximum likelihood in (5.1)will lead us to methods that are competitors to GLS. Now, we just introduce this class. In laterchapters, we will compare formally the two approaches.In this chapter, we will continue to consider θ to be known. We will relax this in Chapter 6. We willuse the unqualified abbreviation ML h en ceforth to refer to normal maximum likelihood for th e model(5.1).5.2 Normal theory ML est imating equation for βFor model (5.1) under the assumption of normality, we may write down the loglikeliho od (ignoringconstants and conditioning on xj) as−n log σ −nXj=1log g(β, θ, xj) − (1/2)nXj=1{Yj− f(xj, β)}2σ2g2(β, θ, xj). (5.2)Letgβ(β, θ, xj) = ∂/∂β g(β, θ, xj) (p × 1),νβ(β, θ, xj) = ∂/∂β log g(β, θ, xj) = gβ(β, θ, xj)/g(β, θ, xj).Differentiating (5.2) with respect to β yields−nXj=1νβ(β, θ, xj) + (1/2)σ−2nXj=12{Yj− f(xj, β)}σ2g2(β, θ, xj)fβ(xj, β)+(1/2)σ−2nXj=12{Yj− f(xj, β)}2g3(β, θ, xj)gβ(β, θ, xj),which may be simplified to yield the estimating equationσ−2nXj=1g−2(β, θ, xj){Yj− f(xj, β)}fβ(xj, β) +nXj=1"{Yj− f(xj, β)}2σ2g2(β, θ, xj)− 1#νβ(β, θ, xj) = 0.Multiplying both sides by σ2> 0 gives the final formnXj=1g−2(β, θ, xj){Yj− f(xj, β)}fβ(xj, β) + σ2nXj=1"{Yj− f(xj, β)}2σ2g2(β, θ, xj)− 1#νβ(β, θ, xj) = 0. (5.3)PAGE 105CHAPTER 5 ST 762, M. DAVIDIANREMARKS:• From (5.3), the ML estimator solves an equation that has the form of that we solve in the GLSapproach plus an additional term.• The additional term arises because of the dependence of g on β.• Note that solving th is equation is complicated by the fact that σ2is no longer a multiplicativescale factor that may be eliminated. Clearly, solving this equation for β must be carried outjointly with s olving the equation for σ2obtained by differentiating the loglikelihood with respectto σ2(thus jointly maximizing the loglikelihood). We will discuss this shortly.• Contrast this with the estimating equation for maximum likelihood under the scaled exponentialfamily. As we saw in Chapter 4, in that case, differentiation of the loglikelihood leads to anequation that has the form of ju st the first term in (5.3), with n o additional term as we have here.Heuristically, in the scaled exponential family, the mean and variance must be related in a veryspecific way, so that only certain g functions arise.In the normal distribution, mean and variance may be related in any way (or no way at all) –almost any f unction g th at makes sense may be considered, and the dependence on β need noteven be through the mean. Because of the special relationship of mean and variance in the scaledexponential family class, all of the information about β is “in the mean;” however, in the normalmodel above, because there is no special relationship, g contains additional information about β.The second term in (5.3) may be thought of as reflecting this additional inform ation.• Note that if g does not depend on β, then νβ(β, θ, xj) ≡ 0, and (5.3) reduces to the usual GLSequation. In this case, there is no additional information about β in g. In fact, if θ is known inthis case, the “weights” w−2j= g−2(β, θ, xj) are known (a known function of xj) and (5.3) is thenjust the usual WLS estimating equation (with known weights), as we have shown earlier.IMPLICATIONS:• If the data really are normally distributed, and the mean and variance mo dels in (5.1) are correctlyspecified, then standard large sample theory implies that the estimator for β solving (5.3) is“optimal” in the sense that it is most efficient (i.e., most precise).Thus, if g is such that it depends on β, so that the second term of (5.3) is non-zero, then solvingthe GLS equation and solving the ML equation (5.3) are two different methods of estimating β.PAGE 106CHAPTER 5 ST 762, M. DAVIDIANImmediately, th en , if the mean-variance model is correct and the assumption of normality is valid,GLS


View Full Document

NCSU ST 762 - Normal theory maximum likelihood and quadratic estimating equations

Download Normal theory maximum likelihood and quadratic estimating equations
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Normal theory maximum likelihood and quadratic estimating equations and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Normal theory maximum likelihood and quadratic estimating equations 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?