NCSU ST 762 - Implementation of generalized least squares - D1873779

Home> Schools> North Carolina State University> Statistics (ST) > ST 762> Implementation of generalized least squares

NCSU ST 762 - Implementation of generalized least squares

School name North Carolina State University

Course St 762- Nonlinear Statistical Models for Univariate and Multivariate Response

Pages 31

Download Save

Unformatted text preview:

762notes.pdfCHAPTER 3 ST 762, M. DAVIDIAN3 Implementation of generalized least squaresWe have indicated that, for the general m odelE(Yj|xj) = f(xj, β), var(Yj|xj) = σ2g2(β, θ, xj), (3.1)a popular method for estimating β in the mean specification is generalized least squares (GLS). Wemotivated the approach from the standpoint of solving an estimating equation of the formnXj=1wj{Yj− f(xj, β)}fβ(xj, β) = 0,where the “weights” are replaced by estimates. The weighting takes into account the d iffering precisionof each response j, giving this approach an omnibus appeal. In fact, as we will see, the GLS appr oachcorresponds to maximum likelihood estimation when the Yjhave distribu tions in a certain class.Before we tackle these issues, it is worthwhile to discuss how this very popular approach may beimplemented in practice. This will serve both to reinforce its generality and to introduce us to thecomputational strategy used to solve very general sets of estimating equations that may not be solvedin a closed form.We will assume for now that θ is known in the sens e discussed in Chapter 2, so that the focus willbe on estimation of β and σ2only. However, we will continue to highlight dependence of the variancefunction g on θ, as later we will consider adding estimation of θ to the mod el-fitting task.3.1 GLS algorithmThe conceptual scheme we will call the GLS algorithm (in the case that θ is known ) may be writtenmore precisely as follows.(i) Estimate β byˆβ(0), whereˆβ(0)is some initial estimate, for exampleˆβOLSsolvingPnj=1{Yj− f(xj, β)}fβ(xj, β) = 0. Set k = 0.(ii) Form weights ˆwj= g−2(ˆβ(k), θ, xj).(iii) Re-estimate β by solvingnXj=1ˆwj{Yj− f(xj, β)}fβ(xj, β) = 0to obtainˆβ(k+1). Set k = k + 1 and return to (ii).PAGE 51CHAPTER 3 ST 762, M. DAVIDIANContinue through C iterations, and adopt the Cth as the estimator.• Intuitively, we might expect (hope) that if C were “large,” successive iteratesˆβ(k)would be moreand more similar. If we could iterate “forever,” we would hope that successive iterates wouldcoincide, so that the algorithm could be said to have “converged.” We will denote this as the case“C = ∞.”• If C = ∞, then the β value appearing in the “weights” and that in th e rest of the equation mustcoincide. Thus, the case C = ∞ corresponds to the case where we are solvingnXj=1g−2(β, θ, xj){Yj− f(xj, β)}fβ(xj, β) = 0 (3.2)in β.As we will see, solving (3.2) may in fact be implemented using an approach different from (andmore d irect than) the GLS algorithm given above. However, we will continue for now to thinkof the general approach conceptually in terms of the GLS algorithm in steps (i) – (iii), as thiswill prove convenient when we generalize to the case where θ is also taken to be unknown an destimated.3.2 Implementing steps (i) and (iii)Assumingˆβ(0)isˆβOLSin step (i), both steps (i) and (iii) in the GLS algorithm require solution of a(p × 1) set of estimating equations of the formnXj=1wj{Yj− f(xj, β)}fβ(xj, β) = 0, (3.3)where the wjare a set of fixed, known constants. In the case of OLS, wj≡ 1 for all j, of course; instep (iii), the wjare the current estimated values from step (ii), which are held fixed in (iii). In general,solving (3.3) in the case of a set of fixed, known weights wj, j = 1, . . . , n, corresponds to the method ofWLS.Thus, implementation of the GLS algorithm requires th e ability to solve estimating equations of the“WLS” form. We thus focus first on how this may be carried out.Note th at if f(xj, β) were a linear function of β, i.e., f(xj, β) = xTjβ, then fβ(xj, β) = xj, and it iseasy to see that (3.3) m ay be solved in a closed form for β.PAGE 52CHAPTER 3 ST 762, M. DAVIDIANIn particular, under these conditions, it is easy to verify that the solution isˆβW LS=nXj=1wjxjxTj−1nXj=1wjxjYj.When linearity of f does not hold, and f is a general nonlinear function of β, then it is clear that aclosed form solution is no longer possible in general. In some special cases, the forms of f and fβmayfortuitously admit an analytical solution, but this is very unusual. Accordingly, (3.3) must be solvednumerically.The basic method for numerical solution of the equation may be derived in different ways. Here is oneway, a variant of an idea called the Gauss-Newton method in the nonlinear regression literature. Wewill discuss another way to motivate this method shortly.By a Taylor series expansion, we may approximate f(xj, β) and fβ(xj, β) by linear functions of β.Taking the expansions about some value β∗“close to” β, we havef(xj, β) ≈ f(xj, β∗) + fTβ(xj, β∗)(β − β∗) (3.4)fβ(xj, β) ≈ fβ(xj, β∗) + fββ(xj, β∗)(β − β∗) (p × 1). (3.5)See Section 2.4 for an overview of this notation; here, fββ(xj, β) is the (p × p ) matrix of second partialderivatives. The underlying assumption behind the linear approximation is that, for β∗“close to” β, thesubsequent terms (quadratic and higher) in the Taylor series are sufficiently small as to be “negligible.”Note also that these expressions carry implicit assumptions about the existence of partial derivatives off and fβrequired for th e relevance of Taylor’s theorem.• For nonlinear models, such approximations are used routinely. It must be kept in mind thatthese approximations involve assumptions that must hold for them to be r elevant to a particularproblem.Substituting these expressions into (3.3), we get0 ≈nXj=1wj{Yj− f(xj, β∗) − fTβ(xj, β∗)(β − β∗)}{fβ(xj, β∗) + fββ(xj, β∗)(β − β∗)}≈nXj=1wj{Yj− f(xj, β∗)}fβ(xj, β∗) −nXj=1wjfβ(xj, β∗)fTβ(xj, β∗)(β − β∗)+nXj=1wj{Yj− f(xj, β∗)}fββ(xj, β∗)(β − β∗) + quadratic terms in (β − β∗).This expression forms the basis for a linear approximation to the (nonlinear) estimating equation.PAGE 53CHAPTER 3 ST 762, M. DAVIDIAN• If β and β∗are “close,” we would expect the quadratic terms to be “small” relative to the otherterms involving (β − β∗) in only a linear way.• Furthermore, under these conditions, we also expectE{Yj− f(xj, β∗)} ≈ 0.Thus, the third term on the right hand side of the approximation, which involves the product of(β − β∗) and {Yj− f(xj, β∗)}, might also be expected to be “small.”Considering, the third and forth terms to be “negligible,” then, we have the

View Full Document


School:
Email:
New Password:
Confirm Password:

NCSU ST 762 - Implementation of generalized least squares

Sign up for free to view:

Please select your school