Filename: SingNewt February 5, 2007 7:50 amProf. W. Kahan Math. 128B Page 1/4 Notes on Nonlinear Newton Iterations at a Typical Singularity Given a nonlinear n-vector-valued function ƒ(x) of an n-vector x , a solution z of the equation ƒ(z) = o is often sought by means of Newton’s Iteration x k+1 := Nƒ(x k ) wherein Newton’s Iterating Function Nƒ(x) := x – ƒ ' (x) –1 ·ƒ(x) . Here ƒ ' (x) := ∂ ƒ(x)/ ∂ x is the n-by-n Jacobian Matrix of first partial derivatives; it is usually presumed adequately differentiable and invertible at all x in some open neighborhood of the desired solution z , and then Nƒ(x) – z = ƒ ' (x) –1 · ( ƒ " (x)·(x–z) + … ) ·(x–x) ≈ O (x–z) 2 as x → z . Those presumptions imply at least Quadratic Convergence . This happens almost always.The iteration’s behavior is not so easy to characterize in some singular situations. Among these the most common has det(ƒ ' (x)) = 0 but ∂ det(ƒ ' (x))/ ∂ x ≠ o ` at x = z . Thus det(ƒ ' (x)) = 0 along some curve (n = 2) or (hyper)surface (n ≥ 3) that passes through z but does not intersect itself there, not even tangentially. Call this locus “ $ ” . It divides some open neighborhood of z into two open regions in which det(ƒ ' (x)) takes opposite signs. Nƒ(x) is either ambiguous or infinite on $ because ƒ ' (x) is not invertible there. These notes explore the behavior of Newton’s iteration when it converges to z but no iterate falls into $ . Convergence turns out to be Linear rather than quadratic, and iterates approaching z usually tend to avoid $ , as we shall see. Our conclusions are summarized on p. 3 and illustrated by examples on p. 4. The Derivative row h`(x) := ∂ det(ƒ'(x))/ ∂ x . Jacobi’s Formula for the derivative of a determinant says d det(B) = Trace(Adj(B)·dB) wherein the Trace is the sum of all diagonal elements, and Adj(B) is the Classical Adjoint or Adjugate : Adj(B) := det(B)·B –1 when det(B) ≠ 0 and is otherwise defined by the continuity of what turns out to be a polynomial function of the elements of B defined by B·Adj(B) ≡ Adj(B)·B ≡ det(B)·I in general. There are other equivalent definitions of Adj(B) in terms of determinants or the Characteristic Polynomial of B , but all we need from them is this fact: Adj(B) ≠ O just when the n-by-n matrix B has Rank(B) ≥ n–1 , and then Rank(Adj(B)) = n – (n–1)·(n – Rank(B)) . Jacobi’s formula is derived at <www.cs.berkeley.edu/~wkahan/MathH110/jacobi.pdf> .Jacobi’s formula says d det(ƒ ' (x)) = Trace(Adj(ƒ ' (x))·ƒ " (x)·dx) wherein the second derivative ƒ " is a bilinear operator that maps n-vectors y and z each linearly to an n-vector ƒ " (x)·y·z that, in general, varies nonlinearly with x and, if continuously, ƒ " (x)·y·z = ƒ " (x)·z·y . In particular ƒ " (x)·dx is a matrix whose every element depends linearly upon column dx , so there must be some row n-vector h ` (x) = ∂ det(ƒ ' (x))/ ∂ x satisfying d det(ƒ ' (x)) = h ` (x)·dx for all dx ; and if h ` (z) ≠ o ` then the equation of the plane tangent to $ at z is h ` (z)·(x–z) = 0 .We assume henceforth that h ` (z) ≠ o ` and det(ƒ ' (z)) = 0 , whence Rank(ƒ ' (z)) = n–1 and hence Rank(Adj(ƒ ' (z))) = 1 , so Adj(ƒ ' (z)) = w·v ` for two nonzero vectors satisfying ƒ ' (z)·w = o and v ` ·ƒ ' (z) = o ` . Consequently h ` (z)·dx = Trace(Adj(ƒ ' (z))·ƒ " (z)·dx) = Trace(w·v ` ·ƒ " (z)·dx) = Trace(v ` ·ƒ " (z)·w·dx) and therefore h ` (z) = v ` ·ƒ"(z)·w = ∂ det(ƒ'(x))/∂x at x = z . We shall need this formula later.Filename: SingNewt February 5, 2007 7:50 amProf. W. Kahan Math. 128B Page 2/4A Taylor Series expansion of ƒ will be presumed valid for x in some open neighborhood of z : ƒ(x) = o + ƒ'(z)·(x–z) + ƒ"(z)·(x–z)·(x–z) + ƒ'"(z)·(x–z)·(x–z)·(x–z) + … . ƒ'(x) = ƒ'(z) + ƒ"(z)·(x–z) + ƒ'"(z)·(x–z)·(x–z) + … ; det(ƒ'(z)) = 0 ; Adj(ƒ'(z)) = w·v` . det(ƒ'(x)) = h`·(x–z) + … wherein h` := v`·ƒ"(z)·w ≠ o` .Needed next is a Taylor series expansion for Newton’s iterating function: Nƒ(x) = x – ƒ'(x)–1·ƒ(x) = z + (ƒ' + ƒ"·(x–z) + ƒ'"·(x–z)·(x–z) + … )–1·(ƒ"·(x–z) + ƒ'"·(x–z)·(x–z) + … )·(x–z)in which all derivatives of ƒ(x) are evaluated at x = z . This is too messy. It must be simplified.A Change of Coordinates.By a process akin to Gaussian Elimination with Pivotal Exchanges of both rows and columns, we can obtain a diagonal matrix L–1·ƒ'(z)·U–1 = M := using suitably permuted lower- and upper-triangular matrices L and U . These figure in a simplifying change of variables from x to u := U·(x–z) . Let g(u) := L–1·ƒ(z + U–1·u) so that Newton’s iterating function for g is Ng(u) := u – g'(u)–1·g(u) = U·(Nƒ(z + U–1·u) – z) . In other words, the iteration xk+1 := Nƒ(xk) starting from x0 is mimicked by the iteration uk+1 := Ng(uk) starting from u0 := U·(x0 – z) in so far as every uk = U·(xk – z) . Iterates xk → z just as fast (if at all) as uk → o .Thus no generality is lost by assuming z = o = ƒ(o) , ƒ'(o) = M := , Adj(ƒ'(o)) = = v·v` in which w` = v` = [o` 1] , and the foregoing Taylor series for Nƒ(x) is simplified to … Nƒ(x) = (M + ƒ"·x + ƒ'"·x·x + … )–1·(ƒ"·x + ƒ'"·x·x + … )·x = x – (M + ƒ"·x + ƒ'"·x·x + … )–1·(M – ƒ'"·x·x + … )·x in which all derivatives of ƒ(x) are evaluated at x = o . Further progress requires a partition of ƒ"·x = in which Xx is an (n–1)-by-(n–1) matrix whose every element is a linear function of x .
View Full Document