**Unformatted text preview:**

Geometric diffusions as a tool for harmonic analysisand structure definition of data: Multiscale methodsR. R. Coifman*†, S. Lafon*, A. B. Lee*, M. Maggioni*, B. Nadler*, F. Warner*, and S. W. Zucker‡*Department of Mathematics, Program in Applied Mathematics, Yale University, 10 Hillhouse Avenue, New Haven, CT 06510; and‡Departmentof Computer Science, Yale University, 51 Prospect Street, New Haven, CT 06510Contributed by R. R. Coifman, February 2, 2005In the companion article, a framework for structural multiscalegeometric organization of subsets of ⺢nand of graphs was intro-duced. Here, diffusion semigroups are used to generate multiscaleanalyses in order to organize and represent complex structures. Weemphasize the multiscale nature of these problems and buildscaling functions of Markov matrices (describing local transitions)that lead to macroscopic descriptions at different scales. Theprocess of iterating or diffusing the Markov matrix is seen as ageneralization of some aspects of the Newtonian paradigm, inwhich local infinitesimal transitions of a system lead to globalmacroscopic descriptions by integration. This article deals with theconstruction of fast-order N algorithms for data representation andfor homogenization of heterogeneous structures.In the c ompanion article (1), it is shown that the eigenfunctionsof a diffusion operator, A, can be used to perform globalanalysis of the set and of functions on a set. Here, we present ac onstruction of a multiresolution analysis of functions on the setrelated to the dif fusion operator A. This allows one to performa local analysis at different dif fusion scales.This is motivated by the fact that in many situations one isinterested not in the data themselves but in functions on the data,and in general these functions exhibit different behaviors atdif ferent scales. This is the case in many problems in learning, inanalysis on graphs, in dynamical systems, etc. The analysisthrough the eigenfunctions of Laplacian considered in thec ompanion article (1) are global and are affected by globalcharacteristics of the space. It can be thought of as global Fourieranalysis. The multiscale analysis proposed here is in the spirit ofwavelet analysis.We refer the reader to (2–4) for further details and applica-tions of this construction, as well as a discussion of the manyrelationships bet ween this work and the work of many otherresearchers in several branches of mathematics and appliedmathematics. Here, we would like to at least mention therelationship with fast multiple methods (5, 6), algebraic multi-grid (7), and lifting (8, 9).Multiscale Analysis of DiffusionConstruction of the Multiresolution Analysis. Suppose we are givena self-adjoint diffusion operator A as in ref. 1 acting on L2of ametric measure space (X, d,) . We interpret A as a dilationoperator and use it to define a multiresolution analysis. It isnatural to discretize the semigroup {At}tⱖ0of the powers of Aat a logarithmic scale, for example at the timestj⫽ 1 ⫹ 2 ⫹ 22⫹ ···⫹ 2j⫽ 2j⫹1⫺ 1. [1]For a fixed 僆 (0,1), we define the approximation spaces byVj⫽ 具兵i:itjⱖ其典, [2]where theis are the eigenvectors of A, ordered by decreasingeigenvalue. We will denote by Pjthe orthogonal projection ontoVj. The set of subspaces {Vj}j僆⺪is a multiresolution analysis inthe sense that it satisfies the following properties:1. limj3 ⫺⬁Vj⫽ L2(X,) , limj3 ⫹⬁Vj⫽具{i:i⫽ 1}典.2. Vj⫹1債 Vjfor every j 僆 ⺪.3. {i:itjⱖ} is an orthonormal basis for Vj.We can also define the detail subspaces Wjas the orthogonalc omplement of Vjin Vj⫹1, so that we have the familiar relationbet ween approximation and detail subspaces as in the classicalwavelet multiresolution constructions:Vj⫹1⫽ VjQ⬜Wj.This is very much in the spirit of a Littlewood–Paley decompo-sition induced by the diffusion semigroup (10). However, in eachsubspace Vjand Wjwe have the orthonormal basis of eigenfunc-tions, but we would like to replace them with localized orthonormalbases of scaling functions as in wavelet theory. Generalized Heisen-berg principles (see Extension of Empirical Functions of the Data Set)put a lower bound on how much localization can be achieved at eachscale j, depending on the spectrum of the operator A and on thespace on which it acts. We would like to have basis elements as muchlocalized as allowed by the Heisenberg principle at each scale, andspanning (approximately) Vj. We do all this while avoiding com-putation of the eigenfunctions.We start by fixing a precision ⬎ 0 and assume that A isrepresented on the basis ⌽0⫽ {␦k}k僆X. We consider thec olumns of A, which can be interpreted as the set of functions⌽˜1⫽ {A␦k}k僆Xon X. We use a local multiscale Gram–Schmidtprocedure, described below, to carefully but efficiently orthonor-malize these columns into a basis ⌽1⫽ {1,k}k僆X1(X1is definedas this index set) for the range of A up to precision . This is alinear transformation we represent by a matrix G0. This yields asubspace that is -close to V1. Essentially, ⌽1is a basis for asubspace that is -close to the range of A, the basis elements thatare well localized and orthogonal. Obviously, 兩X1兩 ⱕ 兩X兩, but theinequalit y may already be strict since part of the range of A maybe below the precision . Whether this is the case or not, we havethen a map M0f rom X to X1, which is the composition of A withthe orthonormalization by G0. We can also represent A in thebasis ⌽1: We denote this matrix by A1and compute A12. See thediagram in Fig. 1.We now proceed by looking at the columns of A12, which are⌽˜2⫽ {A12␦k}k僆X1⫽ {A21,k}k僆X1up to precision , by unrav-eling the bases on which the various elements are represented.Again we can apply a local Gram–Schmidt procedure toorthonor malize this set: This yields a matrix G1and anorthonor mal basis ⌽2⫽ {2,k}k僆X2for the range of A12up toprecision , and hence for the range of A03up to precision 2.Moreover, depending on the decay of the spectr um of A,兩X2兩 ⬍⬍ 兩X1兩. The matrix M1, which is the composition of G1w ithA12, is then of size 兩X2兩 ⫻ 兩X1兩, and A22⫽ M1M1Tis a represen-t ation of A4acting on ⌽2.Af ter j steps in this fashion, we will have a representation ofA1⫹2⫹22⫹䡠䡠䡠⫹2j⫽ A2j⫹1⫺1onto a basis ⌽j⫽ {j,k}k僆Xjthat spans†To whom correspondence