Database-friendly random projections: Johnson-Lindenstrauss with binary coinsIntroductionJohnson and Lindenstrauss [9]Our contributionProjecting onto random linesRandomizationDerandomizationPrevious workFrankl and Meahara [6]Some intuitionPreliminaries and the spherically symmetric casePreliminariesThe spherically symmetric caseTail boundsMoment boundsDiscussionDatabase-friendlinessFurther workAcknowledgementsReferenceshttp://www.elsevier.com/locate/jcssJournal of Computer and System Sciences 66 (2003) 671–687Database-friendly random projections:Johnson-Lindenstrauss with binary coinsDimitris AchlioptasMicrosoft Research, One Microsoft Way, Redmond, WA 98052, USAReceived 28 August 2001; revised 19 July 2002AbstractA classic result of Johnson and Lindenstrauss asserts that any set of n points in d-dimensional Euclideanspace can be embedded into k-dimensional Euclidean space—where k is logarithmic in n and independentof d—so that all pairwise distances are maintained within an arbitrarily small factor. All knownconstructions of such embeddings involve projecting the n points onto a spherically random k-dimensionalhyperplane through the origin. We give two constructions of such embeddings with the property that allelements of the projection matrix belong in f1; 0; þ1g: Such constructions are particularly well suited fordatabase environments, as the computation of the embedding reduces to evaluating a single aggregate overk random partitions of the attributes.r 2003 Elsevier Science (USA). All rights reserved.1. IntroductionConsider projecting the points of your favorite sculpture first onto the plane and then onto asingle line. The result amply demonstrates the power of dimensionality.In general, given a high-dimensional pointset it is natural to ask if it could be embedded into alower dimensional space without suffering great distortion. In this paper, we consider thisquestion for finite sets of points in Euclidean space. It will be convenient to think of n points in Rdas an n d matrix A; each point represented as a row (vector).Given such a matrix representation, one of the most commonly used embeddings is the onesuggested by the singular value decomposition of A: That is, in order to embed the n points intoRkwe project them onto the k-dimensional space spanned by the singular vectors correspondingto the k largest singular values of A: If one rewrites the result of this projection as a (rank k) n dmatrix Ak; we are guaranteed that any other k-dimensional pointset (represented as an n dARTICLE IN PRESSE-mail address: [email protected]/03/$ - see front matter r 2003 Elsevier Science (USA). All rights reserved.doi:10.1016/S0022-0000(03)00025-4matrix D) satisfiesjA AkjFpjA DjF;where, for any matrix Q; jQj2F¼PQ2i;j: To interpret this result observe that if moving a point by ztakes energy proportional to z2; Akrepresents the k-dimensional configuration reachable from Aby expending least energy.In fact, Akis an optimal rank k approximation of A under many matrix norms. In particular, itis well-known that for any rank k matrix D and for any rotationally invariant normjA AkjpjA Dj:At the same time, this optimality implies no guarantees regarding local properties of the resultingembedding. For example, it is not hard to devise examples where the new distance between a pairof points is arbitrarily smaller than their original distance. For a number of problems wheredimensionality reduction is clearly desirable, the absence of such local guarantees can make ithard to exploit embeddings algorithmically.In a seminal paper, Linial et al. [12] were the first to consider algorithmic applications ofembeddings that respect local properties. By now, embeddings of this type have become animportant tool in algorithmic design. A real gem in this area has been the following result ofJohnson and Lindenstrauss [9].Lemma 1.1 (Johnson and Lindenstrauss [9]). Given e40 and an integer n; let k be a positive integersuch that kXk0¼ Oðe2log nÞ: For every set P of n points in Rdthere exists f : Rd-Rksuch thatfor all u; vAPð1 eÞjju vjj2pjjf ðuÞf ðvÞjj2pð1 þ eÞjju vjj2:We will refer to embeddings providing a guarantee akin to that of Lemma 1.1 as JL-embeddings. In the last few years, such embeddings have been used in solving a variety ofproblems. The idea is as follows. By providing a low-dimensional representation of the data,JL-embeddings speed up certain algorithms dramatically, in particular algorithms whose run-timedepends exponentially in the dimension of the working space. (For a number of practicalproblems the best-known algorithms indeed have such behavior.) At the same time, the providedguarantee regarding pairwise distances often allows one to establish that the solution found byworking in the low-dimensional space is a good approximation to the solution in the originalspace. We give a few examples below.Papadimitriou et al. [13], proved that embedding the points of A in a low-dimensional space cansignificantly speed up the computation of a low-rank approximation to A; without significantlyaffecting its quality. In [8], Indyk and Motwani showed that JL-embeddings are useful in solvingthe e-approximate nearest-neighbor problem, where (after some preprocessing of the pointset P)one is to answer queries of the following type: ‘‘Given an arbitrary point x; find a point yAP;such that for every point zAP; jjx zjjXð1 eÞjjx yjj:’’ In a different vein, Schulman [14] usedJL-embeddings as part of an approximation algorithm for the version of clustering where weseek to minimize the sum of the squares of intracluster distances. Recently, Indyk [7] showed thatARTICLE IN PRESSD. Achlioptas / Journal of Computer and System Sciences 66 (2003) 671–687672JL-embeddings can also be used in the context of ‘‘data-stream’’ computation, where one haslimited memory and is allowed only a single pass over the data (stream).1.1. Our contributionOver the years, the probabilistic method has allowed for the original proof of Johnson andLindenstrauss to be greatly simplified and sharpened, while at the same time giving conceptuallysimple randomized algorithms for constructing the embedding [5,6,8]. Roughly speaking, all suchalgorithms project the input points onto a spherically random hyperplane through the origin.While this is conceptually simple, in practical terms it amounts to multiplying the input matrix Awith a
View Full Document