Stanford CS 468 - Lossless Representation of Graphs using Distributions - D1731670

Home> Schools> Stanford University> Computer Science (CS) > CS 468> Lossless Representation of Graphs using Distributions

DOC PREVIEW

Stanford CS 468 - Lossless Representation of Graphs using Distributions

School name Stanford University

Course Cs 468- Topics in Geometric Algorithms:Data-Driven Shape Analysis

Pages 19

This preview shows page 1-2-3-4-5-6 out of 19 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 19 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 19 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 19 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 19 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 19 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 19 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 19 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

PreliminariesReconstructibility from the distribution of subtrianglesA partial isomorphismDistinct weightsSome statisticsReconstructibility from one-dimensional distributionsGraphs with edge weightsGraphs with node weightsLossless Representation of Graphs using Dist r ibutionsMireille Boutin and Gregor KemperOctober 4, 2007AbstractWe consider complete graphs with edge weights and/or node weights taking values in someset. In the first part of this paper, we show that a large number of graphs are completelydetermined, up to isomorphism, by the distribution of their sub-triangles. In the second part,we propose graph representations in terms of one-dimensional distributions (e.g., distributionof the node weights, sum of adjacent weights, etc.). For the case when the weights of the graphare real-valued vectors, we show that all graphs, except for a set of measure zero, are uniquelydetermined, up to isomorphism, from these distributions. The motivating application for thispaper is the problem of browsing through large sets of graphs.Introd uctionGraphs in general, and more particularly weighted or vector weighted gra phs, are often used torepresent complex structures such as 3D objects (Foulds 1992). This is often done, for example, whentrying to determine whether two objects have a similar structure: by using a graph repres e ntation,the problem is simplified into determining whether the two underlying graphs are similar. Theproblem of determining whether two graphs are the same, up to a relabeling of the nodes, is calledthe graph isomorphism problem. While graph isomorphism is not an NP hard problem, it is stillvery hard. In fact, the problem is sometimes assigned to a special complexity class called graphisomorphism complete (Skiena [29]).Graph isomorphism has been a mainstream research problem for more than 3 0 years (Gati [13],Read and Corneil [25]). Over these years, sever al solutions have been developed, including graphmatching methods (Cor neil and Gotlieb [8], Ullmann [32], Kitchen and Rosenfeld [17], Kim andKim [16], Falkenhainer et al. [11], Horaud and Skordas [15], Myaeng and Lopez -Lopez [23]), canon-ical lab e ling representation methods (McKay [19]), graphs invariants (Corneil and Kirk patrick [9],Umeyama [33], Chung [7], Mess mer and Bunke [21], Messmer and Bunke [22]), graph matchingbased on error-correcting isomorphism methods (Tsai and Fu [30], Shapiro and Haralick [28], Sanfe-liu and Fu [26], Eshera and Fu [10], Wong [35]), and approximate graph matching methods (Heraultet al. [14], Kittler et al. [18], Christmas et al. [6], Almohamad and Duffuaa [1]). In this paper, weshow that many graphs can be uniquely reconstructed from some simple distributions. For example,in the first section of the paper, we show that a large number of weighted graphs can be uniquely re-constructed from their distribution of subtriangles. In other words, their distribution of subtrianglesprovides a faithful (i.e., lossle ss) representation for these gr aphs. This means that isomorphism canbe detected simply by comparison of the respective distribution of triangles. For simpler comparisonand visualization, we also introduce graph representation in terms of one-dimensional distributions(e.g., node weights, edge weights and s um of adjacent edge weights). As we show in Section 3, formany graphs, these represe ntations are lossless. Actually, when the weights of the graphs take valuesinside R, the s et of exception form a set of measure zero. This work can b e viewed as an extensionto graphs of the repres entations we proposed in (Boutin & Kemper 2004) for point configurations.The r e sults we present are interesting both from a graph theory point p oint of view and from anapplication point of view. From an application point of view, having a faithful representation which12 M. Boutin, G. Kempercan be compared quickly is useful in the case where many graphs need to be c ompared in a shortamount of time. For example, an important problem is the problem of browsing for g raphs within alarge database. In this problem, being able to compare graphs quickly is a key issue. However, thisis not the only issue. Indeed, exhaustive sear ches are not e fficient ways to query a large databasebecause their complexity is proportional to the size of the database. For faster search, an indexstructure must be built. Da tabase indices exploit the presence of natural clusters in the da taset tosuccessively rule out large regions o f the data space. Unfortunately, some datasets do not exhibitnatural clusters, particularly high-dimensio nal ones (Beyer e t al. [2]). Indeed, while some high-dimensional datas ets can be dealt with successfully (Shaft and Ramakrishnan [27]) (e.g., w hen theunderlying dimensionality of the dataset is much lower than the dimension of the vectors), clusterscan usually only be found in low-dimensional projections of the high-dimensional vectors (Wang andYang [3 4]). However, projecting graph representations which are not invariant under isomorphismis unproductive because the labeling ambiguity blurs the distinction between the dimensions beingprojected. Thus an isomorphism invariant representation is needed. Moreover, a faithful repre-sentation guar antees the highest level of comparison accuracy. In particular, small distinguishingfeatures are guaranteed to be preserved. In other words, fo r a large set of graphs, the proposedrepresentations provide explicit coordinates to r epresent g raphs, and thus naturally lend themselvesto the arr ay of database projection and indexing techniques available in the literature. This is incontrast with many current graph indexing approaches which first approximate the structure witha lossy repres entation before indexing.From a graph theory point of view, there is a long tradition of considering subgraphs of a givengraph and asking how much information about the graph is contained in its subgraph structure.Perhaps the most important example of this approach is Ulam’s conjecture (see Ulam [31]), alsocalled the re c onstruction problem. The conjecture can be stated as follows: Let G be a simple non-directed graph with n ≥ 3 nodes (simple means for each pair of nodes there either is an edge betweenthem or not). Take the set of all isomo rphism classes of (n − 1)-subgr aphs of G, i.e., all graphsobtained from G by deleting one node. To each of these isomorphism classes assign its multiplicity,i.e., the number of nodes of G whose deletion leads to a

View Full Document