Unformatted text preview:

Slide 1Why graph clustering is useful?OutlineMin s-t cutMax flow problemCutsFlowsMax flow problemFlows and cutsFlows and cutsCertificate of optimalitySettingMin cut problemRandomized min-cut algorithmExample of contractionObservations on the algorithmAnalysis of the algorithmMultiway cut (analogue of s-t cut)Algorithm for multiway cutApproximation resultMinimum k-cutMinimum k-cut algorithmGomory-Hu TreeGomory-Hu treeMin-cuts againMeasuring connectivityGraph expansionSpectral analysisLaplacian Matrix propertiesThe second smallest eigenvalueSpectral orderingSpectral partitionFielder ValueConductanceConductance and random walksInterpretation of conductanceClustering ConductanceA clustering bi-criterionThe clustering problemA spectral algorithmA divide and merge methodologyMerge phase or dynamic-progamming on treesDynamic programming on treesGraph ClusteringWhy graph clustering is useful?•Distance matrices are graphs  as useful as any other clustering•Identification of communities in social networks•Webpage clustering for better data management of web dataOutline•Min s-t cut problem•Min cut problem•Multiway cut•Minimum k-cut•Other normalized cuts and spectral graph partitioningsMin s-t cut•Weighted graph G(V,E)•An s-t cut C = (S,T) of a graph G = (V, E) is a cut partition of V into S and T such that s S∈ and t T∈•Cost of a cut: Cost(C) = Σe(u,v) uЄS, v ЄT w(e)•Problem: Given G, s and t find the minimum cost s-t cutMax flow problem•Flow network–Abstraction for material flowing through the edges–G = (V,E) directed graph with no parallel edges–Two distinguished nodes: s = source, t= sink–c(e) = capacity of edge eCuts•An s-t cut is a partition (S,T) of V with sЄS and tЄT•capacity of a cut (S,T) is cap(S,T) = Σe out of Sc(e)•Find s-t cut with the minimum capacity: this problem can be solved optimally in polynomial time by using flow techniquesFlows•An s-t flow is a function that satisfies–For each eЄE 0≤f(e) ≤c(e) [capacity]–For each vЄV-{s,t}: Σe in to vf(e) = Σe out of vf(e) [conservation]•The value of a flow f is: v(f) = Σe out of s f(e)Max flow problem•Find s-t flow of maximum valueFlows and cuts•Flow value lemma: Let f be any flow and let (S,T) be any s-t cut. Then, the net flow sent across the cut is equal to the amount leaving sΣe out of S f(e) – Σe in to S f(e) = v(f)Flows and cuts•Weak duality: Let f be any flow and let (S,T) be any s-t cut. Then the value of the flow is at most the capacity of the cut defined by (S,T):v(f) ≤cap(S,T)Certificate of optimality•Let f be any flow and let (S,T) be any cut. If v(f) = cap(S,T) then f is a max flow and (S,T) is a min cut.•The min-cut max-flow problems can be solved optimally in polynomial time!Setting•Connected, undirected graph G=(V,E)•Assignment of weights to edges: w: ER+•Cut: Partition of V into two sets: V’, V-V’. The set of edges with one end point in V and the other in V’ define the cut•The removal of the cut disconnects G•Cost of a cut: sum of the weights of the edges that have one of their end point in V’ and the other in V-V’Min cut problem•Can we solve the min-cut problem using an algorithm for s-t cut?Randomized min-cut algorithm•Repeat : pick an edge uniformly at random and merge the two vertices at its end-points–If as a result there are several edges between some pairs of (newly-formed) vertices retain them all–Edges between vertices that are merged are removed (no self-loops)•Until only two vertices remain•The set of edges between these two vertices is a cut in G and is output as a candidate min-cutExample of contractioneObservations on the algorithm•Every cut in the graph at any intermediate stage is a cut in the original graphAnalysis of the algorithm•C the min-cut of size k  G has at least kn/2 edges–Why?•Ei: the event of not picking an edge of C at the i-th step for 1≤i ≤n-2•Step 1: –Probability that the edge randomly chosen is in C is at most 2k/(kn)=2/n  Pr(E1) ≥ 1-2/n•Step 2:–If E1 occurs, then there are at least n(n-1)/2 edges remaining–The probability of picking one from C is at most 2/(n-1)  Pr(E2|E1) = 1 – 2/(n-1)•Step i:–Number of remaining vertices: n-i+1–Number of remaining edges: k(n-i+1)/2 (since we never picked an edge from the cut)–Pr(Ei|Πj=1…i-1 Ej) ≥ 1 – 2/(n-i+1)–Probability that no edge in C is ever picked: Pr(Πi=1…n-2 Ei) ≥ Πi=1…n-2(1-2/(n-i+1))=2/(n2-n)•The probability of discovering a particular min-cut is larger than 2/n2•Repeat the above algorithm n2/2 times. The probability that a min-cut is not found is (1-2/n2)n^2/2 < 1/eMultiway cut (analogue of s-t cut)•Problem: Given a set of terminals S = {s1,…,sk} subset of V, a multiway cut is a set of edges whose removal disconnects the terminals from each other. The multiway cut problem asks for the minimum weight such set.•The multiway cut problem is NP-hard (for k>2)Algorithm for multiway cut•For each i=1,…,k, compute the minimum weight isolating cut for si, say Ci•Discard the heaviest of these cuts and output the union of the rest, say C•Isolating cut for si: The set of edges whose removal disconnects si from the rest of the terminals•How can we find a minimum-weight isolating cut?–Can we do it with a single s-t cut computation?Approximation result•The previous algorithm achieves an approximation guarantee of 2-2/k•ProofMinimum k-cut •A set of edges whose removal leaves k connected components is called a k-cut. The minimum k-cut problem asks for a minimum-weight k-cut•Recursively compute cuts in G (and the resulting connected components) until there are k components lef•This is a (2-2/k)-approximation algorithmMinimum k-cut algorithm•Compute the Gomory-Hu tree T for G•Output the union of the lightest k-1 cuts of the n-1 cuts associated with edges of T in G; let C be this union•The above algorithm is a (2-2/k)-approximation algorithmGomory-Hu Tree•T is a tree with vertex set V•The edges of T need not be in E•Let e be an edge in T; its removal from T creates two connected components with vertex sets (S,S’)•The cut in G defined by partition (S,S’) is the cut associated with e in GGomory-Hu tree•Tree T is said to be the Gomory-Hu tree for G if–For each pair of vertices u,v in V, the weight of a minimum u-v cut in G is the same as that in T–For each edge e in T, w’(e) is the weight of the cut associated with e


View Full Document

BU CS 565 - Graph Clustering

Documents in this Course
Load more
Download Graph Clustering
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Graph Clustering and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Graph Clustering 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?