DOC PREVIEW
UCSD CSE 252C - Hierarchical Clustering

This preview shows page 1-2-3-20-21-22-41-42-43 out of 43 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 43 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 43 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 43 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 43 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 43 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 43 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 43 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 43 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 43 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 43 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

HIERARCHICAL CLUSTERINGAn iterative improvement procedurefor hierarchical clusteringAndrew RabinovichCSE 252C 10/12/04Why Hierarchical Clustering and Not Usual Clustering(K-means)?• Obtain a hierarchy instead of an amorphous collection of groups• No need to specify number of clusters• Can view data at many levels of granularity, all at the same time• Simple heuristics for constructing hierarchical clusteringsHierarchical clusteringRecursive partitioning of a data set123451 2 3 4 51-clustering2-clustering3-clustering4-clustering5-clusteringProblemThe whole enterprise of hierarchical clustering could use some more justification.e.g.Conventional Hierarchical Techniques• Agglomerative• DivisiveStandard agglomerative heuristics1. Initially each point is its own cluster.2. Repeatedly merge the two “closest” clusters.Need to define distance between clusters…Single-linkage: distance between closest pair of pointsAverage-linkage: distance between centroidsComplete-linkage: distance between farthest pairSingle-linkage clusteringChaining effect.1 2 3…j j+1 n…1 - jδThe k-clustering will have diameter about n-k, instead of n/k.Complete-linkage clusteringCan similarly construct a bad case…Average-linkage clusteringPoints in d-dimensional space, d = log2k, under an l1metric.Final radius should be 1, instead is d.Therefore: off by a factor of log2k.Average LinkageThe average linkage method avoids the extremes of either large clusters or tight compact clusters.Average Linkage Cost Functions• Sokal-Michener • F• Ward’s Method: 2|| ( ) ( )||S Tµ µ−2,1|| ||| | | |x S y Tx yS T∈ ∈−⋅∑2| | | ||| ( ) ( )||| | | |S TS TS Tµ µ⋅−+Divisive Methods• Normalized CutA binary partition is performed recursively increasing the similarity within cluster and dissimilarity between clustersEvaluation of all agglomerative methods• Can’t trace back• Complexity at least O(n2)Local improvement algorithm for hierarchical clusteringKey challenge: efficiencyGoal:Hierarchical analog of k-means2314Representationa bc dOrdered binary treeLarge search space: given ndata points,122))!1((−−nnnhierarchical clusteringsCost of hierarchical clusteringLet Ckbe the set of kclusters {S1, S2, …, Sk} defined by a flat clustering. The k-means cost is: ∑∑= ∈−=kj SxjkjSxC12)()cost(µWe define the cost of a hierarchical clustering as the weighted sum of the nclusterings defined by the hierarchy ∑==nikkCw1)cost(hcostFeatures of the cost functionInherently emphasizes smaller kBased on k-means cost functionWeights can be set to emphasize particular kUsing properties of Euclidean distance: O(n) time to calculateGeneral Algorithm IdeaUse state space search methods to optimize criterion functionTwo local moves:Keep the ordering fixed and restructure tree optimallyNP-Hard?Keep structure fixed and reorder split order of internal nodes optimallyschedulingalgorithmAnalogy with k-meansKeep the partition fixed and recompute meansKeep the means fixed and optimize the partitioningThe Graft:Keep the ordering fixed and restructure tree optimallyabrhikjabrhikjMaking the graft efficientKey idea: Rewrite the hierarchical clustering cost locally, as a sum of contributions from individual tree edges.This can be done by exploiting special properties of squared Euclidean distance.2 22C Cx C x Cx x Cµ µ µ µ∈ ∈− = − + ⋅ −∑ ∑Lemma: Suppose cluster C has mean µC; then for any other point µ,µµCµµL µRleft tree L right tree R∑∑∈∈−+−+−+−=∪RxRRLxLLxRxLRL2222**)cost(µµµµµµ)cost(*)cost(*22RRLLRL+−++−=µµµµ2Lµµ−2Rµµ−left tree L right tree RCalculating k-means cost by counting tree edges∑==nikkCw1)cost(hcost∑−=1)(1*piiCwTσO(n)computation timeCounting edges for criterion function:2CPµµ−PCRecall:This is in fact just a linearcombination of lengths of treeedges. The coefficient for theedge between Pand Cis:Optimal location for a sub-treePick any subtree. What’s the best place to move it (using a “graft”), while respecting split numbers?Can be determined in just O(n)time:Upward pass: compute change in cost due to removing subtreeDownward pass: calculate change in cost for each potential locationReordering internal nodes2314a bc d2413a bc dKeep structure fixed and reorder split order of internal nodes optimallyOptimal internal node orderingKey idea: Each split in the tree reduces the clustering cost (i.e.. finer approximation to the data). We want to place the best splits early on… this is a schedulingproblem! (board example)Optimal split orderingLet g(·) be the reduction in cost of splitting cluster rooted at x, Tx:))cost()(cost()cost()(RLxxTTTTg+−=Overall hierarchical clustering cost is a time-weighted sum of these:∑∈=Vxxgxn)()(1hcostσwhere σ(x)is the time (1to n) at which Txis split.A scheduling algorithmCan be implemented in O(n logn) time.Choose order σ(·)of splits.Greedy choice based on g(·)doesn’t work – a split whose immediate benefit is small might nevertheless enable future good splits.Horn [1972]: next subtree Tto split is the one which maximizes∑∈TzzgT)(1Final AlgorithmIterate between two local movesKeep structure fixed and reorder split order of internal nodes optimallyOptimally relocate rsub-trees:O(n log n)O(r n)  O(n), since r<n2314a ec db2314a bc deaecdbExampleInitial tree: cost 62.25 After graft: 27.01324a bc de3214a bc decd3214a beExample, continuedAfter reordering:cost 25.5GraftFinal tree:cost 21.0Improvement over average linkageNABest: 20.88Annealed algorithm, initialized with average linkageBest: 430.35Worst: 449.38Best: 20.88Worst: 24.56Iterative algorithm, initialized randomly100 points20 points444.15411.83233 iterations21.9921.598 iterationsAverage linkageIterative algorithm, initialized with average linkage• Random and annealed methods find better optima• Simulated annealing is much more consistent than local search with random initialization• Number of iterations to converge grows with data sizeImprovement over average linkagek-means score for k clustering% improvementWeighted (40 – 60 ) vs. uniformk-means score for k clustering% change in costConvergenceApplications• Gene Expression Analysis• Microarray Data• Image Segmantation• Desease ClassificationSummaryThere is a basic existence question about hierarchical clustering which needs to be addressed:must there always exist a hierarchical clustering in which, for each k,


View Full Document

UCSD CSE 252C - Hierarchical Clustering

Download Hierarchical Clustering
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Hierarchical Clustering and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Hierarchical Clustering 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?