# FSU CIS 5930r - Cluster Analysis (13 pages)

Previewing pages 1, 2, 3, 4 of 13 page document
View Full Document

## Cluster Analysis

Previewing pages 1, 2, 3, 4 of actual document.

View Full Document
View Full Document

## Cluster Analysis

76 views

Pages:
13
School:
Florida State University
Course:
Cis 5930r - Selected Topics in Computer Science (13).
##### Selected Topics in Computer Science (13). Documents
• 41 pages

• 28 pages

• 31 pages

• 3 pages

• 40 pages

• 42 pages

• 27 pages

• 41 pages

• 22 pages

• 25 pages

• 39 pages

• 37 pages

• 38 pages

• 33 pages

• 9 pages

• 16 pages

• 31 pages

• 13 pages

• 34 pages

• 59 pages

• 43 pages

Unformatted text preview:

Cluster Analysis Cluster Analysis What is Cluster Analysis Types of Data in Cluster Analysis A Categorization of Major Clustering Methods Partitioning Methods Hierarchical Methods Density Based Methods Grid Based Methods Model Based Clustering Methods Outlier Analysis Summary Hierarchical Clustering Use distance matrix as clustering criteria This method does not require the number of clusters k as an input but needs a termination condition Step 0 a Step 1 Step 2 Step 3 Step 4 ab b abcde c cde d de e Step 4 agglomerative AGNES Step 3 Step 2 Step 1 Step 0 divisive DIANA AGNES Agglomerative Nesting Implemented in statistical analysis packages e g Splus Use the Single Link method and the dissimilarity matrix Merge objects that have the least dissimilarity Go on in a non descending fashion Eventually all objects belong to the same cluster 10 10 10 9 9 9 8 8 8 7 7 7 6 6 6 5 5 5 4 4 4 3 3 3 2 2 2 1 1 1 0 0 0 1 2 3 4 5 6 7 8 9 10 0 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 Single Link each time merge the clusters C1 C2 which are connected by the shortest single link of objects i e minp C1 q C2dist p q 5 6 7 8 9 10 A Dendrogram Shows How the Clusters are Merged Hierarchically Decompose data objects into a several levels of nested partitioning tree of clusters called a dendrogram d b A clustering of the data objects is obtained by cutting the dendrogram at the desired level then each connected component forms a cluster level 4 E g level 1 gives 4 clusters a b c d e level 2 gives 3 clusters a b c d e level 3 gives 2 clusters a b c d e etc level 1 e a c level 3 level 2 a b c d e DIANA Divisive Analysis Implemented in statistical analysis packages e g Splus Inverse order of AGNES Eventually each node forms a cluster on its own 10 10 10 9 9 9 8 8 8 7 7 7 6 6 6 5 5 5 4 4 4 3 3 3 2 2 2 1 1 1 0 0 0 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 More on Hierarchical Clustering Methods Major weakness of agglomerative clustering methods do not scale well time complexity of

View Full Document

Unlocking...