DOC PREVIEW
UT Dallas CS 6375 - clustering

This preview shows page 1-2-22-23 out of 23 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 23 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 23 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 23 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 23 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 23 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

11Machine LearningCS6375 --- Spring 2015aClusteringInstructor: Yang LiuAcknowledgement: slides modified those from Vincent Ng.2ClusteringPartition unlabeled examples into disjoint subsets of clusters, such that:Examples within a cluster are very similarExamples in different clusters are very differentDiscover new categories in an unsupervised manner (no sample category labels provided).23Supervised LearningDecision treesArtificial neural netsNaïve bayesK-nearest neighborLinear regressionLogistic regressionSupport vector meachines...4Supervised LearningF(x): true function (usually not known)D: training sample drawn from F(x)57,M,195,0,125,95,39,25,0,1,0,0,0,1,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0 078,M,160,1,130,100,37,40,1,0,0,0,1,0,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0 169,F,180,0,115,85,40,22,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0 018,M,165,0,110,80,41,30,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 054,F,135,0,115,95,39,35,1,1,0,0,0,1,0,0,0,1,0,0,0,0,1,0,0,0,1,0,0,0,0 184,F,210,1,135,105,39,24,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0 089,F,135,0,120,95,36,28,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,1,0,0 149,M,195,0,115,85,39,32,0,0,0,1,1,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0 040,M,205,0,115,90,37,18,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 074,M,250,1,130,100,38,26,1,1,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0 077,F,140,0,125,100,40,30,1,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,1 1…35Supervised LearningF(x): true function (usually not known)D: training sample drawn from F(x)57,M,195,0,125,95,39,25,0,1,0,0,0,1,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0 078,M,160,1,130,100,37,40,1,0,0,0,1,0,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0 169,F,180,0,115,85,40,22,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0 018,M,165,0,110,80,41,30,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 054,F,135,0,115,95,39,35,1,1,0,0,0,1,0,0,0,1,0,0,0,0,1,0,0,0,1,0,0,0,0 1G(x): model learned from training sample D71,M,160,1,130,105,38,20,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0 ?Goal: E[(F(x)-G(x))2] is small (near zero) for future samples drawn from F(x)6Supervised LearningWell defined goal:Learn G(x) that is a good approximation to F(x) from training sample DKnow how to measure error:Accuracy, ...47Clustering=Unsupervised Learning8Un-Supervised LearningTrain Set:57,M,195,0,125,95,39,25,0,1,0,0,0,1,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0 078,M,160,1,130,100,37,40,1,0,0,0,1,0,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0 169,F,180,0,115,85,40,22,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0 018,M,165,0,110,80,41,30,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 054,F,135,0,115,95,39,35,1,1,0,0,0,1,0,0,0,1,0,0,0,0,1,0,0,0,1,0,0,0,0 184,F,210,1,135,105,39,24,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0 089,F,135,0,120,95,36,28,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,1,0,0 149,M,195,0,115,85,39,32,0,0,0,1,1,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0 040,M,205,0,115,90,37,18,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 074,M,250,1,130,100,38,26,1,1,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0 077,F,140,0,125,100,40,30,1,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,1 1…Test Set:71,M,160,1,130,105,38,20,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0 ?Sometimes need to apply clustering results to new unseen instances59Clustering• Useful for discovering data structure and patterns• Results depend on algorithms• Issues– What defines a cluster? – Similarity measure? – How many clusters? 10School EmployeesSimpson's Family MalesFemales Clustering is subjectiveWhat is a natural grouping among these objects?611Hierarchical ClusteringBuild a tree-based hierarchical taxonomy from a set of unlabeled examples.Recursive application of a standard clustering algorithm can produce a hierarchical clustering.animalvertebratefish reptile amphib. mammal worm insect crustaceaninvertebrate12Agglomerative vs. Divisive ClusteringAgglomerative (bottom-up) methods start with each example in its own cluster and iteratively combine them to form larger and larger clusters.Divisive (top-down): split clusters.713Hierarchical Agglomerative Clustering (HAC)Assumes a similarity function for determining the similarity of two instances.Starts with all instances in a separate cluster and then repeatedly joins the two clusters that are most similar until there is only one cluster.The history of merging forms a binary tree or hierarchy.Basic algorithm:Start with all instances in their own cluster.Until there is only one cluster:Among the current clusters, determine the two clusters, ci and cj, that are most similar.Replace ci and cjwith a single cluster ci ∪ cj14Similarity/Distance MetricsFor two instances:Euclidian distance (L2norm):L1norm:Cosine similarity 212)(),(imiiyxyxL −=∑=rr∑=−=miiiyxyxL11),(rryxyxyxyxsimrrrrrrrr⋅==•),cos(),(815Cluster SimilarityHow to compute similarity of two clusters each possibly containing multiple instances?Single Link: Similarity of two most similar members.Complete Link: Similarity of two least similar members.Group Average: Average similarity between members.16Single Link Agglomerative ClusteringUse maximum similarity of pairs:Can result in “straggly” (long and thin) clusters due to chaining effect.),(max),(,yxsimccsimjicycxji∈∈=917Single Link Example18Complete Link Agglomerative ClusteringUse minimum similarity of pairs:Makes more “tight,” spherical clusters that are typically preferable.),(min),(,yxsimccsimjicycxji∈∈=1019Complete Link Example20Computing Cluster SimilarityAfter merging ciand cj, the similarity of the resulting cluster to any other cluster, ck, can be computed by:Single Link:Complete Link:)),(),,(max()),((kjkikjiccsimccsimcccsim=∪)),(),,(min()),((kjkikjiccsimccsimcccsim=∪1121Group Average Agglomerative ClusteringUse average similarity across all pairs within the merged cluster to measure the similarity of two clusters.Compromise between single and complete link.Averaged across all ordered pairs in the merged cluster instead of unordered pairs between the two clusters ∑ ∑∪∈ ≠∪∈−∪∪=)( :)(),()1(1),(ji jiccx xyccyjijijiyxsimccccccsimr rrrrr22Hierarchical ClusteringNo need to specify the number of clusters in advanceHierarchal nature maps nicely onto human intuition for some domainsMay be used to find outliersDo not scale wellLocal optima Interpretation of results is subjective1223Non-Hierarchical ClusteringTypes of non-hierarchical clustering (partitional clustering):Hard clustering (e.g., K-means clustering)Soft clustering (e.g., EM)Typically must


View Full Document

UT Dallas CS 6375 - clustering

Documents in this Course
ensemble

ensemble

17 pages

em

em

17 pages

dtree

dtree

41 pages

cv

cv

9 pages

bayes

bayes

19 pages

vc

vc

24 pages

svm-2

svm-2

16 pages

svm-1

svm-1

18 pages

rl

rl

18 pages

mle

mle

16 pages

mdp

mdp

19 pages

knn

knn

11 pages

intro

intro

19 pages

hmm-train

hmm-train

26 pages

hmm

hmm

28 pages

hmm-train

hmm-train

26 pages

hmm

hmm

28 pages

ensemble

ensemble

17 pages

em

em

17 pages

dtree

dtree

41 pages

cv

cv

9 pages

bayes

bayes

19 pages

vc

vc

24 pages

svm-2

svm-2

16 pages

svm-1

svm-1

18 pages

rl

rl

18 pages

mle

mle

16 pages

mdp

mdp

19 pages

knn

knn

11 pages

intro

intro

19 pages

vc

vc

24 pages

svm-2

svm-2

16 pages

svm-1

svm-1

18 pages

rl

rl

18 pages

mle

mle

16 pages

mdp

mdp

19 pages

knn

knn

11 pages

intro

intro

19 pages

hmm-train

hmm-train

26 pages

hmm

hmm

28 pages

ensemble

ensemble

17 pages

em

em

17 pages

dtree

dtree

41 pages

cv

cv

9 pages

bayes

bayes

19 pages

vc

vc

24 pages

svm-2

svm-2

16 pages

svm-1

svm-1

18 pages

rl

rl

18 pages

mle

mle

16 pages

mdp

mdp

19 pages

knn

knn

11 pages

intro

intro

19 pages

hmm-train

hmm-train

26 pages

hmm

hmm

28 pages

ensemble

ensemble

17 pages

em

em

17 pages

dtree

dtree

41 pages

cv

cv

9 pages

bayes

bayes

19 pages

hw2

hw2

2 pages

hw1

hw1

4 pages

hw0

hw0

2 pages

hw5

hw5

2 pages

hw3

hw3

3 pages

20.mdp

20.mdp

19 pages

19.em

19.em

17 pages

16.svm-2

16.svm-2

16 pages

15.svm-1

15.svm-1

18 pages

14.vc

14.vc

24 pages

9.hmm

9.hmm

28 pages

5.mle

5.mle

16 pages

3.bayes

3.bayes

19 pages

2.dtree

2.dtree

41 pages

1.intro

1.intro

19 pages

21.rl

21.rl

18 pages

CNF-DNF

CNF-DNF

2 pages

ID3

ID3

4 pages

mlHw6

mlHw6

3 pages

MLHW3

MLHW3

4 pages

MLHW4

MLHW4

3 pages

ML-HW2

ML-HW2

3 pages

vcdimCMU

vcdimCMU

20 pages

hw0

hw0

2 pages

hw3

hw3

3 pages

hw2

hw2

2 pages

hw1

hw1

4 pages

9.hmm

9.hmm

28 pages

5.mle

5.mle

16 pages

3.bayes

3.bayes

19 pages

2.dtree

2.dtree

41 pages

1.intro

1.intro

19 pages

15.svm-1

15.svm-1

18 pages

14.vc

14.vc

24 pages

hw2

hw2

2 pages

hw1

hw1

4 pages

hw0

hw0

2 pages

hw3

hw3

3 pages

9.hmm

9.hmm

28 pages

5.mle

5.mle

16 pages

3.bayes

3.bayes

19 pages

2.dtree

2.dtree

41 pages

1.intro

1.intro

19 pages

Load more
Download clustering
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view clustering and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view clustering 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?