This preview shows page 1-2-3-4-5-6-7-8-9-60-61-62-63-64-65-66-67-120-121-122-123-124-125-126-127-128 out of 128 pages.
Clustering and the k-means AlgorithmDavid M. BleiCOS424Princeton UniversityMarch 11, 2007D. Blei Clustering 01 1 / 32Clustering•Goal: Automatically segment data into groups of similar points•Question: When and why would we want to do this?•Useful for:•Automatically organizing data•Understanding hidden structure in some data•Representing high-dimensional data in a low-dimensional space•Examples:•Customers according to purchase histories•Genes according to expression profile•Search results according to topic•MySpace users according to interests•A museum catalog according to image similarityD. Blei Clustering 01 2 / 32Clustering•Goal: Automatically segment data into groups of similar points•Question: When and why would we want to do this?•Useful for:•Automatically organizing data•Understanding hidden structure in some data•Representing high-dimensional data in a low-dimensional space•Examples:•Customers according to purchase histories•Genes according to expression profile•Search results according to topic•MySpace users according to interests•A museum catalog according to image similarityD. Blei Clustering 01 2 / 32Clustering•Goal: Automatically segment data into groups of similar points•Question: When and why would we want to do this?•Useful for:•Automatically organizing data•Understanding hidden structure in some data•Representing high-dimensional data in a low-dimensional space•Examples:•Customers according to purchase histories•Genes according to expression profile•Search results according to topic•MySpace users according to interests•A museum catalog according to image similarityD. Blei Clustering 01 2 / 32Clustering•Goal: Automatically segment data into groups of similar points•Question: When and why would we want to do this?•Useful for:•Automatically organizing data•Understanding hidden structure in some data•Representing high-dimensional data in a low-dimensional space•Examples:•Customers according to purchase histories•Genes according to expression profile•Search results according to topic•MySpace users according to interests•A museum catalog according to image similarityD. Blei Clustering 01 2 / 32Clustering•Goal: Automatically segment data into groups of similar points•Question: When and why would we want to do this?•Useful for:•Automatically organizing data•Understanding hidden structure in some data•Representing high-dimensional data in a low-dimensional space•Examples:•Customers according to purchase histories•Genes according to expression profile•Search results according to topic•MySpace users according to interests•A museum catalog according to image similarityD. Blei Clustering 01 2 / 32Clustering•Goal: Automatically segment data into groups of similar points•Question: When and why would we want to do this?•Useful for:•Automatically organizing data•Understanding hidden structure in some data•Representing high-dimensional data in a low-dimensional space•Examples:•Customers according to purchase histories•Genes according to expression profile•Search results according to topic•MySpace users according to interests•A museum catalog according to image similarityD. Blei Clustering 01 2 / 32Clustering•Goal: Automatically segment data into groups of similar points•Question: When and why would we want to do this?•Useful for:•Automatically organizing data•Understanding hidden structure in some data•Representing high-dimensional data in a low-dimensional space•Examples:•Customers according to purchase histories•Genes according to expression profile•Search results according to topic•MySpace users according to interests•A museum catalog according to image similarityD. Blei Clustering 01 2 / 32Clustering•Goal: Automatically segment data into groups of similar points•Question: When and why would we want to do this?•Useful for:•Automatically organizing data•Understanding hidden structure in some data•Representing high-dimensional data in a low-dimensional space•Examples:•Customers according to purchase histories•Genes according to expression profile•Search results according to topic•MySpace users according to interests•A museum catalog according to image similarityD. Blei Clustering 01 2 / 32Clustering•Goal: Automatically segment data into groups of similar points•Question: When and why would we want to do this?•Useful for:•Automatically organizing data•Understanding hidden structure in some data•Representing high-dimensional data in a low-dimensional space•Examples:•Customers according to purchase histories•Genes according to expression profile•Search results according to topic•MySpace users according to interests•A museum catalog according to image similarityD. Blei Clustering 01 2 / 32Clustering•Goal: Automatically segment data into groups of similar points•Question: When and why would we want to do this?•Useful for:•Automatically organizing data•Understanding hidden structure in some data•Representing high-dimensional data in a low-dimensional space•Examples:•Customers according to purchase histories•Genes according to expression profile•Search results according to topic•MySpace users according to interests•A museum catalog according to image similarityD. Blei Clustering 01 2 / 32Clustering•Goal: Automatically segment data into groups of similar points•Question: When and why would we want to do this?•Useful for:•Automatically organizing data•Understanding hidden structure in some data•Representing high-dimensional data in a low-dimensional space•Examples:•Customers according to purchase histories•Genes according to expression profile•Search results according to topic•MySpace users according to interests•A museum catalog according to image similarityD. Blei Clustering 01 2 / 32Clustering•Goal: Automatically segment data into groups of similar points•Question: When and why would we want to do this?•Useful for:•Automatically organizing data•Understanding hidden structure in some data•Representing high-dimensional data in a low-dimensional space•Examples:•Customers according to purchase histories•Genes according to expression profile•Search results according to topic•MySpace users according to interests•A museum catalog according to image similarityD. Blei Clustering 01 2 / 32Clustering set-up•Our
View Full Document