UCI ICS 273A - Image Segmentation using k-means clustering, EM and Normalized Cuts - D132340

Home> Schools> University of California, Irvine> (ICS) > ICS 273A> Image Segmentation using k-means clustering, EM and Normalized Cuts

DOC PREVIEW

UCI ICS 273A - Image Segmentation using k-means clustering, EM and Normalized Cuts

School name University of California, Irvine

Course Ics 273a- Machine Learning

Pages 7

This preview shows page 1-2 out of 7 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 7 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 7 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 7 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Image Segmentation using k-means clustering,EM and Normalized CutsSuman TatirajuDepartment of EECSUniversity Of California - IrvineIrvine, CA [email protected] MehtaDepartment of EECSUniversity Of California - IrvineIrvine, CA [email protected] project addresses the problem of segmenting an image into differ-ent regions. We analyze two unsupervised learning algorithms namelythe K-means and EM and compare it with a graph based algorithm, theNormalized Cut algorithm. The K-means and EM are clustering algo-rithms,which partition a data set into clusters according to some defineddistance measure. The Normalized Cut criterion takes a measure of thesimilarity between data elements of a group and the dissimilarity betweendifferent groups for segmenting the images.1 IntroductionImages are considered as one of the most important medium of conveying information.Understanding images and extracting the information from them such that the informationcan be used for other tasks is an important aspect of Machine learning. An example of thesame would be the use of images for navigation of robots. Other applications like extract-ing malign tissues from body scans etc form integral part of Medical diagnosis. One of thefirst steps in direction of understanding images is to segment them and find out differentobjects in them. To do this, features like the histogram plots and the frequency domaintransform can be used. In this project, we look at three algorithms namely K Means clus-tering, Expectation Maximization and the Normalized cuts and compare them for imagesegmentation. The comparison is based on various error metrics and time complexity ofthe algorithms. It has been assumed that the number of segments in the image are knownand hence can be passed to the algorithm.The report is organized as follows.Section 2 describes each segmentation algorithm in de-tail. Results generated from the algorithms are presented in section 3. Finally, section 4draws some conclusions.2 Image Segmentation AlgorithmsImages can be segmented into regions by the following algorithms:2.1 K-means Clustering AlgorithmK-Means algorithm is an unsupervised clustering algorithm that classifies the input datapoints into multiple classes based on their inherent distance from each other. The algorithmassumes that the data features form a vector space and tries to find natural clustering inthem. The points are clustered around centroids µi∀i = 1 . . . k which are obtained byminimizing the objectiveV =kXi=1Xxj∈Si(xj− µi)2(1)where there are k clusters Si, i = 1, 2, . . . , k and µiis the centroid or mean point of all thepoints xj∈ SiAs a part of this project, an iterative version of the algorithm was implemented. The algo-rithm takes a 2 dimensional image as input. Various steps in the algorithm are as follows:1. Compute the intensity distribution(also called the histogram) of the intensities.2. Initialize the centroids with k random intensities.3. Repeat the following steps until the cluster labels of the image does not changeanymore.4. Cluster the points based on distance of their intensities from the centroid intensi-ties.c(i):= arg minj||x(i)− µj||2(2)5. Compute the new centroid for each of the clusters.µi:=Pmi=11{c(i)= j}x(i)Pmi=11{c(i)= j}(3)where k is a parameter of the algorithm (the number of clusters to be found), i iterates overthe all the intensities, j iterates over all the centroids and µiare the centroid intensities.2.2 EM AlgorithmExpectation Maximization(EM) is one of the most common algorithms used for densityestimation of data points in an unsupervised setting. The algorithm relies on finding themaximum likelihood estimates of parameters when the data model depends on certain latentvariables. In EM, alternating steps of Expectation (E) and Maximization (M) are performediteratively till the results converge.The E step computes an expectation of the likelihood byincluding the latent variables as if they were observed, and a maximization (M) step, whichcomputes the maximum likelihood estimates of the parameters by maximizing the expectedlikelihood found on the last E step[1]. The parameters found on the M step are then usedto begin another E step, and the process is repeated until convergence.Mathematically for a given training dataset {x(1), x(i2, . . . x(m)} and model p(x, z) wherez is the latent variable, We have:l(θ) =mX=1log p(x; θ) (4)=mXi=1logXzp(x, z; θ) (5)As can be seen from the above equation, The log likelihood is described in terms of x, zand θ. But since z, the latent variable is not known, We use approximations in its place.These approximations take the form of E & M steps mentioned above and formulatedmathematically below.E Step, for each i:Qi(z(i)) := p(z(i)|x(i); θ) (6)M Step, for all z:θ := arg maxθXiXz(i)Qi(z(i)) logp(x(i), z(i); θ)Qi(z(i))(7)where Qiis the posterior distribution of z(i)’s given the x(i)’s.Conceptually, The EM algorithm can be considered as a variant of the K Means algorithmwhere the membership of any given point to the clusters is not complete and can be frac-tional.2.3 Normalized Cuts-A graph Partitioning approachImage segmentation can also be viewed as an optimal partitioning of a graph. The image ispresented as a weighted undirected graph G = (V, E). This image graph can be partitionedinto two subgraphs A and B by modeling the partition as minimizing the cut as definedbelow:cut(A, B) =XuA,vBw(u, v) (8)where w(i, j), the weight of each edge is a function of the similarity between nodes iand j.However the minimum cut criteria favors cutting small sets of isolated nodes in thegraph.To overcome these outliers we can use a modified cost function, Normalized Cut asdefined below.Ncut(A, B) =cut(A, B)assoc(A, V )+cut(A, B)assoc(B, V ), assoc(A, V ) =XuA,tVw(u, t) (9)The association value, assoc(A,V), is the total connection from nodes A to all nodes in thegraph. Ncut value won’t be small for the cut that partitions isolating points, because the cutvalue will be a large percentage of the total connection from that set to the others.Given a partition of a graph V into two disjoint complementary sets A and B, let x bean N = |V | dimensional indication vector, xi= 1 if node i is in A, -1 otherwise. Letdi=PjW (i, j) be the total connection from node i to all other nodes. Ncuts can berewritten as (3).Ncut(A, B) =Pxi>0,xj<0−wijxixjPxi>0di+Pxi<0,xj>0−wijxixjPxi<0di(10)Let D = diag(d1, d2....dN) be an NxN diagonal matrix and W

View Full Document