SPARSE REPRESENTATION OF IMAGES WITH HYBRID LINEAR MODELS

Home> Academic Documents> SPARSE REPRESENTATION OF IMAGES WITH HYBRID LINEAR MODELS

DOC PREVIEW

This preview shows page 1 out of 4 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 4 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 4 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

SPARSE REPRESENTATION OF IMAGES WITH HYBRID LINEAR MODELSKun Huang Allen Y. Yang Yi MaCoordinated Science Laboratory, University of Illinois, Urbana, IL 61801Email: {kunhuang, yangyang, yima}@uiuc.eduABSTRACTWe propose a mixtureof multiple linear models, also knownas hybrid linear model, for a sparse representation of an im-age. This is a generalization of the conventional Karhunen-Loeve transform (KLT) orprincipal component analysis (PCA).We provide an algebraic algorithm based on generalizedprincipal component analysis (GPCA) that gives a globaland non-iterative solution to the identification of a hybridlinear model for any given image. We demonstrate the ef-ficiency of the proposed hybrid linear model by experimentsand comparison with other transforms such as the KLT, DCT,and wavelet transforms. Such an efficient representationcanbe very useful for later stages of image processing, espe-cially in applications such as image segmentation and imagecompression.1. INTRODUCTIONIn image processing, one often seeks a more efficient rep-resentation for a digital image than the pixel-based repre-sentation. A typical approach is to divide the image intoa set of blocks. If there is strong statistical correlation be-tween the blocks, they can be represented as the superposi-tion of a much smaller number of components, also knownas the sparse components [17]. Traditional approaches torepresenting an image by a different basis include usingthe discrete Fourier transform (DFT) or the discrete cosinetransform (DCT). A major drawback for these transformsis that, despite significantly different statistics in differentimages, the bases of these transforms are fixed. So thereare reasons to believe that a more efficient representationcan be achieved with the so-called adaptive basis [20]. Onemethod to obtain such an adaptive basis is via the Karhunen-Loeve transform (KLT), also known as principal componentanalysis (PCA) in machine learning. If a set of (random)vectors obey a common second-order statistical model, thebasis (or the linear model) identified by the KLT is in factoptimal [10, 23]. But it will lose its optimality if a singleimage contains regions with significantly different texturesthat cannot be described by a single linear statistical model,which is, unfortunately, typically the case for a generic im-age.This brings up the fundamental problem that this paperis about to address: How can we simultaneously segmentthe image into different regions and estimate an adaptivebasis for each region? That is, to fit a mixture of linearmodels, also known as hybrid linear model, to the image,but without knowing a priori how many linear models touse, the dimension of each model, or which model appliesto which blocks.In the machine learning literature, the problem of si-multaneously estimating a mixture of models and segment-ing data into respective models was usually resolved viaan incremental scheme that iterates between segmentationand estimation, e.g., the expectation maximization (EM)method. It has only recently been discovered that a non-iterative and global solution, called generalized principalcomponent analysis (GPCA) [24, 13], exists for the seg-mentation and estimation of hybrid linear models. The ideathat image segmentation may improve image compressionis not new. However, we believe that GPCA is a methodthat can seamlessly combine these two key components inimage processing. It offers the new capability to representdifferent image regions with different colors and textures bydifferent linear models with different linear bases.Relation to prior work. There is a vast amount of liter-ature on finding adaptive bases (or transforms) for signals.Adaptive wavelet transforms and adapted wavelet packetshave been extensively studied [4, 20, 15, 6, 18]. The idea isto search for an optimal transform from a limited (althoughlarge) set of possible transforms. Another approach is tofind some universal optimal transform based on the signals[10, 19, 6]. Spatially adapted bases have also been devel-oped such as [2, 21, 16]. The main purpose of this paperis to show that an adaptive basis based on a single model(such as the KLT and PCA) is not necessarily optimal foran efficient image representation, and hybrid models maybe a much better choice.The notion of a mixture of linear models and bases forimage representation is closely related to the sparse compo-nent analysis [17]. That is to identify aset of non-orthogonalbase vectors for natural images such that the representationof the images is sparse. In the related work of [3, 11, 12,8, 22], the main goal is to find a mixture of models suchthat the signals can be decomposed into multiple modelsand their overall representation is sparse. In that approach,the signals are expressed as a linear superposition of all themodels while, in this paper, the signals will be segmentedto mutually exclusive groups, and a sparse representation isfound for each group.Image segmentation based on local color and texture in-formation extracted from various filter banks has been stud-ied extensivelyin the computer vision literature (e.g., [1, 14,7]). Since the MPEG-4 have started to incorporate texturesegmentation [9], we expect that the concept and methodintroduced in this paper will be useful for developing newimage and video processing techniques.2. REPRESENTATION OF AN IMAGE WITH AHYBRID LINEAR MODELIn this section, we introduce the notion of hybrid linearmodels for images. Normally, we divide a digital imageI into a set of, say N , non-overlapping1equal-size l × lblocks, that is, I = ∪Nj=1Bi. Denote the number of colorchannels of the image to be c. For grayscale images c = 1,and for color images c = 3 (i.e., RGB, HSV, or YCbCr).Then we may represent each block B by a vector x that col-lects the pixel values, and the dimension of the vector x isK = cl2.As we have contended in the introduction,a single linearmodel has its limitations when applied to a generic imagewhich often consists of regions with significantly differenttextures. However, it is reasonable to assume that a lin-ear model is still valid at least for each region, if we knowhow to segment the image into such regions. That is, weassume that the image blocks can be segmented into mul-tiple groups: X = ∪ni=1Xi, and for each group Xithereexists a basis Bi= {bij}kij=1such that x =Pkij=1αjbij,if x ∈ Xi. We denote the subspace spanned by the basisBiby Si.= span(Bi), i = 1, 2, . . . , n. Let kibe the


School:
Email:
New Password:
Confirm Password:

This preview shows page 1 out of 4 pages.

Please select your school