Unformatted text preview:

Primal Sketch: Integrating Structure andTexture⋆Cheng-en Guo, Song-Chun Zhu, and Ying Nian WuDepartments of Statistics and Computer ScienceUniversity of California, Los AngelesLos Angeles, CA 90095AbstractThis article proposes a generative image model, which is called “primal sketch,”following Marr’s insight and terminology. This model combines two prominentclasses of generative models, namely, sparse coding model and Markov random fieldmodel, for representing geometric structures and stochastic textures respectively.Specifically, the image lattice is divided into structure domain and texture domain.The sparse coding mo del is used to represent image intensities on the structure do-main, where edge and ridge segments are modeled by image coding functions withexplicit geometric and photometric parameters. The edge and ridge segments forma sketch graph whose nodes are corners and junctions. The sketch graph is gov-erned by a simple spatial prior model. The Markov random field model is used tosummarize image intensities on the texture domain, where the texture patterns arecharacterized by feature statistics in the form of marginal histograms of responsesfrom a set of linear filters. The Markov random fields in-paint the texture domainwhile interpolating the structure domain seamlessly. A sketch pursuit algorithm isproposed for model fitting. A number of experiments on real images are shown todemonstrate the model and the algorithm.Key words: Sparse coding, Markov random fields, Image primitives, Sketchgraphs, Lossy image coding⋆We thank Arthur Pece for pointing out the connection with vector quantization.We also thank him and an anonymous referee for detailed comments and suggestionsthat have greatly improved the presentation of the paper. We thank Alan Yuille,Zhuowen Tu, Feng Han, and Yizhou Wang for insightful discussions. The work issupported by NSF IIS-0222967.Email address: cguo,sczhu,[email protected] (Cheng-en Guo, Song-ChunZhu, and Ying Nian Wu).Preprint submitted to Elsevier Science 13 October 20051 IntroductionGeometric structures and stochastic textures are two ubiquitous classes ofvisual phenomena in natural scenes. Geometric structures appear simple, andcan be represented by ed ges, ridges, and their compositions such as corners andjunctions. Stochastic textures appear complex, and are often characterized byfeature statistics. Despite their apparent distinctions, texture impressions areoften caused by large number of object structures that are either too small ortoo distant relative to camera resolution. Moreover, with the change of viewingdistance or camera resolution, the same group of objects may appear eitheras structures or textures. It is therefore desirable to integrate structures andtextures in a common representational and computational framework.In this article, a generative model call “primal sketch” is proposed, followingthe insight and terminology of Marr [15]. The model combines two prominentclasses of generative models. One is sparse co ding model, for representing geo-metric structures. The other is Markov random field model, for characterizingstochastic textures.Specifically, the image lattice is divided into structure domain and texturedomain. The sparse coding model is used to represent image intensities onthe structure domain, where the most common structures are boundaries ofobjects that are above a certain scale. Following Elder and Zucker [7], wemodel the image intensities of object boundaries by a small number of edgeand ridge coding functions with explicit geometric and photometric parame-ters. For instance, an edge segment is modeled by an elongate step functionconvolved with a Gaussian kernel. A ridge segment is a composition of twoparallel edge segments. These edge and ridge segments form a sketch graph,whose nodes are corners and junctions. The sketch graph is regulated by asimple spatial prior model. The form of our sparse coding model is similar tovector quantization [9], where the coding functions serve as coding vectors.The Markov random field model is used to summarize image intensities on thetexture domain, where the texture patterns are characterized by feature statis-tics in the form of marginal histograms of responses from a set of linear filters.The Markov random fields in-paint the texture domain while interpolating thestructure domain seamlessly.Figure (1) shows an example. (a) is the observed image. (b) is the sketch graph,where each line segment represents an edge or ridge coding function. (c) is thereconstructed structure domain of the image using these edge and ridge codingfunctions. (d) is a segmentation of the remaining texture domain into a numberof homogeneous texture regions, by clustering the local marginal histogramsof filter responses. Here different regions are represented by different shades.2(a) Observed image (b) Sketch graph (c) Structure image(d) Texture regions (e) Synthesized textures (f) Synthesized imageFig. 1. An example of the primal sketch model. (a) An observed image. (b) Thesketch graph computed from the image. (c) The structure domain of the image. (d)The remaining texture domain is segmented into a number of homogeneous textureregions. (e) Synthesized textures on these regions. (f) The final synthesized imagethat integrates seamlessly the structure and texture parts.(e) displays the synthesized textures in the segmented regions. (f) is the finalsynthesized image by putting (c) and (e) together. Because the textures aresynthesized with the structure domain as boundary conditions, the texturesinterpolate the structure domain seamlessly.In the primal sketch representation, the sparse coding model and the Markovrandom field model are intrinsically connected. The elongate and orientedlinear filters, such as Gabor filters [6] or Difference of Gaussian filters [25],are used to detect edges or ridges at different frequencies. On the structuredomain where edge and ridge segments are present, the filters have large re-sponses along th e edge or ridge directions, and the optimally tuned filters arehighly connected and aligned across space and frequency. Such regularitiesand redundancies in filter responses can be accounted for by the image cod-ing functions for edge and ridge segments. On the remaining texture domainwhere the filters fail to detect edges or ridges, the filter responses are weakand they are not well aligned over space or frequency. So they can be pooledinto the marginal histograms


View Full Document

UCLA STATS 238 - Primal Sketch

Download Primal Sketch
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Primal Sketch and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Primal Sketch 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?