UCLA STATS 238 - Unsupervised Learning of Probabilistic Grammar-Markov Models - D1026028

Home> Schools> University of California, Los Angeles> Statistics (STATS) > STATS 238> Unsupervised Learning of Probabilistic Grammar-Markov Models

DOC PREVIEW

UCLA STATS 238 - Unsupervised Learning of Probabilistic Grammar-Markov Models

School name University of California, Los Angeles

Course Stats 238- Vision as Bayesian Inference

Pages 38

This preview shows page 1-2-3-18-19-36-37-38 out of 38 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 38 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 38 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 38 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 38 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 38 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 38 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 38 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 38 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 38 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Unsupervised Learning of ProbabilisticGrammar-Markov Models for Object CategoriesLong (Leo) Zhu1, Yuanhao Chen2, Alan Yuille11Department of Statistics, University of California at Los Angeles, Los Angeles, CA 90095{lzhu,yuille}@stat.ucla.edu2University of Science and Technology of China, Hefei, Anhui 230026 [email protected] version. Resubmitted 10/Oct/2007 to IEEE Transactions on Pattern Analysis andMachine Intelligence.1Unsupervised Learning of ProbabilisticGrammar-Markov Models for ObjectCategoriesLong (Leo) Zhu1, Yuanhao Chen2, Alan Yuille11Department of Statistics, University of California at Los Angeles, Los Angeles, CA 90095{lzhu,yuille}@stat.ucla.edu2University of Science and Technology of China, Hefei, Anhui 230026 [email protected] introduce a Probabilistic Grammar-Markov Model (PGMM) which couples probabilistic contextfree grammars and Markov Random Fields. These PGMMs are generative models defined over attributedfeatures and are used to detect and classify objects in natural images. PGMMs are designed so thatthey can perform rapid inference, parameter learning, and the more difficult task of structure induction.PGMMs can deal with unknown pose (position, orientation, and scale) in both inference and learning,different appearances, or aspects, of the model. They are learnt in an unsupervised manner where theinput is a set of images with the object somewhere in the image (at different poses) and with variablebackground. We also extend this to learning PGMMs for classes of objects (where an unknown objectfrom the class is present in each image). The goal of this paper is theoretical but, to provide proof ofconcept, we demonstrate results from this approach on a subset of the Caltech 101 dataset (learning ona training set and evaluating on a testing set). Our results are generally comparable with the currentstate of the art, and our inference is performed in less than five seconds.Index TermsComputer Vision, Structural Models, Grammars, Markov Random Fields, Object Recognition1I. INTRODUCTIONRemarkable progress in mathematics and computer science of probability is leading to arevolution in the scope of probabilistic models. There are exciting new probability models definedon structured relational systems, such as graphs or grammars [1]–[6]. Unlike more traditionalmodels, such as Markov Random Fields (MRF’s) [7] and Conditional Random Fields (CRF’s)[2], these models are not restricted to having fixed graph structures. Their ability to deal withvarying graph structure means that they can be applied to model a large range of complicatedphenomena as has been shown by their applications to natural languages [8], machine learning[6], and computer vision [9].Our longterm goal is to provide a theoretical framework for the unsupervised learning ofprobabilistic models for generating, and interpreting (or parsing), natural images [9]. This issomewhat analogous to Klein and Manning’s work on unsupervised learning of natural languagegrammars [3]. In particular, we hope that this paper can help bridge the gap between computervision and related work on grammars in machine learning [8],[1],[6]. There are, however, majordifferences between vision and natural language processing. Firstly, images are arguably farmore complex than sentences so learning a probabilistic model to generate natural images is tooambitious to start with. Secondly, even if we restrict ourselves to the simpler task of generatingan image containing a single object we must deal with: (i) the cluttered background (similar tolearning a natural language grammar when the input contains random symbols as well as words),(ii) the unknown pose (size, scale, and position) of the object, and (iii) the different appearances,or aspects, of the object. Thirdly, the input is a set of image intensities and is considerably morecomplicated than the limited types of speech tags (e.g. nouns, verbs, etc) used as input in [3].In this paper, we address an important subproblem. We are given a set of images containingthe same object but with unknown pose (position, scale, and orientation). The object is allowedto have several different appearances, or aspects. We represent these images in terms of attributedfeatures (AF’s). The task is to learn a probabilistic model for generating the AF’s (both thoseof the object and the background). Our learning is unsupervised in the sense that we know that2there is an object present in the image but we do not know its pose and, in some cases, we donot know the identity of the object. We require that the probability model must allow: (i) rapidinference, (ii) rapid parameter learning, and (iii) structure induction, where the structure of themodel is unknown and must be grown in response to the data.To address this subproblem, we develop a Probabilistic Grammar Markov Model (PGMM)which is motivated by this goal and its requirements. The PGMM combines elements of MRF’s[7] and probabilistic context free grammars (PCFG’s) [8]. The requirement that we can dealwith a variable number of AF’s (e.g. caused by different aspects of the object) motivates theuse of grammars (instead of fixed graph models like MRF’s). But PCFG’s, see figure (1),are inappropriate because they make independence assumptions on the production rules andhence must be supplemented by MRF’s to model the spatial relationships between AF’s of theobject. The requirement that we deal with pose (both for learning and inference) motivates theuse of oriented triangles of AF’s as our basic building blocks for the probabilistic model, seefigure (2). These oriented triangles are represented by features, such as the internal angles ofthe triangle, which are invariant to the pose of the object. The requirement that we can performrapid inference on new images is achieved by combining the triangle building blocks to enabledynamic programming. The ability to perform rapid inference ensures that parameter estimationand structure learning is practical.We decompose the learning task into: (a) learning the structure of the model, and (b) learningthe parameters of the model. Structure learning is the more challenging task [8],[1],[6] andwe propose a structure induction (or structure pursuit) strategy which proceeds by building anAND-OR graph [4], [5] in an iterative way by adding more triangles or OR-nodes (for differentaspects) to the model. We use clustering techniques to make proposals for adding

View Full Document


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2-3-18-19-36-37-38 out of 38 pages.

UCLA STATS 238 - Unsupervised Learning of Probabilistic Grammar-Markov Models

Sign up for free to view:

Please select your school