UCLA STATS 238 - Unsupervised Learning of Probabilistic Grammar-Markov Models - D386096

Home> Schools> University of California, Los Angeles> Statistics (STATS) > STATS 238> Unsupervised Learning of Probabilistic Grammar-Markov Models

UCLA STATS 238 - Unsupervised Learning of Probabilistic Grammar-Markov Models

School name University of California, Los Angeles

Course Stats 238- Vision as Bayesian Inference

Pages 18

Download Save

Unformatted text preview:

1Unsupervised Learning of ProbabilisticGrammar-Markov Models for Object CategoriesLong (Leo) Zhu1, Yuanhao Chen2, Alan Yuille11Department of Statistics, University of California at Los Angeles, Los Angeles, CA 90095{lzhu,yuille}@stat.ucla.edu2University of Science and Technology of China, Hefei, Anhui 230026 [email protected] introduce a Probabilistic Grammar-Markov Model (PGMM) which couples probabilistic context freegrammars and Markov Random Fields. We propose an unsupervised structure induction method for learning aPGMM for object categories. Our approach is invariant to the scale and rotation of the objects. We illustrate ourapproach using thirteen objects from the Caltech 101 database. In addition, we learn the model of a hybrid objectclass where we do not know the specific object or its position, scale or pose. This is illustrated by learning a hybridclass consisting of faces, motorbikes, and airplanes. The individual objects can be recovered as different aspects ofthe grammar for the object class. In all cases, we validate our results by learning the probabilistic grammars fromtraining datasets and evaluating them on the test datasets. We compare our method to alternative approaches. Theadvantages of our approach is the speed of inference (under one second), the parsing of the object, and increasedaccuracy of performance. Moreover, our approach is very general and can be applied to a large range of objectsand structures.Index TermsComputer Vision, Structural Models, Grammars, Markov Random Fields, Object RecognitionI. INTRODUCTIONREMARKABLE progress in mathematics and computer science of probability is leading to a revolutionin the scope of probabilistic models. In particular, there are exciting new probability models [1]–[6]defined on structured relational systems such as graphs or grammars. These new formulations subsumemore traditional models, such as Markov Random Fields (MRF’s) [7], and have growing applications tonatural languages [8], machine learning (Ref!!), and computer vision [9].Although these models have enormous representational power, there are many practical drawbacks whichmust be overcome before using them. In particular, we need efficient algorithms to learn the models fromtraining data and to perform inference on new examples. This problem is particularly difficult when thestructure of the representation is unknown and needs to be induced from the data.In this paper we present a Probabilistic Grammar-Markov Model (PGMM) which couples probabilisticcontext free grammars and Markov Random Fields. We develop an algorithm called “structure induction”(or “structure pursuit”) which we use to learn the PGMM in an unsupervised manner from a set of trainingdata. This algorithm proceeds by building an AND-OR graph [4], [5] in an iterative way. The form ofthe resulting graph structure ensures that inference can be performed rapidly for new data.Our application is to the detection, recognition, and parsing of objects in images. The training dataconsists of a set of images where the target object is present but at an unknown location. This topic hasbeen much studied [10]–[13]. An extremely compressed version of part of our work appeared in NIPS[14].Our approach has the following four properties. Firstly, a wide range of applicability which we demon-strate by learning models for 13 object categories from the Caltech-101 [10], Figure (1,8). Secondly, the2Fig. 1. Ten of the object categories from Caltech 101 which we learn in this paper.approach is invariant to rotation and a large range of scale of the objects. Thirdly, the approach is ableto deal with object classes, which we illustrate by learning a hybrid class consisting of faces, motorbikesand airplane. Fourthly, the inference is performed rapidly in under a second.This paper is organized as follows. We first review the background in section (II). Section (III) describesthe features we use to represent the images. Section (IV) describes the representation of PGMM. In section(V), we present how to learn the model and use it for detection and parsing. In section (VI), we illustrateour approach by learning models for 13 object, demonstrate invariance to scale and rotation, and performlearning for object classes.II. BACKGROUNDStructured models define a probability distribution on structured relational systems such as graphsor grammars. This includes many standard models of probability distributions defined on graphs – forexample, graphs with fixed structure, such as MRF’s [7] or Conditional Random Fields [2], or ProbabilisticContext Free Grammars (PCFG’s) [3] where the structure is variable. Attempts have been made to unifythese approaches under a common formulation. For example, Case-Factor Diagrams [1] have recently beenproposed as a framework which subsumes both MRF’s and PCFG’s. In this paper, we will be concernedwith models that combine probabilistic grammars with MRF’s. The grammars are based on AND-ORgraphs [1], [4], [5], which relate to mixtures of trees [15]. This merging of MRF’s with probabilisticgrammars results in structured models which have great representational power.There has been considerable interest in inference algorithms for these structured models, for exampleMcAllester et al [1] describe how dynamic programming algorithms (e.g. Viterbi and inside-outside) canbe used to rapidly compute properties of interest. Our paper is concerned with the task of unsupervisedlearning of structured models for applications to detecting, recognizing, and representing visual objects. Inthis paper, we restrict ourselves to a special case of Probabilistic Grammars with OR nodes, and MRF’s.This is simpler than the full cases studied by McAllester but is more complex than the MRF modelsstandardly used for this problem.For MRF models, the number of graph nodes is fixed and structure induction consists of determiningthe connections between the nodes and the forms of the corresponding potentials. For these graphs, aneffective strategy is feature induction [16] which is also known as feature pursuit [17]. A similar strategyis also used to learn CRF’s [18]. In both cases, the learning is fully supervised. For Bayesian network,there is work on learning the structure using the EM algorithm [19].Learning the structure of grammars in an unsupervised way is more difficult. Klein and Manning [3]have developed unsupervised learning of PCFG’s for parsing natural language, but

View Full Document


School:
Email:
New Password:
Confirm Password:

UCLA STATS 238 - Unsupervised Learning of Probabilistic Grammar-Markov Models

Sign up for free to view:

Please select your school