DOC PREVIEW
Berkeley COMPSCI 294 - BLOG - Probabilistic Models with Unknown Objects

This preview shows page 1-2-3 out of 8 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

BLOG: Probabilistic Models with Unknown Objects∗Brian Milch, Bhaskara Marthi, Stuart Russell, David Sontag, Daniel L. Ong and Andrey KolobovComputer Science DivisionUniversity of CaliforniaBerkeley, CA 94720-1776{milch, bhaskara, russell, dsontag, dlong, karaya1}@cs.berkeley.eduAbstractThis paper introduces and illustrates BLOG, a formal lan-guage for defining probability models over worlds withunknown objects and identity uncertainty. BLOG unifiesand extends several existing approaches. Subject to cer-tain acyclicity constraints, every BLOG model specifiesa unique probability distribution over first-order modelstructures that can contain varying and unbounded num-bers of objects. Furthermore, complete inference algo-rithms exist for a large fragment of the language. Wealso introduce a probabilistic form of Skolemization forhandling evidence.1 IntroductionHuman beings and AI systems must convert sensory inputinto some understanding of what’s out there and what’s goingon in the world. That is, they must make inferences aboutthe objects and events that underlie their observations. Nopre-specified list of objects is given; the agent must infer theexistence of objects that were not known initially to exist.In many AI systems, this problem of unknown objects isengineered away or resolved in a preprocessing step. How-ever, there are important applications where the problem isunavoidable. Population estimation, for example, involvescounting a population by sampling from it randomly and mea-suring how often the same object is resampled; this wouldbe pointless if the set of objects were known in advance.Record linkage, a task undertaken by an industry of morethan 300 companies, involves matching entries across multi-ple databases. These companies exist because of uncertaintyabout the mapping from observations to underlying objects.Finally, multi-target tracking systems perform data associa-tion, connecting, say, radar blips to hypothesized aircraft.Probability models for such tasks are not new: Bayesianmodels for data association have been used since the1960s[Sittler, 1964]. The models are written in English andmathematical notation and converted by hand into special-purpose code. In recent years, formal representation lan-guages such as graphical models[Pearl, 1988]have led togeneral inference algorithms, more sophisticated models, andautomated model selection (structure learning). In Sec. 7, wereview several first-order probabilistic languages (FOPLs)∗This work was supported by DARPA under award 03-000219,and by an NSF Graduate Research Fellowship to B. Milch.that explicitly represent objects and the relations betweenthem. However, most FOPLs only deal with fixed sets of ob-jects, or deal with unknown objects in limited and ad hocways. This paper introduces BLOG (Bayesian LOGic), acompact and intuitive language for defining probability dis-tributions over outcomes with varying sets of objects.We begin in Sec. 2 with three example problems, each ofwhich involves possible worlds with varying object sets andidentity uncertainty. We describe generative processes thatproduce such worlds, and give the corresponding BLOG mod-els. Sec. 3 observes that these possible worlds are naturallyviewed as model structures of first-order logic. It then definesprecisely the set of possible worlds corresponding to a BLOGmodel. The key idea is a generative process that constructs aworld by adding objects whose existence and properties de-pend on those of objects already created. In such a process,the existence of objects may be governed by many randomvariables, not just a single population size variable. Sec. 4discusses how a BLOG model specifies a probability distribu-tion over possible worlds.Sec. 5 solves a previously unnoticed “probabilisticSkolemization” problem: how to specify evidence aboutobjects—such as radar blips—that one didn’t know existed.Finally, Sec. 6 briefly discusses inference in unbounded out-come spaces, stating a sampling algorithm and a complete-ness theorem for a large class of BLOG models, and givingexperimental results on one particular model.2 ExamplesIn this section we examine three typical scenarios with un-known objects—simplified versions of the population estima-tion, record linkage, and multitarget tracking problems men-tioned above. In each case, we provide a short BLOG modelthat, when combined with a suitable inference engine, consti-tutes a working solution for the problem in question.Example 1. An urn contains an unknown number of balls—say, a number chosen from a Poisson distribution. Balls areequally likely to be blue or green. We draw some balls fromthe urn, observing the color of each and replacing it. Wecannot tell two identically colored balls apart; furthermore,observed colors are wrong with probability 0.2. How manyballs are in the urn? Was the same ball drawn twice?The BLOG model for this problem, shown in Fig. 1, de-scribes a stochastic process for generating worlds. The first 41 type Color; type Ball; type Draw;2 random Color TrueColor(Ball);3 random Ball BallDrawn(Draw);4 random Color ObsColor(Draw);5 guaranteed Color Blue, Green;6 guaranteed Draw Draw1, Draw2, Draw3, Draw4;7 #Ball ∼ Poisson[6]();8 TrueColor(b) ∼ TabularCPD[[0.5, 0.5]]();9 BallDrawn(d) ∼ Uniform({Ball b});10 ObsColor(d)11 if (BallDrawn(d) != null) then12 ∼ TabularCPD[[0.8, 0.2], [0.2, 0.8]]13 (TrueColor(BallDrawn(d)));Figure 1: BLOG model for the urn-and-balls scenario of Ex. 1with four draws.lines introduce the types of objects in these worlds—colors,balls, and draws—and the functions that can be applied tothese objects. Lines 5–7 specify what objects may exist ineach world. In every world, the colors are blue and green andthere are four draws; these are the guaranteed objects. On theother hand, different worlds have different numbers of balls,so the number of balls that exist is chosen from a prior—aPoisson with mean 6. Each ball is then given a color, as spec-ified on line 8. Properties of the four draws are filled in bychoosing a ball (line 9) and an observed color for that ball(lines 10–13). The probability of the generated world is theproduct of the probabilities of all the choices made.Example 2. We have a collection of citations that refer topublications in a certain field. What publications and re-searchers exist, with what titles and names? Who wrote whichpublication, and which publication does each citation referto? For simplicity, we just


View Full Document

Berkeley COMPSCI 294 - BLOG - Probabilistic Models with Unknown Objects

Documents in this Course
"Woo" MAC

"Woo" MAC

11 pages

Pangaea

Pangaea

14 pages

Load more
Download BLOG - Probabilistic Models with Unknown Objects
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view BLOG - Probabilistic Models with Unknown Objects and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view BLOG - Probabilistic Models with Unknown Objects 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?