DOC PREVIEW
CMU CS 15780 - lecture

This preview shows page 1-2-3-18-19-36-37-38 out of 38 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 38 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 38 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 38 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 38 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 38 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 38 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 38 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 38 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 38 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

15-780: Grad AILecture 18: Probability, planning, graphical modelsGeoff Gordon (this lecture)Tuomas Sandholm TAs Erik Zawadzki, Abe OthmanAdminReminder: project milestone reports due 2 weeks from todayReview: probabilityIndependence, correlationExpectation, conditional e., linearity of e., iterated e., independence & e.Experiment, prior, posteriorEstimators (bias, variance, asymptotic behavior)Bayes RuleModel selectionReview: probability & AIPSTRIPSQBF and “QBF+”PSTRIPS to QBF+ translationQ1X1Q2X2Q3X3... F(X1,X2,X3,...)each quantifier is max, min, or meanExample: got cake?¬have1 ∧ gatebake1 ∧ bake2 󲰜 Cbake2have1 ∧ gateeat1 ∧ eat2 󲰜 Ceat2have1 ∧ eat2 󲰜 Ceat’2[Cbake2 󲰛 have3] ∧ [Ceat2 󲰛 eaten3] ∧ [Ceat’2 󲰛 ¬have3]0.8:gatebake1 ∧ 0.9:gateeat1Example: got cake?have3 󲰛 [Cbake2 ∨ (¬Ceat’2 ∧ have1)]¬have3 󲰛 [Ceat’2 ∨ (¬Cbake2 ∧ ¬have1)]eaten3 󲰛 [Ceat2 ∨ eaten1]¬eaten3 󲰛 [¬eaten1]Example: got cake?¬bake2 ∨ ¬eat2(pattern from past few slides is repeated for each action level w/ adjacent state levels)Example: got cake?¬have1 ∧ ¬ eaten1haveT ∧ eatenTSimple QBF+ examplep(y) = p(z) = 0.5How can we solve?Scenario trick‣transform to PBI or 0-1 ILPDynamic programming‣related to algorithms for SAT, #SAT‣also to belief propagation in graphical models (next)Solving exactly by scenariosReplicate u to uYZ: u00, u01, u10, u11Replicate clauses: share x; set y, z by index; replace u by uYZ; write aYZ for truth valuea00 󲰜 [(¬x ∨ 0) ∧ (¬0 ∨ u00) ∧ (x ∨ ¬0)] ∧ a01 󲰜 [(¬x ∨ 1) ∧ (¬0 ∨ u01) ∧ (x ∨ ¬0)] ∧ ...add a PBI: a00 + a01 + a10 + a11 ! 4 * threshold(¬x ∨ z) ∧ (¬y ∨ u) ∧ (x ∨ ¬y)Solving by sampling scenariosSample a subset of the values of y, z (e.g., {11, 01}):‣a11 󲰜 [(¬x ∨ 1) ∧ (¬1 ∨ u11) ∧ (x ∨ ¬1)] ∧ a01 󲰜 [(¬x ∨ 1) ∧ (¬0 ∨ u01) ∧ (x ∨ ¬0)]Adjust PBI: a11 + a10 ! 2 * threshold(¬x ∨ z) ∧ (¬y ∨ u) ∧ (x ∨ ¬y)Combining PSTRIPS w/ scenariosGenerate M samples of Nature (gatebake1, gateeat1, gatebake3, gateeat3, gatebake5, …)Replicate state-level vars M timesOne copy of action vars bake2, eat2, bake4, …Replicate clauses M times (share actions)Replace goal constraints w/ constraint that all goals must be satisfied in at least y% of scenarios (a PBI)Give to MiniSAT+ (fixed y) or CPLEX (max y)Dynamic programmingConsider the simpler problem (all p=0.5): This is essentially an instance of #SATStructure:Dynamic programming for variable eliminationVariable eliminationIn generalPick a variable orderingRepeat: say next variable is z‣move sum over z inward as far as it goes‣make a new table by multiplying all old tables containing z, then summing out z‣arguments of new table are “neighbors” of zCost: O(size of biggest table * # of sums)‣sadly: biggest table can be exponentially large‣but often not: low-treewidth formulasConnectionsScenarios are related to your current HWDP is related to belief propagation in graphical models (next)Can generalize DP for multiple quantifier types (not just sum or expectation)‣handle PSTRIPSGraphical modelsWhy do we need graphical models?So far, only way we’ve seen to write down a distribution is as a big tableGets unwieldy fast!‣E.g., 10 RVs, each w/ 10 settings‣Table size = 1010Graphical model: way to write distribution compactly using diagrams & numbersTypical GMs are huge (1010 is a small one), but we’ll use tiny ones for examplesBayes netsBest-known type of graphical modelTwo parts: DAG and CPTsRusty robot: the DAGRusty robot: the CPTsFor each RV (say X), there is one CPT specifying P(X | pa(X))P(Metal) = 0.9P(Rains) = 0.7P(Outside) = 0.2P(Wet | Rains, Outside)" TT: 0.9" TF: 0.1" FT: 0.1" FF: 0.1P(Rusty | Metal, Wet) =" TT: 0.8" TF: 0.1" FT: 0"" FF: 0Interpreting itBenefits11 v. 31 numbersFewer parameters to learnEfficient inference = computation of marginals, conditionals 󲰛 posteriorsComparison to prop logic + random causesCan simulate any Bayes net w/ propositional logic + random causes—one cause per CPT entryE.g.:Inference QsIs Z > 0?What is P(E)?What is P(E1 | E2)?Sample a random configuration according to P(.) or P(. | E)Hard part: taking sums over r.v.s (e.g., sum over all values to get normalizer)Inference exampleP(M, Ra, O, W, Ru) = P(M) P(Ra) P(O) P(W|Ra,O) P(Ru|M,W)Find marginal of M, OIndependenceShowed M ⊥ OAny other independences?Didn’t use CPTs: some independences depend only on graph structureMay also be “accidental” independences‣i.e., depend on values in CPTsConditional independenceHow about O, Ru? O RuSuppose we know we’re not wetP(M, Ra, O, W, Ru) = P(M) P(Ra) P(O) P(W|Ra,O) P(Ru|M,W)Condition on W=F, find marginal of O, RuConditional independenceThis is generally true‣conditioning can make or break independences‣many conditional independences can be derived from graph structure alone‣accidental ones often considered less interestingWe derived them by looking for factorizations‣turns out there is a purely graphical test‣one of the key contributions of Bayes netsBlockingShaded = observed (by convention)Example: explaining awayIntuitively:Markov blanketMarkov blanket of C = minimal set of obs’ns to make C independent of rest of graphLearning Bayes nets(see 10-708)MRaOWRuTFTTFTTTTTFTTFFTFFFTFFTFTP(Ra) =P(M) =P(O) =P(W | Ra, O) =P(Ru | M, W) =Laplace smoothingMRaOWRuTFTTFTTTTTFTTFFTFFFTFFTFTP(Ra) =P(M) =P(O) =P(W | Ra, O) =P(Ru | M, W) =Advantages of LaplaceNo division by zeroNo extreme probabilities ‣No near-extreme probabilities unless lots of evidenceLimitations of counting and Laplace smoothingWork only when all variables are observed in all examplesIf there are hidden or latent variables, more complicated algorithm—see 10-708‣or just use a


View Full Document

CMU CS 15780 - lecture

Download lecture
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view lecture and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view lecture 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?