Slide 1Slide 2Slide 3Slide 4Slide 5Slide 6Slide 7Slide 8Slide 9Slide 10Slide 11Slide 12Slide 13Slide 14Slide 15Slide 16Slide 17Slide 18Slide 19Slide 20Slide 21Slide 22Slide 23Slide 24Slide 25Slide 26Slide 27Slide 28Slide 29Slide 301Logical Bayesian NetworksA knowledge representation view on Probabilistic Logical ModelsDaan Fierens, Hendrik Blockeel, Jan Ramon, Maurice BruynoogheKatholieke Universiteit Leuven, Belgium2Probabilistic Logical ModelsVariety of PLMs:•Origin in Bayesian Networks (Knowledge Based Model Construction)•Probabilistic Relational Models•Bayesian Logic Programs•CLP(BN)•… •Origin in Logic Programming•PRISM•Stochastic Logic Programs•… THIS TALK- learning- best known- most developed3Combining PRMs and BLPsPRMs:•+ Easy to understand, intuitive•- Somewhat restricted (as compared to BLPs)BLPs:•+ More general, expressive•- Not always intuitiveCombine strengths of both models in one model ?We propose Logical Bayesian Networks (PRMs+BLPs)4Overview of this TalkExampleProbabilistic Relational ModelsBayesian Logic ProgramsCombining PRMs and BLPs: Why and How ?Logical Bayesian Networks5Example [ Koller et al.]University:•students (IQ) + courses (rating)•students take courses (grade)•grade IQ•rating sum of IQ’sSpecific situation:•jeff takes ai, pete and rick take lp, no student takes db6Bayesian Network-structurerating(db) rating(ai)rating(lp)iq(jeff)iq(pete) iq(rick)grade(jeff,ai)grade(rick,lp)grade(pete,lp)7PRMs [Koller et al.]PRM: relational schema,dependency structure (+ aggregates + CPDs)key iqStudentkey ratingCoursekeystudentTakesCPTaggr + CPTcoursegradeCourseratingStudentiqTakesgrade8PRMs (2)•Semantics: PRM induces a Bayesian network on the relational skeleton key iqjeff ?pete ?rick ?Studentkey ratingai ?lp ?db ?Coursekeystudentf1 jefff2 petef3 rickTakescoursegradeai ?lp ?lp ?9PRMs - BN-structure (3) rating(db) rating(ai)rating(lp)iq(jeff)iq(pete) iq(rick)grade(jeff,ai)grade(rick,lp)grade(pete,lp)10PRMs: Pros & Cons (4)Easy to understand and interpretExpressiveness as compared to BLPs, … :•Not possible to combine selection and aggregation [Blockeel & Bruynooghe, SRL-workshop ‘03]•E.g. extra attribute sex for students•rating sum of IQ’s for female students•Specification of logical background knowledge ?•(no functors, constants)11BLPs [Kersting, De Raedt]Definite Logic Programs + Bayesian networks•Bayesian predicates (range)•Random var = ground Bayesian atom: iq(jeff)•BLP = clauses with CPTrating(C) | iq(S), takes(S,C).Range: {low,high}CPT + combining rule (can be anything)•Semantics: Bayesian network•random variables = ground atoms in LH-model•dependencies grounding of the BLP12BLPs (2)student(pete)., …, course(lp)., …, takes(rick,lp).rating(C) | iq(S), takes(S,C).rating(C) | course(C).grade(S,C) | iq(S), takes(S,C).iq(S) | student(S).BLPs do not distinguish probabilistic and logical/certain/structural knowledge•Influence on readability of clauses•What about the resulting Bayesian network ?13BLPs - BN-structure (3) •Fragment:iq(jeff)grade(jeff,ai)student(jeff)takes(jeff,ai)student(jeff) iq(jeff)truefalsedistribution for iq/1?CPD14BLPs - BN-structure (3) •Fragment:iq(jeff)grade(jeff,ai)student(jeff)takes(jeff,ai)distribution for grade/2, function of iq(jeff)takes(jeff,ai) grade(jeff,ai)truefalse ?CPD15BLPs: Pros & Cons (4)High expressiveness:•Definite Logic Programs (functors, …)•Can combine selection and aggregation (combining rules)Not always easy to interpret •the clauses•the resulting Bayesian network16Combining PRMs and BLPsWhy ?•1 model = intuitive + high expressivenessHow ? •Expressiveness: ( BLPs)•Logic Programming•Intuitive: ( PRMs)•Distinguish probabilistic and logical/certain knowledge•Distinct components (PRMs: schema determines random variables / dependency structure)•(General vs Specific knowledge)17Logical Bayesian NetworksProbabilistic predicates (variables,range) vs Logical predicatesLBN - components:•Relational schema V •Dependency Structure DE•CPDs+ aggregates DIRelational skeleton Logic Program Pl•Description of DoD / deterministic info18Logical Bayesian NetworksSemantics:•LBN induces a Bayesian network on the variables determined by Pl and V19Normal Logic Program Plstudent(jeff).course(ai).takes(jeff,ai).student(pete).course(lp).takes(pete,lp).student(rick).course(db).takes(rick,lp).Semantics: well-founded model WFM(Pl) (when no negation: least Herbrand model)20Viq(S) <= student(S).rating(C) <= course(C).grade(S,C) <= takes(S,C).Semantics: determines random variables•each ground probabilistic atom in WFM(Pl V) is random variable•iq(jeff), …, rating(lp), …,grade(rick,lp)•non-monotonic negation (not in PRMs, BLPs)•grade(S,C) <= takes(S,C), not(absent(S,C)).21DEgrade(S,C) | iq(S).rating(C) | iq(S) <- takes(S,C).Semantics: determines conditional dependencies•ground instances with context in WFM(Pl)•e.g. rating(lp) | iq(pete) <- takes(pete,lp)•e.g. rating(lp) | iq(rick) <- takes(rick,lp)22V + DEiq(S) <= student(S).rating(C) <= course(C).grade(S,C) <= takes(S,C)grade(S,C) | iq(S).rating(C) | iq(S) <- takes(S,C).23LBNs - BN-structurerating(db) rating(ai)rating(lp)iq(jeff)iq(pete) iq(rick)grade(jeff,ai)grade(rick,lp)grade(pete,lp)24DIThe quantitative component•~ in PRMs: aggregates + CPDs •~ in BLPs: CPDs + combining rulesFor each probabilistic predicate p a logical CPD•= function with•input: set of pairs (Ground prob atom,Value)•output: probability distribution for p•Semantics: determines the CPDs for all variables about p25DI (2)•e.g. for rating/1 (inputs are about iq/1)If (SUM(iq(S),Val) Val) > 1000 Then 0.7 high / 0.3 low Else 0.5 high / 0.5 low•Can be written as logical probability tree (TILDE)sum(Val, iq(S,Val), Sum), Sum > 10000.5 / 0.5 0.7 / 0.3•cf [Van Assche et al., SRL-workshop ‘04]26DI (3)DI determines the CPDs•e.g. CPD for rating(lp) = function of iq(pete) and iq(rick)•Entry in CPD for iq(pete)=100 and iq(rick)=120 ?•Apply logical CPD for rating/1 to {(iq(pete),100),(iq(rick),120)}•Result: probab. distribution 0.5 high / 0.5 lowIf (SUM(iq(S),Val) Val) > 1000 Then 0.7 high / 0.3 low Else 0.5 high / 0.5 low27DI (4)Combine selection and aggregation?•e.g. rating sum of IQ’s for female studentssum(Val, (iq(S,Val), sex(S,fem)), Sum),
View Full Document