Learning P-maps Param. LearningPerfect maps (P-maps)Inexistence of P-maps 1Obtaining a P-mapI-EquivalenceSkeleton of a BNWhat about V-structures?Same V-structures not necessaryImmoralities & I-EquivalenceSlide 10Identifying the skeleton 1Identifying the skeleton 2Identifying immoralitiesFrom immoralities and skeleton to BN structuresWhat you need to knowAnnouncementsReviewThumbtack – Binomial DistributionMaximum Likelihood EstimationYour first learning algorithmLearning Bayes netsLearning the CPTsSlide 23Maximum likelihood estimation (MLE) of BN parameters – exampleMaximum likelihood estimation (MLE) of BN parameters – General caseTaking derivatives of MLE of BN parameters – General caseGeneral MLE for a CPTCan we really trust MLE?Bayesian LearningBayesian Learning for ThumbtackBeta prior distribution – P()Posterior distributionConjugate priorUsing Bayesian posteriorBayesian prediction of a new coin flipAsymptotic behavior and equivalent sample sizeBayesian learning corresponds to smoothingBayesian learning for multinomialBayesian learning for two-node BNVery important assumption on prior: Global parameter independenceGlobal parameter independence, d-separation and local predictionWithin a CPTPriors for BN CPTs (more when we talk about structure learning)An exampleWhat you need to know about parameter learning1Learning P-mapsParam. LearningGraphical Models – 10708Carlos GuestrinCarnegie Mellon UniversitySeptember 24th, 2008Readings:K&F: 3.3, 3.4, 16.1, 16.2, 16.3, 16.410-708 – Carlos Guestrin 2006-200810-708 – Carlos Guestrin 2006-20082Perfect maps (P-maps)I-maps are not unique and often not simple enoughDefine “simplest” G that is I-map for PA BN structure G is a perfect map for a distribution P if I(P) = I(G) Our goal:Find a perfect map!Must address equivalent BNs10-708 – Carlos Guestrin 2006-20083Inexistence of P-maps 1XOR (this is a hint for the homework)10-708 – Carlos Guestrin 2006-20084Obtaining a P-mapGiven the independence assertions that are true for PAssume that there exists a perfect map G*Want to find G*Many structures may encode same independencies as G*, when are we done?Find all equivalent structures simultaneously!10-708 – Carlos Guestrin 2006-20085I-EquivalenceTwo graphs G1 and G2 are I-equivalent if I(G1) = I(G2)Equivalence class of BN structuresMutually-exclusive and exhaustive partition of graphsHow do we characterize these equivalence classes?10-708 – Carlos Guestrin 2006-20086Skeleton of a BNSkeleton of a BN structure G is an undirected graph over the same variables that has an edge X–Y for every X!Y or Y!X in G(Little) Lemma: Two I-equivalent BN structures must have the same skeletonAHCEGDBFKJI10-708 – Carlos Guestrin 2006-20087What about V-structures?V-structures are key property of BN structureTheorem: If G1 and G2 have the same skeleton and V-structures, then G1 and G2 are I-equivalentAHCEGDBFKJI10-708 – Carlos Guestrin 2006-20088Same V-structures not necessaryTheorem: If G1 and G2 have the same skeleton and V-structures, then G1 and G2 are I-equivalentThough sufficient, same V-structures not necessary10-708 – Carlos Guestrin 2006-20089Immoralities & I-EquivalenceKey concept not V-structures, but “immoralities” (unmarried parents )X ! Z Y, with no arrow between X and YImportant pattern: X and Y independent given their parents, but not given Z(If edge exists between X and Y, we have covered the V-structure)Theorem: G1 and G2 have the same skeleton and immoralities if and only if G1 and G2 are I-equivalent10-708 – Carlos Guestrin 2006-200810Obtaining a P-mapGiven the independence assertions that are true for PObtain skeletonObtain immoralitiesFrom skeleton and immoralities, obtain every (and any) BN structure from the equivalence class10-708 – Carlos Guestrin 2006-200811Identifying the skeleton 1When is there an edge between X and Y?When is there no edge between X and Y?10-708 – Carlos Guestrin 2006-200812Identifying the skeleton 2Assume d is max number of parents (d could be n)For each Xi and XjEij trueFor each U X – {Xi,Xj}, |U|≤dIs (Xi Xj | U) ?Eij falseIf Eij is trueAdd edge X – Y to skeleton10-708 – Carlos Guestrin 2006-200813Identifying immoralitiesConsider X – Z – Y in skeleton, when should it be an immorality?Must be X ! Z Y (immorality):When X and Y are never independent given U, if Z2UMust not be X ! Z Y (not immorality):When there exists U with Z2U, such that X and Y are independent given U10-708 – Carlos Guestrin 2006-200814From immoralities and skeleton to BN structuresRepresenting BN equivalence class as a partially-directed acyclic graph (PDAG)Immoralities force direction on some other BN edgesFull (polynomial-time) procedure described in reading10-708 – Carlos Guestrin 2006-200815What you need to knowMinimal I-map every P has one, but usually manyPerfect mapbetter choice for BN structurenot every P has onecan find one (if it exists) by considering I-equivalenceTwo structures are I-equivalent if they have same skeleton and immoralitiesAnnouncementsRecitation tomorrowDon’t miss it!No class on Monday 1610-708 – Carlos Guestrin 2006-200810-708 – Carlos Guestrin 2006-200817ReviewBayesian Networks Compact representation for probability distributionsExponential reduction in number of parametersExploits independencies Next – Learn BNsparametersstructureFluAllergySinusHeadacheNose10-708 – Carlos Guestrin 2006-200818Thumbtack – Binomial DistributionP(Heads) = , P(Tails) = 1-Flips are i.i.d.:Independent eventsIdentically distributed according to Binomial distributionSequence D of H Heads and T Tails10-708 – Carlos Guestrin 2006-200819Maximum Likelihood EstimationData: Observed set D of H Heads and T Tails Hypothesis: Binomial distribution Learning is an optimization problemWhat’s the objective function?MLE: Choose that maximizes the probability of observed data:10-708 – Carlos Guestrin 2006-200820Your first learning algorithmSet derivative to zero:10-708 – Carlos Guestrin 2006-200821Learning Bayes netsKnown structure Unknown structureFully observable dataMissing datax(1)… x(m)Datastructure parametersCPTs
View Full Document