DOC PREVIEW
CMU CS 10708 - Lecture

This preview shows page 1-2-3-4-5 out of 14 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

11School of Computer ScienceInfinite Mixture and DirichletProcessProbabilistic Graphical Models (10Probabilistic Graphical Models (10--708)708)Lecture 20, Nov 28, 2007Eric XingEric XingReceptor AKinase CTF FGene GGene HKinaseEKinase DReceptor BX1X2X3X4X5X6X7X8Receptor AKinase CTF FGene GGene HKinaseEKinase DReceptor BX1X2X3X4X5X6X7X8X1X2X3X4X5X6X7X8Reading:Eric Xing 2Clustering2Eric Xing 3Object Recognition and Trackingt=1 t=2 t=3(1.8, 7.4, 2.3)(1.9, 9.0, 2.1)(1.9, 6.1, 2.2)(0.9, 5.8, 3.1)(0.7, 5.1, 3.2)(0.6, 5.9, 3.2)Eric Xing 4Modeling The Mind …… ……t=1 t=TRead sentenceRead sentenceView pictureView pictureDecide whether consistentDecide whether consistentLatent Latent brain processes:brain processes:fMRIfMRIscan:scan:∑∑3Eric Xing 5PNASPNASpaperspapersResearchResearchtopicstopics19002000?ResearchResearchcirclescirclesThe Evolution of ScienceCSBioPhyPhyEric Xing 6Partially Observed, Open and Evolving Possible Worldsz Unbounded # of objects/trajectoriesz Changing attributesz Birth/death, merge/splitz Relational ambiguity z The parametric paradigm:z Finitez Structurally unambiguous*|ttΞ*| 11 ++Ξtt{}()kpφ|xSensor modelSensor modelobservation spaceobservation spaceEntity spaceEntity spacemotion modelmotion model{}{}()tktkpφφ 1+Event modelEvent model{}()0kpφ{}()Tkp:1φorHow to open it up?How to open it up?4Eric Xing 7A Classical Approachz Clustering as Mixture Modelingz Then "model selection" Eric Xing 8Model Selection vs. Posterior Inferencez Model selectionz "intelligent" guess: ???z cross validation: data-hungry /z information theoretic:z AICz TICz MDL :z Bayes factor: need to compute data likelihoodz Posterior inference: we want to handle uncertainty of model complexity explicitlyz we favor a distribution that does not constrain Min a "closed" space!()),ˆ|(|)(minarg KKLMLgf θ⋅⋅)()|()|(MpMDpDMp∝{}K,θ≡MParsimony, Parsimony, Ockam'sOckam'sRazorRazor5Eric Xing 9Two "Recent" Developmentsz First order probabilistic languages (FOPLs)z Examples: PRM, BLOG …z Lift graphical models to "open" world (#rv, relation, index, lifespan …)z Focus on complete, consistent, and operating rules to instantiate possible worlds, and formal language of expressing such rulesz Operational way of defining distributions over possible worlds, via sampling methodsz Bayesian Nonparametricsz Examples: Dirichlet processes, stick-breaking processes …z From finite, to infinite mixture, to more complex constructions (hierarchies, spatial/temporal sequences, …)z Focus on the laws and behaviors of both the generative formalisms and resulting distributionsz Often offer explicit expression of distributions, and expose the structure of the distributions --- motivate various approximate schemesEric Xing 10Clusteringz How to label them ?z How many clusters ???6Eric Xing 11Genetic Demographyz Are there genetic prototypes among them ?z What are they ?z How many ? (how many ancestors do we have ?) Eric Xing 12Genetic Polymorphisms7Eric Xing 13Biological Terms– Each variant is called an “allele”– Almost always bi-allelic– Account for most of the genetic diversity among different (normal) individuals, e.g. drug response, disease susceptibilityz Genetic polymorphism: a difference in DNA sequence among individuals, groups, or populationsz Single Nucleotide Polymorphism (SNP): DNA sequence variation occurring when a single nucleotide - A, T, C, or G -differs between members of the speciesEric Xing 14From SNPs to Haplotypesz Alleles of adjacent SNPs on a chromosome form haplotypesz Powerful in the study of disease association or genetic evolution8Eric Xing 152 136191517419629 172 121271467118 181410 10Genotypes Haplotypes131154921712761184102691716921214718110HaplotypeRe-constructionChromosome phase is knownChromosome phase is unknownHaplotype and Genotypez A collection of alleles derived from the same chromosomeEric Xing 16Ancestral Inferencez Better recovery of the ancestors leads to better haplotyping results (because of more accurate grouping of common haplotypes)z True haplotypes are obtainable with high cost, but they can validate model more subjectively (as opposed to examining saliency of clustering)z Many other biological/scientific utilities GnHn1Hn2Akθk?NNEssentially a clustering problem, but Essentially a clustering problem, but ……9Eric Xing 17z The probability of a genotype g:z Standard settings:z H| = K << 2Jfixed-sized population haplotype poolz p(h1,h2)= p(h1)p(h2)=f1f2Hardy-Weinberg equilibriumz Problem: K ? H ?∑∈= ,212121),|(),()(HhhhhgphhpgpGenotypingmodelHaplotypemodelPopulation haplotypepoolA Finite (Mixture of ) Allele ModelGnHn1Hn2Eric Xing 18A Infinite (Mixture of ) Allele ModelGnHn1Hn2Akθk∞NNz How?z Via a nonparametric hierarchical Bayesian formalism !10Eric Xing 19Stick-breaking ProcessG00 0.4 0.40.6 0.5 0.30.3 0.8 0.24),Beta(~)-(~)(∏∑∑-∞∞αβββππθθδπ11111101kkjkkkkkkkkkGG======LocationMassEric Xing 20Graphical ModelGnHn1Hn2Akθk∞NN11Eric Xing 21Chinese Restaurant ProcessCRP defines an exchangeable distribution on partitions over an (infinite) sequence of samples, such a distribution is formally known as the Dirichlet Process (DP)=)|=(-iikcP c100α+110αα+1α+21α+21αα+2α+31α+32αα+31-+1αim1-+2αim1-+ααi....1θ2θEric Xing 22{A,θ} {A,θ} {A,θ} {A,θ} {A,θ} {A,θ}……31245678 9The DP Mixture of Ancestral Haplotypesz The customers around a table form a clusterz associate a mixture component (i.e., a population haplotype) with a table z sample {a,θ} at each table from a base measure G0to obtain the population haplotype and nucleotide substitution frequency for that componentz With p(h|{Α, θ}) and p(g|h1,h2), the CRP yields a posterior distribution on the number of population haplotypes (and on the haplotype configurations and the nucleotide substitution frequencies)12Eric Xing 23DP-haplotyperz Inference: Markov Chain Monte Carlo (MCMC)z Gibbs samplingz Metropolis HastingGnHn1Hn2AθNKGαG0DPinfinite mixture components(for population haplotypes)Likelihood model(for individual haplotypes and genotypes)Eric Xing 24Model componentsz Choice of base measure:z Nucleotide-substitution model:z Noisy genotyping model:∏⋅jjaG)Beta()Unif(~θ0⎩⎨⎧=−===∏jkjijkjkjijkjkjkjijjkjkjikiahahahpahpahp,,,,,,,,,,,, if if ),|( where),|()},{|(θθθθθ1⎪⎩⎪⎨⎧≠⊕−=⊕==∏jijijijijijijijijijjijijiiiighhghhhhgphhgphhgp,,,,,,,,,,,, if 2 if ),|( where),|(),|(21212121211γγ13Eric Xing 25Gibbs


View Full Document

CMU CS 10708 - Lecture

Documents in this Course
Lecture

Lecture

15 pages

Lecture

Lecture

25 pages

Lecture

Lecture

24 pages

causality

causality

53 pages

lecture11

lecture11

16 pages

Exam

Exam

15 pages

Notes

Notes

12 pages

lecture

lecture

18 pages

lecture

lecture

16 pages

Lecture

Lecture

17 pages

Lecture

Lecture

15 pages

Lecture

Lecture

17 pages

Lecture

Lecture

19 pages

Lecture

Lecture

42 pages

Lecture

Lecture

16 pages

r6

r6

22 pages

lecture

lecture

20 pages

lecture

lecture

35 pages

Lecture

Lecture

19 pages

Lecture

Lecture

21 pages

lecture

lecture

21 pages

lecture

lecture

13 pages

review

review

50 pages

Semantics

Semantics

30 pages

lecture21

lecture21

26 pages

MN-crf

MN-crf

20 pages

hw4

hw4

5 pages

lecture

lecture

12 pages

Lecture

Lecture

25 pages

Lecture

Lecture

25 pages

Lecture

Lecture

15 pages

Load more
Download Lecture
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?