Unformatted text preview:

Machine Learning ! ! ! ! ! Srihari 1 Genetic Inheritance and Bayesian Networks Sargur Srihari [email protected] Learning ! ! ! ! ! Srihari 2 Genetics Pedigree Example • One of the earliest uses of Bayesian Networks – Before general framework was defined • Local independencies are intuitive • Model transmission of certain properties such as blood type from parent to childMachine Learning ! ! ! ! ! Srihari Phenotype and Genotype • Some background on genetics needed to model properly • Blood type is an observable quantity that depends on the genetic makeup – Called a phenotype • Genetic makeup of a person is called a genotype 3Machine Learning ! ! ! ! ! Srihari 4 4 Actual !Electron !Photo!micrograph!Single Chromosome: ~108 base-pairs!Genome: sequence of 3x109 base-pairs! (nucleotides A,C,G,T) !Represents full set of chromosomes!Genome has 46 chromosomes (22 are repeated plus XX and XY)!Large portions of DNA have no survival function (98.5%) and have variations !useful for identification!!TH01 is a location on short arm of chromosome 11:!short tandem repeats (STR) of same base pair AATG!Variant forms (alleles) different for different individuals! locus DNA BasicsMachine Learning ! ! ! ! ! Srihari Genetic Model • Human genetic material – 22 pairs of autosomal chromosomes – One pair of sex chromosomes (X and Y) • Each chromosome contains genetic material that determine person’s properties • Locus: Region of chromosome of interest – Blood type is a particular locus • Alleles: Variants of locus – Blood type has three variants: A, B, O 5Machine Learning ! ! ! ! ! Srihari Independence Assumptions • Arise from biology • Once we know – Genotype of a person • additional evidence about other members of family will not provide new information about blood-type – Genotype of both parents • Determine what is passed to off-spring • Additional ancestral information not needed • These independencies can be captured in BN for a family tree 6Machine Learning ! ! ! ! ! Srihari A small family tree 7 HarryMachine Learning ! ! ! ! ! Srihari BN for Genetic Inheritance 8 G: Genotype B: Blood TypeMachine Learning ! ! ! ! ! Srihari Autosomal Chromosome • In each pair, – Paternal: inherited from father – Maternal: inherited from mother • Person’s genotype is an ordered pair (X,Y) – with each having three possible values (A,B,O) – there are nine values such as (A,B) • Blood type phenotype is a function of both copies – E.g., genotype (A,O) blood type is A – (O,O)  O 9Machine Learning ! ! ! ! ! Srihari CPDs for Genetic Inheritance • Penetrance Model P(B(c)|G(c)) – Probabilities of different phenotypes given person’s genotype • Deterministic for bloodtype • Transmission Model P(G(c)|G(p),G(m)) – Each parent equally likely to transmit either of two alleles to child • Genotype Priors P(G(c)) – Genotype frequencies in population 10Machine Learning ! ! ! ! ! Srihari Real models more complex • Phenotypes for late-onset diseases are not a deterministic function of genotype – A particular genotype may have a higher probability of a disease • Genetic makeup of individual determined by many genes • Some phenotypes depend on many genes • Multiple phenotypes depend on many genes 11Machine Learning ! ! ! ! ! Srihari Modeling multi-locus inheritance • Inheritance patterns of different genes not independent of each other • Need to take into account adjacent loci • Introduce selector variables S(l,c,m) • 1 if locus l in c’s maternal chromosome inherited from c’s maternal grandmother • 2 if locus inherited from c’s maternal grandfather • Model correlations of variables of adjacent loci l and l’ 12Machine Learning ! ! ! ! ! Srihari Use of Genetic Inheritance Model • Extensively used in 1. In genetic counseling and prediction 2. In linkage analysis 13Machine Learning ! ! ! ! ! Srihari Genetic Counseling and Prediction • Take phenotype with known loci and observed phenotype and genotype data for individuals – to infer genotype and phenotype for another person (planned child) • Genetic data – Direct measurements of relevant disease loci or nearby loci which are correlated with disease loci 14Machine Learning ! ! ! ! ! Srihari Linkage Analysis • Harder task • Identifying disease genes from pedigree data using several pedigrees – Several individuals exhibit disease phenotype – Available data • Phenotype information for many individuals in pedigree • Genotype information for known location in chromosome – Use inheritance model to evaluate likelihood – Pinpoint area linked to disease to further analyze genes in that area • Allows focusing on 1/10,000 of genome 15Machine Learning ! ! ! ! ! Srihari Sparse BN in genetic inheritance • Allow reasoning about large pedigree and multiple loci • Allow use of model learning algorithms to understand recombination rates in different regions and penetration probabilities for different diseases


View Full Document

UB CSE 574 - Genetic Inheritance and Bayesian Networks

Download Genetic Inheritance and Bayesian Networks
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Genetic Inheritance and Bayesian Networks and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Genetic Inheritance and Bayesian Networks 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?