Stat 246 Lecture 2 Part B An quick survey of human 2 point linkage analysis Mendel s two laws Modern genetics began with Mendel s experiments on garden peas He studied seven contrasting pairs of characters including The form of ripe seeds round wrinkled The color of the seed albumen yellow green The length of the stem long short Mendel s first law Characters are controlled by pairs of genes which separate during the formation of the reproductive cells meiosis A a A a Mendel s second law When two or more pairs simultaneously of genes segregate they do so independently A a B b A B A b a B a b Exceptions to Mendel s Second Law Morgan s fruitfly data 1909 2 839 flies Eye color Wing length A red B normal AABB x a purple b vestigial aabb AaBb Exp Obs AaBb 710 1 339 x Aabb 710 151 aabb aaBb 710 154 aabb 710 1 195 Morgan s explanation A A B B F1 A A a B a a a b b b b a A a a b b b B F2 a a b Crossover has taken place b a a b B b Parental types Recombinants AaBb aabb Aabb aaBb The proportion of recombinants between the two genes or characters is called the recombination fraction between these two genes It is usually denoted by r or For Morgan s traits r 151 154 2839 0 107 If r 1 2 two genes are said to be linked If r 1 2 independent segregation Mendel s second law Now we move on to small pedigrees One locus founder Founders probabilities are individuals whose parents are not in the pedigree They may of may not be typed Either way we need to assign probabilities to their actual or possible genotypes This is usually done by assuming Hardy Weinberg equilibrium There is a good story here If the frequency of D says D d 1 pr Dd Genotypes of founder couples is 01 H W 2x 01x 99 are usually treated as independent 1 D d pr pop Dd mom dd 2 dd 2x 01x 99 x 99 2 One locus transmission probabilities Children get their genes from their parents genes independently according to Mendel s laws also independently for different children D d 1 2 3 pr kid 3 dd pop 1 Dd 1 2 x 1 2 D d d d mom 2 Dd One locus transmission probabilities II D d 3 d d pr 3 dd 4 Dd 1 2 D d 4 5 D d D D 5 DD 1 Dd 2 Dd 1 2 x 1 2 x 2 x 1 2 x 1 2 x 1 2 x 1 2 The factor 2 comes from summing over the two mutually exclusive and equiprobable ways 4 can get a D and a d One locus penetrance probabilities Pedigree analyses usually suppose that given the genotype at all loci and in some cases age and sex the chance of having a particular phenotype depends only on genotype at one locus and is independent of all other factors genotypes at other loci environment genotypes and phenotypes of relatives etc Complete penetrance DD pr affected DD 1 pr affected DD 8 Incomplete penetrance DD One locus penetrance II Age and sex dependent penetrance see liability classes D D 45 pr affected DD male 45 y o 6 One locus putting it all together In general shaded means affected blank means unaffected D d 2 1 3 5 4 d d Assume penetrances pr affected DD 8 D d D d D D pr affected dd 1 pr affected Dd 3 and that allele D has frequency 01 The probability of this pedigree is the product 2 x 01 x 99 x 7 x 2 x 01 x 99 x 3 x 1 2 x 7 x 1 2 x 1 2 x 8 1 2 x 9 x 2 x 1 2 x 1 2 x One locus putting it all together II Note that we begin by multiplying founder gene frequencies followed by founder penetrances Next we multiply transmission probabilities followed by penetrance probabilities of offspring using their independence given parental genotypes If there are missing or incomplete data we must sum over all mutually exclusive possibilities compatible with the observed data The general strategy of beginning with founders then non founders and multiplying and summing as appropriate has been codified in what is known as the Elston Stewart algorithm for calculating probabilities over pedigrees It is one of the two widely used approaches The other is termed the Lander Green algorithm and takes a quite different approach Both are hidden Markov models both have compute time space limitations with multiple individuals loci see next and extending them beyond their current limits is the ongoing outstanding problem Two loci linkage and recombination D D 1 T T D d 2 3 T t d d t t 3 Son 3 produces sperm with D T D t d T or d t proportions T D d 1 2 2 1 2 t 2 1 2 1 2 1 2 1 2 in Two loci linkage and recombination II Son produces sperm with DT Dt dT T D d t 1 2 or dt 2 1 2 2 1 2 1 2 1 2 1 2 1 2 independent assortment cf in proportions Mendel unlinked loci 1 2 linked loci 0 tightly linked loci Note 1 2 is never observed If the loci are linked then D T and d t D t haplotypes and d T are recombinant are parental and Two loci estimation of recombination fractions d d D D T T t D d T t d d t t D d D d d d T T t t t t Recombination only discernible in the father t D d t t Here 1 4 This is called the phase known double backcross why pedigree Two loci phase Suppose we have data on two linked loci as follows D d d d T t t t D d T t Was the daughter s D T combination parent and d t from her father a parental or recombinant This is the problem of phase did father get D T from one from the other If so then the daughter s paternally derived haplotype is parental If father got D t from one parent and d T from the other these would be parental and daughter s paternally derived haplotype would be recombinant Two loci dealing with phase Phase is incompleteness in genetic information specifically in parental origin of alleles at heterozygous loci Often it can be inferred with certainty from genotype data on parents Often it can be inferred with high probability from genotype data on several children In general genotype data on relatives helps but does not necessarily determine phase In practice probabilities must be calculated under all phases compatible with the observed data and added together The need to do so is the main reason linkage analysis is computationally intensive especially with multilocus analyses Two loci …
View Full Document