DOC PREVIEW
Berkeley STATISTICS 246 - Lecture Notes

This preview shows page 1-2-16-17-18-34-35 out of 35 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 35 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 35 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 35 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 35 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 35 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 35 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 35 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 35 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

1How many genes?How many genes?Mapping mouse traits, cont.Mapping mouse traits, cont.Lecture 3, Statistics 246January 27, 20042Inferring linkage andmapping markersWe now turn to deciding when two marker lociare linked, and if so, estimating the map distancebetween them. Then we go on and create a full(marker) map of each chromosome, relative towhich we can map trait genes. With thesepreliminaries completed, we can map trait loci.3The LOD score Suppose that we have two marker loci, and we don’tknow whether or not they are linked. A natural way toaddress this question is to carry out a formal test ofthe null hypothesis H: r=1/2 against the alternativeK: r< 1/2, using the marker data from our cross.The test statistic almost always used in this context islog10 of the ratio of the likelihood at the maximumlikelihood estimate to that at the null, r=1/2, i.e.€ LOD = log10{L(ˆ r )L(1 / 2)}€ ˆ r4Calculating the LOD score Recall that the (log) likelihood here is based on themultinomial distribution for the allocation of n=132intercross mice into their nine 2-locus genotypiccategories. As we saw earlier, it can be written and so we take the difference between this functionevaluated at and at r=1/2, which is where qi is 1/16, 1/8 or 1/4, depending on i.€ ˆ r € € log10L(r) = nilog10pi(r)i∑€ LOD = nilog10pi(ˆ r ) /qii∑5Null probabilities of 2-locus genotypes1/161/81/16B1/81/41/8H1/161/81/16ABHAL1 L2 This is just putting r = 1/2 in an earlier table.Exercise: Suggest some different test statistics to discriminate between the null H and the alternative K. How do they perform in comparison to the LOD?6Using the LOD score Normal statistical practice would have us setting a type 1 error in a givencontext (cross, sample size), and determining the cut-off for the LOD whichwould achieve approximately the desired error under the null hypothesis. This approach is rarely adopted in genetics, where tradition dictates the useof more stringent thresholds, which take into a account the multiple testingcommon on linkage mapping. It was originally motivated by a Bayesianargument, and in fact, Bayesian approaches to linkage analysis areincreasingly popular. Let us use of Bayes’ formula in the form log10 posterior odds = log10 prior odds + LOD, where the odds are for linkage. With 20 chromosomes, which we mightassume approx the same size, and not too long, the prior probability of tworandom loci being on the same chromosome and hence linked, is about1/20. In order to overcome these prior odds against linkage, and achievereasonable posterior odds, say 100:1, we would want a LOD of at least 3.7Linkage groups And so it has come to pass that a LOD must be >3 to getpeople’s attention. We’ll be a little more precise later. The next step is to define what are called linkage groups.These partition the markers into classes, every pair of markersbeing either closely linked (i.e. r ≈ 0), or being connected by achain of markers, each consecutive pair of which is closelylinked. In practice, we might define closely linked to besomething like a) < c1, and b) LOD( ) > c2, where e.g. c1= 0.2, c2 = 3.€ ˆ r € ˆ r8Forming linkage groups, cont. When one tries to form linkage groups, it is not unusual to haveto vary c1 and c2 a little, until all markers fall into a group ofmore than just one marker. When this is done, it is hoped thatthe linkage groups correspond to chromosomes. If thechromosome number of the species is known, and thatcoincides with the number of linkage groups, this is areasonable presumption. But much can happen to dash thishope: one may have two linkage groups corresponding todifferent arms of the same chromosome, and not know that;one can have a marker at the end of one chromosome “linked”to a marker at the end of another chromosome, though thisshould be rare if there is plenty of data; and so on.9Ordering linkage groups Next we want to order the markers in a linkage group( ideally,on a chromosome). How do we do that? An initial ordering canbe done by starting one of the markers, M1 say, on the mostdistant pair, here distance being recombination fraction, or mapdistance. Call M2 the closest marker to M1 and continue in thisway. Now we want to confirm our ordering. One way is to calculate a(maximized) log likelihood for every ordering, and select theone with the largest log likelihood. But if we have (say) 11markers on a chromosome, this is 11! = 4×107 orders. Whatpeople often do is take moving k-tuples of markers, andoptimize the order of each, e.g. with k = 3 or 4. Whicheverstrategy one adopts, multi (i.e. >2) locus methods are needed.10Likelihoods for 3-locus data Suppose that we have 3 markers M1 , M2 and M3 in that order. How do wecalculate the log likelihood of the associated 3-locus marker data from ourintercross? Recalling the discussion preceding the Punnett square of the last lecture,the parental haplotypes here are a1a2a3 and b1b2b3 while are would nofewer than 6 forms of recombinant haplotypes: the four single recombinants a1a2b3 , a1 b2 b3 , b1b2a3 and b1a2a3 ,and the two double recombinants a1b2 a3 and b1a2b3 . Proceeding as before, we calculate the probability of each of these in termsof the recombination fractions r1 and r2 across intervals M1-M2, and M2-M3,respectively. For simplicity, we assume the Poisson model, withindependence of recombination across disjoint intervals. For example,a1a2a3 would have probability (1- r1)(1- r2)/4, a1a2b3 would have probability(1- r1)r2/4, while a1b2 a3 would have probability r1r2 . We would do this for every one of the 8 paternal and 8 maternal haplotypes,and then collect them up to assign probabilities for each of the 33 3-locusgenotypes (AAA, AAH, …, BBB), and maximize the multinomial likelihood inthe parameters r1 and r2 . This is just as in the 2-locus case.11Multilocus linkage: #loci >3 It should have become clear by now that the strategy justoutlined is not going to work too easily when there are (say) 11loci in a linkage group. In that case, haplotypes are strings of the form a1a2b3 … a10b11 ,where there are just 2 parental and 210-2 distinct recombinanthaplotypes. The number of parental haplotype combinations isthe square of this number, and they must be mapped into 31111-locus genotypes, and a multinomial MLE carried out toestimate 10 recombination fractions. What can be done? In 1987 the first large scale human genetic map was


View Full Document

Berkeley STATISTICS 246 - Lecture Notes

Documents in this Course
Meiosis

Meiosis

46 pages

Meiosis

Meiosis

47 pages

Load more
Download Lecture Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?