DOC PREVIEW
Stanford CS 374 - Study Notes

This preview shows page 1-2-3-4 out of 12 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Am. J. Hum. Genet. 70:1434–1445, 20021434Minimum-Recombinant Haplotyping in PedigreesDajun Qian1and Lars Beckmann21Department of Preventive Medicine, University of Southern California, Los Angeles; and2Deutsches Krebs Forschungs Zentrum, HeidelbergThis article presents a six-rule algorithm for the reconstruction of multiple minimum-recombinant haplotypeconfigurations in pedigrees. The algorithm has three major features: First, it allows exhaustive search of allpossible haplotype configurations under the criterion that there are minimum recombinants between markers.Second, its computational requirement is on the order of in current implementation, where J is the23O( JL)family size and L is the number of marker loci under analysis. Third, it applies to various pedigree structures,with and without consanguinity relationship, and allows missing alleles to be imputed, during the haplotypingprocess, from their identical-by-descent copies. Haplotyping examples are provided using both published andsimulated data sets.IntroductionHaplotyping analysis in pedigrees refers to the reconstruc-tion of haplotypes from phase-unknown genotype datawithin each pedigree. Haplotype data are extremely val-uable in the mapping of disease-susceptibility genes, par-ticularly in the identification of genes related to complexdiseases. The technique for experimentally derived whole-genome haplotypes is becoming available (Douglas et al.2001), and such exact haplotype data are expected to havesignificant impact on future gene-mapping studies, espe-cially in unrelated individuals. However, the reconstruc-tion of haplotypes from conventional genotype data arestill the major choice in most haplotype-based studies,because of both the lower cost of genotyping and theavailability of fast and accurate haplotyping algorithms.Haplotyping analysis in a pedigree involves the con-sideration of the whole space H of all possible distincthaplotype configurations. The whole space H can be par-titioned into subsets , where and is the spaceHr⭓ 0 Hrrof haplotype configurations with r recombinants. We de-note as the space of all possible minimum-recom-Hminbinant haplotype configurations (MRHCs). Tapadar etal. (2000) proposed a minimum-recombinant haplotyp-ing (MRH) algorithm that is based on certain evolution-ary principles, to reconstruct at least one MRHC in eachrun, but their method seems difficult to extend to thehandling of missing genotypes and is not expected to findall MRHCs in limited computations. We have success-Received November 30, 2001; accepted for publication March 6,2002; electronically published April 25, 2002.Address for correspondence and reprints: Mr. Dajun Qian, Depart-ment of Preventive Medicine, University of Southern California, 1540Alcazar Street, CHP 218, Los Angeles, CA 90089-9010. E-mail: [email protected]䉷 2002 by The American Society of Human Genetics. All rights reserved.0002-9297/2002/7006-0006$15.00fully utilized the MRHCs reconstructed by their evolu-tion-based algorithm in several haplotype-based analyses(Qian and Thomas 2001), although the rationale for andpitfalls of ignoring haplotype configurations in the spacerequire further investigation. Wijsman (1987)H ⫺ Hminproposed a 20-rule algorithm, and O’Connell (2000)described a genotype-elimination algorithm; both canbe used for the reconstruction of zero-recombinant hap-lotypes in large pedigrees. These two methods are de-signed to reconstruct haplotypes without recombinantsand can be used to analyze SNP data in a region thatis small enough that the expected number of recombi-nants in the pedigree is very close to 0. Likelihood-basedhaplotyping methods are often flexible enough to tacklelarge and complex pedigrees (Sobel et al. 1996; Lin andSpeed 1997; Thomas et al. 2000), but the price of thisflexibility is complexity and slowness in computation.The present article presents a six-rule MRH algorithmthat exhaustively searches all possible MRHCs in largepedigrees with many markers and allows missing geno-type data to be imputed from the identical-by-descent(IBD) alleles during the haplotyping process. Haplotyp-ing results of data for a published pedigree (Litt et al.1994) are compared with those reported in other arti-cles. Results in simulated data sets are compared be-tween our rule-based MRH method and Tapadar’s ev-olution-based MRH method.MethodsDefinitions, Notation, and AssumptionsTo describe the haplotyping algorithm, we considera pedigree of J family members and a set of L linkedmarker loci, and we define several terms under consid-eration: A “parent” is a family member with at leastone child, a “founder” is a parent without his or herQian and Beckmann: MRH in Pedigrees 1435own parent, an “offspring” is a family member withat least one parent, and an “individual” is any familymember. An individual is defined as “genotyped” atlocus l if the genotype data at locus l either are exper-imentally derived from the DNA sample or can be de-termined from the first-degree relatives. A genotypedparent is defined as “informative” if the individual hasat least one genotyped offspring. An ungenotyped par-ent is defined as “informative” if the individual has agenotyped spouse and has transmitted both haplotypesto multiple offspring. An ungenotyped parent is definedas “partially informative” if the individual has a geno-typed spouse and has transmitted one haplotype to anoffspring. A genotyped offspring is defined as “infor-mative” if the individual has at least one genotypedparent.Parental source (PS) and grandparental source (GS)are the two types of information identified for the twoconstituent alleles at each locus in each family memberin the haplotyping analysis. “PS” refers to the allele thatis paternally or maternally inherited, and “GS” refers tothe PS of each parental allele. An individual is haplo-typed at locus l if the PS of the two constituent allelesat locus l has been assigned.For a nuclear family or a parent-offspring trio, withboth parents ( ) and N offspring (j p 1,2 j p 3, … , N ⫹) at locus l, we use the following definitions:2denote the two constituent alleles of parent 1;a ,blldenote the two constituent alleles of parent 2;c ,dlldenote the two constituent alleles of offspring j;e ,fj,lj,ldenote the paternal and maternal alleles of par-A ,Bllent 1;denote the paternal and maternal alleles of par-C ,Dllent 2;denote the paternal and maternal alleles of off-E ,Fj,lj,lspring j;denote


View Full Document

Stanford CS 374 - Study Notes

Documents in this Course
Probcons

Probcons

42 pages

ProtoMap

ProtoMap

19 pages

Lecture 3

Lecture 3

16 pages

Load more
Download Study Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Study Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Study Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?