DOC PREVIEW
Berkeley STATISTICS 246 - Genes and MS in Tasmania

This preview shows page 1-2-3-25-26-27 out of 27 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

PowerPoint PresentationNature and numbr of relatives needed to give accurate haplotypesSimulation StudyGenotypingData collected in the Tasmanian MS StudyMS study in Tasmania: dataSlide 7Some issues associated with the data preparationErrors, errors and errorsError checking: a little detailUnforeseen ProblemExample output showing partial allele slippageIsolated bin (probably slippage)Example of highly polymorphic markerBox 1 - all alleles shifted +1bpBox 1 Alleles 150, 154 shifted +1bpFixing allele callsObtaining haplotypesGenehunter OutputExtracting untransmitted haplotypes from GENEHUNTERAssessing haplotype sharingNonparametric haplotype sharing analysisHaplotype sharing statistics for genome wide scan data cf. fine mappingTowards a sharing statisticSlide 25Sharing drop-off & allelic heterogeneityHaplo_clusters (Melanie Bahlo)1Genes and MS in Tasmania, cont. Lecture 6, Statistics 246February 5, 2004Nature and numbr of relatives needed to give accurate haplotypesExercise. Explain why it is that when we have both sets of parental genotypes, and the markers are reasonably polymorphic, we can reconstruct an individual’s haplotypes with high probability. What are the difficult cases?If we have no parents, or just one parent, and grandparents’, siblings’ or offsprings’ genotypes are available, which are most informative for an individual’s haplotype reconstruction?3Simulation Study Simulated many different types of pedigrees 300 times each to see which constellations of relatives give the best opportunity of being able to reconstruct haplotypes correctly. Ranking the contributions in order of importance (assuming that the proband has been genotyped):1. Parents2. Grandparents & Siblings3. Offspring4GenotypingWe used STR (short tandem repeat) also known as microsatellite markers …AGCTAGCGCGC….GCGCGGCATTA……AGCTAGCGCGC….GCGCGGCGCATTA…Eventual plan: 5 cM genome wide scan (~ 800 markers) with dinucleotide STRsData collected in the Tasmanian MS Study6MS study in Tasmania: dataCollected 170 (out of an estimated 300) MS cases and 105 controls, and a constellation of ~ 4 relatives for eachCreated a case/control study with 338 case haplotypes and 208 control haplotypesGenotyping carried out at the Australian Genome Research Facility: almost 1 million genotypes (the 2nd largest genotyping project ever carried out in Australia7Cases (170)Controls (105)Grdpts29 (4%) 12 (3%)Parents174 (22%) 123 (30%)Siblings374 (46%)215 (52%)Spouse67 (7%)14 (3%)*Offspring168 (20%)50 (12%)Relatives of cases and controls80917 (1%)Other0414*Some issues associated with the data preparation9Errors, errors and errors•Marker location errors: allocation to wrong chromosome, wrong order, map distances out, Généthon (Dib et al,1996), Marshfield (Broman et al,1998), DeCODE (Kong et al, 2001) included a physical map, •Pedigree (relationship) errors: PREST (McPeek & Sun)•Genotyping errors (caused by assay or analysis): ones causing Mendelian inconsistencies; ones which don’t, PEDCHECK (O’Connell), SIBMED (Douglas et al), MERLIN (Abecasis et al)•Data handling errors (e.g. mixed up samples)•Binning (allele labelling) errors: inconsistencies over time10Error checking: a little detailWith genome-wide genotypes, moderately close relationships can be confirmed or falsified: 7 paternity errors (6 incorrect fathers, 1 incorrect mother) Mix-ups typically stand out (2 DNA sample swaps, 2 duplicate samples, 1 case of contaminated DNA, 1 adopted child unrelated to anyone elseMendelian checks picked up many genotyping errors: 1,472 inconsistencies (0.15% genotyping errors); 15 markers removed; using Mendel on the X found 3 data entry errors and 4 cases where the recorded sex was wrongMultilocus methods can pick up more, in effect identifying close double recombinants: 58 errors inferred by this method and put to missingOther errors demanded special methods.11Unforeseen Problem Marker binning was not consistent over time. Genotyping at 796 markers took over 2 years. Heuristic approach: •Look at all markers with allele bin differences of 1 bp•Seek large frequency differences: 2 allele by box•Carry out allele binning slippage test (for pairs of adjacent alleles and boxes): 2•Markers were flagged if any of the above, and examined for systematic trendsA founder is an individual with no parent in the sample.12Example output showing partial allele slippageAbsolute frequencies for given allele (106) in each box is shown in (time) order of genotypingAlleles in size orderSummary informationNote slippage of allele 104 into allele 106 for Box 7 (yellow)(Time) order of Genotyping - Box 1, 2+3, 5, 6, 7, 21-23, 24-27.Numbers indicate number of individuals in each box13Isolated bin (probably slippage)14Example of highly polymorphic marker15Box 1 - all alleles shifted +1bp16Box 1 Alleles 150, 154 shifted +1bp17Fixing allele callsNeed to track changes carefully18Obtaining haplotypes Haplotypes were reconstructed using the Lander-Green-Kruglyak algorithm (Genehunter/Merlin/Allegro). We’ll go into the details of the algorithm later this lecture, or in the next. Appropriate case and control datasets with these haplotypes were then prepared. Here’s how, from Genehunter output.19Genehunter Output•The genotype data for family MS003 (input)***** MS003 0.000302 0 0 1 8 11 7 4 5 9 8 6 3 6 5 2 9 5 7 4 6 5 1 4 6 6303 0 0 1 5 10 5 3 3 7 1 1 2 5 5 5 4 5 7 1 4 4 1 3 4 8301 303 302 2 5 10 5 3 3 7 1 1 2 5 5 8 11 7 4 5 9 8 6 3 6 5304 303 302 0 5 4 5 7 1 4 4 1 3 4 8 2 9 5 7 4 6 5 1 4 6 6305 303 302 0 5 10 5 3 3 7 1 1 2 5 5 8 11 7 4 5 9 8 6 3 6 5MS003 301 303 302 2 2 5 8 10 11 5 7 3 4 3 5 7 9 1 8 1 6 2 3 5 6 5 5MS003 302 0 0 2 1 2 8 9 11 0 0 7 4 0 0 0 0 0 0 1 6 0 0 0 0 5 6MS003 303 0 0 1 1 5 5 4 10 0 0 7 3 0 0 0 0 0 0 1 1 0 0 0 0 5 8MS003 304 303 302 2 0 2 5 4 9 5 5 7 7 1 4 4 6 4 5 1 1 3 4 4 6 6 8MS003 305 303 302 1 0 5 8 10 11 5 7 3 4 3 5 7 9 1 8 1 6 2 3 5 6 5 5• The haplotype


View Full Document

Berkeley STATISTICS 246 - Genes and MS in Tasmania

Documents in this Course
Meiosis

Meiosis

46 pages

Meiosis

Meiosis

47 pages

Load more
Download Genes and MS in Tasmania
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Genes and MS in Tasmania and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Genes and MS in Tasmania 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?