CMSC423 Bioinformatic Algorithms Databases and Tools Lecture 16 Genetics Reading assignment Chapter 13 Gene association studies Goal identify genes markers associated with disease Example BRCA1 associated with risk of breast cancer Lots of hype on the news recently companies promise to sequence your genome and tell you likely ancestry risk for disease Examples www 23andme com www decodeme com and many others CMSC423 Fall 2008 3 First definitions Genotype genetic composition of our genome Phenotype observable consequence of genotype e g skin hair color IQ disease state etc We have two copies of each chromosome homologous chromosomes each received from one of the parents Each gene can thus have two forms alleles e g A1 A2 Each gene may be associated with a phenotype Dominant gene phenotype of A1 A2 is the same as phenotype of A1 A1 Recessive gene otherwise CMSC423 Fall 2008 4 More definitions Genotype A A is called homozygous both chromosomes have the same allele Genotype A B is called heterozygous mother and father s chromosomes disagree Notes phenotypes not necessarily directly correlated with a single gene polygenic traits probability gene correlates with a phenotype penetrance link between genotype and phenotype can be qualitative gene form matters or quantitative gene dosage matters CMSC423 Fall 2008 5 Technology what we measure Definition of allele genotype depends on what we can measure constantly changing We are looking for things that differ within a population polymorphic markers Restriction fragment length polymorphism RFLP measures presence absence of particular sites in the genome Variable number tandem repeats VNTR specific repeat elements that occur in different copy numbers Single nucleotide polymorphisms SNPs single letter differences between chromosomes 500 000 characterized Copy number variants CNV genomic regions whose copy number differs between individuals CMSC423 Fall 2008 6 Allele frequencies Population genetics questions what alleles exist in a certain population what is the relative abundance of the alleles how diverse is a population Given a locus gene or genomic region assume there are K possible alleles in a population and allele j occurs with frequency pj How uniform is the locus in the population Likelihood two random individuals have same allele K homzygosity F i 1 p CMSC423 Fall 2008 2 i 7 Allele frequencies Usually we focus on the differences K 2 heterzygosity H 1 F 1 i 1 pi Interesting tidbit most variation occurs within populations rather than between e g two Africans are more different from each other than the average African is from the average Chinese see book for details However allele frequencies can be used to infer population membership for an individual CMSC423 Fall 2008 8 Who am I My alleles are A1 B2 C1 D3 assume homozygous for clarity Am I European or Asian Need to know pA1Europe pB2Europe pC1Europe pD3Europe pA1Asia pB2Asia pC1Asia pD3Asia p me European pA1Europe 2X pB2Europe 2X pC1Europe 2X pD3Europe 2 similarly for p me Asian if p me European p me Asian I can infer that I have European ancestry CMSC423 Fall 2008 9 Who am I Inferring ancestry as described is overly simplistic Can do more fancy statistics However any statistical approach is error prone answer is associated with level of confidence i e probability answer is wrong remember P values Beware of anyone who claims to infer your ancestry from genotype Beware of anyone who claims to infer disease susceptibility from genotype need genetic risk counselors not companies providing information for entertainment purposes CMSC423 Fall 2008 10 Recombination Genetic change not only caused by mutations Recombination DNA jumps between homologous chromosomes due to cross over events A1 B1 C1 D1 A1 B1 C1 D1 A2 B2 C2 D2 A2 B2 C2 D2 A1 B1 C2 D2 A1 B1 C2 D1 A2 B2 C1 D1 A2 B2 C1 D2 CMSC423 Fall 2008 11 Association studies The set of alleles on a same chromosome haplotype If a particular allele of a gene is always associated with a phenotype disease is this gene causing the disease Most likely gene is associated nearby with the gene causing the disease their alleles always appear on the same haplotype Due to recombination a set of original haplotypes rapidly becomes broken apart How likely is it that two alleles remain on the same haplotype are linked during evolution CMSC423 Fall 2008 12 Linkage analysis Preservation of linkage depends on distance between the genes and rate of recombination Given two genes A B how can we estimate whether recombination occurred between them How likely is it that A1 and B1 are both on the same haplotype by chance p A1 p B1 How different is this from the observed ratios Linkage Desequilibrium D p A1B1 p A1 p B1 D p A2B2 p A2 p B2 D p A1B1 p A2B2 p A1B2 p A2B1 CMSC423 Fall 2008 13 Linkage analysis Linkage desequilibrium usually measured as ratio to maximum possible desequilibrium D Dmax Dmax min p A2 p B1 p A1 p B2 if D 0 Dmax min p A1 p B1 p A2 p B2 if D 0 Another measure Pearson s correlation coefficient r2 D2 p A1 p A2 p B1 p B2 CMSC423 Fall 2008 14 Additional resources www hapmap org www 1000genomes org www personalgenomes org http www ncbi nlm nih gov sites entrez db omim CMSC423 Fall 2008 15 Homework Prove the equalities on slide 13 D Derive the formula for Dmax on slide 14 problem 3 5 in book Due Tuesday Nov 4 submit by E mail to me and Mohammad CMSC423 Fall 2008 16
View Full Document
Unlocking...