Chapter 20 1 How Are Complete Genomes Sequenced a genomes range in size from about half a million base pairs to several billion b A single sequencing reaction can analyze only about 1000 base pairs 2 Shotgun Sequencing Draw figure 20 2 a A genomes is broken up into a set of overlapping fragments small enough to be sequenced b Step for Shotgun Sequencing 1 High frequency sound waves randomly break a fenome into pieces 160 kilobases long 2 Each 160 kb piece is inserted into a plasmid called a bacterial artificial chromosome BAC which is then inserted into E Choli creating a BAC library thus isolating large numbers of each 160 kb fragment 3 DNA is broken into fragments again that are 1 kb long 4 1 kb fragments inserted into plasmids and placed into bacterial cells large numbers of 1 kb fragments available for sequencing reactions 5 find regions where diff fragments overlap 6 Assemble all the 1 kb fragments from each original 160 kb fragment by matching overlapping ends 7 Assemble all the 160 kb fragments by matching overlapping ends c Point of Shotgun strategy consists of breaking genome into tiny fragments sequencing them and putting the seq data back into correct order 3 Impact of Next Gen Sequencing Strategies a Pyrosequencing takes place on single DNA fragment b In pyrosequencing a genome is sheared into many tiny fragments which are then separated and sequenced directly c Only works w fragments of about 100 base pairs in length 4 Bioinformatics a a field that fuses computer science and biology to manage analyze and interpret biological information 5 Genomes Being Sequenced a Haemophilus Influenzae lives in upper respiratory tract causes earaches respiratory tract infections in children b Over 800 genomes have been sequenced c Most organisms chosen for whole genome sequencing cause disease or have other interesting properties 6 Which Sequences Are Genes a the most basic task in annotating or interpreting a genome is to identify which bases constitute genes b In eukaryotes identifying genes is much more difficult 7 Identifying Genes in Bacterial And Archaeal Genomes a with 3 codons consisting of 3 bases 3 reading frames are possible on each strand for a total of 6 possible reading frames b a long stretch of codons that lacks a stop codon is a good indication of a coding sequence c computer programs look for sequences typical of promoters operators or other regulatory sites DNA segments identified this way are called open reading frames ORF d similarities bet genes in diff species are due to homology genes similar in structure location and function are homologous genes 8 ID Genes in Eukaryotic Genomes a Impossible to scan for ORFs in eukaryotic genes bec coding regions are broken up by introns b to ID eukaryotic genes you must isolate mRNA and turn it back into DNA then seq a portion of resulting molecule to make an expressed sequence tag EST use the EST to find the matching seq in genomic DNA Bacterial and Archaeal Genomes 1 History of Prokaryotic Genomes a genome sequencer must record what is in a genome number type organization of genes b In bacteria there is correlation bet size of a genome and the metabolic capabilities of the organism c There is a lot of genetic diversity among bacteria and archaea about 15 of genes in each species genome are unique d Many genes are identical in certain bacteria e Multiple Chromosomes common in bacteria some bacteria have linear chromosomes f many species contain plasmids 2 Lateral Gene Transfer a Movement of DNA from one species to another b 2 ways to tell ID genes that were acquired via lateral gene transfer 1 a gene is more closely related to distant species that those of related species 2 base pairing in particular gene is diff from the rest of the genome c Plasmids appear to be responsible for lateral gene transfer d Clamydia Trachmomatis has genes that resemble eukaryotic genes it parasitizes incorporated genes from eukaryotes into itself 3 Environmental Sequencing Metagenomics a cataloging all of the genes present in a community of bacteria and archaea b Research group iin Bahamas collected cells from diff water depths and locations and isolated DNA from the samples and sequenced them c Found alleles that code for proteins similar to rhodopsin found in cells of human retina bacterial cells use the same protein to absorb light to help produce ATP 1 Eukaryotic Genomes a Problems w sequencing eukaryotic genomes enormous size of genomes of eukaryotes coping w noncoding sequences that are repeated many times b Repeated Sequences repeated DNA seqs that occur bet genes or inside introns that dont code for products why do these exist 2 Parasitic and Repeated Sequences a 50 of ave eukaryotic genome consists of repeated seqs that dont code for product used by cell b Many repeated seqs derived from Transposable Elements segments of DNA that are capable of being inserted into new locations or transposing in a genome c Transposable elements unlike viruses never leave host cell d transposable elements and viruses are parasitic bec it takes time and resources to copy them and they disrupt gene function when they insert in a new location so they decrease their host s fitness 3 How Transposable Elements Work Figure 20 5 Draw a There are distinct types of transposable elements b well studied type is a long interspersed nuclear element LINE found in humans and other eukaryotes c hypothesized that LINEs evolved from retro viruses d Most LINEs in human genome dont function bec they dont have promoter or genes for reverse transcriptase or integrase e Many genomes riddled w parasitic seqs 4 Repeated Seqs and DNA Fingerprinting a Eukaryotes have several thousand loci called Short Tandem Repeats small seqs repeated one after another contiguously along part of a chromosome b Two classes of STR 1 repeating units that are just 1 to 5 bases long are known as microsatellites of simple sequence repeats 2 repeating units that are 6 to 500 bases long known as minisatellites or variable number tandem repeats c microsatellite seqs originated when DNA polymerase skips or mistakenly adds extra bases during replication d minisatellites and microsatellites have so many diff alleles because of unequal crossover at meiosis 1 draw from figure 20 6 when crossover occurs resulting chromosomes have diff numbers of repeats 5 Fingerprinting a Genome of every individual has at least 1 new allele b the variation in repeat number among individuals is the basis of most DNA
View Full Document