DOC PREVIEW
Stanford CS 262 - Lecture 9 DNA Sequencing

This preview shows page 1-2-3-20-21-40-41-42 out of 42 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 42 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 42 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 42 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 42 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 42 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 42 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 42 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 42 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 42 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

DNA SequencingReadingDNA sequencing – vectorsDNA sequencing – gel electrophoresisOutput of PHRAP: a readMethod to sequence segments longer than 500Slide 7Slide 8Strategies for sequencing a whole genomeHierarchical Sequencing Strategy2. DigestionOnline Clone-by-clone The Walking MethodThe Walking MethodSlide 14Advantages & Disadvantages of Hierarchical SequencingWalking off a Single SeedSlide 17Walking off Several Seeds in ParallelSlide 19Whole-Genome Shotgun SequencingWhole Genome Shotgun SequencingARACHNE: Steps to Assemble a Genome1. Find Overlapping ReadsSlide 24Slide 251. Find Overlapping Reads (cont’d)Basic principle of assembly2. Merge Reads into Contigs (cont’d)Slide 29Slide 30Slide 31Slide 32Slide 33Slide 34Slide 35Slide 364. Derive Consensus SequenceSimulated Whole Genome ShotgunMaking a Simulated ReadHuman 22, Results of SimulationsNeurospora crassa Genome (Real Data)Mouse GenomeDNA SequencingLecture 9, Tuesday April 29, 2003Lecture 9, Tuesday April 29, 2003ReadingBasic:ARACHNE: A Whole-Genome Shotgun Assembler Euler: A shotgun assembler based on finding Eulerian pathsOptional:Transposons; Genome Sizes;ARACHNE 2: Assembly of the mouse genomeSkim through following 2 free Nature issues:Mouse (December 2002);50 year anniversary (last week!)Lecture 9, Tuesday April 29, 2003DNA sequencing – vectors+=DNAShakeDNA fragmentsVectorCircular genome(bacterium, plasmid)Knownlocation(restrictionsite)Lecture 9, Tuesday April 29, 2003DNA sequencing – gel electrophoresisStart at primer(restriction site)Grow DNA chainInclude dideoxynucleoside(modified a, c, g, t)Stops reaction at allpossible pointsSeparate products withlength, using gel electrophoresisLecture 9, Tuesday April 29, 2003Output of PHRAP: a readA read: 500-700 nucleotidesA C G A A T C A G …. A16 18 21 23 25 15 28 30 32 21Quality scores: -10log10Prob(Error)Reads can be obtained from leftmost, rightmost ends of the insertDouble-barreled sequencing:Both leftmost & rightmost ends are sequencedLecture 9, Tuesday April 29, 2003Method to sequence segments longer than 500cut many times at random (Shotgun)genomic segmentGet one or two reads from each segment~500 bp ~500 bpLecture 9, Tuesday April 29, 2003Reconstructing the Sequence (Fragment Assembly)Cover region with ~7-fold redundancy (7X)Overlap reads and extend to reconstruct the original genomic regionreadsLecture 9, Tuesday April 29, 2003Challenges with Fragment Assembly•Sequencing errors~1-2% of bases are wrong•Repeats•Computation: ~ O( N2 ) where N = # readsfalse overlap due to repeatLecture 9, Tuesday April 29, 2003Strategies for sequencing a whole genome1. Hierarchical – Clone-by-clonei. Break genome into many long piecesii. Map each long piece onto the genomeiii. Sequence each piece with shotgunExample: Yeast, Worm, Human, Rat2. Online version of (1) – Walkingi. Break genome into many long piecesii. Start sequencing each piece with shotguniii. Construct map as you goExample: Rice genome3. Whole genome shotgunOne large shotgun pass on the whole genomeExample: Drosophila, Human (Celera), Neurospora, Mouse, Rat, FuguLecture 9, Tuesday April 29, 2003Hierarchical Sequencing Strategy1. Obtain a large collection of BAC clones2. Map them onto the genome (Physical Mapping)3. Select a minimum tiling path4. Sequence each clone in the path with shotgun5. Assemble6. Put everything togethera BAC clonemapgenomeLecture 9, Tuesday April 29, 20032. DigestionRestriction enzymes cut DNA where specific words appear1. Cut each clone separately with an enzyme2. Run fragments on a gel and measure length3. Clones Ca, Cb have fragments of length { li, lj, lk }  overlapDouble digestion:Cut with enzyme A, enzyme B, then enzymes A + BOnline Clone-by-cloneThe Walking MethodLecture 9, Tuesday April 29, 2003Lecture 9, Tuesday April 29, 2003The Walking Method1. Build a very redundant library of BACs with sequenced clone-ends (cheap to build)2. Sequence some “seed” clones3. “Walk” from seeds using clone-ends to pick library clones that extend left & rightLecture 9, Tuesday April 29, 2003Walking: An ExampleLecture 9, Tuesday April 29, 2003Advantages & Disadvantages of Hierarchical SequencingHierarchical Sequencing–ADV. Easy assembly–DIS. Build library & physical map; redundant sequencingWhole Genome Shotgun (WGS)–ADV. No mapping, no redundant sequencing–DIS. Difficult to assemble and resolve repeatsThe Walking method – motivationSequence the genome clone-by-clone without a physical mapThe only costs involved are:–Library of end-sequenced clones (CHEAP)–SequencingLecture 9, Tuesday April 29, 2003Walking off a Single Seed•Low redundant sequencing•Many sequential stepsLecture 9, Tuesday April 29, 2003Walking off a single clone is impractical Cycle time to process one clone: 1-2 months1. Grow clone2. Prepare & Shear DNA3. Prepare shotgun library & perform shotgun4. Assemble in a computer5. Close remaining gapsA mammalian genome would need 15,000 walking steps !Lecture 9, Tuesday April 29, 2003Walking off Several Seeds in Parallel•Few sequential steps•Additional redundant sequencingIn general, can sequence a genome in ~5 walking steps, with <20% redundant sequencingEfficient InefficientLecture 9, Tuesday April 29, 2003Using Two LibrariesSolution: Use a second library of small clonesMost inefficiency comes from closing a small ocean with a much larger cloneWhole-Genome Shotgun SequencingLecture 9, Tuesday April 29, 2003Lecture 9, Tuesday April 29, 2003Whole Genome Shotgun Sequencingcut many times at randomgenomeforward-reverse linked readsplasmids (2 – 10 Kbp)cosmids (40 Kbp)known dist~500 bp~500 bpLecture 9, Tuesday April 29, 2003ARACHNE: Steps to Assemble a Genome1. Find overlapping reads4. Derive consensus sequence..ACGATTACAATAGGTT..2. Merge good pairs of reads into longer contigs3. Link contigs to form supercontigsLecture 9, Tuesday April 29, 20031. Find Overlapping Reads•Sort all k-mers in reads (k ~ 24)TAGATTACACAGATTACTAGATTACACAGATTAC|||||||||||||||||•Find pairs of reads sharing a k-mer•Extend to full alignment – throw away if not >95% similarT GATAGA| ||TACATAGT||Lecture 9, Tuesday April 29, 20031. Find Overlapping ReadsOne caveat: repeatsA k-mer that appears N times, initiates N2 comparisonsALU: 1,000,000 timesSolution:Discard all k-mers that appear more than c  Coverage, (c ~ 10)Lecture 9, Tuesday April 29, 20031. Find Overlapping ReadsCreate local multiple alignments from the


View Full Document

Stanford CS 262 - Lecture 9 DNA Sequencing

Documents in this Course
Lecture 8

Lecture 8

38 pages

Lecture 7

Lecture 7

27 pages

Lecture 4

Lecture 4

12 pages

Lecture 1

Lecture 1

11 pages

Biology

Biology

54 pages

Lecture 7

Lecture 7

45 pages

Load more
Download Lecture 9 DNA Sequencing
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture 9 DNA Sequencing and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 9 DNA Sequencing 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?