DOC PREVIEW
Berkeley COMPSCI 294 - Lecture Notes

This preview shows page 1 out of 4 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 4 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 4 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CS 294-8 Computational Biology for Computer Scientists Spring 2003Lecture 4: January 30Lecturer: Gene Myers Scribe: Evan ChangDisclaimer: These notes have not been subjected to the usual scrutiny reserved for formal publications.They may be distributed outside this class only with the permission of the Instructor.4.1 Polymerase Chain Reaction (PCR)The Polymerase Chain Reaction (PCR) is technique for selective amplification of DNA in vitro. In essence,PCR amplification utilizes the natural machinery of polymerase, the ability to synthesize short oligonu-cleotides, and some sequence knowledge to obtain a pure sample of a specific fragment of DNA. In principle,one molecule of DNA would be enough to carry out the reaction (though we expect at least a few copies).Recall, DNA polymerase requires a primer (a short existing sequence from which to extend) and proceedsfrom 50→ 30(i.e. 30→ 50along the template strand). PCR combines the original duplex, an excess ofsynthetic oligonucleotide primers (usually 18-22 nucleotides in length, see figure 4.1), dNTPs (the four nu-cleoside triphosphates), and a DNA p olymerase capable of withstanding high temperature. It proceeds inthe following cycle:1. Heat the solution to denature the double-stranded DNA.2. Cool to allow hybridization of the primers followed by elongation. It is much more likely the primerswill hybridize than for the original strands to re-hybridize with each other because the primers are ingreat excess.3. Repeat.3’ 5’5’ 3’BAFigure 4.1: PCR Primers. The green segments indicate the primer sequence (usually18-22 nucleotides) that could be synthesized to replicate the pink region.It is clear the number of copies of the DNA grows exponentially in the number of cycles. Figure 4.2 illustratesthe PCR process.4-14-2 Lecture 4: January 30(A)–Original duplex DNA(B)–First cycle—primers attached(C)–Elongation (DNA synthesis)(D)–Second cycle—after elongation(E)–Third cycle—after elongationTAA CT GATTAG A C GTA C A T G A C GC T G CDNARegion to beamplifiedPrimerPrimerTAA CT GATTAG A C GC T G CTAA CT GATTAG A C GC T G CTAA CT GATTAG A C GC T G CTAA CT GATTAG A C GC T G CTAA CT GATTAG A C GC T G CC T AG A TT G C AA C G TT GC T A T G C A T GA CC T AG A TT G C AA C G TT GA CG A T A C G T A CC T AG A TT G C AA C G TT GA CC T AG A TT G C AA C G TT GA CC T AG A TT G C AA C G TT GA CG A T A C G T A CFigure 2.20 Role ofprimer sequences inPCR amplification. (A)Target DNA duplex(blue), showingsequences chosen asthe primer-bindingsites flanking theregion to be amplified.(B) Primer (green)bound to denaturedstrands of target DNA.(C) First round ofamplification. Newlysynthesized DNA isshown in pink. Notethat each primer isextended beyond theother primer site. (D)Second round ofamplification (onlyone strand shown); inthis round, the newlysynthesized strandterminates at theopposite primer site.(E) Third round ofamplification (onlyone strand shown); inthis round, bothstrands are truncatedat the primer sites.Primer sequences arenormally about twiceas long as shown here.Figure 4.2: PCR. (Source: Hartl and Jones, page 61 [HL01])4.2 Application: FBI CODIS FingerprintsDNA fingerprinting is based an specific type of DNA polymorphism called simple tandem repeat polymorphism(STRP). With STRPs, genetic differences result in a different number of copies of some short sequenceat particular loci in the genome. Each STRP may differ in the sequence, the length of the repeatingunit, and the minimum and maximum number of repeats that occur in the population. By knowing thesequence around the STRP site, PCR primers can be designed to amplify a STRP site (as well as controlthe range of the expected fragment size). Size differences between resulting fragments can be used todistinguish individuals. The FBI CODIS test uses 13 such STRP sites (along with information about thedistributions of the STRPs among different ethnicities). Slides about CODIS from lecture are available athttp://inst.eecs.berkeley.edu/~cs294-8/Materials/CODIS.ppt.Lecture 4: January 30 4-34.3 DNA SequencingThe basic idea of the Sanger ladder sequencing method is to produce DNA copies of varying length that stopat a particular base. For example, consider a synthesis reaction that was always forced to end at an adenine(A) residue. Then, if the length of a particular daughter fragment is n, position n in the complement sequencemust be A (i.e. T in the template sequence). Stopping the reaction at a particular base is accomplishedby using dideoxyribonucleoside triphosphates (ddNTPs). Dideoxyribose lacks the 3’-hydroxyl group thatprevents attachment to the next nucleotide.Thus, sequencing is accomplished by inserting the desired fragment into a vector (e.g. plasmid), combiningthis with polymerase, a sequencing primer (created from the known the sequence of the vector), dNTPs,and a small amount (≈ 1%) of ddNTPs with a particular fluorescent dye for each base, and then carryingout electrophoresis in a sequencing machine. These machines detect the fluorescent dye by laser light asthe fragments run off the gel. This process generates traces that can read to determine the sequence (seefigure 4.3). It should be clear that the signal would get weaker as the length of the sequence increases.Currently, we are able to sequence only up to 600-800bp.Figure 4.3: Sequencing Traces. (Source: Hartl and Jones, page 245 [HL01])4-4 Lecture 4: January 304.3.1 Paired-End SequencingA slight variation of the sequencing protocol described above called paired-end sequencing can be used toobtain some information about the space between fragments. Using a double-stranded insert and a sequencingprimer at each end in two separate reactions, we sequence 600-800bp from of each end. With the knowledgeof the length of the insert (typically, 10,000bp), we also determine the space between these two fragments.4.3.2 Shotgun SequencingSince we are limited to sequencing DNA fragments of 600-800bp in length, to sequence longer segments (orwhole genomes), a technique called shotgun sequencing has been developed. The basic idea is to take arandom sampling and assemble based on overlaps. We define the cover c:c =R ·¯LGwhere¯L is the number of base-pairs per read (typically, 500bp), R is the number of sequencing reactions(typically, 2 million), and G is the length of the genome. To reduce the likelihood of gaps to an acceptablelevel, the parameters are chosen such that c = 10. A summary of the shotgun sequencing protocol is


View Full Document

Berkeley COMPSCI 294 - Lecture Notes

Documents in this Course
"Woo" MAC

"Woo" MAC

11 pages

Pangaea

Pangaea

14 pages

Load more
Download Lecture Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?