DOC PREVIEW
UVA CS 302 - Computing Genomes, Genomes Computing

This preview shows page 1-2-3-20-21-40-41-42 out of 42 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 42 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 42 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 42 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 42 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 42 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 42 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 42 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 42 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 42 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Slide 1Final Exam Sneak PreviewMenuGenome Assembly ProblemDNACentral Dogma of BiologyHuman GenomeHuman Genome RaceReading the GenomeGene Reading MachinesSlide 11Decision ProblemIs GA In Class NP?Is GA NP-Complete?NP-CompleteTo Prove NP-HardnessPossible Choices…Busy Beaver Challenge Ruixin YangDoubly-Infinite Tape (Regular BB)One-Way Infinite Tape (Challenge)Winning (3, 2) MachineCodeSlide 23Reducing HAMPATH to GAConsider Each NodeSlide 26Simple NodesGenerating the SubstringsStart and EndWhat is m?Slide 31ApproachesResult: DrawGenomes ComputingSolving HAMPATH with DNAEncoding The ProblemPath BindingGetting the SolutionIs Church-Turning Wrong?DNA-Enhanced PCWhere to Go From HereThanks!Class 26: Class 26: Computing Computing Genomes, Genomes, Genomes Genomes ComputingComputingDavid EvansDavid Evanshttp://www.cs.virginia.edu/evanshttp://www.cs.virginia.edu/evanscs302: Theory of Computationcs302: Theory of ComputationUniversity of Virginia Computer ScienceUniversity of Virginia Computer Science2Lecture 26: Computing Genomes and Vice VersaFinal Exam Sneak Preview•Handout available now•Honor policy: you may discuss these problems with others and use any resources you want until the Final•No notes or other resources may be used during the final•Intent is to give you an idea what to expect on the final and a chance to start thinking about some problems–Don’t attempt to memorize answers: need to understand things since the actual questions may be different3Lecture 26: Computing Genomes and Vice VersaMenu•Computing Genomes (PS6, Problem 6)–Crash course in biology•Busy Beaver result! •Computing with Genomes•Conclusion4Lecture 26: Computing Genomes and Vice VersaGenome Assembly ProblemIn order to assemble a genome, it is necessary to combine snippets from many reads into a single sequence. The input is a set of n genome snippets, each of which is a string of up to k symbols. The output is the smallest single string that contains all of the input snippets as substrings.5Lecture 26: Computing Genomes and Vice VersaDNA•Sequence of nucleotides: adenine (A), guanine (G), cytosine (C), and thymine (T)•Two strands, A must attach to T and G must attach to CAGTCC6Lecture 26: Computing Genomes and Vice VersaCentral Dogma of Biology•RNA makes copies of DNA segments•RNA describes sequences of amino acids•Chains of amino acids make proteins•Proteins make usDNATranscriptionRNATranslationProteinImage from http://www.umich.edu/~protein/7Lecture 26: Computing Genomes and Vice VersaHuman Genome•3 Billion Base Pairs–Each nucleotide is 2 bits (4 possibilities)–3 B pairs * 1 byte/4 pairs = 750 MB•Every sequence of 3 base pairs one of 20 amino acids (or stop codon)–21 possible codons, but 43 = 64 possible –So, really only 750MB * (21/64) ~ 250 MB•Much of it (> 95%) is may be junk (doesn’t encode proteins, but some might be important)1 CD ~ 650 MB8Lecture 26: Computing Genomes and Vice VersaHuman Genome Race•UVa CLAS 1970•Yale PhD•Tenured Professor at U. MichiganFrancis Collins(Director of public National Center for Human Genome Research)(Picture from UVa Graduation 2001)Craig Venter(President of Celera Genomics)vs.•San Mateo College•Court-martialed•Denied tenure at SUNY Buffalo9Lecture 26: Computing Genomes and Vice VersaReading the GenomeWhitehead Institute, MIT10Lecture 26: Computing Genomes and Vice VersaGene Reading Machines•One read: about 700 base pairs•But…don’t know where they are on the chromosomeAGGCATACCAGAATACCCGTGATCCAGAATAAGCActualGenomeACCAGAATACCRead 1TCCAGAATAARead 2TACCCGTGATCCARead 311Lecture 26: Computing Genomes and Vice VersaGenome Assembly ProblemInput: Genome fragments (but without knowing where they are from)Ouput: The full genomeACCAGAATACCRead 1TCCAGAATAARead 2TACCCGTGATCCARead 312Lecture 26: Computing Genomes and Vice VersaDecision ProblemGA = { < { x1, x2, …, xn }, m > | where each xi is a string and there is a string X of length m that includes all of the xi stringsas substrings }If we had a decider for GA, can we find the length of the shortest common superstring?Yes. Try all possible m values from 1, 2, …, Σ|xi|.13Lecture 26: Computing Genomes and Vice VersaIs GA In Class NP?GA = { < { x1, x2, …, xn }, m > | where each xi is a string and there is a string X of length m that includes all of the xi stringsas substrings }Yes. The string X is a polynomial-verifiable certificate.- Check it includes each substring- Check its length is  m14Lecture 26: Computing Genomes and Vice VersaIs GA NP-Complete?GA = { < { x1, x2, …, xn }, m > | where each xi is a string and there is a string X of length m that includes all of the xi stringsas substrings }15Lecture 26: Computing Genomes and Vice VersaNP-CompleteA language B is in NP-complete if:2. There is a polynomial-time reduction from every problem A  NP to B.1. B  NPBNPBNP?16Lecture 26: Computing Genomes and Vice VersaTo Prove NP-Hardness•Pick some known NP-Complete problem X.•Show that a polynomial-time solver for Y could be used to build a polynomial-time solver for X.•This proves that there is no polynomial-time solver for Y (unless P = NP).17Lecture 26: Computing Genomes and Vice VersaPossible Choices…HAMPATHABCDE3SAT(a  b  c)  (a  b   c) …SUBSET-SUM{<{3, 5, 12, 13, 17}, 30}By definition, all must work. Every NP-Complete problem can be reduced to every NP-Complete problem.In practice, some will work much more easily than others. Try to pick a problem “close” to the target problem.18Lecture 26: Computing Genomes and Vice VersaBusy Beaver ChallengeRuixin Yang19Lecture 26: Computing Genomes and Vice VersaDoubly-Infinite Tape (Regular BB)20Lecture 26: Computing Genomes and Vice VersaOne-Way Infinite Tape (Challenge)21Lecture 26: Computing Genomes and Vice VersaWinning (3, 2) Machine22Lecture 26: Computing Genomes and Vice VersaCodeBusyBeaver32.cppBusyBeaver.cpp23Lecture 26: Computing Genomes and Vice VersaIs GA NP-Complete?GA = { < { x1, x2, …, xn }, m > | where each xi is a string and there is a string X of length m that includes all of the xi stringsas substrings }24Lecture 26: Computing Genomes and Vice VersaReducing HAMPATH to GA•Take an arbitrary input to HAMPATH:HAMPATH = { G, s, t | G represents a graph, and there is a path between s and t in G that includes every node in G exactly once }•Construct a corresponding input to GA such that the input is in GA if and only if the original input is in


View Full Document
Download Computing Genomes, Genomes Computing
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Computing Genomes, Genomes Computing and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Computing Genomes, Genomes Computing 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?