algorithms in biologyWelcome toCS374Prof. Serafim BatzoglouMarc SchaubEugene Fratkincs374.stanford.edu algorithms in biology:: April 1st, 2008 Why Algorithms in Biology?•Biologists are collecting massive amounts of data:•Genomes•Genotypes•Gene expression•Protein-protein interaction•There are many more new exciting assays in development...cs374.stanford.edu algorithms in biology:: April 1st, 2008 Why Algorithms in Biology?•There is a need to organize and analyze the data.•Use computational approaches to suggest new hypotheses.•Validate these hypotheses, and discover new biology.•Computer science is becoming a key part of 21st century biology.cs374.stanford.edu algorithms in biology:: April 1st, 2008 21st Century Biologycs374.stanford.edu algorithms in biology:: April 1st, 2008 21st Century Biology...ATCTGTATTCGATTCGTAAATCGGTTACATGGCTAGGTACCCGATAATTCTAGTACGTACGGTACGTATCTGCATTTATATTACTGACCATGTAAACGTAATACGAATCATTAGATTCGGGTATCTGCCCTTAACTAGTTAGTACTATAVATAGTGGCTAGGTACCCGATAATTCTAGTACGTACGGTACGTATCTGCATTTTACATGGCTAGGTACCCGATAATTCTAGTACGTACGGTACGTATCTGCAGCATTTATATTACTGACCATGTAAACGTAATACTATT...cs374.stanford.edu algorithms in biology:: April 1st, 2008 21st Century Biology...ATCTGTATTCGATTCGTAAATCGGTTACATGGCTAGGTACCCGATAATTCTAGTACGTACGGTACGTATCTGCATTTATATTACTGACCATGTAAACGTAATACGAATCATTAGATTCGGGTATCTGCCCTTAACTAGTTAGTACTATAVATAGTGGCTAGGTACCCGATAATTCTAGTACGTACGGTACGTATCTGCATTTTACATGGCTAGGTACCCGATAATTCTAGTACGTACGGTACGTATCTGCAGCATTTATATTACTGACCATGTAAACGTAATACTATT...Genome Sequencingcs374.stanford.edu algorithms in biology:: April 1st, 2008 21st Century Biology...ATCTGTATTCGATTCGTAAATCGGTTACATGGCTAGGTACCCGATAATTCTAGTACGTACGGTACGTATCTGCATTTATATTACTGACCATGTAAACGTAATACGAATCATTAGATTCGGGTATCTGCCCTTAACTAGTTAGTACTATAVATAGTGGCTAGGTACCCGATAATTCTAGTACGTACGGTACGTATCTGCATTTTACATGGCTAGGTACCCGATAATTCTAGTACGTACGGTACGTATCTGCAGCATTTATATTACTGACCATGTAAACGTAATACTATT...The CS problem: fragment assemblycs374.stanford.edu algorithms in biology:: April 1st, 2008 21st Century Biology...ATCTGTATTCGATTCGTAAATCGGTTACATGGCTAGGTACCCGATAATTCTAGTACGTACGGTACGTATCTGCATTTATATTACTGACCATGTAAACGTAATACGAATCATTAGATTCGGGTATCTGCCCTTAACTAGTTAGTACTATAVATAGTGGCTAGGTACCCGATAATTCTAGTACGTACGGTACGTATCTGCATTTTACATGGCTAGGTACCCGATAATTCTAGTACGTACGGTACGTATCTGCAGCATTTATATTACTGACCATGTAAACGTAATACTATT...The CS problem: fragment assemblycs374.stanford.edu algorithms in biology:: April 1st, 2008 21st Century Biology...ATCTGTATTCGATTCGTAAATCGGTTACATGGCTAGGTACCCGATAATTCTAGTACGTACGGTACGTATCTGCATTTATATTACTGACCATGTAAACGTAATACGAATCATTAGATTCGGGTATCTGCCCTTAACTAGTTAGTACTATAVATAGTGGCTAGGTACCCGATAATTCTAGTACGTACGGTACGTATCTGCATTTTACATGGCTAGGTACCCGATAATTCTAGTACGTACGGTACGTATCTGCAGCATTTATATTACTGACCATGTAAACGTAATACTATT...The CS problem: fragment assemblycs374.stanford.edu algorithms in biology:: April 1st, 2008 21st Century Biology...ATCTGTATTCGATTCGTAAATCGGTTACATGGCTAGGTACCCGATAATTCTAGTACGTACGGTACGTATCTGCATTTATATTACTGACCATGTAAACGTAATACGAATCATTAGATTCGGGTATCTGCCCTTAACTAGTTAGTACTATAVATAGTGGCTAGGTACCCGATAATTCTAGTACGTACGGTACGTATCTGCATTTTACATGGCTAGGTACCCGATAATTCTAGTACGTACGGTACGTATCTGCAGCATTTATATTACTGACCATGTAAACGTAATACTATT...The CS problem: fragment assemblycs374.stanford.edu algorithms in biology:: April 1st, 2008 21st Century Biology...ATCTGTATTCGATTCGTAAATCGGTTACATGGCTAGGTACCCGATAATTCTAGTACGTACGGTACGTATCTGCATTTATATTACTGACCATGTAAACGTAATACGAATCATTAGATTCGGGTATCTGCCCTTAACTAGTTAGTACTATAVATAGTGGCTAGGTACCCGATAATTCTAGTACGTACGGTACGTATCTGCATTTTACATGGCTAGGTACCCGATAATTCTAGTACGTACGGTACGTATCTGCAGCATTTATATTACTGACCATGTAAACGTAATACTATT...The CS problem: fragment assemblycs374.stanford.edu algorithms in biology:: April 1st, 2008 21st Century Biology...ATCTGTATTCGATTCGTAAATCGGTTACATGGCTAGGTACCCGATAATTCTAGTACGTACGGTACGTATCTGCATTTATATTACTGACCATGTAAACGTAATACGAATCATTAGATTCGGGTATCTGCCCTTAACTAGTTAGTACTATAVATAGTGGCTAGGTACCCGATAATTCTAGTACGTACGGTACGTATCTGCATTTTACATGGCTAGGTACCCGATAATTCTAGTACGTACGGTACGTATCTGCAGCATTTATATTACTGACCATGTAAACGTAATACTATT...Finding the needles in the haystack•Genes•Regulatory motifs•DNA structurecs374.stanford.edu algorithms in biology:: April 1st, 2008 21st Century Biology...ATCTGTATTCGATTCGTAAATCGGTTACATGGCTAGGTACCCGATAATTCTAGTACGTACGGTACGTATCTGCATTTATATTACTGACCATGTAAACGTAATACGAATCATTAGATTCGGGTATCTGCCCTTAACTAGTTAGTACTATAVATAGTGGCTAGGTACCCGATAATTCTAGTACGTACGGTACGTATCTGCATTTTACATGGCTAGGTACCCGATAATTCTAGTACGTACGGTACGTATCTGCAGCATTTATATTACTGACCATGTAAACGTAATACTATT...Finding the needles in the haystack•Genes•Regulatory motifs•DNA structurecs374.stanford.edu algorithms in biology:: April 1st, 2008 21st Century BiologyComparative Genomics•Several hundred species have been sequenced •Compare their genome•Phylogeny•Find conserved regions•Likely to be functionalcs374.stanford.edu algorithms in biology:: April 1st, 2008 21st Century BiologyComparative GenomicsSource: genome.ucsd.educs374.stanford.edu algorithms in biology:: April 1st, 2008 21st Century BiologyCCCGATAATTCTAGTACGTACGGTACGTATCTGCATTTATATcs374.stanford.edu algorithms in biology:: April 1st, 2008 21st Century BiologyCCCGATAATTCTAGTACGTACGGTACGTATCTGCATTTATATCCCGATTATTCTAGTACGTACGGTACGTATGTGCATTTATATcs374.stanford.edu algorithms in biology:: April 1st, 2008 21st Century BiologyCCCGATAATTCTAGTACGTACGGTACGTATCTGCATTTATATCCCGATTATTCTAGTACGTACGGTACGTATGTGCATTTATATCCCGATTATTCTAGTACGTACGTTACGTATCTGCATTTATATcs374.stanford.edu algorithms in biology:: April 1st, 2008 21st Century BiologyCCCGATAATTCTAGTACGTACGGTACGTATCTGCATTTATATCCCGATTATTCTAGTACGTACGGTACGTATGTGCATTTATATCCCGATTATTCTAGTACGTACGTTACGTATCTGCATTTATATcs374.stanford.edu algorithms in biology:: April 1st, 2008 21st Century BiologyCCCGATAATTCTAGTACGTACGGTACGTATCTGCATTTATATPopulation GeneticsCCCGATTATTCTAGTACGTACGGTACGTATGTGCATTTATATCCCGATTATTCTAGTACGTACGTTACGTATCTGCATTTATATcs374.stanford.edu algorithms in biology:: April 1st, 2008 21st Century BiologyPopulation Genetics•Differences between populations:•Origins of a population•Ancestry of an individual•Case-control studies:•Find the genetic component of diseases•Identify signs of recent evolutioncs374.stanford.edu algorithms in biology:: April 1st, 2008 21st Century BiologyAnd what is coming next ?•Personalized genetics:•You can get genotyped for ~ $1000•Sequencing individual genomes:•Jim Watson•Craig Ventercs374.stanford.edu algorithms in biology:: April 1st, 2008 21st Century BiologyAnd more CS problems...•Physical models of Protein Folding•Folding@home•Models in Systems Biology•Analysis of gene expression data•Data integration•Data visualizationSource:
View Full Document