DOC PREVIEW
CMU CS 10810 - lecture

This preview shows page 1-2-3-24-25-26-27-49-50-51 out of 51 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 51 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 51 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 51 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 51 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 51 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 51 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 51 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 51 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 51 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 51 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 51 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

10-810 /02-710 Computational GenomicsTopicsGradesIntroduction to Molecular BiologyThe Eukaryotic CellCells TypeSlide 7GenomeComparison of Different OrganismsAssigning function to genes / proteinsFunction from sequence homologySequence analysis techniquesGenesSlide 14Example of a Gene: Gal4 DNAGenes Encode for ProteinsExample of a Gene: Gal4 AANumber of Genes in Public DatabasesStructure of Genes in Mammalian CellsSlide 20Identifying Genes in Sequence DataComparative genomicsSlide 23Regulatory RegionsPromoterDNA Binding MotifsExample of MotifsMessenger RNAs (mRNAs)RNAMessenger RNASlide 31Slide 32The RibosomeSlide 34PerturbationPerturbations: RNAiProteinsSlide 38Secondary Structure: Alpha HelixSecondary Structure: Beta SheetProtein StructureDomains of a ProteinAssigning Function to ProteinsProtein InteractionPutting it all together: Systems biologyHigh throughput dataSlide 47Reverse engineering of regulatory networksDynamic regulatory networksPhysical networksWhat you should remember10-810 /02-710Computational GenomicsEric [email protected] 4127http://www.cs.cmu.edu/~epxing/Class/10810-07/Ziv [email protected] 4107Takis [email protected] BST3 (Pitt)Topics• Introduction (1 Week)• Genetics (3 weeks)• Sequence analysis and evolution (4 weeks)• Gene expression (3 weeks)• Systems biology (4 weeks)Grades•4 Problem sets: 36%•Midterm: 24%•Projects: 30%•Class participation and reading: 10%Introduction to Molecular Biology• Genomes• Genes• Regulation• mRNAs• Proteins• SystemsThe Eukaryotic CellCells Type•Eukaryots: - Plants, animals, humans - DNA resides in the nucleus - Contain also other compartments•Prokaryots: - Bacteria - Do not contain compartmentsCentral dogmaProteinmRNADNAtranscriptiontranslationCCTGAGCCAACTATTGATGAAPEPTIDECCUGAGCCAACUAUUGAUGAAGenome•A genome is an organism’s complete set of DNA (including its genes).•However, in humans less than 3% of the genome actually encodes for genes.• A part of the rest of the genome serves as a control regions (though that’s also a small part).•The goal of the rest of the genome is unknown (a possible project …).Comparison of Different Organisms Genome size Num. of genesE. coli .05*1084,200Yeast .15*1086,000Worm 1*10818,400Fly 1.8*10813,600Human 30*10825,000Plant 1.3*10825,000Assigning function to genes / proteins•One of the main goals of molecular (and computational) biology.•There are 25000 human genes and the vast majority of their functions is still unknown•Several ways to determine function - Direct experiments (knockout, overexpression) - Interacting partners - 3D structures - Sequence homologyHardEasierFunction from sequence homology•We have a query gene: ACTGGTGTACCGAT•Given a database with genes with a known function, our goal is to find another gene with similar sequence (possibly in another organism)•When we find such gene we predict the function of the query gene to be similar to the resulting database gene•Problems - How do we determine similarity?Sequence analysis techniques•A major area of research within computational biology.•Initially, based on deterministic (dynamic programming) or heuristic (Blast) alignment methods•More recently, based on probabilistic inference methods (HMMs).GenesGenomic DNAPromoter Protein coding sequence TerminatorWhat is a gene?Example of a Gene: Gal4 DNAATGAAGCTACTGTCTTCTATCGAACAAGCATGCGATATTTGCCGACTTAAAAAGCTCAAG TGCTCCAAAGAAAAACCGAAGTGCGCCAAGTGTCTGAAGAACAACTGGGAGTGTCGCTAC TCTCCCAAAACCAAAAGGTCTCCGCTGACTAGGGCACATCTGACAGAAGTGGAATCAAGG CTAGAAAGACTGGAACAGCTATTTCTACTGATTTTTCCTCGAGAAGACCTTGACATGATT TTGAAAATGGATTCTTTACAGGATATAAAAGCATTGTTAACAGGATTATTTGTACAAGAT AATGTGAATAAAGATGCCGTCACAGATAGATTGGCTTCAGTGGAGACTGATATGCCTCTA ACATTGAGACAGCATAGAATAAGTGCGACATCATCATCGGAAGAGAGTAGTAACAAAGGT CAAAGACAGTTGACTGTATCGATTGACTCGGCAGCTCATCATGATAACTCCACAATTCCG TTGGATTTTATGCCCAGGGATGCTCTTCATGGATTTGATTGGTCTGAAGAGGATGACATG TCGGATGGCTTGCCCTTCCTGAAAACGGACCCCAACAATAATGGGTTCTTTGGCGACGGT TCTCTCTTATGTATTCTTCGATCTATTGGCTTTAAACCGGAAAATTACACGAACTCTAAC GTTAACAGGCTCCCGACCATGATTACGGATAGATACACGTTGGCTTCTAGATCCACAACA TCCCGTTTACTTCAAAGTTATCTCAATAATTTTCACCCCTACTGCCCTATCGTGCACTCA CCGACGCTAATGATGTTGTATAATAACCAGATTGAAATCGCGTCGAAGGATCAATGGCAA ATCCTTTTTAACTGCATATTAGCCATTGGAGCCTGGTGTATAGAGGGGGAATCTACTGAT ATAGATGTTTTTTACTATCAAAATGCTAAATCTCATTTGACGAGCAAGGTCTTCGAGTCAGenes Encode for ProteinsMKLLSSIEQACDICRLKKLKCSKEKPKCAKCLKNNWECRYSPKTKRSPLTRAHLTEVESR LERLEQLFLLIFPREDLDMILKMDSLQDIKALLTGLFVQDNVNKDAVTDRLASVETDMPL TLRQHRISATSSSEESSNKGQRQLTVSIDSAAHHDNSTIPLDFMPRDALHGFDWSEEDDM SDGLPFLKTDPNNNGFFGDGSLLCILRSIGFKPENYTNSNVNRLPTMITDRYTLASRSTT SRLLQSYLNNFHPYCPIVHSPTLMMLYNNQIEIASKDQWQILFNCILAIGAWCIEGESTD IDVFYYQNAKSHLTSKVFESGSIILVTALHLLSRYTQWRQKTNTSYNFHSFSIRMAISLG LNRDLPSSFSDSSILEQRRRIWWSVYSWEIQLSLLYGRSIQLSQNTISFPSSVDDVQRTT TGPTIYHGIIETARLLQVFTKIYELDKTVTAEKSPICAKKCLMICNEIEEVSRQAPKFLQ MDISTTALTNLLKEHPWLSFTRFELKWKQLSLIIYVLRDFFTNFTQKKSQLEQDQNDHQS YEVKRCSIMLSDAAQRTVMSVSSYMDNHNVTPYFAWNCSYYLFNAVLVPIKTLLSNSKSN AENNETAQLLQQINTVLMLLKKLATFKIQTCEKYIQVLEEVCAPFLLSQCAIPLPHISYN NSNGSAIKNIVGSATIAQYPTLPEENVNNISVKYVSPGSVGPSPVPLKSGASFSDLVKLL SNRPPSRNSPVTIPRSTPSHRSVTPFLGQQQQLQSLVPLTPSALFGGANFNQSGNIADSS Example of a Gene: Gal4 AANumber of Genes in Public DatabasesStructure of Genes in Mammalian Cells• Within coding DNA genes there can be un-translated regions (Introns) • Exons are segments of DNA that contain the gene’s information coding for a protein• Need to cut Introns out of RNA and splice together Exons before protein can be made • Alternative splicing increases the potential number of different proteins, allowing the generation of millions of proteins from a small number of genes.Identifying Genes in Sequence Data• Predicting the start and end of genes as well as the introns and exons in each gene is one of the basic problems in computational biology.• Gene prediction methods look for ORFs (Open Reading Frame).• These are (relatively long) DNA segments that start with the start codon, end with one of the end codons, and do not contain any other end codon in between.• Splice site prediction has received a lot of attention in the literature.Comparative genomicsRegulatory RegionsPromoterThe promoter is the place where RNA polymerase binds to start transcription. This is what determines which strand is the coding strand.DNA Binding Motifs• In order to recruit the transcriptional machinery, a transcription factor (TF) needs to bind the DNA


View Full Document

CMU CS 10810 - lecture

Download lecture
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view lecture and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view lecture 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?