New version page

GU BIOL 152 - From Protein Sequence to Function

Upgrade to remove ads

This preview shows page 1-2-3-4-5-35-36-37-38-39-70-71-72-73-74 out of 74 pages.

Save
View Full Document
Premium Document
Do you want full access? Go Premium and unlock all 74 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 74 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 74 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 74 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 74 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 74 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 74 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 74 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 74 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 74 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 74 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 74 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 74 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 74 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 74 pages.
Access to all documents
Download any document
Ad free experience

Upgrade to remove ads
Unformatted text preview:

From Protein Sequence to Function: Functional Analysis of Protein Sequences and Protein ClassificationOverviewFunctional Genomics and ProteomicsProteomicsBioinformatics and Genomics/ProteomicsPowerPoint PresentationMost new proteins come from genome sequencing projectsAdvantages of knowing the complete genome sequenceThe changing face of protein scienceProperties of the natural protein setSlide 11Slide 12Slide 13Problems in functional assignments for “knowns”Slide 15Slide 16Slide 17Slide 18Dealing with “hypothetical” proteinsFunctional prediction: comutational analysisSlide 21Slide 22Poorly characterized protein familiesFunctional prediction: computational analysisSlide 25Functional Prediction: Role of Structural GenomicsStructural Genomics: Structure-Based Functional AssignmentsSlide 28Functional prediction: problem areasSlide 30“Unknown unknowns”To deal with the ocean of new sequences, need “natural” protein classificationThe ideal system would be:Protein EvolutionSlide 35Orthologs and ParalogsSlide 37Levels of Protein ClassificationProtein Family-Domain-MotifProtein Evolution: Sequence Change vs. Domain ShufflingRecent Domain ShufflingPractical classification of proteins: setting realistic goalsComplementary approachesProtein Family Databases to be discussedPIR Web Site (http://pir.georgetown.edu)PIRSF protein classification systemWhole protein functional annotationPIRSF classification is based on evolutionLevels of protein classificationPIRSF curation and annotationiProClass PIRSF reportSystematic correction of annotation errors: Chorismate mutaseChorismate MutaseSystematic correction of annotation errors: IMPDHPropagation of protein annotation within UniProt (under development)InterPro (at EBI)InterPro EntryPIR Superfamilies are being integrated into InterProCOGs (Clusters of Orthologous Groups) (at NCBI)Slide 60Construction of COGs:Slide 62Construction of COGs: Add all homologsSlide 64Slide 65Slide 66Slide 67Slide 68What to do with a new porotein sequenceSlide 70Examples for analysis:Slide 72Slide 73Examples for analysis:From Protein Sequence to Function: Functional Analysis of Protein Sequences and Protein Classification Anastasia Nikolskaya Assistant Professor (Research) Protein Information Resource Department of Biochemistry and Molecular Biology Georgetown University Medical CenterOverview •Role of Bioinformatics/Computational Biology in Proteomics Research•Genomics•Functional Annotation of Proteins•Classification of ProteinsBioinformatics Databases and Analytical Tools: Dr. Mazumder and Dr. Hu Sequence functionFunctional Genomics and Proteomics Proteomics studies biological systems based on global knowledge of protein sets (proteomes). Functional genomics studies biological functions of proteins, complexes, pathways based on the analysis of genome sequences. Includes functional assignments for protein sequences.Genome Transcriptome Proteome MetabolomeProteomics•Data: Gene Expression Profiling - Genome-Wide Analyses of Gene Expression•Data: Structural Genomics - Determine 3D Structures of All Protein Families•Data: Genome Projects (Sequencing) - Functional genomics - Knowing complete genome sequences of a number of organisms is the basis of the proteomics researchBioinformatics and Genomics/Proteomics Sequence,Other DataPathways and Regulatory Circuits Hypothetical CellUnknownGenesPutative Functional GroupsGenomic DNA Sequence5' UTRPromoterExon1Intron Exon2IntronExon3 3' UTRAGGTAGGene RecognitionExon2Exon1 Exon3CACACAATTATAProtein SequenceATGAATAAAStructure DeterminationProtein StructureFunction AnalysisGene NetworkMetabolic PathwayProtein FamilyMolecular EvolutionFamily ClassificationGTGene GeneDNASequence Gene Protein Sequence FunctionWork with protein sequence, not DNA sequenceMost new proteins come from genome sequencing projects•Mycoplasma genitalium - 484 proteins•Escherichia coli - 4,288 proteins•S. cerevisiae (yeast) - 5,932 proteins•C. elegans (worm) ~ 19,000 proteins•Homo sapiens ~ 40,000 proteins... and have unknown functionsAdvantages of knowing the complete genome sequence•All encoded proteins can be predicted and identified •The missing functions can be identified and analyzed•Peculiarities and novelties in each organism can be studied •Predictions can be made and verifiedThe changing face of protein science20th century•Few well-studied proteins•Mostly globular with enzymatic activity•Biased protein set21st century•Many “hypotheti-cal” proteins•Various, often with no enzymatic activity•Natural protein setProperties of the natural protein set•Unexpected diversity of even common enzymes (analogous, paralogous, xenologous, enzymes)•Conservation of the reaction chemistry, but not the substrate specificity•Functional diversity in closely related proteins •Abundance of new structuresEscherichia coliMethanococcus jannaschiiYe astHuman E. coli M. jannaschii S. cerevisiae H. sapiens Characterized experimentally 2046 97 3307 10189 Characterized by similarity 1083 1025 1055 10901 Unknown, conserved 285 211 1007 2723 Unknown, no similarity 874 411 966 7965 from Koonin and Galperin, 2003, with modificationsProtein Sequence FunctionFrom new genomesAutomatic assignment based on sequence similarity:gene name, protein name, functionTo avoid mistakes, need human intervention (manual annotation)Functional annotation of proteins(protein sequence databases)Best annotated protein databases: SwissProt, PIR-1Now part of UniProt – unified protein knowledgebase•Experimentally characterized–Up-to-date information, manually annotated (curated database!)•“Knowns” = Characterized by similarity (closely related to experimentally characterized)– Make sure the assignment is plausible•Function can be predicted– Extract maximum possible information– Avoid errors and overpredictions– Fill the gaps in metabolic pathways•“Unknowns” (conserved or unique)– Rank by importanceObjectives of functional analysisfor different groups of proteinsProblems in functional assignments for “knowns”•Previous low quality annotations - misinterpreted experimental results (e.g. suppressors, cofactors) - biologically senseless annotations Arabidopsis: separation anxiety protein-like Helicobacter: brute force


View Full Document
Download From Protein Sequence to Function
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view From Protein Sequence to Function and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view From Protein Sequence to Function 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?