OSU BMI 731 - Learning objectives for Sequence Analysis (11 pages)

Previewing pages 1, 2, 3, 4 of 11 page document View the full content.
View Full Document

Learning objectives for Sequence Analysis



Previewing pages 1, 2, 3, 4 of actual document.

View the full content.
View Full Document
View Full Document

Learning objectives for Sequence Analysis

54 views


Pages:
11
School:
Ohio State University
Course:
Bmi 731 - Advanced Topics in Biomedical Data Management
Unformatted text preview:

Learning Objectives for Sequence Analysis Lecture 01 05 06 by Dr Ilya Ioshikhes Department of Biomedical Informatics 3172c Graves Hall Tel 292 8929 E mail ioschikhes 1 medctr osu edu Building blocks of DNA and RNA 1 What are the building blocks of DNA 2 What nucleotides are involved in DNA and RNA structure 3 How the information is transferred from DNA to proteins Building blocks of proteins 4 What are the building blocks of proteins 5 What is the primary factor determining a protein s shape and structure DNA and proteins polypeptides as sequences 6 What components are needed to build a sequence 7 How DNA and proteins considered in sequence analysis Basic approaches in sequence analysis 8 What are the basic approaches to compare two sequences 9 What are global and local alignment and what is the difference 10 In which cases do we need to compare entire sequences and which their segments 11 Which kind of alignment is usually more useful for gene detection Algorithms and software for pair wise sequence comparisons 12 What are the most popular algorithms you know 13 How are sequences with mismatches compared 14 What algorithm would you use to find regions conserved between two proteins Multiple sequence alignment MSA and molecular evolution 15 What is the difference between the MSA and pair wise sequence alignment 16 Explain relationship of molecular evolution and MSA 17 How sequence mutations are represented in MSA 18 What are basic types of sequence mutations 19 How pair wise sequence alignment is used in MSA Basic approaches in MSA 20 What are the basic steps necessary for comparison of multiple sequences 21 Explain difference between progressive and iterative approaches for MSA 22 In which cases do we need to compare entire sequences and which their segments 23 Which of the described approaches of the MSA are based on global and which on local alignment 24 What kind of alignment is more useful for finding of conserved regions in protein sequences Algorithms and software for comparisons of multiple sequences 25 What are the most popular algorithms you know 26 Explain basic idea of the MSA scoring 27 What algorithm would you use to find regions conserved between several homologous genes Searching databases for similar sequences 28 What is the most common type of database similarity search 29 Why strict algorithms of the pair wise sequence comparison are not the best when we want to compare query sequences with a large database 30 Which type of the pair wise comparison is more useful for the database searches 31 What algorithms of the database searches do you know 32 Try to explain principles of their work Key information elements Biological background for sequence analysis 1 There are four nucleotides A C G T serving as building blocks for DNA molecules and 20 amino acids serving as building blocks for proteins 2 Genes are DNA segments encoding information for synthesis of proteins 3 Triplet code governs transmission of the information from genes to proteins 3 nucleotides encode 1 amino acid 4 The two major steps of the information transmission are transcription synthesis of mRNA from DNA and translation synthesis of proteins from mRNA There are also other stages in this process splicing etc 5 DNA is a double stranded molecule For most of the purposes of sequence analysis however knowledge of one strand is sufficient because another one may be restored by the rules of complementarity A is complement to T and C to G 10 6 The shape structure of a protein molecule is primarily determined by its amino acid sequence 17 7 Sequence is the order of the constituent subunits of a large biological molecule for example the order of amino acids in a protein or the order of nucleotides in amino acids Two components compose the sequence its subunits building blocks and their order 8 DNA molecules are sequences of nucleotides Protein molecules polypeptides are sequences of amino acids Properties of these molecules are largely defined by their sequences so to study and compare the molecules we must study and compare their sequences Basic terms and approaches for sequence analysis 9 Sequence comparison starts from comparison of two sequences pair wise comparison 10 Basic approaches for the pair wise comparison 1 dot matrix 2 exhaustive alignment including all possible combinations practically infeasible 3 dynamic pair wise alignment letter by letter and 4 alignment by word methods 11 Global and local alignments In global alignment the entire sequences are aligned using as many characters as possible up to both ends of each sequence In local alignment the sequence segments with the highest density of matches are aligned 12 Global alignment is best suited for quite similar sequences Local alignment is best suited for alignment of sequences with only local similarity 13 Optimal alignment is one with the best total score of possible matches mismatches and gaps insertions and deletions The score is typically a sum of amino acid or nucleotide pair scores minus penalties for gaps their opening and length Special matrices are used for the pair scoring 14 Dynamic programming is progressive building of an alignment by comparing two residues at a time moving through all matching positions from one end of each sequence segment to another with scoring each point alignment with the highest score is chosen 15 For scoring of an alignment special matrices are used Dayhoff Amino Acid Substitution Matrices Percent Accepted Mutation or PAM Matrices list the likelihood of change from one amino acid to another in homologous protein sequences during evolution for a certain period of evolutionary time Blocks Amino Acid Substitution Matrices BLOSUM by Henikoff and Henikoff based one the observed amino acid substitutions in a large set of 2000 conserved amino acid patterns called blocks 16 In word methods sequences are broken down into short words and combinations of the words are further compared to find similar regions Used mostly for database searches 17 The most popular software for pair wise sequence comparisons 1 DotPlot and Compare dot matrix approach 2 GAP global alignment Needleman Wunsch dynamic algorithm 3 BestFit local alignment Smith Waterman dynamic algorithm 4 LALIGN finding multiple unique nonintersecting local alignments 5 FASTA and BLAST word algorithms for database searches 18 Following scheme is helpful for resolving of the problems of pair wise sequence comparison Multiple sequence alignment MSA 19


View Full Document

Access the best Study Guides, Lecture Notes and Practice Exams

Loading Unlocking...
Login

Join to view Learning objectives for Sequence Analysis and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Learning objectives for Sequence Analysis and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?