OSU BMI 731 - Learning objectives for Sequence Analysis

Unformatted text preview:

Learning Objectives for Sequence Analysis Lecture. 01/05/06by Dr. Ilya Ioshikhes,Department of Biomedical Informatics, 3172c Graves Hall,Tel. 292-8929, E-mail: [email protected] blocks of DNA and RNA.1. What are the building blocks of DNA?2. What nucleotides are involved in DNA and RNA structure?3. How the information is transferred from DNA to proteins?Building blocks of proteins.4. What are the building blocks of proteins?5. What is the primary factor determining a protein’s shape and structure?DNA and proteins (polypeptides) as sequences.6. What components are needed to build a sequence?7. How DNA and proteins considered in sequence analysis?Basic approaches in sequence analysis.8. What are the basic approaches to compare two sequences?9. What are global and local alignment, and what is the difference?10. In which cases do we need to compare entire sequences, and which – their segments?11. Which kind of alignment is usually more useful for gene detection?Algorithms and software for pair-wise sequence comparisons.12. What are the most popular algorithms you know?13. How are sequences with mismatches compared?14. What algorithm would you use to find regions conserved between two proteins?Multiple sequence alignment (MSA) and molecular evolution.15. What is the difference between the MSA and pair-wise sequence alignment?16. Explain relationship of molecular evolution and MSA.17. How sequence mutations are represented in MSA?18. What are basic types of sequence mutations?19. How pair-wise sequence alignment is used in MSA?Basic approaches in MSA.20. What are the basic steps necessary for comparison of multiple sequences?21. Explain difference between progressive and iterative approaches for MSA.22. In which cases do we need to compare entire sequences, and which – their segments?23. Which of the described approaches of the MSA are based on global, and which – on local alignment?24. What kind of alignment is more useful for finding of conserved regions in protein sequences?Algorithms and software for comparisons of multiple sequences.25. What are the most popular algorithms you know?26. Explain basic idea of the MSA scoring.27. What algorithm would you use to find regions conserved between several homologous genes?Searching databases for similar sequences.28. What is the most common type of database similarity search?29. Why strict algorithms of the pair-wise sequence comparison are not the best whenwe want to compare query sequences with a large database?30. Which type of the pair-wise comparison is more useful for the database searches?31. What algorithms of the database searches do you know?32. Try to explain principles of their work.Key information elements.Biological background for sequence analysis.1. There are four nucleotides (A,C,G,T) serving as building blocks for DNAmolecules, and 20 amino acids serving as building blocks for proteins.2. Genes are DNA segments encoding information for synthesis of proteins.3. Triplet code governs transmission of the information from genes to proteins: 3nucleotides encode 1 amino acid.4. The two major steps of the information transmission are transcription (synthesisof mRNA from DNA) and translation (synthesis of proteins from mRNA). Thereare also other stages in this process (splicing etc.)5. DNA is a double stranded molecule. For most of the purposes of sequenceanalysis, however, knowledge of one strand is sufficient, because another one maybe restored by the rules of complementarity (A is complement to T, and C – to G).10. 6. The shape (structure) of a protein molecule is primarily determined by its amino acid sequence17. . 7. Sequence is the order of the constituent subunits of a large biological molecule,for example, the order of amino acids in a protein or the order of nucleotides inamino acids. Two components compose the sequence: its subunits (buildingblocks) and their order. 8. DNA molecules are sequences of nucleotides. Protein molecules (polypeptides) are sequences of amino acids. Properties of these molecules are largely defined by their sequences, so to study and compare the molecules, we must study and compare their sequences.Basic terms and approaches for sequence analysis.9. Sequence comparison starts from comparison of two sequences (pair-wise comparison).10. Basic approaches for the pair-wise comparison: 1) dot matrix, 2) exhaustivealignment including all possible combinations (practically infeasible), 3) dynamicpair-wise alignment (letter by letter) and 4) alignment by word methods.11. Global and local alignments. In global alignment, the entire sequences are aligned, using as many characters as possible, up to both ends of each sequence. In local alignment, the sequence segments with the highest density of matches are aligned.12. Global alignment is best suited for quite similar sequences. Local alignment is best suited for alignment of sequences with only local similarity.13. Optimal alignment is one with the best total score of possible matches, mismatches and gaps (insertions and deletions). The score is typically a sum of amino acid (or nucleotide) pair scores minus penalties for gaps (their opening and length). Special matrices are used for the pair scoring.14. Dynamic programming is progressive building of an alignment by comparing two residues at a time, moving through all matching positions from one end of each sequence (segment) to another with scoring each point; alignment with the highest score is chosen.15. For scoring of an alignment, special matrices are used: Dayhoff Amino AcidSubstitution Matrices (Percent Accepted Mutation or PAM Matrices) – listthe likelihood of change from one amino acid to another in homologous proteinsequences during evolution, for a certain period of evolutionary time. BlocksAmino Acid Substitution Matrices (BLOSUM) by Henikoff and Henikoff –based one the observed amino acid substitutions in a large set of ~2000 conservedamino acid patterns, called blocks.16. In word methods, sequences are broken down into short words, and combinations of the words are further compared to find similar regions. Used mostly for database searches.17. The most popular software for pair-wise sequence comparisons: 1) DotPlot and Compare (dot matrix approach);


View Full Document

OSU BMI 731 - Learning objectives for Sequence Analysis

Download Learning objectives for Sequence Analysis
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Learning objectives for Sequence Analysis and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Learning objectives for Sequence Analysis 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?