Unformatted text preview:

B o b G r o s s , B i o 3 9 / 1 3 9IntroductionB o b G r o s s , B i o 3 9 / 1 3 9Computational Biology? Bioinformatics? Genomics?•explosion of new molecular biological information•sequence information - genomes•microarray information - expression•far too much for humans to comprehend•rapidly increasing progress in computer technology•how to combine the two so that they work together?•very exciting time to be at the cutting edge in two fieldsB o b G r o s s , B i o 3 9 / 1 3 9Computational Biology•computational biology - seeks to use computational methods to analyze biological processes• includes extraction of information directly from sequences, sometimes called Bioinformatics•includes comparing sequences and structures to each other and to databases to look for relationships•includes discerning evolutionary relationships among organisms through sequence analysis• includes comparing whole genomes to each other, sometimes called Functional Genomics• includes predicting structures and interactions based on those structures, sometimes called Structural Biology•includes finding functional relationships among sets of genes•includes understanding networks of regulatory interactionsB o b G r o s s , B i o 3 9 / 1 3 9Looking for Similar Sequences•what sequence is similar to my sequence?•is part of it similar or is the whole thing similar (local vs global similarity)?•what if sequences are of vastly differing length?•what about rearrangements? [AxxxxBxxx vs. xxxBxxAxx]•how do you define similarity (need to define a scoring system)?•Runtime - need to consider how fast the comparison runs - some compromises have to be made•similar structure even if sequence is not very similar?B o b G r o s s , B i o 3 9 / 1 3 9Looking for Similar Structures•many of the same issues as for sequence comparisons•local vs global similarity•granularity - what size should the basic structural unit be [e.g an amino acid? an alpha helix?]•need scoring system(s)B o b G r o s s , B i o 3 9 / 1 3 9Defining Function•function of whole molecule? or of pieces?•consider binding to DNA•can be the main function of the molecule (e.g. histones)•can be part of a molecule’s function - transcription factor (interacts with DNA and with other TFs)•can be part of a DNA polymerase or RNA polymeraseB o b G r o s s , B i o 3 9 / 1 3 9Looking for Patterns•repeated sequencesAAAA; ACAC; ACCACCACC; ACCnACCnnACC•inverted repeated sequences - sometimes with spacesACGCGT; ACCnnnGGT•patterns based on physical characteristics - hydrophobic or hydrophilic AAs•repeated arrangements of AAs that lead to higher order structures like alpha-helices or beta-sheets, leucine zippers•repeated appearances of higher order structures such as helix-turn-helix found in DNA binding proteins or beta-barrels found in some enzymes•patterns of regulatory sequences upstream of genesB o b G r o s s , B i o 3 9 / 1 3 9Gene Expression Patterns•using microarray data or SAGE•cell cycle•cancer vs. normal tissue•during development•drug treatments•how do you measure similarity of expression?B o b G r o s s , B i o 3 9 / 1 3 9Data Mining•The process of exploring large amounts of data to look for information and relationships is called data mining. •It is more similar to other branches of research than to traditional molecular biology research. •Finding relationships among sequences or other patterns might lend some insight into the function of the molecules being studied.•Available to anyone on Internet.B o b G r o s s , B i o 3 9 / 1 3 9What is a Database?•stores information and allows for rapid retrieval and searching•provides for comparisons and complex searches•information can be in many formats•database information is stored in records•a record can contain a number of individual pieces of information•a relational database allows for connections between different records or pieces of dataB o b G r o s s , B i o 3 9 / 1 3 9Information Retrieval•all records of people living in zip code 03755•all DNA sequences from humans•all human DNA sequences entered by a specific author•literature references to papers by that author•all genes for ATPases found in both humans and yeast but not ArabidopsisB o b G r o s s , B i o 3 9 / 1 3 9Data Warehousing•Different database are established at different locations (and times) by different investigators, but much of the information is interrelated.•How can all of these different databases be “related” to each other in such a way that information can be obtained easily from all of them?•Data warehousing projects are designed to address the issues involved in tying the different databases togetherB o b G r o s s , B i o 3 9 / 1 3 9Reconciling Data Formats•GenBank stores sequence information associated with a specific GeneID.•KEGG databases store enzymes by specific IDs that are not the same as GenBank IDs•Structure databases use an older GenBank name or an entirely different name•Author names can be stored differently•Gross, Robert H.•Gross, R.H.•Gross, RH•R.H. GrossB o b G r o s s , B i o 3 9 / 1 3 9Some Day...•Find sequence similarities for all genes whose expression is inhibited by a specific drug•Find similarities in gene expression patterns for all genes that are phosphorylated in both humans and yeast after inhibition of cell division after G1•Find common promoter motifs for all genes mentioned in a set of published papers on a particular topicB o b G r o s s , B i o 3 9 / 1 3 9Computer Architecture and Operating SystemsB o b G r o s s , B i o 3 9 / 1 3 9The CPU•The Central Processing Unit•Intel, PowerPC•Has a number of “registers” that can store values •Calculations occur by adding or subtracting values in registers or to a value in memory•Works in binary (0 or 1 are the only allowed values)•Has a speed that is usually measured in billions of operations per second (gigahertz, GHz)•Many computers have multiple CPUs which can work in parallelB o b G r o s s , B i o 3 9 / 1 3 9Bits and Bytes•a single unit of memory is called a bit. It stores a value of 1 or 0. •2-bits can have 4 different configurations - 00, 01, 10, 11the rightmost digit is 20, the leftmost is 2100 => 0, 01 => 1, 10 => 2, 11 => 3•typically, bits are combined into larger units to enable us to think


View Full Document

DARTMOUTH BIOL 039 - Introduction

Download Introduction
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Introduction and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Introduction 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?