DOC PREVIEW
UCSD CSE 182 - Lecture

This preview shows page 1-2-3-24-25-26-27-48-49-50 out of 50 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 50 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 50 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 50 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 50 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 50 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 50 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 50 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 50 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 50 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 50 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 50 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CSE/Beng/BIMM 182: Biological Data AnalysisTodayIntroduction to the class:DatabasesLife begins with CellAll life depends on 3 critical moleculesThe molecules of Life and BioinformaticsHistory of GenbankSequence dataSlide 9How do we query a sequence database?Quiz:DNA sequence databasesBLASTQuiz:BLASTNon sequence based queriesProtein Sequences have structureEx2: Sequences have motifsQuiz: Protein Sequence AnalysisDatabase of Protein MotifsSlide 19Quiz: BiologyDNA, RNA and flow of informationDNA, RNA, and the Flow of InformationQuiz:Quiz:Transcription?Quiz: TranslationSlide 26RNA sequences have StructureQuiz:RNAPackagingGenome SequencingQuiz: Sequencing19972001Sequencing PopulationsPersonalized genomics23andMeSlide 37Quiz:Population geneticsVariations in DNAHow do these individual differences occur?MutationsRecombinationGenotypes and HaplotypesSNP databasesSummaryCourse OutlinePerl/PythonGradingAssignment 1ProjectCSE/Beng/BIMM 182: Biological Data AnalysisInstructor: Vineet BafnaTA: Nitin Udpawww.cse.ucsd.edu/classes/www.cse.ucsd.edu/classes/fa09/fa09/cse182cse182Today•We will explore the syllabus through a series of questions? •Please ASK•All logistical information will be given at the endIs this on the test?Can I get an extension on my homework?Introduction to the class:Databases•Biological databases are diverse–Often, little more than large text files•Database technology is about formally representing data and the inter-relationships among the data objects.•This course is not about databases, but about the data itself.•We will ‘look’ at many biological databases (keep a count!) but not at their formal structure. Instead, we will ask:–How can we represent the data?–How can we query this data?•In order to understand the data, we need to know a little Biology.Life begins with Cell•A cell is a smallest structural unit of an organism that is capable of independent functioning•All cells have some common featuresAll life depends on 3 critical molecules•Protein–Form enzymes, send signals to other cells, regulate gene activity.–Form body’s major components (e.g. hair, skin, etc.).• DNA–Hold information on how cell works•RNA–Act to transfer short pieces of information to different parts of cell–Provide templates to synthesize into proteinThe molecules of Life and Bioinformatics•DNA, RNA, and Proteins can all be represented as strings!•DNA/RNA are string over a 4 letter alphabet(A,C,G,T/U).•Protein Sequences are strings over a 20 letter alphabet.•This allows us to store and query them as text.History of Genbank•In 1982 Goad's efforts were rewarded when the National Institutes of Health funded Goad's proposal for the creation of GenBank, a national nucleic acid sequence data bank. By the end of 1983 more than 2,000 sequences (about two million base pairs) were annotated and stored in GenBank. Walter Goad, 1942-2000Sequence dataHow do we query a sequence database?•By name•By sequence•‘Relational’ queries are barely applicableQuiz:DNA sequence databasesSuppose you have a 100nt sequence, and you want to know if it is human, what will you do?How much time will it take? Or, how many steps? (Query=m, Database = n)•What if you were interested in identifying the human homolog of a mouse sequence ( 85% identical)? How much time will it take? What if the query was 10Kbp? What if it was the entire genome? ACGGATCGGCGAATCGAATCGTGGGCCTTAdatabaseAATCGTqueryBLAST•Allows querying sequence databases with sequence queries. •It is the prototypical search tool.•The paper describing it was the most cited paper in the 90s.Quiz:BLASTWhat do you do if BLAST does not return a ‘hit’? What does it mean if BLAST returns a sequence that is 60% identical? Is that significant (are the sequences evolutionarily related)?Suppose Protein sequences A & B are 40% identical, and A &C are 40% identical. If we know that A&B are evolutionarily related, what does that say about A & C?Non sequence based queries•Biological databases are not limited to sequences.Protein Sequences have structureQuiz: Can you search using a structure query?Ex2: Sequences have motifsHow to represent and query such motifs?Quiz: Protein Sequence Analysis•You are interested in all protein sequences that have the following pattern:–[AC]-x-V-x(4)-{ED}•This pattern is translated as: [Ala or Cys]-any-Val-any-any-any-any-{any but Glu or Asp} •How can you search a protein sequence database for any such pattern?• What if the database was a collection of patterns ?Database of Protein MotifsQuiz: Protein Sequence AnalysisProteins fold into a complex 3D shape. Can you predict the fold by looking at the sequence? What is a domain? How can you represent a domain? How can you query?Quiz: Biology•DNA is the only inherited material. Proteins do most of the work, so DNA must somehow contain information about the proteins. •How is the information about proteins encoded in DNA? What is the region encoding this information called?DNA, RNA and flow of information•A gene is expressed in two steps1) Transcription: RNA synthesis2) Translation: Protein synthesisDNA, RNA, and the Flow of Information TranslationTranscriptionReplicationQuiz:How would you find genes in genomic sequence?What is splicing? Alternative splicing? How can you (computationally) tell if a gene has alternative splice forms?What is a gene?Quiz:Transcription?•What causes transcription to switch on or off? How can we find transcription factor binding sites?•The number of transcripts of a gene is indicative of the activity of the gene. Can we count the number of transcripts? Can we tell if the number of copies is abnormally high, or abnormally low?Quiz: Translation•How is Protein Sequencing done?Many proteins are post-translationally modified. How can you identify those proteins?•What is a mass spectrometer?Quiz: Translation•Are all genes translated? •Can you predict non-coding genes in the genome? Can you predict structure for RNA?•What is special about RNA?RNA sequences have StructureQuiz:RNA•How can you predict secondary, and tertiary structure of RNA?•Given an RNA query (sequence + structure), can you find structural homologs in a database? EX: tRNAPackaging•All of the transcripts are encoded in DNA, which is packaged into the genome.•Many databases (much of sequence) are devoted to storing entire genomic sequences.Genome Sequencing•How is the genome sequence


View Full Document

UCSD CSE 182 - Lecture

Download Lecture
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?