DOC PREVIEW
CMU BSC 03510 - Lectures
Pages 45

This preview shows page 1-2-3-21-22-23-43-44-45 out of 45 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 45 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 45 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 45 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 45 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 45 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 45 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 45 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 45 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 45 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 45 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Computational Biology, Part 2 Searching with Entrez/Sequence MotifsExample Entrez SessionExample Entrez Session: home of EntrezExample Entrez Session: search OMIM for ‘cystic fibrosis’Example Entrez Session: first hit is CFTRExample Entrez Session: after clicking linksNucleotideExample Entrez Session: after clicking linksProteinExample Entrez Session: Protein sequence from original cDNAExample Entrez Session: change ‘Send to’ to ‘File’Example Entrez Session: LinksPubMedExample Entrez Session: paper in PubMed that is relatedExample Entrez Session: Related ArticlesComputation of related articlesComputation of related articles: words consideredView the MeSH terms: change ‘Display’ to ‘Citation’Computation of related articles: weight of each wordComputation on related articles: Similarity score of two articlesExample Entrez Session: search Nucleotide for cftrExample Entrez Session: 1249 hits related to cftrExample Entrez Session: set limits as title and mRNAExample Entrez Session: 46 hits with limitsExample Entrez Session: further narrow it down to humanBlock Diagram for Entrez Literature SearchingSequence Analysis TasksDefinitionSequence featuresConsensus sequencesFinding occurrences of consensus sequencesInteractive DemonstrationBlock Diagram for Search with a Consensus SequenceDescribing features using frequency matricesSlide 32Frequency matrices (continued)Slide 34Frequency Matrices, PSSMs, and ProfilesMethods for converting frequency matrices to PSSMsPseudo-countsFinding occurrences of a sequence feature using a ProfileSlide 39Block Diagram for Building a PSSMBlock Diagram for Searching with a PSSMBlock Diagram for Searching for sequences related to a family with a PSSMConsensus sequences vs. frequency matricesSlide 44Reading for next classComputational Biology, Part 2Searching with Entrez/Sequence MotifsComputational Biology, Part 2Searching with Entrez/Sequence MotifsRobert F. MurphyRobert F. MurphyCopyright Copyright  1996, 1999-2008. 1996, 1999-2008.All rights reserved.All rights reserved.Example Entrez SessionExample Entrez SessionGoal: Find literature and sequences for cystic Goal: Find literature and sequences for cystic fibrosis genesfibrosis genesUse Use OMIMOMIM with with KeywordKeyword searching. searching.Switch to Switch to NucleotideNucleotide database to see sequence. database to see sequence. Switch to Switch to ProteinProtein database to see sequence. database to see sequence.Change to Change to GenPeptGenPept format to save sequence. format to save sequence.Use Use links links to find related literatures in to find related literatures in pubmed.pubmed.Use Use Related ArticlesRelated Articles to find similar articles. to find similar articles.Search the Search the NucleotideNucleotide database by database by genegene name. name.Set Set LimitsLimits to narrow down the search to narrow down the searchExample Entrez Session:home of EntrezExample Entrez Session:home of EntrezExample Entrez Session:search OMIM for ‘cystic fibrosis’Example Entrez Session:search OMIM for ‘cystic fibrosis’Example Entrez Session:first hit is CFTRExample Entrez Session:first hit is CFTRExample Entrez Session:after clicking linksNucleotideExample Entrez Session:after clicking linksNucleotideExample Entrez Session:after clicking linksProteinExample Entrez Session:after clicking linksProteinExample Entrez Session:Protein sequence from original cDNA Example Entrez Session:Protein sequence from original cDNAExample Entrez Session:change ‘Send to’ to ‘File’Example Entrez Session:change ‘Send to’ to ‘File’Example Entrez Session:LinksPubMedExample Entrez Session:LinksPubMedExample Entrez Session:paper in PubMed that is relatedExample Entrez Session:paper in PubMed that is relatedExample Entrez Session:Related ArticlesExample Entrez Session:Related ArticlesComputation of related articlesComputation of related articlesSimilarity between documents is measured Similarity between documents is measured by the words they have in common:by the words they have in common:Which words are considered?Which words are considered?What is the weight of each word ?What is the weight of each word ?How do we calculate a similarity score of two How do we calculate a similarity score of two articles?articles?Computation of related articles: words consideredComputation of related articles: words consideredRemove stopwords: uninformativeRemove stopwords: uninformativeStem wordsStem wordsWords from the abstract are “text words”Words from the abstract are “text words”Words from the title are put in twiceWords from the title are put in twiceWords from the MeSH termsWords from the MeSH termsU.S. National Library of MedicineU.S. National Library of MedicineVocabulary used for indexing articles Vocabulary used for indexing articles Consistent way to retrieve informationConsistent way to retrieve informationView the MeSH terms:change ‘Display’ to ‘Citation’View the MeSH terms:change ‘Display’ to ‘Citation’Computation of related articles: weight of each word Computation of related articles: weight of each word Global weight:Global weight:Greater, if the word is less frequent in the whole Greater, if the word is less frequent in the whole databasedatabaseLocal weight: Local weight: Greater, if the word is more frequent in the Greater, if the word is more frequent in the documentdocumentLonger document is not favoredLonger document is not favoredComputation on related articles: Similarity score of two articlesComputation on related articles: Similarity score of two articlesWeight of one pair of common word:Weight of one pair of common word: local wt1 * local wt2 * global wtlocal wt1 * local wt2 * global wtSimilarity of two articles: sum of weights Similarity of two articles: sum of weights of all common wordsof all common wordsThe higher the score the closer the two The higher the score the closer the two articlesarticlesSimilarity scores are pre-computedSimilarity scores are pre-computedExample Entrez Session:search Nucleotide for cftrExample Entrez Session:search Nucleotide for cftrExample Entrez Session:1249 hits related to cftrExample Entrez Session:1249 hits related to cftrExample Entrez Session:set limits as title and mRNAExample Entrez Session:set limits as title and mRNAExample Entrez Session:46 hits with limits


View Full Document

CMU BSC 03510 - Lectures

Download Lectures
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lectures and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lectures 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?