UIUC MCB 432 - Assign_01_key

This preview shows page 1 out of 3 pages.

View full document
Premium Document
Do you want full access? Go Premium and unlock all 3 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Due January 30 2014 MCB 432 Name key 1 pt for name Assignment 1 25 pts total 2 free Searching and Navigating in Entrez The National Center for Biotechnology Information NCBI http www ncbi nlm nih gov has created an integrated system called Entrez composed of many databases and links between them There is an accompanying handout for this assignment on the course WWW site Please read it completely although not everything will be required for this assignment some of it will be helpful later Go to the global cross database search page enter Archaea without the quotation marks in the search box and press the return key or click GO 1 How many matches do you get in each of the following databases 3 pt 3 pt for all 2 pt for most 1 pt for any These numbers may change daily Nucleotide 499 569 Structure 9 531 PubMed 23 870 Protein 3 103 832 Taxonomy 1 PubMed Central 28 883 It is easy to understand the publications and protein sequences but what about Taxonomy Follow the link to the Taxonomy search results page which lists the Taxonomy entries that match the term Archaea Click on the link to the first matching term which just happens to be Archaea This takes you into the taxonomy itself Note the fine print Click on organism name to get more information Do so to get to the detailed data on the taxonomic group Archaea 2 How many Entrez records in these databases are in the subtree links1 of this Taxonomy entry 3 pt 3 pt for all 2 pt for most 1 pt for any These numbers may change daily Nucleotide 373 308 Structure 3 608 PubMed none Protein 2 011 592 Taxonomy 8 075 PubMed Central 18 474 The results in this table are substantially different from those in question 1 This query is more specific it requires that the relationship to Archaea be through the organism not just any random bit of text You should see 2 Direct links from the Archaea entry in the Taxonomy database to entries in the Structure database Since this seems quite vague let s try to find the species name2 of the actual organism s Follow this link to Structure database You will get a list of the 2 matching entries 3a What is the MMDB Molecular Modeling DataBase ID of the first matching structure 1 pt 20257 3b What is the PDB Protein Data Bank ID of the first matching structure 3 1 pt 1LP6 3c The second match should be MMDB ID 20252 What is Taxonomy shown for this structure 1 pt Archaea hence the direct link 3d In the query box at the top of the page is the exact Entrez query to the Structure database that gave these results What is the query 1 pt txid2157 Organism noexp so it is the taxonomy id in the Organism field not expanded 3e Change the query to be Archaea Organism noexp without the quotation marks and press return or click Search Did this query give the same search results or not 1 pt The same so the identifier and the name both work 1 A direct link has Archaea as its organism whereas a subtree link has any member of the Archaea e g Methanosarcina acetivorans as an organism This can be extremely helpful if you want of find information for a broader taxonomic group 2 A species name e g Escherichia coli is composed of the genus name Escherichia and the specific epithet coli 3 That fact that there is a PDB ID means that the original data come from the Protein Data Bank Remember Entrez is a collection of data from other databases Knowing the source database can be useful Page 2 Assignment 1 Name Click on the image or the title to follow the link to the Structure database entry for PDB ID 1LOQ 4a What is the title of the structure entry in bold above the Citation 1 pt Crystal Structure of Orotidine Monophosphate Decarboxylase Complexed With Product UMP 4b What is the Source Organism listed for the entry this should be no surprise 1 pt Archaea 4c To find more information than is included with the structure entry you might look at the original publication makes sense doesn t it Follow the link to the Citation click on the title What is the PubMed ID PMID of this paper 1 pt 12011084 4d Structure papers often do not include the organism name in the abstract but this one does What is the species name of the organism from which this enzyme comes 1 pt Methanobacterium thermoautotrophicum which is now called Methanothermobacter thermautotrophicus Go back to the Taxonomy page for Archaea the one with all of the links to other Entrez databases The largest number of Direct links are to PubMed Central This makes sense because many papers will talk about the Domain4 of Archaea which is what this Taxonomy entry is The next largest number seems to be Proteins Are these the same as the structures entering the domain where the organism should have been placed Or is it something else 5a Follow the Protein Direct links to the list of matching entries For me the first entry is Accession WP 021061062 1 If it is not for you it might be necessary to use your browser s text search function to find this entry What is the title of this entry 1 pt MULTISPECIES 50S ribosomal protein L13 Archaea 5b What is the GI GenInfo number of this entry 1 pt 544622762 5c Follow the link to the entry This is a protein sequence displayed in a GenBank like format one of the formats that we will see repeatedly in this course so get used to it What is the DEFINITION 1 pt MULTISPECIES 50S ribosomal protein L13 Archaea 5d What is the SOURCE 5 1 pt Archaea 5e What is the ORGANISM 6 1 pt Archaea If you look at the COMMENT you will see an explanation of why the sequence is not attributed to any more specific organism This is quite different than in the structure case 6a At the top of the page is the standard Entrez query box with the database pop up set to Protein Enter as a query enter the number 22219261 this is an Entrez GenInfo number for a protein sequence and press return or click Search Although you have not seen this page before it should ring some bells What is the ACCESSION 1 pt 1LOQ A i e molecule A in structure 1LOQ structures can have multiple molecules 4 Okay they call it a superkingdom Sigh Someplace in Entrez they call is a kingdom but I have lost track of where This is a free text description and can be informal in its terminology 6 This is the formal name of the source organism in the NCBI Entrez taxonomy The second and subsequent lines of this are the taxonomic hierarchy leading to this name In this case there are no higher categories 5 Page 3 Assignment 1 6b Under DBSOURCE what is Pdb id 1 1 pt …

View Full Document

UIUC MCB 432 - Assign_01_key

Download Assign_01_key
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...

Join to view Assign_01_key and access 3M+ class-specific study document.

We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Assign_01_key and access 3M+ class-specific study document.


By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?