UIUC MCB 432 - Assign_07_key

This preview shows page 1 out of 2 pages.

View full document
Premium Document
Do you want full access? Go Premium and unlock all 2 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

MCB 432 Name Assignment 7 1 pt each unless noted 50 pt total Analyzing Sequences Using NCBI PSI BLAST Due Mar 18 2014 This assignment requires addtional instructions in the file Assign 07 suppl 1a What is the RID Request ID for your search varies 1b What was the length of the query sequence 88 amino acids 1c What is the accession number of the best matching sequence YP 001582204 1 1 is optional 1d What is the score of this match in bits 175 1e What is the E value 9e 55 or 9 10 55 but not 9e 55 1f Is the observed match significantly better than random Yes 1g From what organism does the database sequence come include the strain Nitrosopumilus maritimus SCM1 1h What is the description of the best matching sequence I would normally say no organism name hypothetical protein Nmar 0870 but I forgot and the table column is Description 10 1i How may entries have an E value 10 to the query 8 1j What is the most common description of sequences with an E value 10 10 to the query hypothetical protein In this case including an organism makes no sense 2 The output table is divided into 2 sections alignments with an E value better than threshold I get 16 of these and alignments with an E value WORSE than threshold Let us consider sequences in the second group E value 0 005 For these 3 accession numbers what are the Evalue and the description 3 pt total Accession E value Description YP 002466353 1 0 007 glutamyl tRNA Gln amidotransferase subunit C WP 003001437 1 1 1 ribosome biogenesis GTPase A WP 004077429 1 1 9 glutamyl amidotransferase 3 As you work through the following steps successive iterations of PSI BLAST for each of these 4 accession numbers record the bit score the E value and the percent identity Max ident You currently have the values for interation 1 The values are filled in for NP 613926 1 If data are not available for a particular accession number put NA in the table cells 15 pt total iterYP 002466353 1 WP 003001437 1 WP 004077429 1 YP 004037241 1 ation score E val ident score E val ident score E val ident score E val ident 1 40 4 0 007 27 35 8 1 1 33 33 5 1 9 25 NA NA NA 2 62 3 8e 11 24 NA NA NA 56 9 8e 09 25 53 9 1e 07 21 3 NA NA NA NA NA NA NA NA NA NA NA NA 3 94 1 5e 23 23 NA NA NA 89 9 3e 21 25 96 0 1e 23 21 Page 2 Assignment 7 Name 4 After filling the iteration 1 data click a Go button that says Run PSI Blast interation 2 use the results to answer the following questions and to fill in the next row in the table above 4a What is the accession number of the best match YP 001582204 1 1 is optional 4b What is the score of this match in bits 102 4c What is the E value 5e 26 or 5 10 26 but not 5e 26 4d Is this more or less significant than the match to this sequence in iteration 1 Less 4e How many sequences are displayed with a light yellow background approximately 500 4f How may entries have an E value 10 10 to the query approximately 85 4g Is the match to YP 002466353 1 more or less significant than in the previous interation More 4h Is the match to WP 003001437 1 more or less significant than in the previous interation Less 4i What is the Request ID RID of this page you will need it varies 4j What is the highest E value on the search results page 6e 07 or 6 10 7 but not 6e 07 5 After filling the itereration 2 data click a Go buttons that says Run PSI Blast interation 3 use the results to to fill in the next row in the table above 5a What is the accession number of the best match WP 004092718 1 1 is optional 5b What is the score of this match in bits 138 5c What is the E value 2e 40 or 2 10 40 but not 2e 40 5d What is the percent sequence identity of the best match to pep2 the query 22 5e What is the highest E value on the search results page 2e 28 or 2 10 28 but not 2e 28 6 Something very bad has happened the iteration 2 search found hundreds of bacterial sequences whereas most of what we had been watching were archaeal sequences Searching with a PSSM position specific scoring matrix dominated by Bacteria gives more Bacteria not Archaea Filtering the results will not help because there are thousands of matching bacterial sequences Go back to the iteration 2 search results using the RID in 4i or the Recent Results entry listed as Program psi blastp 2 Near the top of the page click Formatting options enter Archaea no quotes as the Organism and click Reformat Although it might change slightly I get 61 matching sequences after this filtering Now click Run PSI Blast iteration 3 Note that the results will still be filtered to Archaea but this will not change the answers below 6a What is the accession number of the best match YP 005381459 1 1 is optional 6b What is the score of this match in bits 118 6c What is the E value 2e 32 or 2 10 32 but not 2e 32 6d How may entries have an E value 10 10 to the query 125 6e Is the match to YP 002466353 1 more or less significant than in interation 2 More 6f What is the highest E value on the search results page 2e 16 or 2 10 16 but not 2e 16 6g The sequence with 100 identity is not the most significant match In a sentence or two explain how this can be true The significance of the match is based upon the match to the profile which is a composite of data from many in this case 61 sequences not to the query per se The profile could be thought of as an estimate of the ancestral sequence this comes with many caveats it is closer to distant sequences but further from the original query very flexible answers So if you have a few related sequences and want to find more distantly related proteins PSI BLAST can help you find them though this does not always work as expected

View Full Document

UIUC MCB 432 - Assign_07_key

Download Assign_07_key
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...

Join to view Assign_07_key and access 3M+ class-specific study document.

We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Assign_07_key and access 3M+ class-specific study document.


By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?