DOC PREVIEW
U of I CS 498 - Protein Sequencing and Identification

This preview shows page 1-2-21-22 out of 22 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 22 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 22 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 22 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 22 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 22 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Protein Sequencing and IdentificationMotivationHistoryPeptide FragmentationBreaking Protein into Peptides and Peptides into Fragment IonsN- and C-terminal PeptidesTerminal peptides and ion typesSlide 8Slide 9Slide 10Slide 11Peptide sequencing problemSlide 13Theoretical spectrumMatch between Spectra and the Shared Peak CountPeptide Sequencing ProblemVertices of Spectrum GraphEdges of Spectrum GraphPathsPath Scorep(P, s)Slide 22Protein Sequencing and IdentificationMotivation•Want to know which proteins are present in the cell•Protein identification: Given a protein sample, does it match some protein in a database ?•Protein sequencing: No database. Directly find the sequence of the protein sample.History•First protein sequencing done by Nobel laureate Fred Sanger–broke the insulin protein into pieces (“peptides”)–sequenced each resulting fragment separately–reconstructed entire insulin sequence by fragment assemblyPeptide Fragmentation•Peptides tend to fragment along the backbone.•Fragments can also loose neutral chemical groups like NH3 and H2O.H...-HN-CH-CO . . . NH-CH-CO-NH-CH-CO-…OHRi-1RiRi+1H+Prefix Fragment Suffix FragmentCollision Induced DissociationBreaking Protein into Peptides and Peptides into Fragment Ions•Proteases, e.g. trypsin, break protein into peptides.•A Tandem Mass Spectrometer further breaks the peptides down into fragment ions and measures the mass of each piece.•Mass Spectrometer accelerates the fragmented ions; heavier ions accelerate slower than lighter ones.•Mass Spectrometer measure mass/charge ratio of an ion.N- and C-terminal PeptidesN-terminal peptidesC-terminal peptidesTerminal peptides and ion typesPeptideMass (D) 57 + 97 + 147 + 114 = 415PeptideMass (D) 57 + 97 + 147 + 114 – 18 = 397withoutN- and C-terminal PeptidesN-terminal peptidesC-terminal peptides415 486 30115457 71185332429N- and C-terminal PeptidesN-terminal peptidesC-terminal peptides415 486 30115457 71185332429N- and C-terminal Peptides415 486 30115457 71185332429N- and C-terminal Peptides415 486 30115457 71185332429Reconstruct peptide from the set of masses of fragment ions (mass-spectrum)Peptide sequencing problem•A = {a1, a2, … a20} : set of amino acids, each with mass m(ai)•Peptide P = p1…pn is a sequence of amino acids, with parental mass m(P) = ∑im(pi)•Partial N-terminal peptide Pi = p1…pi with mass mi•Mass spectrum has the masses of all partial N-terminal peptides, determined experimentally –Ignoring C-terminal peptides for simplicityPeptide sequencing problem•A peptide may lose one or more smaller parts of itself (such as a water or an ammonia)•The Mass spectrometer measures mass of fragments that may not be the entire fragment Pi.•Assume k different ion losses possible.•Possible losses of mass: ∆ = {∂1, … ∂k}Theoretical spectrum•The theoretical spectum T(P) of a peptide P can be calculated by subtracting all possible mass losses ∂1…∂k from masses of all partial peptides of P•Each partial peptide generates k masses in the theoretical spectrumMatch between Spectra and the Shared Peak Count•The match between two spectra is the number of masses (peaks) they share (Shared Peak Count or SPC)•In practice mass-spectrometrists use the weighted SPC that reflects intensities of the peaks•Match between experimental and theoretical spectra is defined similarlyPeptide Sequencing ProblemGoal: Find a peptide with maximal match between an experimental and theoretical spectrum.Input:–S: experimental spectrum–Δ: set of possible ion types–m: parent massOutput: –P: peptide with mass m, whose theoretical spectrum matches the experimental S spectrum the bestVertices of Spectrum Graph•Masses of potential N-terminal peptides•Vertices are generated by reverse shifts corresponding to ion types Δ={δ1, δ2,…, δk}•Every mass s in an MS/MS spectrum generates k vertices V(s) = {s+δ1, s+δ2, …, s+δk} corresponding to potential N-terminal peptides•Vertices of the spectrum graph: {initial vertex}V(s1) V(s2) ... V(sm) {terminal vertex}Edges of Spectrum Graph•Two vertices with mass difference corresponding to an amino acid A:–Connect with an edge labeled by A•Gap edges for di- and tri-peptidesPaths•Path in the labeled graph spell out amino acid sequences•There are many paths, how to find the correct one?•We need scoring to evaluate pathsPath Score•p(P,S) = probability that peptide P produces spectrum S= {s1,s2,…sq}•p(P, s) = the probability that peptide P generates a peak s•Scoring = computing probabilities•p(P,S) = πsєS p(P, s)p(P, s)•What is the probability that peptide P will produce a fragment mass s ?•Each ion type ∂i has some probability of occurring, written as qi•A peptide has all k peaks with probability •and no peaks with probability•Suppose that a partial peptide Pi produces ions ∂1…∂l and does not produce ions ∂l+1…∂k ∏=kiiq1∏=−kiiq1)1(p(P, s)•Then p(P,s) =•A peptide also produces a ``random noise'' with uniform probability qR in any position.• € qii=1l∏ ⎛ ⎝ ⎜ ⎞ ⎠ ⎟(1− qi)i= l +1k∏ ⎛ ⎝ ⎜ ⎞ ⎠ ⎟€ qiqRi=1l∏ ⎛ ⎝ ⎜ ⎞ ⎠ ⎟(1− qi)1− qRi= l +1k∏ ⎛ ⎝ ⎜ ⎞ ⎠


View Full Document

U of I CS 498 - Protein Sequencing and Identification

Documents in this Course
Lecture 5

Lecture 5

13 pages

LECTURE

LECTURE

39 pages

Assurance

Assurance

44 pages

LECTURE

LECTURE

36 pages

Pthreads

Pthreads

29 pages

Load more
Download Protein Sequencing and Identification
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Protein Sequencing and Identification and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Protein Sequencing and Identification 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?