DOC PREVIEW
U of I CS 498 - Multiple Sequence Alignment

This preview shows page 1-2-3-18-19-36-37-38 out of 38 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 38 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 38 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 38 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 38 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 38 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 38 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 38 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 38 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 38 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Multiple Sequence AlignmentMultiple Alignment versus Pairwise AlignmentGeneralizing the Notion of Pairwise AlignmentAligning Three Sequences2-D vs 3-D Alignment Grid2-D cell versus 2-D Alignment CellArchitecture of 3-D Alignment CellMultiple Alignment: Dynamic ProgrammingMultiple Alignment: Running TimePractically speakingMultiple Alignment Induces Pairwise AlignmentsReverse Problem: Constructing Multiple Alignment from Pairwise AlignmentsSlide 13Inferring Multiple Alignment from Pairwise AlignmentsProfile Representation of Multiple AlignmentMultiple Alignment: Greedy ApproachProgressive AlignmentClustalWStep 1: Pairwise AlignmentStep 2: Guide TreeStep 2: Guide Tree (cont’d)Step 3: Progressive AlignmentThreaded Blockset Aligner (TBA) Blanchette et al. 2004Drawbacks of usual alignmentNew approachPowerPoint PresentationSlide 27Definition: BlocksetDefinition: Ref-blocksetDefinition: Threaded blocksetThreaded blockset advantagesSlide 32Threaded Blockset Aligner (TBA)Slide 34Slide 35EvaluationSimulating evolutionEvaluating alignmentMultiple Sequence AlignmentMultiple Alignment versus Pairwise Alignment•Up until now we have only tried to align two sequences. •What about more than two? And what for? •A faint similarity between two sequences becomes significant if present in many•Multiple alignments can reveal subtle similarities that pairwise alignments do not revealGeneralizing the Notion of Pairwise Alignment•Alignment of 2 sequences is represented as a 2-row matrix•In a similar way, we represent alignment of 3 sequences as a 3-row matrix A T - G C G - A _ C G T - A A T C A C - A•Score: more conserved columns, better alignmentAligning Three Sequences•Same strategy as aligning two sequences•Use a 3-D “Manhattan Cube”, with each axis representing a sequence to align•For global alignments, go from source to sinksourcesink2-D vs 3-D Alignment GridVW2-D edit graph3-D edit graph2-D cell versus 2-D Alignment Cell In 3-D, 7 edges in each unit cubeIn 2-D, 3 edges in each unit squareArchitecture of 3-D Alignment Cell(i-1,j-1,k-1)(i,j-1,k-1)(i,j-1,k)(i-1,j-1,k)(i-1,j,k)(i,j,k)(i-1,j,k-1)(i,j,k-1)Multiple Alignment: Dynamic Programming•si,j,k = max(x, y, z) is an entry in the 3-D scoring matrixsi-1,j-1,k-1 + (vi, wj, uk)si-1,j-1,k +  (vi, wj, _ )si-1,j,k-1 +  (vi, _, uk)si,j-1,k-1 +  (_, wj, uk)si-1,j,k +  (vi, _ , _)si,j-1,k +  (_, wj, _)si,j,k-1 +  (_, _, uk)cube diagonal: no indelsface diagonal: one indeledge diagonal: two indelsMultiple Alignment: Running Time•For 3 sequences of length n, the run time is 7n3; O(n3)•For k sequences, build a k-dimensional Manhattan, with run time (2k-1)(nk); O(2knk)•Conclusion: dynamic programming approach for alignment between two sequences is easily extended to k sequences but it is impractical due to exponential running timePractically speaking•Multiple alignment is a hard problem•Yet, it is of extreme practical importance–Comparing several species is a very common task–Doing this for the entire genome is an increasingly common demand!•Several heuristic-based algorithms have been developed that employ greedy, divide-and-conquer, dynamic programming approaches in various combinations•The algorithms we will see today are actually used by current multiple alignersMultiple Alignment Induces Pairwise AlignmentsEvery multiple alignment induces pairwise alignments x: AC-GCGG-C y: AC-GC-GAG z: GCCGC-GAGInduces:x: ACGCGG-C; x: AC-GCGG-C; y: AC-GCGAGy: ACGC-GAC; z: GCCGC-GAG; z: GCCGCGAGReverse Problem: Constructing Multiple Alignment from Pairwise AlignmentsGiven 3 arbitrary pairwise alignments: x: ACGCTGG-C; x: AC-GCTGG-C; y: AC-GC-GAGy: ACGC--GAC; z: GCCGCA-GAG; z: GCCGCAGAG can we construct a multiple alignment that inducesthem?Reverse Problem: Constructing Multiple Alignment from Pairwise AlignmentsGiven 3 arbitrary pairwise alignments: x: ACGCTGG-C; x: AC-GCTGG-C; y: AC-GC-GAGy: ACGC--GAC; z: GCCGCA-GAG; z: GCCGCAGAG can we construct a multiple alignment that inducesthem? NOT ALWAYSPairwise alignments may be inconsistentInferring Multiple Alignment from Pairwise Alignments •From an optimal multiple alignment, we can infer pairwise alignments between all pairs of sequences, but they are not necessarily optimal•It is difficult to infer a ``good” multiple alignment from optimal pairwise alignments between all sequencesProfile Representation of Multiple AlignmentIn the past we were aligning a sequence against a sequenceCan we align a sequence against a profile? Can we align a profile against a profile? - A G G C T A T C A C C T G T A G – C T A C C A - - - G C A G – C T A C C A - - - G C A G – C T A T C A C – G G C A G – C T A T C G C – G G A 1 1 .8 C .6 1 .4 1 .6 .2G 1 .2 .2 .4 1T .2 1 .6 .2- .2 .8 .4 .8 .4Multiple Alignment: Greedy Approach•Choose most similar pair of strings and combine into a profile , thereby reducing alignment of k sequences to an alignment of of k-1 sequences/profiles. Repeat•This is a heuristic greedy methodu1= ACGTACGTACGT…u2 = TTAATTAATTAA…u3 = ACTACTACTACT……uk = CCGGCCGGCCGGu1= ACg/tTACg/tTACg/cT…u2 = TTAATTAATTAA……uk = CCGGCCGGCCGG…kk-1Progressive Alignment•Progressive alignment is a variation of greedy algorithm with a somewhat more intelligent strategy for choosing the order of alignments. •Use profiles to compare sequences•Gaps in consensus string are permanentClustalW•Popular multiple alignment tool today•‘W’ stands for ‘weighted’ (different parts of alignment are weighted differently).•Three-step process1.) Construct pairwise alignments2.) Build Guide Tree3.) Progressive Alignment guided by the treeStep 1: Pairwise Alignment•Aligns each sequence again each other giving a similarity matrix•Similarity = exact matches / sequence length (percent identity) v1 v2 v3 v4v1 -v2 .17 -v3 .87 .28 -v4 .59 .33 .62 -Step 2: Guide Tree•Create Guide Tree using the similarity matrix•ClustalW uses the “neighbor-joining method”•Guide tree roughly reflects evolutionary relationsStep 2: Guide Tree (cont’d)v1v3v4 v2Calculate:v1,3 = alignment (v1, v3)v1,3,4 = alignment((v1,3),v4)v1,2,3,4 = alignment((v1,3,4),v2) v1 v2 v3 v4v1 -v2 .17 -v3 .87


View Full Document

U of I CS 498 - Multiple Sequence Alignment

Documents in this Course
Lecture 5

Lecture 5

13 pages

LECTURE

LECTURE

39 pages

Assurance

Assurance

44 pages

LECTURE

LECTURE

36 pages

Pthreads

Pthreads

29 pages

Load more
Download Multiple Sequence Alignment
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Multiple Sequence Alignment and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Multiple Sequence Alignment 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?