Stanford CS 262 - Lecture Notes - D499338

Home> Schools> Stanford University> Computer Science (CS) > CS 262> Lecture Notes

DOC PREVIEW

Stanford CS 262 - Lecture Notes

School name Stanford University

Course Cs 262- Computational Genomics

Pages 16

This preview shows page 1-2-3-4-5 out of 16 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 16 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 16 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 16 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 16 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 16 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 16 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

CS262 Lecture 3: Sequence Alignment continued, BLAST Professor Serafim Batzoglou TA: Nadine Hussami Scribe: Brian Lao Lecture Date: 1/14/2014 !!!!!!!1!TABLE OF CONTENTS SCORING THE GAPS MORE ACCURATELY..…………………… 2-8 Convex Gap Dynamic Programming …..……………………..... 2-4 Affine Gap Penalty ……………………………………….…....... 4-6 Piecewise Linear Function …………………………………….... 6-7 Bounded Dynamic Programming ……………………………..... 7-8 LINEAR-SPACE ALIGNMENT ………………………………...….. 8-13 Sequences and Subsequences …………………............………….. 8 Hirschberg’s Algorithm ……………………………….....……. 8-13 HEURISTIC LOCAL ALIGNERS ……………………………….... 13-15 BLAST ……………………………………………………….... 13-15 !!!!!!!!!!!!!!!!!!!!!!!!2!START!OF!LECTURE!!SCORING!THE!GAPS!MORE!ACCURATELY! ! ! ! ! ! !!Previous)Model:!!Last!time,!we!defined!gap!penalties!as!a!given!constant!in!any!of!the!2!sequences.!Every!letter!that!goes!to!a!gap!incu rs!a!pen alty,!lets!call!it!d.!!!For!a!gap!of!length!n,!the!penalty!is!n!x!d.!!Figure!1.!!A!graph!illustrating!the!length!of!the!gap!vs.!the!magnitude!of!the!penalty.!As!the!gap!length!increases,!the!penalty!increases!linearly.!How)gaps)biologically)work)in)reality:!!However,!in!biological!sequences,!gaps!of!different!lengths!are!not!linearly!worse!because!gaps!usually!occur!in!bunches.!Longer!gaps!are!not!linearly!worse!than!shorter!gaps.!!Given!that!we’ve!opened!a!gap,!many!biological!mechanisms!will!make!it!easier!to!extend!the!gap.!!For!example,!the!presence!of!certain!repeats,!transp oson s!tha t!have!large!length.!!!How)do)we)handle)this?)Concave)gap)penalty)function)γ(n):!!As!gap!increases,!second!position!less!expensive,!third!position!even!less!expensive!etc.!!!γ(n):!for!all!n,!γ(n!+!1)!J!γ(n)!≤!γ(n)!J!γ(n!–!1)!!3!)Figure!2.!!A!concave!gap!penalty!function!graph!illustrating!the!length!of!the!gap!vs.!the!magnitude!of!the!penalty.!!As!the!gap!length!gets!longer,!the!marginal!penalty!decreases!for!each!successive!extension!of!the!gap.!Why)Original)Dynamic)Programming)does)not)work)for)nonDconstant)gap)scoring:!!Here,!the!original!dynamic!programming!for!global!alignment!does!not!work!if!the!gap!scoring!is!not!constant,!and!is!instead!decreasing!with!every!additional!position.!!!Why?!!Let’s!imagine!we’re!in!a!middle!cell!of!a!given!matrix,!at!position!(i,!j).!!In!the!original!algorithm,!we!would!look!at!th e!ma x!o f!the!three!positions!corresponding!to:!!Up:!F(i,!jJ1)!J!d!Left:!F(iJ1,!j)!J!d!Diagonal!upJleft:!F(iJ1,!jJ1)!+!s(xi,!yj)!However,!now!we!want!d!(the!gap!penalty)!to!be!dependent!on!how!long!the!gap!is.!!The!original!algorithm!does!no t!wo rk!here!b ecause!w e!ca nno t!simply!look!back!at!how!many!positions!we’ve!got!so!far!and!penalize!additional!increments!of!the!gap.!!When!we!“made!the!decision”!to!open!the!gap,!we!did!not!know!that!this!would!make!subsequent!gaps!up!to!the!current!position!cheaper,!and!we!don’t!know!that!the!optimum!did!not!come!from!somewhere!else.!!Thus!we!prefer!not!to!open!the!gap,!and!instead!just!take!the!diagonal!upJleft!direction.!!4!Convex)gap)dynamic)programming:!!Below!is!the!way!that!does!work.1!!!Initialization!and!termin atio n !a re!th e!s am e !as !b efore.!!However!the!iteration!is:!!We!maximize!over!all!possible!gap!lengths!for!gaps!that!end!in!this!position.!!For!example,!if!we!are!at!position!I,j,!then!we!say,!the!max!could!either!come!from!(iJ1,!jJ1),!and!I’m!assuming!the!other!possibilities!are!a!gap!of!a!given!length!ending!here.!!!How!long!could!this!gap!be?!Could!be!length!1!or!2,!etc.,!could!come!from!any!position!k,!from!0!to!iJ1,!and!could!be!from!beginning!of!sequence!to!iJ1,!and!there!would!be!a!gap!in!that!many!positions,!iJk!positions.!!This!would!cost!exactly!gamma!of!iJk.!!I!assume!it!closed!at!this!current!position!and!pay!for!it!Running!time!for!this!algorithm!is!O(N2M)!under!the!assumption!of!N>M,!and!the!Space!is!O(NM).!Compromise:)affine)gap)penalty2:!!We!penalize!a!gap !o f!l ength!n!by!a!gap!open!penalty!of!d!*!number!of!positions!in!the!gap,!and!can!see!why!this!makes!an!approximation!of!a!convex!scoring!function.!!Idea!is!that!its!bad!to!open!a!gap,!but!once!its!open,!can!extend!it!relatively!more!cheaply.!!The!gap!penalty!function!is:!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!1!According!to!lecture,!no!one!actually!uses!this!algorithm,!so!it!is!deemed!to!be!not!very!2!The!affine!gap!penalty!algorithm!is!more!extensively!used!and!is!the!most!common.!!5!Ex!.!an!upper!sequence!with!3!gaps,!a!lower!sequence!with!2!gaps,!where!there’s!10!matches:!the!score!of!the!alignment!is!10m!–!2d!–!3e.!!For!this,!d!and!e!are!constants.!!Figure!3.!!Graph!illustrating!the!length!of!the!gap!vs.!the!magnitude!of!the!gap!penalty.!!There!is!an!initial!gap!opening!penalty!of!d,!and!a!gap!penalty!of!(nJ1)*e!for!extensions!of!the!gap.!Will)have)3)Dynamic)Programming)Matrices)for)Affine)Gap)model:!!One!matrix,!matrix!F,!will!keep!the!scores!of!aligning!first!i!positions!of!sequence!x!to!first!j!positions!of!sequence!y,!under!assumption!th at!xi!aligns!to!yj.!!The!second!matrix,!Matrix!G,!will!be!the!optimal!score!of!alignment!assuming!xi!aligns!to!a!gap!after!yj.!!And!then!there!is!Matrix!H,!which!gives!optimal!score!again!assuming!yj!aligns!to!a!gap!after!xi.!!Matrix!V!will!be!the!max!of!the!above!three!matrices.!!! NeedlemanDWunsch)with)Affine)Gaps:))Why)do)we)need)the)three)matrices?:!!The!reason!for!three!matrices!is!b e ca u se !t h ere !are !sc e n ari o s!where!a!gap!at!i,!j!is!not!optimal!having!opened!the!gap!and!ending!it!there.!!But!if!we!go!one!further!position,!maybe!that!opening!of!the!gap!was!a!good!!6!investment!for!the!overall!score,!and!now!extendin g!the!gap !gives!the!o ptim al!alignment,!but!we!wouldn’t!know!that!unless!we!made!the!initially!subJoptimal!investment!of!opening!the!gap.!!You!don’t!know!at!the!stage!of!the!initial!open!gap,!but!when!extend!then!you !can!see!th at!it’s!the!o ptim al,!so!w e!need!separate!matrices!to!keep!track!of!just!the!scores!with!gaps.!! NeedlemanDWunsch)with)Affine)Gaps:))The)Algorithm:!!Intialization!is!similar!to!before.!!But!iteration!now!involves!keeping!track!of!three!DP!matrices,!and!then!choosing!the!max!out!of!the!three:!!!!

View Full Document


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2-3-4-5 out of 16 pages.

Stanford CS 262 - Lecture Notes

Sign up for free to view:

Please select your school