DOC PREVIEW
U of I CS 498 - Tracking and Parsing

This preview shows page 1-2-3-19-20-39-40-41 out of 41 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 41 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 41 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 41 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 41 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 41 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 41 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 41 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 41 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 41 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Cute Tricks with Dynamic Programming:Tracking and ParsingD.A. ForsythRecall: Dynamic Programming•We had to choose a set of discrete variables to minimize a cost function•We did this by building a cost-to-go function•recursively•at X_(n-1)=cost of best path leaving X_n-1•at X_k= best of (cost of leaving X_k+cost to go(X_k+1))u1(X1)+b12(X1,X2)+u2(X2)+b23(X2,X3)+u3(X3)+...un(Xn)Human parsing•We are given an image•We have a model of a person•head+torso•head+torso+upper+lower arms•head+torso+upper+lower arms+legs•more, perhaps•could be 2D or 3D•We know a bunch of stuff about this model•eg color of each segment•eg how segments join up•We must report configuration of the modelApplications•KinectSimplest case•Model •2D•segments are image rectangles of known size•we know the color of the segments•we have two cost functions•segment to image match•eg sum of squared color differences•segment to segment compatibility•eg ends are close; possibly angles•in a tree (more details to follow)•Matching = dynamic programmingBuilding a trellis•Simple cases•head+torso•one column each•entries in the column are image rectangles•picture just like the texture picture•head+torso+two upper arms•one column per segment•entries are image rectangles•picture only slightly more interesting, because we have a treeIn notation•We have variables X1..Xn•one per body segment•each can be mapped to an image rectangle•for the moment, assume they form a chain•eg head+torso; head+torso+upper+lower leg•trees will be easy, as above•We have a cost function•with a special form (for the case of a chain)•We seekIn notation -II: the cost to go function•We have for the chain function:•Define the cost-to-go function by•this is the base of a recursion!•Notice that•is the same as (except for Xn, which is missing)In notation -III: the cost-to-go function-IIIn notation - IVSoDynamic programming for a chainHeavily simplified tracking model•We have a moving object, and want to trace its path•Two sources of information•Measurement•actual, but unknown, position in k’th image is x_k (2D, continuous)•measured position in k’th image is m_k (2D, continuous)•very close to actual position•m_k=x_k+(tiny position error)•far more complex models are possible•Dynamics•it isn’t moving much•i.e. x_k+1=x_k+(tiny position change)•far more complex models are possibleIn equationsxk+1= xk+ ζkmk= xk+ ηktiny position changescale is σptiny position changescale is σmMotion constraintMeasurement constraintFinding the path•To find the path, we must choose x_1, ..., x_n to minimize(m1−x1)T(m1−x1)σ2m+(x2−x1)T(x2−x1)σ2p+...(xn−xn−1)T(xn−xn−1)σ2p+(mn−xn)T(mn−xn)σ2mBuilding a cost-to-go function•Terms involving x_n:•Maximize with respect to x_n•set gradient wrt x_n to zero•rearrange•get(xn− xn−1)T(xn− xn−1)σ2p+(mn− xn)T(mn− xn)σ2mxn=�xn−1σ2p+mnσ2m��σ2mσ2pσ2m+ σ2p�We can substitute back(m1−x1)T(m1−x1)σ2m+(x2−x1)T(x2−x1)σ2p+...(xn−2−xn−1)T(xn−2−xn−1)σ2p+(xn−1−mn−1)T(xn−1−mn−1)σ2m+c(xn−1)(m1−x1)T(m1−x1)σ2m+(x2−x1)T(x2−x1)σ2p+...(xn−xn−1)T(xn−xn−1)σ2p+(mn−xn)T(mn−xn)σ2mAnd do this again and again...•To get some c(x_1)•minimize this•substitute x_1 in expression for x_2•x_2 in expression for x_3•...•x_n-1 in expression for x_n•This is a dynamic program, too•Can be extended to:•more complex state models•more complex measurement models•more complex dynamic models•Result: The Kalman FilterReminderDP works for forestsSome issues •How do you deal with two legs?•or two arms?•or, rather, how do you ensure the match has two?“Efficient Matching of Pictorial Structures,” by P. Felzenszwalb and D.P. Huttenlocher, Proc. IEEE CVPR 2000, c 2000, IEEE.What if you don’t know the color?•Easy answer for video:•look for lateral walking view with a classifier•read off appearance of arms, legs, etc•feed into model; now detect in each frameBuild and detect modelstorsoar mle gheadlabelpixelslearnlimbclassifiersgeneralposepictorialstructureunusual posesmall scale"Lola" likelihoodRamanan, Forsyth and Zisserman CVPR05Ramanan, Forsyth and Zisserman CVPR05Ramanan, Forsyth and Zisserman CVPR05What if you don’t know the color? - II•Hard answer for static images:•guess color; parse; reestimate color; reparse; and so onGuessing the colorFrom Ramanan, 03IteratingFrom Ramanan, 03IteratingFrom Ramanan, 03This is a search•And we can prune it•It’s easy to detect torso’s accurately•the arms can get only so far from the torsoPruned searchBut dynamic programming fails•We’ve already seen that getting two arms/legs is an issue•which we ducked, somewhat•Our model is wrong•arms tend to have the same color•legs tend to have the same color•this means there should be terms in the cost function that link left/right•this breaks dpBreaking dp - I•Cost function for a chain of three variables:•Picture:u1(X1)+b12(X1,X2)+u2(X2)+b23(X2,X3)+u3(X3)X X X1 2 3Breaking dp•New pictureXX X12 3u1(X1)+b12(X1,X2)+u2(X2)+b23(X2,X3)+u3(X3)+b13(X1,X3)Breaking dp - III•Try to build a cost-to-go function:•eliminate X3•but now our cost-to-go function depends on X1 and X2•so no benefit! •We need to check all triples!u1(X1)+b12(X1,X2)+u2(X2)+b23(X2,X3)+u3(X3)+b13(X1,X3)Breaking dp - IVTran and Forsyth 10All sorts of picturesXX X45 6XX X12 3XX X97 8Could you do dynamic programming here?Not in terms of X1...X9but in terms of G1, G2, G3G1=(X1, X2, X3)G2=(X4, X5, X6)G3=(X7, X8, X9)Strategy for non-dp cases•Approximation•true answer is intractable•Strategy:•Fix some u’s, to make a chain•do dp•now fix different u’s, iterate•One of many possible strategies•question is very richTran+Forsyth, 10Important points•Human parsing is useful•Can do with dp•Can lead to models where you can’t maximize with dp•or can’t maximize efficiently•What to do is an “algorithms” question•Approximate algorithms exist•Hard questions•“best” approximate algorithm•quality of


View Full Document

U of I CS 498 - Tracking and Parsing

Documents in this Course
Lecture 5

Lecture 5

13 pages

LECTURE

LECTURE

39 pages

Assurance

Assurance

44 pages

LECTURE

LECTURE

36 pages

Pthreads

Pthreads

29 pages

Load more
Download Tracking and Parsing
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Tracking and Parsing and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Tracking and Parsing 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?