Cut-And-Stitch: Efficient Parallel Learning of Linear Dynamical Systems on SMPsLei Li, Wenjie Fu, Fan Guo, Todd C. Mowry, Christos FaloutsosComputer Science Department Carnegie Mellon [email protected] of Computer ScienceOutline• Motivation• Background• Proposed Method: Cut-And-Stitch• Experiments• Conclusion2Motivation• Common tool: Linear Dynamical System (LDS)• Challenge?– Learning LDS is slow3Need For SpeedMotion Capture, Human Motion Modelingfinancial data modeling, predictionsensor data, pattern miningRelated Work• Parallel mining of closed sequential patterns, [KDD 05, S. Cong, J. Han, D. Padua]• Mining terabytes of data for frequent itemsets, [PPoPP 07, G. Buehrer, S. Parthasarathy, S. Tatikonda, T. Kurc, J. Saltz]• Parallelize the discovery of frequent patterns in large graphs, [IPDPS 07, S. Reinhardt, G. Karypis]• Parallel algorithms for SVM – Cascade SVM, [NIPS 04, H. Graf, E. Cosatto, L. Bottou, I. Durdanovic, V. Vapnik]– Parallel Mixture of SVMs, [NIPS 02, R. Collobert, S. Bengio, Y. Bengio]– PSVM (open src), [NIPS 07, E. Chang, K. Zhu, H. Wang, H. Bai, J. Li, Z. Qiu, H. Cui]• Multicore Map-Reduce approach, [NIPS 06, C. Chu, S. Kim, Y. Lin, Y. Yu, G. Bradski, A. Ng, K. Olukotun]4Problem Definition• Problem: – Given a sequence of data, find the best model parameters for Linear Dynamical System• Traditional Method: – Maximum Likelihood Estimation via Expectation-Maximization(EM) algorithm• Objective:– Parallelize the learning algorithm• Assumption:– shared memory architecture5Linear Dynamical Systemaka. Kalman Filter6• Parameters: θ=(u0, V0, A, Γ, C, Σ)• Observation: y1…yn• Hidden variables: z1… zn6Z1Z1Z2Z2Z3Z3Z4Z4Z5Z5Y1Y2Y3Y4Y5(A∙z1, Γ)(u0, V0)(C∙z3, Σ)(A∙z2, Γ)(C∙z1, Σ) (C∙z2, Σ)(C∙z4, Σ)(A∙z3, Γ)(C∙z5, Σ)(A∙z4, Γ)Example7given positions, estimate dynamics (i.e. params)z1y1y2z3y3z4y4z5y5z6y6z2TimePosition of left elbowTraditional:How to learn LDS?8Sequential Learning (EM)z1z2z2y1y2y2z3z3y3y3z4z4y4y4z5z5y5y5z6z6y6y6Compute P(z1| y1)Time*MeasuredEstimatedPosition of left elbowSequential Learning (EM)10z1z2y1y2z3z3y3y3z4z4y4y4z5z5y5y5z6z6y6y6From P(z1| y1) Compute P(z2| y1 , y2)Time*Intuition: z2may be close to z1*MeasuredEstimatedPosition of left elbowSequential Learning (EM)11z1y1y2z3y3z4z4y4y4z5z5y5y5z6z6y6y6From P(z2| y1 , y2) Compute P(z3| y1 , y2 , y3)z2Time***MeasuredEstimatedPosition of left elbowSequential Learning (EM)12z1y1y2z3y3z4y4z5z5y5y5z6z6y6y6From P(z3| y1 , y2 , y3) Compute P(z4| y1 , y2 , y3 , y4)z212Time****MeasuredEstimatedPosition of left elbowSequential Learning (EM)13z1y1y2z3y3z4y4z5y5z6z6y6y6z2From P(z4| y1 , y2 , y3 , y4) Compute P(z5| y1 , y2 , y3 , y4 , y5)Time*****MeasuredEstimatedPosition of left elbowSequential Learning (EM)14z1y1y2z3y3z4y4z5y5z6y6z2From P(z5| y1 , y2 , y3 , y4 , y5) Compute P(z6| y1 , y2 , y3 , y4 , y5 , y6)Time******MeasuredEstimatedPosition of left elbow*Sequential Learning (EM)15z1y1y2z3y3z4y4z5y5z6y6z2From P(z6| y1 , y2 , y3 , y4 , y5 , y6) Compute P(z5| y1 , y2 , y3 , y4 , y5 , y6)15Time****MeasuredEstimated**Intuition: take the future backwardPosition of left elbowSequential Learning (EM)16z1y1y2z3y3z4y4z5y5z6y6z2From P(z6| y1 , y2 , y3 , y4 , y5 , y6) Compute P(z4| y1 , y2 , y3 , y4 , y5 , y6)*1616Time****Measured***Estimated*****Position of left elbowSequential Learning (EM)z1y1y2z3y3z4y4z5y5z6y6z2From P(z4| y1 , y2 , y3 , y4 , y5 , y6) Compute P(z3| y1 , y2 , y3 , y4 , y5 , y6)17*1717Time****Measured****Estimated*****Position of left elbowSequential Learning (EM)z1y1y2z3y3z4y4z5y5z6y6z2From P(z3| y1 , y2 , y3 , y4 , y5 , y6) Compute P(z2| y1 , y2 , y3 , y4 , y5 , y6)18*1818Time****Measured*****Estimated*****Position of left elbowSequential Learning (EM)19z1y1y2z3y3z4y4z5y5z6y6z2From P(z2| y1 , y2 , y3 , y4 , y5 , y6) Compute P(z1| y1 , y2 , y3 , y4 , y5 , y6)191919**1919Time****Measured*****Estimated*****Position of left elbowSequential Learning (EM)z1y1y2z3y3z4y4z5y5z6y6z2From all posterior z1 , z2 , z3 , z4 , z5 , z6P(z1| y1 , y2 , y3 , y4 , y5 , y6), P(z2| y1 , y2 , y3 , y4 , y5 , y6)…Compute sufficient statisticsE[zi]E[zizi’]E[zi-1zi’]Sequential Learning (EM)21z1y1y2z3y3z4y4z5y5z6y6z2**Time****Measured*****with sufficient statistics, compute argmax ←likelihood(θ) θreconstructed signalPosition of left elbowSequential Learning (EM)22z1y1y2z3y3z4y4z5y5z6y6z2From P(z6| y1 , y2 , y3 , y4 , y5 , y6) Compute P(z5| y1 , y2 , y3 , y4 , y5 , y6)22TimePosition*****MeasuredEstimated******How to parallelize it?23Speed Bottleneck: sequential computation of posteriorz1y1y2z3y3z4y4z5y5y6z2z6“Leap of faith”start computation without feedback from previous node (cut), and reconcile later (stitch)24Proposed Method: Cut-And-Stitch25z1y1y2z3y3z4y4z5y5y6z2z6υ2,Φ2,η2,Ψ2υ1,Φ1,η1,Ψ1z1y1y2z'2z2z3y3z4y4z'4z5y5y6z6υ3,Φ3,η3,Ψ3start computation without feedback from previous node (cut)reconcile later (stitch)Cut-And-Stitchυ2,Φ2,η2,Ψ2υ1,Φ1,η1,Ψ1z1y1y2y2z'2z'2z2z2z3y3z4z4y4y4z'4z'4z5y5y6y6z6z6υ3,Φ3,η3,Ψ3Cut step: Estimate posteriors (E)TimeMeasuredEstimatedIntuition: compute all three at once***P(z1| y1), P(z3| y3), P(z5| y5)Position of left elbowCut-And-Stitchυ2,Φ2,η2,Ψ2υ1,Φ1,η1,Ψ1z1y1y2z'2z'2z2z3y3z4y4z'4z'4z5y5y6z6υ3,Φ3,η3,Ψ3Cut step: Estimate posteriors (E)Time******MeasuredEstimatedPosition of left elbowCut-And-Stitchυ2,Φ2,η2,Ψ2υ1,Φ1,η1,Ψ1z1y1y2z'2z'2z2z3y3z4y4z'4z'4z5y5y6z6υ3,Φ3,η3,Ψ3Cut step: Estimate posteriors (E)TimeMeasured*********Intuition: backward adjust all at oncePosition of left elbowCut-And-Stitchυ2,Φ2,η2,Ψ2υ1,Φ1,η1,Ψ1z1y1y2z'2z'2z2z3y3z4y4z'4z'4z5y5y6z6υ3,Φ3,η3,Ψ3Stitch step: Collect sufficient statistics for each block (C)E[zi]E[zizi’]E[zi-1zi’]Cut-And-StitchStitch step: Collect sufficient Statistics (C) Maximize parameters (M)υ2,Φ2,η2,Ψ2υ1,Φ1,η1,Ψ1z1y1y2z'2z'2z2z3y3z4y4z'4z'4z5y5y6z6υ3,Φ3,η3,Ψ330TimeMeasured*********Position of left elbowCut-And-Stitchυ2,Φ2,η2,Ψ2υ1,Φ1,η1,Ψ1z1y1y2z'2z2z3y3z4y4z'4z5y5y6z6υ3,Φ3,η3,Ψ3Stitch together: Re-estimate block parameters (R)TimeMeasured************Intuition: exchange messages cross blockIterate…reconstructed signalPosition of left elbowExperimentsQ1: How much speed up can we get?Q2: How good is the reconstuction accuracy?32Experiments• Dataset:– 58 human motion sequences, 200 – 500
or
We will never post anything without your permission.
Don't have an account? Sign up