BU CS 565 - Time-series data analysis (28 pages)

Previewing pages 1, 2, 3, 26, 27, 28 of 28 page document
View Full Document

Time-series data analysis

Previewing pages 1, 2, 3, 26, 27, 28 of actual document.

View Full Document
View Full Document

Time-series data analysis

91 views

Pages:
28
School:
Boston University
Course:
Cs 565 - Advanced Java Programming
Advanced Java Programming Documents
• 34 pages

• 98 pages

• 51 pages

• 32 pages

• 46 pages

• 28 pages

• 41 pages

• 41 pages

• 44 pages

• 48 pages

• 43 pages

• 17 pages

• 31 pages

• 30 pages

• 35 pages

• 2 pages

• 30 pages

• 38 pages

• 27 pages

• 34 pages

• 35 pages

• 98 pages

• 51 pages

• 17 pages

• 17 pages

• 13 pages

• 51 pages

• 30 pages

• 28 pages

• 27 pages

• 30 pages

• 27 pages

• 44 pages

• 32 pages

• 17 pages

• 98 pages

• 17 pages

• 12 pages

• 46 pages

• 49 pages

• 55 pages

• 36 pages

• 98 pages

• 51 pages

• 37 pages

• 31 pages

• 51 pages

• 31 pages

• 28 pages

• 51 pages

• 46 pages

• 35 pages

• 51 pages

• 98 pages

• 37 pages

Unformatted text preview:

Time series data analysis Why deal with sequential data Because all data is sequential All data items arrive in the data store in some order Examples transaction data documents and words In some or many cases the order does not matter In many cases the order is of interest Time series data Financial time series process monitoring Questions What is the structure of sequential data Can we represent this structure compactly and accurately Sequence segmentation Gives an accurate representation of the structure of sequential data How By trying to find homogeneous segments Segmentation question Can a sequence T t1 t2 tn be described as a concatenation of subsequences S1 S2 Sk such that each Si is in some sense homogeneous The corresponding notion of segmentation in unordered data is clustering Dynamic programming algorithm Sequence T length n k segments cost function E table M For i 1 to n Set M 1 i E T 1 i Everything in one cluster For j 1 to k Set M j j 0 each point in its own cluster For j 2 to k For i j 1 to n Set M j i mini i M j 1 i E T i 1 i To recover the actual segmentation not just the optimal cost store also the minimizing values i Takes time O n2k space O kn Example R t R t Basic definitions Sequence T t1 t2 tn an ordered set of n d dimensional real points ti Rd A k segmentation S a partition of T into k contiguous segments s1 s2 sk Each segment s S is represented by a single value s Rd the representative of the segment Error Ep S The error of replacing individual points with 1 representatives p p E p S t s s S t s The k segmentation problem Given a sequence T of length n and a value k find a ksegmentation S s1 s2 sk of T such that the Ep error is minimized Common cases for the error function Ep p 1 and p 2 When p 1 the best s corresponds the median of the points in segment s When p 2 the best s corresponds to the mean of the points in segment s Optimal solution for the k segmentation problem Bellman 61 The k segmentation problem can be solved optimally using

View Full Document

Unlocking...