Exact Indexing of Dynamic Exact Indexing of Dynamic Time Warping - PowerPoint PPT Presentation

Exact Indexing of Dynamic Exact Indexing of Dynamic Time Warping Time Warping Eamonn Keogh Eamonn Keogh Computer Science & Engineering Department University of California - Riverside Riverside,CA 92521 eamonn@cs.ucr.edu

Fair Use Agreement Fair Use Agreement If you use these slides (or any part thereof) for any lecture or class, please send me an email, if possible with a pointer to the relevant web page or document. eamonn eamonn@cs.ucr.edu

Outline of Talk Outline of Talk • Why do Time Series Similarity Matching? Why do Time Series Similarity Matching? • • Limitations of Euclidean Distance Limitations of Euclidean Distance • • Dynamic Time Warping Dynamic Time Warping • • Lower Bounding Dynamic Time Warping Lower Bounding Dynamic Time Warping • • Indexing Dynamic Time Warping Indexing Dynamic Time Warping • • Experimental Evaluation Experimental Evaluation • • Conclusions Conclusions • • Questions Questions •

Why do Time Series Similarity Matching? Why do Time Series Similarity Matching? Clustering Classification Clustering Classification Rule Discovery Rule Discovery Query by Content 10 ⇒ s = 0.5 c = 0.3

Euclidean Vs Dynamic Time Warping Euclidean Vs Dynamic Time Warping Euclidean Distance Sequences are aligned “one to one”. “ Warped” Time Axis Nonlinear alignments are possible.

Limitations of Euclidean Distance I Limitations of Euclidean Distance I Classification Classification Classification Experiment on Cylinder Cylinder- -Bell Bell- -Funnel Funnel Dataset Dataset Classification Experiment on Training data consists of 10 exemplars from each class. • (One) Nearest Neighbor Algorithm • “Leaving-one-out” evaluation, averaged over 100 runs 26.10% 26.10% • Euclidean Distance Error rate Euclidean Distance Error rate 2.87% 2.87% • Dynamic Time Warping Error rate Dynamic Time Warping Error rate •

Limitations of Euclidean Distance II Limitations of Euclidean Distance II Clustering Clustering Friday Monday Tuesday Thursday Saturday Sunday Wednesday Wednesday was a national holiday Euclidean Dynamic Time Warping

Because of the robustness of Dynamic Time Warping Because of the robustness of Dynamic Time Warping compared to Euclidean Distance, it is used in… compared to Euclidean Distance, it is used in… Bioinformatics: Aach, J. and Robotics: Schmill, M., Oates, T. & Church, G. (2001). Aligning gene Cohen, P. (1999). Learned models for continuous planning. In 7 th International expression time series with time warping algorithms. Bioinformatics. Workshop on Artificial Intelligence and Volume 17, pp 495-508. Statistics. Medicine: Caiani, E.G., et. al. Chemistry: Gollmer, K., & Posten, C. (1995) Detection of distorted pattern using (1998) Warped-average template technique dynamic time warping algorithm and to track on a cycle-by-cycle basis the cardiac filling phases on left ventricular application for supervision of bioprocesses. volume. IEEE Computers in Cardiology. IFAC CHEMFAS-4 Gesture Recognition: Meteorology/ Tracking/ Gavrila, D. M. & Davis,L. S.(1995). Biometrics / Astronomy / Towards 3-d model-based tracking and Finance / Manufacturing … recognition of human movement: a multi-view approach. In IEEE IWAFGR

 How is DTW How is DTW ∑ = K =  ( , ) min DTW Q C w K  k k 1 Calculated? Calculated? γ (i,j) = d ( q i , c j ) + min{ γ ( i -1, j -1) , γ ( i -1, j ) , γ ( i , j -1) } C Q C Q Warping path w

DTW is much bet t er t han Euclidean dist ance f or classif icat ion, clust ering, query by cont ent et c. But is it not t rue t hat “ dynamic t ime warping cannot be speeded up by indexing *”, and is O( n 2 )? * Agrawal, R., Lin, K. I., Sawhney, H. S., & Shim, K. (1995). Fast similarity search in the presence of noise, scaling, Dooh and translation in times-series databases. VLDB pp. 490-501.

Constraints Global Constraints Global • Slightly speed up the calculations • Prevent pathological warpings C C Q Q Sakoe-Chiba Band Itakura Parallelogram

A global constraint constrains the indices of the warping path w k = ( i , j ) k such that j - r ≤ i ≤ j + r Where r is a term defining allowed range of warping for a given point in a sequence. r = Sakoe-Chiba Band Itakura Parallelogram

Lower Bounding Lower Bounding We can speed up similarity search under DTW by using a lower bounding function. Algorithm Lower_Bounding_Sequential_Scan(Q) Algorithm Lower_Bounding_Sequential_Scan(Q) Intuition 1. 1. best_so_far = infinity; best_so_far = infinity; 2. 2. for all sequences in database for all sequences in database 3. 3. LB_dist = lower_bound_distance( C i , Q); LB_dist = lower_bound_distance( C i , Q); Try to use a cheap lower 4. 4. if LB_dist < best_so_far if LB_dist < best_so_far bounding calculation as 5. 5. true_dist = DTW(C i , Q); true_dist = DTW(C i , Q); often as possible. if true_dist < best_so_far if true_dist < best_so_far 6. 6. 7. 7. best_so_far = true_dist; best_so_far = true_dist; 8. 8. index_of_best_match = i; index_of_best_match = i; Only do the expensive, 9. 9. endif endif full calculations when it is 10. 10. endif endif absolutely necessary. 11. endfor 11. endfor

Lower Bound of Kim et. al. Lower Bound of Kim et. al. C A D B LB_Kim The squared difference between the two Kim, S, Park, S, & Chu, W. An index-based approach for sequence’s first (A), last (D), minimum similarity search supporting time (B) and maximum points (C) is returned warping in large sequence as the lower bound databases . ICDE 01, pp 607-614

Lower Bound of Yi et. al. Lower Bound of Yi et. al. max(Q) min(Q) LB_Yi The sum of the squared length of gray Yi, B, Jagadish, H & Faloutsos, lines represent the minimum the C. Efficient retrieval of similar corresponding points contribution to the time sequences under time overall DTW distance, and thus can be warping . ICDE 98, pp 23-27. returned as the lower bounding measure

What we have seen so far… What we have seen so far… • Dynamic Time Warping (DTW) is a very robust technique for measuring time series similarity. • DTW is widely used in diverse fields. • Since DTW is expensive to calculate, techniques to speed up similarity search have been introduced, including global constraints and two different lower bounding techniques.

A Novel Lower Bounding Technique I A Novel Lower Bounding Technique I C Q U Q L Sakoe-Chiba Band U i = max(q i-r : q i+r ) L i = min(q i-r : q i+r ) C Q U Q L Itakura Parallelogram

A Novel Lower Bounding Technique II A Novel Lower Bounding Technique II C C U Q L Q  − > Sakoe-Chiba Band 2 ( c U ) if c U  i i i i n ∑ = − <  2 LB _ Keogh ( Q , C ) ( c L ) if c L i i i i =  i 1 0 otherwise  C C Q U LB_Keogh Itakura Parallelogram L Q

The tightness of the lower bound for each technique is proportional nal The tightness of the lower bound for each technique is proportio to the length of gray lines used in the illustrations to the length of gray lines used in the illustrations LB_Kim LB_Yi LB_Keogh Sakoe-Chiba LB_Keogh Itakura

Before we consider the problem of Before we consider the problem of indexing, let us empirically evaluate the indexing, let us empirically evaluate the quality of the proposed lowering quality of the proposed lowering bounding technique. bounding technique. This is a good idea, since it is an This is a good idea, since it is an implementation free measure of quality. measure of quality. implementation free First we must discuss our experimental First we must discuss our experimental philosophy… philosophy…

Experimental Philosophy Experimental Philosophy • We tested on 32 datasets from such diverse fields as finance, medicine, biometrics, chemistry, astronomy, robotics, networking and industry. The datasets cover the complete spectrum of stationary/ non-stationary, noisy/ smooth, cyclical/ non-cyclical, symmetric/ asymmetric etc • Our experiments are completely reproducible. We saved every random number, every setting and all data. • To ensure true randomness, we use random numbers created by a quantum mechanical process. • We test with the Sakoe-Chiba Band , which is the worst case for us (the Itakura Parallelogram would give us much better results).

Tightness of Lower Bound Experiment Tightness of Lower Bound Experiment • We measured T T = Lower Bound Estimate of Dynamic Time Warp Dista nce True Dynamic Time Warp Dista nce 0 ≤ T ≤ 1 • For each dataset, we randomly extracted 50 sequences of length 256 . The larger the We compared each sequence to the 49 better others. Query length of • For each dataset we report T as 256 is about the average ratio from the 1,225 (50*49/2) mean in the comparisons made. literature.

LB_Keogh LB_Yi 1.0 LB_Kim 0.8 0.6 0.4 0.2 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32

Effect of Query Length on Tightness of Lower Bounds Effect of Query Length on Tightness of Lower Bounds 1.0 Tightness of Lower Bound T 0.8 0.6 31 0.4 32 0.2 LB_Keogh 0 LB_Yi 16 32 64 128 256 512 1024 LB_Kim Query Length

Exact Indexing of Dynamic Exact Indexing of Dynamic Time Warping - PowerPoint PPT Presentation

Exact Indexing of Dynamic Exact Indexing of Dynamic Time Warping Time Warping Eamonn Keogh Eamonn Keogh Computer Science & Engineering Department University of California - Riverside Riverside,CA 92521 eamonn@cs.ucr.edu Fair Use

Distributed Indexing Indexing, session 8 CS6200: Information Retrieval Slides by: Jesse Anderton

Indexing Multimedia Multimedia Databases Databases Indexing Indexing Multimedia Databases

NPFL103: Information Retrieval (3) Index construction, Distributed and dynamic indexing, Index

Algebraic Tools for Exact Geometric Computing I - Exact Arithmetic and Filtering Michael Hemmer

Indexing Presentation - The Basics Attached is the slide deck for a short presentation on indexing

Indexing and Searching Indexing and Searching TDT4215 TDT4215 Indexing & Searching 3

Bitmap Indexing and related indexing techniques Presented by: El Ghailani Maher Outline I

Chapter 6 Hash-Based Indexing Efficient Support for Equality Search Hash-Based Indexing Static

Indexing December 12, 2008 Indexing Introduction New tuple is stored without any order next

Cycle time: 40 sec Cycle time: 12 sec Cycle time: 0.75 sec Cycle time: 1.25 sec Cycle time: 5

Index Construction Dictionary, postings, scalable indexing, dynamic indexing Web Search 1

INDEXING - 1 Tree-Structured Indices Tree-structured indexing techniques support both

BEST OF EXACT GLOBE Jos Suijkens Michiel Beek Best of Exact Globe 2 Agenda 1. A fresh look for

Notes on exact meets and joins R. N. Ball, J. Picado and A. Pultr 1 Exact meets and joins.

Audio Indexing and Retrieval IT6902; Semester B, 2004/2005; Leung Audio Indexing and Retrieval

Indexing CS6320 1/29/2018 Shachi Deshpande, Yunhe Liu Content Motivation for Indexing

CS70: Lecture 2. Outline. Quick Background and Notation. Direct Proof (Forward Reasoning).

CS70: Lecture 2. Outline. Today: Proofs!!! 1. By Example (or Counterexample). 2. Direct. (Prove P

Section 4 Boundary Value Problems for ODEs Numerical Analysis II Xiaojing Ye, Math &

Talk on Sheaf Representation John Kennison Clark University Joint work with Mike Barr and Bob

Monetary Policy Report July 2020 Chapter 1 Figure 1.1. Measures of the degree of government

ardl: Stata module to estimate autoregressive distributed lag models Sebastian Kripfganz 1 Daniel

Numb3rs 11 2 10 3 Lecture 4 9 4 8 5 7 6 The Skippy Clock 13 0 12 1 Has 13 hours on

Saugatuck Public Schools Survey Presented by: EPIC MRA 4710 W . Saginaw Hwy. Suite 2C

Exact Indexing of Dynamic Exact Indexing of Dynamic Time Warping - PowerPoint PPT Presentation

Exact Indexing of Dynamic Exact Indexing of Dynamic Time Warping Time Warping Eamonn Keogh Eamonn Keogh Computer Science & Engineering Department University of California - Riverside Riverside,CA 92521 eamonn@cs.ucr.edu Fair Use

Distributed Indexing Indexing, session 8 CS6200: Information Retrieval Slides by: Jesse Anderton

Indexing Multimedia Multimedia Databases Databases Indexing Indexing Multimedia Databases

NPFL103: Information Retrieval (3) Index construction, Distributed and dynamic indexing, Index

Algebraic Tools for Exact Geometric Computing I - Exact Arithmetic and Filtering Michael Hemmer

Indexing Presentation - The Basics Attached is the slide deck for a short presentation on indexing

Indexing and Searching Indexing and Searching TDT4215 TDT4215 Indexing &amp; Searching 3

Bitmap Indexing and related indexing techniques Presented by: El Ghailani Maher Outline I

Chapter 6 Hash-Based Indexing Efficient Support for Equality Search Hash-Based Indexing Static

Indexing December 12, 2008 Indexing Introduction New tuple is stored without any order next

Cycle time: 40 sec Cycle time: 12 sec Cycle time: 0.75 sec Cycle time: 1.25 sec Cycle time: 5

Index Construction Dictionary, postings, scalable indexing, dynamic indexing Web Search 1

INDEXING - 1 Tree-Structured Indices Tree-structured indexing techniques support both

BEST OF EXACT GLOBE Jos Suijkens Michiel Beek Best of Exact Globe 2 Agenda 1. A fresh look for

Notes on exact meets and joins R. N. Ball, J. Picado and A. Pultr 1 Exact meets and joins.

Audio Indexing and Retrieval IT6902; Semester B, 2004/2005; Leung Audio Indexing and Retrieval

Indexing CS6320 1/29/2018 Shachi Deshpande, Yunhe Liu Content Motivation for Indexing

CS70: Lecture 2. Outline. Quick Background and Notation. Direct Proof (Forward Reasoning).

CS70: Lecture 2. Outline. Today: Proofs!!! 1. By Example (or Counterexample). 2. Direct. (Prove P

Section 4 Boundary Value Problems for ODEs Numerical Analysis II Xiaojing Ye, Math &amp;

Talk on Sheaf Representation John Kennison Clark University Joint work with Mike Barr and Bob

Monetary Policy Report July 2020 Chapter 1 Figure 1.1. Measures of the degree of government

ardl: Stata module to estimate autoregressive distributed lag models Sebastian Kripfganz 1 Daniel

Numb3rs 11 2 10 3 Lecture 4 9 4 8 5 7 6 The Skippy Clock 13 0 12 1 Has 13 hours on

Saugatuck Public Schools Survey Presented by: EPIC MRA 4710 W . Saginaw Hwy. Suite 2C

Indexing and Searching Indexing and Searching TDT4215 TDT4215 Indexing & Searching 3

Section 4 Boundary Value Problems for ODEs Numerical Analysis II Xiaojing Ye, Math &