Practitioners . . . Limitations of Correlation Other Similarity Measures How to Select a . . . How to Select an Compared Values . . . Appropriate Similarity Case When No Scaling . . . Case When All . . . Measure: Towards a Case When Only . . . Case When Only Shift . . . Symmetry-Based Approach Home Page Title Page Ildar Batyrshin 1 , Thongchai Dumrongpokaphan 2 , Vladik Kreinovich 3 , and Olga Kosheleva 3 ◭◭ ◮◮ ◭ ◮ 1 Centro de Investigaci´ n en Computaci´ on (CIC) Instituto Polit´ ecnico Nacional (IPN), M´ exico, D.F. Page 1 of 22 batyr1@gmail.com 2 Department of Mathematics, Chiang Mai University Go Back Thailand, tcd43@hotmail.com Full Screen 3 University of Texas at El Paso, USA vladik@utep.edu, olgak@utep.edu Close Quit
Practitioners . . . Limitations of Correlation 1. Outline Other Similarity Measures • When practitioners analyze the similarity between How to Select a . . . time series, they often use correlation. Compared Values . . . Case When No Scaling . . . • Sometimes this works. Case When All . . . • However, sometimes, this leads to counter-intuitive re- Case When Only . . . sults. Case When Only Shift . . . Home Page • In such cases, other similarity measures are more ap- propriate. Title Page • An important question is how to select an appropriate ◭◭ ◮◮ similarity measures. ◭ ◮ • In this talk, we show, on simple examples, that Page 2 of 22 – the use of natural symmetries – scaling and shift Go Back – can help with such a selection. Full Screen Close Quit
Practitioners . . . Limitations of Correlation 2. Practitioners Routinely Use Correlation to De- Other Similarity Measures tect Similarities How to Select a . . . • Practitioners are often interested in gauging similarity: Compared Values . . . Case When No Scaling . . . – between two sets of related data or Case When All . . . – between two time series. Case When Only . . . • A natural idea seems to be to look for (sample) corre- Case When Only Shift . . . C a,b Home Page lation : ρ ( a, b ) = , where σ a · σ b Title Page n n b = 1 = 1 = 1 def def def ◭◭ ◮◮ � � � n · ( a i − a ) · ( b i − b ) , a n · n · C a,b a i , b b i , i =1 i =1 i =1 ◭ ◮ n n Page 3 of 22 = 1 = 1 def def def def � � � � ( a i − a ) 2 , V b ( b i − b ) 2 . = = n · n · σ a V a , σ b V b , V a Go Back i =1 i =1 Full Screen • Practitioners understand that correlation only detects linear dependence. Close Quit
Practitioners . . . Limitations of Correlation 3. Limitations of Correlation Other Similarity Measures • In some cases, the dependence is non-linear. How to Select a . . . Compared Values . . . • In such cases, simple correlation does not work. Case When No Scaling . . . • More complex methods are needed to detect depen- Case When All . . . dence. Case When Only . . . • Also, correlation assumes that the value b i is only af- Case When Only Shift . . . Home Page fected by the value of a i at the same moment of time i . Title Page • In real life, we may have a delayed effect – and the corresponding delay may depend on time. ◭◭ ◮◮ • However, in simple linear no-delay cases, practitioners ◭ ◮ expect correlation to be a perfect measure of similarity. Page 4 of 22 • And often it is. But sometimes, it is not. Let us give Go Back two examples. Full Screen Close Quit
Practitioners . . . Limitations of Correlation 4. First Example Other Similarity Measures • We ask people to evaluate movies on a scale 0–5. How to Select a . . . Compared Values . . . • Persons a , b , and c gave the following grades: Case When No Scaling . . . a 1 = 4 , a 2 = 5 , a 3 = 4 , a 4 = 5 , a 5 = 4 , a 6 = 5; Case When All . . . b 1 = 5 , b 2 = 4 , b 3 = 5 , b 4 = 4 , b 5 = 5 , b 6 = 4; Case When Only . . . Case When Only Shift . . . c 1 = 0 , c 2 = 1 , c 3 = 0 , c 4 = 1 , c 5 = 0 , c 6 = 1 . Home Page • From the common sense viewpoint, a and b have similar Title Page tastes: they like all the movies. ◭◭ ◮◮ • However, between a i and b i , there is a perfect anti - ◭ ◮ correlation ρ = − 1. Page 5 of 22 • The opposite opinion is expressed by c who does not like the movies. Go Back Full Screen • However, between a i and c i , there is a perfect correla- tion ρ = 1; so, correlation is counter-intuitive. Close Quit
Practitioners . . . Limitations of Correlation 5. Second Example Other Similarity Measures • Suppose that the US stock market shows periodic os- How to Select a . . . cillations, with relative values Compared Values . . . Case When No Scaling . . . a 1 = 1 . 0 , a 2 = 0 . 9 , a 3 = 1 . 0 , a 4 = 0 . 9 . Case When All . . . • Stock market in a small country X shows similar rela- Case When Only . . . tive changes, but with a much higher amplitude: Case When Only Shift . . . b 1 = 1 . 0 , b 2 = 0 . 5 , b 3 = 1 . 0 , b 4 = 0 . 5 . Home Page • These sequences are somewhat similar, but not the Title Page same: ◭◭ ◮◮ – while the US stock market has relatively small 10% ◭ ◮ fluctuations, Page 6 of 22 – the stock market of the country X changes by a Go Back factor of two. Full Screen • However, the two stock markets have a perfect positive correlation ρ = 1. Close Quit
Practitioners . . . Limitations of Correlation 6. Other Similarity Measures Other Similarity Measures • The need to go beyond correlation is well known. How to Select a . . . Compared Values . . . • Many effective similarity measures have proposed. Case When No Scaling . . . • Most of these measures start: Case When All . . . – either with correlation, Case When Only . . . – or with the Euclidean distance Case When Only Shift . . . Home Page � n � � � ( a i − b i ) 2 d ( a, b ) = Title Page � i =1 ◭◭ ◮◮ – or with a more general l p -distance ◭ ◮ � n � 1 /p Page 7 of 22 � | a i − b i | p . Go Back i =1 Full Screen • Sometimes, a linear or nonlinear transformation is ap- plied to the result, to make it more intuitive. Close Quit
Practitioners . . . Limitations of Correlation 7. Other Similarity Measures (cont-d) Other Similarity Measures • In other situations, modifications take care of the pos- How to Select a . . . sible time lag in describing the dependence. Compared Values . . . Case When No Scaling . . . • For example, we may look for a correlation between b i Case When All . . . and the delayed series a i + c for an appropriate c . Case When Only . . . • More generally, we can look for delay c ( i ) that changes Case When Only Shift . . . with time, i.e., for correlation between b i and a i + c ( i ) . Home Page • An example of such a similarity measure is the move- Title Page split-merge metric. ◭◭ ◮◮ ◭ ◮ Page 8 of 22 Go Back Full Screen Close Quit
Practitioners . . . Limitations of Correlation 8. How to Select a Similarity Measure Other Similarity Measures • In different practical situations, different similarity How to Select a . . . measures are appropriate. Compared Values . . . Case When No Scaling . . . • It is therefore important to be able to select the most Case When All . . . appropriate similarity measure for each given situation. Case When Only . . . • There have been several papers comparing the effec- Case When Only Shift . . . tiveness of different similarity measures in clustering . Home Page • Another important practical case is when we simply Title Page have two time series. ◭◭ ◮◮ • In this talk, we show that natural symmetries – shifts ◭ ◮ and scalings – can help. Page 9 of 22 • We only consider no-time-lag linear case. Go Back • We hope that symmetries will help in general case as Full Screen well. Close Quit
Practitioners . . . Limitations of Correlation 9. Compared Values Come from Measurements Other Similarity Measures • We want to understand a discrepancy between com- How to Select a . . . monsense meaning of similarity and correlation. Compared Values . . . Case When No Scaling . . . • For this, let us recall how we get the values a i and b i . Case When All . . . • Usually, we get these values from measurements. Case When Only . . . • Sometimes, they come from expert estimates: Case When Only Shift . . . Home Page – they can also be considered as measurements Title Page – performed by a human being as a measuring instru- ◭◭ ◮◮ ment. ◭ ◮ • To perform a measuremnt, we need to select a starting point and a measuring unit. Page 10 of 22 • For example, we can measure temperature in the Go Back Fahrenheit (F) scale or in the Celsius (C) scale. Full Screen Close Quit
Recommend
More recommend