Generalized Distances Between Rankings Ravi Kumar Sergei Vassilvitskii Yahoo! Research
Evaluation How to evaluate a set of results? - Use a Metric! NDCG, MAP, ERR, ... Distances Between Rankings WWW 2010
Evaluation How to evaluate a set of results? - Use a Metric! NDCG, MAP, MRR, ... How to evaluate a measure? 1. Incremental improvement - Show a problem with current measure - Propose a new measure that fixes that (and only that) problem 2. Axiomatic approach - Define rules for good measures to follow - Find one that follows the rules Distances Between Rankings WWW 2010
Desired Properties • Richness – Support element weights, position weights, etc. • Simplicity – Be simple to understand • Generalization – Collapse to a natural metric with no weights are present • Satisfy Basic Properties – Scale free, invariant under relabeling, triangle inequality... • Correlation with other metrics – Should behave similar to other approaches – Allows us to select a metric best suited to the problem Distances Between Rankings WWW 2010
Kendall’s Tau Rank 2 ( σ ) Rank 1 Distances Between Rankings WWW 2010
Kendall’s Tau Rank 2 ( σ ) Rank 1 An Inversion: A pair of elements and such i j that and . σ ( i ) < σ ( j ) i > j Distances Between Rankings WWW 2010
Kendall’s Tau Rank 2 ( σ ) Rank 1 An Inversion: A pair of elements and such i j that and . σ ( i ) < σ ( j ) i > j Example: Rank 1: > Rank 2: > Distances Between Rankings WWW 2010
Kendall’s Tau Rank 2 ( σ ) Rank 1 An Inversion: A pair of elements and such i j that and . σ ( i ) < σ ( j ) i > j Kendall’s Tau: Count total number of inversions in σ . � K ( σ ) = 1 σ ( i ) > σ ( j ) i<j Example: Inverted pairs: ( , ) , ( , ) Kendall’s Tau: 2 Distances Between Rankings WWW 2010
Spearman’s Footrule Rank 2 ( σ ) Rank 1 Displacement: distance an element moved due i to σ = . | i − σ ( i ) | Distances Between Rankings WWW 2010
Spearman’s Footrule Rank 2 ( σ ) Rank 1 Displacement: distance an element moved due i to σ = . | i − σ ( i ) | Spearman’s Footrule: Total displacement of all elements: � F ( σ ) = | i − σ ( i ) | i Example: Total Displacement = 1 + 1 + 2 = 4 Distances Between Rankings WWW 2010
Kendall vs. Spearman Relationship Diaconis and Graham proved that the two measures are robust: K ( σ ) ≤ F ( σ ) ≤ 2 K ( σ ) ∀ σ Thus the rotation (previous example) is the worst case. Distances Between Rankings WWW 2010
Weighted Versions How to incorporate weights into the metric? Element weights swapping two important elements vs. two inconsequential ones Position weights swapping two elements near the head vs. near the tail of the list Pairwise similarity weights swapping two similar elements vs. two very different elements Distances Between Rankings WWW 2010
Element Weights Swap two elements of weight and . How much should the w i w j inversion count in the Kendall’s tau? w i + w j - Average of the weights ? 2 √ w � w � - Geometric average of the weights: ? 1 - Harmonic average of the weights: ? 1 1 w � + w � - Some other monotonic function of the weights? Distances Between Rankings WWW 2010
Element Weights Swap two elements of weight and . How much should the w i w j inversion count in the Kendall’s tau? Rank 2 ( σ ) Rank 1 Distances Between Rankings WWW 2010
Element Weights Swap two elements of weight and . How much should the w i w j inversion count in the Kendall’s tau? Rank 2 ( σ ) Rank 1 Treat element i as a collection of w i subelements of weight 1. Distances Between Rankings WWW 2010
Element Weights Swap two elements of weight and . How much should the w i w j inversion count in the Kendall’s tau? Rank 2 ( σ ) Rank 1 Treat element i as a collection of w i subelements of weight 1. The subelements remain in same order Distances Between Rankings WWW 2010
Element Weights Swap two elements of weight and . How much should the w i w j inversion count in the Kendall’s tau? Rank 2 ( σ ) Rank 1 Treat element i as a collection of w i subelements of weight 1. The subelements remain in same order Then: The total number of inversions between subelements of i and j : w � w � � K w ( σ ) = Define: w i w j 1 σ ( i ) > σ ( j ) i<j Distances Between Rankings WWW 2010
Element Weights Using the same intuition, how do we define the displacement and the Footrule metric? Rank 2 ( σ ) Rank 1 Each of the subelements is w i � � displaced by: . | w j | w j − j<i σ ( j ) < σ ( i ) Distances Between Rankings WWW 2010
Element Weights Using the same intuition, how do we define the displacement and the Footrule metric? Rank 2 ( σ ) Rank 1 Each of the subelements is w i � � displaced by: . | w j | w j − j<i σ ( j ) < σ ( i ) Therefore total displacement for � � w i | w j | element i: . w j − j<i σ ( j ) < σ ( i ) Weighted Footrule Distance: � � � F w ( σ ) = w i | w j | w j − i j<i σ ( j ) < σ ( i ) Distances Between Rankings WWW 2010
Kendall vs. Spearman Relationship The DG Inequality extends to the weighted case: K w ( σ ) ≤ F w ( σ ) ≤ 2 K w ( σ ) ∀ σ Rotation remains the worst case example. Distances Between Rankings WWW 2010
Position Weights How should we differentiate inversions near the head of the list versus those at the tail of the list? δ i - Let be the cost of swapping element at position i-1 with one at position i. - In typical applications: δ 2 ≥ δ 3 ≥ . . . ≥ δ n 1 1 (DCG sets ) δ i = log i − log i + 1 i p i ( σ ) = p i − p σ ( i ) � δ j - Let , and be the average cost of per ¯ p i = i − σ ( i ) j =2 swap charged to element i. Distances Between Rankings WWW 2010
Position Weights i p i ( σ ) = p i − p σ ( i ) � δ j ¯ - Let , and be the average cost of per p i = i − σ ( i ) j =2 swap charged to element i. p i ( σ ) ¯ We can treat as if they were element weights, and define: Distances Between Rankings WWW 2010
Position Weights i p i ( σ ) = p i − p σ ( i ) � δ j ¯ - Let , and be the average cost of per p i = i − σ ( i ) j =2 swap charged to element i. p i ( σ ) ¯ We can treat as if they were element weights, and define: � K δ ( σ ) = p i ( σ )¯ ¯ p j ( σ ) 1 σ � i � > σ � j � Kendall’s Tau: i<j � � � F δ ( σ ) = p i ( σ ) | ¯ p j ( σ ) − ¯ p j ( σ ) | ¯ Footrule: i j<i σ ( j ) < σ ( i ) Conclude: K δ ( σ ) ≤ F δ ( σ ) ≤ 2 K δ ( σ ) ∀ σ Distances Between Rankings WWW 2010
Element Similarities Element weights: model cost of important versus inconsequential elements. Position weights model different cost of inversions near the head or tail of list How to model the cost of swap similar elements versus different elements. Distances Between Rankings WWW 2010
Element similarities Rank L Rank C Rank R With identical element and position weights is L or R better? Distances Between Rankings WWW 2010
Element similarities Rank L Rank C Rank R With identical element and position weights is L or R better? In the extreme case L and C are identical, even though an inversion occurred Distances Between Rankings WWW 2010
Modeling Similarities For two elements i and j let denote the distance between them. D ij We assume that forms a metric (follows triangle inequality). D : [ n ] × [ n ] Distances Between Rankings WWW 2010
Modeling Similarities For two elements i and j let denote the distance between them. D ij We assume that forms a metric (follows triangle inequality). D : [ n ] × [ n ] Rank 2 ( σ ) Rank 1 To define Kendall’s Tau: scale each inversion by the distance between the inverted elements. Distances Between Rankings WWW 2010
Modeling Similarities For two elements i and j let denote the distance between them. D ij We assume that forms a metric (follows triangle inequality). D : [ n ] × [ n ] Rank 2 ( σ ) Rank 1 To define Kendall’s Tau: scale each inversion by the distance between the inverted elements. In the example: K( σ ) = D( , ) + D( , ) Generally: � K D ( σ ) = D ij 1 σ ( i ) > σ ( j ) i<j Distances Between Rankings WWW 2010
Footrule with similarities Defining Footrule with similarities Rank 2 ( σ ) Rank 1 D( , ) + Distances Between Rankings WWW 2010
Footrule with similarities Defining Footrule with similarities Rank 2 ( σ ) Rank 1 D( , ) + D( , ) + Distances Between Rankings WWW 2010
Recommend
More recommend