On Top-k Selection from m-wise Partial Rankings via Borda Counting Wenjing Chen 1 Ruida Zhou 1 Chao Tian 1 Cong Shen 2 1 Department of Electrical and Computer Engineering Texas A&M University 2 Electrical and Computer Engineering Department University of Virginia ISIT, June 2020 ISIT On Top-k Selection from m-wise Partial Rankings via Borda Counting June 2020 1 / 22
Contents Problem Setup 1 Borda Counting Procedure 2 Main Results 3 Proof 4 ISIT On Top-k Selection from m-wise Partial Rankings via Borda Counting June 2020 2 / 22
Motivation and Relevant Results Ranking aggregation is important: e.g. information retrieval and recommending systems. A well studied case is ranking aggregation using pairwise comparisons. Algorithms based on the parametric model (e.g. BTL models) may perform poorly if there is a mismatch to the model. Shah et al., 2017 employed the Borda counting procedure on ranking aggregation using pairwise comparisons in a nonparametric model setting. Our work extends the pairwise comparison problem to the ranking aggregation using m -wise comparisons. ISIT On Top-k Selection from m-wise Partial Rankings via Borda Counting June 2020 3 / 22
Borda Counting Procedure: An Example, n=6, m=3 ISIT On Top-k Selection from m-wise Partial Rankings via Borda Counting June 2020 4 / 22
Problem Setup A total of n items indexed by [ n ] = { 1 , 2 , . . . , n } . The noisy partial ranking samples are collected in r rounds. In round ℓ ∈ [ r ], each subset of m items, say A ⊆ [ n ], are compared. Each ranking result is observed with probability p . ISIT On Top-k Selection from m-wise Partial Rankings via Borda Counting June 2020 5 / 22
Borda Counting Procedure In a comparison among the set A , if the result is observed, the i -th item will receive score β i (1 = β 1 ≥ β 2 ≥ ... ≥ β m ≥ 0). Otherwise, each item in this comparison will receive score 0. X ( ℓ ) a , A − : the score item- a receives in the ℓ -th round in the comparison among the set A = a ∪ A − . ISIT On Top-k Selection from m-wise Partial Rankings via Borda Counting June 2020 6 / 22
Borda Counting Procedure In a comparison among the set A , if the result is observed, the i -th item will receive score β i (1 = β 1 ≥ β 2 ≥ ... ≥ β m ≥ 0). Otherwise, each item in this comparison will receive score 0. X ( ℓ ) a , A − : the score item- a receives in the ℓ -th round in the comparison among the set A = a ∪ A − . After r rounds, the total score item- a receives is � � X ( ℓ ) W a = a , A − . ℓ ∈ [ r ] A − ⊆ [ n ] \{ a } The top- k estimate ( ˜ S k ): the k items which receive the highest empirical scores . ISIT On Top-k Selection from m-wise Partial Rankings via Borda Counting June 2020 6 / 22
Probabilistic Model When the items in set A = { v 1 , v 2 ... v m } are being compared, The probability of the order � v = ( v 1 , v 2 , ... v m ) occurring: M v 1 v 2 ... v m or M � v . v . � = A : the items in the vector � v are those in the set A . v . Constraints: M � v ≥ 0 and � = A M � v = 1 for any � v with distinct elements. � ISIT On Top-k Selection from m-wise Partial Rankings via Borda Counting June 2020 7 / 22
A Few Notations R a , A − ( t ): the probability that item- a ranks at the t -th position in the set A = { a } ∪ A − . Then the expected score of item- a relative to A − m � � X ( ℓ ) � ∀ a ∈ [ n ] , ∀A = a ∪ A − . E = p β t R a , A − ( t ) , a , A − t =1 Associated score of any item- a : average expected score m 1 � � � � τ a = β t R a , A − ( t ) , ρ n , m t =1 A − ⊆ [ n ] \{ a } � n − 1 � where ρ n , m = . m − 1 ISIT On Top-k Selection from m-wise Partial Rankings via Borda Counting June 2020 8 / 22
Main Results: Upper Bound Theorem 1 For any α > 0 , the probability of choosing incorrect top-k items using the Borda counting method with probability distribution M ∈ F k ( α ) is upper-bounded as k ] ≤ n − α 2 / 4+2 . P M [ ˜ S k � = S ∗ sup M ∈ F k ( α ) ˜ S k : the Borda counting estimator of top- k subset, S ∗ k : the true top- k subset of the highest associated scores. ∆ k = τ ( k ) − τ ( k +1) is the k -th threshold of associated scores. � � � log n F k ( α ) = M ∈ M : ∆ k ≥ α : set of ”good” ranking probability distributions. rp ρ n , m ISIT On Top-k Selection from m-wise Partial Rankings via Borda Counting June 2020 9 / 22
Main Results: Converse Part Theorem 2 Let n , k where 2 k ≤ n be chosen. If √ � 2 1 , p ≥ log n α ( g , m , � 7 g ( n , m , � β ) � α ≤ ¯ β ) 4 rh ( n ) , and n ≥ 7 , h ( n ) ρ n , m then the error probability of any estimator ˆ S k is lower bounded as k ] ≥ 1 P M [ ˆ S k � = S ∗ sup 7 , M ∈ F k ( α ) where g ( n , m , � β ) and h ( n ) are two constants to be specified in the proof. ISIT On Top-k Selection from m-wise Partial Rankings via Borda Counting June 2020 10 / 22
Example: Special Case of Theorem 1 Special case m = 2, ( β 1 , β 2 ) = (1 , 0): the case in Shah et al., 2017. The set of ranking probabilities can be simplified to � � � log n F k ( α ) = M ∈ M : ∆ k ≥ α . rp ( n − 1) Result in Shah et al., 2017 has the set F (1) k ( α ) � � � log n � n F (1) k ( α ) = M ∈ M : ∆ k ≥ α . rp ( n − 1) n − 1 If we set α ≥ 8, then the bound becomes k ] ≤ n − α 2 / 4+2 ≤ n − 14 . P M [ ˜ S k � = S ∗ sup M ∈ F k ( α ) The bound matches the result in Shah et al., 2017, F (1) k ( α ) ⊆ F k ( α ), so the Theorem 1 here is slightly tighter than that in Shah et al., 2017. ISIT On Top-k Selection from m-wise Partial Rankings via Borda Counting June 2020 11 / 22
Example: Special Case of Theorem 2 � n − 1 , and that p ≥ log n When m = 2, the assumptions in Theorem 2 reduce to α ≤ 1 n 2 rn . 7 Theorem 2 matches precisely the converse part in Shah et al., 2017. Next: details of the proofs. Assume w.l.o.g The underlying ranking is consistent with the index of items. Then S ∗ k = [ k ] = { 1 , 2 , ..., k } . ISIT On Top-k Selection from m-wise Partial Rankings via Borda Counting June 2020 12 / 22
Proof: Upper Bound Outline of the proof of Theorem 1. From the fact that the event � ˜ S k � = S ∗ = {∃ a ∈ S ∗ k , b ∈ [ n ] \ S ∗ � k such that a is ranked after b } , k we know P M [ ˜ S k � = S ∗ � k ] ≤ P ( W b − W a > 0) a ∈ [ k ] , b ∈ [ n ] \ [ k ] By applying the Bernstein’s inequality, for any a ∈ [ k ] , b ∈ [ n ] \ [ k ], P ( W b − W a > 0) can be upper bounded as n − α 2 4 . Then k ] ≤ k ( n − k ) n − α 2 4 ≤ n − α 2 P M [ ˜ S k � = S ∗ 4 +2 . ISIT On Top-k Selection from m-wise Partial Rankings via Borda Counting June 2020 13 / 22
Proof: Upper Bound Difference from the pairwise case : X ( ℓ ) a , b vs X ( ℓ ) a , A − : X ( ℓ ) a , b is an indicator function representing if a beats b in round l . In m -wise case, the situation is more complicated. We use a general concept X ( ℓ ) a , A − . M ab vs R a , A − ( t ): M ab : the probability of a beats b . In m -wise case, to deal with more complicated ranking results, use R a , A − ( t ) to aggregate the probability of some cases together to simplify the analysis. Lemma 3 (Bernstein’s inequality) Let Y 1 , ..., Y n be independent zero-mean random variables. Suppose that | Y i | ≤ M almost surely, for all i. Then, for all positive t, we have that � n � � � 1 2 t 2 � Y i > t ≤ exp − . P � E Y 2 + 1 � � 3 Mt i i =1 ISIT On Top-k Selection from m-wise Partial Rankings via Borda Counting June 2020 14 / 22
Proof: Upper Bound How we apply Bernstein’s inequality? Notice that � � X ( ℓ ) X ( ℓ ) � � � W b − W a = b , A − − . a , A − ℓ ∈ [ r ] A − ⊆ [ n ] \{ b } A − ⊆ [ n ] \{ a } We need to prepare the following steps. 1 First, transform the r.v.s X ( ℓ ) a , A − and X ( ℓ ) b , A − in the RHS of the equation above into independent zero-mean r.v.s. 2 Bound the first-order and second-order moments of the new r.v.s obtained in last step. ISIT On Top-k Selection from m-wise Partial Rankings via Borda Counting June 2020 15 / 22
Proof: Upper Bound Step 1: define the centralized score: m � � X ( ℓ ) ¯ a , A − � X ( ℓ ) X ( ℓ ) = X ( ℓ ) � a , A − − E a , A − − p β t R a , A − ( t ) , ∀ a ∈ [ n ] a , A − t =1 and the centralized cross-score � � � � X ( ℓ ) { a , b } , A −− � X ( ℓ ) a , { b , A −− } − X ( ℓ ) X ( ℓ ) X ( ℓ ) ¯ b , { a , A −− } − E + E a , { b , A −− } b , { a , A −− } � m m � = X ( ℓ ) a , { b , A −− } − X ( ℓ ) � � β t R t b , { a , A −− } − p β t R a , { b , A −− } ( t ) − p . b , { a , A −− } t =1 t =1 ISIT On Top-k Selection from m-wise Partial Rankings via Borda Counting June 2020 16 / 22
Proof: Upper Bound Step 2: for centralized score: X ( ℓ ) X ( ℓ ) First-order moment bound: | ¯ a , A − | ≤ β 1 and | ¯ { a , b } , A −− | ≤ 2 β 1 Bound the variance of centralized score as m �� � 2 � �� � 2 � �� � 2 � � 2 � X ( ℓ ) X ( ℓ ) X ( ℓ ) X ( ℓ ) ¯ � β 2 = E − E ≤ E = p t R b , A − ( t ) . E b , A − b , A − b , A − b , A − t =1 Bound the variance of centralized cross-score: ... ISIT On Top-k Selection from m-wise Partial Rankings via Borda Counting June 2020 17 / 22
Recommend
More recommend