Dimensionality Reduction and (Bucket) Ranking: a Mass - PowerPoint PPT Presentation

Dimensionality Reduction and (Bucket) Ranking: a Mass Transportation Approach Mastane Achab, Anna Korba, Stephan Cl´ emen¸ con DA2PL’2018, Poznan, Poland

Outline Introduction Dimensionality Reduction on S n Empirical Distortion Minimization Numerical Experiments on a Real-world Dataset

Introduction (1/2) ◮ Permutations over n items � n � = { 1 , . . . , n }

Introduction (1/2) ◮ Permutations over n items � n � = { 1 , . . . , n } ◮ Number of permutations explodes: # S n = n !

Introduction (1/2) ◮ Permutations over n items � n � = { 1 , . . . , n } ◮ Number of permutations explodes: # S n = n ! ◮ Distribution P on S n : n ! − 1 parameters

Introduction (2/2) ◮ Question: ”How to summarize P ?”

Introduction (2/2) ◮ Question: ”How to summarize P ?” ◮ Answer: dimensionality reduction

Introduction (2/2) ◮ Question: ”How to summarize P ?” ◮ Answer: dimensionality reduction ◮ Problem: no vector space structure for permutations

Preliminaries (1/2) Bucket order C = ( C 1 , . . . , C K ): ordered partition of � n � ◮ C i ’s disjoint non empty subsets of � n �

Preliminaries (1/2) Bucket order C = ( C 1 , . . . , C K ): ordered partition of � n � ◮ C i ’s disjoint non empty subsets of � n � ◮ ∪ K k =1 C k = � n �

Preliminaries (1/2) Bucket order C = ( C 1 , . . . , C K ): ordered partition of � n � ◮ C i ’s disjoint non empty subsets of � n � ◮ ∪ K k =1 C k = � n � ◮ K : ”size” of C

Preliminaries (1/2) Bucket order C = ( C 1 , . . . , C K ): ordered partition of � n � ◮ C i ’s disjoint non empty subsets of � n � ◮ ∪ K k =1 C k = � n � ◮ K : ”size” of C ◮ (# C 1 , . . . , # C K ): ”shape” of C Partial order: ” i is ranked lower than j in C ” if ∃ k < l s.t. ( i , j ) ∈ C k × C l .

Preliminaries (2/2) P C : set of all bucket distributions P ′ associated to C ◮ P ′ distribution on S n

Preliminaries (2/2) P C : set of all bucket distributions P ′ associated to C ◮ P ′ distribution on S n ◮ if ( i , j ) ∈ C k × C l ( k < l ), then p ′ j , i = 0

Preliminaries (2/2) P C : set of all bucket distributions P ′ associated to C ◮ P ′ distribution on S n ◮ if ( i , j ) ∈ C k × C l ( k < l ), then p ′ j , i = 0 i , j = P (Σ ′ ( i ) < Σ ′ ( j )) for Σ ′ ∼ P ′ ◮ p ′ P ′ ∈ P C described by d C = � k ≤ K # C k ! − 1 ≤ n ! − 1 parameters

Background on Consensus Ranking Consensus ranking (or ”ranking aggregation”): summarize permutations σ 1 , . . . , σ N by a consensus/median ranking σ ∗ ∈ S n by solving: N � min d ( σ, σ s ) . σ ∈ S n s =1

Background on Consensus Ranking Consensus ranking (or ”ranking aggregation”): summarize permutations σ 1 , . . . , σ N by a consensus/median ranking σ ∗ ∈ S n by solving: N � min d ( σ, σ s ) . σ ∈ S n s =1 If Σ 1 , . . . , Σ N i.i.d. sampled from P (Korba et al., 2017), solve: σ ∈ S n E Σ ∼ P [ d (Σ , σ )] . min

Kemeny medians Particular choice for metric d :

Kemeny medians Particular choice for metric d : ◮ Kendall’s τ distance d τ ( σ, σ ′ ) = � i < j I { ( σ ( i ) − σ ( j ))( σ ′ ( i ) − σ ′ ( j )) < 0 } .

Kemeny medians Particular choice for metric d : ◮ Kendall’s τ distance d τ ( σ, σ ′ ) = � i < j I { ( σ ( i ) − σ ( j ))( σ ′ ( i ) − σ ′ ( j )) < 0 } . ◮ Kemeny medians are solutions of: min σ ∈ S n E Σ ∼ P [ d τ (Σ , σ )].

Kemeny medians Particular choice for metric d : ◮ Kendall’s τ distance d τ ( σ, σ ′ ) = � i < j I { ( σ ( i ) − σ ( j ))( σ ′ ( i ) − σ ′ ( j )) < 0 } . ◮ Kemeny medians are solutions of: min σ ∈ S n E Σ ∼ P [ d τ (Σ , σ )]. Unique Kemeny median σ ∗ P if P strictly stochastically transitive:

Kemeny medians Particular choice for metric d : ◮ Kendall’s τ distance d τ ( σ, σ ′ ) = � i < j I { ( σ ( i ) − σ ( j ))( σ ′ ( i ) − σ ′ ( j )) < 0 } . ◮ Kemeny medians are solutions of: min σ ∈ S n E Σ ∼ P [ d τ (Σ , σ )]. Unique Kemeny median σ ∗ P if P strictly stochastically transitive: ◮ p i , j ≥ 1 / 2 and p j , k ≥ 1 / 2 ⇒ p i , k ≥ 1 / 2 ◮ p i , j � = 1 / 2 for all i < j ◮ given by Copeland ranking � σ ∗ P ( i ) = 1 + I { p i , j < 1 / 2 } . j � = i

Bucket orders of size n Consensus ranking: extreme case of bucket order C of size n .

Bucket orders of size n Consensus ranking: extreme case of bucket order C of size n . ◮ C = ( { σ ∗− 1 (1) } , . . . , { σ ∗− 1 ( n ) } )

Bucket orders of size n Consensus ranking: extreme case of bucket order C of size n . ◮ C = ( { σ ∗− 1 (1) } , . . . , { σ ∗− 1 ( n ) } ) ◮ P C = { δ σ ∗ } , hence dimension d C = 0

Bucket orders of size n Consensus ranking: extreme case of bucket order C of size n . ◮ C = ( { σ ∗− 1 (1) } , . . . , { σ ∗− 1 ( n ) } ) ◮ P C = { δ σ ∗ } , hence dimension d C = 0 Problem: generalization for any bucket order.

A Mass Transportation Approach ◮ Question: ”How to quantify approximation error between orginal distrib. P and bucket distrib. P ′ ∈ P C ?”.

A Mass Transportation Approach ◮ Question: ”How to quantify approximation error between orginal distrib. P and bucket distrib. P ′ ∈ P C ?”. ◮ Our answer: Wasserstein distance W d , q ( P , P ′ ). Definition � P , P ′ � � � d q (Σ , Σ ′ ) W d , q = Σ ∼ P , Σ ′ ∼ P ′ E inf

A Mass Transportation Approach ◮ Question: ”How to quantify approximation error between orginal distrib. P and bucket distrib. P ′ ∈ P C ?”. ◮ Our answer: Wasserstein distance W d , q ( P , P ′ ). Definition � P , P ′ � � � d q (Σ , Σ ′ ) W d , q = Σ ∼ P , Σ ′ ∼ P ′ E inf ◮ Why: because it generalizes consensus ranking. Indeed: W d , 1 ( P , δ σ ) = E Σ ∼ P [ d (Σ , σ )] .

A Mass Transportation Approach ◮ Question: ”How to quantify approximation error between orginal distrib. P and bucket distrib. P ′ ∈ P C ?”. ◮ Our answer: Wasserstein distance W d , q ( P , P ′ ). Definition � P , P ′ � � � d q (Σ , Σ ′ ) W d , q = Σ ∼ P , Σ ′ ∼ P ′ E inf ◮ Why: because it generalizes consensus ranking. Indeed: W d , 1 ( P , δ σ ) = E Σ ∼ P [ d (Σ , σ )] . ◮ Focus on d = d τ and q = 1.

Distortion measure A bucket order C represents well P if small distortion Λ P ( C ). Definition W d τ , 1 ( P , P ′ ) Λ P ( C ) = min P ′ ∈ P C

Distortion measure A bucket order C represents well P if small distortion Λ P ( C ). Definition W d τ , 1 ( P , P ′ ) Λ P ( C ) = min P ′ ∈ P C Explicit expression for Λ P ( C ): Proposition � � Λ P ( C ) = p j , i 1 ≤ k < l ≤ K ( i , j ) ∈C k ×C l .

Empirical setting Training sample: Σ 1 , . . . , Σ N i.i.d. from P . ◮ Empirical pairwise probabilities: N � p i , j = 1 � I { Σ s ( i ) < Σ s ( j ) } . N s =1

Empirical setting Training sample: Σ 1 , . . . , Σ N i.i.d. from P . ◮ Empirical pairwise probabilities: N � p i , j = 1 � I { Σ s ( i ) < Σ s ( j ) } . N s =1 ◮ Empirical distortion of any bucket order C : � � Λ N ( C ) = � p j , i = Λ � P N ( C ) . i ≺ C j

Rate bound Empirical distortion minimizer � C K ,λ is solution of: � min Λ N ( C ) , C∈ C K ,λ where C K ,λ set of bucket orders C of size K and shape λ (i.e. # C k = λ k for all 1 ≤ k ≤ K ). Theorem For all δ ∈ (0 , 1) , we have with probability at least 1 − δ : � log( 1 δ ) Λ P ( � C K ,λ ) − inf Λ P ( C ) ≤ β ( n , λ ) × . N C∈ C K ,λ

The Strong Stochastic Transitive Case Assume that P is strongly (and strictly) stochastically transitive i.e.: p i , j ≥ 1 / 2 and p j , k ≥ 1 / 2 ⇒ p i , k ≥ max ( p i , j , p j , k ) .

The Strong Stochastic Transitive Case Assume that P is strongly (and strictly) stochastically transitive i.e.: p i , j ≥ 1 / 2 and p j , k ≥ 1 / 2 ⇒ p i , k ≥ max ( p i , j , p j , k ) . Theorem (i). Λ P ( C ) has a unique minimizer over C K ,λ , denote it C ∗ ( K ,λ ) . (ii). C ∗ ( K ,λ ) is the unique bucket order in C K ,λ agreeing with the Kemeny median.

Consequence: agglomerative algorithm. Outline Introduction Dimensionality Reduction on S n Empirical Distortion Minimization Numerical Experiments on a Real-world Dataset

Experiments Sushi dataset (Kamishima, 2003): ◮ n = 10 sushi dishes ◮ N = 5000 full rankings. sushi dataset K 10 4 3 dimension 4 10 3 5 10 2 6 7 10 1 8 0 10 20 distortion

Thank you!

Dimensionality Reduction and (Bucket) Ranking: a Mass - PowerPoint PPT Presentation

Dimensionality Reduction and (Bucket) Ranking: a Mass Transportation Approach Mastane Achab, Anna Korba, Stephan Cl emen con DA2PL2018, Poznan, Poland Outline Introduction Dimensionality Reduction on S n Empirical Distortion

From Bucket-Elimination To Bucket Trees E Bucket E: P(E|B,C) P(E|B,C) D Bucket D: P(D|A,B)

STAT 209 Dimensionality Reduction November 26, 2019 Colin Reimer Dawson 1 / 24 Dimensionality

Dimensionality Reduction Alexandros Tantos Assistant Professor Aristotle University of

Investigating Dimensionality Dimensionality Dimensionality with with Investigating

Dimensionality Reduction Algorithms (and how to interpret their output) Dalya Baron (Tel Aviv

Exploring Multivariate Data with Clustering and Dimensionality Reduction Marco Baroni Practical

Preprocessing and Dimensionality Reduction J er emy Fix CentraleSup elec

WIKIPEDIA ARTICLE GROUP 9 Contents Article Overview 1. Dimensionality Reduction 2.

Nonlinear Dimensionality Reduction Donovan Parks Overview Direct visualization vs.

Applied Machine Learning Dimensionality reduction using PCA Siamak Ravanbakhsh COMP 551 (Fall

DIMENSIONALITY REDUCTION DIMENSIONALITY REDUCTION MATTHIEU BLOCH April 21, 2020 1 / 26

Probabilistic Dimensionality Reduction Neil D. Lawrence University of Sheffield Facebook, London

Kernel-Based Dimensionality Reduction Methods on Synthesized and Facial Image Data Jonathan L.

Bucket-Sort and Radix-Sort 1, c 3, a 3, b 7, d 7, g 7, e B 0 1

1-Bucket-Theta: Map 1-Bucket-Theta: Reduce Col T 1 6 Row Input: tuple x S T,

Bounded in inference non-iteratively; ; Min ini-bucket t el eliminatio ion COMPSCI

Discussion of Identifying Neighborhood Effects Among Firms: Evidence from Location Lotteries

New - Theaters on Expo Floor Networking

Unification-Based Grammar Engineering Dan Flickinger Stanford University & Redbird Advanced

Prt t qrts

Workshop in Honour of John Power on the occasion of his 60th Birthday @RIMS, Kyoto University 23

Beyond Block I/O: Rethinking / Traditional Storage Primitives Traditional Storage Primitives

Cryptanalysis Results on Spook Bringing Full Shadow-512 to the Light Patrick Derbez 1 , Paul Huynh

SSS: An Implementation of Key-value Store based MapReduce Framework Hirotaka Ogawa (AIST, Japan)

Dimensionality Reduction and (Bucket) Ranking: a Mass - PowerPoint PPT Presentation

Dimensionality Reduction and (Bucket) Ranking: a Mass Transportation Approach Mastane Achab, Anna Korba, Stephan Cl emen con DA2PL2018, Poznan, Poland Outline Introduction Dimensionality Reduction on S n Empirical Distortion

From Bucket-Elimination To Bucket Trees E Bucket E: P(E|B,C) P(E|B,C) D Bucket D: P(D|A,B)

STAT 209 Dimensionality Reduction November 26, 2019 Colin Reimer Dawson 1 / 24 Dimensionality

Dimensionality Reduction Alexandros Tantos Assistant Professor Aristotle University of

Investigating Dimensionality Dimensionality Dimensionality with with Investigating

Dimensionality Reduction Algorithms (and how to interpret their output) Dalya Baron (Tel Aviv

Exploring Multivariate Data with Clustering and Dimensionality Reduction Marco Baroni Practical

Preprocessing and Dimensionality Reduction J er emy Fix CentraleSup elec

WIKIPEDIA ARTICLE GROUP 9 Contents Article Overview 1. Dimensionality Reduction 2.

Nonlinear Dimensionality Reduction Donovan Parks Overview Direct visualization vs.

Applied Machine Learning Dimensionality reduction using PCA Siamak Ravanbakhsh COMP 551 (Fall

DIMENSIONALITY REDUCTION DIMENSIONALITY REDUCTION MATTHIEU BLOCH April 21, 2020 1 / 26

Probabilistic Dimensionality Reduction Neil D. Lawrence University of Sheffield Facebook, London

Kernel-Based Dimensionality Reduction Methods on Synthesized and Facial Image Data Jonathan L.

Bucket-Sort and Radix-Sort 1, c 3, a 3, b 7, d 7, g 7, e B 0 1

1-Bucket-Theta: Map 1-Bucket-Theta: Reduce Col T 1 6 Row Input: tuple x S T,

Bounded in inference non-iteratively; ; Min ini-bucket t el eliminatio ion COMPSCI

Discussion of Identifying Neighborhood Effects Among Firms: Evidence from Location Lotteries

New - Theaters on Expo Floor Networking

Unification-Based Grammar Engineering Dan Flickinger Stanford University &amp; Redbird Advanced

Prt t qrts

Workshop in Honour of John Power on the occasion of his 60th Birthday @RIMS, Kyoto University 23

Beyond Block I/O: Rethinking / Traditional Storage Primitives Traditional Storage Primitives

Cryptanalysis Results on Spook Bringing Full Shadow-512 to the Light Patrick Derbez 1 , Paul Huynh

SSS: An Implementation of Key-value Store based MapReduce Framework Hirotaka Ogawa (AIST, Japan)

Unification-Based Grammar Engineering Dan Flickinger Stanford University & Redbird Advanced