An Impossibility Theorem Seminar Algorithms Kevin Chang Eindhoven University of Technology June 19, 2018
Contents Motivation Definitions The Impossibility Theorem Centroid-Based Clustering and Consistency Relaxing the Properties
Table of Contents Motivation Definitions The Impossibility Theorem Centroid-Based Clustering and Consistency Relaxing the Properties
Motivation
Motivation ◮ The mapper algorithm needs a ’good’ cluster algorithm
Motivation Clustering : ◮ ‘Clustering’ cannot be precisely defined ◮ Intuitively, group set of objects that are ‘similar’. ◮ Unsupervised
Motivation Clustering : ◮ ‘Clustering’ cannot be precisely defined ◮ Intuitively, group set of objects that are ‘similar’. ◮ Unsupervised Example:
Motivation Clustering : ◮ ‘Clustering’ cannot be precisely defined ◮ Intuitively, group set of objects that are ‘similar’. ◮ Unsupervised Example:
Motivation ◮ There exist no universal good clustering algorithm. ◮ Every clustering algorithm assumes a certain model. ◮ e.g. k-means tends to generate hyperspherical clusters.
Motivation Example:
Motivation Example: k-means
Motivation Example: Single-link
Motivation Example:
Motivation Example: k-means
Motivation Example: Single-link
Motivation The idea of no universal clustering algorithm is partially captured by the impossibility theorem: ◮ There is no single clustering algorithm simultaneously satisfies a set of basic intuitive axioms of data clustering.
Table of Contents Motivation Definitions The Impossibility Theorem Centroid-Based Clustering and Consistency Relaxing the Properties
Definitions ◮ S is a set of n points
Definitions ◮ S is a set of n points ◮ A distance function is any function d : S × S → R such that: ◮ For distinct i , j ∈ S , we have d ( i , j ) ≥ 0. ◮ d ( i , j ) = 0 iff i = j . ◮ d ( i , j ) = d ( j , i ) .
Definitions ◮ S is a set of n points ◮ A distance function is any function d : S × S → R such that: ◮ For distinct i , j ∈ S , we have d ( i , j ) ≥ 0. ◮ d ( i , j ) = 0 iff i = j . ◮ d ( i , j ) = d ( j , i ) . ◮ A clustering function is any function f ( d ) that takes a distance function d , and returns a partition of Γ of S . ◮ Points are not assumed to belong to any ambient space. ◮ The sets in Γ will be called its clusters.
Definitions Example: Set of points
Definitions Example: Distance function
Definitions Example: A partition of S, existing out of 3 clusters
Table of Contents Motivation Definitions The Impossibility Theorem Centroid-Based Clustering and Consistency Relaxing the Properties
Scale-Invariance Axiom 1: Scale-Invariance For any distance function d and any α > 0 , we have f ( d ) = f ( α · d ) ◮ i.e. cluster functions should not have a built-in ’length-scale’.
Scale-Invariance
Richness Let Range( f ) denote the set of all partitions Γ such that f ( d ) = Γ for some distance function d Axiom 2: Richness Range( f ) is equal to the set of all partitions of S . ◮ i.e. every partition of S is a possible output.
Richness
Consistency ◮ Let Γ be a partition of S , and d and d ’ two distance functions on S.
Consistency ◮ Let Γ be a partition of S , and d and d ’ two distance functions on S. ◮ d ’ is a Γ -transformation of d if 1. for all i , j ∈ S belonging to the same cluster of Γ , we have d ’ ( i , j ) ≤ d ( i , j ) ; 2. for all i , j ∈ S belonging to different clusters of Γ , we have d ’ ( i , j ) ≥ d ( i , j )
Consistency ◮ Let Γ be a partition of S , and d and d ’ two distance functions on S. ◮ d ’ is a Γ -transformation of d if 1. for all i , j ∈ S belonging to the same cluster of Γ , we have d ’ ( i , j ) ≤ d ( i , j ) ; 2. for all i , j ∈ S belonging to different clusters of Γ , we have d ’ ( i , j ) ≥ d ( i , j ) Axiom 3: Consistency Let d and d ’ be two distance functions. If f ( d ) = Γ , and d ’ is a Γ -transformation of d , then f ( d ’ ) = Γ ◮ i.e. Cluster stays the same after reducing the distance within cluster and enlarging distance between cluster.
Consistency
The Impossibility Theorem Theorem 2.1 For each n ≥ 2, there is no clustering function f that satisfies Scale-Invariance, Richness, and Consistency.
Single-linkage ◮ Single-linkage is a family of clustering function. ◮ Initialize each point as its own cluster. ◮ Repeatedly merge pair of clusters whose distance to one another is minimum until a stopping condition is reached.
Single-linkage Example:
Single-linkage Example:
Single-linkage Example:
Single-linkage Example:
Single-linkage Example:
Single-linkage Example:
Examples of Impossibility ◮ k-cluster stopping condition : Stop adding edges when there are k connected components.
Examples of Impossibility ◮ k-cluster stopping condition : Stop adding edges when there are k connected components. ◮ For any k ≥ 1, and n ≥ k , this stopping condition satisfies Scale-Invariance and Consistency.
Examples of Impossibility ◮ k-cluster stopping condition : Stop adding edges when there are k connected components. ◮ For any k ≥ 1, and n ≥ k , this stopping condition satisfies Scale-Invariance and Consistency.
Examples of Impossibility ◮ k-cluster stopping condition : Stop adding edges when there are k connected components. ◮ For any k ≥ 1, and n ≥ k , this stopping condition satisfies Scale-Invariance and Consistency.
Examples of Impossibility ◮ distance-r stopping condition : Only add edges of weight at most r .
Examples of Impossibility ◮ distance-r stopping condition : Only add edges of weight at most r . ◮ For any r > 0, and any n ≥ 2, this stopping condition satisfies Richness and Consistency.
Examples of Impossibility ◮ distance-r stopping condition : Only add edges of weight at most r . ◮ For any r > 0, and any n ≥ 2, this stopping condition satisfies Richness and Consistency. r
Examples of Impossibility ◮ distance-r stopping condition : Only add edges of weight at most r . ◮ For any r > 0, and any n ≥ 2, this stopping condition satisfies Richness and Consistency. r
Examples of Impossibility ◮ scale- α stopping condition : Let p ∗ denote the maximum pairwise distance. Add only edges of weight at most α p ∗
Examples of Impossibility ◮ scale- α stopping condition : Let p ∗ denote the maximum pairwise distance. Add only edges of weight at most α p ∗ ◮ For any positive α < 1, and n ≥ 3, this stopping condition satisfies Scale-Invariance and Richness
Examples of Impossibility ◮ scale- α stopping condition : Let p ∗ denote the maximum pairwise distance. Add only edges of weight at most α p ∗ ◮ For any positive α < 1, and n ≥ 3, this stopping condition satisfies Scale-Invariance and Richness p ∗ α p ∗
Examples of Impossibility ◮ scale- α stopping condition : Let p ∗ denote the maximum pairwise distance. Add only edges of weight at most α p ∗ ◮ For any positive α < 1, and n ≥ 3, this stopping condition satisfies Scale-Invariance and Richness p* α p ∗
The Impossibility Theorem Proof Intuition
The Impossibility Theorem Proof First some notions. ◮ A partition Γ ’ is a refinement of a partition Γ if for every set C ’ ∈ Γ ’, there is a set C ∈ Γ such that C ’ ⊆ C . Partition Γ ’ Partition Γ
The Impossibility Theorem Proof First some notions. ◮ A partition Γ ’ is a refinement of a partition Γ if for every set C ’ ∈ Γ ’, there is a set C ∈ Γ such that C ’ ⊆ C . ◮ A collection of partitions is an antichain if it does not contain two distinct partitions such that one is a refinement of the other. Partition Γ ’ Partition Γ
The Impossibility Theorem Proof The impossibility result follows from: Theorem 3.1 If a clustering function f satisfies Scale-Invariance and Consistency, then Range( f ) is an antichain.
The Impossibility Theorem Proof Some more notions needed to prove theorem 3.1: ◮ For a partition Γ a distance function d ( a , b ) -conforms to Γ if, ◮ for all pairs of points i , j that belong to the same cluster of Γ , we have d ( i , j ) ≤ a ◮ while all pairs of points i , j that belong to the different cluster of Γ , we have d ( i , j ) ≥ b
The Impossibility Theorem Proof Some more notions needed to prove theorem 3.1: ◮ For a partition Γ a distance function d ( a , b ) -conforms to Γ if, ◮ for all pairs of points i , j that belong to the same cluster of Γ , we have d ( i , j ) ≤ a ◮ while all pairs of points i , j that belong to the different cluster of Γ , we have d ( i , j ) ≥ b Example: Partition Γ 5 3 d (3 , 5) -conforms to Γ
Recommend
More recommend