guarantees for spectral clustering with fairness
play

Guarantees for Spectral Clustering with Fairness Constraints Matthus - PowerPoint PPT Presentation

Guarantees for Spectral Clustering with Fairness Constraints Matthus Kleindessner, Samira Samadi , Pranjal Awasthi & Jamie Morgenstern Spectral Clustering (SC) and Fairness SC is the method of choice for clustering the nodes of a graph.


  1. Guarantees for Spectral Clustering with Fairness Constraints Matthäus Kleindessner, Samira Samadi , Pranjal Awasthi & Jamie Morgenstern

  2. Spectral Clustering (SC) and Fairness SC is the method of choice for clustering the nodes of a graph. Friendship network: SC can re- sult in highly unfair clustering with respect to the two demo- graphic groups. Fair clustering ( Chierichetti et al. 2017): in every cluster, each group V s should be represented with (approximately) the same fraction as in the whole data set V . Goal: Study spectral clustering with fairness constraints . 2 / 7

  3. Spectral Clustering Goal: Partition V into k clusters with min RatioCut objective value. ∪ C k by H ∈ R n × k with ⋄ Encode a clustering V = C 1 ˙ ∪ . . . ˙ � � 1 / | C l | , i ∈ C l H il = (1) 0 , i / ∈ C l RatioCut( C 1 , . . . , C k ) = Tr( H T LH ) . L is the graph Laplacian matrix . ⋄ The exact problem: H ∈ R n × k Tr( H T LH ) subject to H is of form (1) min ⋄ Solve the relaxed version: H ∈ R n × k Tr( H T LH ) subject to H T H = I k . min ⋄ Apply k -means clustering to the rows of H . 3 / 7

  4. Spectral Clustering with Fairness Constraints Approach: Incorporate fairness as a linear constraint H ∈ R n × k Tr( H T LH ) subject to H T H = I k & F T H = 0 . min Convert the program to the standard form and solve. � Our approach is analogous to existing versions of constrained SC that try to incorporate must-link constraints (e.g. Yu and Shi ’04) Friendship network: Our algo- rithm finds a fair clustering with respect to the two demographic groups. 4 / 7

  5. Analysis on Variant of Stochastic Block Model Given V with a fair ground-truth clustering e.g., V = C 1 ˙ ∪ C 2  a , i and j in same group and in same cluster     b , i and j in same group, but in different clusters  Pr( i , j ) = c , i and j in different groups, but in same cluster     d , i and j in different groups, and in different clusters  for some a > b > c > d . V 1 Theorem (informal): Fair SC recovers the ground-truth clus- C 1 C 2 tering C 1 ˙ ∪ C 2 with high proba- bility. Standard SC is likely to return V 1 ˙ ∪ V 2 . V 2 5 / 7

  6. Experiments on Real Networks FriendshipNet, FacebookNet, DrugNet FriendshipNet --- gender FacebookNet --- gender 0.8 15 60 0.8 0.7 50 0.6 10 40 Balance RatioCut Balance 0.6 RatioCut 0.5 30 0.4 0.4 5 20 0.3 10 0.2 0.2 0 0 0 5 10 15 0 5 10 15 k k DrugNet --- ethnicity 0.2 6 Balance of data set Normalized SC Standard SC 5 Algorithm 1 Alg. 3 0.15 4 Balance RatioCut 0.1 3 2 0.05 1 0 0 0 5 10 15 k Average balance of clusters and RatioCut value as a function of number of clusters. 6 / 7

  7. Thank you! Poster #195 7 / 7

Recommend


More recommend