fairness constraints for graph embeddings
play

Fairness Constraints for Graph Embeddings* William L. Hamilton - PowerPoint PPT Presentation

Fairness Constraints for Graph Embeddings* William L. Hamilton Assistant Professor at McGill University and Mila Canada CIFAR Chair in AI Visiting Researcher at Facebook AI Research *Joint work with my PhD student Joey Bose, to appear in


  1. Fairness Constraints for Graph Embeddings* William L. Hamilton Assistant Professor at McGill University and Mila Canada CIFAR Chair in AI Visiting Researcher at Facebook AI Research *Joint work with my PhD student Joey Bose, to appear in ICML 2019 (pdf) William L. Hamilton, McGill University and Mila 1

  2. Graph embeddings William L. Hamilton, McGill University and Mila 2

  3. Application: Node classification ? ? ? ? Machine Learning ? William L. Hamilton, McGill University and Mila 4

  4. Application: Link prediction ? ? x ? Machine Learning William L. Hamilton, McGill University and Mila 5

  5. Becoming ubiquitous in social applications Graph embedding techniques are a powerful approach for § social recommendations, bot detection, content screening, behavior prediction, geo-localization, E.g., Facebook, Huawei, Uber Eats, Pinterest, LinkedIn, WeChat § Classic collaborative filtering approaches can be re- § interpreted in a more general graph embedding framework. William L. Hamilton, McGill University and Mila 6

  6. But what about fairness and privacy? Graph embeddings designed to capture ev erything that ever § might be useful for the objective. Even if we don’t provide the model information about § s (e.g., gender or age), the model wi se sensi sitive a attributes will use th this infor formati tion on. . Wha What if a us user do doesn’ n’t want nt thi his inf nformation n us used? d? § William L. Hamilton, McGill University and Mila 7

  7. Fairness from a pragmatic perspective Strict privacy and discrimination concerns are one § motivation. But what if users just don’t want their recommendations do § depend on certain attributes? What if users want the system to “ignore” parts of their § demographics or past behavior? William L. Hamilton, McGill University and Mila 8

  8. Fairness in graph embeddings Basic idea: How can we learn node embeddings that are Ba § invariant to particular sensitive attributes? Cha Challeng nges: § Graph data is not i.i.d. § There is not just one classification task that we are trying to § enforce fairness on. There are often many possible sensitive attributes. § William L. Hamilton, McGill University and Mila 9

  9. Our work: Fairness in graph embeddings William L. Hamilton, McGill University and Mila 10

  10. Preliminaries and set-up Learning an encoder function to map nodes to embeddings: § Using these embeddings to “score” the likelihood of a § relationship between nodes: William L. Hamilton, McGill University and Mila 11

  11. Preliminaries and set-up Learning an encoder function to map nodes to embeddings: § Using these embeddings to “score” the likelihood of a § relationship between nodes: Score of a (possible) edge is a function of the two node embeddings and the relation type. William L. Hamilton, McGill University and Mila 12

  12. Preliminaries and set-up Learning an encoder function to map nodes to embeddings: § Using these embeddings to “score” the likelihood of a § relationship between nodes: Goal: Train the embeddings (with a subset of the true edges) so that the score for all real edges is larger than all non-edges. William L. Hamilton, McGill University and Mila 13

  13. Preliminaries and set-up Generic loss function: § X L edge ( s ( e ) , s ( e − 1 ) , ..., s ( e − m )) e ∈ E train Scores assigned to Task-specific loss random negative function Sum over (batch sample edges. of) training edges. Score assigned to positive/real edge. William L. Hamilton, McGill University and Mila 14

  14. Preliminaries and set-up: Concrete examples Score functions: § Loss-functions: § William L. Hamilton, McGill University and Mila 15

  15. Preliminaries and set-up: Concrete examples Score functions: § s ( e ) = s ( h z u , r, z v i ) = z > Dot-product: § u z v Loss-functions: § William L. Hamilton, McGill University and Mila 16

  16. Preliminaries and set-up: Concrete examples Score functions: § s ( e ) = s ( h z u , r, z v i ) = z > Dot-product: § u z v TransE: § s ( e ) = s ( h z u , r, z v i ) = �k z u + r � z v k 2 2 Loss-functions: § William L. Hamilton, McGill University and Mila 17

  17. Preliminaries and set-up: Concrete examples Score functions: § s ( e ) = s ( h z u , r, z v i ) = z > Dot-product: § u z v TransE: § s ( e ) = s ( h z u , r, z v i ) = �k z u + r � z v k 2 2 Loss-functions: § m Max-margin: X § L edge ( s ( e ) , s ( e − 1 ) , ..., s ( e − m )) = max(1 − s ( e ) + s ( e − i ) , 0) i =1 William L. Hamilton, McGill University and Mila 18

  18. Preliminaries and set-up: Concrete examples Score functions: § s ( e ) = s ( h z u , r, z v i ) = z > Dot-product: § u z v TransE: § s ( e ) = s ( h z u , r, z v i ) = �k z u + r � z v k 2 2 Loss-functions: § m Max-margin: X § L edge ( s ( e ) , s ( e − 1 ) , ..., s ( e − m )) = max(1 − s ( e ) + s ( e − i ) , 0) i =1 m Cross-entropy: § X L edge ( s ( e ) , s ( e − 1 ) , ..., s ( e − m )) = − log( σ ( s ( e )) − log(1 − σ ( s ( e − i )) i =1 William L. Hamilton, McGill University and Mila 19

  19. Formalizing fairness How do we ensure fairness in this context? § William L. Hamilton, McGill University and Mila 20

  20. Formalizing fairness How do we ensure fairness in this context? § So Solution: re repre resentational invariance § Want embeddings to be independent from the attributes: § Which is equivalent to minimizing the mutual information to § between the embeddings and the attributes: William L. Hamilton, McGill University and Mila 21

  21. Enforcing fairness through an adversary William L. Hamilton, McGill University and Mila 22

  22. Enforcing fairness through an adversary Key ey co componen ent 1: Composi sitional al en enco coder er. § Given a set of attributes, it outputs “filtered” embeddings § that should be invariant to those attributes. Trainable filter function (neural Input: node ID and network) outputs embedding Sum over all set of sensitive that is invariant to attribute k . sensitive attributes attributes William L. Hamilton, McGill University and Mila 23

  23. Enforcing fairness through an adversary Key y comp mponent 2: Ad Adve versarial discrimi minators § For each sensitive attribute, train an adversarial discriminator that § tries to predict that sensitive attribute from the filtered embeddings: Ou Outpu put: Likelihood that node u has that attribute value. Discriminator In Input: : Filtered for sensitive embeddding for node u attribute k. and attribute value. William L. Hamilton, McGill University and Mila 24

  24. Enforcing fairness through an adversary Pu Putting it all together in an adversarial loss: § Original loss function for the edge prediction task Constant that determines the Likelihood of discriminator predicting the strength of the fairness sensitive attributes. constraints William L. Hamilton, McGill University and Mila 25

  25. Enforcing fairness through an adversary Pu Putting i g it a all t toge ogether i in a an a adversarial l los oss: § During training the encoder tries to minimize this loss and the § adversarial discriminators are trained to maximize it. William L. Hamilton, McGill University and Mila 26

  26. Enforcing fairness through an adversary William L. Hamilton, McGill University and Mila 28

  27. Dataset 1: MovieLens-1M Classic recommender system benchmark. § Bipartite graph between users and movies. § ): Users and movies No Node des (~1 (~10,000): § Edges (~1,000,000): Rating a user gives a movie Ed § Se Sensitive ve attributes : § Gender § Age (binned to become a categorical attribute) § Occupation § William L. Hamilton, McGill University and Mila 30

  28. Dataset 2: Reddit Derived from public Reddit comments. § Bipartite graph between users and communities. § ): Users and communities Node No des (~3 (~300,000): § Edges (~7,000,000): Whether a user commented on that Ed § community Se Sensitive ve attributes : Randomly select 50 communities to be § “sensitive” communities William L. Hamilton, McGill University and Mila 31

  29. Dataset 3: Freebase 15k-237 Derived from classic knowledge base completion § benchmark. Knowledge graph between set of typed entities. § ): Users and communities No Node des (~1 (~15,000): § Edges (~150,000): 237 different relation types (e.g., Ed § married_to, born_in, capital_of, director_of) ve attributes : Randomly selected 3 entity type Se Sensitive § annotations (e.g., is_actor) to be “sensitive attributes” William L. Hamilton, McGill University and Mila 32

  30. Experiments: Three questions 1. What is the cost of invariance? 2. What is the impact of compositionality? 3. Can we generalize to unseen combinations of attributes? William L. Hamilton, McGill University and Mila 33

  31. MovieLens: Fairness results How strongly can we enforce fairness? § Compare three approaches to enforcing fairness: § No adversary (i.e., just train on the recommendation task) § Independent adversarial model for each attribute § Full compositional model § William L. Hamilton, McGill University and Mila 34

Recommend


More recommend