GEEM : An algorithm for Active Learning on Attributed Graphs Florence Regol* Soumyasundar Pal*, Yingxue Zhang**, Mark Coates* * McGill University Compnet Lab **Huawei Noah’s Ark Lab, Montreal Research Center July 14th 2020 1 / 22
Active Learning - Problem Setting What is active learning? Label / Target Feature Space 2 / 22
Active Learning - Problem Setting What is active learning? • Access to unlabelled data. Label / Target Feature Space 2 / 22
Active Learning - Problem Setting What is active learning? • Access to unlabelled data. • Query an oracle for labels/targets . Label / Target Feature Space 2 / 22
Active Learning - Problem Setting What is active learning? • Access to unlabelled data. • Query an oracle for labels/targets . Label / Target Feature Space 2 / 22
Active Learning - Problem Setting What is active learning? • Access to unlabelled data. • Query an oracle for labels/targets . → Expensive process. Label / Target Feature Space 2 / 22
Active Learning - Problem Setting What is active learning? • Access to unlabelled data. • Query an oracle for labels/targets . → Expensive process. Label / Target ? ? Feature Space Goal: Choose optimal queries to maximize performance. 2 / 22
Active Learning for Node Classification ? ? 3 / 22
Active Learning Process Pool-based active learning algorithm steps : 4 / 22
Active Learning Process Pool-based active learning algorithm steps : PREDICT : Infer ˆ Y = f t ( X ). 1 4 / 22
Active Learning Process Pool-based active learning algorithm steps : PREDICT : Infer ˆ Y = f t ( X ). 1 Trained on ( X , Y L t ) current labelled set L t . 4 / 22
Active Learning Process Pool-based active learning algorithm steps : PREDICT : Infer ˆ Y = f t ( X ). 1 Trained on ( X , Y L t ) current labelled set L t . QUERY : Select q from the unlabelled set U t . 2 4 / 22
Active Learning Process Pool-based active learning algorithm steps : PREDICT : Infer ˆ Y = f t ( X ). 1 Trained on ( X , Y L t ) current labelled set L t . QUERY : Select q from the unlabelled set U t . 2 Update L t +1 = L t ∪ { q t } and U t +1 = U t \ { q t } . 4 / 22
Active Learning Process Pool-based active learning algorithm steps : PREDICT : Infer ˆ Y = f t ( X ). 1 Trained on ( X , Y L t ) current labelled set L t . QUERY : Select q from the unlabelled set U t . 2 Update L t +1 = L t ∪ { q t } and U t +1 = U t \ { q t } . Repeat until the query budget B has been reached. 4 / 22
Active Learning on Graphs - Existing work GCN-based models 5 / 22
Active Learning on Graphs - Existing work GCN-based models SOTA Active leaning strategies based on GCN output . (AGE [1] and ANRMAB [2]) 5 / 22
Active Learning on Graphs - Existing work GCN-based models SOTA Active leaning strategies based on GCN output . (AGE [1] and ANRMAB [2]) PREDICT : Infer ˆ Y = f t ( X ). 1 5 / 22
Active Learning on Graphs - Existing work GCN-based models SOTA Active leaning strategies based on GCN output . (AGE [1] and ANRMAB [2]) PREDICT : Infer ˆ Y = f t ( X ). 1 → Run one epoch of GCN . 5 / 22
Active Learning on Graphs - Existing work GCN-based models SOTA Active leaning strategies based on GCN output . (AGE [1] and ANRMAB [2]) PREDICT : Infer ˆ Y = f t ( X ). 1 → Run one epoch of GCN . → Save the node embeddings output from the GCN . 5 / 22
Active Learning on Graphs - Existing work GCN-based models SOTA Active leaning strategies based on GCN output . (AGE [1] and ANRMAB [2]) PREDICT : Infer ˆ Y = f t ( X ). 1 → Run one epoch of GCN . → Save the node embeddings output from the GCN . QUERY Select q ∈ U t . 2 5 / 22
Active Learning on Graphs - Existing work GCN-based models SOTA Active leaning strategies based on GCN output . (AGE [1] and ANRMAB [2]) PREDICT : Infer ˆ Y = f t ( X ). 1 → Run one epoch of GCN . → Save the node embeddings output from the GCN . QUERY Select q ∈ U t . 2 → Select q based on metrics derived from GCN output. 5 / 22
Active Learning on Graphs - Existing work GCN-based models SOTA Active leaning strategies based on GCN output . (AGE [1] and ANRMAB [2]) PREDICT : Infer ˆ Y = f t ( X ). 1 → Run one epoch of GCN . → Save the node embeddings output from the GCN . QUERY Select q ∈ U t . 2 → Select q based on metrics derived from GCN output. [1] Cai et al. ”Active learning for graph embedding” arXiv 2017 [2] Gao et al. ”Active discriminative network representation learning” IJCAI 2018 5 / 22
Existing work - Results GCN -based algorithms on Cora. 80 Accuracy of GCN without active learning with x = 120 nodes in labeled set 70 Accuracy 60 AGE 50 ANRMAB 20 40 60 Number of nodes in labeled set 6 / 22
Limitation : Deep learning models generally rely on sizable validation set for hyperparameters tuning. 7 / 22
Limitation : Deep learning models generally rely on sizable validation set for hyperparameters tuning. Results with non-optimized GCN hyperparameter highlight this dependence . 7 / 22
Existing work - Non optimized model Cora with non-optimized version of AGE . 80 70 Accuracy 60 50 AGE AGE non optimized 40 ANRMAB 20 40 60 Number of nodes in labeled set 8 / 22
Existing work - Unseen dataset Amazon-photo. Hyperparameters not fine-tuned to the dataset. Accuracy of GCN without active learning with x = 160 nodes in labeled set 80 Accuracy 60 AGE 40 AGE non optimized ANRMAB 10 20 30 40 50 Number of nodes in labeled set 9 / 22
Proposed Algorithm : Graph Expected Error Minimization (GEEM) 10 / 22
Proposed algorithm - GEEM Expected Error Minimization (EEM) 11 / 22
Proposed algorithm - GEEM Expected Error Minimization (EEM) Risk of q : The expected 0/1 error once added to L t . 11 / 22
Proposed algorithm - GEEM Expected Error Minimization (EEM) Risk of q : The expected 0/1 error once added to L t . Denoted by R + q | Y L t . 11 / 22
Proposed algorithm - GEEM Expected Error Minimization (EEM) Risk of q : The expected 0/1 error once added to L t . Denoted by R + q | Y L t . EEM selects the query q that minimizes this risk . 11 / 22
Proposed algorithm - GEEM Expected Error Minimization (EEM) Risk of q : The expected 0/1 error once added to L t . Denoted by R + q | Y L t . EEM selects the query q that minimizes this risk . q ∗ = arg min R + q | Y L t q ∈U t 11 / 22
Proposed algorithm - GEEM Expected Error Minimization (EEM) Risk of q : The expected 0/1 error once added to L t . Denoted by R + q | Y L t . EEM selects the query q that minimizes this risk . q ∗ = arg min R + q | Y L t q ∈U t � � R + q � 1 � k ′ ∈ K p ( y i = k ′ | Y L t , y q = k ) | Y L t = 1 − max p ( y q = k | Y L t ) |U � q t | i ∈U � q k ∈ K t 12 / 22
Proposed algorithm - GEEM Expected Error Minimization (EEM) Risk of q : The expected 0/1 error once added to L t . Denoted by R + q | Y L t . EEM selects the query q that minimizes this risk . q ∗ = arg min R + q | Y L t q ∈U t � � R + q � 1 � k ′ ∈ K p ( y i = k ′ | Y L t , y q = k ) | Y L t = 1 − max p ( y q = k | Y L t ) |U � q t | i ∈U � q k ∈ K t 13 / 22
Proposed algorithm - GEEM Expected Error Minimization (EEM) Risk of q : The expected 0/1 error once added to L t . Denoted by R + q | Y L t . EEM selects the query q that minimizes this risk . q ∗ = arg min R + q | Y L t q ∈U t � � R + q � 1 � k ′ ∈ K p ( y i = k ′ | Y L t , y q = k ) | Y L t = 1 − max p ( y q = k | Y L t ) |U � q t | i ∈U � q k ∈ K t 14 / 22
Proposed Algorithm - p ( y |· ) All that remains is to define p ( y |· ) 15 / 22
Proposed Algorithm - p ( y |· ) All that remains is to define p ( y |· ) Simplified GCN [3] : Removes non-linearities of GCNs to obtain a linearized logistic regression model. 15 / 22
Proposed Algorithm - p ( y |· ) All that remains is to define p ( y |· ) Simplified GCN [3] : Removes non-linearities of GCNs to obtain a linearized logistic regression model. Set x j W Y L ) ( k ) p ( y j = k | Y L ) = σ ( ˜ 15 / 22
Proposed Algorithm - p ( y |· ) All that remains is to define p ( y |· ) Simplified GCN [3] : Removes non-linearities of GCNs to obtain a linearized logistic regression model. Set x j W Y L ) ( k ) p ( y j = k | Y L ) = σ ( ˜ GEEM : x i W L t , + q , y k ) ( k ′ ) ) σ ( ˜ R + q � � 1 x q W Y L t ) ( k ) | Y L t = (1 − max k ′ ∈ K σ ( ˜ |U − q | t k ∈ K i ∈U � q 16 / 22
Proposed Algorithm - p ( y |· ) All that remains is to define p ( y |· ) Simplified GCN [3] : Removes non-linearities of GCNs to obtain a linearized logistic regression model. Set x j W Y L ) ( k ) p ( y j = k | Y L ) = σ ( ˜ GEEM : x i W L t , + q , y k ) ( k ′ ) ) σ ( ˜ R + q � � 1 x q W Y L t ) ( k ) | Y L t = (1 − max k ′ ∈ K σ ( ˜ |U − q | t k ∈ K i ∈U � q 17 / 22
Results 18 / 22
Results - GEEM Cora. GEEM outperforms GCN-based methods even when GCN hyperparameters are fine-tuned. 80 70 Accuracy 60 AGE AGE non optimized 50 ANRMAB GEEM* 40 Random 20 40 60 Number of nodes in labeled set 19 / 22
Results - GEEM Amazon-photo. GEEM significantly outperforms GCN-based methods. 80 Accuracy 60 40 10 20 30 40 50 Number of nodes in labeled set 20 / 22
Recommend
More recommend