laplacian regularized few shot learning laplacianshot
play

Laplacian Regularized Few Shot Learning (LaplacianShot) Imtiaz - PowerPoint PPT Presentation

Laplacian Regularized Few Shot Learning (LaplacianShot) Imtiaz Masud Ziko, Jose Dolz, Eric Granger and Ismail Ben Ayed ETS Montreal 1 Overview Few-Shot Proposed Experiments Learning LaplacianShot - Experimental Setup - What and Why ?


  1. Laplacian Regularized Few Shot Learning (LaplacianShot) Imtiaz Masud Ziko, Jose Dolz, Eric Granger and Ismail Ben Ayed ETS Montreal 1

  2. Overview Few-Shot Proposed Experiments Learning LaplacianShot - Experimental Setup - What and Why ? - The context - SOTA results on 5 different few-shot - Brief discussion - Proposed formulation benchmarks. on existing approaches. - Optimization - Proposed Algorithm 2

  3. Few-Shot Learning (An example) 3

  4. Few-Shot Learning (An example) - Given C = 5 classes From these - Each class c having 1 examples. To classify Learn a this Model (5-way 1-shot) 4

  5. Few-Shot Learning (An example) 2 4 - Given C = 5 classes From these - Each class c having 1 examples. To classify Learn a this Model (5-way 1-shot) 5

  6. Few-Shot Learning 2 4 Humans recognize perfectly with few examples 6

  7. Few-Shot Learning Modern ML methods generalize poorly ❏ Need a better way. ❏ 7

  8. Few-shot learning A very large body of recent works, mostly based on: Meta-learning framework 8

  9. Meta-Learning Framework 9

  10. Meta-Learning Framework Training set with enough labeled data (base classes different from the test classes) 10

  11. Meta-Learning Framework Training set with enough labeled data to learn initial mode l 11

  12. Meta-Learning Framework Create episodes and do episodic training to learn meta-learner Vinyal et al, (Neurips ‘16) , Snell et al, (Neurips ‘17) , Sung et al, (CVPR ‘ 18) , Finn et al, (ICML‘ 17) , Ravi et al, (ICLR‘ 17) , Lee et al, (CVPR‘ 19) , Hu et al, (ICLR ‘20) , Ye et al, (CVPR ‘20) , . . . 12

  13. Taking a few steps backward . . Recently [ Chen et al., ICLR’19, Wang et al., ’19, Dhillon et al., ICLR’20 ] : Simple baselines outperform the overly convoluted meta-learning based approaches. 13

  14. Baseline Framework No need to meta-train 14

  15. Baseline Framework Simple conventional cross-entropy training The approaches mostly differ during inference 15

  16. Inductive vs Transductive inference Supports Examples Vinayls et al., NEURIPS’ 16 (Attention mechanism) Query/Test point Snell et al., NEURIPS’ 17 (Nearest Prototype) 16

  17. Inductive vs Transductive inference Supports Examples Liu et. al., ICLR’19 (Label propagation) Query/Test points Dhillon, ICLR’20 (Transductive fine-tuning) Transductive : Predict for all test points, instead of one at a time 17

  18. Proposed LaplacianShot - Latent Assignment matrix for N Laplacian-regularized query samples: objective: - Label assignment for each query: - And Simplex Constraints: 18

  19. Proposed LaplacianShot Nearest Prototype classification Laplacian-regularized objective: When Similar to ProtoNet ( Snell ’17 ) or SimpleShot ( Wang ’19 ) Laplacian Regularization Well known in Graph Laplacian: Spectral clustering ( Shi I‘00, Von ‘07 ) , SLK ( Ziko ’18 ) SSL ( Weston ‘12, Belkin ‘06 ) 19

  20. LaplacianShot Takeaways ✓ SOTA results without bell and whistles. ✓ Simple constrained graph clustering works very well. ✓ No network fine-tuning , neither meta-learning ✓ Model Agnostic ✓ Fast transductive inference: almost inductive time 20

  21. LapLacianShot More Details 21

  22. Proposed LaplacianShot Nearest Prototype classification Laplacian-regularized objective: When Labeling according to nearest support prototypes - Feature embedding: - Prototype can be : - The support example in 1-shot or - Simple mean from support examples or - Weighted mean from both support and initially predicted query samples 22

  23. Proposed LaplacianShot Pairwise similarity Laplacian-regularized objective: Laplacian Regularization Well known in Graph Laplacian: Encourages nearby points to have similar assignments 23

  24. Proposed Optimization Laplacian-regularized objective: Tricky to optimize due to: 24

  25. Proposed Optimization Laplacian-regularized objective: Tricky to optimize due to: ✖ Simplex/Integer Constraints. 25

  26. Proposed Optimization Laplacian-regularized objective: Tricky to optimize due to: ✖ Laplacian over discrete variables. 26

  27. Proposed Optimization Laplacian-regularized objective: Relax integer constraints: ✖ Require solving for the N×C variables all together Convex quadratic problem ➢ ✖ Extra projection steps for the simplex constraints 27

  28. Proposed Optimization Laplacian-regularized objective: We do: ✓ Independent and closed-form updates for each assignment variable ✓ Concave relaxation ✓ Efficient bound optimization 28

  29. Concave Laplacian 29

  30. Concave Laplacian When Equal = = 30

  31. Concave Laplacian When Not = Equal 31

  32. Concave Laplacian When ฀฀ Not = Equal ฀฀ Degree 32

  33. Concave Laplacian Remove constant terms = 33

  34. Concave Laplacian Concave for PSD matrix 34

  35. Concave-Convex relaxation Putting it altogether Convex barrier function: ● Avoids extra dual variables for ● Closed- form update for the simplex constraint duel 35

  36. Bound optimization First-order approximation of concave term Fixed unary 36

  37. Bound optimization Iteratively optimize: We get Iterative tight upper bound: Where: 37

  38. Bound optimization Independent upper bound: 38

  39. Bound optimization Minimize Independent upper bound: KKT conditions brings closed form updates: 39

  40. LaplacianShot Algorithm 40

  41. Experiments Generic Classification Datasets: mini ImageNet splits: 64 base, 16 1. Mini- ImageNet validation and 20 test classes tiered ImageNet splits: 351 base, 97 2. Tierd-ImageNet validation and 160 test classes 3. CUB 200-2001 Fine-Grained Classification 4. Inat Splits: 100 base, 50 validation and 50 test classes 41

  42. Experiments Evaluation protocol: Datasets: - 5 -way 1 -shot/ 5 -shot . 1. Mini- ImageNet - 15 query samples per class 2. Tierd-ImageNet (N=75). 3. CUB 200-2001 - Average accuracy over 10,000 few-shot tasks with 95% 4. Inat confidence interval. 42

  43. Experiments - More realistic and challenging Datasets: - Recently introduced (Wertheimer& Hariharan, 2019) 1. Mini- ImageNet - Slight class distinction 2. Tierd-ImageNet - Imbalanced class distribution with 3. CUB 200-2001 variable number of supports/query per class 4. Inat 43

  44. Experiments Evaluation protocol: Datasets: - 227 -way multi-shot . 1. Mini- ImageNet - Top-1 accuracy averaged over 2. Tierd-ImageNet the test images Per Class . 3. CUB 200-2001 - Top-1 accuracy averaged over all the test images ( Mean ) 4. Inat 44

  45. Experiments We do Cross-entropy training with base classes LaplacianShot during inference 45

  46. Results (Mini-ImageNet) 46

  47. Results (Mini-ImageNet) 47

  48. Results (Tiered-ImageNet) 48

  49. Results (CUB) Cross Domain 49

  50. Results (iNat) 50

  51. Ablation: Choosing 51

  52. Ablation: Convergence 52

  53. Ablation: Average Inference time Transductive 53

  54. LaplacianShot Takeaways ✓ SOTA results without bell and whistles. ✓ Simple constrained graph clustering works very well. ✓ No network fine-tuning , neither meta-learning ✓ Model Agnostic: during inference with any training model and gain up to 4/5%!!! ✓ Fast transductive inference: almost inductive time 54

  55. Thank you Code On: https://github.com/imtiazziko/LaplacianShot 55

Recommend


More recommend