Constrained Video Face Clustering using 1NN Relations Vicky Kalogeiton Andrew Zisserman
Video face clustering In Input Ou Outp tput Video source: [Tapaswi ICVGIP 2014] 1
Why does it matter? Comfort Fun Access
Modern applications Automatic story telling Grand entry of the king’s horses Grand entry of the king’s horses and men. AR ARYA , wearing a helm and men. ARYA, wearing a helm and cloak, pushes her way into a and cloak, pushes her way into a tall wagon for a better look…. tall wagon for a better look…. In rides JO , followed by the JOFFREY, HOU HOUND 3
Overview C1C: Constrained 1NN Clustering Cannot-lin Ca link t must-link must-link cannot- Must-lin Mu link cannot-link link 1 st min-cut 1 st neighbor neighbor clust min-cut er cluster 4
Overview C1C: Constrained 1NN Clustering Cannot-lin Ca link t must-link must-link cannot- Must-lin Mu link cannot-link link 1 st min-cut 1 st neighbor neighbor clust min-cut er cluster Friends dataset Contributions • season 3 (10h) ü No training required ü Scalable ~25 episodes • ü Low computational cost 17k head tracks • ü Outperforms state of the art 49 characters • ü Friends: challenging 5
Outline • FINCH clustering method • Self-supervised Constraints • C1C pipeline • Friends dataset • Experimental Results 6
Related work [Everingham BMVC 2006] [Cinbis ICCV 2011] [Bojanowski ICCV 2013] [Wu ICCV 2013,CVPR 2013] S-Siam [Sharma T-BIOM 2020] BCL [Tapaswi ICCV 2019] CCL [Sharma FG 2020] T-Siam [Sharma FG 2019] - Use HAC - Require learning - High computational cost 7
C1C: Constrained 1NN Clustering Hierarchical Self-supervised Clustering method constraints [Code] http://www.robots.ox.ac.uk/~vgg/research/c1c/ 8
C1C – FINCH Hierarchical Self-supervised Clustering method constraints • Pairwise distances between all instances FINCH • At every partition, link all first NN • Merge instances that are first neighbors or have a common first neighbor • Represent a cluster with the average of its [S. Sarfraz CVPR 19] instances [Code] http://www.robots.ox.ac.uk/~vgg/research/c1c/ 9
C1C— Self-supervised Constraints Hierarchical Self-supervised Clustering method constraints Must st-Li Link nk Ca Cannot-lin link t [Code] http://www.robots.ox.ac.uk/~vgg/research/c1c/ 10
C1C: Constrained 1NN Clustering 11
C1C: Constrained 1NN Clustering must-link cannot-link 12
C1C: Constrained 1NN Clustering (a) must-link cannot-link (b) 13
C1C: Constrained 1NN Clustering (a) must-link cannot-link (b) 14
C1C: Constrained 1NN Clustering (a) must-link cannot-link 1 st neighbor (b) 15
C1C: Constrained 1NN Clustering (a) must-link (c) cannot-link 1 st neighbor (b) 16
C1C: Constrained 1NN Clustering (a) must-link cannot-link 1 st neighbor (b) 17
C1C: Constrained 1NN Clustering (a) must-link cannot-link 1 st neighbor min-cut (b) 18
C1C: Constrained 1NN Clustering must-link cannot-link 1 st neighbor min-cut cluster
Friends dataset • Friends season 3 (~10h, 25 episodes) • 17k head tracks, 49 characters (six main, 43 secondary) Rachel Monica Ross Joey Phoebe Main Ma Carol Kate Rachel date Gunther Janice Secondary Se 20
Datasets & Metrics Buffy the Vampire Slayer The Big Bang Theory (BBT) Implementation WCP: Weighted Clustering Purity Tr Trade-of off • Architecture: |,| WCP = 1 ResNet-50 & ' - ( . ( Purity • Pre-trained ()* MS-Celeb-1M . ( :purity of cluster / • Fine-tuned: - ( : #samples in cluster / VGGFace2 # clusters 21
Quantitative Results %W %WCP CP 90.8% BBT BBT 92.9% 95.3% 82.9% ü C1 C1C : better Buffy Bu 86.5% FINCH and BCL 88.1% 69.7% ü Fri Friends : Fr Friends challenging 77.0% FINCH [Sarfraz CVPR 19] BCL [Tapaswi ICCV 19] C1C [Ours] 22
Quantitative Results %W %WCP CP 90.8% BBT BBT 92.9% 95.3% 82.9% ü C1 C1C : better Bu Buffy 86.5% FINCH and BCL 88.1% 69.7% ü Fri Friends : Friends Fr challenging 77.0% FINCH [Sarfraz CVPR 19] BCL [Tapaswi ICCV 19] C1C [Ours] 23
Qualitative results 24
Conclusions & Future work C1C: • – links instances through 1NN relations – must-link and cannot-link constraints Advantages: • – scalable – no training required – low computational cost Friends dataset • State of the art results • Overcome failures by using more context • Automatically estimate #characters • 25
Thank you
Recommend
More recommend