unsupervised image segmentation using comparative
play

Unsupervised Image Segmentation Using Comparative Reasoning and - PowerPoint PPT Presentation

Unsupervised Image Segmentation Using Comparative Reasoning and Random Walks Anuva Kulkarni Carnegie Mellon University Filipe Condessa Carnegie Mellon, IST-University of Lisbon Carnegie Mellon University Jelena Kovacevic 1 Outline


  1. Unsupervised Image Segmentation Using Comparative Reasoning and Random Walks Anuva Kulkarni Carnegie Mellon University Filipe Condessa Carnegie Mellon, IST-University of Lisbon Carnegie Mellon University Jelena Kovacevic 1

  2. Outline • Motivation – Training-free methods – Comparative Reasoning – Related work • Approach – Winner Take All (WTA) Hash – Clustering based on Random Walks • Some experimental results 2

  3. Acknowledgements • Example and test images taken from – Berkeley Segmentation Dataset (BSDS) – The Prague Texture Segmentation Data Generator and Benchmark 3

  4. Motivation • Goals: – Segment images where no. of classes unknown – Eliminate training data (may not be available) – Fast pre-processing step for classification • Segmentation is similarity search • Comparative Reasoning is rank correlation using machine learning concept of “hashing” 4

  5. Hashing • Used to speed up the searching process • A ‘hash function’ relates the data values to keys or ‘hash codes’ 0111 Hash function Key/ Value Hash code • Hash table is shortened representation of data Hash table Hash value Data 001 Bird_type1 010 Bird_type2 011 Dog_type1 5 100 Fox_type1

  6. Hashing • Similar data points have the same (or close by) hash keys or “hash codes” Input data Hash code • Properties of hash functions – Always returns a number for an object – Two equal objects will always have the same number – Two unequal objects may not always have different numbers Images from 6 https://upload.wikimedia.org/wikipedia/commons/3/32/House_sparrow04.jpg Wikipedia www.weknowyourdreams.com

  7. Hashing for Segmentation • Each pixel is described by some feature vectors ( eg. Color) • Hashing is used to cluster them into groups 1110 0110 Color features of each pixel Image computed 0001 0111 Similar features hashed into same groups 7

  8. Segmentation and Randomized Hashing • Random hashing i.e using a hash code to indicate the region in which a feature vector lies after splitting the space using a set of randomly chosen splitting planes 2 1111 3 1011 1 0111 1001 0011 0001 0 1000 0110 0000 0100 8 C. J. Taylor and A. Cowley, “Fast segmentation via randomized hashing.,” in BMVC, pp. 1–11, 2009.

  9. Winner Take All (WTA) Hash • A way to convert feature vectors into compact binary hash codes • Absolute value of feature does not matter, only the ordering of values matters • Rank correlation preserved – Stability • Distance between hashes approximates rank correlation J. Yagnik, D. Strelow, D. A. Ross, and R.s. Lin, “The power of comparative reasoning,” in ICCV 2011, 9 pp. 2431–2438, IEEE, 2011.

  10. Calculating WTA Hash • Consider 3 feature vectors Step 1: Create random permutations Permutation vector θ 3 1 5 2 6 4 feature 2 feature 3 feature 1 13 4 2 11 5 3 12 5 3 10 4 2 1 90 44 5 15 6 Step 1 Permute with θ 2 13 5 4 3 11 3 12 4 5 2 10 44 1 15 90 6 5 10

  11. Calculating WTA Hash • Step 2: Choose first K entries. Let K=3 Permutation vector θ 3 1 5 2 6 4 feature 2 feature 3 feature 1 13 4 2 11 5 3 12 5 3 10 4 2 1 90 44 5 15 6 Permute with θ Step 1 2 13 5 4 3 11 3 12 4 5 2 10 44 1 15 90 6 5 Step 2 Choose first K entries 2 13 5 4 3 11 3 12 4 5 2 10 44 1 15 90 6 5 11

  12. Calculating WTA Hash • Step 3: Pick the index of the max. entry. This is the hash code ‘h’ of that feature vector Permutation vector θ 3 1 5 2 6 4 feature 2 feature 3 feature 1 13 4 2 11 5 3 12 5 3 10 4 2 1 90 44 5 15 6 Permute with θ Step 1 2 13 5 4 3 11 3 12 4 5 2 10 44 1 15 90 6 5 Step 2 Choose first K entries 2 13 5 4 3 11 3 12 4 5 2 10 44 1 15 90 6 5 Hash code is index Step 3 2 13 5 4 3 11 3 12 4 5 2 10 44 1 15 90 6 5 of top entry out of the K h=2 h=2 h=1 12

  13. Calculating WTA Hash Notice that Feature 2 is just Feature 1 perturbed by one, but Feature 3 is very different Permutation vector θ 3 1 5 2 6 4 feature 2 feature 3 feature 1 13 4 2 11 5 3 12 5 3 10 4 2 1 90 44 5 15 6 Permute with θ Step 1 2 13 5 4 3 11 3 12 4 5 2 10 44 1 15 90 6 5 Step 2 Choose first K entries 2 13 5 4 3 11 3 12 4 5 2 10 44 1 15 90 6 5 Hash code is index Step 3 2 13 5 4 3 11 3 12 4 5 2 10 44 1 15 90 6 5 of top entry out of the K h=2 h=2 h=1 13 Feature 1 and Feature 2 are similar

  14. Random Walks • Understanding proximity in graphs • Useful in propagation in graphs – creates probability maps • Similar to electrical network with voltages and resistances 0.16V 2 + 1V • It is supervised. 1 1 User must specify 1 - 1V 0.05V seeds 1 2 2 - 0.16V 14

  15. Our Approach Similarity Search Block I Block II Transform to WTA Random Input image graph with hash projections (Nodes, Edges) RW Algorithm Block III Probabilities Yes Auto. seed Segmented Stop? from selection output RW algo. No 15

  16. Block I: Similarity Search Similarity Search Block I Block II Transform to WTA Random Input image graph with hash projections (Nodes, Edges) RW Algorithm Block III Probabilities Yes Auto. seed Segmented Stop? from selection output RW algo. No 16

  17. WTA hash • Image Dimensions: P x Q x d • Project onto R randomly chosen hyperplanes – Each point in image has R feature vectors R d d Random projections onto R pairs of points vectorize PQ PQ Image = Q P 17

  18. WTA hash • Run WTA hash N times. R d 01 11 d Random projections onto R pairs of points vectorize PQ PQ Image = PQ Q P Each point has R features Run WTA hash. W K=3 for each point in the image Hence possible values Repeat this N times to get PQ x N matrix of hash codes of hash codes are 00, 01, 11 18

  19. Block II: Create Graph Similarity Search Block I Block II Transform to WTA Random Input image graph with hash projections (Nodes, Edges) RW Algorithm Block III Probabilities Yes Auto. seed Segmented Stop? from selection output RW algo. No 19

  20. Create Graph • Run WTA hash N times ! each point has N hash codes • Image transformed into lattice • Calculate edge weight between nodes i and j ω i,j = exp( − βν i,j ) where: ν i,j = d H ( i, j ) γ d H ( i, j ) = Avg. Hamm. distance over all N hash codes of i and j γ = Scaling factor β = Weight parameter for the RW algorithm 20

  21. Block III: RW Algorithm Similarity Search Block I Block II Transform to WTA Random Input image graph with hash projections (Nodes, Edges) RW Algorithm Block III Probabilities Yes Auto. seed Segmented Stop? from selection output RW algo. No 21

  22. Seed Selection • Needs initial seeds to be defined • Unsupervised draws using Dirichlet processes • DP(G 0 , α ) – G o is base distribution – α is discovery parameter • Larger α leads to discovery of more classes ! | ! ! ! ! , ! = =10 ! | ! ! ! ! , ! = =1 Total ! numbe Total ! numbe ! | ! ! ! ! , ! = =100 Total ! numbe wher e 22 ! !"! = Tot ! ! = Class ! ! ! = { ! ! = nu

  23. Seed Selection • Probability that a new seed belongs to a new class is proportional to α • Probability for the i th sample with class label y i – Result by Blackwell and MacQueen, 1973 n − i + α c C tot p ( y i = c | y − i , α ) = n − 1 + α where: C tot = Total number of classes y i = Class label c, c 2 { 1 , 2 ...C tot } y − i = { y j | j 6 = i } n − i = number of samples in c th class excluding i th sample c 23

  24. Seed Selection • Unsupervised, hence C tot is infinite. Hence, n − i c ∀ c, n − i C tot →∞ p ( y i = c | y − i , α ) = lim > 0 c n − 1 + α • “Clustering effect” or “rich gets richer” Class is non-empty • Probability that a new class is discovered: α 8 c, n − i C tot →∞ p ( y i 6 = y j for all j < i | y − i , α ) = lim = 0 c n � 1 + α Class is empty or new 24

  25. Random Walks • Use the RW algorithm to generate probability maps in each iteration • Entropy calculated with probability maps • Entropy-based stopping criteria – Cluster purity " , Avg. image entropy # 25

  26. Experimental Results Histology images Automatically Picked seeds Berkeley segmentation subset Avg. GCE of dataset = 0.186 26

  27. Experimental Results TexGeo Avg GCE of dataset = 0.134 TexBTF Avg. GCE of dataset = 0.061 27

  28. Experimental Results • Comparison measure: Global Consistency Error (GCE)* – Lower GCE indicates lower error No. of GCE Score features BSDSubset TexBTF TexColor TexGeo 10 0.179 0.063 0.159 0.102 20 0.180 0.065 0.159 0.129 40 0.186 0.061 0.156 0.134 *C. Fowlkes, D. Martin, and J. Malik, “Learning affinity functions for image segmentation: Combining patch-based and 28 gradient-based approaches,” vol. 2, pp. II–54, IEEE, 2003.

Recommend


More recommend