approximate nearest neighbors search approximate nearest
play

Approximate Nearest Neighbors Search Approximate Nearest Neighbors - PowerPoint PPT Presentation

Approximate Nearest Neighbors Search Approximate Nearest Neighbors Search in High Dimensions in High Dimensions and Locality-Sensitive Hashing and Locality-Sensitive Hashing PAPERS Piotr Indyk, Rajeev Motwani: Approximate Nearest Neighbors:


  1. Approximate Nearest Neighbors Search Approximate Nearest Neighbors Search in High Dimensions in High Dimensions and Locality-Sensitive Hashing and Locality-Sensitive Hashing PAPERS Piotr Indyk, Rajeev Motwani: Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality , STOC 1998. Eyal Kushilevitz, Rafail Ostrovsky, Yuval Rabani: Efficient Search for Approximate Nearest Neighbor in High Dimensional Spaces . SIAM J. Comput., 2000. Mayur Datar, Nicole Immorlica, Piotr Indyk, Vahab S. Mirrokni: Locality-Sensitive Hashing Scheme Based on p-Stable Distributions. Symposium on Computational Geometry 2004. Alexandr Andoni, Mayur Datar, Nicole Immorlica, Vahab S. Mirrokni, P. Indyk: Locality-sensitive hashing using stable distributions . In: Nearest Neighbor Methods in Learning and Vision: Theory and Practice , 2006. CS 468 | Geometric Algorithms Aneesh Sharma, Michael Wand CS 468 | Geometric Algorithms Aneesh Sharma, Michael Wand

  2. Overview Overview Overview • Introduction • Locality Sensitive Hashing (Aneesh) • Hash Functions Based on p -Stable Distributions (Michael) 2

  3. Overview Overview Overview • Introduction • Nearest neighbor search problems • Higher dimensions • Johnson-Lindenstrauss lemma • Locality Sensitive Hashing (Aneesh) • Hash Functions Based on p -Stable Distributions (Michael) 3

  4. Problem Problem 4

  5. Problem Statement Problem Statement Today’s Talks: NN-search in high dimensional spaces • Given • Point set P = { p 1 , …, p n } • a query point q • Find • [ ε -approximate] nearest neighbor to q from P • Goal: • Sublinear query time • “Reasonable” preprocessing time & space • “Reasonable” growth in d (exponential not acceptable) 5

  6. Applications Applications Example Application: Feature spaces • Vectors x ∈ R d represent characteristic features of objects sports • There are often many top speed car features ? ? sedan SUV • Use nearest neighbor mileage rule for classification / recognition 6

  7. Applications Applications “Real World” Example: Image Completion 7

  8. Applications Applications “Real World” Example: Image Completion • Iteratively fill in pixels with best match (+ multi scale) • Typically 5 × 5 … 9 × 9 neighborhoods, i.e.: dimension 25 … 81 • Performance limited by nearest neighbor search • 3D version: dimension 81 … 729 8

  9. Higher Dimensions Higher Dimensions 9

  10. Higher Dimensions are Weird Higher Dimensions are Weird Issues with High-Dimensional Spaces : • d -dimensional space: d independent neighboring directions to each point • Volume-distance ratio explodes vol( r ) ∈ ∈ Θ ( r d ) ∈ ∈ d → ∞ d = 1 d = 2 d = 3 10

  11. No Grid Tricks No Grid Tricks Regular Subdivision Techniques Fail • Regular k -grids contain k d cells • The “grid trick” does not work • Adaptive grids usually also do not help • Conventional integration k subdivisions becomes infeasible ( ⇒ MC-approx.) • Finite element function representation become infeasible 11

  12. Higher Dimensions are Weird Higher Dimensions are Weird More Weird Effects: • Dart-throwing anomaly • Normal distributions d = 1..200 d = 1..200 gather prob.-mass in thin shells • [Bishop 95] • Nearest neighbor ~ farthest neighbor • For unstructured points (e.g. iid-random) • Not true for certain classes of structured data • [Beyer et al. 99] 12

  13. Johnson-Lindenstrauss Johnson-Lindenstrauss Lemma Lemma 13

  14. Johnson-Lindenstrauss Lemma Johnson-Lindenstrauss Lemma JL-Lemma: [Dasgupta et al. 99] • Point set P in R d , n := # P • There is f : R d → R k , k ∈ O( ε -2 ln n ) ( k ≥ 4( ε 2 /2 – ε 3 /3) -1 ln n ) • …that preserves all inter-point distances up to a factor of (1+ ε ) Random orthogonal linear projection works with probability ≥ ≥ ≥ ≥ (1-1/ n ) 14

  15. This means… This means… What Does the JL-Lemma Imply? Pairwise distances in small point set P (sub-exponential in d ) can be well-preserved in low-dimensional embedding What does it not say? Does not imply that the points themselves are well-represented (just the pairwise distances) 15

  16. Experiment Experiment 16

  17. Intuition Intuition Difference Vectors diff u • Normalize (relative error) v • Pole yields bad approximation diff • Non-pole area much good prj. bad prj. larger (high dimension) no-go area • Need large number good prj. of poles (exponential in d ) 17

  18. Overview Overview Overview • Introduction • Locality Sensitive Hashing • Approximate Nearest Neighbors • Big picture • LSH on unit hypercube - Setup - Main idea - Analysis - Results • Hash Functions Based on p -Stable Distributions 18

  19. Approximate Approximate Nearest Neighbors Nearest Neighbors 19

  20. ANN: Decision version ANN: Decision version Input: P , q , r Output: • If there is a NN, return yes and output one ANN • If there is no ANN, return no • Otherwise, return either 20

  21. ANN: Decision version ANN: Decision version Input: P , q , r Output: • If there is a NN, return yes and output one ANN • If there is no ANN, return no • Otherwise, return either 21

  22. ANN: Decision version ANN: Decision version Input: P , q , r Output: • If there is a NN, return yes and output one ANN • If there is no ANN, return no • Otherwise, return either 22

  23. ANN: Decision version ANN: Decision version General ANN c PLEB Decision version + Binary search 23

  24. ANN: previous results ANN: previous results Query time Space Preprocessing used time ( ) ( ) ( ) d log Vornoi d / 2 d / 2 O 2 n O n O n ( ) ( ) ( ) d log Kd-tree 2 O n O n O n log n ( ) ( ) ( ) ρ + ρ 1 ρ + LSH 1 log log O n n O n O n n 24

  25. LSH: Big picture LSH: Big picture 25

  26. Locality Sensitive Hashing Locality Sensitive Hashing • Remember: solving decision ANN • Input: • No. of points: n • Number of dimensions: d • Point set: P • Query point: q 26

  27. LSH: Big Picture LSH: Big Picture • Family of hash functions: • Close points to same buckets • Faraway points to different buckets • Choose a random function and hash P • Only store non-empty buckets 27

  28. LSH: Big Picture LSH: Big Picture • Hash q in the table • Test every point in q ’s bucket for ANN • Problem: • q ’s bucket may be empty 28

  29. LSH: Big Picture LSH: Big Picture • Solution: • Use a number of hash tables! • We are done if any ANN is found 29

  30. LSH: Big Picture LSH: Big Picture • Problem: • Poor resolution � too many candidates! • Stop after reaching a limit, small probability 30

  31. LSH: Big Picture LSH: Big Picture • Want to find a hash function: [ ] ∈ = ≥ α If u B ( q , r ) then Pr h ( u ) h ( q ) [ ] ∉ = ≤ β If u B ( q , R ) then Pr h ( u ) h ( q ) < α >> β r R , • h is randomly picked from a family • Choose ( ) = + ε R r 1 31

  32. LSH on unit LSH on unit Hypercube Hypercube 32

  33. Setup: unit hypercube Setup: unit hypercube • Points lie on hypercube: H d = {0,1} d • Every point is a binary string • Hamming distance ( r ): • Number of different coordinates 33

  34. Setup: unit hypercube Setup: unit hypercube • Points lie on hypercube: H d = {0,1} d • Every point is a binary string • Hamming distance ( r ): • Number of different coordinates 34

  35. Main idea Main idea 35

  36. Hash functions for hypercube Hash functions for hypercube • Define family F : ( ) d = Given : Hypercube H , point b b , , b 1 d K ( ) d ∈ = = ∈ =   h F : h ( b ) b b b , , b H , for i 1 , , d 1  i i d  K K   + ε r r ( 1 ) α = − β = − 1 , 1 d d • Intuition: compare a random coordinate ( ) + ε α β • Called: r , r ( 1 ), , - sensitive family 36

  37. Hash functions for hypercube Hash functions for hypercube • Define family G : d ∈ Given : b H , F ∈ g G : { } { } 1 k i d k  → = ∈    g : 0 , 1 0 , 1 g ( b ) h ( b ), , h ( b ) , for h F     K     k + ε k  − r r ( 1 ) ′ ′ α =  = α β =  −  = β k k 1 , 1     d d     • Intuition: Compare k random coordinates • Choose k later – logarithmic in n � J-L lemma 37

  38. Constructing hash tables Constructing hash tables g , , g • Choose uniformly at random from G τ 1 K τ • Constructing hash tables, hash P τ • Will choose later g g g τ 1 2 38

  39. LSH: ANN algorithm LSH: ANN algorithm g , , g • Hash q into each τ 1 K • Check colliding nodes for ANN τ 4 • Stop if more than collisions, return fail g g g τ 1 2 39

  40. Details… Details… 40

  41. Choosing parameters Choosing parameters τ • Choose k and to ensure constant probability of: • Finding an ANN if there is a NN < τ ( 4 ) • Few collisions when there is no ANN α ln 1/ ρ = Define : β ln 1/ ln n ρ = τ = Choose : k , 2 n β ln 1/ 41

Recommend


More recommend