Approximate Nearest Neighbors Search Approximate Nearest Neighbors Search in High Dimensions in High Dimensions and Locality-Sensitive Hashing and Locality-Sensitive Hashing PAPERS Piotr Indyk, Rajeev Motwani: Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality , STOC 1998. Eyal Kushilevitz, Rafail Ostrovsky, Yuval Rabani: Efficient Search for Approximate Nearest Neighbor in High Dimensional Spaces . SIAM J. Comput., 2000. Mayur Datar, Nicole Immorlica, Piotr Indyk, Vahab S. Mirrokni: Locality-Sensitive Hashing Scheme Based on p-Stable Distributions. Symposium on Computational Geometry 2004. Alexandr Andoni, Mayur Datar, Nicole Immorlica, Vahab S. Mirrokni, P. Indyk: Locality-sensitive hashing using stable distributions . In: Nearest Neighbor Methods in Learning and Vision: Theory and Practice , 2006. CS 468 | Geometric Algorithms Aneesh Sharma, Michael Wand CS 468 | Geometric Algorithms Aneesh Sharma, Michael Wand
Overview Overview Overview • Introduction • Locality Sensitive Hashing (Aneesh) • Hash Functions Based on p -Stable Distributions (Michael) 2
Overview Overview Overview • Introduction • Nearest neighbor search problems • Higher dimensions • Johnson-Lindenstrauss lemma • Locality Sensitive Hashing (Aneesh) • Hash Functions Based on p -Stable Distributions (Michael) 3
Problem Problem 4
Problem Statement Problem Statement Today’s Talks: NN-search in high dimensional spaces • Given • Point set P = { p 1 , …, p n } • a query point q • Find • [ ε -approximate] nearest neighbor to q from P • Goal: • Sublinear query time • “Reasonable” preprocessing time & space • “Reasonable” growth in d (exponential not acceptable) 5
Applications Applications Example Application: Feature spaces • Vectors x ∈ R d represent characteristic features of objects sports • There are often many top speed car features ? ? sedan SUV • Use nearest neighbor mileage rule for classification / recognition 6
Applications Applications “Real World” Example: Image Completion 7
Applications Applications “Real World” Example: Image Completion • Iteratively fill in pixels with best match (+ multi scale) • Typically 5 × 5 … 9 × 9 neighborhoods, i.e.: dimension 25 … 81 • Performance limited by nearest neighbor search • 3D version: dimension 81 … 729 8
Higher Dimensions Higher Dimensions 9
Higher Dimensions are Weird Higher Dimensions are Weird Issues with High-Dimensional Spaces : • d -dimensional space: d independent neighboring directions to each point • Volume-distance ratio explodes vol( r ) ∈ ∈ Θ ( r d ) ∈ ∈ d → ∞ d = 1 d = 2 d = 3 10
No Grid Tricks No Grid Tricks Regular Subdivision Techniques Fail • Regular k -grids contain k d cells • The “grid trick” does not work • Adaptive grids usually also do not help • Conventional integration k subdivisions becomes infeasible ( ⇒ MC-approx.) • Finite element function representation become infeasible 11
Higher Dimensions are Weird Higher Dimensions are Weird More Weird Effects: • Dart-throwing anomaly • Normal distributions d = 1..200 d = 1..200 gather prob.-mass in thin shells • [Bishop 95] • Nearest neighbor ~ farthest neighbor • For unstructured points (e.g. iid-random) • Not true for certain classes of structured data • [Beyer et al. 99] 12
Johnson-Lindenstrauss Johnson-Lindenstrauss Lemma Lemma 13
Johnson-Lindenstrauss Lemma Johnson-Lindenstrauss Lemma JL-Lemma: [Dasgupta et al. 99] • Point set P in R d , n := # P • There is f : R d → R k , k ∈ O( ε -2 ln n ) ( k ≥ 4( ε 2 /2 – ε 3 /3) -1 ln n ) • …that preserves all inter-point distances up to a factor of (1+ ε ) Random orthogonal linear projection works with probability ≥ ≥ ≥ ≥ (1-1/ n ) 14
This means… This means… What Does the JL-Lemma Imply? Pairwise distances in small point set P (sub-exponential in d ) can be well-preserved in low-dimensional embedding What does it not say? Does not imply that the points themselves are well-represented (just the pairwise distances) 15
Experiment Experiment 16
Intuition Intuition Difference Vectors diff u • Normalize (relative error) v • Pole yields bad approximation diff • Non-pole area much good prj. bad prj. larger (high dimension) no-go area • Need large number good prj. of poles (exponential in d ) 17
Overview Overview Overview • Introduction • Locality Sensitive Hashing • Approximate Nearest Neighbors • Big picture • LSH on unit hypercube - Setup - Main idea - Analysis - Results • Hash Functions Based on p -Stable Distributions 18
Approximate Approximate Nearest Neighbors Nearest Neighbors 19
ANN: Decision version ANN: Decision version Input: P , q , r Output: • If there is a NN, return yes and output one ANN • If there is no ANN, return no • Otherwise, return either 20
ANN: Decision version ANN: Decision version Input: P , q , r Output: • If there is a NN, return yes and output one ANN • If there is no ANN, return no • Otherwise, return either 21
ANN: Decision version ANN: Decision version Input: P , q , r Output: • If there is a NN, return yes and output one ANN • If there is no ANN, return no • Otherwise, return either 22
ANN: Decision version ANN: Decision version General ANN c PLEB Decision version + Binary search 23
ANN: previous results ANN: previous results Query time Space Preprocessing used time ( ) ( ) ( ) d log Vornoi d / 2 d / 2 O 2 n O n O n ( ) ( ) ( ) d log Kd-tree 2 O n O n O n log n ( ) ( ) ( ) ρ + ρ 1 ρ + LSH 1 log log O n n O n O n n 24
LSH: Big picture LSH: Big picture 25
Locality Sensitive Hashing Locality Sensitive Hashing • Remember: solving decision ANN • Input: • No. of points: n • Number of dimensions: d • Point set: P • Query point: q 26
LSH: Big Picture LSH: Big Picture • Family of hash functions: • Close points to same buckets • Faraway points to different buckets • Choose a random function and hash P • Only store non-empty buckets 27
LSH: Big Picture LSH: Big Picture • Hash q in the table • Test every point in q ’s bucket for ANN • Problem: • q ’s bucket may be empty 28
LSH: Big Picture LSH: Big Picture • Solution: • Use a number of hash tables! • We are done if any ANN is found 29
LSH: Big Picture LSH: Big Picture • Problem: • Poor resolution � too many candidates! • Stop after reaching a limit, small probability 30
LSH: Big Picture LSH: Big Picture • Want to find a hash function: [ ] ∈ = ≥ α If u B ( q , r ) then Pr h ( u ) h ( q ) [ ] ∉ = ≤ β If u B ( q , R ) then Pr h ( u ) h ( q ) < α >> β r R , • h is randomly picked from a family • Choose ( ) = + ε R r 1 31
LSH on unit LSH on unit Hypercube Hypercube 32
Setup: unit hypercube Setup: unit hypercube • Points lie on hypercube: H d = {0,1} d • Every point is a binary string • Hamming distance ( r ): • Number of different coordinates 33
Setup: unit hypercube Setup: unit hypercube • Points lie on hypercube: H d = {0,1} d • Every point is a binary string • Hamming distance ( r ): • Number of different coordinates 34
Main idea Main idea 35
Hash functions for hypercube Hash functions for hypercube • Define family F : ( ) d = Given : Hypercube H , point b b , , b 1 d K ( ) d ∈ = = ∈ = h F : h ( b ) b b b , , b H , for i 1 , , d 1 i i d K K + ε r r ( 1 ) α = − β = − 1 , 1 d d • Intuition: compare a random coordinate ( ) + ε α β • Called: r , r ( 1 ), , - sensitive family 36
Hash functions for hypercube Hash functions for hypercube • Define family G : d ∈ Given : b H , F ∈ g G : { } { } 1 k i d k → = ∈ g : 0 , 1 0 , 1 g ( b ) h ( b ), , h ( b ) , for h F K k + ε k − r r ( 1 ) ′ ′ α = = α β = − = β k k 1 , 1 d d • Intuition: Compare k random coordinates • Choose k later – logarithmic in n � J-L lemma 37
Constructing hash tables Constructing hash tables g , , g • Choose uniformly at random from G τ 1 K τ • Constructing hash tables, hash P τ • Will choose later g g g τ 1 2 38
LSH: ANN algorithm LSH: ANN algorithm g , , g • Hash q into each τ 1 K • Check colliding nodes for ANN τ 4 • Stop if more than collisions, return fail g g g τ 1 2 39
Details… Details… 40
Choosing parameters Choosing parameters τ • Choose k and to ensure constant probability of: • Finding an ANN if there is a NN < τ ( 4 ) • Few collisions when there is no ANN α ln 1/ ρ = Define : β ln 1/ ln n ρ = τ = Choose : k , 2 n β ln 1/ 41
Recommend
More recommend