Structured filters Construct concatenated code O
Structured filters Normalize (only for example) O
Structured filters Normalize (only for example) O
Structured filters Normalize (only for example) O
Structured filters Construct Voronoi cells O
Structured filters Defines partition O
Structured filters ...with efficient decoding O
Structured filters Techniques • Idea 1: Increase number of regions to 2 Θ( d ) ◮ Number of hash tables increases to 2 Θ( d ) – ok for n = 2 Θ( d ) ◮ Decoding cost potentially too large O
Structured filters Techniques • Idea 1: Increase number of regions to 2 Θ( d ) ◮ Number of hash tables increases to 2 Θ( d ) – ok for n = 2 Θ( d ) ◮ Decoding cost potentially too large • Idea 2: Use structured codes for random regions ◮ Spherical/Voronoi LSH with dependent random points ◮ Concatenated code of log d low-dim. spherical codes O ◮ Allows for efficient list-decoding
Structured filters Techniques • Idea 1: Increase number of regions to 2 Θ( d ) ◮ Number of hash tables increases to 2 Θ( d ) – ok for n = 2 Θ( d ) ◮ Decoding cost potentially too large • Idea 2: Use structured codes for random regions ◮ Spherical/Voronoi LSH with dependent random points ◮ Concatenated code of log d low-dim. spherical codes O ◮ Allows for efficient list-decoding • Idea 3: Replace partitions with filters ◮ Relaxation: filters need not partition the space ◮ Simplified analysis ◮ Might not be needed to achieve improvement
Structured filters Results For random sparse settings ( n = 2 o ( d ) ), query time O ( n ρ ) with 1 � � ρ = 1 + o d (1) . 2 c 2 − 1 O
Structured filters Results For random sparse settings ( n = 2 o ( d ) ), query time O ( n ρ ) with 1 � � ρ = 1 + o d (1) . 2 c 2 − 1 For random dense settings ( n = 2 κ d with small κ ), we obtain 1 − κ � � ρ = 1 + o d ,κ (1) . 2 c 2 − 1 O
Structured filters Results For random sparse settings ( n = 2 o ( d ) ), query time O ( n ρ ) with 1 � � ρ = 1 + o d (1) . 2 c 2 − 1 For random dense settings ( n = 2 κ d with small κ ), we obtain 1 − κ � � ρ = 1 + o d ,κ (1) . 2 c 2 − 1 O For random dense settings ( n = 2 κ d with large κ ), we obtain ρ = − 1 1 � � � � 2 κ log 1 − 1 + o d (1) . 2 c 2 − 1
Asymmetric nearest neighbors Previous results: symmetric NNS • Query time: O ( n ρ ) • Update time: O ( n ρ ) • Preprocessing time: O ( n 1+ ρ ) • Space complexity: O ( n 1+ ρ )
Asymmetric nearest neighbors Previous results: symmetric NNS • Query time: O ( n ρ ) • Update time: O ( n ρ ) • Preprocessing time: O ( n 1+ ρ ) • Space complexity: O ( n 1+ ρ ) Can we get a tradeoff between these costs?
Asymmetric nearest neighbors Voronoi regions O
Asymmetric nearest neighbors Spherical cap
Asymmetric nearest neighbors Cap height α α
Asymmetric nearest neighbors Smaller α = ⇒ Larger caps, more work α
Asymmetric nearest neighbors Larger α = ⇒ Smaller caps, less work α
Asymmetric nearest neighbors α q > α u = ⇒ Faster queries, slower updates α u α q
Asymmetric nearest neighbors α q < α u = ⇒ Slower queries, faster updates α q α u
Asymmetric nearest neighbors Results General expressions ρ q = ( 2c 2 − 1 ) / c 4 Minimize space ( α q /α u = cos θ ) ρ u = 0 ρ q = 1 / ( 2c 2 − 1 ) Balance costs α q α u ρ u = 1 / ( 2c 2 − 1 ) ( α q /α u = 1) Minimize time ρ q = 0 ρ u = ( 2c 2 − 1 ) / ( c 2 − 1 ) 2 ( α q /α u = 1 / cos θ ) Query time O ( n ρ q ), update time O ( n ρ u ), preprocessing time O ( n 1+ ρ u ), space complexity O ( n 1+ ρ u )
Asymmetric nearest neighbors Results General expressions Small c = 1 + ε ρ q = ( 2c 2 − 1 ) / c 4 ρ q = 1 − 4 ε 2 + O ( ε 3 ) Minimize space ( α q /α u = cos θ ) ρ u = 0 ρ u = 0 ρ q = 1 / ( 2c 2 − 1 ) ρ q = 1 − 4 ε + O ( ε 2 ) Balance costs α q α u ρ u = 1 / ( 2c 2 − 1 ) ρ u = 1 − 4 ε + O ( ε 2 ) ( α q /α u = 1) Minimize time ρ q = 0 ρ q = 0 ρ u = ( 2c 2 − 1 ) / ( c 2 − 1 ) 2 ρ u = 1 / (4 ε 2 ) + O (1 /ε ) ( α q /α u = 1 / cos θ ) Query time O ( n ρ q ), update time O ( n ρ u ), preprocessing time O ( n 1+ ρ u ), space complexity O ( n 1+ ρ u )
Asymmetric nearest neighbors Results General expressions Large c → ∞ ρ q = ( 2c 2 − 1 ) / c 4 ρ q = 2 / c 2 + O (1 / c 4 ) Minimize space ( α q /α u = cos θ ) ρ u = 0 ρ u = 0 ρ q = 1 / ( 2c 2 − 1 ) ρ q = 1 / (2 c 2 ) + O (1 / c 4 ) Balance costs α q α u ρ u = 1 / ( 2c 2 − 1 ) ρ u = 1 / (2 c 2 ) + O (1 / c 4 ) ( α q /α u = 1) Minimize time ρ q = 0 ρ q = 0 ρ u = 2 / c 2 + O (1 / c 4 ) ρ u = ( 2c 2 − 1 ) / ( c 2 − 1 ) 2 ( α q /α u = 1 / cos θ ) Query time O ( n ρ q ), update time O ( n ρ u ), preprocessing time O ( n 1+ ρ u ), space complexity O ( n 1+ ρ u )
Asymmetric nearest neighbors Tradeoffs α q α u
Conclusions Main result: Allow using more regions with list-decodable codes • For n = 2 o ( d ) , non-asymptotic improvement • For n = 2 Θ( d ) , asymptotic improvement • Corollary: Lower bounds for n = 2 o ( d ) do not hold for n = 2 Θ( d ) • Improved tradeoffs between query and update complexities
Recommend
More recommend