proximity in the age of distraction robust approximate
play

Proximity in the Age of Distraction: Robust Approximate Nearest - PowerPoint PPT Presentation

Proximity in the Age of Distraction: Robust Approximate Nearest Neighbor Search Sariel Har-Peled Sepideh Mahabadi UIUC MIT Nearest Neighbor Problem Nearest Neighbor Dataset of points in a metric space (, ) , e.g.


  1. Proximity in the Age of Distraction: Robust Approximate Nearest Neighbor Search Sariel Har-Peled Sepideh Mahabadi UIUC MIT

  2. Nearest Neighbor Problem

  3. Nearest Neighbor Dataset of 𝑜 points 𝑄 in a metric space (𝑌, 𝑒 𝑌 ) , e.g. ℝ 𝑒

  4. Nearest Neighbor Dataset of 𝑜 points 𝑄 in a metric space (𝑌, 𝑒 𝑌 ) , e.g. ℝ 𝑒 A query point 𝑟 comes online 𝑟

  5. Nearest Neighbor Dataset of 𝑜 points 𝑄 in a metric space (𝑌, 𝑒 𝑌 ) , e.g. ℝ 𝑒 A query point 𝑟 comes online Goal: 𝑟 • Find the nearest data point 𝑞 ∗ 𝑞 ∗

  6. Nearest Neighbor Dataset of 𝑜 points 𝑄 in a metric space (𝑌, 𝑒 𝑌 ) , e.g. ℝ 𝑒 A query point 𝑟 comes online Goal: 𝑟 • Find the nearest data point 𝑞 ∗ 𝑞 ∗ • Do it in sub-linear time and small space

  7. Approximate Nearest Neighbor Dataset of 𝑜 points 𝑄 in a metric space (𝑌, 𝑒 𝑌 ) , e.g. ℝ 𝑒 A query point 𝑟 comes online 𝑞 Goal: 𝑟 • Find the nearest data point 𝑞 ∗ 𝑞 ∗ • Do it in sub-linear time and small space • Approximate Nearest Neighbor ─ If optimal distance is 𝑠 , report a point in distance c𝑠 for c = 1 + 𝜗

  8. Approximate Nearest Neighbor Dataset of 𝑜 points 𝑄 in a metric space (𝑌, 𝑒 𝑌 ) , e.g. ℝ 𝑒 A query point 𝑟 comes online 𝑞 Goal: 𝑟 • Find the nearest data point 𝑞 ∗ 𝑞 ∗ • Do it in sub-linear time and small space • Approximate Nearest Neighbor ─ If optimal distance is 𝑠 , report a point in distance c𝑠 for c = 1 + 𝜗 ─ For Hamming (and 𝑀 1 ) query time is 𝑜 1/𝑃(𝑑) [IM98] 1 𝑃(𝑑 2 ) [AI08] ─ and for Euclidean ( 𝑀 2 ) it is 𝑜

  9. Applications of NN Searching for the closest object

  10. Robust NN Problem

  11. Robustness The data points are:

  12. Robustness The data points are: • corrupted, noisy • Image denoising

  13. Robustness The data points are: • corrupted, noisy Movies • Image denoising 1 - 0 - - - Users • Incomplete - 0 1 - 0 - - - - 1 1 - • Recommendation: Sparse matrix

  14. Robustness The data points are: • corrupted, noisy Movies • Image denoising 1 - 0 - - - Users • Incomplete - 0 1 - 0 - - - - 1 1 - • Recommendation: Sparse matrix • Irrelevant • Occluded image

  15. The Robust NN problem • Dataset of 𝑜 points 𝑄 in ℝ 𝑒 n=3 𝑞 1 = (3,4,0,5) 𝑞 2 = (3,2,1,2) 𝑞 3 = (2,3,3,1)

  16. The Robust NN problem • Dataset of 𝑜 points 𝑄 in ℝ 𝑒 n=3,k=2 𝑞 1 = (3,4,0,5) • A parameter 𝒍 𝑞 2 = (3,2,1,2) 𝑞 3 = (2,3,3,1)

  17. The Robust NN problem • Dataset of 𝑜 points 𝑄 in ℝ 𝑒 𝑟 = (1,2, 1,5) n=3,k=2 𝑞 1 = (3,4,0,5) • A parameter 𝒍 𝑞 2 = (3,2,1,2) • A query point 𝑟 comes online 𝑞 3 = (2,3,3,1) • Find the closest point after removing 𝒍 coordinates

  18. The Robust NN problem • Dataset of 𝑜 points 𝑄 in ℝ 𝑒 𝑟 = (1,2, 1,5) n=3,k=2 𝑞 1 = (3,4,0,5) dist=1 • A parameter 𝒍 𝑞 2 = (3,2,1,2) dist=0 • A query point 𝑟 comes online 𝑞 3 = (2,3,3,1) dist=2 • Find the closest point after removing 𝒍 coordinates

  19. The Robust NN problem • Dataset of 𝑜 points 𝑄 in ℝ 𝑒 𝑟 = (1,2, 1,5) n=3,k=2 𝑞 1 = (3,4,0,5) dist=1 • A parameter 𝒍 𝑞 2 = (3,2,1,2) dist=0 • A query point 𝑟 comes online 𝑞 3 = (2,3,3,1) dist=2 • Find the closest point after removing 𝒍 coordinates

  20. The Robust NN problem • Dataset of 𝑜 points 𝑄 in ℝ 𝑒 𝑟 = (1,2, 1,5) n=3,k=2 𝑞 1 = (3,4,0,5) dist=1 • A parameter 𝒍 𝑞 2 = (3,2,1,2) dist=0 • A query point 𝑟 comes online 𝑞 3 = (2,3,3,1) dist=2 • Find the closest point after removing 𝒍 coordinates  Different set of coordinates for different points  Applying this naively would require 𝑒 𝑙 ≈ 𝑒 𝑙

  21. Budgeted Version • Dataset of 𝑜 points 𝑄 in ℝ 𝑒 𝑥 = 0.5, 0.5, 0.8, 0.3 n=3 • 𝑒 weights 𝑞 1 = (1,4,0,3) 𝑥 = (𝑥 1 , 𝑥 2 , … , 𝑥 𝑒 ) ∈ 0,1 𝑒 𝑞 2 = (3,2,4,2) 𝑞 3 = (4,6,3,4)

  22. Budgeted Version • Dataset of 𝑜 points 𝑄 in ℝ 𝑒 𝑥 = 0.5, 0.5, 0.8, 0.3 𝑟 = (1,2, 5,5) n=3 • 𝑒 weights 𝑞 1 = (1,4,0,3) 𝑥 = (𝑥 1 , 𝑥 2 , … , 𝑥 𝑒 ) ∈ 0,1 𝑒 𝑞 2 = (3,2,4,2) • A query point 𝑟 comes online 𝑞 3 = (4,6,3,4) • Find the closest point after removing a set of coordinates 𝐶 of weight at most 𝟐 .

  23. Budgeted Version • Dataset of 𝑜 points 𝑄 in ℝ 𝑒 𝑥 = 0.5, 0.5, 0.8, 0.3 𝑟 = (1,2, 5,5) n=3 • 𝑒 weights 𝑞 1 = (1,4,0,3) 𝑥 = (𝑥 1 , 𝑥 2 , … , 𝑥 𝑒 ) ∈ 0,1 𝑒 𝑞 2 = (3,2,4,2) • A query point 𝑟 comes online 𝑞 3 = (4,6,3,4) • Find the closest point after removing a set of coordinates 𝐶 of weight at most 𝟐 .

  24. Budgeted Version • Dataset of 𝑜 points 𝑄 in ℝ 𝑒 𝑥 = 0.5, 0.5, 0.8, 0.3 𝑟 = (1,2, 5,5) n=3 • 𝑒 weights 𝑞 1 = (1,4,0,3) 𝑥 = (𝑥 1 , 𝑥 2 , … , 𝑥 𝑒 ) ∈ 0,1 𝑒 dist=4 𝑞 2 = (3,2,4,2) dist=1 • A query point 𝑟 comes online 𝑞 3 = (4,6,3,4) dist=3 • Find the closest point after removing a set of coordinates 𝐶 of weight at most 𝟐 .

  25. Budgeted Version • Dataset of 𝑜 points 𝑄 in ℝ 𝑒 𝑥 = 0.5, 0.5, 0.8, 0.3 𝑟 = (1,2, 5,5) n=3 • 𝑒 weights 𝑞 1 = (1,4,0,3) 𝑥 = (𝑥 1 , 𝑥 2 , … , 𝑥 𝑒 ) ∈ 0,1 𝑒 dist=4 𝑞 2 = (3,2,4,2) dist=1 • A query point 𝑟 comes online 𝑞 3 = (4,6,3,4) dist=3 • Find the closest point after removing a set of coordinates 𝐶 of weight at most 𝟐 .

  26. Results Bicriterion Approximation, for 𝑀 1 norm • Suppose that for 𝑞 ∗ ⊂ 𝑄 we have 𝑒𝑗𝑡𝑢 𝑟, 𝑞 ∗ = 𝑠 after ignoring 𝑙 coordinates

  27. Results Bicriterion Approximation, for 𝑀 1 norm • Suppose that for 𝑞 ∗ ⊂ 𝑄 we have 𝑒𝑗𝑡𝑢 𝑟, 𝑞 ∗ = 𝑠 after ignoring 𝑙 coordinates • For 𝜀 ∈ (0,1) o Report a point 𝑞 s.t. 𝑒𝑗𝑡𝑢 𝑟, 𝑞 = 𝑃(𝑠/𝜀) after ignoring 𝑃(𝑙/𝜀) coordinates. o Query time equals to 𝑜 𝜀 queries in 2-ANN data- structure

  28. Results Bicriterion Approximation, for 𝑀 1 norm • Suppose that for 𝑞 ∗ ⊂ 𝑄 we have 𝑒𝑗𝑡𝑢 𝑟, 𝑞 ∗ = 𝑠 after ignoring 𝑙 coordinates • For 𝜀 ∈ (0,1) o Report a point 𝑞 s.t. 𝑒𝑗𝑡𝑢 𝑟, 𝑞 = 𝑃(𝑠/𝜀) after ignoring 𝑃(𝑙/𝜀) coordinates. o Query time equals to 𝑜 𝜀 queries in 2-ANN data- structure Why not single criterion? • Equivalent to exact near neighbor in Hamming: there is a point within distance 𝑠 of the query iff there is a point within distance 0 after ignoring 𝑙 = 𝑠 coordinates

  29. Results distance #ignored Query Time coordinates #Queries Query type Opt 𝑠 𝑙

  30. Results distance #ignored Query Time coordinates #Queries Query type Opt 𝑠 𝑙 𝑃( 𝑠 𝑜 𝜀 𝑃( 𝑙 𝑀 1 2-ANN 𝜀) 𝜀)

  31. Results distance #ignored Query Time coordinates #Queries Query type Opt 𝑠 𝑙 𝑃( 𝑠 𝑜 𝜀 𝑃( 𝑙 𝑀 1 2-ANN 𝜀) 𝜀) 𝑑 1/p -ANN 1/p 𝑃(𝑙 𝑑 + 1 𝑜 𝜀 𝑀 𝐪 𝑃(𝑠 𝑑 + 1 𝜀 ) ) 𝜀

  32. Results distance #ignored Query Time coordinates #Queries Query type Opt 𝑠 𝑙 𝑃( 𝑠 𝑜 𝜀 𝑃( 𝑙 𝑀 1 2-ANN 𝜀) 𝜀) 𝑑 1/p -ANN 1/p 𝑃(𝑙 𝑑 + 1 𝑜 𝜀 𝑀 𝐪 𝑃(𝑠 𝑑 + 1 𝜀 ) ) 𝜀 𝑃( 𝑙 (1 + 𝜗) - 𝑠(1 + 𝜗) O( 𝑜 𝜀 1 + 𝜗 − ANN 𝜗𝜀 ) 𝜗 ) approximation

  33. Results distance #ignored Query Time coordinates #Queries Query type Opt 𝑠 𝑙 𝑃( 𝑠 𝑜 𝜀 𝑃( 𝑙 𝑀 1 2-ANN 𝜀) 𝜀) 𝑑 1/p -ANN 1/p 𝑃(𝑙 𝑑 + 1 𝑜 𝜀 𝑀 𝐪 𝑃(𝑠 𝑑 + 1 𝜀 ) ) 𝜀 𝑃( 𝑙 (1 + 𝜗) - 𝑠(1 + 𝜗) O( 𝑜 𝜀 1 + 𝜗 − ANN 𝜗𝜀 ) 𝜗 ) approximation 𝑜 𝜀 Budgeted 𝑃(𝑠) Weight of 𝑃(1) 2-ANN +𝑃(𝑜 𝜀 𝑒 4 ) Version

  34. Algorithm

  35. High Level Algorithm Theorem. If for a point 𝑞 ∗ ⊂ 𝑄 , the 𝑀 1 distance of 𝑟 and 𝑞 ∗ is at most 𝑠 after removing 𝑙 coordinates, there exists an algorithm which reports a point 𝑞 whose distance to 𝑟 is 𝑃(𝑠/𝜀) after removing 𝑃(𝑙/𝜀) coordinates.

  36. High Level Algorithm Theorem. If for a point 𝑞 ∗ ⊂ 𝑄 , the 𝑀 1 distance of 𝑟 and 𝑞 ∗ is at most 𝑠 after removing 𝑙 coordinates, there exists an algorithm which reports a point 𝑞 whose distance to 𝑟 is 𝑃(𝑠/𝜀) after removing 𝑃(𝑙/𝜀) coordinates. • Cannot apply randomized dimensionality reduction e.g. Johnson-Lindenstrauss

  37. High Level Algorithm Theorem. If for a point 𝑞 ∗ ⊂ 𝑄 , the 𝑀 1 distance of 𝑟 and 𝑞 ∗ is at most 𝑠 after removing 𝑙 coordinates, there exists an algorithm which reports a point 𝑞 whose distance to 𝑟 is 𝑃(𝑠/𝜀) after removing 𝑃(𝑙/𝜀) coordinates. • Cannot apply randomized dimensionality reduction e.g. Johnson-Lindenstrauss • A set of randomized maps 𝒈 𝟐 , 𝒈 𝟑 , … 𝒈 𝒏 : ℝ 𝒆 → ℝ 𝒆 ′ • All of them map far points from query to far points • At least one of them maps a close point to a close point

Recommend


More recommend