Learning Nearest Neighbor Graphs from Learning Nearest Neighbor Graphs from Noisy Distance Samples Noisy Distance Samples Blake Mason, Ardhendu Tripathy, & Robert Nowak Blake Mason, Ardhendu Tripathy, & Robert Nowak
Motivation Wish to learn ‘ most similar’ or ‘ closest’ items to a given from noisy measurements
Motivation Wish to learn ‘ most similar’ or ‘ closest’ items to a given from noisy measurements amazon.com/discover
Motivation Wish to learn ‘ most similar’ or ‘ closest’ items to a given from noisy measurements Fujitsu white paper
Motivation Wish to learn ‘ most similar’ or ‘ closest’ items to a given from noisy measurements We don’t know the given a priori. We want to answer ‘closest’ queries for any item quickly!
The Nearest Neighbor Graph Problem Sharma et al. (2015)
Preliminaries and Notation • • •
Outline of ANNTri
Elimination via the triangle inequality j k i l
Triangle Inequality Bounds
Theoretical Results • Worst case complexity is always O(n 2 ) • In general, order matters
Theoretical Results • Often, we can do better:
Theoretical Results • An example of separation:
Theoretical Results
Experimental Results • Simulated data • 100 points in ℝ 2 • 10 clusters of 10 points • Euclidean distance • Gaussian noise, 𝜏 2 = 0.1
Experimental Results • Compare against Random sampling • Test effect of triangle inequality
Experimental Results • The metric is (2d) Euclidean • We can compare against (distance) matrix completion • With a distance matrix, the graph can be computed easily
Experimental Results • What shoes are most similar?
Experimental Results • What shoes are most similar? • 85 images from UTZappos50K dataset
Experimental Results • What shoes are most similar? • 85 images from UTZappos50K dataset • Human judgements collected by Heim et al., (2015).
Experimental Results
Experimental Results • What shoes are most similar? • 85 images from UTZappos50K dataset • Human judgements collected by Heim et al., (2015).
Experimental Results • What shoes are most similar? • 85 images from UTZappos50K dataset • Human judgements collected by Heim et al., (2015).
Main takeways for ANNTri 1. ANNTri finds the nearest neighbor graph for general metrics using the triangle inequality 2. Only requires access to noisy oracle 3. In favorable settings, requires 𝑷(𝒐𝒎𝒑𝒉 𝒐 𝚬 −𝟑 ) queries versus 𝑷 𝒐 𝟑 𝚬 −𝟑 needed by brute force!
Recommend
More recommend