active embedding search via
play

Active Embedding Search via When bits meet brains: Noisy Paired - PowerPoint PPT Presentation

Active Embedding Search via When bits meet brains: Noisy Paired Comparisons Locally competitive algorithms for sparse approximation Gregory H. Canal Georgia Institute of Technology In collaboration with: Andrew K. Massimino, Mark A.


  1. Active Embedding Search via When bits meet brains: Noisy Paired Comparisons Locally competitive algorithms for sparse approximation Gregory H. Canal Georgia Institute of Technology In collaboration with: Andrew K. Massimino, Mark A. Davenport, Christopher J. Rozell Sensory Information Processing Lab

  2. Estimating preferences in similarity embedding Active embedding search via noisy paired comparisons

  3. Estimating preferences in similarity embedding Active embedding search via noisy paired comparisons

  4. Estimating preferences in similarity embedding Active embedding search via noisy paired comparisons

  5. Estimating preferences in similarity embedding 1 4 2 3 • Item preferences ranked by distance to user Active embedding search via noisy paired comparisons

  6. Estimating preferences in similarity embedding 3 1 1 4 4 2 2 3 • Item preferences ranked by distance to user Active embedding search via noisy paired comparisons

  7. Estimating preferences in similarity embedding 3 1 1 4 4 2 2 3 • Item preferences ranked by distance to user • Continuous user point: hypothetical ideal item (not necessarily in dataset) Active embedding search via noisy paired comparisons

  8. Method of paired comparisons Learn preferences via method of paired comparisons (David, 1963) “Which of these two foods do you prefer to eat?” – Direct comparisons may be explicitly solicited – Comparisons are implicitly solicited everywhere – In practice, responses are noisy, inconsistent Active embedding search via noisy paired comparisons

  9. Ideal point model Pairwise search : estimate user vector 𝑥 ∈ ℝ $ based on paired • comparisons between items • Ideal point model : continuous point 𝑥 encodes ideal item that is preferred over all other items (Coombs, 1950) Paired comparison (𝑞, 𝑟) : user at 𝑥 ¡ prefers item 𝑞 over item 𝑟 if and only if 𝑥 − 𝑞 < 𝑥 − 𝑟 less preferred 𝑥 𝑟 more preferred 𝑞 Active embedding search via noisy paired comparisons

  10. Ideal point model Pairwise search : estimate user vector 𝑥 ∈ ℝ $ based on paired • comparisons between items • Ideal point model : continuous point 𝑥 encodes ideal item that is preferred over all other items (Coombs, 1950) Paired comparison (𝑞, 𝑟) : user at 𝑥 ¡ prefers item 𝑞 over item 𝑟 if and only if 𝑥 − 𝑞 < 𝑥 − 𝑟 less preferred 𝑥 𝑟 more preferred 𝑞 Active embedding search via noisy paired comparisons

  11. Prior work How can paired comparisons (hyperplanes) be selected? • Query as few pairs as possible • Linear models (e.g., learning to rank, latent factors) unsuitable for nonlinear ideal point model (Wu et al., 2017; Qian et al., 2015) • Feasible region tracking – Query pairs adaptively – Add slack variables to feasible region (Massimino & Davenport, 2018) 𝑥 – Repeat comparisons, take majority vote (Jamieson & Nowak, 2011) – Previous methods do not incorporate noise into pair selection Active embedding search via noisy paired comparisons

  12. Modeling response noise • 𝑏 /0 ∈ ℝ $ , 𝑐 /0 ∈ ℝ : weights, threshold of hyperplane 𝑞 bisecting 𝑞, 𝑟 • Model noise with logistic response probability 𝑟 1 𝑄 𝑞 ≺ 𝑟 = > ?9@ 1 + 𝑓 9: ;< (= ;< ;< ) – 𝑙 /0 : noise constant , represents signal-to-noise 𝑥 A ratio – User estimated as posterior mean (MMSE estimator) Active embedding search via noisy paired comparisons

  13. Our contribution • Directly incorporate noise model into adaptive selection of pairs • Strategy 1: InfoGain • Strategy 2: EPMV – analytically tractable • Strategy 3: MCMV – computationally tractable Active embedding search via noisy paired comparisons

  14. Strategy 1: Maximize information gain (InfoGain) C : binary response to i th paired comparison • 𝑍 ℎ C 𝑋 : differential entropy of posterior • • InfoGain : choose queries that maximize expected decrease in posterior entropy i.e. information gain : C 𝑧 C9I = ℎ C9I 𝑋 − 𝐹 K L [ℎ C (𝑋)|𝑧 C9I ] 𝐽 𝑋; 𝑍 user responds • No closed-form expression, estimate with samples from posterior – Computationally expensive : scales in product of # of samples and # candidate pairs • Difficult to analyze convergence Active embedding search via noisy paired comparisons

  15. Information gain intuition • Symmetry of mutual information: C 𝑧 C9I = 𝐼(𝑍 C |𝑧 C9I ) − 𝐼(𝑍 C |𝑋, 𝑧 C9I ) 𝐽 𝑋;𝑍 • First term promotes selection of comparisons where outcome is non-obvious, given previous responses – Maximized when comparison response is equiprobable , i.e. probability of picking each pair item is 1/2 Active embedding search via noisy paired comparisons

  16. Information gain intuition • Symmetry of mutual information: C 𝑧 C9I = 𝐼(𝑍 C |𝑧 C9I ) − 𝐼(𝑍 C |𝑋, 𝑧 C9I ) 𝐽 𝑋;𝑍 • Second term promotes selection of comparisons that would have predictable outcomes if 𝑥 were known When 𝑥 close to hyperplane, response is unpredictable 0.5 Q 𝑥− 𝑐 /0 𝑏 /0 • Choose query where 𝑥 is far from hyperplane in expectation – i.e. posterior variance orthogonal to hyperplane ( projected variance ) is large Active embedding search via noisy paired comparisons

  17. Strategy 2: Equiprobable, max-variance (EPMV) C 𝑧 C9I = 𝐼(𝑍 C |𝑧 C9I ) − 𝐼(𝑍 C |𝑋, 𝑧 C9I ) 𝐽 𝑋;𝑍 • Equiprobable : response is equally likely to be either item – Determines hyperplane threshold • Max-variance: comparison cuts in direction of maximum projected variance – Determines hyperplane weights 𝑄 𝑞 ≺ 𝑟 = 1/2 Active embedding search via noisy paired comparisons

  18. EPMV theory Proposition For equiprobable comparison with hyperplane weights 𝑏 /0 , C 𝑧 C9I ≥ 𝑀 I 𝑏 /0 Q Σ Z|K L[\ 𝑏 /0 𝐽 𝑋; 𝑍 where 𝑀 I ¡ is a monotonically increasing function. Ø EPMV approximates InfoGain Active embedding search via noisy paired comparisons

  19. EPMV theory Theorem For the EPMV query scheme with each selected query satisfying 𝑙 /0 𝑏 /0 ≥ 𝑙 ]C^ > 0 and stopping threshold 𝜁 > 0 , \ i < 𝜁 . We have consider the stopping time 𝑈 b = min 𝑗: Σ Z|h L I I I 𝑒 s log b ) . 𝐹 𝑈 b = O(𝑒 log b + r b: pLq I Furthermore, for any query scheme 𝐹 𝑈 b ) . b = Ω(𝑒 log Ø For large noise constants ( 𝑙 ]C^ ≫ 0 ), EPMV reduces the posterior volume at a nearly-optimal rate. Active embedding search via noisy paired comparisons

  20. EPMV in practice • Often, one selects pair from pool, rather than querying arbitrary hyperplanes • Select pair that maximizes approximate EPMV utility function, for 𝜇 > 0 Prefers Prefers max- equiprobable variance queries queries • Computationally expensive – same utility evaluation cost as InfoGain Active embedding search via noisy paired comparisons

  21. Strategy 3: Mean-cut, max-variance (MCMV) C 𝑧 C9I = 𝐼(𝑍 C |𝑧 C9I ) − 𝐼(𝑍 C |𝑋, 𝑧 C9I ) 𝐽 𝑋;𝑍 • Computational bottleneck in EPMV is evaluating equiprobable property – Approximate equiprobable property with mean-cut property i.e. hyperplane passes through posterior mean Q 𝐹 𝑋 𝑍 C9I − 𝑐 /0 = 0 𝑏 /0 Active embedding search via noisy paired comparisons

  22. MCMV theory Proposition Q Σ Z|K L[\ 𝑏 /0 ≫ 0 , For mean-cut comparisons with 𝑏 /0 Ø MCMV approximates EPMV Proposition For mean-cut comparison with hyperplane weights 𝑏 /0 , C 𝑧 C9I ≥ 𝑀 s 𝑏 /0 Q Σ Z|K L[\ 𝑏 /0 𝐽 𝑋; 𝑍 where 𝑀 s ¡ is a monotonically increasing function. Ø MCMV approximates InfoGain Active embedding search via noisy paired comparisons

  23. MCMV in practice • Select pair that maximizes utility function, for 𝜇 > 0 Prefers max- variance queries Prefers mean-cut queries • Computational cost is much cheaper than InfoGain and EPMV – Scales with sum of # number of posterior samples and # candidate pairs, rather than product Active embedding search via noisy paired comparisons

  24. Methods overview Method Advantages Limitations Computationally expensive Directly minimizes InfoGain posterior volume Difficult to analyze Convergence Computationally EPMV guarantee expensive No convergence Computationally MCMV guarantee cheap (future work) Active embedding search via noisy paired comparisons

  25. Simulated results • Item embedding constructed from Yummly Food-10k dataset (Wilber et al., 2015; 2014) – 10,000 food items – ~1 million human comparisons between items • Simulated pairwise search – Noise constant 𝑙 /0 estimated from training comparisons – User preference point drawn uniformly from hypercube, 𝑒 = 4 Active embedding search via noisy paired comparisons

  26. Simulated results – baseline methods • Random – pairs selected uniformly at random – user estimated as posterior mean • GaussCloud (Massimino & Davenport, 2018) – pairs chosen to approximate Gaussian point cloud around estimate, shrinks over multiple stages – user estimated by approximately solving non-convex program • ActRank (Jamieson & Nowak, 2011) – pairs selected that intersect feasible region of preference points – query repeated multiple times, majority vote taken – user estimated as Chebyshev center* * our addition Active embedding search via noisy paired comparisons

Recommend


More recommend