lecture 15 exact tensor completion
play

Lecture 15: Exact Tensor Completion Joint Work with David Steurer - PowerPoint PPT Presentation

Lecture 15: Exact Tensor Completion Joint Work with David Steurer Lecture Outline Part I: Matrix Completion Problem Part II: Matrix Completion via Nuclear Norm Minimization Part III: Generalization to Tensor Completion


  1. Lecture 15: Exact Tensor Completion Joint Work with David Steurer

  2. Lecture Outline • Part I: Matrix Completion Problem • Part II: Matrix Completion via Nuclear Norm Minimization • Part III: Generalization to Tensor Completion • Part IV: SOS -symmetry to the Rescue • Part V: Finding Dual Certificate for Matrix Completion • Part VI: Open Problems

  3. Part I: Matrix Completion Problem

  4. Matrix Completion • Matrix Completion: Let Ω be a set of entries sampled at random. Given the entries {𝑁 𝑏𝑐 : 𝑏, 𝑐 ∈ Ω} from a matrix 𝑁 , can we determine the remaining entries of 𝑁 ? • Impossible in general, tractable if 𝑁 is low rank 𝑈 where 𝑠 is not too large. 𝑠 i.e. 𝑁 = σ 𝑗=1 𝜇 𝑗 𝑣 𝑗 𝑤 𝑗

  5. Netflix Challenge • Canonical example of matrix completion: Netflix Challenge • Can we predict users’ preferences on other movies from their previous ratings?

  6. Netflix Challenge ? 5 6 ? 6 10 8 9 ? ? ? ? 9 6 7.5 9 ? 8 9 ?

  7. Solving Matrix Completion • Current best method in practice: Alternating minimization 𝑠 𝑈 , alternate • Idea: Write 𝑁 = σ 𝑗=1 𝑣 𝑗 𝑤 𝑗 between optimizing {𝑣 𝑗 } and {𝑤 𝑗 } • Best known theoretical guarantees: Nuclear norm minimization • This lecture: We’ll describe nuclear norm minimization and how it generalizes to tensor completion via SOS.

  8. Part II: Nuclear Norm Minimization

  9. Theorem Statement 𝑈 is an 𝑠 • Theorem [Rec11]: If 𝑁 = σ 𝑗=1 𝜇 𝑗 𝑣 𝑗 𝑤 𝑗 𝑜 × 𝑜 matrix then nuclear norm minimization requires 𝑃(𝑜𝑠𝜈 0 𝑚𝑝𝑕𝑜 2 ) random samples to complete 𝑁 with high probability • Note: 𝜈 0 is a parameter related to how coherent the {𝑣 𝑗 } and the {𝑤 𝑗 } (see appendix for the definition) • Example of why this is needed: If 𝑣 𝑗 = 𝑓 𝑘 then 𝑈 = 𝑓 𝑘 𝑤 𝑗 𝑈 can only be fully detected by 𝑣 𝑗 𝑤 𝑗 sampling all of row 𝑘 , which requires sampling almost everything!

  10. Nuclear Norm • Recall the singular value decomposition (SVD) of a matrix 𝑁 𝑈 where the {𝑣 𝑗 } are 𝑠 • 𝑁 = σ 𝑗=1 𝜇 𝑗 𝑣 𝑗 𝑤 𝑗 orthonormal, the {𝑤 𝑗 } are orthonormal, and 𝜇 𝑗 ≥ 0 for all 𝑗 . 𝑠 • The nuclear norm of 𝑁 is 𝑁 ∗ = σ 𝑗=1 𝜇 𝑗

  11. Nuclear Norm Minimization • Matrix completion problem: Recover 𝑁 given randomly sampled entries {𝑁 𝑏𝑐 : 𝑏, 𝑐 ∈ Ω} • Nuclear norm minimization: Find the matrix 𝑌 which minimizes 𝑌 ∗ while satisfying 𝑌 𝑏𝑐 = 𝑁 𝑏𝑐 whenever 𝑏, 𝑐 ∈ Ω . • How do we minimize 𝑌 ∗ ?

  12. Semidefinite Program • We can implement nuclear norm minimization with the following semidefinite program: • Minimize the trace of 𝑉 𝑌 ≽ 0 where 𝑌 𝑈 𝑊 𝑌 𝑏𝑐 = 𝑁 𝑏𝑐 whenever 𝑏, 𝑐 ∈ Ω • Why does this work? We’ll first show that the true solution is a good solution. We’ll then describe how to show the true solution is the optimal solution

  13. True Solution • Program: Minimize the trace of 𝑉 𝑌 ≽ 0 𝑌 𝑈 𝑊 where 𝑌 𝑏𝑐 = 𝑁 𝑏𝑐 whenever 𝑏, 𝑐 ∈ Ω 𝑣 𝑗 • True solution: 𝑉 𝑌 𝑈 𝑈 = σ 𝑗 𝜇 𝑗 𝑣 𝑗 𝑤 𝑗 𝑤 𝑗 𝑌 𝑈 𝑊 𝑈 ) (recall that 𝑁 = σ 𝑗 𝜇 𝑗 𝑣 𝑗 𝑤 𝑗 𝑈 = 𝑢𝑠 𝑤 𝑗 𝑤 𝑗 𝑈 = 1 , • Since for all 𝑗 , 𝑢𝑠 𝑣 𝑗 𝑣 𝑗 𝑉 𝑌 = 2 σ 𝑗 𝜇 𝑗 𝑢𝑠 𝑌 𝑈 𝑊

  14. Dual Certificate • Program: Minimize the trace of 𝑉 𝑌 ≽ 0 𝑌 𝑈 𝑊 where 𝑌 𝑏𝑐 = 𝑁 𝑏𝑐 whenever 𝑏, 𝑐 ∈ Ω 𝐽𝑒 −𝐵 • Dual Certificate: ≽ 0 −𝐵 𝑈 𝐽𝑒 • Recall that if 𝑁 1 , 𝑁 2 ≽ 0 then 𝑁 1 ⦁𝑁 2 ≥ 0 (where ⦁ is the entry-wise dot product) 𝐽𝑒 −𝐵 𝑉 𝑌 • ⦁ ≥ 0 𝑌 𝑈 −𝐵 𝑈 𝑊 𝐽𝑒 • If 𝐵 𝑏𝑐 = 0 whenever 𝑏, 𝑐 ∉ Ω , this lower bounds the trace.

  15. True Solution Optimality 𝐽𝑒 −𝐵 • Dual Certificate: ≽ 0 where 𝐵 𝑏𝑐 = −𝐵 𝑈 𝐽𝑒 0 whenever 𝑏, 𝑐 ∉ Ω 𝑣 𝑗 • True solution 𝑉 𝑌 𝑈 𝑈 = σ 𝑗 𝜇 𝑗 𝑣 𝑗 𝑤 𝑗 𝑤 𝑗 𝑌 𝑈 𝑊 𝐽𝑒 −𝐵 𝑉 𝑌 ⦁ = 0 is optimal if 𝑌 𝑈 −𝐵 𝑈 𝑊 𝐽𝑒 𝑣 𝑗 𝐽𝑒 −𝐵 • This occurs if = 0 for all 𝑗 𝑤 𝑗 −𝐵 𝑈 𝐽𝑒

  16. Conditions on 𝐵 𝐽𝑒 −𝐵 • We want 𝐵 such that ≽ 0 , 𝐵 𝑏𝑐 = −𝐵 𝑈 𝐽𝑒 0 whenever 𝑏, 𝑐 ∉ Ω , and 𝑣 𝑗 𝐽𝑒 −𝐵 = 0 for all 𝑗 𝑤 𝑗 −𝐵 𝑈 𝐽𝑒 • Necessary and sufficient conditions on 𝐵 : 1. 𝐵 ≤ 1 2. 𝐵 𝑏𝑐 = 0 whenever 𝑏, 𝑐 ∉ Ω 3. 𝐵𝑤 𝑗 = 𝑣 𝑗 for all 𝑗 4. 𝐵 𝑈 𝑣 𝑗 = 𝑤 𝑗 for all 𝑗

  17. Dual Certificate with all entries • Necessary and sufficient conditions on 𝐵 : 1. 𝐵 ≤ 1 2. 𝐵 𝑏𝑐 = 0 whenever 𝑏, 𝑐 ∉ Ω 3. 𝐵𝑤 𝑗 = 𝑣 𝑗 for all 𝑗 4. 𝐵 𝑈 𝑣 𝑗 = 𝑤 𝑗 for all 𝑗 • If we have all entries (so we can ignore 𝑈 condition 2), we can take 𝐵 = σ 𝑗 𝑣 𝑗 𝑤 𝑗 • Challenge: Find 𝐵 when we don’t have all entries • Remark: This explains why the semidefinite program minimizes the nuclear norm.

  18. Part III: Generalization to Tensor Completion

  19. Tensor Completion • Ω be a set of entries Tensor Completion: Let sampled at random. Given the entries {𝑈 𝑏𝑐𝑑 : 𝑏, 𝑐, 𝑑 ∈ Ω} from a tensor 𝑈 , can we determine the remaining entries of 𝑈 ? • More difficult problem: tensor rank is much more complicated

  20. Exact Tensor Completion Theorem 𝑠 • Theorem [PS17]: If 𝑈 = σ 𝑗=1 𝜇 𝑗 𝑣 𝑗 ⊗ 𝑤 𝑗 ⊗ 𝑥 𝑗 , the {𝑣 𝑗 } are orthogonal, the {𝑤 𝑗 } are orthogonal, and the {𝑥 𝑗 } are orthogonal then with high probability we can recover 𝑈 with 3 𝑃(𝑠𝜈𝑜 2 𝑞𝑝𝑚𝑧𝑚𝑝𝑕(𝑜)) random samples • First algorithm to obtain exact tensor completion • Remark: The orthogonality condition is very restrictive but this result can likely be extended. • See appendix for the definition of 𝜈 .

  21. Semidefinite Program: First Attempt • Won’t quite work, but we’ll fix it later. • Minimize the trace of 𝑉 𝑌 ≽ 0 where 𝑌 𝑈 𝑊𝑋 𝑌 𝑏𝑐𝑑 = 𝑈 𝑏𝑐𝑑 whenever 𝑏, 𝑐, 𝑑 ∈ Ω • Here the top and left blocks are indexed by 𝑏 and the bottom and right blocks are indexed by 𝑐, 𝑑 .

  22. True Solution 𝑉 𝑌 • ≽ 0 Program: Minimize trace of 𝑌 𝑈 𝑊𝑋 where 𝑌 𝑏𝑐𝑑 = 𝑈 𝑏𝑐𝑑 whenever 𝑏, 𝑐, 𝑑 ∈ Ω 𝑉 𝑌 • = True solution: 𝑌 𝑈 𝑊𝑋 𝑣 𝑗 𝑈 𝑤 𝑗 ⊗ 𝑥 𝑗 𝑈 σ 𝑗 𝜇 𝑗 𝑣 𝑗 𝑤 𝑗 ⊗ 𝑥 𝑗 (recall that T = σ 𝑗 𝜇 𝑗 𝑣 𝑗 𝑤 𝑗 ⊗ 𝑥 𝑗 𝑈 ) 𝑉 𝑌 • = 2 σ 𝑗 𝜇 𝑗 𝑢𝑠 𝑌 𝑈 𝑊𝑋

  23. Dual Certificate: First Attempt 𝑉 𝑌 • ≽ 0 Program: Minimize trace of 𝑌 𝑈 𝑊𝑋 where 𝑌 𝑏𝑐𝑑 = 𝑈 𝑏𝑐𝑑 whenever 𝑏, 𝑐, 𝑑 ∈ Ω 𝐽𝑒 −𝐵 • ≽ 0 where Dual Certificate: −𝐵 𝑈 𝐽𝑒 𝐵 𝑏𝑐𝑑 = 0 whenever 𝑏, 𝑐, 𝑑 ∉ Ω 𝑣 𝑗 𝐽𝑒 −𝐵 • = 0 for all 𝑗 We want −𝐵 𝑈 𝑤 𝑗 ⊗ 𝑥 𝑗 𝐽𝑒

  24. Conditions on 𝐵 𝐽𝑒 −𝐵 • We want 𝐵 such that ≽ 0 , 𝐵 𝑏𝑐𝑑 = −𝐵 𝑈 𝐽𝑒 0 whenever 𝑏, 𝑐, 𝑑 ∉ Ω , and 𝑣 𝑗 𝐽𝑒 −𝐵 = 0 for all 𝑗 𝑤 𝑗 ⊗ 𝑥 𝑗 −𝐵 𝑈 𝐽𝑒 • Necessary and sufficient conditions on 𝐵 : 1. 𝐵 ≤ 1 2. 𝐵 𝑏𝑐𝑑 = 0 whenever 𝑏, 𝑐, 𝑑 ∉ Ω 3. 𝐵(𝑤 𝑗 ⊗ 𝑥 𝑗 ) = 𝑣 𝑗 for all 𝑗 4. 𝐵 𝑈 𝑣 𝑗 = 𝑤 𝑗 ⊗ 𝑥 𝑗 for all 𝑗 TOO STRONG, requires Ω(𝑜 2 ) samples!

  25. Part IV: SOS-symmetry to the Rescue

  26. SOS Program • Minimize the trace of 𝑉 𝑌 ≽ 0 where 𝑌 𝑈 𝑊𝑋 𝑌 𝑏𝑐𝑑 = 𝑈 𝑏𝑐𝑑 whenever 𝑏, 𝑐, 𝑑 ∈ Ω and 𝑊𝑋 is SOS-symmetric (i.e. 𝑊𝑋 𝑐𝑑𝑐 ′ 𝑑 ′ = 𝑊𝑋 𝑐 ′ 𝑑𝑐𝑑 ′ for all 𝑐, 𝑑, 𝑐 ′ , 𝑑′ )

  27. Review: Matrix Polynomial 𝑟(𝑅) • Definition: Given a symmetric matrix 𝑅 indexed by monomials, define q 𝑅 = σ 𝐿 (σ 𝐽,𝐾:𝐿=𝐽∪𝐾(𝑏𝑡 𝑛𝑣𝑚𝑢𝑗𝑡𝑓𝑢𝑡) 𝑅 𝐽𝐾 )𝑦 𝐿 • Idea: M ∙ 𝑅 = ෨ 𝐹[𝑟(𝑅)]

  28. Dual Certificate • Program: Minimize trace of 𝑉 𝑌 ≽ 0 𝑌 𝑈 𝑊𝑋 where 𝑌 𝑏𝑐𝑑 = 𝑈 𝑏𝑐𝑑 whenever 𝑏, 𝑐, 𝑑 ∈ Ω and 𝑊𝑋 is SOS-symmetric 𝐽𝑒 −𝐵 • Dual Certificate: ≽ 0 where −𝐵 𝑈 𝐶 𝐵 𝑏𝑐𝑑 = 0 whenever 𝑏, 𝑐, 𝑑 ∉ Ω and q 𝐶 = 𝑟(𝐽𝑒) 𝑣 𝑗 𝐽𝑒 −𝐵 • We want = 0 for all 𝑗 𝑤 𝑗 ⊗ 𝑥 𝑗 −𝐵 𝑈 𝐶

Recommend


More recommend