Lecture 15: Exact Tensor Completion Joint Work with David Steurer

Lecture Outline • Part I: Matrix Completion Problem • Part II: Matrix Completion via Nuclear Norm Minimization • Part III: Generalization to Tensor Completion • Part IV: SOS -symmetry to the Rescue • Part V: Finding Dual Certificate for Matrix Completion • Part VI: Open Problems

Part I: Matrix Completion Problem

Matrix Completion • Matrix Completion: Let Ω be a set of entries sampled at random. Given the entries {𝑁 𝑏𝑐 : 𝑏, 𝑐 ∈ Ω} from a matrix 𝑁 , can we determine the remaining entries of 𝑁 ? • Impossible in general, tractable if 𝑁 is low rank 𝑈 where 𝑠 is not too large. 𝑠 i.e. 𝑁 = σ 𝑗=1 𝜇 𝑗 𝑣 𝑗 𝑤 𝑗

Netflix Challenge • Canonical example of matrix completion: Netflix Challenge • Can we predict users’ preferences on other movies from their previous ratings?

Netflix Challenge ? 5 6 ? 6 10 8 9 ? ? ? ? 9 6 7.5 9 ? 8 9 ?

Solving Matrix Completion • Current best method in practice: Alternating minimization 𝑠 𝑈 , alternate • Idea: Write 𝑁 = σ 𝑗=1 𝑣 𝑗 𝑤 𝑗 between optimizing {𝑣 𝑗 } and {𝑤 𝑗 } • Best known theoretical guarantees: Nuclear norm minimization • This lecture: We’ll describe nuclear norm minimization and how it generalizes to tensor completion via SOS.

Part II: Nuclear Norm Minimization

Theorem Statement 𝑈 is an 𝑠 • Theorem [Rec11]: If 𝑁 = σ 𝑗=1 𝜇 𝑗 𝑣 𝑗 𝑤 𝑗 𝑜 × 𝑜 matrix then nuclear norm minimization requires 𝑃(𝑜𝑠𝜈 0 𝑚𝑝𝑕𝑜 2 ) random samples to complete 𝑁 with high probability • Note: 𝜈 0 is a parameter related to how coherent the {𝑣 𝑗 } and the {𝑤 𝑗 } (see appendix for the definition) • Example of why this is needed: If 𝑣 𝑗 = 𝑓 𝑘 then 𝑈 = 𝑓 𝑘 𝑤 𝑗 𝑈 can only be fully detected by 𝑣 𝑗 𝑤 𝑗 sampling all of row 𝑘 , which requires sampling almost everything!

Nuclear Norm • Recall the singular value decomposition (SVD) of a matrix 𝑁 𝑈 where the {𝑣 𝑗 } are 𝑠 • 𝑁 = σ 𝑗=1 𝜇 𝑗 𝑣 𝑗 𝑤 𝑗 orthonormal, the {𝑤 𝑗 } are orthonormal, and 𝜇 𝑗 ≥ 0 for all 𝑗 . 𝑠 • The nuclear norm of 𝑁 is 𝑁 ∗ = σ 𝑗=1 𝜇 𝑗

Nuclear Norm Minimization • Matrix completion problem: Recover 𝑁 given randomly sampled entries {𝑁 𝑏𝑐 : 𝑏, 𝑐 ∈ Ω} • Nuclear norm minimization: Find the matrix 𝑌 which minimizes 𝑌 ∗ while satisfying 𝑌 𝑏𝑐 = 𝑁 𝑏𝑐 whenever 𝑏, 𝑐 ∈ Ω . • How do we minimize 𝑌 ∗ ?

Semidefinite Program • We can implement nuclear norm minimization with the following semidefinite program: • Minimize the trace of 𝑉 𝑌 ≽ 0 where 𝑌 𝑈 𝑊 𝑌 𝑏𝑐 = 𝑁 𝑏𝑐 whenever 𝑏, 𝑐 ∈ Ω • Why does this work? We’ll first show that the true solution is a good solution. We’ll then describe how to show the true solution is the optimal solution

True Solution • Program: Minimize the trace of 𝑉 𝑌 ≽ 0 𝑌 𝑈 𝑊 where 𝑌 𝑏𝑐 = 𝑁 𝑏𝑐 whenever 𝑏, 𝑐 ∈ Ω 𝑣 𝑗 • True solution: 𝑉 𝑌 𝑈 𝑈 = σ 𝑗 𝜇 𝑗 𝑣 𝑗 𝑤 𝑗 𝑤 𝑗 𝑌 𝑈 𝑊 𝑈 ) (recall that 𝑁 = σ 𝑗 𝜇 𝑗 𝑣 𝑗 𝑤 𝑗 𝑈 = 𝑢𝑠 𝑤 𝑗 𝑤 𝑗 𝑈 = 1 , • Since for all 𝑗 , 𝑢𝑠 𝑣 𝑗 𝑣 𝑗 𝑉 𝑌 = 2 σ 𝑗 𝜇 𝑗 𝑢𝑠 𝑌 𝑈 𝑊

Dual Certificate • Program: Minimize the trace of 𝑉 𝑌 ≽ 0 𝑌 𝑈 𝑊 where 𝑌 𝑏𝑐 = 𝑁 𝑏𝑐 whenever 𝑏, 𝑐 ∈ Ω 𝐽𝑒 −𝐵 • Dual Certificate: ≽ 0 −𝐵 𝑈 𝐽𝑒 • Recall that if 𝑁 1 , 𝑁 2 ≽ 0 then 𝑁 1 ⦁𝑁 2 ≥ 0 (where ⦁ is the entry-wise dot product) 𝐽𝑒 −𝐵 𝑉 𝑌 • ⦁ ≥ 0 𝑌 𝑈 −𝐵 𝑈 𝑊 𝐽𝑒 • If 𝐵 𝑏𝑐 = 0 whenever 𝑏, 𝑐 ∉ Ω , this lower bounds the trace.

True Solution Optimality 𝐽𝑒 −𝐵 • Dual Certificate: ≽ 0 where 𝐵 𝑏𝑐 = −𝐵 𝑈 𝐽𝑒 0 whenever 𝑏, 𝑐 ∉ Ω 𝑣 𝑗 • True solution 𝑉 𝑌 𝑈 𝑈 = σ 𝑗 𝜇 𝑗 𝑣 𝑗 𝑤 𝑗 𝑤 𝑗 𝑌 𝑈 𝑊 𝐽𝑒 −𝐵 𝑉 𝑌 ⦁ = 0 is optimal if 𝑌 𝑈 −𝐵 𝑈 𝑊 𝐽𝑒 𝑣 𝑗 𝐽𝑒 −𝐵 • This occurs if = 0 for all 𝑗 𝑤 𝑗 −𝐵 𝑈 𝐽𝑒

Conditions on 𝐵 𝐽𝑒 −𝐵 • We want 𝐵 such that ≽ 0 , 𝐵 𝑏𝑐 = −𝐵 𝑈 𝐽𝑒 0 whenever 𝑏, 𝑐 ∉ Ω , and 𝑣 𝑗 𝐽𝑒 −𝐵 = 0 for all 𝑗 𝑤 𝑗 −𝐵 𝑈 𝐽𝑒 • Necessary and sufficient conditions on 𝐵 : 1. 𝐵 ≤ 1 2. 𝐵 𝑏𝑐 = 0 whenever 𝑏, 𝑐 ∉ Ω 3. 𝐵𝑤 𝑗 = 𝑣 𝑗 for all 𝑗 4. 𝐵 𝑈 𝑣 𝑗 = 𝑤 𝑗 for all 𝑗

Dual Certificate with all entries • Necessary and sufficient conditions on 𝐵 : 1. 𝐵 ≤ 1 2. 𝐵 𝑏𝑐 = 0 whenever 𝑏, 𝑐 ∉ Ω 3. 𝐵𝑤 𝑗 = 𝑣 𝑗 for all 𝑗 4. 𝐵 𝑈 𝑣 𝑗 = 𝑤 𝑗 for all 𝑗 • If we have all entries (so we can ignore 𝑈 condition 2), we can take 𝐵 = σ 𝑗 𝑣 𝑗 𝑤 𝑗 • Challenge: Find 𝐵 when we don’t have all entries • Remark: This explains why the semidefinite program minimizes the nuclear norm.

Part III: Generalization to Tensor Completion

Tensor Completion • Ω be a set of entries Tensor Completion: Let sampled at random. Given the entries {𝑈 𝑏𝑐𝑑 : 𝑏, 𝑐, 𝑑 ∈ Ω} from a tensor 𝑈 , can we determine the remaining entries of 𝑈 ? • More difficult problem: tensor rank is much more complicated

Exact Tensor Completion Theorem 𝑠 • Theorem [PS17]: If 𝑈 = σ 𝑗=1 𝜇 𝑗 𝑣 𝑗 ⊗ 𝑤 𝑗 ⊗ 𝑥 𝑗 , the {𝑣 𝑗 } are orthogonal, the {𝑤 𝑗 } are orthogonal, and the {𝑥 𝑗 } are orthogonal then with high probability we can recover 𝑈 with 3 𝑃(𝑠𝜈𝑜 2 𝑞𝑝𝑚𝑧𝑚𝑝𝑕(𝑜)) random samples • First algorithm to obtain exact tensor completion • Remark: The orthogonality condition is very restrictive but this result can likely be extended. • See appendix for the definition of 𝜈 .

Semidefinite Program: First Attempt • Won’t quite work, but we’ll fix it later. • Minimize the trace of 𝑉 𝑌 ≽ 0 where 𝑌 𝑈 𝑊𝑋 𝑌 𝑏𝑐𝑑 = 𝑈 𝑏𝑐𝑑 whenever 𝑏, 𝑐, 𝑑 ∈ Ω • Here the top and left blocks are indexed by 𝑏 and the bottom and right blocks are indexed by 𝑐, 𝑑 .

True Solution 𝑉 𝑌 • ≽ 0 Program: Minimize trace of 𝑌 𝑈 𝑊𝑋 where 𝑌 𝑏𝑐𝑑 = 𝑈 𝑏𝑐𝑑 whenever 𝑏, 𝑐, 𝑑 ∈ Ω 𝑉 𝑌 • = True solution: 𝑌 𝑈 𝑊𝑋 𝑣 𝑗 𝑈 𝑤 𝑗 ⊗ 𝑥 𝑗 𝑈 σ 𝑗 𝜇 𝑗 𝑣 𝑗 𝑤 𝑗 ⊗ 𝑥 𝑗 (recall that T = σ 𝑗 𝜇 𝑗 𝑣 𝑗 𝑤 𝑗 ⊗ 𝑥 𝑗 𝑈 ) 𝑉 𝑌 • = 2 σ 𝑗 𝜇 𝑗 𝑢𝑠 𝑌 𝑈 𝑊𝑋

Dual Certificate: First Attempt 𝑉 𝑌 • ≽ 0 Program: Minimize trace of 𝑌 𝑈 𝑊𝑋 where 𝑌 𝑏𝑐𝑑 = 𝑈 𝑏𝑐𝑑 whenever 𝑏, 𝑐, 𝑑 ∈ Ω 𝐽𝑒 −𝐵 • ≽ 0 where Dual Certificate: −𝐵 𝑈 𝐽𝑒 𝐵 𝑏𝑐𝑑 = 0 whenever 𝑏, 𝑐, 𝑑 ∉ Ω 𝑣 𝑗 𝐽𝑒 −𝐵 • = 0 for all 𝑗 We want −𝐵 𝑈 𝑤 𝑗 ⊗ 𝑥 𝑗 𝐽𝑒

Conditions on 𝐵 𝐽𝑒 −𝐵 • We want 𝐵 such that ≽ 0 , 𝐵 𝑏𝑐𝑑 = −𝐵 𝑈 𝐽𝑒 0 whenever 𝑏, 𝑐, 𝑑 ∉ Ω , and 𝑣 𝑗 𝐽𝑒 −𝐵 = 0 for all 𝑗 𝑤 𝑗 ⊗ 𝑥 𝑗 −𝐵 𝑈 𝐽𝑒 • Necessary and sufficient conditions on 𝐵 : 1. 𝐵 ≤ 1 2. 𝐵 𝑏𝑐𝑑 = 0 whenever 𝑏, 𝑐, 𝑑 ∉ Ω 3. 𝐵(𝑤 𝑗 ⊗ 𝑥 𝑗 ) = 𝑣 𝑗 for all 𝑗 4. 𝐵 𝑈 𝑣 𝑗 = 𝑤 𝑗 ⊗ 𝑥 𝑗 for all 𝑗 TOO STRONG, requires Ω(𝑜 2 ) samples!

Part IV: SOS-symmetry to the Rescue

SOS Program • Minimize the trace of 𝑉 𝑌 ≽ 0 where 𝑌 𝑈 𝑊𝑋 𝑌 𝑏𝑐𝑑 = 𝑈 𝑏𝑐𝑑 whenever 𝑏, 𝑐, 𝑑 ∈ Ω and 𝑊𝑋 is SOS-symmetric (i.e. 𝑊𝑋 𝑐𝑑𝑐 ′ 𝑑 ′ = 𝑊𝑋 𝑐 ′ 𝑑𝑐𝑑 ′ for all 𝑐, 𝑑, 𝑐 ′ , 𝑑′ )

Review: Matrix Polynomial 𝑟(𝑅) • Definition: Given a symmetric matrix 𝑅 indexed by monomials, define q 𝑅 = σ 𝐿 (σ 𝐽,𝐾:𝐿=𝐽∪𝐾(𝑏𝑡 𝑛𝑣𝑚𝑢𝑗𝑡𝑓𝑢𝑡) 𝑅 𝐽𝐾 )𝑦 𝐿 • Idea: M ∙ 𝑅 = ෨ 𝐹[𝑟(𝑅)]

Dual Certificate • Program: Minimize trace of 𝑉 𝑌 ≽ 0 𝑌 𝑈 𝑊𝑋 where 𝑌 𝑏𝑐𝑑 = 𝑈 𝑏𝑐𝑑 whenever 𝑏, 𝑐, 𝑑 ∈ Ω and 𝑊𝑋 is SOS-symmetric 𝐽𝑒 −𝐵 • Dual Certificate: ≽ 0 where −𝐵 𝑈 𝐶 𝐵 𝑏𝑐𝑑 = 0 whenever 𝑏, 𝑐, 𝑑 ∉ Ω and q 𝐶 = 𝑟(𝐽𝑒) 𝑣 𝑗 𝐽𝑒 −𝐵 • We want = 0 for all 𝑗 𝑤 𝑗 ⊗ 𝑥 𝑗 −𝐵 𝑈 𝐶

Lecture 15: Exact Tensor Completion Joint Work with David Steurer - PowerPoint PPT Presentation

Lecture 15: Exact Tensor Completion Joint Work with David Steurer Lecture Outline Part I: Matrix Completion Problem Part II: Matrix Completion via Nuclear Norm Minimization Part III: Generalization to Tensor Completion

8. Tensor Field Visualization Tensor: extension of concept of scalar and vector Tensor data

Tensor Field Techniques Lecture 11 March 5, 2020 Outline Basics of tensor algebra Tensor

(Some) Challenges in (Some) Challenges in Tensor Mining Tensor Mining Evrim Acar Sandia

TENSOR ALGEBRA Continuum Mechanics Course (MMC) - ETSECCPB - UPC Introduction to Tensors Tensor

Tensor-Matrix Products with a Compressed Sparse Tensor Shaden Smith George Karypis University

Tensor Field Visualization 9-1 Ronald Peikert SciVis 2007 - Tensor Fields Tensors

Algebraic Tools for Exact Geometric Computing I - Exact Arithmetic and Filtering Michael Hemmer

ELD Completion Module Advice for students on completion of Modules A, B & C Why?

PROGRAMMING TENSOR CORES: NATIVE VOLTA TENSOR CORES WITH CUTLASS Andrew Kerr, Timmy Liu, Mostafa

TENSOR LAYERS FOR COMPRESSION OF DEEP LEARNING NETWORKS Cris Cecka Senior Research Scientist,

Tensor Methods for Signal Processing and Machine Learning Qibin Zhao Tensor Learning Unit RIKEN

and You Tensor network methods Matrix product states (MPS) Projected Entangled Pair States

Renormalization of Tensor Network States II. RG of Tensor Network States Tao Xiang Institute of

Design of a High-Performance GEMM-like Tensor-Tensor Multiplication Paul Springer and Paolo

Tensor Invariants and Kronecker Coefficients Jiarui Fei University of California, Riverside

Lax Gray tensor product for 2-quasi-categories Yuki Maehara Macquarie University CT 2019 Yuki

A Model of Black Hole Evaporation and 4D Weyl Anomaly RIKEN-iTHES Yuki Yokokura with H. Kawai

Compositions of Bottom-Up Tree Series Transformations Andreas Maletti a Technische Universit at

WeBWorK, Web-based Homework System Maria Nogin Department of Mathematics California State

Lab Course RouterLab Organization Tutorial Time Thursday, 16:00 st Except for

On the Size and the Approximability of Minimum Temporally Connected Subgraphs Dimitris Fotakis

High Price Gapping Play 99 CENTS STORES 14.4 14.3 14.2 14.1 14.0 13.9 13.8 13.7 13.6

SPOTLYTICS: HOW TO USE CLOUD MARKET PLACES FOR DATA ANALYTICS? TIM KRASKA, ELKHAN DADASHOV,

AerCap Holdings N.V. Aengus Kelly, CEO January 2017 Industry Update Looking Back PASSENGER