spectral methods from tensor networks
play

Spectral Methods from Tensor Networks Alex Wein Courant Institute, - PowerPoint PPT Presentation

Spectral Methods from Tensor Networks Alex Wein Courant Institute, NYU Joint work with Ankur Moitra (MIT) 1 / 19 Outline Tensors 2 / 19 Outline Tensors Statistical problems involving tensors 2 / 19 Outline Tensors


  1. Spectral Methods from Tensor Networks Alex Wein Courant Institute, NYU Joint work with Ankur Moitra (MIT) 1 / 19

  2. Outline ◮ Tensors 2 / 19

  3. Outline ◮ Tensors ◮ Statistical problems involving tensors 2 / 19

  4. Outline ◮ Tensors ◮ Statistical problems involving tensors ◮ A general framework for designing algorithms for tensor problems: “spectral methods from tensor networks” 2 / 19

  5. Outline ◮ Tensors ◮ Statistical problems involving tensors ◮ A general framework for designing algorithms for tensor problems: “spectral methods from tensor networks” ◮ Orbit recovery: a certain class of tensor problems ◮ Structured tensor decomposition 2 / 19

  6. Outline ◮ Tensors ◮ Statistical problems involving tensors ◮ A general framework for designing algorithms for tensor problems: “spectral methods from tensor networks” ◮ Orbit recovery: a certain class of tensor problems ◮ Structured tensor decomposition ◮ Main result: first polynomial-time algorithm for a certain orbit recovery problem 2 / 19

  7. I. Tensors and Tensor Networks 3 / 19

  8. What is a Tensor? An order-p tensor is an n 1 × n 2 × · · · × n p multi-array: T = ( T i 1 , i 2 ,..., i p ) with i j ∈ { 1 , 2 , . . . , n j } . An order-1 tensor is a vector. An order-2 tensor is a matrix. 4 / 19

  9. What is a Tensor? An order-p tensor is an n 1 × n 2 × · · · × n p multi-array: T = ( T i 1 , i 2 ,..., i p ) with i j ∈ { 1 , 2 , . . . , n j } . An order-1 tensor is a vector. An order-2 tensor is a matrix. T is symmetric if n 1 = · · · = n p = n and T i 1 ,..., i p = T i π (1) ,..., i π ( p ) for any permutation π . ◮ In this talk, all tensors will be symmetric. 4 / 19

  10. What is a Tensor? An order-p tensor is an n 1 × n 2 × · · · × n p multi-array: T = ( T i 1 , i 2 ,..., i p ) with i j ∈ { 1 , 2 , . . . , n j } . An order-1 tensor is a vector. An order-2 tensor is a matrix. T is symmetric if n 1 = · · · = n p = n and T i 1 ,..., i p = T i π (1) ,..., i π ( p ) for any permutation π . ◮ In this talk, all tensors will be symmetric. Given p vectors x 1 , . . . , x p , the rank-1 tensor x 1 ⊗ x 2 ⊗ · · · ⊗ x p has entries ( x 1 ⊗ x 2 ⊗ · · · ⊗ x p ) i 1 ,..., i p = ( x 1 ) i 1 ( x 2 ) i 2 · · · ( x p ) i p . ◮ Generalizes the rank-1 matrix xy ⊤ . ◮ Symmetric version: x ⊗ p = x ⊗ · · · ⊗ x ( p times). 4 / 19

  11. Tensor Problems Some statistical problems involving tensors: 5 / 19

  12. Tensor Problems Some statistical problems involving tensors: ◮ Tensor PCA / Spiked Tensor Model [RM’14, HSS’15] : Observe T = λ x ⊗ p + Z where ◮ x ∈ R n is planted “signal” (norm 1) ◮ λ > 0 is signal-to-noise parameter ◮ Z is “noise” (i.i.d. Gaussian tensor) Goal: given T , recover x “Recover a rank-1 tensor buried in noise” 5 / 19

  13. Tensor Problems Some statistical problems involving tensors: ◮ Tensor PCA / Spiked Tensor Model [RM’14, HSS’15] : Observe T = λ x ⊗ p + Z where ◮ x ∈ R n is planted “signal” (norm 1) ◮ λ > 0 is signal-to-noise parameter ◮ Z is “noise” (i.i.d. Gaussian tensor) Goal: given T , recover x “Recover a rank-1 tensor buried in noise” ◮ Tensor Decomposition [AGJ’14, BKS’15, GM’15, HSSS’16, MSS’16] : Observe T = � r i =1 x ⊗ p where { x i } are random vectors: i ◮ x i ∼ N (0 , I n ) Goal: given T , recover { x 1 , . . . , x r } “Recover the components of a rank- r tensor” 5 / 19

  14. Tensor Network Notation A graphical representation for tensors (used in e.g., quantum) 6 / 19

  15. Tensor Network Notation A graphical representation for tensors (used in e.g., quantum) An order- p tensor has p “legs”, one for each index: i T ⇔ T = ( T i , j , k ) k j 6 / 19

  16. Tensor Network Notation A graphical representation for tensors (used in e.g., quantum) An order- p tensor has p “legs”, one for each index: i T ⇔ T = ( T i , j , k ) k j Two (or more) tensors can be attached by contracting indices: a b i T U B = ( B a , b , c , d ) ⇔ c B a , b , c , d = � i T a , c , i U b , d , i d Rule: sum over “fully connected” indices (in this case, i ) 6 / 19

  17. More Examples A bigger example: u i j T k B = ( B a , b , c , d ) ⇔ T T B a , b , c , d = � i , j , k T a , c , j T b , d , k T i , j , k u i a c b d 7 / 19

  18. More Examples A bigger example: u i j T k B = ( B a , b , c , d ) ⇔ T T B a , b , c , d = � i , j , k T a , c , j T b , d , k T i , j , k u i a c b d This framework generalizes matrix/vector multiplication: x ⊤ ABy x − A − B − y ⇔ 7 / 19

  19. More Examples A bigger example: u i j T k B = ( B a , b , c , d ) ⇔ T T B a , b , c , d = � i , j , k T a , c , j T b , d , k T i , j , k u i a c b d This framework generalizes matrix/vector multiplication: x ⊤ ABy x − A − B − y ⇔ � x i A ij B jk y k ijk 7 / 19

  20. II. Spectral Methods from Tensor Networks 8 / 19

  21. Spectral Methods from Tensor Networks General framework for solving tensor problems: 1. Given input tensor T 2. Build a new tensor B by connecting copies of T in a tensor network 3. Flatten B to form a symmetric matrix M ◮ E.g., the ( { a , b } , { c , d } )-flattening of B = ( B a , b , c , d ) is the n 2 × n 2 matrix M ( a , b ) , ( c , d ) = B a , b , c , d 4. Compute the leading eigenvector of M 9 / 19

  22. Prior Work Prior work has (implicitly) used this framework: u a c T T T T T T T T T b d a c b d ◮ [Richard–Montanari’14, Hopkins–Shi–Steurer’15] “Tensor unfolding” ◮ [Hopkins–Shi–Steurer’15] “Spectral SoS” ◮ [Hopkins–Schramm–Shi–Steurer’16] “Spectral SoS with partial trace” ◮ [Hopkins–Schramm–Shi–Steurer’16] “Spectral tensor decomposition” u is a random vector (to break symmetry). 10 / 19

  23. Our Contribution 11 / 19

  24. Our Contribution We give the first polynomial-time algorithm for a particular tensor problem: heterogeneous continuous multi-reference alignment. The algorithm is a spectral method based on this tensor network: c a T T T T u T T T b T T d Smaller tensor networks fail for this problem. 11 / 19

  25. General Analysis of Tensor Networks 12 / 19

  26. General Analysis of Tensor Networks Main step of analysis is to upper bound largest eigenvalue of a matrix built from a tensor network. 12 / 19

  27. General Analysis of Tensor Networks Main step of analysis is to upper bound largest eigenvalue of a matrix built from a tensor network. Trace moment method: for a symmetric matrix M with eigenvalues { λ i } and λ max = max i | λ i | , Tr ( M 2 k ) = � λ 2 k ≥ λ 2 k i max i so compute E [ Tr ( M 2 k )] and apply Markov’s inequality: max ≥ t 2 k ) ≤ E [ Tr ( M 2 k )] P ( λ max ≥ t ) = P ( λ 2 k . t 2 k 12 / 19

  28. Trace Method for Tensor Networks Example: T is an order-3 symmetric tensor with i.i.d. Rademacher (uniform ± 1) entries, and we want to compute E [ Tr ( M 6 )] where M is the ( { a , b } , { c , d } )-flattening of this tensor: a b T T c d 13 / 19

  29. Trace Method for Tensor Networks Example: T is an order-3 symmetric tensor with i.i.d. Rademacher (uniform ± 1) entries, and we want to compute E [ Tr ( M 6 )] where M is the ( { a , b } , { c , d } )-flattening of this tensor: a b T T c d Note that M M M Tr ( M 6 ) = M M M so plug in the definition of M ... 13 / 19

  30. Trace Method for Tensor Networks (Continued) T T T T T T Tr ( M 6 ) = T T T T T T So the computation of E [ Tr ( M 6 )] is reduced to a combinatorial question about this diagram. When T is i.i.d. Rademacher: E [ Tr ( M 6 )] is the number of ways to label the edges of the diagram with elements of [ n ] such that each triple { i , j , k } appears incident to an even number of T ’s. 14 / 19

  31. Trace Method for Tensor Networks (Continued) T j T k T i T T T Tr ( M 6 ) = T T T T T T So the computation of E [ Tr ( M 6 )] is reduced to a combinatorial question about this diagram. When T is i.i.d. Rademacher: E [ Tr ( M 6 )] is the number of ways to label the edges of the diagram with elements of [ n ] such that each triple { i , j , k } appears incident to an even number of T ’s. 14 / 19

  32. Trace Method for Tensor Networks (Continued) T j T k T i T T T Tr ( M 6 ) j = T T T T T k i T So the computation of E [ Tr ( M 6 )] is reduced to a combinatorial question about this diagram. When T is i.i.d. Rademacher: E [ Tr ( M 6 )] is the number of ways to label the edges of the diagram with elements of [ n ] such that each triple { i , j , k } appears incident to an even number of T ’s. 14 / 19

  33. III. Orbit Recovery Problems 15 / 19

  34. Image Alignment Given many noisy rotated copies of an image, recover the image. Image credit: [Bandeira, PhD thesis ’15] 16 / 19

  35. Image Alignment Given many noisy rotated copies of an image, recover the image. Image credit: [Bandeira, PhD thesis ’15] Application: cryo-EM (cryo-electron microscopy) ◮ Given many noisy pictures of a molecule taken from different unknown angles, recover the 3D structure of the molecule. 16 / 19

  36. Orbit Recovery Orbit Recovery Problem [APS17,BRW17,PWBRS17,BBKPW W 17,APS18] : 17 / 19

  37. Orbit Recovery Orbit Recovery Problem [APS17,BRW17,PWBRS17,BBKPW W 17,APS18] : ◮ Let x ∈ R n be an unknown “signal” (e.g. the image) 17 / 19

Recommend


More recommend