tensor estimation with structured priors
play

Tensor estimation with structured priors Clment Luneau, Nicolas - PowerPoint PPT Presentation

Tensor estimation with structured priors Clment Luneau, Nicolas Macris June 29, 2020 Laboratoire de Thorie des Communications, EPFL, Switzerland Statistical model for tensor estimation Noisy observations of a symmetric rank-one tensor


  1. Tensor estimation with structured priors Clément Luneau, Nicolas Macris June 29, 2020 Laboratoire de Théorie des Communications, EPFL, Switzerland

  2. Statistical model for tensor estimation Noisy observations of a symmetric rank-one tensor i.i.d. 1 √ √ λ λ n X ⊗ 3 + Z ∀ 1 ≤ i ≤ j ≤ k ≤ n : Y ijk = n X i X j X k + Z ijk ⇔ Y = √ √ λ λ n XX T + Z ∀ 1 ≤ i ≤ j ≤ n : Y ij = n X i X j + Z ij ⇔ Y = • n -dimensional spike X ∈ R n ∼ N ( 0 , 1 ) additive white Gaussian noise • Z ij ( k ) • λ > 0 ∝ signal-to-noise ratio Goal: estimate the spike X and/or the underlying rank-one tensor X ⊗ 3

  3. High-dimensional regime for i.i.d. prior i.i.d. • Performance of Approximate Message Passing algorithm precisely tracked 2 1 Lelarge and Miolane, “Fundamental limits of symmetric low-rank matrix estimation”. 2 Lesieur et al., “Statistical and computational phase transitions in spiked tensor estimation”. 2 i.i.d. prior on the spike: X 1 , X 2 , . . . , X n ∼ P X • Precise formula 1 for MMSE := 1 n E ∥ X − E [ X | Y ] ∥ 2 when n → + ∞

  4. Algorithmic gap for low sparsity prior Bernoulli-Rademacher prior 3 P X ( 1 ) = P X ( − 1 ) = ρ/ 2 P X ( 0 ) = 1 − ρ , Algorithmic gap even for matrix estimation if low sparsity ρ (below ρ = 0 . 05)

  5. Structured prior Data in nature has structure 3 Aubin et al., “The spiked matrix model with generative priors”. Proposed by Aubin et al. in the context of matrix estimation i.i.d. • W sensing matrix: W ij i.i.d. 4 manifold • High-dimensional signal efgectively lies on a low-dimensional • Compressed sensing: signal to estimate sparse in some domain Recently 3 use of generative models to encode structure: � ( WS ) i � X i := φ √ p • S p -dimensional latent vector: S 1 , . . . , S p ∼ P S ∼ N ( 0 , 1 ) • φ (nonlinear) activation functions

  6. Matrix estimation with generative priors “No algorithmic gap with generative-model priors” 4 ReLU (right) activations. Figure by Aubin et al. 4 Aubin et al., “The spiked matrix model with generative priors”. 5 High-dimensional limit n → + ∞ with fjxed ratio α := n / p Figure 1: MMSE as a function of ∆ = 1 /λ for linear (left), sign (centre) and

  7. Tensor estimation with generative priors Can we leverage generative priors in tensor estimation to have a fjnite algorithmic gap for a centered prior? In this talk 6 1. Formulas for asymptotic mutual information & MMSE 2. Visualization of MMSE ( X ⊗ 3 ) for difgerent settings 3. Limit α := n / p → 0: simplifjed equivalent model with i.i.d. prior

  8. Asymptotic normalized mutual information Theorem: asymptotic normalized mutual information 5 Z n 7 √ � ( WS ) i � λ ∀ 1 ≤ i ≤ j ≤ k ≤ n : Y ijk = n X i X j X k + Z ijk with ∀ i : X i := φ √ p I ( X ; Y | W ) lim = inf inf sup ψ λ,α ( q x , q s , r s ) n → + ∞ q x ∈ [ 0 ,ρ x ] q s ∈ [ 0 ,ρ s ] r s ≥ 0 n / p → α with potential function � � x / 2 φ ( √ ρ s − q s U + √ q s V ) + � � � � V ψ λ,α ( q x , q s , r s ) := I U ; λ q 2 α I ( S ; √ r s S + Z ) − r s ( ρ s − q s ) + λ + 1 12 ( ρ x − q x ) 2 ( ρ x + 2 q x ) 2 α ∼ N ( 0 , 1 ) and ρ s := E S 2 , ρ x := E φ ( √ ρ s U ) 2 where S ∼ P S , U , V , Z , � Z i.i.d. 5 Luneau and Macris, Tensor estimation with structured priors.

  9. Minimum mean square error Theorem: asymptotic tensor MMSE 6 6 Luneau and Macris, Tensor estimation with structured priors. n 3 8 � Q ∗ q ∗ ψ λ,α ( q ∗ x ( λ ) := x ∈ [ 0 , ρ x ] : inf sup x , q s , r s ) q s ∈ [ 0 ,ρ s ] r s ≥ 0 � = inf inf sup ψ λ,α ( q x , q s , r s ) q x ∈ [ 0 ,ρ x ] q s ∈ [ 0 ,ρ s ] r s ≥ 0 For almost every λ > 0, Q ∗ x ( λ ) = { q ∗ x ( λ ) } is a singleton and � � � X ⊗ 3 − E [ X ⊗ 3 | Y , W ] � 2 � � 3 E q ∗ lim = ρ 3 x − x ( λ ) n → + ∞ n / p → α

  10. Algorithmic gap asymptotic MMSE 9 critical point equation ∇ ψ λ,α ( q x , q s , r s ) = 0 ⇕ fjxed point equation ( q x , q s , r s ) = F λ,α ( q x , q s , r s ) • Fixed point with lowest potential ψ λ,α ( q x , q s , r s ) used to compute • Uninformative fjxed point q x = 0 ifg φ odd function, P S centered Strongly stable fjxed point ⇒ infjnite algorithmic gap persists

  11. 2 signal-to-latent space dimensions . 10 Asymptotic MMSE in the plane ( α, λ ) Information theoretic threshold λ IT decreases with the ratio α of Figure 2: Asymptotic MMSE ( X ⊗ 3 ) as a function of ( α, λ ) for ϕ ( x ) = x . Left: ( δ 1 + δ − 1 ) P S ∼ N ( 0 , 1 ) . Right: P S ∼

  12. Asymptotic MMSE signal-to-latent space dimensions estimation problem with i.i.d. Rademacher prior. 11 Information theoretic threshold λ IT decreases with the ratio α of Figure 3: Asymptotic MMSE ( X ⊗ 3 ) as a function of λ for ϕ ( x ) = sign ( x ) , P S ∼ N ( 0 , 1 ) and difgerent values of α . Limit α → 0 + given by tensor

  13. Limit of vanishing signal-to-latent space dimensions x 7 Lelarge and Miolane, “Fundamental limits of symmetric low-rank matrix estimation”. i.i.d. X n n Same asymptotic mutual information than Z 2 12 n Limit α → 0 + of the asymptotic mutual information I ( X ; Y | W ) λ lim lim = inf 12 ( ρ x − q x ) 2 ( ρ x + 2 q x ) n → + ∞ α → 0 + q x ∈ [ 0 ,ρ x ] n / p → α � � �� � � � λ q 2 � + � + I U ; φ ρ s − ( E S ) 2 U + | E S | V � V √ λ � � X i � X j � X k + � Z ijk , 1 ≤ i ≤ j ≤ k ≤ n , Y ijk = � with � X i = φ ( ρ s − ( E S ) 2 U i + | E S | V i ) ; U , V i.i.d. ∼ N ( 0 , I n ) ; V known • E S ∼ P S S = 0 : i.i.d. prior � X 1 , . . . , � ∼ φ ( N ( 0 , ρ s )) • E S ∼ P S S ̸ = 0 : side information V , proof in 7 easily adapted

  14. Limit of vanishing signal-to-latent space dimensions i.i.d. 8 Aubin et al., “The spiked matrix model with generative priors”. 2 dx “No algorithmic gap with generative-model priors” 8 ? 13 Algorithmic gap for matrix estimation with generative prior 1. Similar behavior for matrix estimation with generative priors 2. We can choose φ to obtain any equivalent i.i.d. prior φ ( N ( 0 , ρ s )) when α → 0 + including a prior exhibiting an algorithmic gap � � WS / √ p X = φ with S 1 , . . . , S p ∼ P S centered unit-variance and   − 1 if x < − ϵ  � − ϵ  2 = ρ √ e − x 2 φ ( x ) = ; 0 if − ϵ < x < ϵ  2 π  −∞  + 1 if x > ϵ Equivalent to i.i.d. Bernoulli-Rademacher prior when α → 0 + φ ( N ( 0 , ρ s )) ∼ ( 1 − ρ ) δ 0 + ρ 2 δ 1 + ρ 2 δ − 1

  15. Limit of vanishing signal-to-latent space dimensions signal X lying on a lower p -dimensional space 14 However regime α → 0 + does not correspond to a high-dimensional Does the algorithmic gap vanishes/disappears when α increases?

  16. References Aubin, Benjamin et al. “The spiked matrix model with generative priors”. In: Advances in Neural Information Processing Systems 32. 2019, pp. 8366–8377. Lelarge, Marc and Léo Miolane. “Fundamental limits of symmetric low-rank matrix estimation”. In: Probability Theory and Related Fields 173.3 (2019). ISSN: 1432-2064. DOI: 10.1007/s00440-018-0845-x . Lesieur, Thibault et al. “Statistical and computational phase 2017 IEEE International Symposium on Information Theory (ISIT) (2017). DOI: 10.1109/isit.2017.8006580 . Luneau, Clément and Nicolas Macris. Tensor estimation with structured priors. 2020. arXiv: 2006.14989 [cs.IT] . 15 transitions in spiked tensor estimation”. In:

Recommend


More recommend