dimitri nion
play

Dimitri Nion Post-Doc fellow, KU Leuven, Kortrijk, Belgium E-mail: - PowerPoint PPT Presentation

Tensor Decompositions: Models, Applications, Algorithms, Uniqueness Dimitri Nion Post-Doc fellow, KU Leuven, Kortrijk, Belgium E-mail: Dimitri.Nion@kuleuven-kortrijk.be Homepage: http://perso-etis.ensea.fr/~nion/ I3S, Sophia-Antipolis,


  1. Tensor Decompositions: Models, Applications, Algorithms, Uniqueness Dimitri Nion Post-Doc fellow, KU Leuven, Kortrijk, Belgium E-mail: Dimitri.Nion@kuleuven-kortrijk.be Homepage: http://perso-etis.ensea.fr/~nion/ I3S, Sophia-Antipolis, December 11th 2008

  2. Preliminary Tensor Decompositions Q: What is this ? R: Powerful multi-linear algebra tools that generalize matrix decompositions. Q: Where are they useful ? R: Increasing number of applications involve manipulation of multi-way data, rather than 2-way data. Q: How powerful are they compared to matrix decompositions? R: Uniqueness properties + Better exploitation of the multi- dimensional nature of data Key research axes: � Development of new models/decompositions � Development of algorithms to compute decompositions � Uniqueness bounds of tensor decompositions � New applications, or existing applications where the multi- 2 way nature of data was ignored until now

  3. Roadmap I. Introduction II. A few Tensor Decompositions: PARAFAC, HOSVD/Tucker, Block-Decompositions III. Algorithms to compute Tensor Decompositions IV. Applications V. Conclusion and Future Research 3

  4. I. Introduction What is a tensor ? Tensor of order N = Array with N dimensions For N>2, « Higher-Order Tensors » y = 1st-order tensor Y = 2nd-order tensor = 3rd-order tensor � � � � 4

  5. I. Introduction Multi-Way Processing, why? General motivation for using tensor signal representation and processing : « If by nature, a signal is multi-dimensional, then its tensor representation allows to use multilinear algebra tools, which are more powerful than linear algebra tools. » Many signals are tensors : - (R,G,B) image can be represented as a tensor - Video sequence is a tensor of consecutive frames - Multi-variate signals, varying e.g. with time, temperature, illumination, sensor positions, etc… 5

  6. I Introduction Tensor models: an increasing number of applications Various disciplines: � Phonetics � Psychometry � Chemometrics (spectroscopy, chromatography) � Image and video compression and analysis � Scientific programming � Sensor analysis � Multi-Way Principal Component Analysis (PCA) � Blind Source Separation and Independent Component Analysis (ICA) 6 � Telecommunications (wireless communications)

  7. I. Introduction Multi-Way Data K Set of K matrices of size IxJ � � � � I One matrix observed K times J (ex: K = time, K = number of sensors, etc) � 3-way tensor (« third-order tensor ») Multiple variables � extension to N-way tensors How to perform Multi-Way Analysis? - Via tensor-algebra tools (=multilinear algebra tools) - Matrix tools (SVD, EVD, QR, LU) have to be generalized � Tensor Decompositions 7

  8. I. Introduction Tensor Unfolding (“matricization”) J J Y Y × Y = ... Y I I I KJ K k 1 � � � � K K J Y Y × = Y Y ... J IK i J I K 1 Y I I j Y Y × = Y ... K J K JI 1 Multi-Way Analysis? - One can choose one matrix representation of �� and apply matrix tools (ex: matrix SVD for Principal Component Analysis (PCA)) - Problem: the multi-way structure is then ignored - Feature of N-way analysis: exploit the N matrices simultaneously 8

  9. Roadmap I. Introduction II. A few Tensor Decompositions: PARAFAC, HOSVD/Tucker, Block-Decompositions III. Algorithms to compute Tensor Decompositions IV. Applications V. Conclusion and Future Research 9

  10. I. Tensor Decompositions Matrix Singular Value Decomposition (SVD) R J V H R Y U I = S U U I V V I H = H = � unitary matrices and S = diag σ σ ( ,..., ) � Singular values in decreasing order R 1 If rank ( Y )>R, this truncated SVD is the best rank-R approx. of Y In general a matrix factorization Y = UV H is not unique: Y = UV H = UPP -1 V H The SVD is unique because of unitary constraints on U and V and ordering constraint of the singular values in S 10

  11. I. Tensor Decompositions Tucker-3 Decomposition [Tucker 1966] C L M N = ∑∑∑ y a b c h K ijk il jm kn lmn N = = = l m n B = T 1 1 1 � � A I L A B C = × × × � � M 1 2 3 J � Tucker Tucker- Tucker Tucker -3 = 3 - - 3 = 3 3 = 3- 3 = 3 -way PCA - - way PCA. One unitary base (A way PCA way PCA A, B A A B B B, C C C C) per mode (Tucker-1, Tucker-2,…, Tucker-N are possible). � If A A, B A A B B, C B C C C are unitary matrices, TUCKER=HOSVD (« Higher Order Singular Value Decomposition ») �� is the representation of � � in the reduced spaces. � �� �� �� � � � The number of principal components may be different in the ≠ ≠ L M N three modes i.e. � is not not diagonal (difference with matrix SVD). not not � � � �

  12. I. Tensor Decompositions Uniqueness of Tucker-3 Decomposition C P − 1 P 3 3 N P P P P B − − T 1 1 � K 2 1 1 2 L = A M � I New core tensor J � Tucker not unique: rotational freedom in each mode. � A , B , C are not unique (only subspace estimates). 12

  13. The best rank-(L,M,N) approximation [De Lathauwer, 2000] Y 1 is the best lower rank approximation of Y (in Y 1 = truncated Matrix SVD of Y the Frobenius norm sense): V H Y U = I Min ||Y-Y 1 || F 1 S = s.t. Y 1 is rank-R Question: Is the truncated HOSVD, the best rank-(L,M,N) approximation of ��� ��� ? NO ��� ��� C N L Min B T M - � � A F The truncated HOSVD is only a good rank-(L,M,N) approximation of � � . � � To find the best one, one usually starts with the truncated HOSVD (initialization) and then alternate updates of the 3 subspace matrices A , B and C . 13

  14. I. Tensor Decompositions PARAFAC Decomposition [Harshman 1970] C � is diagonal K � � � R B T = ( if i=j=k, h ijk =1, else, h ijk =0 ) � I R A � R J c c c 1 c c c c c R 1 R 1 1 R R Sum of R rank-1 tensors: + … + � 1 +…+ � b b b b 1 b R b b b � � � � � � R = 1 R R R R 1 1 R R a a a a R a a 1 a a R R R 1 1 1 K C �� �� �� �� = set of K matrices of the B T form: = A � (:,:,k)=A A A A diag(C C(k,:)) B C C B B T B � � � 14

  15. I. Tensor Decompositions Uniqueness of PARAFAC Decomposition (1) C Scaling matrix Permutation matrix Π D 3 R D D B Π Π T K 2 1 R � = A R � I D D D I = with R 1 2 3 J � Under mild conditons (next slide) PARAFAC is unique: only trivial ambiguities remain on A , B and C ( permutation and scaling of columns). � PARAFAC decomposition gives the true matrices A , B and C (up to the trivial ambiguities) � this is a key feature compared to matrix SVD (which gives only subspaces) 15

  16. I. Tensor Decompositions Uniqueness of PARAFAC Decomposition (2) Uniqueness condition [Kruskal, 1977] + + ≥ + k k k R 2 2 (1) A B C k A is the Kruskal-rank of A Generically, k A A =min(I,R) A A ≥ (I,R)+ (J,R)+ (K,R) (R+ ) (2) min min min 2 1 Rela �e� �o�n� (real an� comple� cases) �o�n� (�) �s �o�n� (�) [De Lathauwer 2005] : − − − I(I ) K(K ) R(R ) 1 1 1 ≥ ≥ J R et (3) 2 2 2 16

  17. I. Tensor Decompositions PARAFAC vs Tucker 3 C K N B T = � � A L I M J PARAFAC TUCKER 3 R L M N = ∑ = ∑ ∑ ∑ y a b c y a b c h i j k i r j r k r ijk il jm kn lmn r = = = = 1 l m n 1 1 1 � is diagonal � is not diagonal � � � � � � ≠ ≠ � A A, B B and C do not C do not A A B B C do not C do not L M N L=M=N � A A A, B A B B and C B C have the C C necessarily have the same nb. of columns same nb. of columns Unique (trivial ambiguities): Unique (trivial ambiguities): Not unique: Not unique: Not unique: Not unique: Unique (trivial ambiguities): Unique (trivial ambiguities): Rotational freedom still remains. Only arbitrary scaling and permutation remains . 17

  18. I. Tensor Decompositions Block Component Decomposition in rank-(L r ,L r ,1) terms c c R 1 K B B T T L 1 BCD-(L r ,L r ,1) L 1 L R R 1 L R = + … + � I A A 1 R J � First generalization of PARAFAC in block terms [De Lathauwer, de Baynast, 2003] � If L r =1 for all r, then BCD-(L r ,L r ,1)=PARAFAC � Unknown matrices: L 1 L R L 1 L R C = ... B = A = A A B B K ... ... I J R R 1 1 c c R 1 � BCD-(L r ,L r ,1) is said unique if the only remaining ambiguities are: � Arbitrary permutation of the blocks in A A A A and B B B B and of the columns of C C C C � Rotational freedom of each block (block-wise subspace estimation) + scaling ambiguity on the columns of C C C C 18

  19. I. Tensor Decompositions Uniq ueness of the BCD Uniqueness of the BCD- -(L,L,1) (i.e., L (L,L,1) (i.e., L 1 1 =L =L 2 2 =…=L =…=L R R =L) =L) Uniqueness of the BCD Uniqueness of the BCD - - (L,L,1) (i.e., L (L,L,1) (i.e., L =L =L =…=L =…=L =L) =L) 1 1 2 2 R R Sufficient bound 1     I J ≤ ≥ [De Lathauwer LR IJ ( ,R)+ ( ,R)+ (K,R) (R+ ) and (1) min min min 2 1      L   L  SIMAX 2008] Sufficient bound 2 + . + + ≤ ≥ − R IJ K R L 1 L 1 L 1 [Nion, PhD Thesis, and C C C (2) min( , ) + I J R L 2007] : n ! = k C where − n k n k ! ( )! 19

Recommend


More recommend