Multilinear Algebra and Tensor Decomposition Qibin Zhao Tensor Learning Unit RIKEN AIP 2018-6-2 @ Waseda University
Self-Introduction • 2009, Ph.D. in Computer Science, Shanghai Jiao Tong University • 2009 - 2017, RIKEN Brain Science Institute • 2017 - Now, RIKEN AIP • Research Interests: • Brain computer interface, brain signal processing • Tensor decomposition and machine learning
Self-Introduction Brain computer interface
Self-Introduction RECRUITMENT INFORMATION MARCH 15, 2017 RIKEN Center for Advanced Intelligence Project (AIP) http://www .riken.jp/en/research/labs/aip/ Tensor Learning Unit Contact Information Mitsui Building, 15th floor, Unit Leader: Dr. Qibin Zhao 1-4-1 Nihonbashi, Chuo- Tensors are high-dimensional generalizations of ku, Tokyo103-0027, Japan vectors and matrices, which can provide a natural Email: qibin.zhao@riken.jp and scalable representation for multi-dimensional, multi-relational or multi-aspect data with inherent structure and complex dependence. In our team, we Research Field investigate the various tensor-based machine Computer Science learning models, e.g., tensor decomposition, multilinear latent variable model, tensor regression and classification, tensor networks, deep tensor learning, and Bayesian tensor learning, with aim to facilitate the learning from high- Related Fields dimensional structured data or large-scale latent parameter space. In addition, we develop the scalable and efficient tensor learning Machine Learning algorithms supported by the theoretical principles, with the goal to Computer Vision advance existing machine learning approaches. The novel applications in computer vision and brain data analysis will also be exploited to Neuroscience provide new insights into tensor learning methods. Research Subjects Opening Positions Tensor 1 Decomposition 2 3 Tensor Networks POSTDOCTORAL TECHNICAL RESEARCH Tensor Regression RESEARCHER STAFF INTERN and Classification Deep Tensor Doctoral degree Technical support Ph.D students are Learning for researchers preferable Bayesian Tensor Learning We are seeking talented and creative researchers who are willing to solve the challenging problems in machine learning. For research topics, please refer to the bottom- right side. If you are interested in joining our team, please contact us (see the top-right side).
Outline • Vector and linear algebra • Matrix and its decomposition • What is tensor? • Basic operations in tensor algebra • Classical tensor decomposition ✦ CP Decomposition ✦ Tucker Decomposition
� � � � � � � � � � � � � � � � � � Vectors � We can think of vectors in two ways: � � Points in a multidimensional space with respect to some coordinate system � translation of a point in a multidimensional space ex., translation of the origin (0,0) � ∑ = + + = L =
Dot Product or Scalar Product � Dot product is the product of two vectors � � Example: � � � � x y 1 1 ⋅ = ⋅ = + = � � � � x y x y x y s � � 1 1 2 2 � � � � x y � � � � � � 2 2 [ ] ⋅ = = � � L M � � � It is the projection of one vector onto another � � � � � ⋅ = θ x y x y cos θ x.y ⋅ = ⋅ � � ( ) + ⋅ = ⋅ + ⋅ � ∑ = = ⋅ = + + + = L � ( ) ( ) ⋅ = ⋅ = ( ) ( ) ⋅ = ⋅ � ( ) ( ) ( )( ) ⋅ = ⋅ � � � � ∀ ≠ ≠ ⋅ = ⇔ ⊥ � � � + + + L � � � � ∈ �
� � � � � � � ⋅ = ⋅ = + = � � � � � � � � � � � � � � � � [ ] ⋅ = = � � L M � � � � � � � � ⋅ = θ θ Dot Product or Scalar Product ⋅ = ⋅ x y y x � Commutative: � ( ) + ⋅ = ⋅ + ⋅ x y z x z y z � Distributive: ∑ = = ⋅ = + + + = L � Linearity ( ) ( ) ⋅ = ⋅ c x y c x y = ( ) ( ) ⋅ = ⋅ x c y c x y � ( ) ( ) ( )( ) ⋅ = ⋅ c x c y c c x y � 1 2 1 2 � � � ∀ ≠ ≠ ⋅ = ⇔ ⊥ � � � + + + L � � � � ∈ �
� � � � � � � ⋅ = ⋅ = + = � � � � � � � � � � � � � � � � [ ] ⋅ = = � � L M � � � � � � � � ⋅ = θ θ Norms ⋅ = ⋅ � Euclidean norm (sometimes called 2-norm): � ( ) + ⋅ = ⋅ + ⋅ � n ∑ = = ⋅ = + + + = 2 2 2 2 x x x x x x x x L � ( ) ( ) 1 2 n i 2 ⋅ = ⋅ = i 1 ( ) ( ) ⋅ = ⋅ � The length of a vector is defined to be its (Euclidean) norm. ( ) ( ) ( )( ) � A unit vector is of length 1. ⋅ = ⋅ � � � ∀ ≠ ≠ ⋅ = ⇔ ⊥ � � � + + + L � � � � ∈ �
� � � � � � � � � � � � L � � � � � � � � � � = L M � � � � � � � � = � � � � ⋅ ⋅ � � � � � � � � � � � � � � � � = + + M O L � � � � � � � � ⋅ ⋅ ∑ � � � � = o = Singular Value Decomposition D=USV T = V U S D I 1 I x ∈ R D � A matrix has a column space and a row space 2 D � SVD orthogonalizes these spaces and decomposes U ( contains the left singular vectors/eigenvectors ) D = USV T V ( contains the right singular vectors/eigenvectors ) � Rewrite as a sum of a minimum number of rank-1 matrices R = σ D u v ∑ o r r r = r 1 = σ o ∑ � = � σ + σ + σ = = σ o ∑ ∑ � = =
� � � � � � � � � � � � L � � � � � � � � = � � L M � � � � � � � � = � � � � � � ⋅ ⋅ � � � � � � � � � � � � � � = + + M O L � � � � � � � � ⋅ ⋅ ∑ � � � � = o = ∈ � � = � = σ ∑ o = Matrix SVD Properties = R σ D u v o ∑ � Rank Decomposition: r r r = � sum of min. number of rank-1 matrices r 1 ….. σ + σ + σ D = 1 2 R v R v 2 v 1 T T T u 1 u 2 u R R R = σ 1 2 D u v o ∑ ∑ Multilinear Rank Decomposition: � r r r r 1 2 1 2 = = r 1 r 1 1 2 = V S U D
Matrix in Machine Learning Data often available in matrix form. coefficient s e r u t a e f samples
Matrix in Machine Learning Data often available in matrix form. movie movies rating 4 users
Matrix in Machine Learning Data often available in matrix form. word count words 57 text documents
Matrix Decomposition in Machine Learning dictionary learning ≈ low-rank approximation factor analysis latent semantic analysis data X dictionary W activations H ≈
Matrix Decomposition in Machine Learning dictionary learning ≈ low-rank approximation factor analysis latent semantic analysis data X dictionary W activations H ≈
Matrix Decomposition in Machine Learning for dimensionality reduction (coding, low-dimensional embedding) ≈
Matrix Decomposition in Machine Learning for interpolation (collaborative filtering, image inpainting) ≈
Basic Model of Matrix Decomposition s n r H e t t a p K N samples s e ≈ r V u W t a e f F
Matrix Decomposition with Constraints Di ff erent types of constraints have been considered in previous works: − Sparsity constraints: either on W or H ( e.g. , Hoyer, 2004; Eggert and Korner, 2004 ) ; − Shape constraints on w k , e.g. : I convex NMF : w k are convex combinations of inputs (Ding et al., 2010) ; I harmonic NMF : w k are mixtures of harmonic spectra (Vincent et al., 2008) . − Spatial coherence or temporal constraints on h k : activations are smooth ( Virtanen, 2007; Jia and Qian, 2009; Essid and Fevotte, 2013) ; − Cross-modal correspondence constraints: factorisations of related modalities are related, e.g. , temporal activations are correlated ( Seichepine et al., 2013; Liu et al., 2013; Yilmaz et al., 2011) ; − Geometric constraints: e.g. , select particular cones C w ( Klingenberg et al., 2009; Essid, 2012) .
Matrix and Matrix Decomposition NMF (non-negativity constraints) • ICA (Independent Component Analysis) ICA SCA • SCA (Sparse Component Analysis) (independency (sparsity constraints) constraints ) • MCA (Morphological Component Analysis) • NMF (Non-negative Factorization) MCA (morphological features)
49 images among 2429 from MIT’s CBCL face dataset
Principal Component Analysis Importance of features Facial Vectorised images features in each image ≈ ... ... ... ... V W H
Recommend
More recommend