Tensor Network Representation for Machine Learning - Recent Advances and Perspectives Qibin ZHAO Tensor Learning Unit RIKEN AIP AIP Symposium (Mar. 19, 2019)
Tensor Learning Unit - Members Postdoctoral Researchers (2) ‣ Ming Hou, Chao Li Part-timer (2) ‣ Longhao Yuan (PhD student), Xuyang Zhao (PhD student) Interns (4) ‣ Canada, Japan, China Visitors (9) ‣ Andrzej Cichocki, Toshihisa Tanaka, Jianting Cao ‣ Guillaume Rabusseau, Justin Dauwels, Danilo Mandic, Brahim Chib- draa, Cesar F. Caiafa, Jordi Sole Casals 2
Background and Problems Kernel learning f ( x ) W · Φ( x ) = ‣ Problems become easier when mapping to higher dimensional space. Kernel Learning ‣ Curse of dimensionality, grows exponentially ‣ Weights can be exponentially big ‣ “kernelization” scales quadratically with training set Rank-1 tensor size. In the era of big data, this issue is cited as one reason why neural nets have overtaken kernel methods. ‣ Low generalization due to representer theorem em says exact W = X Perfect Problem for α j Φ ( x j ) Tensor Networks to j solve 3
Background and Problems Neural Networks ‣ Weight matrix is huge but highly redundant. ‣ Low-rank compression: limited compression rate ‣ Computational inefficient due to huge parameters ‣ Not applicable for small devices Neural Nets Multi-modal deep learning, multi-task deep learning ⇣ �⌘ � f ( x ) = Φ 2 M 2 Φ 1 M 1 x Tensor Networks is a natural tool to solve these problems 4
Neural Network (NN) vs. Tensor Network (TN) Similarity ‣ Assembling simple units (neurons or tensors) into complicated functions Difference ‣ Decision functions in ML vs. wavefunctions in quantum mechanics ‣ Nonlinear in NN vs. linear in TN ‣ NN do non-linear things to low-dimensional space vs. TN do linear things in high-dimensional space 5
What Are Tensor Networks (TNs) ? ‣ A powerful tool to describe strongly entangled quantum many-body systems in physics ‣ Decompose a high-order tensor into a collection of low- Λ order tensors connected according to a network pattern ‣ Tensor network diagram A x b = Ax = Scalar Vector Matrix I J I J a a A I J I A B C = AB I I A = K K I J I P C P A B I 3rd-order tensor 3rd-order diagonal tensor I = I 3 R M M K A Λ I 1 I 3 R R J J L L I 1 R I 2 R K R I 2 a i,j,k b k,l,m,p = c i,j,l,m,p � k =1 6 �
B/nicbVDLSgNBEOyNrxhfUY9eBoPgKeyKoMegF71IRPOAZA2zk9lkyMzuMtMrhCXgD3jVP/AmXv0Vf8DvcPI4aGJBQ1HVTXdXkEh 0HW/nNzS8srqWn69sLG5tb1T3N2rmzjVjNdYLGPdDKjhUkS8hgIlbya UxVI3g Gl2O/8ci1EXF0j8OE+4r2IhEKRtFKd9cPN51iyS27E5BF4s1ICWaodorf7W7MUsUjZJIa0/LcBP2MahRM8lGhnRqeUDagPd6yNK <latexit sha1_base64="xq6SZyclRHkhMFODjB+IR1qIPAE=">A <latexit sha1_base64="xq6SZyclRHkhMFODjB+IR1qIPAE=">A Gz+bnDoiR1bpkjDWtiIkE/X3REaVMUMV2E5FsW/mvbH4n9dKMTz3MxElKfKITReFqSQYk/HfpCs0ZyiHl Cmhb2VsD7VlKFNZ34L9tWoYIPx5mNYJPWTsueWvdvTUuViFlEeDuAQjsGDM6jAFVShBgx68Awv8Oo8OW/Ou/Mxbc05s5l9+APn8wcYAZYG</latexit> B/nicbVDLSgNBEOyNrxhfUY9eBoPgKeyKoMegF71IRPOAZA2zk9lkyMzuMtMrhCXgD3jVP/AmXv0Vf8DvcPI4aGJBQ1HVTXdXkEh 0HW/nNzS8srqWn69sLG5tb1T3N2rmzjVjNdYLGPdDKjhUkS8hgIlbya UxVI3g Gl2O/8ci1EXF0j8OE+4r2IhEKRtFKd9cPN51iyS27E5BF4s1ICWaodorf7W7MUsUjZJIa0/LcBP2MahRM8lGhnRqeUDagPd6yNK <latexit sha1_base64="xq6SZyclRHkhMFODjB+IR1qIPAE=">A CD3icbVDLSgMxFM3UV62vsS7dBItQN2WmCLosutGNVrEPaMeS dM2NMkMSUYsw3yEP+BW/8CduPUT/AG/w0w7C209cOFwzr3cw/FDRpV2nC8rt7S8srqWXy9sbG5t79i7xaYKIolJAwcskG0fKcKoIA1N SPtUBLEfUZa/vg89VsPRCoaiDs9CYnH0VDQAcVIG6lnF7sc6RFGL 5OyleXt/fVo5 dcirOFHCRuBkpgQz1nv3d7Qc4 kRozJBSHdcJtRcjqSlmJCl0I0VChMdoSDqGCsSJ8uJp9gQeGqUPB4E0IzScqr8vYsSVmnDfbKZJ1byXiv95nUgPTr2YijDSRODZo0HEoA5gWgTsU0mwZhNDEJbUZIV4hCTC2tQ1/0WPeFIwxbjzNSySZrXiOhX35rhUO8sqyoN9cADKwAUnoAYuQB0 A aP4Bm8gFfryXqz3q2P2WrOym72wB9Ynz9zcJwf</latexit> <latexit sha1_base64="4Xp/3dBHBfhT8YVNP+RwTz8l4vk=">A CD3icbVDLSgMxFM3UV62vsS7dBItQN2WmCLosutGNVrEPaMeS dM2NMkMSUYsw3yEP+BW/8CduPUT/AG/w0w7C209cOFwzr3cw/FDRpV2nC8rt7S8srqWXy9sbG5t79i7xaYKIolJAwcskG0fKcKoIA1N SPtUBLEfUZa/vg89VsPRCoaiDs9CYnH0VDQAcVIG6lnF7sc6RFGL 5OyleXt/fVo5 dcirOFHCRuBkpgQz1nv3d7Qc4 kRozJBSHdcJtRcjqSlmJCl0I0VChMdoSDqGCsSJ8uJp9gQeGqUPB4E0IzScqr8vYsSVmnDfbKZJ1byXiv95nUgPTr2YijDSRODZo0HEoA5gWgTsU0mwZhNDEJbUZIV4hCTC2tQ1/0WPeFIwxbjzNSySZrXiOhX35rhUO8sqyoN9cADKwAUnoAYuQB0 A aP4Bm8gFfryXqz3q2P2WrOym72wB9Ynz9zcJwf</latexit> Gz+bnDoiR1bpkjDWtiIkE/X3REaVMUMV2E5FsW/mvbH4n9dKMTz3MxElKfKITReFqSQYk/HfpCs0ZyiHl Cmhb2VsD7VlKFNZ34L9tWoYIPx5mNYJPWTsueWvdvTUuViFlEeDuAQjsGDM6jAFVShBgx68Awv8Oo8OW/Ou/Mxbc05s5l9+APn8wcYAZYG</latexit> B/nicbVDLSgNBEOyNrxhfUY9eBoPgKeyKoMegF71IRPOAZA2zk9lkyMzuMtMrhCXgD3jVP/AmXv0Vf8DvcPI4aGJBQ1HVTXdXkEh 0HW/nNzS8srqWn69sLG5tb1T3N2rmzjVjNdYLGPdDKjhUkS8hgIlbya UxVI3g Gl2O/8ci1EXF0j8OE+4r2IhEKRtFKd9cPN51iyS27E5BF4s1ICWaodorf7W7MUsUjZJIa0/LcBP2MahRM8lGhnRqeUDagPd6yNK <latexit sha1_base64="4Xp/3dBHBfhT8YVNP+RwTz8l4vk=">A <latexit sha1_base64="4Xp/3dBHBfhT8YVNP+RwTz8l4vk=">A CD3icbVDLSgMxFM3UV62vsS7dBItQN2WmCLosutGNVrEPaMeS dM2NMkMSUYsw3yEP+BW/8CduPUT/AG/w0w7C209cOFwzr3cw/FDRpV2nC8rt7S8srqWXy9sbG5t79i7xaYKIolJAwcskG0fKcKoIA1N SPtUBLEfUZa/vg89VsPRCoaiDs9CYnH0VDQAcVIG6lnF7sc6RFGL 5OyleXt/fVo5 dcirOFHCRuBkpgQz1nv3d7Qc4 kRozJBSHdcJtRcjqSlmJCl0I0VChMdoSDqGCsSJ8uJp9gQeGqUPB4E0IzScqr8vYsSVmnDfbKZJ1byXiv95nUgPTr2YijDSRODZo0HEoA5gWgTsU0mwZhNDEJbUZIV4hCTC2tQ1/0WPeFIwxbjzNSySZrXiOhX35rhUO8sqyoN9cADKwAUnoAYuQB0 A aP4Bm8gFfryXqz3q2P2WrOym72wB9Ynz9zcJwf</latexit> <latexit sha1_base64="4Xp/3dBHBfhT8YVNP+RwTz8l4vk=">A CD3icbVDLSgMxFM3UV62vsS7dBItQN2WmCLosutGNVrEPaMeS dM2NMkMSUYsw3yEP+BW/8CduPUT/AG/w0w7C209cOFwzr3cw/FDRpV2nC8rt7S8srqWXy9sbG5t79i7xaYKIolJAwcskG0fKcKoIA1N SPtUBLEfUZa/vg89VsPRCoaiDs9CYnH0VDQAcVIG6lnF7sc6RFGL 5OyleXt/fVo5 dcirOFHCRuBkpgQz1nv3d7Qc4 kRozJBSHdcJtRcjqSlmJCl0I0VChMdoSDqGCsSJ8uJp9gQeGqUPB4E0IzScqr8vYsSVmnDfbKZJ1byXiv95nUgPTr2YijDSRODZo0HEoA5gWgTsU0mwZhNDEJbUZIV4hCTC2tQ1/0WPeFIwxbjzNSySZrXiOhX35rhUO8sqyoN9cADKwAUnoAYuQB0 A aP4Bm8gFfryXqz3q2P2WrOym72wB9Ynz9zcJwf</latexit> Gz+bnDoiR1bpkjDWtiIkE/X3REaVMUMV2E5FsW/mvbH4n9dKMTz3MxElKfKITReFqSQYk/HfpCs0ZyiHl Cmhb2VsD7VlKFNZ34L9tWoYIPx5mNYJPWTsueWvdvTUuViFlEeDuAQjsGDM6jAFVShBgx68Awv8Oo8OW/Ou/Mxbc05s5l9+APn8wcYAZYG</latexit> <latexit sha1_base64="xq6SZyclRHkhMFODjB+IR1qIPAE=">A B/nicbVDLSgNBEOyNrxhfUY9eBoPgKeyKoMegF71IRPOAZA2zk9lkyMzuMtMrhCXgD3jVP/AmXv0Vf8DvcPI4aGJBQ1HVTXdXkEh 0HW/nNzS8srqWn69sLG5tb1T3N2rmzjVjNdYLGPdDKjhUkS8hgIlbya UxVI3g Gl2O/8ci1EXF0j8OE+4r2IhEKRtFKd9cPN51iyS27E5BF4s1ICWaodorf7W7MUsUjZJIa0/LcBP2MahRM8lGhnRqeUDagPd6yNK Gz+bnDoiR1bpkjDWtiIkE/X3REaVMUMV2E5FsW/mvbH4n9dKMTz3MxElKfKITReFqSQYk/HfpCs0ZyiHl Cmhb2VsD7VlKFNZ34L9tWoYIPx5mNYJPWTsueWvdvTUuViFlEeDuAQjsGDM6jAFVShBgx68Awv8Oo8OW/Ou/Mxbc05s5l9+APn8wcYAZYG</latexit> TT/MPS Representation and Properties [V. Oseledets, SIAM J. Sci. Comput., 2011] G (1) G (2) G ( ) n G ( ) N R 1 R 2 R n -1 R n R N -1 I 1 I 2 I n I N ( ) n (1) (2) ( N ) G G i 1 G G i n i 2 i . . N I N . I 2 I 1 . . I n . . . ... . ... ... ... R N -1 R 1 . . R n -1 . R 1 . . . . . . R 2 R n TT: tensor train decomposition; MPS: matrix product state ‣ Efficient to represent data values by I N O ( NIR 2 ) parameters ‣ Efficient to compute or optimize TT/MPS by DMRG algorithm 7
Recommend
More recommend