(Some) Challenges in (Some) Challenges in Tensor Mining Tensor Mining Evrim Acar Sandia National Labs., Livermore, CA
Tensor Mining Tensor Mining Parafac Parafac unsupervised unsupervised = + + X X E X X = + dense or sparse dense or sparse X E X Tucker Tucker supervised supervised y y y y ≈ X train X X train X test X test
App I: Social Networks Analysis App I: Social Networks Analysis Joint work with Joint work with T.G. Kolda T.G. Kolda and D. M. and D. M. Dunlavy Dunlavy • In social networks, we are interested in modeling relationships (links) evolving over time. • Example: – DBLP dataset: Authors x Conferences x Years (10K x 2K x 14: ~0.1% dense) … 2004 2004 … Q1: Can we use tensor decompositions 1992 1992 to model the data and extract 1991 1991 meaningful underlying factors? authors authors Q2: Can we predict who is going to publish at which conferences in future? SIAM CS&E (Link Prediction in time) # of papers March 2-6, 2009 by i th author conferences conferences at j th conf. in year k
Modeling DBLP using PARAFAC c 2 … c 1 c R years years … + + authors authors b 1 b 2 b R ≈ X a 1 a 2 a R conferences conferences • Solve using a gradient Solve using a gradient- -based based • optimization approach optimization approach • Initialization: Initialization: • • first two modes using first two modes using • svd, , svd • last mode: random, last mode: random, •
Components make sense! c 2 … c 1 c R year year … + + authors authors b 1 b 2 b R ≈ X X a 1 a 2 a R conferences conferences a r b r c r a b c r r r Craig Boutilier Author mode Author mode Conference mode Time mode Conference mode Time mode 0.05 0.16 0.4 1.2 0.6 1 Daphne Koller 0.9 0.14 0 1 0.5 0.2 IJCAI 0.8 0.12 -0.05 0.8 0.4 0 0.7 0.1 0.6 -0.1 0.6 0.3 -0.2 0.08 Coeffs. Coeffs. Coeffs. Coeffs. Coeffs. Coeffs. 0.5 0.06 -0.15 -0.4 0.4 0.2 0.4 0.04 0.3 -0.2 -0.6 0.2 0.1 0.02 0.2 -0.25 0 0 -0.8 0 0.1 DAGM -0.3 -0.02 -0.2 -0.1 -1 0 0 0 2000 2000 4000 4000 6000 6000 8000 8000 10000 10000 12000 12000 0 0 200 200 400 400 600 600 800 800 1000 1000 1200 1200 1400 1400 1600 1600 1800 1800 1992 1992 1994 1994 1996 1996 1998 1998 2000 2000 2002 2002 2004 2004 Thomas Martin Authors Authors Conferences Conferences Years Years Hans Peter Lehmann Meinzer BILDMED Heinrich CARS Niemann
Sparse tensor with What if data is a Sparse tensor with What if data is a Missing entries? Missing entries? • Sparse Data: Success with 70% randomly missing data • Missing Data [ Kiers, 1997; Tomasi & Bro, 2005 ] : [Tomasi&Bro, 2005] • Sparse & Missing:
App II: Understanding Epileptic Seizures Joint work with Joint work with R. Bro, B. R. Bro, B. Yener Yener, C. A. , C. A. Bingol Bingol. H. . H. Bingol Bingol channels channels time time
Epilepsy Tensors Epilepsy Tensors x ij : Electrical potential at i th sample j th channel CWT Time samples Time samples x ijk : Power of a wavelet coeff. Channels at i th sample j th scale k th channel Channels Scales (freq.) • Data rearranged as a three-way array using continuous wavelet transform (CWT): •Let c ijk be the wavelet coefficient at time sample i at scale j for the k th channel. • An Epilepsy Tensor is a three-way array, X, where each entry x ijk is computed as:
Epilepsy Focus Localization Epilepsy Focus Localization c 1 c 2 ALS ALS b 1 b 2 samples Time ≈ + a 1 a 2 Channels Scales Signature in time domain Signature in time domain Signature in freq. domain Signature in freq. domain Signature in electrodes domain Signature in electrodes domain 10 8 0.18 0.16 1 1 7 0.8 0.8 0.14 0.16 8 6 0.6 0.6 Fp1 Fp1 Fp2 Fp2 0.12 0.14 5 0.4 0.4 6 F7 F7 F8 F8 Fz Fz 0.1 F3 F3 F4 F4 4 0.2 0.2 0.12 Coeffs. Coeffs. Coeffs. Coeffs. 3 4 0.08 T3 T3 C3 C3 C4 C4 T4 T4 0 0 0.1 2 -0.2 -0.2 0.06 P4 P4 T5 T5 Pz Pz T6 T6 2 1 -0.4 -0.4 0.08 0.04 O1 O2 O1 O2 0 -0.6 -0.6 0 0.06 0.02 -1 -0.8 -0.8 -2 -2 0.04 0 -1 -1 0 0 500 500 1000 1000 1500 1500 2000 2000 0 0 20 20 40 40 60 60 80 80 100 100 Time Samples Time Samples Scales Scales Acar et al’07 , De Vos et al’07
How many components? How many components? 10 0.2 10 0.2 20 0.2 0 0.1 0 0.1 0 0.1 -10 0 -10 0 -20 0 0 1000 2000 0 50 100 0 1000 2000 0 50 100 0 1000 2000 0 50 100 Time Samples Scales Time Samples Scales Time Samples Scales 10 0.2 10 0.2 10 0.2 0 0.1 0 0.1 0 0.1 -10 0 -10 0 -10 0 0 1000 2000 0 50 100 0 1000 2000 0 50 100 0 1000 2000 0 50 100 Time Samples Scales Time Samples Scales Time Samples Scales 10 0.2 10 0.5 0 0.1 0 0 -10 0 -10 -0.5 0 1000 2000 0 50 100 0 1000 2000 0 50 100 Time Samples Scales Time Samples Scales 20 0.4 0 0.2 -20 0 0 1000 2000 0 50 100 Time Samples Scales
How to initialize? How to initialize? HOSVD RANDOM HOSVD RANDOM 10 0.2 10 0.2 0 0.1 0 0.1 -10 0 -10 0 0 1000 2000 0 50 100 0 1000 2000 0 50 100 Time Samples Scales Time Samples Scales 10 0.2 10 0.2 0 0.1 0 0.1 -10 0 -10 0 0 1000 2000 0 50 100 0 1000 2000 0 50 100 Time Samples Scales Time Samples Scales
Understanding Epileptic Seizures channels channels time time
Epilepsy Feature Tensor Epilepsy Feature Tensor • Construction of an Epilepsy Feature Tensor from multi-channel EEG ⎡ ⎤ f s ( ) 1 ⎢ ⎥ f ( ) s ⎢ ⎥ Channels 2 ⎢ ⎥ x ij : Electrical potential at i th ⎢ ⎥ channel j th time sample ⎣ ⎦ f ( ) s n Time samples Time epochs Epilepsy Feature Tensor Channels Features x ijk : Value of j th feature at i th epoch recorded at k th channel
Seizure Recognition Seizure Recognition Training Set Test Set • Build a model using the training set X • Predict the labels of new recordings. and the labels y . X y train Pre 1 Seizure 1 seizure y test y Post 1 test ? Time epochs ? Test Test Pre 2 Seizure 2 Post 2 non-seizure Pre 3 Seizure 3 Post 3
Multiway Classification(?) Classification(?) Multiway • Potential Approaches – Modify multiway regression models, e.g., multilinear PLS [Bro, 1996; Bro et al., 2001], as classifiers. R Multilinear y train y PLS W J J train R R I X train W K I T train K x i x K i J y t est Linear Discriminant Analysis X test T test – Unfold the data and apply two-way classification, e.g., SVM. Features - Channels Time ….. Epochs
Some challenges are … … Some challenges are • Handling Sparse Data with Missing Entries: – We need models to capture the underlying sparse factors in sparse tensors with missing entries. • Determining the Rank: – Important also in practice. • Initialization: – Algorithms suffer from the local minima problem. In practice, we may end up interpreting our results differently. • Supervised learning on tensors: – We need classification models for tensors as good as the state-of-the-art two- way classification approaches such as SVMs.
Thank you! Thank you! • References: – Social Networks Analysis : [ Tensor toolbox & Poblano toolbox (by Sandia)] • Acar, Kolda and Dunlavy, An Optimization Approach for Fitting Canonical Tensor Decompositions, SAND2009-0857, Feb. 2009. – Understanding Epileptic Seizures: [ PLS toolbox (by Eigenvector Research)] • Acar, Bingol, Bingol, Bro and Yener, Multiway Analysis of Epilepsy Tensors, Bioinformatics , 23(13): i10-i18, 2007. • Acar, Bingol, Bingol, Bro and Yener, Seizure Recognition on Epilepsy Feature Tensor, Proc. 29th Int. Conf. IEEE Engineering in Medicine and Biology Society , 2007. – Survey: • Acar and Yener, Unsupervised Multiway Data Analysis: A Literature Survey, IEEE Transactions on Knowledge and Data Engineering , 21(1): 6-20, 2009. • Contact: Evrim Acar, Sandia National Laboratories , eacarat@sandia.gov
Recommend
More recommend