SKT: A Computationaly Efficient SUPANOVA: Spline Kernel Based Machine Learning Tool Boleslaw Szymanski, Lijuan Zhu, Long Han and Mark Embrechts Rensselaer Polytechnic Institute, Troy, NY 12180, USA and Alexander Ross and Karsten Sternickel Cardiomag Imaging, Inc. Schenectady, NY 12304, USA Presented at the 11 th Online World Conference on Soft Computing in Industrial Applications September 18 – October 6, 2006
Presentation Outline 1. Introduction: SUPANOVA Kernels 2. Heuristic for Efficient SUPANOVA Kernel Computation 3. Results of SKT for Two Benchmarks: � Iris Data � Boston Housing Market 4. An Industrial Application: Automatic Analysis of Magnetocardiograms 5. Preprocessing Measurements into CMI Data 6. Results of Processing CMI Data under SKT 2 CMI/RPI: Spline Kernel Based Machine Learning
SUPANOVA Kernels Represent the prediction function as a sum of kernels M ∑ = • = • ≥ f ( x ) K ( x , x ) a c K ( x , x ) a , c 0 0 0 j j 0 j = j 0 - kernels weighted by nonnegative coefficients ANOVA kernel m m m ∑ ∑ ∏ ∏ = + = + + + + K ( u , v ) [ 1 k ( u , v )] 1 k ( u , v ) k ( u , v ) k ( u , v ) ... k ( u , v ) ANOVA i i i i i i j j i i = = < = i 1 i 1 i j i 1 - decomposes functions of the order m into a sum of terms: 1-ary , 2-ary … m-ary order functions of the original arguments K K K K K K K ... K + + + + { − { 1 4 2 4 4 3 4 0 i i 1 i 2 1 j 4 4 2 j 1 4 4 3 j 2 M 1 ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ m m ⎛ ⎞ m ⎜ ⎟ m ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎝ ⎠ ⎝ ⎠ 0 ⎝ ⎠ m 1 ⎝ ⎠ 2 Each kernel function in ANOVA kernel is spline kernel + 3 ( u v ) min( u , v ) min( u , v ) = + − i i i i i i k ( u , v ) u v spline i i i i 2 6 + 1 = c - vector with dimensions m M 2 3 CMI/RPI: Spline Kernel Based Machine Learning
SUPANOVA: Objective Function • Objective function (S. Gunn and J. Kandola 2002) - quadratic error: 2 M ∑ − • y c K a j j = j 0 2 - smoothness error: M ∑ λ × • T c a K a a j j = 0 j - approximate sparseness error with one-norm: M ∑ λ c c j = j 0 • Proposed objective function - ideal zero-norm sparseness error 2 M M M ∑ ∑ ∑ Φ = − • + λ × • + λ ≥ T ( a , c ) y c K a c a K a c c 0 j j a j j c j j 0 = = = j 0 j 0 j 0 2 λ - weight for smoothness error a λ - weight for sparseness error c c - sparseness vector - same parameter as used in traditional kernels a 4 CMI/RPI: Spline Kernel Based Machine Learning
The Gunn’s Iterative Solution 2 M M M ∑ ∑ ∑ Φ = − • + λ × • + λ ≥ T ( , ) a c y c K a c a K a c c 0 j j a j j c j j 0 = = = j 0 j 0 j 0 2 u r = 0: ' 1 S c u u r r u r = Φ λ S 1: ' a argmin ( , ') and compute a c starting with large value and decreasing it a a until the minimum error is achieved. u u r u u r r = Φ λ = λ * S2: c argmin ( ', ) where a c 0 and is set so the loss is the same as at initialization c a c r r M ∑ T × × c a K a j j = λ = λ j 0 c a M uu r u u r r = Φ λ = λ * * S 3: a argmin ( , a c ) where 0 and is computed starting with a large value a c a and then decreasing it to mini mize the error. 5 CMI/RPI: Spline Kernel Based Machine Learning
Heuristic Method for Computing Sparse Vector c In a single step 2 M M ∑ ∑ Φ = − • + λ ≥ ' ' ( a , c ) y c K a c c 0 j j c j j 0 = = j 0 j 0 2 • Initialization - create empty set S of all selected elements of vector c - create a set E contains all the remaining elements of vector c • Selection - select, one by one c i in set E and compute the minimal value of the loss function with this selection - choose the element c i that achieves the smallest value of loss function among all elements of set E • Adjustment - solve the set of linear equations to refit the c values in set S • Control loop - stop heuristic when the process reaches the limited iteration number or set E becomes empty 6 CMI/RPI: Spline Kernel Based Machine Learning
SUPANOVA Experimental Results • Iris data ( 4 variables, 50 samples, classification) - all combination of features 4 target 3.5 predicted 3.5 Iris data 3 Target and Predicted Values Iris data 3 2.5 Predicted Value 2.5 2 2 1.5 1.5 1 q2 = 0.023 q2 = 0.023 1 Q2 = 0.173 Q2 = 0.173 RMSE = 0.355 RMSE = 0.355 0.5 0.5 1 1.5 2 2.5 3 3.5 0.5 0 5 10 15 20 25 30 Target Value Sorted Sequence Number • Boston Housing Market ( 13 variables, 506 samples, regression) - up-to binary combination of features 60 target predicted 50 50 Target and Predicted Values Boston Housing 40 Boston Housing 40 Predicted Value 30 30 20 20 10 q2 = 0.214 10 q2 = 0.214 Q2 = 0.217 Q2 = 0.217 RMSE = 4.768 RMSE = 4.768 0 0 0 5 10 15 20 25 30 35 40 45 50 55 0 20 40 60 80 100 120 140 160 Target Value Sorted Sequence Number 7 CMI/RPI: Spline Kernel Based Machine Learning
SUPANOVA Results Discussion • Heuristic performance with adjustment versus without adjustment 140 with-adjustment 130 without-adjustment 120 110 Step S2 Error 100 90 80 70 60 50 40 30 0 5 10 15 20 25 30 35 40 Number of non-zero elements • Up-to binary versus up-to ternary combinations of features 60 60 target target predicted predicted 50 50 Target and Predicted Values Target and Predicted Values 40 40 30 30 20 20 10 10 q2 = 0.214 q2 = 0.205 0 Q2 = 0.217 Q2 = 0.213 RMSE = 4.768 RMSE = 4.717 0 -10 0 20 40 60 80 100 120 140 160 0 20 40 60 80 100 120 140 160 Sorted Sequence Number Sorted Sequence Number 8 CMI/RPI: Spline Kernel Based Machine Learning
An Industrial Application: Automatic Classification of Magnetocardiograms CMI 2409 This is a 9-channel MCG system, capable of scanning A 20 cm x 20 cm region Normal LCX block 3 vessel disease over the heart. Visualization of magnetocardiograms (examples) Goals Technical Objectives • Identify cardiac ischemia from • Use computational intelligence to solve a magnetocardiograms “missing information” problem • Accurate diagnosis of ischemia in patients • Reverse the process and identify the signature with left and/or right bundle branch blocks of cardiac diseases in magnetocardiograms • Incorporate domain knowledge to enhance • Develop an online database for data exchange automatic classification of diseases (pooling) between hospitals • Provide cardiologists with a hypothesis testing module 9 CMI/RPI: Spline Kernel Based Machine Learning
Typical Time Series Data for T3-T4 MagnetoCardiogram 201 10 CMI/RPI: Spline Kernel Based Machine Learning
From Magnetocardiogram to Data Series for Analysis 1. Recording 4. Selecting 1. Preprocessing of the time 1. Preprocessing of the time series & averaging series & averaging 2. Filtering 2. Heart cycle interval selection: 2. Heart cycle interval selection: Chose the T wave for the Chose the T wave for the detection of ischemia detection of ischemia 3. Averaging 3. Automatically determine the 3. Automatically determine the window of interest window of interest 5. Exporting 4. Data export to machine 4. Data export to machine learning processing learning processing 11 CMI/RPI: Spline Kernel Based Machine Learning
CMI Delay Data and Resulting Sparse Vector c • Training data : 241 patients All 10 selected features • Test data : 84 patients are combination features • Features : 74 data representing delays Code Value of pick of polarization signal 2517 15.28137 1431 16.20629 509 2.13248 • Heuristic parameters : binary 1416 12.94138 combinations of features 579 5.35348 1319 13.27779 1256 8.21285 • Kernel processing results : 1492 10.90624 - run time: 15mins(desktop) 2506 16.66780 - size of sparse vector c : 10 elements 2207 6.72628 772 3.01011 12 CMI/RPI: Spline Kernel Based Machine Learning
CMI Data Processing Results: Prediction versus True Value 1.5 target predicted Target and Predicted Values 1 0.5 0 -0.5 -1 q2 = 0.620 Q2 = 0.627 RMSE = 0.786 -1.5 0 10 20 30 40 50 60 70 80 90 Sorted Sequence Number 13 CMI/RPI: Spline Kernel Based Machine Learning
CMI Data Processing Results: ROC Curve 1 0.8 ROC curve predicts True Positives a system response 0.6 with different selections of the 0.4 value of true/false prediction threshold. 0.2 AZ_area = 0.8672 0 0 0.2 0.4 0.6 0.8 1 False Positives 14 CMI/RPI: Spline Kernel Based Machine Learning
CMI Data Processing Results: Confusion Matrix and Distribution of True Values Balance error: 83.67% predicted negative predicted positive negative 32 5 positive 9 38 30 25 20 Healthy 15 Sick 10 5 0 1 2 3 4 5 15 CMI/RPI: Spline Kernel Based Machine Learning
Recommend
More recommend