Incremental Classification with Generalized Eigenvalues Mario - PowerPoint PPT Presentation

High Performance Computing and Networking Institute National Research Council, Italy The Data Reference Model: Incremental Classification with Generalized Eigenvalues Mario Rosario Guarracino September 17, 2007 11/30/2007 8:53 AM

People@ICAR � Researchers � Collaborators – Mario Guarracino – Franco Giannessi (UniPi) – Pasqua D’Ambra – Claudio Cifarelli (HP) – Ivan De Falco – Panos Pardalos, Onur Seref (UFL) – Ernesto Tarantino – Oleg Prokopyev (U. Pittsburg) – Giuseppe Trautteur (UniNa) – Francesca Del Vecchio Blanco � Associates (SUN) – Daniela di Serafino (SUN) – Antonio Della Cioppa (UniSa) – Francesca Perla (UniParth) – Gerardo Toraldo (UniNa) � Students – Danilo Abbate, � Fellows – Francesco Antropoli, – Davide Feminiano – Giovanni Attratto, – Salvatore Cuciniello – Tony De Vivo, – Alessandra Vocca, Katedry Oblicze � Równoległych PJWSTK i Zespołu Architektury Komputerowej IPIPAN October 12, 2006 -- Pg. 2

Agenda � Generalized eigenvalues classification � Purpose of incremental learning � Subset selection algorithm � Initial points selection � Accuracy results � More examples � Conclusion and future work Katedry Oblicze � Równoległych PJWSTK i Zespołu Architektury Komputerowej IPIPAN October 12, 2006 -- Pg. 3

Introduction � Supervised learning refers to the capability of a system to learn from examples ( training set ). � The trained system is able to provide an answer ( output ) for each new question ( input ). � S upervised means the desired output for the training set is provided by an external teacher. � Binary classification is among the most successful methods for supervised learning. Katedry Oblicze � Równoległych PJWSTK i Zespołu Architektury Komputerowej IPIPAN October 12, 2006 -- Pg. 4

Applications � Data produced in biomedical application will exponentially increase in the next years. � In genomic/proteomic application, data are often updated, which poses problems to the training step. � Publicly available datasets contain gene expression data for tens of thousands characteristics. � Current classification methods can over- fit the problem, providing models that do not generalize well. Katedry Oblicze � Równoległych PJWSTK i Zespołu Architektury Komputerowej IPIPAN October 12, 2006 -- Pg. 5

Linear discriminant planes � Consider a binary classification task with points in two linearly separable sets. – There exists a plane that classifies all points in the two sets B B A A � There are infinitely many planes that correctly classify the training data. Katedry Oblicze � Równoległych PJWSTK i Zespołu Architektury Komputerowej IPIPAN October 12, 2006 -- Pg. 6

Support vector machines formulation � To construct the furthest plane from both sets, we examine the convex hull of each set. � � � � � � � � �� B B c � � A A � � � � � � � � �� d � � � � � � � � � � � � � � � � The best plane bisects closest points ( support vectors ) in the convex hulls. Katedry Oblicze � Równoległych PJWSTK i Zespołu Architektury Komputerowej IPIPAN October 12, 2006 -- Pg. 7

Support vector machines dual formulation � The dual formulation, yielding the same solution, is to maximize the margin between support planes – Support planes leave all points of a class on one side � � � � � � �� B B A A �� Support planes are pushed apart until they “bump” into a small set of data points ( support vectors ). Katedry Oblicze � Równoległych PJWSTK i Zespołu Architektury Komputerowej IPIPAN October 12, 2006 -- Pg. 8

Support Vector Machine features � Support Vector Machines are the state of the art for the existing classification methods. � Their robustness is due to the strong fundamentals of statistical learning theory. � The training relies on optimization of a quadratic convex cost function, for which many methods are available. – Available software includes SVM-Lite and LIBSVM. � These techniques do not scale well with the size of the training set. – Training 50,000 examples amounts to a Hessian matrix with 2.5 billion elements ~ 20 GB RAM. Katedry Oblicze � Równoległych PJWSTK i Zespołu Architektury Komputerowej IPIPAN October 12, 2006 -- Pg. 9

A different approach � The problem can be restated as: find two hyperplanes , each the closest to one set and the furthest from the other. � �� B B A A � � � � � �� The binary classification problem can be solved as a generalized eigenvalue computation (GEC). O. L. Mangasarian and E. W. Wild Multisurface Proximal Support Vector Classification via Generalized Eigenvalues. Data Mining Institute Tech. Rep. 04-03, June 2004. Katedry Oblicze � Równoległych PJWSTK i Zespołu Architektury Komputerowej IPIPAN October 12, 2006 -- Pg. 10

GEC method � �� Let: � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � Previous equation becomes: � � �� Raleigh quotient of generalized eigenvalue problem: Gx = λ Hx . Katedry Oblicze � Równoległych PJWSTK i Zespołu Architektury Komputerowej IPIPAN October 12, 2006 -- Pg. 11

GEC method Conversely, the plane closer to B and furthest from A : � �� Same eigenvectors of the previous problem and reciprocal eigenvalues. � We only need to evaluate the eigenvectors related to minimum and maximum eigenvalues of Gx= λ Hx . Katedry Oblicze � Równoległych PJWSTK i Zespołu Architektury Komputerowej IPIPAN October 12, 2006 -- Pg. 12

GEC method Let [ w 1 γ 1 ] and [ w 2 γ 2 ] be eigenvectors associated to min and max eigenvalues of Gx = λ Hx : � a � A closer to x'w 1 - γ 1 = 0 than to x'w 2 - γ 2 = 0 , � b � B closer to x'w 2 - γ 2 = 0 than to x'w 1 - γ 1 = 0 . Katedry Oblicze � Równoległych PJWSTK i Zespołu Architektury Komputerowej IPIPAN October 12, 2006 -- Pg. 13

Example Let: � � � � � � � � � � � � � � � � � � Set G =[ A - e ]' [ A - e ] and H =[ B - e ]' [ B - e ] , we obtain: � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � Minimum and maximum eigenvalues of Gx = λ Hx are λ 1 = 0 and λ 3 = � and the corresponding eigenvectors: x 1 =[1 0 2], x 3 =[1 -1 0] . The resulting planes are x – 2 = 0 and x – y = 0 . Katedry Oblicze � Równoległych PJWSTK i Zespołu Architektury Komputerowej IPIPAN October 12, 2006 -- Pg. 14

Classification accuracy: linear kernel Dataset train dim ReGEC GEPSVM SVM 300 7 NDC 87.60 86.70 89.00 297 13 ClevelandHeart 86.05 81.80 83.60 768 8 PimaIndians 74.91 73.60 75.70 2462 14 GalaxyBright 98.24 98.60 98.30 Accuracy results using ten fold cross validation Katedry Oblicze � Równoległych PJWSTK i Zespołu Architektury Komputerowej IPIPAN October 12, 2006 -- Pg. 15

Nonlinear case � When sets are not linearly separable, nonlinear discrimination is needed. � Data is nonlinearly transformed in another space to increase separability, and linear discrimination is found in that space. Katedry Oblicze � Równoległych PJWSTK i Zespołu Architektury Komputerowej IPIPAN October 12, 2006 -- Pg. 16

Nonlinear case � A standard technique is to transform points into a nonlinear space, via kernel functions, like the Gaussian kernel : � �� Each element of the kernel matrix is: � �� where � � � � K. Bennett and O. Mangasarian, Robust Linear Programming Discrimination of Two Linearly Inseparable Sets , Optimization Methods and Software, 1, 23-34, 1992. Katedry Oblicze � Równoległych PJWSTK i Zespołu Architektury Komputerowej IPIPAN October 12, 2006 -- Pg. 17

Nonlinear case � Using the Gaussian kernel the GEC problem can be formulated: � � � �� in order to evaluate the proximal surfaces: � � �� the associated GEC is ill posed. Katedry Oblicze � Równoległych PJWSTK i Zespołu Architektury Komputerowej IPIPAN October 12, 2006 -- Pg. 18

Incremental Classification with Generalized Eigenvalues Mario - PowerPoint PPT Presentation

High Performance Computing and Networking Institute National Research Council, Italy The Data Reference Model: Incremental Classification with Generalized Eigenvalues Mario Rosario Guarracino September 17, 2007 11/30/2007 8:53 AM

Homogeneous Linear Systems Three Cases: Distinct Real Eigenvalues Repeated Eigenvalues

Incremental Classification: First Step into Lifelong Learning PAN Xinyu MMLab, Department of IE

Eigenvalues and Eigenvectors Raibatak Sen Gupta 2019 Eigenvalues Characteristic Equation and

Eigenvalues, Eigenvectors, and Diagonalization Diagonalization Math 240 Calculus III Summer

The Classification of Generalized Riemann Derivatives Stefan Catoiu DePaul University, Chicago

Incremental Garbage Collection Part II Roland Schatz Incremental Garbage Collection p.1/22

For matrix A ( p p ) with real eigenvalues, define F A , the empirical distribution function of

Chapter 5 Eigenvalues and Eigenvectors Section 5.1 Eigenvectors and Eigenvalues Motivation:

For matrix A ( p p ) with real eigenvalues, define F A , the empirical distribution function of

Eigenvalues, Eigenvectors, Matrix Factoring, and Principal Components The eigenvalues and

Linear Algebra Chapter 5: Eigenvalues and Eigenvectors Section 5.1. Eigenvalues and

Matrix Calculations: Eigenvalues and Eigenvectors H. Geuvers (and A. Kissinger) Institute for

Eigenvalues and Eigenvectors Artem Los (arteml@kth.se) February 9th, 2017 Artem Los

The number of eigenvalues of a tensor Dustin Cartwright 1 Yale University/MPIM Bonn May 31, 2012

Section 5.5 Complex Eigenvalues (Part II) Motivation: Complex Versus Two Real Eigenvalues

Generalized MPLS Signaling draft-ietf-mpls-generalized-signaling-05.txt

Statistical Learning

Clustering: evolution of methods to meet new challenges C. Biernacki ee Clustering, Orange

Exploring the Performance Potential of Chapel Richard Barrett, Sadaf Alam, and Stephen Poole

Cautionary statement Forward-looking statements - cautionary statement This presentation and the

Establishing a global quality of care benchmark report Presented by: Fanny Sampurno 13 th August

Retinal Degenerations Recent Developments Treatments for Patients with Severe Diverse

CONVERSION OF SLIDES AND NEGATIVE FILMS TO DIGITAL IMAGES Dragan Mladenovi 1,2 , Dragutin

TPS: algorithms ICTP SCHOOL ON MEDICAL PHYSICS Radiation Therapy: Dosimetry and Treatment