2009-03-24 Outline Utilizing Diversity and Performance � Introduction Measures for Ensemble Creation � Data Mining � Predictive Modelling � Ensembles � Diversity � Information Fusion � Problem Statement � Research and Results � Implicit Diversity � Estimating Ensemble Performance � Evaluating Optimization Criteria Tuve Löfström � Combining Measures � Licentiate Thesis Conclusions � Discussion Introduction Introduction � “Data mining is the process of exploration and analysis, � Predictive modeling is one of the key tasks in data by automatic or semiautomatic means, of large mining quantities of data in order to discover meaningful � The objective when performing predictive patterns and rules” � Berry and Linoff 1997 modeling is to predict a value for a specific variable � the target variable � The aim of data mining is to � Most often a predictive model is found from � “be able to respond to the patterns, to act on them, directed data mining ultimately turning the data into information, the � a top-down approach where a mapping from an input information into action, and the action into value” vector to a scalar output is learnt from samples � Berry and Linoff 1997 �������������������������������������� Introduction Variables Targets � The task is either classification or regression ������ ������ ������ ������ ������ ����� ������ ����� ����� ��� � ��� ��� ��������������� � When performing classification the target value � ��� � ��� ��������������� must be any of a pre-defined set of values ��� ��� �� ��� ��������������� ��� ��� ��� ��� ��������������� � For regression, the target value is a continuous ��� � ��� ��� ��������������� value ��� �� ��� ��� ���������������� � ��� ��� ��� ���������������� Instances ��� ��� ��� ���������������� ��� ��! ��� ��� ���������������� � The normal procedure is to use historical data ��� ��! �� ��� ���������������� � ��� ��� ��� ������������ with known target values to build models that �! ��� ��� ��� ������������ could later be used for prediction ��� ��� ��� ��� ������������ �� ��� ��� ��� ������������ 1
2009-03-24 A decision tree A rule set JChipper rules: =========== �������� �� ��� IF ( petalwidth <= 0.6 ) THEN Iris-setosa [50/0] ��������� �������� IF ( petalwidth <= 1.7 ) AND ( petallength <= 5.0 ) THEN Iris-versicolor [50/2] ��� �� ���������� ������ DEFAULT: Iris-virginica [50/2] Number of Rules : 3 Number of Conditions : 4 A neural net Ensembles � An ensemble is a composite model, aggregating multiple base models into one predictive model � An ensemble prediction, consequently, is a function of all included base models � Both theory and a wealth of empirical studies have established that ensembles are generally = ��� ��� ��� ��� y f � � f � � � �� more accurate than single predictive models i An ensemble Diversity � For the ensemble approach to work, the � �� ensemble must contain diversity �������� � � � � � � � �� � � � � There would be no point in combining only models that always M � � M � � �� M n � � Make the same mistakes � Add the same information � � ����� ��������� � F � M � �� M � ��!�� M n �� � We want models that perform well ���������� � individually and complement each other � � �� y "�� 2
2009-03-24 The need for diversity Diversity Measures � �� • Overall ensemble error depends on average error of � Diversity is well defined for regression problems ��������� ������ � ensemble members and diverisity � � � � � � �� � S � � Not for classification problems • Increasing diversity decreases � Several different heuristical diversity measures for a overall error h � � h � � �� h S � classification context have been proposed. •Provided it does not result � �� in an increase in average � Two types of measures ��������� error ������ � � � ����� h "� � F � h � �� h � ��!�� h S �� � � � � � � �� � S � � Pairwise measures (Krogh and Vedelsby, 1995) h � � h � � �� h S � ������������ � Compare all pairs and average over the � � �� y "�� ������ � Unfortunately average error and results � � ����� h "� � F � h � �� h � ��!�� h S �� diversity are highly correlated � Non-pairwise measures ������������ � � �� y "�� ������ � = − E E A � Measure all members together Information Fusion Ensembles in Information Fusion � Information fusion is the research about how to aid � One of the characteristics of information fusion is the decision makers with different tasks, by combining data need to combine data from several sources and information from various sources � To understand the whole picture from all the various fractions of data that is gathered � It is characterized by the necessity to gather data about � Obviously, the use of ensembles is a very natural objects or situations from multiple sources and combine them to enable effective decision support, framework for information fusion often under severe time and resource constraints � New base models can be added when new sources are added � Each source can only provide information from its � Old models can be updated or dropped when they become specific point of view and often only about some too faulty or sources are removed or lost specific feature. Diversity and Information Fusion Problem Statement � Diversity in ensembles is achieved by dividing � The main problem: datasets into: How should ensembles be created to maximize predictive performance? � The problem statement: � Different feature sets How could measurements of diversity and predictive performance on available � Different subsets of data data be used when combining or selecting base classifiers in order to maximize � Measurements of the problem from different ensemble predictive performance on unseen data? perspectives � The final goal when building predictive models is to achieve as high predictive performance as possible, this is inherent in the need of a predictive model � The data used in Information Fusion often come: � An ensemble can be formed either by simply combining available base classifiers, � from different kinds of sensors or by selecting a subset of base classifiers � This means that diversity and performance measures can be used either to guide in the � with different intervalls selection or as an implicit goal when creating the models to combine � from sensors at different positions 3
Recommend
More recommend