lazy associative classification
play

Lazy Associative Classification Decision Tree Classifier (Eager) - PowerPoint PPT Presentation

Contents: Classification Lazy Associative Classification Decision Tree Classifier (Eager) Associative Classifier By Adriano Veloso,Wagner Meira Jr. , Mohammad J. Zaki Comparison between Decision Tree and Associative Classifier


  1. Contents: � Classification Lazy Associative Classification � Decision Tree Classifier � (Eager) Associative Classifier By Adriano Veloso,Wagner Meira Jr. , Mohammad J. Zaki � Comparison between Decision Tree and Associative Classifier � Lazy Associative Classifier � Comparison between Lazy and Eager Associative Classifier Presented by: Fariba Mahdavifard � Shortcomings of Lazy Associative Classifier Department of Computing Science � Conclusion University of Alberta Classification: Model Construction and Classification Models Prediction • Learning Step: The training data is used to construct a model which relates the feature variables. • Several models have been proposed over the years, • Test Step: The training model is used to predict the class such as neural network, statistical model, decision variable for test instances. trees (DT), genetic algorithms, etc. Classification Algorithms Training Data • The most suitable one for data mining is DT. DT could be constructed relatively fast Classifier DT models are simple and easy to be understood. (Model) IF outlook = ‘rainy’ OR windy=‘false’ THEN play=‘yes’

  2. Decision Tree Classifier Contents: • At each internal node, the best split is chosen according to the � Classification information gain criterion. � Decision Tree Classifier • A DT is built using a greedy � (Eager) Associative Classifier recursive splitting strategy � Comparison between Decision Tree and Associative Classifier • Decision tree can be considered as a set of disjoint decision rules, � Lazy Associative Classifier Test instance with one rule per leaf. outlook � Comparison between Lazy and Eager Associative Classifier sunny rainy � Shortcomings of Lazy Associative Classifier • Such greedy (local) search may overcast humidity prune important rules! windy � Conclusion high normal true false no yes no yes yes Eager Associative Classifier Contents: χ → c • Class association rules (CARs) : � Classification • CARs are essentially decision rules � Decision Tree Classifier • They are ranked in decreasing order of information gain. Antecedent is composed of feature variables � (Eager) Associative Classifier Consequent is class • During the testing phase, Associative classifier checks weather each � Comparison between Decision Tree and Associative CAR matches the test instance. Classifier • The class associated with the first match is chosen. � Lazy Associative Classifier � Comparison between Lazy and Eager Associative Classifier Note: � Decision tree is a greedy search for CARs that only expands the � Shortcomings of Lazy Associative Classifier current best rule. � Conclusion � Eager Associative Classifier mines all possible CARs with a given minimum support .

  3. Eager Associative Classifier Eager Associative Classifier Steps: 1. Algorithm mines all frequent CARs outlook windy temperature sunny sunny 2. Sort them in descending order of information gain. false true false true rainy sunny overcast 3. For each test instance, the first CAR matching that, is hot humidity humiditytemperature temperature humidity windy temperature windy used to predict the class. high normal normal cool true cool mild cool normal true false yes no yes no yes no yes no yes no no yes yes • Three CARs match the test instance are: outlook=sunny, temperature=cool, humidity=high -> play??? 1. {windy=false and temperature=cool -> play=yes} 2. {outlook=sunny and humidity=high -> play=no} The first rule would be selected, since it is the best ranked CAR. 3. {outlook=sunny and temperature=cool -> play=yes} Comparison between Decision Tree and Contents: Associative Classifier • The test instance is recognized by only on rule in decision � Classification tree. � Decision Tree Classifier • The same test instance is recognized by three CARs in � (Eager) Associative Classifier associative classifier. � Comparison between Decision Tree and • Intuitively associative classifiers perform better than decision trees because it allows several CARs to cover the Associative Classifier same portion of the training data . � Lazy Associative Classifier � Comparison between Lazy and Eager Associative Classifier • Theorem1: The rules derived from a decision tree are subset � Shortcomings of Lazy Associative Classifier of the CARs mined using an eager associative classifier based on information gain. � Conclusion • Theorem 2: CARs perform no worse than decision tree rules, according to the information gain principle.

  4. Contents: Lazy Learning Algorithms � Classification • Eager learning methods create the � Decision Tree Classifier classification model during the learning � (Eager) Associative Classifier � Comparison between Decision Tree and Associative phase using training data Classifier � Lazy Associative Classifier • But lazy learning methods postpone � Comparison between Lazy and Eager Associative Classifier generalization and building the � Shortcomings of Lazy Associative Classifier classification model until a query is given. � Conclusion Lazy Associative Classifier Contents: Lazy Associative Classifier induces CARs specific to each test instance. � Classification 1. Lazy Associative Classifier projects the training data only on � Decision Tree Classifier features in the test instance (from all training instances, only the instances sharing at least one feature with test instance � (Eager) Associative Classifier are used) � Comparison between Decision Tree and Associative 2. From this projected training data, CARs are induced and Classifier ranked, and the best CAR is used. � Lazy Associative Classifier � Comparison between Lazy and Eager Associative Classifier � Shortcomings of Lazy Associative Classifier � Conclusion

  5. Comparison between Lazy and Eager Comparison between Lazy and Eager Associative Classifier Associative Classifier Test Instance: Test Instance: Outlook=overcast, Temperature=hot and Humidity=low -> play? Outlook=overcast, Temperature=hot and Humidity=low -> play? • Lazy Associative Classifier projects the training data (D) by the features • The set of CARs found by eager classifier (minsup=40% ) is in the test instance A composed of: • The projected training data (D A ) has less instances, therefore CARs not 1. {windy=false and humidity=normal -> play=yes} frequent in D may be frequent in D A . 2. {windy=false and humidity=cool -> play=yes} • The Lazy Associative Classifier found two CARs in D A : 1. {Outlook=overcast -> play=yes} None of the two CARs matches the test instance! 2. {Temperature=hot -> play=yes} • The Lazy CARs predict the correct class and they are also simpler compaerd to the eager ones. Comparison between Lazy and Eager Comparison between Lazy and Eager Associative Classifier Associative Classifier • Intuitively, lazy classifiers perform better than eager • Intuitively, lazy classifiers perform better than eager classifiers because of two characteristic: classifiers because of two characteristic: 1. Missing CARs: 2. Highly Disjunctive Spaces: • Eager classifiers search for CARs in a large search space. • Eager classifiers often combine small disjuncts to generate more general predictions. It will reduce classification • This strategy generates a large rule-set, but CARs that are performance in highly disjunctive spaces where single important for some specific test instances may be missed! disjunct may be important to classify specific instances. • Lazy classifiers focus the search for CARs in a much • Lazy classifiers generalize their training examples exactly smaller search space, which is induced by the features of as needed to cover the test instance. More appropriate in the test instance. complex search spaces!

  6. Shortcomings of Lazy Associative Classifier Contents: � Classification First Problem: � Decision Tree Classifier • The more CARs are generated, the better is the classifier??! � (Eager) Associative Classifier � Comparison between Decision Tree and Associative • NO! it sometimes leads to overfitting, reducing the Classifier generalization and affecting the classification accuracy. � Lazy Associative Classifier � Comparison between Lazy and Eager Associative Classifier • Overfitting and high sensitivity to irrelevant features are � Shortcomings of Lazy Associative Classifier shortcoming of lazy classifier. • Features should be selected carefully. � Conclusion Shortcomings of Lazy Associative Classifier Contents: � Classification Second Problem: � Decision Tree Classifier • Lazy classifier typically requires more work to classify � (Eager) Associative Classifier all test instances. � Comparison between Decision Tree and Associative Classifier • Caching mechanism is used to decrease this workload. � Lazy Associative Classifier � Comparison between Lazy and Eager Associative Classifier • The basic idea of caching: different test instances may � Shortcomings of Lazy Associative Classifier induce different rule-sets, but different rule-sets may share common CARs. � Conclusion

Recommend


More recommend