agenda
play

Agenda Background Techniques Example Applications Summary 2 1 - PDF document

12/1/2011 Emil Brissman & Kajsa Eriksson 2011-12-07 1 Agenda Background Techniques Example Applications Summary 2 1 12/1/2011 Background: The problem Decision trees: Need to have low prediction error


  1. 12/1/2011 Emil Brissman & Kajsa Eriksson 2011-12-07 1 Agenda  Background  Techniques  Example  Applications  Summary 2 1

  2. 12/1/2011 Background: The problem  Decision trees:  Need to have low prediction error  Should not be over fitted to training data  Should generally have low complexity, for interpretation purposes 3 Background: Existing solutions  Pruning:  Post-process to reduce over fitting and decision tree complexity  Risk of the tree becoming under fitted  Boosting:  Reduces prediction error by applying a series of separate classifications and then combines them  Complexity of the tree increases drastically 4 2

  3. 12/1/2011 Background: Grafting  To prove that a more complex tree could have a lower prediction error without being over fitted to training data  The idea is to reclassify regions of the instance space without training data or with just misclassified data  Reclassification results in a higher probability of rightly classifying data that fall into empty regions 5 Techniques (1/2)  There are four algorithms for grafting that are built upon each other  They are made as a post process for the C4.5 classification technique  C4.5X is the first algorithm which was developed just to test the theory of grafting  C4.5+ is a formal grafting algorithm developed because of the success of C4.5X 6 3

  4. 12/1/2011 Techniques (2/2)  C4.5++ is a further development that is proved to not produce over fitting in the tree. In other words it balances the bias and variance of the tree.  C4.5A is the fourth and final algorithm which is a performance update from previous ones. By considering a smaller set of data the computational time is reduced. 7 Example (1/4)  Classification of instance space after C4.5 algorithm: A A > 7 A <= 7 ◊ A A > 2 A <= 2 * B B <= 5 B > 5 * 8 4

  5. 12/1/2011 Example (2/4)  The blue region is a leaf in the deduced decision tree that C4.5 classified as *.  But what is really the most likely class for the area marked with ?  By applying grafting as a post process a new prediction for the area can be made 9 Example (3/4)  Step 1: For each leaf the algorithm visits all ancestor nodes. It tries to find possible cuts that split the leaf region.  Step 2: It chooses those cuts that have the highest Laplacian accuracy estimate  Laplace: (P+1) / (T+2) ○ T – number of instances below certain ancestor ○ P – number of instances of majority class below same ancestor 10 5

  6. 12/1/2011 Example (4/4)  Step 3: The best supported cuts are introduced in the decision tree as new branches and leaves with a more likely class  Result: 3 new leaves  a - ◊  b - *  c - ◊  The region with the ? now belongs to class ◊ 11 Applications  Grafting as a post-process to C4.5 is implemented in Weka as J48graft 12 6

  7. 12/1/2011 Summary  Grafting is a post-process that successfully reduces the prediction error of a decision tree by re-evaluating areas of the instance space where no training data exists  It is proved that the increased complexity of a grafted tree does not mean that the tree is more over fitted  Grafting together with pruning most often gives even better results. Probably because the algorithms complement each other. 13 Bibliography  Kumar, V., Steinbach, M. & Tan, P.-N. (2006). Introduction to Data Mining . Pearson College Div. Quinlan, J. R. (1993). C4.5: Programs for Machine Learning .  Los Altos: Morgan Kaufmann.  University of Waikato. Weka 3: Data Mining Software in Java . http://www.cs.waikato.ac.nz/ml/weka/index.html [2011-12-01] Webb, G.I. (1996). Further Experimental Evidence against the  Utility of Occam's Razor . Journal of Artificial Intelligence Research, vol. 4, pp. 397-417. Webb, G.I (1997 ). Decision Tree Grafting . Learning, IJCAI’97  Proceedings of the Fifteenth international joint conference on Artificial intelligence, vol. 2, pp. 846-85.  Webb, G.I. (1999). Decision Tree Grafting From the All-Test- But-One Partition . Machine Learning, IJCAI '99 Proceedings of the Sixteenth international joint conference on Artificial intelligence, vol. 2, pp. 702-707. 14 7

Recommend


More recommend