pattern based classification a unifying perspective
play

Pattern-Based Classification: A Unifying Perspective LeGo - PowerPoint PPT Presentation

Pattern-Based Classification: A Unifying Perspective LeGo Slovenia, Bled 2009 07.09.2009 Albrecht Zimmermann, Siegfried Nijssen, Bjrn Bringmann Katholieke Universiteit Leuven, Belgium Observations The LeGo schema Pattern Feature Model


  1. Pattern-Based Classification: A Unifying Perspective LeGo Slovenia, Bled 2009 07.09.2009 Albrecht Zimmermann, Siegfried Nijssen, Björn Bringmann Katholieke Universiteit Leuven, Belgium

  2. Observations The LeGo schema Pattern Feature Model PS PS M DB Mining Selection Induction General schema Augment/replaces data mining step in KDD Topic of this workshop

  3. Observations (cont.) Pattern Feature Model PS PS M DB Mining Selection Induction Exhaustive Frequent Decision Tree Closed Heuristic Decision List Correlating SVM

  4. Observations (cont.) Pattern Feature Model PS PS M DB Mining Selection Induction Exhaustive Frequent Decision Tree Closed Heuristic Decision List Correlating SVM No overview Ramamohanarao et al ‘07

  5. Observations (cont.) Pattern Feature Model PS PS M DB Mining Selection Induction Exhaustive Frequent Decision Tree Closed Heuristic Decision List Correlating SVM No overview → reinventions → revisited dead ends → lost progress

  6. Observations (cont.) Pattern Feature Model PS PS M DB Mining Selection Induction Exhaustive Frequent Decision Tree Closed Heuristic Decision List Correlating SVM No overview → reinventions → revisited dead ends → lost progress

  7. What patterns and how? Which pattern type Which data-structure Itemsets FP-Trees Multi-itemsets ZBDDs Sequences TID-Lists Trees Bit-Vectors Graphs

  8. What patterns and how? Which pattern type Which data-structure Itemsets FP-Trees Results hold for Multi-itemsets ZBDDs lattices (itemsets) or even Sequences TID-Lists partial orders (graphs) Trees Bit-Vectors Independent of Graphs Pattern Type Sequences ⊂ Trees ⊂ Graphs

  9. What patterns and how? Which pattern type Which data-structure Itemsets FP-Trees Results hold for Multi-itemsets ZBDDs lattices (itemsets) or even Sequences TID-Lists partial orders (graphs) Trees Bit-Vectors Independent of Independent of Graphs Pattern Type Data Structure Sequences ⊂ Trees ⊂ Graphs

  10. Why mine explicit patterns? Traditional classification E X C U R S U S Attributes: {A 1 ,...,A d } Why should we care in Values: V(A) = {v 1 ,...,v r } the first place? Decision Trees: Rules: A 1 =v 2 A 1 =v 2 ∧ A 4 =v 1 ⇒ + apart from attending the workshop A 3 =v 2 ∧ A 2 =v 1 ⇒ - A 4 =v 1 A 3 =v 2

  11. Why mine explicit patterns? Traditional classification Attributes: {A 1 ,...,A d } Values: V(A) = {v 1 ,...,v r } Decision Trees: Rules: A 1 =v 2 A 1 =v 2 ∧ A 4 =v 1 ⇒ + A 3 =v 2 ∧ A 2 =v 1 ⇒ - A 4 =v 1 A 3 =v 2

  12. Why mine explicit patterns? Pattern based classification Transactions are Structured t ⊆ {i 1 ,...,i ℑ } Patterns provide instance description Models can be built independent of data type Yield interpretable classifiers Alternatives are opaque (Kernels, NN, ...)

  13. Thus leverage pattern mining techniques Advantages: 15 years of research → fast and scaleable Described in structured language → persistent, not opaque Challenge(s): (Re-)Entangle instance description and classification

  14. Roadmap Class-sensitive patterns & the mining thereof Model-independence Post-processing Iterative Mining Model-dependence Post-processing Iterative Mining

  15. Roadmap D I S C L A I M E R Class-sensitive patterns & the mining thereof We will probably miss Model-independence some approaches that Post-processing should have been Iterative Mining included in the presentation. Model-dependence Post-processing which just proves our point Iterative Mining

  16. Should we use frequent patterns? • Well-researched • Which threshold? • Frequent → expected • Frequent → no/anti- to hold on unseen correlation w/classes • Efficient mining • (Too) many patterns Pattern Feature Model PS PS M DB Mining Selection Induction

  17. New Item! Class-sensitive patterns Taking relationship to class-labels into account 91 92 93 94 95 96 97 98 99 00 01 02 03 04 05 06 07 08 09 Jumping Emerging Patterns ’01 (JEP) Interesting Rules ’98 (IR) Emerging Patterns ’99 (EP) Nuggets ‘94 Subgroup Descriptions ’96 (SGD) Contrast Sets ’99 (CS) Version Space Patterns ‘01 Correlating Patterns ’00 (CP) Discriminative Patterns ’07 (DP) Class-Association Rules ’98 (CAR) Taking no sides/not subscribing to particular universe

  18. Evaluating class-sensitivity Confidence, Lift, WRAcc (Novelty), X 2 , Correlation Coefficient, Information Gain, Fisher Score Some of them mathematically equivalent, some semantically Lavrac et al. ‘09

  19. How to mine them? Mining frequent patterns & Bounding specific measure post-processing Wrobel ’97 (SGD) Liu et al. ’98 (CAR) Bay et al. ’99 (CS) Kavask et al. ’06 (SGD) Wang et al. ’05 (CAR) Atzmüller et al. ’06 (SGD) Arunasalam et al. ’06 (CAR) Cheng et al. ’07 (DP) Nowozin et al. ’07 (CAR) Cheng et al. ’08 (DP) (1 bound) CAR - Class Association Rules CS - Contrast Sets DP - Discriminative Patterns SGD - SubGroup Descriptions

  20. How to? (cont.) General Branch-and-bound Iterative deepening Webb ’95 (CAR) Bringmann et al. ’06 (CP) Klösgen ’96 (SGD) Cerf et al. ’08 (CAR) Morishita et al. ’00 Yan et al. ’08 (DP) (2-bounds) Sequential sampling Grosskreutz et al. ’08 (SGD) Scheffer et al. ’02 (SGD) Nijssen et al. ’09 (4-bounds)* Earlier than most specifics, subsumes them! *) itemset-specific, constraint programming

  21. What traversal strategy Seriously ?

  22. Result sets Are still too big May include irrelevant patterns May include much redundancy

  23. The (extended) LeGo Pattern Feature Model PS PS M DB Mining Selection Induction Pattern set constraint Model constraint

  24. The (extended) LeGo Mining Constraint Optimisation Criteria Pattern Feature Model PS PS M DB Mining Selection Induction Model constraint

  25. The (extended) LeGo Model-Independent Mining Constraint Iterative Mining Model-Independent Optimisation Post-Processing Criteria Pattern Feature Model PS PS M DB Mining Selection Induction Model constraint

  26. The (extended) LeGo Model-Independent Iterative Mining Model-Independent Post-Processing Pattern Feature Model PS PS M DB Mining Selection Induction Optimisation Criteria Mining Constraint

  27. The (extended) LeGo Model-Independent Iterative Mining Model-Independent Post-Processing Pattern Feature Model PS PS M DB Mining Selection Induction Model-Dependent Post-Processing Model-Dependent Iterative Mining

  28. Model-independence Model-Independent Iterative Mining Model-Independent Post-Processing Pattern Feature Model PS PS M DB Mining Selection Induction Only patterns affect other patterns’ selection Modular: usable in any classifier (often SVM)

  29. Model independent Post-processing Model-Independent Post-Processing Pattern Feature Model PS PS M DB Mining Selection Induction Mine large set of patterns Select subset Exhaustively: too expensive Heuristically: usually ordered Use measure to quantify combined worth

  30. Model independent Post-Processing Pattern Set Scores • Pattern sets can be scored based on computable for all data types • TID lists of patterns only • significance: incorporate support/class-sensitivity • redundancy: similarity between TID lists requires specialization • Pattern structure & TID lists • using a pattern distance measure • by computing how well the patterns compress data Pattern Feature Model PS PS M DB Mining Selection Induction

  31. Model independent Post-Processing Exhaustive D I S C L A I M E R Knobbe et al. ’06 De Raedt et al. ’07 Exhaustive enumeration Exhaustive enumeration The following Explicit size constraint Arbitrary constraints algorithms should be Boundable pruning Monotone, boundable pruning Implicit redundancy control Explicit redundancy control considered illustrating (entropy) examples, NOT recommendations! other approaches vary Extremely large search space -> scalability issues Counter-intuitive result: all sets Pattern Feature Model PS PS M DB Mining Selection Induction

  32. Model independent Post-Processing Exhaustive Knobbe et al. ’06 De Raedt et al. ’07 Exhaustive enumeration Exhaustive enumeration Explicit size constraint Arbitrary constraints Boundable pruning Monotone, boundable pruning Implicit redundancy control Explicit redundancy control (entropy) Extremely large search space -> scalability issues Counter-intuitive result: all sets Pattern Feature Model PS PS M DB Mining Selection Induction

  33. Model independent Post-Processing Heuristic Search Strategies • Fixed Order: Scan patterns in (possibly random) fixed order, add each pattern that improves running score (O(n)) P6 P1 P4 P2 P3 P5 P7 P8 P9 • Greedy: Repeatedly reorder patterns to pick pattern that improves score most (O(n2)) P1 P2 P3 P4 P5 P6 P7 P8 P9

  34. Model independent Post-Processing Heuristic Search Strategies • Fixed Order: Scan patterns in (possibly random) fixed order, add each pattern that improves running score (O(n)) P6 P1 P4 P2 P3 P5 P7 P8 P9 • Greedy: Repeatedly reorder patterns to pick pattern that improves score most (O(n2)) P7 P5 P1 P3 P8 P9 P2 P4 P6

Recommend


More recommend