for monday
play

For Monday Read FOIL paper No homework Program 2 Questions? Rule - PowerPoint PPT Presentation

For Monday Read FOIL paper No homework Program 2 Questions? Rule Learning Why learn rules? Proposition Rule Learning Basic if-then rules Condition is typically a conjunction of attribute tests Basic Approaches


  1. For Monday • Read FOIL paper • No homework

  2. Program 2 • Questions?

  3. Rule Learning • Why learn rules?

  4. Proposition Rule Learning • Basic if-then rules • Condition is typically a conjunction of attribute tests

  5. Basic Approaches • Decision tree -> rules • Neural network -> rules (TREPAN) • Sequential covering algorithms – Top-down – Bottom-up – Hybrid

  6. Decision Tree Rules • Resulting rules may contain unnecessary antecedents, resulting in over-fitting. • Rules are post-pruned. • Resulting rules may lead to conflicting conclusions on some instances. • Sort rules by training (validation) accuracy to create an ordered decision list. • The first rule that applies is used to classify a test instance. red  circle → A (97% train accuracy) red  big → B (95% train accuracy) : : Test case: <big, red, circle> assigned to class A

  7. Sequential Covering

  8. Minimum Set Cover

  9. Greedy Sequential Covering Example Y + + + + + + + + + + + + + X 9

  10. Greedy Sequential Covering Example Y + + + + + + + + + + + + + X 10

  11. Greedy Sequential Covering Example Y + + + + + + X 11

  12. Greedy Sequential Covering Example Y + + + + + + X 12

  13. Greedy Sequential Covering Example Y + + + X 13

  14. Greedy Sequential Covering Example Y + + + X 14

  15. Greedy Sequential Covering Example Y X 15

  16. No-optimal Covering Example Y + + + + + + + + + + + + + X 16

  17. Greedy Sequential Covering Example Y + + + + + + + + + + + + + X 17

  18. Greedy Sequential Covering Example Y + + + + + + X 18

  19. Greedy Sequential Covering Example Y + + + + + + X 19

  20. Greedy Sequential Covering Example Y + + X 20

  21. Greedy Sequential Covering Example Y + + X 21

  22. Greedy Sequential Covering Example Y + X 22

  23. Greedy Sequential Covering Example Y + X 23

  24. Greedy Sequential Covering Example Y X 24

  25. Learning a Rule • Two basic approaches: – Top-down – Bottom-up

  26. Top-Down Rule Learning Example Y + + + + + + + + + + + + + X 26

  27. Top-Down Rule Learning Example Y + + + + + + Y>C 1 + + + + + + + X 27

  28. Top-Down Rule Learning Example Y + + + + + + Y>C 1 + + + + + + + X X>C 2 28

  29. Top-Down Rule Learning Example Y Y<C 3 + + + + + + Y>C 1 + + + + + + + X X>C 2 29

  30. Top-Down Rule Learning Example Y Y<C 3 + + + + + + Y>C 1 + + + + + + + X X<C 4 X>C 2 30

  31. Bottom-Up Rule Learning Example Y + + + + + + + + + + + + + X 31

  32. Bottom-Up Rule Learning Example Y + + + + + + + + + + + + + X 32

  33. Bottom-Up Rule Learning Example Y + + + + + + + + + + + + + X 33

  34. Bottom-Up Rule Learning Example Y + + + + + + + + + + + + + X 34

  35. Bottom-Up Rule Learning Example Y + + + + + + + + + + + + + X 35

  36. Bottom-Up Rule Learning Example Y + + + + + + + + + + + + + X 36

  37. Bottom-Up Rule Learning Example Y + + + + + + + + + + + + + X 37

  38. Bottom-Up Rule Learning Example Y + + + + + + + + + + + + + X 38

  39. Bottom-Up Rule Learning Example Y + + + + + + + + + + + + + X 39

  40. Bottom-Up Rule Learning Example Y + + + + + + + + + + + + + X 40

  41. Bottom-Up Rule Learning Example Y + + + + + + + + + + + + + X 41

  42. Algorithm Specifics • Metrics – How do we pick literals to add to our rules? • Handling continuous features • Pruning

  43. Rules vs. Trees

  44. Top-down vs Bottom-up

  45. Rule Learning vs. Knowledge Engineering • An influential experiment with an early rule-learning method (AQ) by Michalski (1980) compared results to knowledge engineering (acquiring rules by interviewing experts). • People known for not being able to articulate their knowledge well. • Knowledge engineered rules: – Weights associated with each feature in a rule – Method for summing evidence similar to certainty factors . – No explicit disjunction • Data for induction: – Examples of 15 soybean plant diseases descried using 35 nominal and discrete ordered features, 630 total examples. – 290 “best” (diverse) training examples selected for training. Remainder used for testing • What is wrong with this methodology? 45

  46. “Soft” Interpretation of Learned Rules • Certainty of match calculated for each category. • Scoring method: – Literals: 1 if match, -1 if not – Terms (conjunctions in antecedent): Average of literal scores. – DNF (disjunction of rules): Probabilistic sum: c 1 + c 2 – c 1 * c 2 • Sample score for instance A  B  ¬C  D  ¬ E  F A  B  C → P (1 + 1 + -1)/3 = 0.333 D  E  F → P (1 + -1 + 1)/3 = 0.333 Total score for P: 0.333 + 0.333 – 0.333* 0.333 = 0.555 • Threshold of 0.8 certainty to include in possible diagnosis set. 46

  47. Experimental Results • Rule construction time: – Human: 45 hours of expert consultation – AQ11: 4.5 minutes training on IBM 360/75 • What doesn’t this account for? • Test Accuracy: 1 st choice Some choice Number of correct correct diagnoses 97.6% 100.0% 2.64 AQ11 71.8% 96.9% 2.90 Manual KE 47

Recommend


More recommend