a meta learning system for multi instance classification
play

A meta-learning system for multi-instance classification Gitte - PowerPoint PPT Presentation

A meta-learning system for multi-instance classification Gitte Vanwinckelen and Hendrik Blockeel KU Leuven, Belgium Motivation Performed extensive evaluation of multi-instance (MI) learners on datasets from different domains


  1. A meta-learning system for multi-instance classification Gitte Vanwinckelen and Hendrik Blockeel KU Leuven, Belgium

  2. Motivation ● Performed extensive evaluation of multi-instance (MI) learners on datasets from different domains ● Performance of MI algorithms is very sensitive to the application domain ● Can we formalize this knowledge by learning a meta-model?

  3. Outline 1) Motivation 2) What is multi-instance learning? 3) Design principles of meta-model 4) Performance evaluation of mi-learners 5) Meta-learning results 6) Conclusion

  4. MI learning

  5. Relationship instances – bag ● Traditional mi learning – At least one postive instance in a bag – Learn a concept that describes all positive instances (or bags) ● Generalized mi learning – All instances in a bag contribute to its label – Learn a concept that identifies the positive bags

  6. Standard multi-instance learning Drug activity prediction Identifying musky molecule configurations [Dietterich, Artificial Intelligence 1997]

  7. Generalized multi-instance learning Which bags describe a beach ? [J. Amores, Artificial Intelligence '13]

  8. Meta-learning ● Which learner performs best on which MI dataset? ● Construct meta-features from original learning tasks ● Learn a model on meta-dataset (decision tree) ● Nb attributes, size train sets, correlation with output , ... ● Landmarkers: Fast algorithms [ Pfahringer '00] ● Indicate performance of expensive algorithms

  9. Meta-learning with landmarking ● Reduce MI datasets to single-instance datasets based on different MI assumptions ● Standard MI assumption – Label instances with bag label – One-sided noisy dataset ● Collective assumption – All instances contribute equally to the bag label – Average features values over all instances in a bag

  10. MI experiments: Datasets ● SIVAL image classification, CBIR (25) ● Synthetic newsgroups, text classification (20) ● Binary classification UCI datasets (27) – adult, tictactoe,diabetes,transfusion,spam – Iid sampled to create bags – Bag configurations: ½, ⅓, ¼, … ● Evaluation: Area Under ROC curve (AUC)

  11. MI experiments: Algorithms ● Decision trees: SimpleMI-J48, MIWrapper-J48, Adaboost-MITI ● Rule inducer MIRI ● Nearest neighbors: CitationKNN ● OptimalBall ● Diverse Density: MDD, EM-DD, MIDD ● TLD ● Support Vector Machines: mi-SVM, MISMO (NSK) ● Logistic regression: MILR, MILR-C

  12. Performance overview MI algorithms ● Comparison of classifiers over multiple datasets [Demsar '06] ● Are performance differences statistically significant? ● Friedman test with post-hoc Nemenyi test – Ranking of algorithms for each dataset – Average ranks over datasets same domain – Hypothesis test that algorithms perform equally good – Nemenyi test identify statistically equivalent groups of classifiers ● Critical difference diagram

  13. Critical difference diagrams (AUC) UCI Text CBIR

  14. Meta-learning setup ● 14 learners → binary classification tasks for all combinations of learners (one vs one) ● Leave-one-out cross-validation ● Three dataset domains (CBIR, text, UCI datasets) ● Landmarkers (standard and collective assumption) : – Naive Bayes – 1 nearest neighbors – Logistic regression – Decision stump ●

  15. UCI Metamodel based on number of features and noise level Majority classifier wins Meta-model wins

  16. UCI metamodel: Landmarker approach Majority classifier wins Standard MI landmarkers Meta-model wins Collective MI landmarkers Dstump, NB, 1NN, LR

  17. CBIR metamodel: Landmarker approach Majority classifier wins Standard MI landmarkers Meta-model wins Collective MI landmarkers

  18. Relationship landmarkers: logistic regression CBIR UCI Text

  19. Conclusions and future work ● Demonstration large differences MI learner evaluation on different domains ● Not sufficient to evaluate on multiple datasets from same domain ● Larger meta-dataset needed ● Define alternative MI assumptions and translate to SI datasets – e.g. Meta-data assumption (NSK)

Recommend


More recommend