Introduction QCLR HOMER HOMER+QCLR Conclusions On the Combination of two Decompositive Multi-Label Classification Methods Grigorios Tsoumakas 1 , Eneldo Loza Menc´ ıa 2 , Ioannis Katakis 1 , Sang-Hyeun Park 2 , and Johannes F¨ urnkranz 2 1 Aristotle University of Thessaloniki, Greece 2 Technische Universit¨ at Darmstadt, Germany 11 September 2009 Tsoumakas, Loza Menc´ ıa, Katakis, Park & F¨ urnkranz ECML PKDD 2009 Workshop on Preference Learning (PL-09)
Introduction QCLR HOMER HOMER+QCLR Conclusions Outline Introduction Background QCLR HOMER Evaluation Conclusions Tsoumakas, Loza Menc´ ıa, Katakis, Park & F¨ urnkranz ECML PKDD 2009 Workshop on Preference Learning (PL-09)
Introduction QCLR HOMER HOMER+QCLR Conclusions Multi-Label Classification Objects are assigned to a set of labels (domains: text, biology, music etc) Tsoumakas, Loza Menc´ ıa, Katakis, Park & F¨ urnkranz ECML PKDD 2009 Workshop on Preference Learning (PL-09)
Introduction QCLR HOMER HOMER+QCLR Conclusions Methods A . Problem Adaptation Extend algorithms in order to handle multi-label data (e.g. ML k NN, BPMLL) B . Problem Transformation Transform the learning task into one or more single-label classification tasks e.g. Label Powerset (LP), Binary Relevance (BR) Decompositive Approaches: Focus on large number of labels e.g. HOMER, QCLR Tsoumakas, Loza Menc´ ıa, Katakis, Park & F¨ urnkranz ECML PKDD 2009 Workshop on Preference Learning (PL-09)
Introduction QCLR HOMER HOMER+QCLR Conclusions Methods A . Problem Adaptation Extend algorithms in order to handle multi-label data (e.g. ML k NN, BPMLL) B . Problem Transformation Transform the learning task into one or more single-label classification tasks e.g. Label Powerset (LP), Binary Relevance (BR) Decompositive Approaches: Focus on large number of labels e.g. HOMER, QCLR Main idea of this work Combine two state of the art decompositive methods (HOMER + QCLR) in order to confront problems with large number of labels more effectively and efficiently Tsoumakas, Loza Menc´ ıa, Katakis, Park & F¨ urnkranz ECML PKDD 2009 Workshop on Preference Learning (PL-09)
� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � Introduction QCLR HOMER HOMER+QCLR Conclusions QWeighted Calibrated Label Ranking (1/4) Based on Ranking by Pairwise Comparison [H¨ ullermeier et al., AIJ08] RPC - Transformation Learns one binary model for each pair of labels Ex � # Ex � # Label � set Label � set � � 1 �� � 4 � � � 1 �� � 4 � � � 3 �� � 4 � � � 3 �� � 4 � � � 1 � � � 1 � � � 2 �� � 3 �� � 4 � � � 2 �� � 3 �� � 4 � Ex � # Ex � # 1vs2 1vs2 Ex � # Ex � # 1vs3 1vs3 Ex � # Ex � # 1vs4 1vs4 Ex � # Ex � # 2vs3 2vs3 Ex � # Ex � # 2vs4 2vs4 Ex � # Ex � # 3vs4 3vs4 true true true true false false false false false false false false true true false false true true false false false false true true false false false false Tsoumakas, Loza Menc´ ıa, Katakis, Park & F¨ urnkranz ECML PKDD 2009 Workshop on Preference Learning (PL-09)
� � � � � � � � � � � � � � � � � � � � Introduction QCLR HOMER HOMER+QCLR Conclusions QCLR (2/4) RPC - Classification new instance x 1vs2 1vs2 1vs3 1vs3 1vs4 1vs4 2vs3 2vs3 2vs4 2vs4 3vs4 3vs4 Label Label Votes Votes � 3 � 1 � 1 � 1 � 2 � 2 Ranking: � 2 � 3 � 3 � 4 � 4 � 4 Tsoumakas, Loza Menc´ ıa, Katakis, Park & F¨ urnkranz ECML PKDD 2009 Workshop on Preference Learning (PL-09)
� � � � � � � � � � � � � � � � � � � � Introduction QCLR HOMER HOMER+QCLR Conclusions QCLR (2/4) RPC - Classification new instance x 1vs2 1vs2 1vs3 1vs3 1vs4 1vs4 2vs3 2vs3 2vs4 2vs4 3vs4 3vs4 Label Label Votes Votes � 3 � 1 � 1 � 1 � 2 � 2 Ranking: � 2 � 3 � 3 � 4 � 4 � 4 How to obtain a bipartition? Introduce a virtual label λ V, that separates positive from negative labels (Calibrated Label Ranking) [F¨ urnkranz et al., MLJ08] Tsoumakas, Loza Menc´ ıa, Katakis, Park & F¨ urnkranz ECML PKDD 2009 Workshop on Preference Learning (PL-09)
� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � Introduction QCLR HOMER HOMER+QCLR Conclusions QCLR (3/4) CLR - Transformation Additional pairwise models are necessary Ex � # Ex � # 1vsV 1vsV Ex � # Ex � # 2vsV 2vsV Ex � # Ex � # Label � set Label � set Ex � # Ex � # 3vsV 3vsV Ex � # Ex � # 4vsV 4vsV true true false false � � 1 �� � 4 � � � 1 �� � 4 � false false true true false false false false � � 3 �� � 4 � � � 3 �� � 4 � true true true true true true false false � � 1 � � � 1 � false false false false false false true true � � 2 �� � 3 �� � 4 � � � 2 �� � 3 �� � 4 � true true true true Ex � # Ex � # 1vs3 1vs3 Ex � # Ex � # 1vs2 1vs2 Ex � # Ex � # 1vs4 1vs4 Ex � # Ex � # 2vs3 2vs3 Ex � # Ex � # 2vs4 2vs4 Ex � # Ex � # 3vs4 3vs4 true true true true false false false false false false false false true true false false true true false false false false true true false false false false Tsoumakas, Loza Menc´ ıa, Katakis, Park & F¨ urnkranz ECML PKDD 2009 Workshop on Preference Learning (PL-09)
� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � Introduction QCLR HOMER HOMER+QCLR Conclusions QCLR (4/4) CLR - Classification new instance x 1vs2 1vs2 1vs3 1vs3 1vs4 1vs4 2vs3 2vs3 2vs4 2vs4 3vs4 3vs4 1vsV 1vsV 2vsV 2vsV 3vsV 3vsV 4vsV 4vsV � 1 Label Label Votes Votes � 1 � 1 � V � 2 � 2 Ranking: � 2 � 3 � 3 � 4 � 4 � 4 � 3 � V � V Limitation: Need to query quadratic number of models Solution : Quick Weighted Voting [Loza Menc´ ıa et al., ESANN09] Complexity is n + dnlog ( n ), where n is the number of labels and d is the average number of relevant labels (cardinality) Tsoumakas, Loza Menc´ ıa, Katakis, Park & F¨ urnkranz ECML PKDD 2009 Workshop on Preference Learning (PL-09)
λ λ λ λ λ λ λ λ µ µ µ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ Introduction QCLR HOMER HOMER+QCLR Conclusions HOMER - Hierarchy Of MultiLabel ClassifiERs (1/2) Main Idea [ Tsoumakas et al., ECMLPKDD08w] The transformation of a multi-label problem with large number of labels into many hierarchically structured simpler sub-problems Tsoumakas, Loza Menc´ ıa, Katakis, Park & F¨ urnkranz ECML PKDD 2009 Workshop on Preference Learning (PL-09)
Introduction QCLR HOMER HOMER+QCLR Conclusions HOMER - Hierarchy Of MultiLabel ClassifiERs (1/2) Main Idea [ Tsoumakas et al., ECMLPKDD08w] The transformation of a multi-label problem with large number of labels into many hierarchically structured simpler sub-problems Step 1. Hierarchical Organization of Labels λ 1 , λ 2, λ 3 , λ 4 , λ 5 , λ 6 , λ 7 , λ 8 µ 1 µ 2 µ 3 λ 1 , λ 6 , λ 8 λ 4 , λ 5 , λ 2 λ 7 , λ 3 λ 1 λ 6 λ 8 λ 4 λ 5 λ 2 λ 7 λ 3 k : branching factor meta label µ n : represents the union of the labels of the node Tsoumakas, Loza Menc´ ıa, Katakis, Park & F¨ urnkranz ECML PKDD 2009 Workshop on Preference Learning (PL-09)
Introduction QCLR HOMER HOMER+QCLR Conclusions HOMER - Hierarchy Of MultiLabel ClassifiERs (2/2) Step 2. Assign a Multilabel Classifier at each internal node λ 1 , λ 2, λ 3 , λ 4 , λ 5 , λ 6 , λ 7 , λ 8 x h 0 µ 1 µ 2 µ 3 λ 1 , λ 6 , λ 8 λ 4 , λ 5 , λ 2 λ 7 , λ 3 h 1 h 2 h 3 Prediction λ 1 λ 6 λ 8 λ 4 λ 5 λ 2 λ 7 λ 3 Tsoumakas, Loza Menc´ ıa, Katakis, Park & F¨ urnkranz ECML PKDD 2009 Workshop on Preference Learning (PL-09)
Introduction QCLR HOMER HOMER+QCLR Conclusions HOMER - Hierarchy Of MultiLabel ClassifiERs (2/2) Step 2. Assign a Multilabel Classifier at each internal node λ 1 , λ 2, λ 3 , λ 4 , λ 5 , λ 6 , λ 7 , λ 8 x h 0 µ 1 µ 2 µ 3 λ 1 , λ 6 , λ 8 λ 4 , λ 5 , λ 2 λ 7 , λ 3 h 1 h 2 h 3 Prediction λ 1 λ 6 λ 8 λ 4 λ 5 λ 2 λ 7 λ 3 Advantages Classification Time - Only invoke few classifiers of the hierarchy 1 Prediction Performance - Balanced examples for each classifier 2 Training Time - Smaller datasets at each node 3 Tsoumakas, Loza Menc´ ıa, Katakis, Park & F¨ urnkranz ECML PKDD 2009 Workshop on Preference Learning (PL-09)
Introduction QCLR HOMER HOMER+QCLR Conclusions Label Distribution (1/2) Open Issue How should we distribute labels into k children nodes (groups)? Tsoumakas, Loza Menc´ ıa, Katakis, Park & F¨ urnkranz ECML PKDD 2009 Workshop on Preference Learning (PL-09)
Recommend
More recommend