challenges and applications
play

CHALLENGES AND APPLICATIONS with : N. Bassiliades, W. Groves, M. - PowerPoint PPT Presentation

MULTI-TARGET PREDICTION: CHALLENGES AND APPLICATIONS with : N. Bassiliades, W. Groves, M. Laliotis, N. Markantonatos, Grigorios Tsoumakas, F. Markatopoulou, C. & E. Papagiannopoulou, Y. Papanikolaou, School of informatics, E.


  1. MULTI-TARGET PREDICTION: CHALLENGES AND APPLICATIONS with : N. Bassiliades, W. Groves, M. Laliotis, N. Markantonatos, Grigorios Tsoumakas, F. Markatopoulou, C. & E. Papagiannopoulou, Y. Papanikolaou, School of informatics, E. Spyromitros-Xioufis, I. Tsamardinos, I. Vlahavas, A. Vrekou Aristotle university of Thessaloniki

  2. MULTI-TARGET PREDICTION Tasks Challenges Applications Multi-label learning Exploiting dependencies Multimedia annotation among the targets  Video, image, audio, text Multi-target regression Scaling to extreme sizes Gene function prediction Label ranking of output spaces Ecological modelling Multi-task learning Dealing with class Demand forecasting imbalance Collaborative filtering Ensemble pruning Target heterogeneity Dyadic prediction Willem Waegeman, Krzysztof Dembczynski, Eyke Hüllermeier , Multi-Target Prediction, Tutorial @ ICML 2013 2

  3. MULTI-TARGET PREDICTION Tasks Challenges Applications Multi-label learning Exploiting dependencies Multimedia annotation among the targets  Video, image, audio, text Multi-target regression Scaling to extreme sizes Gene function prediction Label ranking of output spaces Ecological modelling Multi-task learning Dealing with class Demand forecasting imbalance Collaborative filtering Ensemble pruning Target heterogeneity Dyadic prediction Willem Waegeman, Krzysztof Dembczynski, Eyke Hüllermeier , Multi-Target Prediction, Tutorial @ ICML 2013 3

  4. OUTLINE 1. Deterministic label relationships Exploiting dependencies 2. From multi-label classification among the targets to multi-target regression 3. Semantic indexing of biomedical literature Applications 4. Multi-label classification for instance-based ensemble pruning 4

  5. OUTLINE 1. Deterministic label relationships  Papagiannopoulou, C., Tsoumakas, G., Tsamardinos, I. (2015). Discovering and Exploiting Deterministic Label Relationships in Multi-Label Learning. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '15)  Papagiannopoulou, E., Tsoumakas, G., Bassiliades, N. (2015). On Discovering Relationships in Multi-Label Learning via Linked Open Data, In Proceedings of Know@LOD Workshop of ESWC 2. From multi-label classification to multi-target regression 3. Semantic indexing of biomedical literature 4. Multi-label classification for instance-based ensemble pruning 5

  6. MULTI-LABEL LEARNING 𝑌 1 𝑌 2 … 𝑌 𝒒 𝑍 1 𝑍 2 … 𝑍 𝒓 … 12 0 1 … 1 0.12 1 training … -5 1 1 … 0 2.34 9 examples … 40 1 0 … 1 1.22 3 2.18 2 … 8 ? ? … ? unknown instances 1.76 7 … 23 ? ? … ? 𝑟 binary output variables 𝑞 input variables 6

  7. THE SEED tower sky ImageCLEF 2011 challenge  Automatic annotation of Flickr images  JPG, EXIF information & user tags  99 concepts river flowers 7

  8. Can we post-process the THE QUESTION probabilities in a sound way so that they obey the relationships? Label Relationships Sample Output Positive entailment Label Probability  River → Water River 0.7  Car → Vehicle Water 0.5 Mutual exclusion Autumn 0.6  Autumn, Winter, Spring, Summer Winter 0.4  Single person, Small group, Big group, Spring 0.2 No persons Summer 0.1 … … 8

  9. EXTRACTING RELATIONSHIPS Positive entailment Contingency table for labels 𝐵 and 𝐶  𝑏 → 𝑐 is extracted when 𝑉 = 0  𝑐 → 𝑏 is extracted when 𝑈 = 0 𝒃 ¬𝒃  The relationship’s support is 𝑇 S T 𝒄 U V ¬𝒄 Mutual exclusion  𝑏 → ¬𝑐 ∧ 𝑐 → ¬𝑏 is extracted when 𝑇 = 0  The relationship’s support is 𝑈 + 𝑉  Higher order relationships are extracted following the Apriori algorithm paradigm 9

  10. 6 labels TOY EXAMPLE A B C D E F 1 1 1 0 0 0 1 1 1 1 0 0 0 0 0 0 1 0 0 1 1 0 1 0 1 1 1 0 0 0 10 training examples 0 1 1 1 0 1 0 0 1 1 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 1 10

  11. TOY EXAMPLE A B C D E F 1 1 1 0 0 0 Positive entailments 1 1 1 1 0 0  𝑏 → 𝑐 (support 3) 0 0 0 0 1 0 0 1 1 0 1 0 1 1 1 0 0 0 0 1 1 1 0 1 0 0 1 1 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 1 11

  12. TOY EXAMPLE A B C D E F 1 1 1 0 0 0 Positive entailments 1 1 1 1 0 0  𝑏 → 𝑐 (support 3) 0 0 0 0 1 0  𝑏 → 𝑑 (support 3) 0 1 1 0 1 0 1 1 1 0 0 0 0 1 1 1 0 1 0 0 1 1 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 1 12

  13. TOY EXAMPLE A B C D E F 1 1 1 0 0 0 Positive entailments 1 1 1 1 0 0  𝑏 → 𝑐 (support 3) 0 0 0 0 1 0  𝑏 → 𝑑 (support 3) 0 1 1 0 1 0  𝑐 → 𝑑 (support 5) 1 1 1 0 0 0 0 1 1 1 0 1 0 0 1 1 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 1 13

  14. TOY EXAMPLE A B C D E F 1 1 1 0 0 0 Positive entailments 1 1 1 1 0 0  𝑏 → 𝑐 (support 3) 0 0 0 0 1 0  𝑏 → 𝑑 (support 3) 0 1 1 0 1 0  𝑐 → 𝑑 (support 5)  𝑒 → 𝑑 (support 3) 1 1 1 0 0 0 0 1 1 1 0 1 0 0 1 1 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 1 14

  15. TOY EXAMPLE A B C D E F 1 1 1 0 0 0 Positive entailments 1 1 1 1 0 0  𝑏 → 𝑐 (support 3) 0 0 0 0 1 0  𝑏 → 𝑑 (support 3) 0 1 1 0 1 0  𝑐 → 𝑑 (support 5)  𝑒 → 𝑑 (support 3) 1 1 1 0 0 0 0 1 1 1 0 1 Mutual exclusion 0 0 1 1 1 0  {𝐵, 𝐹, 𝐺} (support 9) 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 1 15

  16. EXPLOITING RELATIONSHIPS: POSITIVE ENTAILMENT 𝑐 ¬𝑐 𝐵 𝐶 Label 𝐵 entails label 𝐶 1 0 𝑏  𝑏 → 𝑐 0 1 ¬𝑏 Generalization 𝐵 1  𝑏 1 → 𝑐, … , 𝑏 𝑙 → 𝑐 𝑐 ¬𝑐 At least one parent of 𝑪 true 1 0 Leak node … 𝐶 otherwise 0 1  To consider other causes of 𝐶  Virtual label equal to 𝐵 𝑙  True where 𝐶 is true and all of its parents are false  False in all other training examples 𝑀 𝐶 16

  17. EXPLOITING RELATIONSHIPS: MUTUAL EXCLUSION 𝐶 =true Among 𝑙 labels 𝐵 1 , … , 𝐵 𝑙 𝐵 1 Leak node  To cover all training examples, … 𝐶 i.e. to become exhaustive  Virtual label equal to 𝑐 ¬𝑐  True where all other parents 𝐵 𝑙 of B are false Only one parent of 𝐶 true 1 0  False in all other examples otherwise 0 1 𝑀 𝐶 17

  18. TOY EXAMPLE Positive entailments Node Node Before Before After After  𝑏 → 𝑐 (support 3) 0.400 0.400 0.022 0.022 𝐵 𝐵  𝑏 → 𝑑 (support 3) 0.350 0.350 0.082 0.082 𝑀𝑓𝑏𝑙𝐵 𝑀𝑓𝑏𝑙𝐵  𝑐 → 𝑑 (support 5) 0.250 0.250 0.096 0.096 𝐶 𝐶  𝑒 → 𝑑 (support 3) 0.600 0.600 0.031 0.031 𝐸 𝐸 Mutual exclusion 0.010 0.010 0.050 0.050 𝑀𝑓𝑏𝑙𝐶𝐸 𝑀𝑓𝑏𝑙𝐶𝐸  {𝐵, 𝐹, 𝐺} (support 9) 0.200 0.200 0.345 0.345 𝐷 𝐷 0.300 0.300 0.064 0.064 𝐺 𝐺 0.850 0.850 0.850 0.850 𝐹 𝐹 0.300 0.300 0.064 0.064 𝑀𝑓𝑏𝑙𝐹𝐺𝐵 𝑀𝑓𝑏𝑙𝐹𝐺𝐵 18

  19. EMPIRICAL STUDY 12 multi-label datasets Relationship discovery  Minimum support of 2 – increase exponentially in case of memory issues Learning  Binary Relevance + Random Forest with 10 trees  Weka, Mulan Inference  Virtual evidence insertion, exact inference via clustering algorithm  jSMILE library 19

  20. POSITIVE ENTAILMENT IN “MEDICAL” 3 entailment relationships extracted from 978 radiologists’ reports annotated with ICD -9 codes Sup. Congenital obstruction of ureteropelvic junction  Hydronephrosis 4 Shortness of breath  Renal agenesis and dysgenesis 3 Vomiting alone  Renal agenesis and dysgenesis 3 Ureteropelvic junction obstruction is the most common pathologic cause of antenatally detected hydronephrosis 20

  21. MUTUAL EXCLUSION Emotions Enron quiet-still XOR amazed-surprised “ Company Business, Strategy, etc. ” XOR “friendship / affection” In business, sir , one has no friends, only correspondents ~Alexandre Dumas 21

  22. RESULTS: POSITIVE ENTAILMENT Wilcoxon test P-value 0.0156 Minimum Number Number of % MAP Dataset Support of Labels Relations Improvement Bibtex 2 159 11 0.279 Bookmarks 2 208 4 0.068 Enron 2 53 4 0.391 ImageCLEF2011 2 99 28 2.977 ImageCLEF2012 2 94 1 0.168 Medical 2 45 6 2.284 Yeast 2 14 3 1.584 22

  23. RESULTS: MUTUAL EXCLUSION (1/2) Wilcoxon test P-value 0.1099 Dataset Minimum Support Number of Labels Number of Relations % MAP Improvement Bibtex 128 159 76 -1.626 Bookmarks 2048 208 1 -0.068 Emotions 8 6 1 1.424 Enron 2 53 481 -8.434 ImageCLEF2011 32 99 325 1.865 ImageCLEF2012 64 94 278 -2.862 IMDB 2 28 22 4.222 Medical 16 45 31 3.769 Scene 2 6 4 3.023 Slashdot 2 22 23 11.803 TMC2007 2 22 8 6.044 Yeast 2 14 2 1.760 23

  24. RESULTS: MUTUAL EXCLUSION (2/2) Bibtex Enron ImageCLEF2011 ImageCLEF2012 Minimum 128 256 2 32 32 128 64 256 Support Number of 3 48 1 27 8 40 76 22 325 56 Relationships % MAP -1 . 6 3 -8 . 4 0 . 21 1.87 3 . 3 4 - 2.87 0 . 63 0,60 Improvement 24

Recommend


More recommend