The Multilabel Naive Credal Classifier Alessandro Antonucci and - PowerPoint PPT Presentation

The Multilabel Naive Credal Classifier Alessandro Antonucci and Giorgio Corani { alessandro,giorgio } @idsia.ch Istituto “Dalle Molle” di Studi sull’Intelligenza Artificiale - Lugano (Switzerland) http://ipg.idsia.ch ISIPTA ’15, Pescara, July 21st, 2015

IPG ⊂ IDSIA ⊂ USI ∪ SUPSI ⊂ LUGANO

IPG ⊂ IDSIA ⊂ USI ∪ SUPSI ⊂ LUGANO University of Applied Sciences and Arts of Southern Switzerland ( supsi.ch ) Universit` a della Svizzera Italiana ( usi.ch )

IPG ⊂ IDSIA ⊂ USI ∪ SUPSI ⊂ LUGANO

Chronology (Acknowledgements) Credal version of the naive Bayes classifier by Marco (Zaffalon) MAP algs for imprecise HMMs by Jasper (De Bock) & Gert (de Cooman) ISIPTA ’01 ISIPTA ’11 IJCAI-13 NIPS 14 ISIPTA ’15 Bayes nets as multilabel classifiers by Denis (Mau´ a) & us MAP in generic credal nets by Jasper & Cassio (de Campos) & me A credal classifiers based on MAP tasks in credal nets by us

Single- vs. multi-label classification A (fictious) classifier to detect eyes color SINGLE-LABEL Possible classes C := { brown , green , blue } Heterochromia iridum : two (or more) colors Possible values in 2 C , a multilabel task! Trivial approaches Standard classification over the power set Exponential in the number of labels! C = green Each label as a separate Boolean variable a (standard) classifier for each label MULTI-LABEL Ignored relations among classes ! Graphical models (GMs) to depict relations among class labels (and features) Classification as (standard) inference in GMs C = { blue , brown }

Credal classifiers are not (yet) multilabel classifiers Class variable C and (discrete) features F , a test instance ˜ f Standard (single-label) classifier are maps: F → C learn P ( C , F ) from data and return c ∗ := arg max c ∈C P ( c , ˜ f ) Multi-label classifiers: F → 2 C C = ( C 1 , . . . , C n ) as an array of Boolean vars, one for each label learn P ( C , F ) and solve the MAP task c ∗ := arg max c ∈{ 0 , 1 } n P ( c , ˜ f ) Credal (single-label) classifiers: F → 2 C learn credal set K ( C , F ) and return all c ′′ ∈ C s.t. ∄ c ′ : P ( c ′ , ˜ f ) > P ( c ′′ , ˜ f ) ∀ P ( C , F ) ∈ K ( C , F ) Multilabel credal classifier (MCC): F → 2 2 C learn credal set K ( C , F ) and return all sequences c ′′ s.t. ∄ c ′ : P ( c ′ , ˜ f ) > P ( c ′′ , ˜ f ) ∀ P ( C , F ) ∈ K ( C , F )

Compact Representation of the Output Output of a MCC might be exponentially large Jasper & Gert’s idea to fix this with imprecise HMMs (Viterbi): decide whether or not there is at least an optimal sequence sucht that a variable is in a particular state (for each variable and state) With MCCs, for each class label, we can decide whether: the class is active for all the optimal sequences the class is inactive fro all the optimal sequences there are optimal sequences with the label active, and others with the label inactive Optimization task P ( c ′ , f ) min l =0 / 1 max inf P ( c ′′ , f ) ≤ 1 c ′′ : c ′′ c ′ P ( C , F ) ∈ K ( C , F ) O (2 treewidth ) for separately specified credal nets (e.g., local IDM) More complex with non-separate specifications

C F 1 F 2 F m . . . NBC

C F 1 F 2 F m . . . NCC=NBC+IDM

C 1 Multi-label? Naive topology over classes C 2 C n . . . F 1 F 2 F m . . . Structural learning to bound # of parents of the features and to select the super-class C 1

F 1 F 1 F 1 . . . m 1 2 Features replicated: tree topology F n F n F n C 1 m 1 2 . . . C 2 C n MNBC . . . F 2 F 2 F 2 . . . m 1 2

F 1 F 1 F 1 . . . m 1 2 Features replicated: tree topology F n F n F n C 1 m 1 2 . . . C 2 C n MNBC . . . + IDM F 2 F 2 F 2 = MNCC . . . m 1 2

During the poster session I can Explain some detail about the learning of the structure Explain the feature replication trick (tis makes inference simpler) Explain the non-separate IDM-based quantification of the model Explain the detail of the (convex) optimization . . .

MNCC: the algorithm Input: test instance f (+ dataset D ) / Output initialized: C 1 C 2 C n . . . active 0 0 0 . . . inactive 0 0 0 . . . for l = 1 , . . . , n do for c l = 0 , 1 do P t ( c ′ , f ) l = c l max c ′ inf t if min c ′′ : c ′′ P t ( c ′′ , f ) ≤ 1 then Output( l , c l )=1 end if end for end for linear representation of a (exponential) number of maximal seqs 1 1 1 0 0 1 0 1

Testing MNCC Preliminary tests on real-world datasets Data set Classes Features Instances Emotions 6 44/72 593 Scene 6 224/294 2407 E-mobility 10 14/18 4226 Slashdot 22 496/1079 3782 Perfomance described by: % of instance s.t. all maximal seqs all in the same state Accuracy of the precise model when MNCC is determinate Accuracy of the precise model when MNCC is indeterminate

1.00 .75 .50 .25 0 C 1 C 2 C 3 C 4 C 5 C 6 Emotions

1.00 .75 .50 .25 0 C 1 C 2 C 3 C 4 C 5 C 6 Scene

C 1 C 2 C 3 C 4 C 5 C 6 C 7 C 8 C 9 C 10 0 .25 .50 .75 1.00 E-mobility

C 15 C 2 C 13 C 20 C 5 C 6 C 8 C 17 C 3 C 7 C 14 C 11 C 18 C 16 C 10 C 4 C 9 C 19 C 22 C 21 C 12 C 1 0 .25 .50 .75 1.00 Slashdot

Conclusions, Outlooks and Acks Among the first tools for robust multilabel classification Still lots of things to do: Extension to multidimensional/hierarchical case Extension to continuous variables (features) Extension to continuous class (multi-target interval-valued regression) More complex topologies (ETAN, de Campos, 2014) Variational approach to features replication Not only 0/1 losses (imprecise losses?)

The Multilabel Naive Credal Classifier Alessandro Antonucci and - PowerPoint PPT Presentation

The Multilabel Naive Credal Classifier Alessandro Antonucci and Giorgio Corani { alessandro,giorgio } @idsia.ch Istituto Dalle Molle di Studi sullIntelligenza Artificiale - Lugano (Switzerland) http://ipg.idsia.ch ISIPTA 15, Pescara,

Outline Naive Credal Classifier 2: an extension of Naive Bayes Introducing NCC2 1 for

Active Learning by the Naive Credal Classifier Alessandro Antonucci , Giorgio Corani ,

STAT 339 Naive Bayes Classification 8-10 March 2017 Colin Reimer Dawson Outline Naive Bayes

Naive Bayes and Gaussian Bayes Classifier Ladislav Rampasek slides by Mengye Ren and others

Multiclass Multilabel Classification with More Classes than Examples Ohad Shamir Weizmann

Extreme multilabel learning Charles Elkan Amazon Fellow December 12, 2015 1/32 Massive

An Empirical Study on Lazy Multilabel Classification Algorithms Eleftherios Spyromitros,

Active Learning for Sparse Bayesian Multilabel Classification Deepak Vasisht, MIT & IIT Delhi

The Nave Bayes Classifier Machine Learning 1 Todays lecture The nave Bayes Classifier

Credal Networks under Epistemic Irrelevance: Theory and Algorithms

Introduction to Machine Learning Classification: Naive Bayes Learning goals 15 Understand the

Spam Filtering with Naive Bayes Classifier Yuriy Arabskyy June 6, 2017 Table of contents What

Lazy Associative Classification Decision Tree Classifier (Eager) Associative Classifier By

Naive Bayes Classication Naive Bayes Classication In [1]: % matplotlib inline from

CSE 473: Artificial Intelligence Autumn 2010 Machine Learning: Naive Bayes and Perceptron Luke

PLUGIN CLASSIFIERS: NAIVE BAYES, LDA, PLUGIN CLASSIFIERS: NAIVE BAYES, LDA, LOGISTIC REGRESSION

Introduction to Chromatin IP sequencing (ChIP-seq) data analysis Workshop on ChIP-seq data

DNA Binding Proteins CSE 527 Autumn 2007 A variety of DNA binding proteins (transcription

c-Si solar cells High Efficiency concepts of c-Si wafer

Colors of Asteroid Families H. Campins*, J. Ziffer, J. Licandro, J. de Len Pisa May 5, 2011

The Organization of Knowledge Geoff Nunberg Concepts of Information i218 Feb. 19, 2015 A MODEST

Partners PrEP Trial Oral PrEP for Heterosexual Couples in Kenya and Uganda Partners PrEP: Study

Research (NIHR) in England Sponsored by: &

The Digital Services Tax: A Principled Proposal for the Future of International Taxation Wei Cui

The Multilabel Naive Credal Classifier Alessandro Antonucci and - PowerPoint PPT Presentation

The Multilabel Naive Credal Classifier Alessandro Antonucci and Giorgio Corani { alessandro,giorgio } @idsia.ch Istituto Dalle Molle di Studi sullIntelligenza Artificiale - Lugano (Switzerland) http://ipg.idsia.ch ISIPTA 15, Pescara,

Outline Naive Credal Classifier 2: an extension of Naive Bayes Introducing NCC2 1 for

Active Learning by the Naive Credal Classifier Alessandro Antonucci , Giorgio Corani ,

STAT 339 Naive Bayes Classification 8-10 March 2017 Colin Reimer Dawson Outline Naive Bayes

Naive Bayes and Gaussian Bayes Classifier Ladislav Rampasek slides by Mengye Ren and others

Multiclass Multilabel Classification with More Classes than Examples Ohad Shamir Weizmann

Extreme multilabel learning Charles Elkan Amazon Fellow December 12, 2015 1/32 Massive

An Empirical Study on Lazy Multilabel Classification Algorithms Eleftherios Spyromitros,

Active Learning for Sparse Bayesian Multilabel Classification Deepak Vasisht, MIT &amp; IIT Delhi

The Nave Bayes Classifier Machine Learning 1 Todays lecture The nave Bayes Classifier

Credal Networks under Epistemic Irrelevance: Theory and Algorithms

Introduction to Machine Learning Classification: Naive Bayes Learning goals 15 Understand the

Spam Filtering with Naive Bayes Classifier Yuriy Arabskyy June 6, 2017 Table of contents What

Lazy Associative Classification Decision Tree Classifier (Eager) Associative Classifier By

Naive Bayes Classication Naive Bayes Classication In [1]: % matplotlib inline from

CSE 473: Artificial Intelligence Autumn 2010 Machine Learning: Naive Bayes and Perceptron Luke

PLUGIN CLASSIFIERS: NAIVE BAYES, LDA, PLUGIN CLASSIFIERS: NAIVE BAYES, LDA, LOGISTIC REGRESSION

Introduction to Chromatin IP sequencing (ChIP-seq) data analysis Workshop on ChIP-seq data

DNA Binding Proteins CSE 527 Autumn 2007 A variety of DNA binding proteins (transcription

c-Si solar cells High Efficiency concepts of c-Si wafer

Colors of Asteroid Families H. Campins*, J. Ziffer, J. Licandro, J. de Len Pisa May 5, 2011

The Organization of Knowledge Geoff Nunberg Concepts of Information i218 Feb. 19, 2015 A MODEST

Partners PrEP Trial Oral PrEP for Heterosexual Couples in Kenya and Uganda Partners PrEP: Study

Research (NIHR) in England Sponsored by: &amp;

The Digital Services Tax: A Principled Proposal for the Future of International Taxation Wei Cui

Active Learning for Sparse Bayesian Multilabel Classification Deepak Vasisht, MIT & IIT Delhi

Research (NIHR) in England Sponsored by: &