Making Set-valued Predictions in Evidential Classification: A - PowerPoint PPT Presentation

Making Set-valued Predictions in Evidential Classification: A Comparison of Different Approaches Liyao Ma & Thierry Denœux ISIPTA 2019 - 5th July 1

Introduction ● Classification : label predictions Ω = { ω 1 , · · · , ω n } ● Uncertainty → set-valued predictions ● Dempster-Shafer theory ISIPTA 2019 - 5th July 2

Decision making view of classification Precise assignments F = { f ω 1 , · · · , f ω n } ● Precise assignments + complete preorder : Maximum Expected Utility principle ISIPTA 2019 - 5th July 3

Decision making view of classification Precise assignments F = { f ω 1 , · · · , f ω n } ● Precise assignments + complete preorder : Maximum Expected Utility principle ● The uncertain case ❍ Precise assignments + partial preorder ❍ Partial assignments + complete preorder ISIPTA 2019 - 5th July 3

Decision making view of classification Precise assignments F = { f ω 1 , · · · , f ω n } ● Precise assignments + complete preorder : Maximum Expected Utility principle ● The uncertain case ❍ Precise assignments + partial preorder ❍ Partial assignments + complete preorder Partial assignments F = { f A , A ∈ 2 Ω \ {∅}} ISIPTA 2019 - 5th July 3

Two families of decision strategies ● Precise assignments + partial preorder ❍ F = { f ω 1 , · · · , f ω n } ❍ Interval dominance, maximality, weak dominance... ❍ Lack of information → [ E m ( f i ) , E m ( f i )] ❍ Set of non-dominated acts F ∗ = { f ω 1 , f ω 2 } ● Partial assignments + complete preorder ❍ F = { f A , A ∈ 2 Ω \ {∅}} ❍ Generalized maximin, maximax, Hurwicz, minimax regret... ❍ The optimal act F ∗ = { f { ω 1 ,ω 2 } } ISIPTA 2019 - 5th July 4

Defining the utility of set-valued predictions states of nature acts ω 1 ω 2 ω 3 f { ω 1 } 1.0000 0.2000 0.1000 f { ω 2 } 0.2000 1.0000 0.2000 f { ω 3 } 0.1000 0.2000 1.0000 ISIPTA 2019 - 5th July 5

Defining the utility of set-valued predictions states of nature acts ω 1 ω 2 ω 3 f { ω 1 } 1.0000 0.2000 0.1000 f { ω 2 } 0.2000 1.0000 0.2000 f { ω 3 } 0.1000 0.2000 1.0000 f { ω 1 ,ω 2 } ? ? ? f { ω 1 ,ω 3 } ? ? ? f { ω 2 ,ω 3 } ? ? ? f { ω 1 ,ω 2 ,ω 3 } ? ? ? ISIPTA 2019 - 5th July 5

Defining the utility of set-valued predictions ● Ordered Weighted Average (OWA) operator u A , j = F ( { u ij | ω i ∈ A } ) = � | A | ˆ k = 1 w k u A ( k ) j ❍ Tolerance degree of imprecision TOL ( w ) = � | A | | A |− k | A |− 1 w k k = 1 ❍ weights calculation ENT ( w ) := − � | A | max k = 1 w k log w k w s.t. TOL ( w ) = γ � | A | k = 1 w k = 1 ISIPTA 2019 - 5th July 6

Defining the utility of set-valued predictions states of nature acts ω 1 ω 2 ω 3 f { ω 1 } 1.0000 0.2000 0.1000 f { ω 2 } 0.2000 1.0000 0.2000 f { ω 3 } 0.1000 0.2000 1.0000 f { ω 1 ,ω 2 } 0.8400 0.8400 0.1800 f { ω 1 ,ω 3 } 0.8200 0.2000 0.8200 f { ω 2 ,ω 3 } 0.1800 0.8400 0.8400 f { ω 1 ,ω 2 ,ω 3 } 0.7373 0.7455 0.7373 ISIPTA 2019 - 5th July 7

Experimental Comparisons ● UCI and artificial Gaussian data sets ● Classification performances with varying γ ● Performances with noised test sets ● Performances with increasing training set size ISIPTA 2019 - 5th July 8

Conclusions ● Two approaches are contrasted ❍ partial preorder among precise assignments ❍ complete preorder among partial assignments ● the utility of set-valued prediction : OWA ● experimental comparisons ❍ set-valued predictions perform better ❍ cautious rules preferred ISIPTA 2019 - 5th July 9

Thank you! Making Set-valued Predictions in Evidential Classification: A Comparison of Different Approaches Liyao Ma, Thierry Denœux Two families of set-valued decision strategies Partial preorders among precise assignments Patterns are assigned to one and only one of the n classes: F = { f 1 , · · · , f n } decision criterion preference relation E m ( f i ) = � m ( B ) min ω j ∈ B u ij interval dominance f i � ID f j ⇐ ⇒ E m ( f i ) ≥ E m ( f j ) B ⊆ Ω maximality f i � max f j ⇐ ⇒ E m ( f i − f j ) ≥ 0 E m ( f i ) = � m ( B ) max ω j ∈ B u ij B ⊆ Ω weak dominance f i � WD f j ⇐ ⇒ � E m ( f i ) ≥ E m ( f j ) � ∧ � E m ( f i ) ≥ E m ( f j ) � Complete preorders among partial assignments Patterns are assigned partially to a non-empty subset of Ω : F = { f A , A ∈ 2 Ω \ {∅}} - generalized maximin f A i � ∗ f A j ⇐ ⇒ E m ( f A i ) ≥ E m ( f A j ) ⇒ E owa m , β ( f A i ) ≥ E owa - generalized OWA f A i � β f A j ⇐ m , β ( f A j ) - generalized maximax f A i � ∗ f A j ⇐ ⇒ E m ( f A i ) ≥ E m ( f A j ) - generalized minimax regret f A i � r f A j ⇐ ⇒ R ( f A i ) ≤ R ( f A j ) - generalized Hurwicz f A i � α f A j ⇐ ⇒ E m , α ( f A i ) ≥ E m , α ( f A j ) - maximum expected utility f A i � m f A j ⇐ ⇒ EU ( f A i ) ≥ EU ( f A j ) - pignistic criterion f A i � p f A j ⇐ ⇒ E p ( f A i ) ≥ E p ( f A j ) Extending utility matrix via an OWA operator Evaluation of set-valued predictions The extended utility matrix ˆ The classification performance is evaluated by the U ( 2 n − 1 ) × n is crucial to | A | both decision-making and performance evaluation. ENT ( w ) = − � averaged utility in the test set T : w k log w k , The utility of assigning one instance to set A should k = 1 | T | intuitively be a function of those utilities of each pre- 1 subject to TOL ( w ) = γ and � | A | Acc ( T ) = � u F ∗ ˆ i , i ∗ . cise assignments within A : k = 1 w k = 1 . | T | Example: the utility matrix extended by an i = 1 | A | OWA operator with γ = 0.8 u A , j = F ( { u ij | ω i ∈ A } ) = ˆ � w k u A ( k ) j . states of nature Experimental data acts k = 1 ω 1 ω 2 ω 3 Given the DM’s tolerance degree of imprecision f { ω 1 } 1.0000 0.2000 0.1000 2.5 class 1 f { ω 2 } 0.2000 1.0000 0.2000 UCI Balance 2 class 3 class 2 1.5 | A | 0.1000 0.2000 1.0000 | A | − k f { ω 3 } scale dataset 1 � attribute y TOL ( w ) = | A | − 1 w k = γ , f { ω 1 , ω 2 } 0.8400 0.8400 0.1800 and simu- 0.5 0 k = 1 f { ω 1 , ω 3 } 0.8200 0.2000 0.8200 lated Gaussian -0.5 -1 f { ω 2 , ω 3 } 0.1800 0.8400 0.8400 datasets the weights corresponding to the OWA operator are -1.5 f { ω 1 , ω 2 , ω 3 } 0.7373 0.7455 0.7373 obtained by maximizing the entropy -2 -2 -1 0 1 2 3 4 attribute x Experiments Belief functions concerning the states of nature were generated through the DS theory-based neural network classifier. DC1 DC2 DC3 DC4 DC5 DC6 DC7 DC8 DC9 averaged utility γ =0.5 0.9186 0.9188 0.9186 0.9186 0.9186 0.9186 0.9187 0.9187 0.9187 γ =0.6 0.9179 0.9184 0.9176 0.9179 0.9184 0.9176 0.9187 0.9188 0.9188 γ =0.7 0.9059 0.9064 0.9052 0.9059 0.9056 0.9054 0.9190 0.9190 0.9187 Classification γ =0.8 0.9043 0.9032 0.9028 0.9043 0.9030 0.9024 0.9191 0.9191 0.9188 performances γ =0.9 0.9319 0.9325 0.9331 0.9319 0.9192 0.9192 0.9188 0.9339 0.9339 with varying γ γ =1.0 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9194 0.9194 0.9188 (UCI Balance γ =0.5 100.00% 100.00% 100.00% 100.00% 100.00% 100.00% 97.44% 97.44% 99.97% % of precision scale dataset) γ =0.6 88.96% 89.47% 88.96% 88.96% 89.18% 89.06% 97.44% 97.44% 99.97% γ =0.7 80.10% 80.77% 80.06% 80.10% 80.22% 80.26% 97.44% 97.44% 99.97% γ =0.8 69.70% 70.14% 69.63% 69.70% 69.82% 69.63% 97.44% 97.44% 99.97% γ =0.9 57.02% 57.76% 57.12% 57.02% 57.38% 57.12% 97.44% 97.44% 99.97% γ =1.0 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 97.44% 97.44% 99.97% Performances with noised test sets (Gaussian dataset) Performances with increasing training set size (Gaussian) 0.95 1 0.95 1 0.9 F1: Maximin, Minimax regret 0.9 F1: Pignistic F1: Maximax 0.85 F1: Hurwicz 0.94 F1: OWA 0.8 Maximin, Minimax regret 0.95 F1: Maximin, Minimax regret 0.8 F2: Interval dominance % of precise predictions Maximax % of precise predictions F1: Maximax F1: Pignistic averaged utility F2: Maximality 0.7 Pignistic 0.93 F1: Hurwicz 0.75 F2: Weak dominance Hurwicz averaged utility F1: OWA 0.6 OWA 0.9 F2: Interval dominance 0.7 Interval dominance 0.92 F2: Maximality 0.5 Maximality F2: Weak dominance 0.65 Weak dominance F1: Maximin, Minimax regret 0.85 0.4 F1: Maximax 0.6 0.91 F1: Pignistic 0.3 F1: Hurwicz 0.55 F1: OWA 0.8 0.9 F2: Interval dominance 0.5 0.2 F2: Maximality F2: Weak dominance 0.45 0.1 0.89 0.75 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0 200 400 600 800 1000 1200 0 200 400 600 800 1000 1200 parameter parameter number of training instances number of training instances Conclusions The set-valued predictions induced by a partial preorder turn into precise ones when information becomes more precise. In contrast, the criteria based on a complete preorder can provide set-valued predictions even when uncertainty is quantified by probabilities. Set-valued predictions perform better than precise ones in the case of complex data sets: therefore, the most cautious rules should be preferred in highly uncertain environments. ISIPTA 2019 - 5th July 10

Making Set-valued Predictions in Evidential Classification: A - PowerPoint PPT Presentation

Making Set-valued Predictions in Evidential Classification: A Comparison of Different Approaches Liyao Ma & Thierry Denux ISIPTA 2019 - 5th July 1 Introduction Classification : label predictions = { 1 , , n }

Evidential and Causal Reasoning Much reasoning in AI can be seen as evidential reasoning ,

Many-Valued Logic Daniel Bonevac February 27, 2013 Daniel Bonevac Many-Valued Logic Rationales

1 Predictions for 2020 Predictions for 2020 We will live in flying houses. 1966

Algebraic Study of Lattice-Valued Logic and Lattice-Valued Modal Logic Yoshihiro Maruyama

Shuffle algebra perspective on operator valued probability theory 30 mars 2020 1/25 Operator

VECTOR-VALUED FUNCTIONS MATH 200 MAIN QUESTIONS FOR TODAY Whats a vector valued function?

An Evidential Tool Bus John Rushby Computer Science Laboratory SRI International Menlo Park,

Evidential and Legal Reasoning in AI the role of argumentation Floris Bex Utrecht University

The descriptive set-theoretic complexity of the set of points of continuity of a multi-valued

How Much For an Interval? Case of Interval . . . a Set? a Twin Set? a p-Box? Case of Set-Valued

Input. A set of men M , and a set of women W . Input. A set of men M , and a set of women W .

Sequential Extensions of Causal and Evidential Decision Theory Tom Everitt, Jan Leike, and Marcus

Making maps pretty Andrea Aime Jim Groffen Making Maps Pretty Making Maps Pretty 1 1 Making

On enumerating the kernels in a bipolar valued digraph Raymond Bisdorff University of Luxembourg

Cardinal Newman EVERYONE IS VALUED Cardinal Newman EVERYONE IS VALUED Main Changes to the GCSE

Bayesian Sparsification of Deep Complex-valued networks Ivan Nazarov, Evgeny Burnaev ADASE

Competition in Telecommunications Market Khartoum-Sudan, 24-26 May 2016 Market power,

Understanding Personality, Work-Related Behaviour and Performance Paul OLeary Ph.D Country

Measuring and testing for the systemically important financial institutions Carlos Castro and

1 Major Changes and Impacts of 1992-93 Restructuring More than half of state government

THEORIES OF HARM AND EFFICIENCY JUSTIFICATIONS IN ABUSE OF DOMINANCE CASES Paolo Buccirossi

ECIS roundtable event 29 November 2016 Summary of Presentation on big data and competition law

Value of 3D in Moments of Visual Dominance Tom Curtin, Director of Business Development, DTI

Monetary Policy in Pakistan: Confronting Fiscal Dominance and Imperfect Credibility Ehsan