Pattern Recognition. Bayesian and non-Bayesian Tasks. Petr Po s k - PowerPoint PPT Presentation

CZECH TECHNICAL UNIVERSITY IN PRAGUE Faculty of Electrical Engineering Department of Cybernetics Pattern Recognition. Bayesian and non-Bayesian Tasks. Petr Poˇ s´ ık This lecture is based on the book Ten Lectures on Statistical and Structural Pattern Recognition by Michail I. Schlesinger and V´ aclav Hlav´ aˇ c (Kluwer, 2002). (V ˇ cesk´ e verzi kniha vyˇ sla pod n´ azvem Deset pˇ redn´ aˇ sek z teorie statistick´ eho a struktur´ aln´ ıho rozpozn´ av´ an´ ı ı ˇ ve vydavatelstv´ CVUT v roce 1999.) P. Poˇ s´ ık c � 2014 Artificial Intelligence – 1 / 21

Pattern Recognition P. Poˇ s´ ık c � 2014 Artificial Intelligence – 2 / 21

Definitions of concepts An object of interest is characterized by the following parameters: ■ observation x ∈ X (vector of numbers, graph, picture, sound, ECG, . . . ), and Pattern Recognition ■ hidden state k ∈ K . • Concepts ■ k is often viewed as the object class , but it may be something different, e.g. when we • Notes • PR task examples seek for the location k of an object based on the picture x taken by a camera. • Two types of PR Bayesian DT Non-Bayesian DT P. Poˇ s´ ık c � 2014 Artificial Intelligence – 3 / 21

Definitions of concepts An object of interest is characterized by the following parameters: ■ observation x ∈ X (vector of numbers, graph, picture, sound, ECG, . . . ), and Pattern Recognition ■ hidden state k ∈ K . • Concepts ■ k is often viewed as the object class , but it may be something different, e.g. when we • Notes • PR task examples seek for the location k of an object based on the picture x taken by a camera. • Two types of PR Bayesian DT Joint probability distribution p XK : X × K → � 0, 1 � Non-Bayesian DT p XK ( x , k ) is the joint probability that the object is in the state k and we observe x . ■ p XK ( x , k ) = p X | K ( x | k ) · p K ( k ) ■ P. Poˇ s´ ık c � 2014 Artificial Intelligence – 3 / 21

Definitions of concepts An object of interest is characterized by the following parameters: ■ observation x ∈ X (vector of numbers, graph, picture, sound, ECG, . . . ), and Pattern Recognition ■ hidden state k ∈ K . • Concepts ■ k is often viewed as the object class , but it may be something different, e.g. when we • Notes • PR task examples seek for the location k of an object based on the picture x taken by a camera. • Two types of PR Bayesian DT Joint probability distribution p XK : X × K → � 0, 1 � Non-Bayesian DT p XK ( x , k ) is the joint probability that the object is in the state k and we observe x . ■ p XK ( x , k ) = p X | K ( x | k ) · p K ( k ) ■ Decision strategy (or function or rule) q : X → D D is a set of possible decisions. (Very often D = K .) ■ q is a function that assigns a decision d = q ( x ) , d ∈ D , to each x ∈ X . ■ P. Poˇ s´ ık c � 2014 Artificial Intelligence – 3 / 21

Definitions of concepts An object of interest is characterized by the following parameters: ■ observation x ∈ X (vector of numbers, graph, picture, sound, ECG, . . . ), and Pattern Recognition ■ hidden state k ∈ K . • Concepts ■ k is often viewed as the object class , but it may be something different, e.g. when we • Notes • PR task examples seek for the location k of an object based on the picture x taken by a camera. • Two types of PR Bayesian DT Joint probability distribution p XK : X × K → � 0, 1 � Non-Bayesian DT p XK ( x , k ) is the joint probability that the object is in the state k and we observe x . ■ p XK ( x , k ) = p X | K ( x | k ) · p K ( k ) ■ Decision strategy (or function or rule) q : X → D D is a set of possible decisions. (Very often D = K .) ■ q is a function that assigns a decision d = q ( x ) , d ∈ D , to each x ∈ X . ■ Penalty function (or loss function) W : K × D → R (real numbers) ■ W ( k , d ) is a penalty for decision d if the object is in state k . P. Poˇ s´ ık c � 2014 Artificial Intelligence – 3 / 21

Definitions of concepts An object of interest is characterized by the following parameters: ■ observation x ∈ X (vector of numbers, graph, picture, sound, ECG, . . . ), and Pattern Recognition ■ hidden state k ∈ K . • Concepts ■ k is often viewed as the object class , but it may be something different, e.g. when we • Notes • PR task examples seek for the location k of an object based on the picture x taken by a camera. • Two types of PR Bayesian DT Joint probability distribution p XK : X × K → � 0, 1 � Non-Bayesian DT p XK ( x , k ) is the joint probability that the object is in the state k and we observe x . ■ p XK ( x , k ) = p X | K ( x | k ) · p K ( k ) ■ Decision strategy (or function or rule) q : X → D D is a set of possible decisions. (Very often D = K .) ■ q is a function that assigns a decision d = q ( x ) , d ∈ D , to each x ∈ X . ■ Penalty function (or loss function) W : K × D → R (real numbers) ■ W ( k , d ) is a penalty for decision d if the object is in state k . Risk R : Q → R ■ the mathematical expectation of the penalty which must be paid when using the strategy q . P. Poˇ s´ ık c � 2014 Artificial Intelligence – 3 / 21

Notes to decision tasks In the following, we consider decision tasks where ■ the decisions do not influence the state of nature (unlike game theory or control theory ). Pattern Recognition ■ a single decision is made, issues of time are ignored in the model (unlike control • Concepts theory , where decisions are typically taken continuously in real time). • Notes • PR task examples ■ the costs of obtaining the observations are not modelled (unlike sequential decision • Two types of PR theory ). Bayesian DT Non-Bayesian DT P. Poˇ s´ ık c � 2014 Artificial Intelligence – 4 / 21

Notes to decision tasks In the following, we consider decision tasks where ■ the decisions do not influence the state of nature (unlike game theory or control theory ). Pattern Recognition ■ a single decision is made, issues of time are ignored in the model (unlike control • Concepts theory , where decisions are typically taken continuously in real time). • Notes • PR task examples ■ the costs of obtaining the observations are not modelled (unlike sequential decision • Two types of PR theory ). Bayesian DT Non-Bayesian DT The hidden parameter k (state, class) is considered not observable. Common situations are: ■ k can be observed, but at a high cost. k is a future state (e.g. price of gold) and will be observed later. ■ P. Poˇ s´ ık c � 2014 Artificial Intelligence – 4 / 21

Pattern recognition task examples The description of the concepts is very general—so far we did not specify what the items of the X , K , and D sets actually are, how they are represented. Application Observation (measurement) Decisions x ∈ R n Coin value in a slot machine Value Gene-expression profile, x ∈ R n Cancerous tissue detection { yes, no } Results of medical tests, x ∈ R n Medical diagnostics Diagnosis Optical character recognition 2D bitmap, intensity image Words, numbers License plate recognition 2D bitmap, grey-level image Characters, numbers Fingerprint recognition 2D bitmap, grey-level image Personal identity { yes, no } Face detection 2D bitmap x ( t ) Speech recognition Words x ( t ) Speaker identification Personal identity x ( t ) { yes, no } Speaker verification x ( t ) EEG, ECG analysis Diagnosis Forfeit detection Various { yes, no } P. Poˇ s´ ık c � 2014 Artificial Intelligence – 5 / 21

Two types of pattern recognition 1. Statistical pattern recognition ■ Objects are represented as points in a vector space. Pattern Recognition ■ The point (vector) x contains the individual observations (in a numerical form) • Concepts as its coordinates. • Notes • PR task examples 2. Structural pattern recognition • Two types of PR Bayesian DT ■ The object observations contain a structure which is represented and used for recognition. Non-Bayesian DT ■ A typical example of the representation of a structure is a grammar . P. Poˇ s´ ık c � 2014 Artificial Intelligence – 6 / 21

Bayesian Decision Theory P. Poˇ s´ ık c � 2014 Artificial Intelligence – 7 / 21

Bayesian decision task Given the sets X , K , and D , and functions p XK : X × K → � 0, 1 � and W : K × D → R , find a strategy q : X → D which minimizes the Bayesian risk of the strategy q Pattern Recognition R ( q ) = ∑ x ∈ X ∑ p XK ( x , k ) · W ( k , q ( x )) . Bayesian DT k ∈ K • Bayesian dec. task • Characteristics of q ∗ The optimal strategy q , denoted as q ∗ , is then called the Bayesian strategy . • Two special cases • Limitations Non-Bayesian DT P. Poˇ s´ ık c � 2014 Artificial Intelligence – 8 / 21

Pattern Recognition. Bayesian and non-Bayesian Tasks. Petr Po s k - PowerPoint PPT Presentation

CZECH TECHNICAL UNIVERSITY IN PRAGUE Faculty of Electrical Engineering Department of Cybernetics Pattern Recognition. Bayesian and non-Bayesian Tasks. Petr Po s k This lecture is based on the book Ten Lectures on Statistical and

Part 5 pattern recognition pattern recognition track pattern recognition: associate hits

CS 7616 Pattern Recognition Bayesian Decision Theory Aaron Bobick School of Interactive Computing

Feature Selection Pattern Recognition: The Early Days Pattern Recognition: The Early Days Only

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION Pattern Recogniton Pattern: Any

An NFR Pattern Approach to Dealing An NFR Pattern Approach to Dealing An NFR Pattern Approach to

CS 7616 Pattern Recognition Introduction Aaron Bobick School of Interactive Computing

Bayesian Methods for Neural Networks Readings: Bishop, Neural Networks for Pattern Recognition .

Pattern Recognition CSE 802 Michigan State University Spring 2017 Lecture 1, January 9, 2017

Applications of Pattern Recognition in Computational Biology Pattern Recognition Course

Pattern Recognition: An Overview Prof. Richard Zanibbi Pattern Recognition (One) Definition

CS 7616 Pattern Recognition Linear, Linear, Linear Aaron Bobick School of Interactive

Pattern Recognition 2018 Support Vector Machines Ad Feelders Universiteit Utrecht Ad Feelders

Shared Memory Programming with OpenMP Lecture 6: Tasks What are tasks? Tasks are

Scheduling Aperiodic Tasks Background Scheduling Treat aperiodic tasks as lowest-priority

A common pattern: map Another common pattern: filter Pattern: take a list and produce a new list,

Scope Constrained Frequent Pattern Mining: Constrained Frequent Pattern Mining: A A

Welcome and First Lecture Department of Government London School of Economics and Political

Neyman-Pearson Given a sample x 1 , x 2 , ..., x n , from a More Motifs distribution f(...|

Outline A taxonomy of CR security threats Primary user emulation attacks Cognitive Radio

Distributed Statistical Inference using Type Based Random Access over Multi-access Fading

Description of the Detection Process Detektor: receives signals and decides on object existence

Infotheory for Statistics and Learning Lecture 4 Binary hypothesis testing The

2013 rockchalk 1 / 81 K.U. Introduction Data Outreg Plots Free Lunch Conclusions Guessing

Machine Learning Classification: Introduction Hamid R. Rabiee Jafar Muhammadi, Nima Pourdamghani