Introduction à la Reconnaissance de Formes – RF, « Pattern Recognition – PR » Dijana Petrovska-Delacrétaz dijana.petrovska@telecom-sudparis.eu « update 2020march24» 1
Bibliographie 1. Pattern Classification and Scene Analysis ; R.O. Duda, P.E. Hart, John Wiley & Sons, 2001 2. Introduction to Pattern Recognition: Statistical, Structural, Neural and Fuzzy Logic Approaches ; M. Friedman, A. Kandel, World Scientific, 1999 3. Reconnaissance des Formes et Analyse de Scènes, vol. 3 ; M. Kunt et al., Presses Polytechniques et Universitaires Romandes, 2000 4. Statistical Pattern Recognition: A Review, A.Jain, R.Duin, J. Mao, PAMI 2000 (figures principalement de cette ref) 5. Guide to Biometric Reference Systems and Performance Evaluation. D. Petrovska-Delacrétaz, G. Chollet, and B. Dorizzi, editors . Springer-Verlag, 2009. DOI: 10.1007/978-1-84800-292-0 6. Deep Learning with Python, Chollet, F., Manning Publications Company 7. http://www.jmlr.org/papers/volume11/erhan10a/erhan10a.pdf 2
Introduction • Pattern recognition (PR) is the study of how machines can observe the environment, learn to distinguish patterns of interest from their background, and make sound and reasonable decisions about the categories of the patterns • Humans are the best pattern recognizers, but « their » PR algorithms are mostly unknown • Goal: make decisions automatically • A lot of disciplines are concerned with PR: artificial intelligence, computer vision, machine learning, psychology, biology, medicine,…., understanding human PR… • Pattern = def Watanabe 1985 =opposite of chaos, entity that could be given a name 3
Pattern examples 4
Profusion of “big” data • More an more data are available in digital form: on Google, ADN sequences, You tube videos, astronomy, personal photos …. need of organizing the data • Supervised classification (e.g., discriminant analysis) in which the input pattern is identified as a member of a predefined class, • Unsupervised classification (e.g., clustering) in which the pattern is assigned to a hitherto unknown class • Emerging applications: Google search, biometrics, speech recognition • Thanks to rapidly growing computational and storage power 5
6
Design of PR • Modules of a PR: – Data acquisitions – Segmentation – Data representation (feature extraction) – Learning (classification) – Post-processing (cost, fusion) – Decision making • Basic approaches of PR: – Template matching – Statistical classification – Syntactic or structural matching – Neural networks 7
Design Cycle • Data collection • Feature choice • Model choice • Training • Evaluation : split of the data !!! Should be done in the very beginning: Into train, development and evaluation disjoint partitions (specific to the PR approach) • Common evaluations, reference systems and reproducible results: – Guide to Biometric Reference Systems and Performance Evaluation. D. Petrovska-Delacrétaz, G. Chollet, and B. Dorizzi, editors . Springer-Verlag, 2009. DOI: 10.1007/978-1-84800-292-0 8
Template matching • In template matching, a template (typically, a 2D shape) or a prototype of the pattern to be recognized is available. The pattern to be recognized is matched (with a similarity measure) against the stored template while taking into account all allowable pose (translation and rotation) and scale changes. • Examples: fingerprints, normalized face images (pixel correlation) 9
Statistical approach Each pattern is represented in terms of d- dimensional features or measurements and is viewed as a point in a d-dimensional space. The goal is to choose those features that allow pattern vectors belonging to different categories to occupy compact and disjoint regions in a d-dimensional feature space. Examples: Bayes classifiers, LDA, …. 10
Syntactic approach For complex patterns, it is more appropriate to adopt a hierarchical perspective where a pattern is viewed as being composed of simple subpatterns which are themselves built from yet simpler subpatterns . The simplest/elementary subpatterns to be recognized are called primitives and the given complex pattern is represented in terms of the interrelationships between these primitives. Assumes that pattern structure is quantifiable and extractable so that structural similarity of patterns can be assessed Need of grammatical rules for the classification. Examples: data represented by symbols: DNA sequences, speech atomic units (phonemes), data driven speech units 11
12
Neural networks and co. Massively parallel computing systems consisting of an extremely large number of simple processors with many interconnections. • Main characteristics of neural networks are that they have the ability to learn complex nonlinear input- output relationships, use sequential training procedures, and adapt themselves to the data • Need of huge training data and computational power • Begin to be efficient on real world applications • Are able to learn data structures…. 13
Various approaches in statistical PR 14
Our focus: Statistical Pattern Recognition • The recognition system is operated in two modes: training (learning) and classification (testing) 15
Learning with generalization ability • Data: Train and test sets should be disjoint • if test set = train set (“par coeur” learning) • Test on test set only once (no tuning) • Partitioning of the data in an optimal way 16
Bad generalization reasons • # features >> # train samples • Too many parameters to estimate with few data • Optimization on the test set: overtraining 17
Distribution de proba connues • The features are assumed to have a probability density or mass (depending on whether the features are continuous or discrete) function conditioned on the pattern class. Thus, a pattern vector x belonging to class wi is viewed as an observation drawn randomly from the class-conditional probability function p(x|wi); • Optimal Bayes decision…. 18
Dichotomy supervised/unsupervised learning • If the form of the class-conditional densities is not known, then we operate in a nonparametric mode. In this case, we must either estimate the density function (e.g., Parzen window approach) or k-nearest neighbor rule • If classification with construction of decision boundaries => directly construct the decision boundary based on the training data =geom.approach 19
Feature extraction • Dimensionality reduction methods: PCA; LDA, N-networks,…. • Voir exemple visages 20
Visages 2D versus 3D • Visages 2D: – facilité d’acquisition – Dispositifs peu chers – Formats de codage répandus 21
Modules (etapes) necessaries Modules de: détection des visages localisation de points caractéristiques extraction des paramètres pertinents création de modèles compacts métriques d’évaluation 22
Besoin de données • Pour avoir des exemples d’apprentissage pertinents pour « éduquer » les algorithmes de chaque module domaine du « machine learning » • Pour trouver des bons paramètres pour le focntionement des modules particuliers • Pour évaluer les résultats • Pour trouver les limites de fonctionnement des algorithmes • . 23
Détection de visage • Algorithme le plus répandu: AdaBoost, voir openCV 24
Détection de points caractéristiques ("landmarks") du visage • 58 landmarks 25
Normalisation • Avec en plus quelques prétraitements Face extraction with geometric normalization Gray level extraction from HSV color system Photometric normalization by anisotropic smoothing 128x128 pixels 26
Problème de dimension • Par ex. image 2D: 150x150 pixels – Si dim50 quantifies avec 20 niveau – Espace des possibilités 20 50 – Dim = 22500 x val gris ou RGB …… • Données 3D de haute résolution: 75000 facettes (texture) + forme…. • Besoin de les traiter de manière compacte • Il y a des corrélations => comment les exploiter =>avec la PCA • Création des « manifolds » 27
« Manifold » des visages • Pixels très corrélés dans certaines régions • Points communs: yeux, nez, bouche, symétrie axe verticale… • Variabilité des visages: – Expressions – Bruits capture (résolution cameras) – Expression – Vieillissement – Changement d’apparence … • Difficulté: Comment représenter ces variabilités 28
Représentation des images PCA stack 29
Espace intrinsèque des classes images • Image arbitraire est définie par ces a x b = d pixels: – Dimensionalité d • Extraction d’information pour réduire la dimension: exploitation des redondances • But: – Espace stockage plus petit – Si transmission => utilisation de moins de bandes passante et/ou affichage progressif – Si reco formes: reco plus rapide sur des des images de faible dimensionnalite • Problème: quelle est la dimensionnalité intrinsèque d’une classe d’images 30
Exemple d’une droite Ligne droite dans R3 • points représentatifs a = x1 x2 x3 : dim 3 • Sous espace constitue par tous les points de la droite a un degré de liberté, • Sous espace représentatif: – droite f(x1 x2 x3) = a1 x1 + a2 x2 + a3 x3 • Representation des points: translation le long de la droite – Dim 1 31
Recommend
More recommend