Constrained discriminative speaker verification specific to - PowerPoint PPT Presentation

Constrained discriminative speaker verification specific to normalized i-vectors P.M. Bousquet, J.F. Bonastre LIA University of Avignon the June 21, 2016 P.M. Bousquet, J.F. Bonastre (LIA) Odyssey 2016 the June 21, 2016 1 / 26

Discriminative approach for i-vector: SoA Normalization Within-class covariance matrix W (centering and scaling) Length normalization Gaussian-PLDA modelling ... parameters ( µ , Φ , Λ ) LLR score Discriminative classifier Logistic regression-based (SoA) with with score coefficients PLDA parameters ( µ , Φ , Λ ) P.M. Bousquet, J.F. Bonastre (LIA) Odyssey 2016 the June 21, 2016 2 / 26

Discriminative approach for i-vector: proposed ... Normalization Within-class covariance matrix W (centering and scaling) Length normalization Additional normalization procedure (intended to constrain the discriminative training) Gaussian-PLDA modelling ... parameters ( µ , Φ , Λ ) LLR score Discriminative classifier Constrained (limited order of coefficients to optimize) Logistic regression-based (SoA) Orthonormal discriminative classifier a new approach ... with with score coefficients PLDA parameters ( µ , Φ , Λ ) P.M. Bousquet, J.F. Bonastre (LIA) Odyssey 2016 the June 21, 2016 3 / 26

Gaussian-PLDA Model A d -dimensional i-vector w can be decomposed as follows: w = µ + Φy s + ε (1) - Φy s and ε are assumed to be statistically independent and ε follows a centered Gaussian distribution with full covariance matrix Λ . - Speaker factor y s can be a full-rank d -vector ( two-covariance model ) or constrained to lie in the r -linear range of the d × r matrix Φ , ( eigenvoice subspace ). P.M. Bousquet, J.F. Bonastre (LIA) Odyssey 2016 the June 21, 2016 4 / 26

Gaussian-PLDA scoring Closed-form solution of LLR-score: the second degree polynomial function of w i and w j components: s i , j = log P ( w i , w j |H tar ) P ( w i , w j |H non ) i P w j + 1 − µ t ( P + Q ) ( w i + w j ) = w t w t i Q w i + w t � � j Q w j 2 + µ t ( P + Q ) µ + 1 2 log | A t | − log | A n | (2) where � − 1 Φ t Λ − 1 P = Λ − 1 Φ 2 Φ t Λ − 1 Φ + I r � � − 1 Φ t Λ − 1 Q = P − Λ − 1 Φ Φ t Λ − 1 Φ + I r � � − 1 2 Φ t Λ − 1 Φ + I r � A t = � − 1 Φ t Λ − 1 Φ + I r � A n = (3) P.M. Bousquet, J.F. Bonastre (LIA) Odyssey 2016 the June 21, 2016 5 / 26

Discriminative classifiers for speaker recognition SoA: based on logistic regression Given the dataset of target and non-target trials χ tar , χ non with cardinalities N tar , N non respectively, the log-probability of correctly classifying all training ( total cross entropy ) is equal to: 1 1 � � � Nnon � � � Ntar TCE = P ( H non | t ) P ( H tar | t ) (4) t ∈ χ non t ∈ χ tar Goal : maximizing the (log-)TCE by gradient-descent with respect to some coefficients: PLDA LLR score coefficients (i.e. of score matrices P and Q ). LLR-score can be written as a dot-product ϕ i , j .ω between an expanded vector of a trial ϕ i , j and a vector ω initialized with PLDA parameters [Burget et al., 2011] PLDA parameters ( µ, Φ , Λ ) [B¨ orgstrom and Mac Cree, 2013] P.M. Bousquet, J.F. Bonastre (LIA) Odyssey 2016 the June 21, 2016 6 / 26

Discriminative classifiers for speaker recognition Difficulties to overcome Discriminative training (DT) can suffer from various limitations: data insufficiency over-fitting on development data respect of metaparameters conditions: definiteness, positivity / negativity of PLDA LLR-score covariance matrices ... P.M. Bousquet, J.F. Bonastre (LIA) Odyssey 2016 the June 21, 2016 7 / 26

Discriminative classifiers for speaker recognition Difficulties to overcome Discriminative training (DT) can suffer from various limitations: data insufficiency over-fitting on development data respect of metaparameters conditions: definiteness, positivity / negativity of PLDA LLR-score covariance matrices ... Constrained DT : training only a small amount of parameters ⇒ order O ( d ), or even O (1), instead of O ( d 2 ). = Some solutions [Rohdin et al., 2016, B¨ orgstrom and Mac Cree, 2013]: single coefficient optimized for each dimension of the i-vector or, even, the four feature kinds that make up score. only mean vector µ and eigenvalues of PLDA matrices ΦΦ t and Λ are trained by DT and, even, their scaling factors only. metaparameters conditions: working with singular value decomposition of P and Q / flooring of parameters. P.M. Bousquet, J.F. Bonastre (LIA) Odyssey 2016 the June 21, 2016 7 / 26

Discriminative classifiers for speaker recognition Difficulties to overcome Discriminative training (DT) can suffer from various limitations: data insufficiency over-fitting on development data respect of metaparameters conditions: definiteness, positivity / negativity of PLDA LLR-score covariance matrices ... DT struggles to improve speaker detection when i-vectors have been first normalized, whereas this option has proven to achieve best performance in speaker verification. P.M. Bousquet, J.F. Bonastre (LIA) Odyssey 2016 the June 21, 2016 7 / 26

Normalization step Normalization Within-class covariance matrix W (centering and scaling) Length normalization Additional normalization procedure (intended to constrain the discriminative training) Gaussian-PLDA modelling ... parameters ( µ , Φ , Λ ) LLR score Discriminative classifier Constrained (limited order of coefficients to optimize) Logistic regression-based (SoA) Orthonormal discriminative classifier a new approach ... with with score coefficients PLDA parameters ( µ , Φ , Λ ) P.M. Bousquet, J.F. Bonastre (LIA) Odyssey 2016 the June 21, 2016 8 / 26

Normalization step Within-class covariance matrix W (centering and scaling) Length normalization = ⇒ W is almost exactly isotropic, i.e. W ≈ σ I , σ > 0 P.M. Bousquet, J.F. Bonastre (LIA) Odyssey 2016 the June 21, 2016 9 / 26

Normalization step Within-class covariance matrix W (centering and scaling) Length normalization Proposed : Additional normalization step (which does not modify distances between i-vectors): Rotation by the eigenvector basis of between-class covariance matrix B of the training dataset. B = P∆P t ( SVD ) w ← P t w P.M. Bousquet, J.F. Bonastre (LIA) Odyssey 2016 the June 21, 2016 9 / 26

Normalization step Within-class covariance matrix W (centering and scaling) Length normalization Proposed : Additional normalization step (which does not modify distances between i-vectors): Rotation by the eigenvector basis of between-class covariance matrix B of the training dataset. = ⇒ B is diagonal, = ⇒ W remains almost exactly isotropic (and therefore diagonal), since B -eigenvector basis is orthogonal. Assumptions: PLDA matrices ΦΦ t , Λ become almost diagonal, and even isotropic for Λ (as a consequence, P and Q of score are almost diagonal) P.M. Bousquet, J.F. Bonastre (LIA) Odyssey 2016 the June 21, 2016 9 / 26

Normalization step Within-class covariance matrix W (centering and scaling) Length normalization Proposed : Additional normalization step (which does not modify distances between i-vectors): Rotation by the eigenvector basis of between-class covariance matrix B of the training dataset. Moreover, W − 1 B ≈ B ⇒ the LDA solution can be identified as the subspace of the first r eigenvectors of B . First r components of training i-vectors are approximately their projection onto the LDA r -subspace. P.M. Bousquet, J.F. Bonastre (LIA) Odyssey 2016 the June 21, 2016 9 / 26

Normalization step The score can be rewritten as this sum of O ( r ) terms: r � p k w i , k w j , k + 1 � � w 2 i , k + w 2 � � s i , j = 2 q k − ( p k + q k ) µ k ( w i , k + w j , k ) j , k k =1 + res i , j (5) where r is the range of the PLDA eigenvoice subspace res i , j sums all the diagonal terms beyond the r th dimension, all the off-diagonal terms and offsets. Thus, we assume that the major proportion of variability in the LLR-score is contained into the first r terms of the sum above (the residual term is negligible). P.M. Bousquet, J.F. Bonastre (LIA) Odyssey 2016 the June 21, 2016 10 / 26

Normalization step Table: Analysis of PLDA parameters before and after the B -rotation additional normalization procedure. before after male female male female Diagonality of... PLDA eigenvoice subspace ΦΦ t 0.23 0.15 0.95 0.97 PLDA score matrix P 0.48 0.25 0.98 0.96 PLDA score matrix Q 0.41 0.23 0.96 0.97 Isotropy of PLDA nuisance variability Λ 0.98 0.96 0.99 0.97 Residual variance 0.29 0.42 0.004 0.004 � 2 � diag ( ΦΦ t ) Tr Diagonality of the symmetric matrix ΦΦ t : ∈ [0 , 1] Tr (( ΦΦ t ) 2 ) m 2 Isotropy of Λ : d × Tr ( Λ 2 ) ∈ [0 , 1] where m Λ denotes the mean value of Λ Λ -diagonal var ( res ) Variance of the residual term: var ( score ) ∈ [0 , 1] P.M. Bousquet, J.F. Bonastre (LIA) Odyssey 2016 the June 21, 2016 11 / 26

Constrained discriminative speaker verification specific to - PowerPoint PPT Presentation

Constrained discriminative speaker verification specific to normalized i-vectors P.M. Bousquet, J.F. Bonastre LIA University of Avignon the June 21, 2016 P.M. Bousquet, J.F. Bonastre (LIA) Odyssey 2016 the June 21, 2016 1 / 26

Discriminative Models Joakim Nivre Uppsala University Department of Linguistics and Philology

W3C Speaker Identification W3C Speaker Identification and Verification Workshop and Verification

A New Adaptation Method for Speaker- -Model Model A New Adaptation Method for Speaker Creation

The HDU Discriminative SMT System for Constrained Data PatentMT at NTCIR10 Patrick Simianer, Gesa

Speech Processing 15-492/18-492 Speaker ID Who is speaking? Speaker ID, Speaker Recognition

DIVS DL/ID Verification Systems Verification of Legal Status DIVS Passport Verification

Generative vs. discriminative Generative Discriminative Belief network A is more More

Discriminative word alignment by learning the Discriminative word alignment by learning the

Three models for discriminative machine Three models for discriminative machine translation using

Configuration and Management of Speaker Verification Systems W3C Workshop on Speaker Biometrics

Implementing Existing Management Protocols on Constrained Devices J urgen Sch onw alder

Specific Aims One Page The single most important page in a grant Specific Aims Specific Aims

Automatic Verification of Automatic Verification of Automatic Verification of Automatic

Debate: Writing and Presentation Mr. Winand Debate Proposition America is losing its competitive

Discriminative Metric Learning in Nearest Neighbor Models for Image Annotation Matthieu

Dynamic Re-ordering in Mining Top- k Productive Discriminative Patterns Yoshitaka Kameya * and

Class discrimination for microarray studies Vlad Popovici Swiss Institute of Bioinformatics

Feature Reduction and Selection Selim Aksoy Bilkent University Department of Computer

Linear Models for Classification Oliver Schulte - CMPT 726 Bishop PRML Ch. 4 Discriminant

Lecture 12: Midterm Exam Review Dr. Chengjiang Long Computer Vision Researcher at Kitware Inc.

Discriminative Feature Extraction and Dimension Reduction - PCA & LDA Berlin Chen, 2004

Local Classification Methods for Heterogeneous Classes Julia Schiffner and Claus Weihs

Introduction to Machine Learning Classification: Tasks Sonar Learning goals 0.20 Understand

Lecture 16: Summary and outlook Felix Held, Mathematical Sciences MSA220/MVE440 Statistical

Constrained discriminative speaker verification specific to - PowerPoint PPT Presentation

Constrained discriminative speaker verification specific to normalized i-vectors P.M. Bousquet, J.F. Bonastre LIA University of Avignon the June 21, 2016 P.M. Bousquet, J.F. Bonastre (LIA) Odyssey 2016 the June 21, 2016 1 / 26

Discriminative Models Joakim Nivre Uppsala University Department of Linguistics and Philology

W3C Speaker Identification W3C Speaker Identification and Verification Workshop and Verification

A New Adaptation Method for Speaker- -Model Model A New Adaptation Method for Speaker Creation

The HDU Discriminative SMT System for Constrained Data PatentMT at NTCIR10 Patrick Simianer, Gesa

Speech Processing 15-492/18-492 Speaker ID Who is speaking? Speaker ID, Speaker Recognition

DIVS DL/ID Verification Systems Verification of Legal Status DIVS Passport Verification

Generative vs. discriminative Generative Discriminative Belief network A is more More

Discriminative word alignment by learning the Discriminative word alignment by learning the

Three models for discriminative machine Three models for discriminative machine translation using

Configuration and Management of Speaker Verification Systems W3C Workshop on Speaker Biometrics

Implementing Existing Management Protocols on Constrained Devices J urgen Sch onw alder

Specific Aims One Page The single most important page in a grant Specific Aims Specific Aims

Automatic Verification of Automatic Verification of Automatic Verification of Automatic

Debate: Writing and Presentation Mr. Winand Debate Proposition America is losing its competitive

Discriminative Metric Learning in Nearest Neighbor Models for Image Annotation Matthieu

Dynamic Re-ordering in Mining Top- k Productive Discriminative Patterns Yoshitaka Kameya * and

Class discrimination for microarray studies Vlad Popovici Swiss Institute of Bioinformatics

Feature Reduction and Selection Selim Aksoy Bilkent University Department of Computer

Linear Models for Classification Oliver Schulte - CMPT 726 Bishop PRML Ch. 4 Discriminant

Lecture 12: Midterm Exam Review Dr. Chengjiang Long Computer Vision Researcher at Kitware Inc.

Discriminative Feature Extraction and Dimension Reduction - PCA &amp; LDA Berlin Chen, 2004

Local Classification Methods for Heterogeneous Classes Julia Schiffner and Claus Weihs

Introduction to Machine Learning Classification: Tasks Sonar Learning goals 0.20 Understand

Lecture 16: Summary and outlook Felix Held, Mathematical Sciences MSA220/MVE440 Statistical

Discriminative Feature Extraction and Dimension Reduction - PCA & LDA Berlin Chen, 2004