GMM-based classification from noisy features Alexey Ozerov ( 1 ) , Mathieu Lagrange ( 2 ) and Em m anuel Vincent ( 1 ) 1st September 2011 (1) INRIA, Centre de Rennes - Bretagne Atlantique, (2) STMS Lab IRCAM - CNRS – UPMC International Workshop on Machine Listening in Multisource Environments (CHiME 2011) , Florence, Italy
Outline � Introduction � GMM decoding from noisy data � GMM learning from noisy data � Experiments � Conclusions and further work 1st September 2011 CHiME 2011, Florence, Italy 2
Introduction � Classification from noisy data � Classification from noisy or multi-source audio Feature Classification Decision extraction Noisy signal Noisy features � Poor performance because of high noise variability 1st September 2011 CHiME 2011, Florence, Italy 3
State of the art � Signal level: Noise suppression or source separation Source Feature Classification Decision separation extraction Noisy signal Noisy Separated features signal 1st September 2011 CHiME 2011, Florence, Italy 4
State of the art � Feature level: Features robust to � additive or convolute noise � errors produced by source separation Robust Source Classification feature Decision separation extraction Noisy signal Separated Noisy signal features 1st September 2011 CHiME 2011, Florence, Italy 5
State of the art � Classifier level: Classification that accounts for possible distortion of the features, given some information about this distortion [Cooke01, Barker05, Deng05, Kolossa10] Noisy features Source Feature Classification Decision separation extraction Noisy signal Information Separated about feature signal distortion / Generative UNCERTAINTY GMM-based classification 1st September 2011 CHiME 2011, Florence, Italy 6
State of the art limits and our contributions � Limit 1: It is assumed that the clean data underlying the noisy observations have been generated by the GMMs. [Cooke01, Barker05, Deng05, Kolossa10] � Contribution 1: Introduction and investigation of a new data-driven criterion for GMM learning and decoding as an alternative to the model-driven criterion. 1st September 2011 CHiME 2011, Florence, Italy 7
State of the art limits and our contributions � Limit 2: Uncertainty is taken into account only at the decoding stage, assuming that the GMMs were trained from some clean data. [Cooke01, Barker05, Deng05, Kolossa10] � Contribution 2: Deriving two new Expectation Maximization (EM) algorithms allowing learning GMMs from noisy data with Gaussian uncertainty for the both criteria considered. 1st September 2011 CHiME 2011, Florence, Italy 8
Outline � Introduction � GMM decoding from noisy data � GMM learning from noisy data � Experiments � Conclusions and further work 1st September 2011 CHiME 2011, Florence, Italy 9
GMM decoding from noisy data � GMM � Uncertainties � Binary (either observed or missing) [Cooke01, Barker05] � Gaussian (“ asymptotically” more general) [Deng05, Kolossa10] known unknown unknown known 1st September 2011 CHiME 2011, Florence, Italy 10
Criteria � Criterion 1: Model-driven criterion ( likelihood integration ) [ state of the art] [Deng05, Kolossa10] GMM Missing feature Feature expectation 1st September 2011 CHiME 2011, Florence, Italy 11
Criteria � Criterion 2: Data-driven criterion ( log-likelihood integration ) [ proposed] 1st September 2011 CHiME 2011, Florence, Italy 12
Outline � Introduction � GMM decoding from noisy data � GMM learning from noisy data � Experiments � Conclusions and further work 1st September 2011 CHiME 2011, Florence, Italy 13
GMM learning from noisy data � Binary uncertainty � EM algorithm [Ghahramani&Jordan94] � Gaussian uncertainty � We derived two new EM algorithms for the both criteria considered 1st September 2011 CHiME 2011, Florence, Italy 14
GMM learning from noisy data Needed some approximations Generalizes “ asymptotically” the binary uncertainty EM [Ghahramani&Jordan94] 1st September 2011 CHiME 2011, Florence, Italy 15
Outline � Introduction � GMM decoding from noisy data � GMM learning from noisy data � Experiments � Conclusions and further work 1st September 2011 CHiME 2011, Florence, Italy 16
Artificial uncertainty 1. is drawn from a Gaussian 2. is drawn from � Artificial uncertainty � gives us a possibility to control some characteristics of the uncertainty, � allows us leaving the study of the following situations for further work: � realistic feature-corrupting noise, � estimated uncertainty covariances. 1st September 2011 CHiME 2011, Florence, Italy 17
Characteristics of the uncertainty � Feature to Noise Ratio (FNR) (dB) � Noise Variation Level (NVL) (dB) 1st September 2011 CHiME 2011, Florence, Italy 18
Evaluated setups � All possible combinations of � 375 setups 1st September 2011 CHiME 2011, Florence, Italy 19
Artificial data GMMs used for clean data generation Clean data 6 6 GMM of class 1 Class 1 4 4 GMM of class 2 Class 2 GMM of class 3 Class 3 2 2 0 0 −2 −2 −4 −4 −6 −6 −5 0 5 −5 0 5 Noisy data (NVL = 0 dB, FNR = 10 dB) Noisy data (NVL = 8 dB, FNR = 10 dB) 6 6 4 4 2 2 0 0 −2 −2 −4 −4 −6 −6 −5 0 5 −5 0 5 1st September 2011 CHiME 2011, Florence, Italy 20
Real data � Speaker recognition task � Setting is quite similar to [ Reynolds95] � TIMIT database � 10 male speakers � 16-states GMMs � Feature space dimension = 20 � Differences with [ Reynolds95] � Features: Logarithms of Mel-Frequency Filter- Bank outputs (LMFFB) instead of MFCC � GMMs with full covariance matrices 1st September 2011 CHiME 2011, Florence, Italy 21
Artificial data results Impact of FNR (NVL train = NVL test = 0 dB) Impact of NVL (FNR train = FNR test = −10 dB) 100 100 90 90 80 80 70 70 Correct classification rate Correct classification rate 60 60 50 50 40 40 30 30 20 Like int (FNR train = 0 dB) 20 Like int (NVL train = 0 dB) Like int (FNR train = 20 dB) Like int (NVL train = 8 dB) Log like int (FNR train = 0 dB) Log like int (NVL train = 0 dB) 10 10 Log like int (FNR train = 20 dB) Log like int (NVL train = 8 dB) No uncrt (FNR train = 0 dB) No uncrt (NVL train = 0 dB) No uncrt (FNR train = 20 dB) No uncrt (NVL train = 8 dB) 0 0 −20 −10 0 10 20 0 2 4 6 8 FNR in test NVL in test 1st September 2011 CHiME 2011, Florence, Italy 22
Artificial data GMMs used for clean data generation Clean data 6 6 GMM of class 1 Class 1 4 4 GMM of class 2 Class 2 GMM of class 3 Class 3 2 2 0 0 −2 −2 −4 −4 −6 −6 −5 0 5 −5 0 5 Noisy data (NVL = 0 dB, FNR = 10 dB) Noisy data (NVL = 8 dB, FNR = 10 dB) 6 6 4 4 2 2 0 0 −2 −2 −4 −4 −6 −6 −5 0 5 −5 0 5 1st September 2011 CHiME 2011, Florence, Italy 23
Real data results Impact of FNR (NVL train = NVL test = 0 dB) Impact of NVL (FNR train = FNR test = 0 dB) 100 100 Like int (FNR train = 10 dB) Like int (NVL train = 0 dB) Like int (FNR train = 20 dB) Like int (NVL train = 8 dB) Log like int (FNR train = 10 dB) Log like int (NVL train = 0 dB) 90 90 Log like int (FNR train = 20 dB) Log like int (NVL train = 8 dB) No uncrt (FNR train = 10 dB) No uncrt (NVL train = 0 dB) No uncrt (FNR train = 20 dB) No uncrt (NVL train = 8 dB) 80 80 70 70 Correct classification rate Correct classification rate 60 60 50 50 40 40 30 30 20 20 10 10 0 0 −20 −10 0 10 20 0 2 4 6 8 FNR in test NVL in test 1st September 2011 CHiME 2011, Florence, Italy 24
Outline � Introduction � GMM decoding from noisy data � GMM learning from noisy data � Experiments � Conclusions and further work 1st September 2011 CHiME 2011, Florence, Italy 25
Conclusions and further work � Conclusions � We validate the model-driven uncertainty decoding approach as compared to a data-driven approach. � We show that considering the uncertainty allows us to � handle the heterogeneity of noise between the training and testing sets, � exploit the variability of noise for improved performance. � Further work � Considering realistic feature-corrupting noise and uncertainty covariances estimation. � Considering the log-likelihood integration within a GMM-based classification framework with discriminative training. 1st September 2011 CHiME 2011, Florence, Italy 26
Recommend
More recommend