progress in the framework of the respite project at
play

Progress in the framework of the RESPITE project at DaimlerChrysler - PowerPoint PPT Presentation

Research & Technology Progress in the framework of the RESPITE project at DaimlerChrysler Research & Technology DrIng. Fritz Class and Joan Mar Martigny, Jan. 2002 Research & Technology Contents DaimlerChrysler offline


  1. Research & Technology Progress in the framework of the RESPITE project at DaimlerChrysler Research & Technology Dr−Ing. Fritz Class and Joan Marí Martigny, Jan. 2002

  2. Research & Technology Contents � DaimlerChrysler off−line demonstrator � Block−diagram of our off−line demonstrator � Next evaluation experiments using our demonstrator � On−going research in Discriminative Feature Extraction � TANDEM acoustic modelling � Linear−Discriminant−Analysis−based (LDA) front−end � Quadratic−Discriminant−Analysis−based (QDA) front−end � A two−layer perceptron to generate state−posteriors from QDA features (RBFs) � Results

  3. Research & Technology DC off−line demonstrator: block−diagram DC ASR system Evaluation Feature Acoustic Viterbi SCLITE results Extraction Modelling Decoding package AURORA to UNI_IO to UNI_IO PCM feature likelihood Feature Acoustic Extraction Modelling CTK/QUICKNET/MSTK AURORA REF

  4. Research & Technology DC off−line demonstrator: next steps � Evaluate results of IDIAP Multi−Stream toolkit on the AURORA 2000 database and compare them with those of SPRACHcore and CTK toolkits � Determine, given the results of the previous evaluation and system requirements, which is the desirable technique for our purposes � Using our own in−car american english database compare our baseline system with the selected optimum technique

  5. Research & Technology Contents � DaimlerChrysler off−line demonstrator � Block−diagram of our off−line demonstrator � Next evaluation experiments using our demonstrator � On−going research in Discriminative Feature Extraction � TANDEM acoustic modelling � Linear−Discriminant−Analysis−based (LDA) front−end � Quadratic−Discriminant−Analysis−based (QDA) front−end � A two−layer perceptron to generate state−posteriors from QDA features (RBFs) � Results and Conclusions

  6. Research & Technology Discriminative Feature Extraction: TANDEM training Database Feature Database Forward Forward- HMM PCA VQ NN pass Backward PCM Extraction CMF Non−linear transform of the feature space Database Database NN PCA outputs Unsuprv. CB CB CB inv Clustering inversion PCA PCA matrix comput. Neural Net training NN Database EBP weights ALI algorithm

  7. Research & Technology Discriminative Feature Extraction: LDA training Linear transform to reduce dimensionality Database Feature Database LDA Forwrd- HMM VQ PCM Extraction CMF transform Backwrd Database Suprvsd. ALI Clustering LDA LDA CB matrix comput. Supervised Clustering CB inverted trnsfrm. & CB inv.

  8. Research & Technology Discriminative Feature Extraction: LDA TANDEM training t-1 t LDA training t t-1 Database EBP Bayes rule ALI algorithm t p ( x / q ) P ( q ) = i i P ( q / x ) ∑ i p ( x / q ) P ( q ) NN j j weights j Database Supervised ALI clustering x2 o x o o x x o x x o o CB x x x x o o xx x o o o x1

  9. Research & Technology Discriminative Feature Extraction: QDA TANDEM can be interpreted as a kind of non−linear feature extraction TANDEM features are obtained from log−posteriors SvOutPlaceObject Applying Bayes rule as in the previous slide ∑ = − + log P ( q / x ) log p ( x / q ) log p ( x / q ) P ( q ) log P ( q ) i i j j i j A quadratic equation is obtained ( ) ( ) 1 − ∝ − µ ’ Σ − µ − 1 log P ( q / x ) x x log p ( x ) i i i i 2 Key questions at this point are: � Is one gaussian per cluster enough ? � How many classes should be used ? � Is the gaussianity assumption always a good one?

  10. Research & Technology Discriminative Feature Extraction: RBFs A compromise between connectionist and parametric modelling are RBFs Returning back to the Bayes rule p ( x / q ) P ( q ) = i i P ( q / x ) ∑ An RBF is thus obtained i p ( x / q ) P ( q ) j j j We could express it as: f (.) ( ) i = µ Σ P ( q / x ) f w , N ( , ) i ik k k µ Σ N ( k , ) f (.) k i Where f is the softmax function and N is the gaussian pdf f (.) i

  11. Research & Technology Discriminative Feature Extraction: results Recognition results on AURORA 2000 Results TESTA 35,0 30,0 25,0 ANN/HMM CMF+MSG 20,0 WER(%) ANN/HMM PLP+MSG DC baseline 15,0 LDA_MSG-LDA_CMF 10,0 5,0 0,0 0 dB 10 dB 20 dB Clean

  12. Research & Technology Discriminative Feature Extraction: results Recognition results on AURORA 2000 Results TESTB 50,0 45,0 40,0 35,0 ANN/HMM MFCC+MSG 30,0 WER(%) ANN/HMM PLP+MSG 25,0 DC baseline 20,0 LDA_MSG-LDA_CMF 15,0 10,0 5,0 0,0 0 dB 10 dB 20 dB Clean

  13. Research & Technology Discriminative Feature Extraction: Results Recognition results on AURORA 2000 Results TESTC 45,0 40,0 35,0 30,0 ANN/HMM MFCC+MSG WER(%) 25,0 ANN/HMM PLP+MSG DC baseline 20,0 LDA_MSG-LDA_CMF 15,0 10,0 5,0 0,0 0 dB 10 dB 20 dB Clean

  14. Research & Technology Discriminative Feature Extraction: results Reduction of the dimensionality of the Neural Net TESTB-STREET WORD MLP (MFCC) Double Delta MLP Type 0 dB 10 dB 20 dB clean Average Weights (3x9x11)=297 + 480 + 127 37,5 4,4 1,8 2,4 11,5 203.520 (3x9x11)=297 + 254 + 127 36,0 7,1 2,9 2,7 12,2 107.696 Delta MLP Type 0 dB 10 dB 20 dB clean Average (2x9x11)=198 + 480 + 127 35,2 6,9 2,6 2,6 11,8 156.000 (2x9x11)=198 + 254 + 127 36,6 7,5 2,7 2,9 12,4 82.550 No Delta MLP Type 0 dB 10 dB 20 dB clean Average (17 x 11)=187 + 480 + 127 36,5 7,8 3,1 3,1 12,6 150.720 (17 x 11)=187 + 254 + 127 37,4 8,5 3,0 3,3 13,1 79.756 (13 x 11)=143 + 480 + 127 37,7 8,3 2,7 2,8 12,9 129.600 (13 x 11)=143 + 254 + 127 39,1 8,3 3,1 3,3 13,5 68.580 (9 x 11)=99 + 480 + 127 39,1 8,1 2,9 2,8 13,2 108.480 (9 x 11)=99 + 254 + 127 40,1 8,7 3,1 3,1 13,8 57.404

  15. Research & Technology Discriminative Feature Extraction: conclusions � TANDEM acoustic modelling can be performed with discriminant parametric models too (QDA) � As a compromise between connectionist and parametric modelling RBFs can be used for TANDEM � Concatenation of LDA−PLP and LDA−MSG features results in an slight improvement to our baseline LDA system � Word−based Hybrid ANN/HMMs are the best performing

Recommend


More recommend