voice activity detection
play

Voice Activity Detection Voice Activity Detection Speaker - PowerPoint PPT Presentation

Voice Activity Detection Introduction Voice Activity Detection Voice Activity Detection Speaker Recognition Feature Extraction Algorithms Victor Lenoir Threshold VAD Gaussian Mixture Model VAD LRDE Experiments Laboratoire de Recherche


  1. Voice Activity Detection Introduction Voice Activity Detection Voice Activity Detection Speaker Recognition Feature Extraction Algorithms Victor Lenoir Threshold VAD Gaussian Mixture Model VAD LRDE Experiments Laboratoire de Recherche et D´ eveloppement de l’EPITA Results Discuss July 3, 2011 http://lrde.epita.fr/ 1 / 31 Victor Lenoir

  2. Voice Activity Detection Outline Introduction Introduction Voice Activity Detection Speaker Recognition Voice Activity Detection Feature Speaker Recognition Extraction Algorithms Threshold VAD Feature Extraction Gaussian Mixture Model VAD Experiments Algorithms Results Discuss Threshold VAD Gaussian Mixture Model VAD Experiments Results Discuss 2 / 31 Victor Lenoir

  3. Voice Activity Detection Outline Introduction Voice Activity Detection Speaker Recognition Introduction Feature Voice Activity Detection Extraction Speaker Recognition Algorithms Threshold VAD Gaussian Mixture Model VAD Experiments Feature Extraction Results Discuss Algorithms Experiments 3 / 31 Victor Lenoir

  4. Voice Activity Detection Voice Activity Detection Introduction Voice Activity Detection Speaker Recognition Feature Extraction Algorithms In audio processing, we often want to remove noise and Threshold VAD Gaussian Mixture Model VAD silence. Experiments Voice activity Detection is used to detect human speech Results Discuss in an audio recording. 4 / 31 Victor Lenoir

  5. Voice Activity Detection Applications Introduction Voice Activity Detection Speaker Recognition Feature Extraction Voice Activity detection has many applications. Algorithms Threshold VAD Such as : Gaussian Mixture Model VAD ◮ Speech encoding (GSM) Experiments Results ◮ Audio conferencing Discuss ◮ Speech/Speaker recognition 5 / 31 Victor Lenoir

  6. Voice Activity Detection Speaker Recognition Introduction Voice Activity Detection Speaker Recognition Feature Extraction Algorithms Threshold VAD Gaussian Mixture Model VAD Experiments Results Discuss 6 / 31 Victor Lenoir

  7. Voice Activity Detection Outline Introduction Voice Activity Detection Speaker Recognition Introduction Feature Extraction Algorithms Threshold VAD Feature Extraction Gaussian Mixture Model VAD Experiments Results Algorithms Discuss Experiments 7 / 31 Victor Lenoir

  8. Voice Activity Detection Feature Extraction : Short-term Analysis Introduction Voice Activity Detection Speaker Recognition Feature Extraction Algorithms Threshold VAD Gaussian Mixture Model VAD Experiments Results Discuss 8 / 31 Victor Lenoir

  9. Voice Activity Detection Features Introduction Voice Activity Detection Speaker Recognition Feature Extraction Algorithms ◮ Energy Threshold VAD Gaussian Mixture ◮ Zero-Crossing Rate Model VAD Experiments ◮ Spectral Flatness Measure Results Discuss ◮ Mel-Frequency Cepstral Coefficients (MFCCs) 9 / 31 Victor Lenoir

  10. Voice Activity Detection Outline Introduction Voice Activity Detection Speaker Recognition Introduction Feature Extraction Algorithms Feature Extraction Threshold VAD Gaussian Mixture Model VAD Experiments Algorithms Results Discuss Threshold VAD Gaussian Mixture Model VAD Experiments 10 / 31 Victor Lenoir

  11. Voice Activity Detection Threshold VAD : Structure Introduction Voice Activity Detection Speaker Recognition Feature Extraction Algorithms Threshold VAD Gaussian Mixture Model VAD Experiments Results Discuss 11 / 31 Victor Lenoir

  12. Voice Activity Detection VAD Threshold : Features Extraction Introduction Voice Activity Detection Speaker Recognition New Feature for each frames : Feature Extraction E ( X ) ∆( X ) = Algorithms δ + SFM ( X ) ∗ Z ( X ) Threshold VAD Gaussian Mixture Model VAD Experiments Where: Results Discuss ◮ E ( X ) Energy ◮ Z ( X ) Number of zero crossing ◮ SFM ( X ) Spectral Flatness Measure ◮ δ = cte to prevent division by 0 12 / 31 Victor Lenoir

  13. Voice Activity Detection VAD Threshold : Initialization Introduction Voice Activity Detection Speaker Recognition Feature Extraction Algorithms Threshold VAD Gaussian Mixture Model VAD Experiments Results Discuss 13 / 31 Victor Lenoir

  14. Voice Activity Detection VAD Threshold : Initialization Introduction Voice Activity Detection Speaker Recognition Feature Extraction Algorithms Threshold VAD Gaussian Mixture Model VAD Experiments Results Discuss Speech Threshold = 45 14 / 31 Victor Lenoir

  15. Voice Activity Detection VAD Threshold : Initialization Introduction Voice Activity Detection Speaker Recognition Feature Extraction Algorithms Threshold VAD Gaussian Mixture Model VAD Experiments Results Discuss Speech Threshold = 45 Noise Threshold = 5 15 / 31 Victor Lenoir

  16. Voice Activity Detection VAD Threshold : Learning Introduction Voice Activity Final Threshold = 45+5 = 25 Detection 2 Speaker Recognition Feature Extraction Algorithms Threshold VAD Gaussian Mixture Model VAD Experiments Results Discuss 16 / 31 Victor Lenoir

  17. Voice Activity Detection VAD Threshold : Learning Introduction Voice Activity Final Threshold = 45+5 = 25 Detection 2 Speaker Recognition Feature Extraction Algorithms Threshold VAD Gaussian Mixture Model VAD Experiments Results Discuss 17 / 31 Victor Lenoir

  18. Voice Activity Detection VAD Threshold : Learning Introduction Voice Activity Final Threshold = 45+5 = 25 Detection 2 Speaker Recognition Feature Extraction Algorithms Threshold VAD Gaussian Mixture Model VAD Experiments Results Discuss 18 / 31 Victor Lenoir

  19. Voice Activity Detection VAD Threshold : Learning Introduction Voice Activity Final Threshold = 45+5 = 25 Detection 2 Speaker Recognition Feature Extraction Algorithms Threshold VAD Gaussian Mixture Model VAD Experiments Results Discuss New Speech Threshold = 35+28+27+26+45 = 32 5 New Noise Threshold = 5+21+6+16+6 = 10 5 New Final Threshold = 32+10 = 21 2 18 / 31 Victor Lenoir

  20. Voice Activity Detection Gaussian Mixture Model Introduction Voice Activity Detection Speaker Recognition Feature Extraction Algorithms Threshold VAD Gaussian Mixture Model (GMM) is a probabilistic model Gaussian Mixture Model VAD used to represent a probability distribution. Experiments Results It’s defined as a weighted sum of gaussian components. Discuss 19 / 31 Victor Lenoir

  21. Voice Activity Detection Voice Activity Detection using Gaussian Mixture Models Introduction Voice Activity Detection Speaker Recognition Feature Extraction Algorithms Threshold VAD Gaussian Mixture Model VAD Experiments Results Discuss 20 / 31 Victor Lenoir

  22. Voice Activity Detection VAD GMM : Features Extraction Introduction We extract the same features as the previous algorithm. Voice Activity Detection Speaker Recognition Plus we extract MFCCs. And we compute a new feature Feature (the same as the previous algorithm): Extraction Algorithms Threshold VAD E ( X ) Gaussian Mixture ∆( X ) = Model VAD δ + SFM ( X ) ∗ Z ( X ) Experiments Results Discuss Where: ◮ E ( X ) Energy ◮ Z ( X ) Number of zero crossing ◮ SFM ( X ) Spectral Flatness Measure ◮ δ = cte to prevent division by 0 21 / 31 Victor Lenoir

  23. Voice Activity Detection VAD GMM : Initialization Introduction Voice Activity Detection Speaker Recognition Feature Extraction Algorithms Threshold VAD Gaussian Mixture Model VAD Experiments Results Discuss 22 / 31 Victor Lenoir

  24. Voice Activity Detection VAD GMM : Initialization Introduction Voice Activity Detection Speaker Recognition Feature Extraction Algorithms Threshold VAD Gaussian Mixture Model VAD Experiments Results Discuss 22 / 31 Victor Lenoir

  25. Voice Activity Detection VAD GMM : Initialization Introduction Voice Activity Detection Speaker Recognition Feature Extraction Algorithms Threshold VAD Gaussian Mixture Model VAD Experiments Results Discuss SpeechGMM = Expectation Maximization (E) 22 / 31 Victor Lenoir

  26. Voice Activity Detection VAD GMM : Initialization Introduction Voice Activity Detection Speaker Recognition Feature Extraction Algorithms Threshold VAD Gaussian Mixture Model VAD Experiments Results Discuss SpeechGMM = Expectation Maximization (E) NoiseGMM = Expectation Maximization (F) 22 / 31 Victor Lenoir

  27. Voice Activity Detection VAD GMM : Learning and Segmentation Introduction Voice Activity Detection Speaker Recognition Feature Extraction Algorithms Threshold VAD Gaussian Mixture Model VAD Experiments Results Discuss 23 / 31 Victor Lenoir

  28. Voice Activity Detection Outline Introduction Voice Activity Detection Speaker Recognition Introduction Feature Extraction Algorithms Feature Extraction Threshold VAD Gaussian Mixture Model VAD Experiments Algorithms Results Discuss Experiments Results Discuss 24 / 31 Victor Lenoir

  29. Voice Activity Detection Compare VAD_Ref Introduction Voice Activity Detection Speaker Recognition Feature Extraction Algorithms Threshold VAD VAD_Threshold Gaussian Mixture Model VAD Experiments Results Discuss VAD_GMM 25 / 31 Victor Lenoir

Recommend


More recommend