odyssey 2016
play

Odyssey 2016 The Speaker and Language Recognition Workshop June - PowerPoint PPT Presentation

Odyssey 2016 The Speaker and Language Recognition Workshop June 21-24, 2016, Bilbao, Spain A new feature for automatic speaker verification anti-spoofing: constant Q cepstral coefficients Massimiliano Todisco, Hctor Delgado and Nicholas Evans


  1. Odyssey 2016 The Speaker and Language Recognition Workshop June 21-24, 2016, Bilbao, Spain A new feature for automatic speaker verification anti-spoofing: constant Q cepstral coefficients Massimiliano Todisco, Héctor Delgado and Nicholas Evans Department of Digital Security EURECOM, Sophia Antipolis, France The OCTAVE project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 647850.

  2. Introduction o Spoofing is a manipulation of a biometric system by a fraudulent user o Automatic speaker verification is vulnerable to spoofing o Spoofing algorithm cannot be known in advance o Need for generalised countermeasures o New feature based on constant Q transform were proposed A new feature for automatic speaker verification anti-spoofing: constant Q cepstral coefficients

  3. From Fourier to constant Q o Fourier transform may lack frequency resolution o In STFT, the time and frequency resolutions are constant o Constant Q transform (CQT) is an alternative which reflects more closely human perception o CQT employs a variable time/frequency resolution: greater time resolution for higher • frequencies greater frequency resolution for • lower frequencies A new feature for automatic speaker verification anti-spoofing: constant Q cepstral coefficients

  4. Constant Q Cepstral Coefficients (CQCC) speech signal CQCC Constant-Q Power Uniform DCT LOG Transform spectrum resampling Block diagram of CQCC feature extraction o Combining CQT (in place of STFT) with traditional cepstral analysis o Issue : discrete cosine transform (DCT) cannot be directly applied § CQT and DCT have different scale (geometric vs linear) § Geometric DCT bases are no longer orthogonal o Solution : Uniformly resample the non-uniform frequency scale of CQT to a linear frequency scale A new feature for automatic speaker verification anti-spoofing: constant Q cepstral coefficients

  5. Experiments & Results on ASVspoof 2015 Database Comparison of results (EER [%]) on ASVspoof2015 Database Front-end: CQCC-A (19+0 th second derivative coefficients) Back-end: 2 GMMs (512 components, EM training), one for human speech and one for spoofed speech o Known attacks : all the systems deliver excellent error rates o Unknown attacks : CQCC features give best performance § Attack S10 (unit selection synthesis): EER = 1.065% à 87% relative improvement o Best spoofing detection performance ( 72% relative improvement ) reported to date A new feature for automatic speaker verification anti-spoofing: constant Q cepstral coefficients

  6. Poster place A new feature for automatic speaker verification anti-spoofing: constant Q cepstral coefficients

  7. Poster place A new feature for automatic speaker verification anti-spoofing: constant Q cepstral coefficients

Recommend


More recommend