Speaker Recognition and Speaker Recognition and the ETSI Standard - PowerPoint PPT Presentation

Speaker Recognition and Speaker Recognition and the ETSI Standard the ETSI Standard Distributed Speech Distributed Speech Recognition Front- -End End Recognition Front Charles Broun David Pearce William Campbell Holly Kelleher Motorola Labs Motorola Limited Human Interface Lab Basingstoke, UK Tempe, Arizona, USA

Outline Outline Outline • Background • Speaker Verification – Embedded Process – Distributed Process • Distributed Speech Recognition (DSR) • Classifier • Experimental Setup • Results • Conclusion 2

Background Background Background Motivation • Issues with Embedded Solutions Mobile devices do not have the necessary memory or battery capacity Updating software requires access to each device Multiple devices may contain different speaker models • Potential Benefits of Distributed Solutions Server supports computation and memory requirements Software updates are handled in a single location A single speaker model may support multiple mobile devices – enabling a ‘portable’ interface • Distributed Speech Recognition (DSR) Standard This standard addresses the above issues for speech recognition Can work on this standard be leveraged for speaker verification? 3

Speaker Verification Speaker Verification Speaker Verification Embedded Process • Feature extractor and classifier typically combined into a proprietary solution • Can jointly optimize both components • Target system must support computation & memory requirements of both components Accept >T Input Score Feature Compare to Classifier Speech Extractor Threshold, T Data <T Reject Speaker Model 4

Speaker Verification Speaker Verification Speaker Verification Distributed Process • Feature extractor is standardized • Cannot jointly optimize both components • Client only supports computational & memory requirements of feature extractor • Server supports higher load of classifier Accept >T Input Score Feature Compare to Classifier Speech Extractor Threshold, T Data <T Reject Wireless Channel Speaker Model 5

DSR DSR DSR Background of DSR Standard • Motivation of Standard Front-End Potential benefits of distributed solutions for speech recognition Eliminates voice/vocoder channel mismatch • Activities European Telecommunications Standards Institute (ETSI) Aurora Working Group within ETSI First standard published in February 2000 6

DSR DSR DSR ETSI Standard DSR System Concept • Terminal front-end targeted to mobile devices • Features transmitted over a low-error data channel • Speech recognizer runs on high power server Terminal DSR Front-End Parameterisation Frame Compression Structure & M el-Cepstrum Split V Q Error Protection W ireless Data Channel – 4.8 kbit/s Server DSR Back-End Error Detection Decompression Recognition & M itigation 7

DSR DSR DSR ETSI Standard DSR Front-End • Feature set consists of 12 mel-cepstum coefficient, logE, C0 • Quantization supports a data rate of 4800 b/s • Error protection supports robustness to transmission errors Input Speech ADC Offcom Framing PE W FFT MF LOG DCT logE Abbreviations: Feature Compression ADC Analog-to-digital conversion Offcom Offset compensation PE Pre-emphasis logE Energy measure computation Bit Stream Formatting W Windowing FFT Fast Fourier transform MF Mel-filtering LOG Non-linear transform To Transmission Channel DCT Discrete cosine transform 8

Classifier Classifier Classifier Polynomial Classifier [ ] t = = Given x x x and K 2 • Compute the polynomial basis vector 1 2 [ ] t = 2 2 p ( x ) 1 x x x x x x 1 2 1 1 2 2 • Apply a polynomial discriminant = t d ( x , w ) w p ( x ) function 1 M 1 M • Compute the score as the average ∑ ∑ = t = t s w p ( x ) w p ( x ) k k M M across all frames = = k 1 k 1 DSR Polynomial Discriminant Score Average s x Feature Basis Vector Function Σ k Vectors p ( x ) d ( x , w ) Speaker Model w 9

Experimental Setup Experimental Setup Experimental Setup YOHO Database • 138 speakers • Enrollment – 4 sessions – 24 phrases – “23-45-56” • Testing – 10 sessions – 4 phrases – “45-23-56” Speaker Verification System • Classifier: 3 rd order polynomial • Features: 12 MFCCs from the DSR front-end • Channel: GSM bit-error masks 10

Results Results Results Performance Average Equal Error Rate (%) for a 1-Phrase Test Verify Un- Error EP1 EP2 EP3 quantized -Free Enroll Unquantized 1.18 - - - - Error-Free - 1.22 1.22 1.26 1.67 EP1 - 1.22 1.22 1.26 1.67 EP2 - 1.22 1.22 1.27 1.66 EP3 - 1.26 1.26 1.30 1.70 11

Conclusion Conclusion Conclusion Demonstrated that the ETSI Standard Distributed Speech Recognition Front-End is viable for speaker verification 12

Speaker Recognition and Speaker Recognition and the ETSI Standard - PowerPoint PPT Presentation

Speaker Recognition and Speaker Recognition and the ETSI Standard the ETSI Standard Distributed Speech Distributed Speech Recognition Front- -End End Recognition Front Charles Broun David Pearce William Campbell Holly Kelleher Motorola

ETSI Open for Cooperation Adrian Scrase ETSI Vice-President International Partnership Projects

CI/CD in ETSI NFV environment PIERRE LYNCH (IXIA/KEYSIGHT), GERGELY CSATARI (NOKIA) ETSI NFV

DVB TTML Subtitling Systems ETSI EN 303 560 Peter Cherriman (BBC) DVBs ETSI EN 303 560 DVB

ETSI Perspective on IoT collaboration with Patrick Guillemin ETSI Secretariat, Strategy and New

ETSI Reconfigurable Radio Systems (RRS) Tutorial Dr. Markus Mueck (ETSI RRS Chairman), Kari

ETSI & Lawful Interception of IP ETSI & Lawful Interception of IP Traffic Traffic Jaya

Speech Processing 15-492/18-492 Speaker ID Who is speaking? Speaker ID, Speaker Recognition

Container service chaining Martin ual INTRO AGENDA ETSI NFV MANO IETF SFC

Combining Speech and Speaker Recognition - A Joint Modeling Approach Hang Su Supervised by:

A summary of deep models for face recognition Qianli Liao Face recognition Face recognition:

8-Speech Recognition Speech Recognition Concepts Speech Recognition Approaches

Speech recognition Brief history Technology Computer Literacy 1 Lecture 22 How does

Frans Bolk CEO UniQ-ID Uses certificates ( x.509) Has its own UniQ-CA ETSI

DECT ULE INTRODUCTION DECT is a digital wireless technology developed and standardized by ETSI

EC Joint Research Center Cross TeChnology evaluaTion PlugTesTs CoexisTenCe inTeroPerabiliTy of

ITS TECHNOLOGY CHOICES FOR EUROPE Presented by Adrian Scrase (ETSI) ITS Standardisation world

End-to-End Testing Volunteer Call Everything Testers Need to Know to Have A Successful Testing

Residents / Managers Meeting June 2020 Anne Doyle New New Banner Banner Ne Next to to We Welcome

Coronavirus Travel Sentiment Index Presentation of Findings Week of April 20 th IMPORTANT

Coronavirus Covid 19 School Closure Currently the message from the PM is that schools are

Agenda * Short description of CaaSP * Demo: kubectl * What is HELM * Demo: helm * wrap-up

Pharmacy Front-End Refresh for Under $1,000 NCPA 2018 Annual Convention Speakers: Gabe Trahan,

Corporate Presentation August 2017 Investor Presentation Q1 FY2018 1 BLS At a Glance

All Terrain Cedar Saw, L.L.C. Oklahoma State University 10 December 2004 BAE 4012 Colby J. Funk

Explore More Topics

Sambuz

Useful Links

Newsletter

Mail Us

Speaker Recognition and Speaker Recognition and the ETSI Standard - PowerPoint PPT Presentation

Speaker Recognition and Speaker Recognition and the ETSI Standard the ETSI Standard Distributed Speech Distributed Speech Recognition Front- -End End Recognition Front Charles Broun David Pearce William Campbell Holly Kelleher Motorola

ETSI Open for Cooperation Adrian Scrase ETSI Vice-President International Partnership Projects

CI/CD in ETSI NFV environment PIERRE LYNCH (IXIA/KEYSIGHT), GERGELY CSATARI (NOKIA) ETSI NFV

DVB TTML Subtitling Systems ETSI EN 303 560 Peter Cherriman (BBC) DVBs ETSI EN 303 560 DVB

ETSI Perspective on IoT collaboration with Patrick Guillemin ETSI Secretariat, Strategy and New

ETSI Reconfigurable Radio Systems (RRS) Tutorial Dr. Markus Mueck (ETSI RRS Chairman), Kari

ETSI &amp; Lawful Interception of IP ETSI &amp; Lawful Interception of IP Traffic Traffic Jaya

Speech Processing 15-492/18-492 Speaker ID Who is speaking? Speaker ID, Speaker Recognition

Container service chaining Martin ual INTRO AGENDA ETSI NFV MANO IETF SFC

Combining Speech and Speaker Recognition - A Joint Modeling Approach Hang Su Supervised by:

A summary of deep models for face recognition Qianli Liao Face recognition Face recognition:

8-Speech Recognition Speech Recognition Concepts Speech Recognition Approaches

Speech recognition Brief history Technology Computer Literacy 1 Lecture 22 How does

Frans Bolk CEO UniQ-ID Uses certificates ( x.509) Has its own UniQ-CA ETSI

DECT ULE INTRODUCTION DECT is a digital wireless technology developed and standardized by ETSI

EC Joint Research Center Cross TeChnology evaluaTion PlugTesTs CoexisTenCe inTeroPerabiliTy of

ITS TECHNOLOGY CHOICES FOR EUROPE Presented by Adrian Scrase (ETSI) ITS Standardisation world

End-to-End Testing Volunteer Call Everything Testers Need to Know to Have A Successful Testing

Residents / Managers Meeting June 2020 Anne Doyle New New Banner Banner Ne Next to to We Welcome

Coronavirus Travel Sentiment Index Presentation of Findings Week of April 20 th IMPORTANT

Coronavirus Covid 19 School Closure Currently the message from the PM is that schools are

Agenda * Short description of CaaSP * Demo: kubectl * What is HELM * Demo: helm * wrap-up

Pharmacy Front-End Refresh for Under $1,000 NCPA 2018 Annual Convention Speakers: Gabe Trahan,

Corporate Presentation August 2017 Investor Presentation Q1 FY2018 1 BLS At a Glance

All Terrain Cedar Saw, L.L.C. Oklahoma State University 10 December 2004 BAE 4012 Colby J. Funk

Explore More Topics

Sambuz

Useful Links

Newsletter

Mail Us

ETSI & Lawful Interception of IP ETSI & Lawful Interception of IP Traffic Traffic Jaya