Resource development and experiments in automatic South African - PowerPoint PPT Presentation

Resource development and experiments in automatic South African broadcast news transcription SLTU 2012, Cape Town, South Africa Herman Kamper 1 , Febe de Wet 1 , 2 , Thomas Hain 3 , Thomas Niesler 1 1 Department of Electrical and Electronic Engineering, Stellenbosch University, South Africa 2 Human Language Technology Competency Area, CSIR Meraka Institute, Pretoria, South Africa 3 Department of Computer Science, University of Sheffield, United Kingdom UNIVERSITEIT STELLENBOSCH UNIVERSITY

Introduction Broadcast news domain: Provides a ready source of speech audio data Variety of speech styles and quality: careful newsreader to noisy spontaneous Useful as components for subsequent speech technologies H. Kamper (Stellenbosch University) South African broadcast news (SABN) SLTU 2012, Cape Town, South Africa 2 / 14

Introduction Broadcast news domain: Provides a ready source of speech audio data Variety of speech styles and quality: careful newsreader to noisy spontaneous Useful as components for subsequent speech technologies South African (English) broadcast news: Several prevalent English accents South African English is under-resourced variety of English H. Kamper (Stellenbosch University) South African broadcast news (SABN) SLTU 2012, Cape Town, South Africa 2 / 14

Introduction Broadcast news domain: Provides a ready source of speech audio data Variety of speech styles and quality: careful newsreader to noisy spontaneous Useful as components for subsequent speech technologies South African (English) broadcast news: Several prevalent English accents South African English is under-resourced variety of English Motivation Report on baseline results of a straight-forward system: Use resources collected at Stellenbosch University (2000 – present) Aim is to use baseline for comparative/interesting further studies H. Kamper (Stellenbosch University) South African broadcast news (SABN) SLTU 2012, Cape Town, South Africa 2 / 14

Accents of English in South Africa Five major accents of South African English are identified in the literature: Afrikaans English (AE) 5 . 7% Other 1 . 6% 2 . 3% Indian South African English (IE) 77 . 8% 3 . 8% Black South White South African English (EE) African English 8 . 8% (BE) Cape Flats English (CE) H. Kamper (Stellenbosch University) South African broadcast news (SABN) SLTU 2012, Cape Town, South Africa 3 / 14

South African broadcast news data 20 hours SAFM broadcasts from 1996 to 2006: RD : Newsreader speech, prepared 27 speakers, 12.9 hours (BE, EE, IE) SI : Studio interview speech, fairly spont. 61 speakers, 0.6 hours NST : Non-studio telephone speech, spont. 262 speakers, 2.07 hours NS : Non-studio wideband speech, noisy 208 speakers, 1.54 hours Accent annotated for each sentence-level segment. Test set similar in composition to training set ∼ 2.7 hours. H. Kamper (Stellenbosch University) South African broadcast news (SABN) SLTU 2012, Cape Town, South Africa 4 / 14

System development Speech recognition problem ˆ W = arg max P ( W | X ) = arg max p ( X | W ) P ( W ) W W H. Kamper (Stellenbosch University) South African broadcast news (SABN) SLTU 2012, Cape Town, South Africa 5 / 14

System development Speech recognition problem ˆ W = arg max P ( W | X ) = arg max p ( X | W ) P ( W ) W W Models required Language model for P ( W ) - 109M word corpus of newspaper text 1 H. Kamper (Stellenbosch University) South African broadcast news (SABN) SLTU 2012, Cape Town, South Africa 5 / 14

System development Speech recognition problem ˆ W = arg max P ( W | X ) = arg max p ( X | W ) P ( W ) W W Models required Language model for P ( W ) - 109M word corpus of newspaper text 1 Pronunciation dictionary for p ( X | W ) - 60k word pronunciation dictionary 2 H. Kamper (Stellenbosch University) South African broadcast news (SABN) SLTU 2012, Cape Town, South Africa 5 / 14

System development Speech recognition problem ˆ W = arg max P ( W | X ) = arg max p ( X | W ) P ( W ) W W Models required Language model for P ( W ) - 109M word corpus of newspaper text 1 Pronunciation dictionary for p ( X | W ) - 60k word pronunciation dictionary 2 Acoustic model for p ( X | W ) - 20h SABN corpus (previous slide) 3 H. Kamper (Stellenbosch University) South African broadcast news (SABN) SLTU 2012, Cape Town, South Africa 5 / 14

Language modelling 109M word corpus from South African newspapers , collected 2000 – 2005: The Financial Mail , Business Day , The Sunday Times , The Times , Sunday World , The Sowetan , The Herald , The Algoa Sun and The Daily Dispatch SRILM toolkit used to train trigram language models on above text as well as on the transcriptions of acoustic training set (185k words) Also considered interpolation of the two language models H. Kamper (Stellenbosch University) South African broadcast news (SABN) SLTU 2012, Cape Town, South Africa 6 / 14

Language modelling 109M word corpus from South African newspapers , collected 2000 – 2005: The Financial Mail , Business Day , The Sunday Times , The Times , Sunday World , The Sowetan , The Herald , The Algoa Sun and The Daily Dispatch SRILM toolkit used to train trigram language models on above text as well as on the transcriptions of acoustic training set (185k words) Also considered interpolation of the two language models Perplexity Language model Trained on 109M newspaper corpus 162.9 328.9 Trained on acoustic training set Interpolation of the above two 139.9 H. Kamper (Stellenbosch University) South African broadcast news (SABN) SLTU 2012, Cape Town, South Africa 6 / 14

Pronunciation dictionary Pronunciation dictionaries developed by a phonetic expert Reflect typical EE pronunciation Phone set: 45 ARPABET phones Training pronunciation dictionary: 15k words Recognition pronunciation dictionary: 60k words Average number of pronunciations per word: 1.25 Out-of-vocabulary rate on test set: 1.02% H. Kamper (Stellenbosch University) South African broadcast news (SABN) SLTU 2012, Cape Town, South Africa 7 / 14

Acoustic modelling Used HTK to train cross-word triphone HMMs H. Kamper (Stellenbosch University) South African broadcast news (SABN) SLTU 2012, Cape Town, South Africa 8 / 14

Acoustic modelling Used HTK to train cross-word triphone HMMs Initial triphone HMMs single-pass retraining MFCC MF-PLP HMMs HMMs single-pass retraining Per-segment Per-bulletin Per-segment Per-bulletin CMN, per- CMN, per- CMN CMN bulletin CVN bulletin CVN H. Kamper (Stellenbosch University) South African broadcast news (SABN) SLTU 2012, Cape Town, South Africa 8 / 14

Acoustic modelling Used HTK to train cross-word triphone HMMs Initial triphone HMMs single-pass retraining MFCC MF-PLP 28.9% 27.7% HMMs HMMs single-pass retraining Per-segment Per-bulletin Per-segment Per-bulletin CMN, per- CMN, per- CMN CMN bulletin CVN bulletin CVN H. Kamper (Stellenbosch University) South African broadcast news (SABN) SLTU 2012, Cape Town, South Africa 8 / 14

Acoustic modelling Used HTK to train cross-word triphone HMMs Initial triphone HMMs single-pass retraining MFCC MF-PLP 28.9% 27.7% HMMs HMMs single-pass retraining Per-segment Per-bulletin Per-segment Per-bulletin CMN, per- CMN, per- CMN CMN bulletin CVN bulletin CVN 25.1% 26.9% H. Kamper (Stellenbosch University) South African broadcast news (SABN) SLTU 2012, Cape Town, South Africa 8 / 14

Acoustic modelling Used HTK to train cross-word triphone HMMs Initial triphone HMMs single-pass retraining MFCC MF-PLP 28.9% 27.7% HMMs HMMs single-pass retraining Per-segment Per-bulletin Per-segment Per-bulletin CMN, per- CMN, per- CMN CMN bulletin CVN bulletin CVN 25.1% 24.6% 26.9% 26.4% H. Kamper (Stellenbosch University) South African broadcast news (SABN) SLTU 2012, Cape Town, South Africa 8 / 14

Experimental results Final system Acoustic model set: 2624 states Features: mel-frequency perceptual linear prediction ( MF-PLP ) Normalisation: per-segment CMN , per-bulletin CVN H. Kamper (Stellenbosch University) South African broadcast news (SABN) SLTU 2012, Cape Town, South Africa 9 / 14

Resource development and experiments in automatic South African - PowerPoint PPT Presentation

Resource development and experiments in automatic South African broadcast news transcription SLTU 2012, Cape Town, South Africa Herman Kamper 1 , Febe de Wet 1 , 2 , Thomas Hain 3 , Thomas Niesler 1 1 Department of Electrical and Electronic

Automatic Verification of Automatic Verification of Automatic Verification of Automatic

Experiments on deflection of charged Experiments on deflection of charged Experiments on

Automatic Registration and Calibration Automatic Registration and Calibration Automatic

Automatic Enrollment and Automatic IRAs David C. John The Heritage Foundation The Retirement

Dependency Dependency- -Based Automatic Evaluation Based Automatic Evaluation Dependency

HOW-TO GUIDE ON SOUTH-SOUTH AND TRIANGULAR COOPERATION AND DECENT WORK Contents Introduction

SOUTH- SOUTH COOPERATION THE ST. LUCIA AND CARIBBEAN SUB-REGIONAL SITUATION WHAT IS SOUTH-SOUTH

Experimental Design and the Search for Quasi-Experiments Department of Government London School

Chapter 8. Experiments Chapter 8. Experiments Experimental Research Experimental Research

Experiments Philosophy of Economics University of Virginia Matthias Brinkmann Contents 1.

Resource Resource Management Management RESOURCE MANAGEMENT RESOURCE MANAGEMENT We have a

South East Pipe Industries South East Asia Pipe lndustries South East Pipe Industries South

Seminar 18122 Automatic Quality Assurance and Release Seminar 18122 Automatic Quality

Advice Automatic Structures and Uniformly Automatic Classes Faried Abu Zaid 1 , Erich Grdel 2 ,

Automatic NUMA Balancing Rik van Riel, Principal Software Engineer, Red Hat Vinod Chegu, Master

OBT Formation in Night Experiments and OBT Formation in Night Experiments and OBT Formation in

Split Cuts for Two-Stage Stochastic Integer Programs Merve Bodur 1 Sanjeeb Dash 2 Oktay Gnlk 2

Probabilistic Logic Programming for Natural Language Processing Fabrizio Riguzzi, Evelina Lamma,

Multimodal Biometrics Josef Kittler Centre for Vision, Speech and Signal Processing University

Inexact Tensor Methods with Dynamic Accuracies Nikita Doikov Yurii Nesterov UCLouvain, Belgium

r s rs r q r

6502 Stack Philipp Koehn 20 September 2019 Philipp Koehn Computer Systems Fundamentals: 6502

DVB-T2: A second generation digital terrestrial broadcast system Oliver Haffenden BBC Research

Programming Languages Janyl Jumadinova September 1-3, 2020 Janyl Jumadinova Programming