Using Functional Load for Optimizing DPGMM based Zero Resource - PowerPoint PPT Presentation

Using Functional Load for Optimizing DPGMM based Zero Resource Sub-word Unit Discovery Bin Wu 1 , Sakriani Sakti 1,2 , Jinsong Zhang 3 and Satoshi Nakamura 1,2 {wu.bin.vq9,ssakti,s-nakamura}@is.naist.jp, jinsong.zhang@blcu.edu.cn 1. Nara Institute of Science and Technology, Japan 2. RIKEN, Center for Advanced Intelligence Project AIP, Japan 3. Beijing Language and Culture University, China 2018/12/10 1

Background 2018/12/10 2

Research Question o h a y o o sil • How to find phoneme-like units from zero-resource speech? line-girl1-ohayou1 2018/12/10 3

Why important • Problem: zero-resource phoneme-like unit discovery • Why the problem important? • State-of- art DNN needs labels (phonemes,…) • manual labelling needs money and effort • Knowledge of the labels (phonological system, …) • Zero- resource technology helps to create these labels (phonemes, …) 2018/12/10 4

Previous methods • Unsupervised sub-word unit discovery of Zerospeech • Pre-trained labels + DNN • spoken term detection + autoencoder [Badino 2014, Kamper, 2015; Pitt, 2015] • spoken term detection + ABNet [Synnaeve 2014, Thiolliere, 2015] • Unsupervised clustering • Variational autoencoders [Ondel, 2016; Ebber, 2017] • Dirichlet Process Gaussian Mixture Model ( DPGMM Clustering) [lee, 2012; Chen, 2015] • DPGMM + ASR feature transformations [Heck, 2016] • DPGMM + ASR alignment [Heck, 2017] • DPGMM clustering gets top results of the Zerospeech Challenge 2015, 2017 2018/12/10 5

Problem 2018/12/10 6

Human cognitive process of phoneme • Goal: Audio -> Phoneme-like units o h a y o o sil • How does the human find the phonemes? Top-down knowledge interpretation phone sequence, words, grammar and semantics ( Contextual ) Human cognitive process of speech o h a y ( o o sil ) 1 2 3 4 1 1 5 ( Acoustical ) DPGMM Bottom-up acoustic-to-category process 2018/12/10 7

Problem1:DPGMM is too sensitive to acoustics 2018/12/10 8

Problems of DPGMM clustering • Problem1: DPGMM is too sensitive to acoustics • High frequency acoustics make lots of small DPGMM clusters Example: f: high frequency • Rapid formant changes make lots of small DPGMM clusters i: rapid format change • # of clusters > # of phonemes of usual languages DPGMM Clusters True phonemes True words 2018/12/10 10 DPGMM clustering results on timit training corpus

Problem2: DPGMM is weak in contextual modelling 2018/12/10 10

Contextual modelling • Context is important School K1 and K2 is acoustically different However, K1 is always following s /s k1 u:l/ K2 is always following some word boundary Kite K1 and K2 are in completely different context / k2 ait/ They belong to same phoneme. 2018/12/10 11

Example: • pack: /æ1/ after p and: /æ2/ before word boundary Problems of DPGMM clustering • acoustically different and but complementary distribution • /æ1/ and /æ2/ belong to same • Problem2: DPGMM is weak in contextual modelling phoneme /æ/ • Acoustically different sub-word units are always treated as different labels by DPGMM. • Although they are in completely different context and belongs to same phoneme DPGMM Clusters True phonemes True words 2018/12/10 12 DPGMM clustering results on timit training corpus

Contextual modelling • Context is important Assume B and 13 are two different phonemes, But they are acoustically similar, Sometimes B is between A and C Sometimes 13 is between 12 and 14 We can distinguish B and 13 by the specific context A, C and 12, 14 2018/12/10 13

Example: • Shed: /ʃ/ and fields: /s/ • Problems of DPGMM clustering /ʃ/ and /s/ acoustically similar • Only /s/ will following /d/ fields can’t be ended as /d/ + /ʃ/ • Problem3: DPGMM is weak in contextual modelling • Context can help distinguish acoustically similar phonemes DPGMM Clusters True phonemes True words DPGMM clustering results on timit training corpus 2018/12/10 14

Problems of DPGMM • Human use context to distinguish phonemes • Acoustic different units with completely different context tends to be the same phoneme • Context also helps distinguishing acoustic similar phonemes • Problems of DPGMM • weak in context modeling (top-down) • sensitive to acoustics (bottom-up) 2018/12/10 15

Proposal 2018/12/10 16

Proposal • But How to deal with the contextual effects? • Statement: • If two units can be easily distinguished by the context. • It means the contrast of two units are not important in communication • (a.k.a Functional Load (FL) is small) • Equivalently, the contrast conveys little information in communication • Extremely, if two units are in Completely different context, It means FL = 0 ; It means conveying no info . 2018/12/10 17

Computation of functional load • The measurement of functional load of the contrasts • Information loss ignoring the contrast (Hockett, 1955) • functional load of a contrast of a label pair x and y  School H L ( ) H L ( )  xy FL x y ( , ) H L ( ) /s k1 u:l/ • eg. In English, K1 and K2 are in completely different context • Mathematically, 𝐺𝑀 𝑙1, 𝑙2 = 0 Kite / k2 ait/ 2018/12/10 18

System configuration • Proposal: greedy mergers based on least functional load criteria • Iteratively merge the DPGMM label pairs with lowest functional load and enhance our features by ASR 2018/12/10 19

Experiment & Result 2018/12/10 20

Experiment and result • Xitsonga corpus • an excerpt the NCHLT corpus of South African read speech (length: 2 h 29 min) • with the official segmentation of Interspeech Zero Resource Speech Challenge 2015 2018/12/10 21

Conclusion • DPGMM is weak in context modeling and sensitive to acoustics • We enhance the contextual modeling of DPGMM labels by minimum functional criteria • Result shows we can get posterigram of much lower dimension with similar ABX error 2018/12/10 22

Thank you for listening 2018/12/10 23

Using Functional Load for Optimizing DPGMM based Zero Resource - PowerPoint PPT Presentation

Using Functional Load for Optimizing DPGMM based Zero Resource Sub-word Unit Discovery Bin Wu 1 , Sakriani Sakti 1,2 , Jinsong Zhang 3 and Satoshi Nakamura 1,2 {wu.bin.vq9,ssakti,s-nakamura}@is.naist.jp, jinsong.zhang@blcu.edu.cn 1. Nara

Load Balancing with nftables by Laura Garca (Zen Load Balancer Team) Netdev 1.1 Prototype of

Optimizing monitoring networks for Optimizing monitoring networks for Optimizing monitoring

Zero Waste at The Nat Zero Waste Zero Waste Zero Waste is a philosophy that encourages the

Getting to Zero San Francisco Consortium Zero new HIV infections Zero HIV deaths Zero stigma

Getting to Zero San Francisco Consortium Zero new HIV infections Zero HIV deaths Zero stigma

Load Balancing Load Balancing Load balancing: distributing data and/or computations across

Vertical Stress Increases Chapter 8 Point Load 1 3/25/2015 Point Load Point Load

Zero-knowledge Arguments Proving circuit satisfaibility in zero-knowledge Zero-knowledge In

FFR Guided Functional FFR Guided Functional FFR Guided Functional FFR Guided Functional

DALLAS ZERO WASTE Recycling 101 ZERO WASTE PLAN What is Zero Waste? The planet has limited

VISION ZERO SF: ELIMINATING TRAFFIC DEATHS BY 2024 FEBRUARY 6, 2017 VISION ZERO VISION ZERO SF

Presentation of Platform Zero Incidents Platform Zero Incidents Platform Zero Incidents MENTAL

Vision Zero Insight A new approach to Roads Policing VISION ZERO 2 The Vision Zero Action Plan

Consortium Zero new HIV infections Zero HIV deaths Zero stigma and discrimination Agenda 1.

Dynamic Load Balancing in Dynamic Load Balancing in Charm+ + Charm+ + Abhinav S Bhatele

Load Test of Load Test of High Capacity Micropile Micropile High Capacity in Site in Site

AlsaModularSynth An instrument for the electrified virtuoso Matthias Nagorni, SUSE Linux AG,

Waveform Generation From phones, durations, F0 to waveforms 11-752, LTI, Carnegie Mellon Types

Perceiving Prosody in Sinewave Speech A Sine of the Times Yasmine Sukola and Lissette

The Prediction Error Signal 1 Prediction Error Signal Behavior 2 LP Speech Analysis file:s5,

A Type System for Format Strings Konstantin Weitz weitzkon@uw.edu Gene Kim genelkim@uw.edu

Vulnerabilities in C/C++ programs Part II TDDC90 Software Security Ulf Kargn Department

CS161 Midterm 1 Review Midterm 1: March 4, 18:30- 20:00 Same room as lecture Security Analysis

1 Example Bugs Type Qualifiers [Shankar, et al 01] Idea null dereference Add tainted and