TTS and Data Selection: Improving Systems for Low-Resource Languages - PowerPoint PPT Presentation

TTS and Data Selection: Improving Systems for Low-Resource Languages Chevy Levitan, DREU 2015

outline I. Project II. Approach III. Methods IV. Status V. Future

I. Project synthesize natural, intelligible voices for low resource languages using data selection

motivation ▷ bridge the gap

motivation ▷ bridge the gap ▷ allow for cross-language communication

why data selection?

HRLs vs. LRLs prepared data found data ★ ★ abundance of limited training ★ ★ training material material high quality speech low quality speech systems systems

A. filter out unwanted data from training set

A. filter out unwanted data from training set B. supplement limited LRL data with choice data from similar HRL

II. APPROACH preparing the experiment

corpus ▷ Boston Radio News Corpus ▷ pre-processed ▷ English

extract features data selection process sort values create subsets synthesize data

evaluate.

evaluate. compare/contrast voices

example VOICE 1 VOICE 2

solution 1. subset data 2. complete dataset

III. METHODS testing our hypothesis

standards ★ follow standard procedures for evaluating TTS voices

standards ★ follow standard procedures for evaluating TTS voices ★ successful voice = intelligible + natural

standards ★ follow standard procedures for evaluating TTS voices ★ successful voice = intelligible + natural ★ use crowdsourcing for unbiased results

mechanical turk Intelligibility transcribe nonsense sentences ➔ accurate transcription = intelligible voice ➔

mechanical turk Intelligibility transcribe nonsense sentences ➔ accurate transcription = intelligible voice ➔ Naturalness use Likert scale to rate voices from very unnatural to very natural ➔ identify the voices are categorized as natural+ ➔

IV. STATUS our current state

intelligibility HIT ✓ create subsets

intelligibility HIT ✓ create subsets ✓ synthesize voices with this data

intelligibility HIT ✓ create subsets ✓ synthesize voices with this data ✓ design and implement HIT

intelligibility HIT ✓ create subsets ✓ synthesize voices with this data ✓ design and implement HIT ✓ publish on MTurk site

intelligibility HIT ✓ create subsets ✓ synthesize voices with this data ✓ design and implement HIT ✓ publish on MTurk site ✓ workers complete HITs

intelligibility HIT ✓ created subsets ✓ synthesized voices with this data ✓ design and implement HIT ✓ publish on MTurk site ✓ workers complete HITs ✓ accept/reject work

naturalness HIT ✓ create subsets

naturalness HIT ✓ create subsets ✓ synthesize voices with this data

naturalness HIT ✓ create subsets ✓ synthesize voices with this data ✓ design and implement HIT

naturalness HIT ✓ create subsets ✓ synthesize voices with this data ✓ design and implement HIT - publish on MTurk site - workers complete HITs - accept/reject work

V. FUTURE further exploration of this research

evaluation analyze mechanical turk responses

evaluation analyze mechanical turk responses low-resource implement data selection for LRLs

evaluation analyze mechanical turk responses low-resource implement data selection for LRLs text apply similar methods to automatically select text data

Thanks! Any questions?

TTS and Data Selection: Improving Systems for Low-Resource Languages - PowerPoint PPT Presentation

TTS and Data Selection: Improving Systems for Low-Resource Languages Chevy Levitan, DREU 2015 outline I. Project II. Approach III. Methods IV. Status V. Future I. Project synthesize natural, intelligible voices for low resource languages

New approaches for improving New approaches for improving Data mining feature selection Data

THE DATA SELECTION PROPOSAL AND SUPPLY OF MINIDAQ SYSTEMS FOR APA TESTING Georgia Karagiorgi

Improving Model Selection by Employing the Test Data Max Westphal, Werner Brannath University of

CS2P: Improving Video Bitrate Selection and Adaptation with Data-Driven Throughput Prediction Y.

Solar and Wind Resource Data for use in the Systems Advisor Model Anthony Lopez Senior

Effective Topic Distillation Effective Topic Distillation with Key Resource Pre- -selection

Improving Shop Efficiency with Coating Selection Don Gill, Hempel Presenter Improving Shop

Systems for Resource Management Corso di Sistemi e Architetture per Big Data A.A. 2016/17

Data Pipeline Selection and Optimization DOLAP 2019 Alexandre Quemy IBM IBM, , Da Data ta an

Vendor Selection and Implementation of CDS in the Low Resource Setting Adam Prater, MD/MPH

Planning 2.0 Improving the way we plan together. 1 Planning 2.0 Resource Management Plans A

CSE 291D/234 Data Systems for Machine Learning Arun Kumar Topic 3: Feature Engineering and Model

VIA : Improving Internet Telephony Quality VIA Using Predictive Relay Selection Junchen Jiang ,

Advisory Panel on Improving Healthcare Systems May 8, 2014 9:00 a.m. 5:45 p.m. EDT 1

Data Selection R&D Kickoff Josh Klein, Penn, 5/1/2018 Relevant Requirements Data

on Astrophysical Data Processing Heterogeneous Many-Core Systems Theodore Kisner, LBNL

SELECTION CBM Topic Overview 1. Introduction To HRM 2. Recruitment & Selection Overview

SELECTION CBM Topic Overview 1. Introduction To HRM 2. Recruitment & Selection Overview

5/13/2016 Access Methods and Mounting Systems Presenter Nikkol Anderson Selection

Advisory Panel on Improving Healthcare Systems January 13, 2014 11:15 - 5:00 pm EST 1 Welcome

Advisory Panel on Improving Healthcare Systems January 14, 2015 9:30 a.m. 5:30 p.m. EST 1

Therm-O-Flow Supply Systems Graco Online Training TOF Supply Systems System Overview

Figure 1. Learning curves under different quality levels of training data (p is the probability

Learning From Data Lecture 13 Validation and Model Selection The Validation Set Model Selection

TTS and Data Selection: Improving Systems for Low-Resource Languages - PowerPoint PPT Presentation

TTS and Data Selection: Improving Systems for Low-Resource Languages Chevy Levitan, DREU 2015 outline I. Project II. Approach III. Methods IV. Status V. Future I. Project synthesize natural, intelligible voices for low resource languages

New approaches for improving New approaches for improving Data mining feature selection Data

THE DATA SELECTION PROPOSAL AND SUPPLY OF MINIDAQ SYSTEMS FOR APA TESTING Georgia Karagiorgi

Improving Model Selection by Employing the Test Data Max Westphal, Werner Brannath University of

CS2P: Improving Video Bitrate Selection and Adaptation with Data-Driven Throughput Prediction Y.

Solar and Wind Resource Data for use in the Systems Advisor Model Anthony Lopez Senior

Effective Topic Distillation Effective Topic Distillation with Key Resource Pre- -selection

Improving Shop Efficiency with Coating Selection Don Gill, Hempel Presenter Improving Shop

Systems for Resource Management Corso di Sistemi e Architetture per Big Data A.A. 2016/17

Data Pipeline Selection and Optimization DOLAP 2019 Alexandre Quemy IBM IBM, , Da Data ta an

Vendor Selection and Implementation of CDS in the Low Resource Setting Adam Prater, MD/MPH

Planning 2.0 Improving the way we plan together. 1 Planning 2.0 Resource Management Plans A

CSE 291D/234 Data Systems for Machine Learning Arun Kumar Topic 3: Feature Engineering and Model

VIA : Improving Internet Telephony Quality VIA Using Predictive Relay Selection Junchen Jiang ,

Advisory Panel on Improving Healthcare Systems May 8, 2014 9:00 a.m. 5:45 p.m. EDT 1

Data Selection R&amp;D Kickoff Josh Klein, Penn, 5/1/2018 Relevant Requirements Data

on Astrophysical Data Processing Heterogeneous Many-Core Systems Theodore Kisner, LBNL

SELECTION CBM Topic Overview 1. Introduction To HRM 2. Recruitment &amp; Selection Overview

SELECTION CBM Topic Overview 1. Introduction To HRM 2. Recruitment &amp; Selection Overview

5/13/2016 Access Methods and Mounting Systems Presenter Nikkol Anderson Selection

Advisory Panel on Improving Healthcare Systems January 13, 2014 11:15 - 5:00 pm EST 1 Welcome

Advisory Panel on Improving Healthcare Systems January 14, 2015 9:30 a.m. 5:30 p.m. EST 1

Therm-O-Flow Supply Systems Graco Online Training TOF Supply Systems System Overview

Figure 1. Learning curves under different quality levels of training data (p is the probability

Learning From Data Lecture 13 Validation and Model Selection The Validation Set Model Selection

Data Selection R&D Kickoff Josh Klein, Penn, 5/1/2018 Relevant Requirements Data

SELECTION CBM Topic Overview 1. Introduction To HRM 2. Recruitment & Selection Overview

SELECTION CBM Topic Overview 1. Introduction To HRM 2. Recruitment & Selection Overview