DATA COLLECTION & PREPARATION FOR SPEECH SYSTEMS Chevy Levitan - PowerPoint PPT Presentation

Sep 26, 2023 •392 likes •609 views

DATA COLLECTION & PREPARATION FOR SPEECH SYSTEMS Chevy Levitan Mentor: Erica Cooper Director: Dr.Julia Hirschberg OBJECTIVE Gather and process data for global speech technologies. PROJECTS I. ENGLISH -> TTS II. LOW-RESOURCE

DATA COLLECTION & PREPARATION FOR SPEECH SYSTEMS Chevy Levitan Mentor: Erica Cooper Director: Dr.Julia Hirschberg
OBJECTIVE Gather and process data for global speech technologies.
PROJECTS I. ENGLISH -> TTS II. LOW-RESOURCE LANGUAGES -> KEYWORD SEARCHING ○ Background ○ Methods ○ Status ○ Future work
TTS >> BACKGROUND ○ About Method Description Pros Cons Concatenative form words by natural sounding, expensive, rigid, stringing together easy to implement large databases small units of speech HMM-based generate waveforms context- sounds synthetic from HMM’s dependent, flexible, smaller databases, robust
TTS >> BACKGROUND ○ Applications ■ assistive technology - blind - speech impaired ■ phones - caller id - driving settings
TTS >> BACKGROUND ○ Process
TTS >> BACKGROUND Boston Radio Corpus: ○ Designed for TTS ○ 7 speakers ○ 7+ hours of clean audio ○ Transcriptions
TTS >> METHODS Paragraph -> Sentence: ○ Each training segment should be smaller ○ Split text and audio ○ Each sentence is identified by its speaker and a number (ex: f1a_0001.txt)
TTS >> METHODS Paragraph -> Sentence: ○ Text a. find (‘.’) in paragraph b. list of rules for abbreviations c. send each sentence to its own .txt file ○ Audio a. find (‘.’) in .txt file b. look up timing in .wrd file for the following word c. trim the audio (sox) (ex: sox src dest start dur)
TTS >> METHODS HTS-Speaker Adaptive Demo: ❏ Install demo ❏ Configure with default parameters ❏ Configure with our data
TTS >> STATUS HTS-Speaker Adaptive Demo: ✓ Install demo ✓ Configure with default parameters → Configure with our data
KS >> BACKGROUND Low-resource Languages: ○ Languages that have limited tools at their disposal ○ English is high-resource; TTS, ASR… ○ Need data to build resources
KS >> BACKGROUND ○ Where can we find lots of audio and text data for low-resource languages?? ○ Internet → Free → Accessible → Global
KS >> BACKGROUND PROBLEM: photos, logos, animations, advertisements...
KS >> BACKGROUND SOLUTION: BEAUTIFUL SOUP.
KS >> METHODS ❏ Select language ❏ Find useful websites ❏ Scrape
KS >> METHODS ✓ Language Telugu ✓ Blogs 1. http://mahojas.blogspot.com/ 2. http://yaramana.blogspot.com/ 3. http://ishtapadi.blogspot.com/ ✓ Scrape
KS >> METHODS EXAMPLE : http://mahojas.blogspot.com/ text sample:
KS >> STATUS ○ Languages: Telugu, Lithuanian ○ Scraped ~500 web pages ○ Word count: > 100,000
FUTURE WORK ○ Data selection ○ Audio scraping ○ Scrape other languages → Tok pisin → Cebuano → Kurmanji kurdish → Kazakh ○ Build synthesizer for low-resource languages
THANK YOU!

Recommend

Speech Processing Speech Processing Using Speech with Computers Overview Overview Speech vs

Speech Processing Speech Processing Using Speech with Computers Overview Overview Speech vs Text Speech vs Text Same but different Same but different Core Speech Technologies Core Speech Technologies Speech Recognition Speech

706 views • 38 slides

Sunglasses SM001 Collection SM005 Collection YPC001 Collection(swimming goggles) SR001

Sunglasses SM001 Collection SM005 Collection YPC001 Collection(swimming goggles) SR001 Collection SR002 Collection SR003 Collection SR004 Collection Optical-Anti blue glasses FU006 Collection FU002Collection FU004 Collection FU001

186 views • 17 slides

6-Text To Speech (TTS) Speech Synthesis Speech Synthesis Concept Speech Naturalness Phone

6-Text To Speech (TTS) Speech Synthesis Speech Synthesis Concept Speech Naturalness Phone Sequence To Speech Articulatory Approaches Concatenative Approaches HMM-based Approaches Rule-Based Approaches 1 Speech Synthesis Concept

751 views • 57 slides

Data Preparation Data Preparation Types of Data and Basic statistics Discretization of

Data Preparation Data Preparation Introduction to Data Preparation Data Preparation Data Preparation Types of Data and Basic statistics Discretization of Continuous Variables Working in the R environment (Data

291 views • 24 slides

EECS E6870 converting speech to text Speech Recognition automatic speech recognition

What Is Speech Recognition? EECS E6870 converting speech to text Speech Recognition automatic speech recognition (ASR), speech-to-text (STT) what its not Michael Picheny,

346 views • 22 slides

Speech Processing 11-492/18-492 Speech Processing 11-492/18-492 Speech Synthesis Evaluation

Speech Processing 11-492/18-492 Speech Processing 11-492/18-492 Speech Synthesis Evaluation Evaluating Speech Synthesis Evaluating Speech Synthesis How good is the voice? How good is the voice? This voice is a 45.67 This voice is a

466 views • 24 slides

Speech Processing 15-492/18-492 Speech Synthesis Overview Text processing Speech Synthesis

Speech Processing 15-492/18-492 Speech Synthesis Overview Text processing Speech Synthesis From text to speech From text to speech Text Analysis Text Analysis Strings of characters to words Strings of characters to words

670 views • 25 slides

Speech Processing 15- -492/18 492/18- -492 492 Speech Processing 15 Speech Synthesis Prosody

Speech Processing 15- -492/18 492/18- -492 492 Speech Processing 15 Speech Synthesis Prosody Speech Synthesis Speech Synthesis Linguistic Analysis Linguistic Analysis Pronunciations Pronunciations Prosody Prosody

422 views • 24 slides

Project Overview Speech Speech Generation Generation Common Semantic Frame Speech Speech

9807-11 Multilingual Conversational System Research James Glass and Stephanie Seneff Project Overview Speech Speech Generation Generation Common Semantic Frame Speech Speech Understanding Understanding DATABASE Explore

748 views • 3 slides

Automatic Speech Recognition (CS753) Automatic Speech Recognition (CS753) Lecture 25: Speech

Automatic Speech Recognition (CS753) Automatic Speech Recognition (CS753) Lecture 25: Speech synthesis (Concluding lecture) Instructor: Preethi Jyothi Nov 6, 2017 Recall: SPSS framework O Speech Speech Train Parameter

277 views • 26 slides

Speech Processing 11-492/18-492 Speech Processing 11-492/18-492 Speech Recognition Acoustic

Speech Processing 11-492/18-492 Speech Processing 11-492/18-492 Speech Recognition Acoustic modeling Pronunciation dictionary Acoustic Modeling Acoustic Modeling Speech and Signal Variability Speech and Signal Variability Measuring

625 views • 27 slides

Preparation Test Preparation Practice Makes Perfect: Students should take numerous practice

ACT Test Preparation Test Preparation Practice Makes Perfect: Students should take numerous practice tests! Utilize your resources: Take a test prep course! Go to www.act.org Test Preparation from ACT Test Preparation Test Preparation

593 views • 56 slides

Preparation for Sonship Mike Parsons 23. Frequency (2) Preparation for Sonship How can we be

Preparation for Sonship Mike Parsons 23. Frequency (2) Preparation for Sonship How can we be prepared for sonship? Preparation with help from tutors Preparation through entering rest Preparation through soul transformation at

1.52k views • 98 slides

Preparation for Sonship Mike Parsons 12. Deconstruction (1) eg.freedomarc.org Preparation for

Preparation for Sonship Mike Parsons 12. Deconstruction (1) eg.freedomarc.org Preparation for Sonship How can we be prepared for sonship? Preparation with help from tutors Preparation through entering rest Preparation through

809 views • 66 slides

Speech and Language CS 188: Artificial Intelligence Speech technologies Automatic

Speech and Language CS 188: Artificial Intelligence Speech technologies Automatic speech recognition (ASR) Text-to-speech synthesis (TTS) Dialog systems Language processing technologies Lecture 18: Speech

193 views • 3 slides

Speech and Language CS 188: Artificial Intelligence Spring 2011 Speech technologies

Speech and Language CS 188: Artificial Intelligence Spring 2011 Speech technologies Automatic speech recognition (ASR) Text-to-speech synthesis (TTS) Dialog systems Language processing technologies Speech and Language

210 views • 5 slides

Strategic Teacher Compensation Strategic Teacher Compensation Doug Peter Hering

Strategic Teacher Compensation Strategic Teacher Compensation Doug Peter Hering Hilts $$ Guy Guy Our Presentation Goals: Application Recommendation Expectations Charter School Paradoxes Our

782 views • 35 slides

Building a Home (Nests) Bridges to Birding Building a Home (Nests) Learn: Types of Nests How

Building a Home (Nests) Bridges to Birding Building a Home (Nests) Learn: Types of Nests How Nests Help ID Birds Why, How, Where Birds Build Nests Materials Birds Use for Nests Human Impacts on Birds and Nests What You Should Do When You

709 views • 36 slides

1 Project Area Boundaries Grand Boulevard Studies Transportation analysis focused on core of

Melissa Wittstruck, Assistant Planner, Neighborhood and Planning Services Inga Note, Senior Traffic Planning Engineer, Integrated Capital Management Plan Commission Hearing July 8, 2020 1 Project Area Boundaries Grand Boulevard Studies

859 views • 16 slides

(GLUP) Study Plus Long Range Planning Committee (LRPC) December 11, 2019 Tonights Agenda 1.

Shirlington Village Special General Land Use Plan (GLUP) Study Plus Long Range Planning Committee (LRPC) December 11, 2019 Tonights Agenda 1. Introduction 2. Parking Utilization Report 3. LRPC Discussion 4. Results from Community

396 views • 22 slides

1 ReVision Energy presentation to SMMC Energy Team 3-13-2014 Sam LaValle of ReVision

1 ReVision Energy presentation to SMMC Energy Team 3-13-2014 Sam LaValle of ReVision Energy gave an oral and power point presentation with pictures projected onto the wall. ReVision Energy is the largest installer of solar energy panels

325 views • 3 slides

SnowWolf Application Training Large Commercial Lots Large Commercial Lots Soluti tion on: :

SnowWolf Application Training Large Commercial Lots Large Commercial Lots Soluti tion on: : UltraPlow with WolfWings Majority angle plowing Plow in both directions Containment less important than angle plowing Never clean up

827 views • 34 slides

APEX 04/08/2015 1. BTF: different beam size and different beam current 2. BTF: Octupole 3. BTF:

E-lens related beam-beam experiment 04/08/2015 W. Fischer, Y. Luo, X. Gu APEX 04/08/2015 1. BTF: different beam size and different beam current 2. BTF: Octupole 3. BTF: Different e-beam energy 4. BTF: 1D separation 5. Three 111x111 ramps 2

463 views • 12 slides

Software Development in a Dynamic Environment Paul Chandler Robert Kriss Agenda BACKGROUND

Software Development in a Dynamic Environment Paul Chandler Robert Kriss Agenda BACKGROUND Modern software is developed in a dynamic environment HYPOTHETICAL SCENARIO Building a data scraping tool and website ADDITIONAL CONTRACTUAL

323 views • 14 slides