Latvian Text-to-Speech Synthesizer Mrcis Pinnis Ilze Auzia - PowerPoint PPT Presentation

Dec 19, 2022 •287 likes •369 views

Latvian Text-to-Speech Synthesizer Mrcis Pinnis Ilze Auzia Marcis.Pinnis@lumii.lv Ilze.Auzina@lumii.lv Approach and Features AILAB IMCS UL Text-to-Speech system T2S V1 Concatenative text-to-speech system The system features:

Latvian Text-to-Speech Synthesizer Mārcis Pinnis Ilze Auziņa Marcis.Pinnis@lumii.lv Ilze.Auzina@lumii.lv
Approach and Features • AILAB IMCS UL Text-to-Speech system T2S V1 – Concatenative text-to-speech system – The system features: • variable length speech fragment concatenation – diphones – full words – common phrases – multiple sound combination fragments • Punctuation and silence fragment length control • Rule based text transcription process (in order to obtain the phonetic representation of a text) • Audio fragment concatenation with interpolation at signal concatenation points to force signal smoothing
T2S V1 - Domain Oriented System • The flexible speech fragment length allows domain orientation to achieve better synthesis results – T2S V1 domain oriented for Weather Forecasts
Issues in Development • Several Issues arose in the Development of the T2S V1 Speech Synthesis System – Orthographic ambiguities in characters • “ e” - /e/ “egle” , /{/ “ezers” • “ē” - /e:/ “ēvele” , /{:/ “ēka” • “o” - /uo/ “ola” , /o/ “omlete ” , /o:/ “oda” – Sound segment alignment isn’t always smooth – Synthesized speech is too neutral – prosody is not modeled – System’s current speed is not suitable for on -the-fly applications
Unsolvable Issues • The Latvian Language orthography allows the usage of “e” and “o” for more than one phoneme, which makes it impossible to guess the right pronunciation. – “ēdu” – is it present or past? – “ koks ” – is it a microorganism or a tree? – “ deva ” – is it a noun or a verb? • Such issues can be solved only if the context is large enough to guess the right form. If the context is not present (Consider the sentence “Es ēdu pusdienas.”) or is not wide enough, prediction is theoretically impossible.
Demonstration
The Perspective of Further Research • The system may be improved in three ways: – By introducing better NLP solutions: • Context dependent abbreviation analysis • Context dependent numeric transformation analysis • Context dependent morphological analysis • Sentence and word level prosody analysis • Phonetic dictionary necessary to minimize the impact of wrong rule application – By introducing better low level synthesis: • Usage of PSOLA and RELP approaches for prosody control • Alternative – switch to HMM-based unit selection speech synthesis (for instance, HTS) • Algorithm optimization (solves the speed issue) – By introducing higher quality speech corpus: • Better target domain vocabulary coverage • Better speech fragment alignment
THANK YOU ;o)

Recommend

Speech Processing Speech Processing Using Speech with Computers Overview Overview Speech vs

Speech Processing Speech Processing Using Speech with Computers Overview Overview Speech vs Text Speech vs Text Same but different Same but different Core Speech Technologies Core Speech Technologies Speech Recognition Speech

706 views • 38 slides

Speech Processing 15-492/18-492 Speech Synthesis Overview Text processing Speech Synthesis

Speech Processing 15-492/18-492 Speech Synthesis Overview Text processing Speech Synthesis From text to speech From text to speech Text Analysis Text Analysis Strings of characters to words Strings of characters to words

670 views • 25 slides

10 slides that always work Simple text boxes (I) Sample text Sample text Sample text

10 slides that always work Simple text boxes (I) Sample text Sample text Sample text Sample text Sample text Sample text Sample text Sample text Sample text Sample text Sample text Sample

208 views • 10 slides

6-Text To Speech (TTS) Speech Synthesis Speech Synthesis Concept Speech Naturalness Phone

6-Text To Speech (TTS) Speech Synthesis Speech Synthesis Concept Speech Naturalness Phone Sequence To Speech Articulatory Approaches Concatenative Approaches HMM-based Approaches Rule-Based Approaches 1 Speech Synthesis Concept

751 views • 57 slides

EECS E6870 converting speech to text Speech Recognition automatic speech recognition

What Is Speech Recognition? EECS E6870 converting speech to text Speech Recognition automatic speech recognition (ASR), speech-to-text (STT) what its not Michael Picheny,

346 views • 22 slides

Speech Processing 11-492/18-492 Speech Synthesis Overview Text processing Speech Synthesis

Speech Processing 11-492/18-492 Speech Synthesis Overview Text processing Speech Synthesis From text to speech Text Analysis Strings of characters to words Linguistic Analysis From words to pronunciations and prosody

490 views • 25 slides

CONTENT TITLE Insert Subtitle Here Enter Text Here Enter Text Here Enter Text Here

CONTENT TITLE Insert Subtitle Here Enter Text Here Enter Text Here Enter Text Here Enter Text Here Enter Text Here CONTENT TITLE Insert Subtitle Here Enter Text Here Enter Text Here Enter Text Here Enter Text

699 views • 66 slides

Post-Conference Presentation Sunday Oladayo Oladejo Table of Content A Introduction B

Post-Conference Presentation Sunday Oladayo Oladejo Table of Content A Introduction B Benefits C Take-Aways D Research Areas Add text add text add text add text add text add text add text add text add text add text add text E Research

514 views • 12 slides

Latvian traditional DISHES international cooking course for traditional cooking courses for all

Latvian traditional DISHES international cooking course for traditional cooking courses for all countries-partners GRUNDTVIG PARTNERSHIPS PROJECT HELLO , WE ARE FROM LATVIA LATVIA GRUNDTVIG PARTNERSHIPS PROJECT LATVIAN TRADITIONAL WE

501 views • 24 slides

Specific Support Action Final presentation The Latvian Research Riga February 22 Funding System

Horizon 2020 Policy Support Facility Specific Support Action Final presentation The Latvian Research Riga February 22 Funding System Background and Task This study has been produced at the request of the Latvian authorities by an expert

593 views • 55 slides

OPERATING RESULTS OF LATVIAN COMMERCIAL BANKS 4TH QUARTER 2017 Saturs 2 Assets 1. 2. Capital

1 OPERATING RESULTS OF LATVIAN COMMERCIAL BANKS 4TH QUARTER 2017 Saturs 2 Assets 1. 2. Capital Profit 3. 4. Issued loans Deposits 5. 6. Financial ratios 3 1. BANKS ASSETS Assets* of Latvian banking sector 2000-2017 4 35.000 30.000

1.28k views • 98 slides

Role of small and medium sized urban areas in territorial development: Latvian experience and

Role of small and medium sized urban areas in territorial development: Latvian experience and plans for the upcoming Latvian presidency of the Council of the EU Ilze JUREVIA Ministry of Environmental Protection and Regional Development

527 views • 15 slides

Speech Processing 15-492/18-492 Speech Synthesis Waveform generation 2 Speech Synthesis Text

Speech Processing 15-492/18-492 Speech Synthesis Waveform generation 2 Speech Synthesis Text Analysis Text Analysis Chunking, tokenization, token expansion Chunking, tokenization, token expansion Linguistic Analysis

648 views • 29 slides

Speech Processing 11-492/18-492 Speech Processing 11-492/18-492 Speech Synthesis Evaluation

Speech Processing 11-492/18-492 Speech Processing 11-492/18-492 Speech Synthesis Evaluation Evaluating Speech Synthesis Evaluating Speech Synthesis How good is the voice? How good is the voice? This voice is a 45.67 This voice is a

466 views • 24 slides

Speech Processing 15- -492/18 492/18- -492 492 Speech Processing 15 Speech Synthesis Prosody

Speech Processing 15- -492/18 492/18- -492 492 Speech Processing 15 Speech Synthesis Prosody Speech Synthesis Speech Synthesis Linguistic Analysis Linguistic Analysis Pronunciations Pronunciations Prosody Prosody

422 views • 24 slides

Project Overview Speech Speech Generation Generation Common Semantic Frame Speech Speech

9807-11 Multilingual Conversational System Research James Glass and Stephanie Seneff Project Overview Speech Speech Generation Generation Common Semantic Frame Speech Speech Understanding Understanding DATABASE Explore

748 views • 3 slides

Dat Data- a-Dri Drive ven Spe n Speech ech Synt nthe hesis Konstantin Tretjakov kt@ut.ee

Seminar on Language Technology Dat Data- a-Dri Drive ven Spe n Speech ech Synt nthe hesis Konstantin Tretjakov kt@ut.ee 11.12.07 Speech Synthesis Computers are getting smarter all the time. Scientists tell us that soon they will

491 views • 35 slides

Hanady Ahmed Allan Ramsay Arabic Department, CAS

Combining corpus-based and linguistic models for Arabic speech systems Hanady Ahmed Allan Ramsay Arabic Department, CAS School of Computer Science Qatar University

698 views • 36 slides

Letter-to-Phoneme Conversion for a German Text-to-Speech System Vera Demberg Institut fr

Letter-to-Phoneme Conversion for a German Text-to-Speech System Vera Demberg Institut fr Maschinelle Sprachverarbeitung (IMS) Universitt Stuttgart und IBM Deutschland Entwicklung GmbH Bblingen May 31, 2006 Vera Demberg (IMS / IBM)

419 views • 31 slides

SpeechRecognition P y thon librar y SP OK E N L AN G U AG E P R OC E SSIN G IN P YTH ON Daniel

SpeechRecognition P y thon librar y SP OK E N L AN G U AG E P R OC E SSIN G IN P YTH ON Daniel Bo u rke Machine Learning Engineer / Yo u T u be Creator Wh y the SpeechRecognition librar y? Some e x isting p y thon libraries CMU Sphin x Kaldi

629 views • 23 slides

StructuralTextFeatures CISC489/689010,Lecture#13 Monday,April6 th

4/7/09 StructuralTextFeatures CISC489/689010,Lecture#13 Monday,April6 th BenCartereGe StructuralFeatures Sofarwehavemainlyfocusedonvanilla

537 views • 20 slides

Entity Representation and Retrieval Laura Dietz University of New Hampshire Alexander Kotov Wayne

Entity Representation and Retrieval Laura Dietz University of New Hampshire Alexander Kotov Wayne State University Edgar Meij Bloomberg ICTIR 2016 Tutorial on Utilizing KGs in Text-centric IR Knowledge Graphs A way to represent human

1.24k views • 66 slides

bounding-box April 9, 2019 1 Boxes in Object Detection In [1]: % matplotlib inline import d2l

bounding-box April 9, 2019 1 Boxes in Object Detection In [1]: % matplotlib inline import d2l from mxnet import image, nd, contrib d2l.set_figsize() img = image.imread( ' catdog.jpg ' ).asnumpy() d2l.plt.imshow(img) Out[1]:

193 views • 8 slides

in 6 Slides Servicing GANT Services Reimer Karlsen-Masur, DFN-CERT GN3plus Symposium Services

GANT eduPKI in 6 Slides Servicing GANT Services Reimer Karlsen-Masur, DFN-CERT GN3plus Symposium Services GmbH 24 25 February 2015 Slides & Related Materials @ Athens https://www.edupki.org Outline The 3 building-blocks of

371 views • 8 slides