Analysis of speech Dr. Anil Kumar Vuppala IIIT Hyderabad Analysis - PowerPoint PPT Presentation

Oct 11, 2023 •215 likes •343 views

Analysis of speech Dr. Anil Kumar Vuppala IIIT Hyderabad Analysis of speech Representing speech signal on a digital computer Sampling and Quantization Representing information present in speech Extraction of parameters Method of

Analysis of speech Dr. Anil Kumar Vuppala IIIT Hyderabad
Analysis of speech Representing speech signal on a digital computer • Sampling and Quantization Representing information present in speech • Extraction of parameters Method of analysis is application dependent
Types of Analysis based segment duration • Segmental (10 – 50 ms) • Short-time spectrum, formants, pitch • Subsegmental (1 – 5 ms) • Excitation source characteristics, glottal closure • Suprasegmental ( > 100 ms) • Prosodic features - Intonation, duration, energy contour
Preprocessing Preemphasis • Primarily used for emphasizing high frequency components wrt low frequency • High-pass filtering removes envelope y ( n ) =s ( n )− a ∗ s ( n − 1 ) H ( z )= Y ( z ) 1 S ( z ) = 1 − az − 1
Short-time Analysis Speech signal – quasistationary Block processing or short-time analysis Issues – window shape and size Methods -Short-time spectrum analysis - Filter bank analysis - Spectrographic analysis - Linear prediction analysis - Cepstral analysis
Filter bank analysis: Nonlinear frequency scales • Human ear is frequency selective • Higher resolution at low frequencies, vice-versa
Spectrographic Analysis Narrowband and Wideband
Linear prediction analysis LP residual gives an estimate of the excitation source Normalize LP error (residual to signal energy ratio) is useful in the analysis of different sounds, V/NV detection Peaks in Hilbert enevlop of the residual signal correspond to the GCIs 8
Spectral Envelope via LP analysis
Cepstral Analysis •Cepstrum is computed as IDFT of log-magnitude spectrum •Helps separate system and source information • Provides a compact representation of the spectral envelope • Can be evaluated from short-time (DFT) spectrum or LP spectrum.
Thank you

Recommend

Speech Processing Speech Processing Using Speech with Computers Overview Overview Speech vs

Speech Processing Speech Processing Using Speech with Computers Overview Overview Speech vs Text Speech vs Text Same but different Same but different Core Speech Technologies Core Speech Technologies Speech Recognition Speech

706 views • 38 slides

6-Text To Speech (TTS) Speech Synthesis Speech Synthesis Concept Speech Naturalness Phone

6-Text To Speech (TTS) Speech Synthesis Speech Synthesis Concept Speech Naturalness Phone Sequence To Speech Articulatory Approaches Concatenative Approaches HMM-based Approaches Rule-Based Approaches 1 Speech Synthesis Concept

751 views • 57 slides

Speech Processing 15-492/18-492 Speech Synthesis Overview Text processing Speech Synthesis

Speech Processing 15-492/18-492 Speech Synthesis Overview Text processing Speech Synthesis From text to speech From text to speech Text Analysis Text Analysis Strings of characters to words Strings of characters to words

670 views • 25 slides

Speech Processing 15- -492/18 492/18- -492 492 Speech Processing 15 Speech Synthesis Prosody

Speech Processing 15- -492/18 492/18- -492 492 Speech Processing 15 Speech Synthesis Prosody Speech Synthesis Speech Synthesis Linguistic Analysis Linguistic Analysis Pronunciations Pronunciations Prosody Prosody

422 views • 24 slides

EECS E6870 converting speech to text Speech Recognition automatic speech recognition

What Is Speech Recognition? EECS E6870 converting speech to text Speech Recognition automatic speech recognition (ASR), speech-to-text (STT) what its not Michael Picheny,

346 views • 22 slides

Speech Processing 11-492/18-492 Speech Processing 11-492/18-492 Speech Synthesis Evaluation

Speech Processing 11-492/18-492 Speech Processing 11-492/18-492 Speech Synthesis Evaluation Evaluating Speech Synthesis Evaluating Speech Synthesis How good is the voice? How good is the voice? This voice is a 45.67 This voice is a

466 views • 24 slides

Project Overview Speech Speech Generation Generation Common Semantic Frame Speech Speech

9807-11 Multilingual Conversational System Research James Glass and Stephanie Seneff Project Overview Speech Speech Generation Generation Common Semantic Frame Speech Speech Understanding Understanding DATABASE Explore

748 views • 3 slides

Automatic Speech Recognition (CS753) Automatic Speech Recognition (CS753) Lecture 25: Speech

Automatic Speech Recognition (CS753) Automatic Speech Recognition (CS753) Lecture 25: Speech synthesis (Concluding lecture) Instructor: Preethi Jyothi Nov 6, 2017 Recall: SPSS framework O Speech Speech Train Parameter

277 views • 26 slides

Speech Processing 11-492/18-492 Speech Processing 11-492/18-492 Speech Recognition Acoustic

Speech Processing 11-492/18-492 Speech Processing 11-492/18-492 Speech Recognition Acoustic modeling Pronunciation dictionary Acoustic Modeling Acoustic Modeling Speech and Signal Variability Speech and Signal Variability Measuring

625 views • 27 slides

Speech Processing 15-492/18-492 Speech Synthesis Pronunciation Letter to Sound rules Speech

Speech Processing 15-492/18-492 Speech Synthesis Pronunciation Letter to Sound rules Speech Synthesis Linguistic Analysis Linguistic Analysis Pronunciations Pronunciations Prosody Prosody Part of Speech Tagging

383 views • 21 slides

Speech Processing 11-492/18-492 Speech Synthesis Overview Text processing Speech Synthesis

Speech Processing 11-492/18-492 Speech Synthesis Overview Text processing Speech Synthesis From text to speech Text Analysis Strings of characters to words Linguistic Analysis From words to pronunciations and prosody

490 views • 25 slides

Speech of Greta Thunberg at the UN Climate Change COP24 Conference in Katowice Content -Greta

Political speech Speech of Greta Thunberg at the UN Climate Change COP24 Conference in Katowice Content -Greta Thunberg -Analysis of the speech -Video of the speech -Result of the survey What kind of speech is this? A political speech,

451 views • 11 slides

Speech Processing 15-492/18-492 Speech Synthesis Waveform generation 2 Speech Synthesis Text

Speech Processing 15-492/18-492 Speech Synthesis Waveform generation 2 Speech Synthesis Text Analysis Text Analysis Chunking, tokenization, token expansion Chunking, tokenization, token expansion Linguistic Analysis

648 views • 29 slides

Speech sound disorder by Sajjal (2018) Definition A speech sound disorder (SSD) is a speech

Speech sound disorder by Sajjal (2018) Definition A speech sound disorder (SSD) is a speech disorder in which some speech sounds (called phonemes) in a child's (or, sometimes, an adult's) language are either not produced, not produced correctly,

543 views • 16 slides

Chapter 1 Introduction to Speech Signal Processing 1 Outline The

Chapter 1 Introduction to Speech Signal Processing 1 Outline The Speech Signal Speech Signal Processing Speech Production/Perception Model and the Speech Chain The Speech Stack Applications

669 views • 51 slides

Speech and Language CS 188: Artificial Intelligence Speech technologies Automatic

Speech and Language CS 188: Artificial Intelligence Speech technologies Automatic speech recognition (ASR) Text-to-speech synthesis (TTS) Dialog systems Language processing technologies Lecture 18: Speech

193 views • 3 slides

MUSIC CLASSIFICATION USING DNNS Course Project for CS365 Chaitanya Ahuja Amlan Kar Mentored by

4/4/2015 ai-presentation: Slides MUSIC CLASSIFICATION USING DNNS Course Project for CS365 Chaitanya Ahuja Amlan Kar Mentored by Prof. Amitabh Mukherjee http://home.iitk.ac.in/~chahuja/cs365/project/slides/#/ 1/1 4/4/2015 ai-presentation:

512 views • 12 slides

AB Feature Extraction Experiments Discussion Noise Robust LVCSR Feature Extraction Based on

Introduction AB Feature Extraction Experiments Discussion Noise Robust LVCSR Feature Extraction Based on the Stabilized Weighted Linear Prediction HUT-TUT Fall DSP Seminar 2008 Heikki Kallasjoki Adaptive Informatics Research Centre

320 views • 18 slides

Pattern Recognition Part 4: Feature Extraction Gerhard Schmidt Christian-Albrechts-Universitt

Pattern Recognition Part 4: Feature Extraction Gerhard Schmidt Christian-Albrechts-Universitt zu Kiel Faculty of Engineering Institute of Electrical and Information Engineering Digital Signal Processing and System Theory Feature Extraction

592 views • 46 slides

Linear Predictive Coding and Cepstrum coefficients for mining time variant information from

Linear Predictive Coding and Cepstrum coefficients for mining time variant information from software repositories G. Antoniol, F. Rollo and G. Venturi RCOST Unievrsity of Sannio - Italy LPC Idea Model a time series with a polynomial

196 views • 3 slides

SDS: ASR, NLU, & VXML Ling575 Spoken Dialog April 14, 2016 Roadmap Dialog System

SDS: ASR, NLU, & VXML Ling575 Spoken Dialog April 14, 2016 Roadmap Dialog System components: ASR: Noisy channel model Representation Decoding NLU: Call routing Grammars for dialog systems Basic

927 views • 76 slides

A Deep Representation for Invariance and Music Classification Chiyuan Zhang, Georgios

A Deep Representation for Invariance and Music Classification Chiyuan Zhang, Georgios Evangelopoulos, Stephen Voinea, Lorenzo Rosasco, Tomaso Poggio. Center for Brains, Minds and Machines (CBMM) Computer Science and Artificial Intelligence

441 views • 19 slides

Exemplar-based voice conversion using non-negative spectrogram deconvolution Zhizheng Wu 1 ,

Exemplar-based voice conversion using non-negative spectrogram deconvolution Zhizheng Wu 1 , Tuomas Virtanen 2 , Tomi Kinnunen 3 , Eng Siong Chng 1 , Haizhou Li 1,4 1 Nanyang Technological University, Singapore 2 Tampere University of Technology,

192 views • 16 slides

A Horizon B Horizon Samples Standard Volumetric Glassware with open bottom Filter paper Funnel

A Horizon B Horizon Samples Standard Volumetric Glassware with open bottom Filter paper Funnel with built in ceramic filter Flask with vacuum attachment Silt Removal Clay Transfer Clay After Drying

394 views • 25 slides

Analysis of speech Dr. Anil Kumar Vuppala IIIT Hyderabad Analysis - PowerPoint PPT Presentation

Analysis of speech Dr. Anil Kumar Vuppala IIIT Hyderabad Analysis of speech Representing speech signal on a digital computer Sampling and Quantization Representing information present in speech Extraction of parameters Method of

Speech Processing Speech Processing Using Speech with Computers Overview Overview Speech vs

6-Text To Speech (TTS) Speech Synthesis Speech Synthesis Concept Speech Naturalness Phone

Speech Processing 15-492/18-492 Speech Synthesis Overview Text processing Speech Synthesis

Speech Processing 15- -492/18 492/18- -492 492 Speech Processing 15 Speech Synthesis Prosody

EECS E6870 converting speech to text Speech Recognition automatic speech recognition

Speech Processing 11-492/18-492 Speech Processing 11-492/18-492 Speech Synthesis Evaluation

Project Overview Speech Speech Generation Generation Common Semantic Frame Speech Speech

Automatic Speech Recognition (CS753) Automatic Speech Recognition (CS753) Lecture 25: Speech

Speech Processing 11-492/18-492 Speech Processing 11-492/18-492 Speech Recognition Acoustic

Speech Processing 15-492/18-492 Speech Synthesis Pronunciation Letter to Sound rules Speech

Speech Processing 11-492/18-492 Speech Synthesis Overview Text processing Speech Synthesis

Speech of Greta Thunberg at the UN Climate Change COP24 Conference in Katowice Content -Greta

Speech Processing 15-492/18-492 Speech Synthesis Waveform generation 2 Speech Synthesis Text

Speech sound disorder by Sajjal (2018) Definition A speech sound disorder (SSD) is a speech

Chapter 1 Introduction to Speech Signal Processing 1 Outline The

Speech and Language CS 188: Artificial Intelligence Speech technologies Automatic

MUSIC CLASSIFICATION USING DNNS Course Project for CS365 Chaitanya Ahuja Amlan Kar Mentored by

AB Feature Extraction Experiments Discussion Noise Robust LVCSR Feature Extraction Based on

Pattern Recognition Part 4: Feature Extraction Gerhard Schmidt Christian-Albrechts-Universitt

Linear Predictive Coding and Cepstrum coefficients for mining time variant information from

SDS: ASR, NLU, & VXML Ling575 Spoken Dialog April 14, 2016 Roadmap Dialog System

A Deep Representation for Invariance and Music Classification Chiyuan Zhang, Georgios

Exemplar-based voice conversion using non-negative spectrogram deconvolution Zhizheng Wu 1 ,

A Horizon B Horizon Samples Standard Volumetric Glassware with open bottom Filter paper Funnel

Sambuz

Useful Links

Newsletter

Mail Us

Analysis of speech Dr. Anil Kumar Vuppala IIIT Hyderabad Analysis - PowerPoint PPT Presentation

Analysis of speech Dr. Anil Kumar Vuppala IIIT Hyderabad Analysis of speech Representing speech signal on a digital computer Sampling and Quantization Representing information present in speech Extraction of parameters Method of

Speech Processing Speech Processing Using Speech with Computers Overview Overview Speech vs

6-Text To Speech (TTS) Speech Synthesis Speech Synthesis Concept Speech Naturalness Phone

Speech Processing 15-492/18-492 Speech Synthesis Overview Text processing Speech Synthesis

Speech Processing 15- -492/18 492/18- -492 492 Speech Processing 15 Speech Synthesis Prosody

EECS E6870 converting speech to text Speech Recognition automatic speech recognition

Speech Processing 11-492/18-492 Speech Processing 11-492/18-492 Speech Synthesis Evaluation

Project Overview Speech Speech Generation Generation Common Semantic Frame Speech Speech

Automatic Speech Recognition (CS753) Automatic Speech Recognition (CS753) Lecture 25: Speech

Speech Processing 11-492/18-492 Speech Processing 11-492/18-492 Speech Recognition Acoustic

Speech Processing 15-492/18-492 Speech Synthesis Pronunciation Letter to Sound rules Speech

Speech Processing 11-492/18-492 Speech Synthesis Overview Text processing Speech Synthesis

Speech of Greta Thunberg at the UN Climate Change COP24 Conference in Katowice Content -Greta

Speech Processing 15-492/18-492 Speech Synthesis Waveform generation 2 Speech Synthesis Text

Speech sound disorder by Sajjal (2018) Definition A speech sound disorder (SSD) is a speech

Chapter 1 Introduction to Speech Signal Processing 1 Outline The

Speech and Language CS 188: Artificial Intelligence Speech technologies Automatic

MUSIC CLASSIFICATION USING DNNS Course Project for CS365 Chaitanya Ahuja Amlan Kar Mentored by

AB Feature Extraction Experiments Discussion Noise Robust LVCSR Feature Extraction Based on

Pattern Recognition Part 4: Feature Extraction Gerhard Schmidt Christian-Albrechts-Universitt

Linear Predictive Coding and Cepstrum coefficients for mining time variant information from

SDS: ASR, NLU, &amp; VXML Ling575 Spoken Dialog April 14, 2016 Roadmap Dialog System

A Deep Representation for Invariance and Music Classification Chiyuan Zhang, Georgios

Exemplar-based voice conversion using non-negative spectrogram deconvolution Zhizheng Wu 1 ,

A Horizon B Horizon Samples Standard Volumetric Glassware with open bottom Filter paper Funnel

Sambuz

Useful Links

Newsletter

Mail Us

SDS: ASR, NLU, & VXML Ling575 Spoken Dialog April 14, 2016 Roadmap Dialog System