Information Transmission Chapter 3, text and speech OVE EDFORS - PowerPoint PPT Presentation

Information Transmission Chapter 3, text and speech OVE EDFORS ELECTRICAL AND INFORMATION TECHNOLOGY

Learning outcomes Understand some of the most important concepts regarding ● information and its representation (bits, bandwidth, SNR), how to perform decibel calculations, ● what text is and how it can be coded, ● signal frequency content/components and spectrum, ● voice generation and properties, ● audio quality measures, and ● basics of (digital) audio/music recording. ● 2

Where are we in the BIG PICTURE? 3

Some concepts • Bits – Small pieces of information – The information in a 2-valued variable • Bandwidth – Fourier transform of a signal – (The number of bits/s from a source) • Signal to noise ratio – SNR – Average signal power / average noise power 4

Decibel - dB • Convenient when comparing values with a really small difference or a really large one • If A and B are power values • Or if A and B are amplitude values 5

What is text? Definition : A collection of letters (numbers, symbols, …) to form words (math figures, software, crypto-text, …) Symbols come from a set called the alphabet Do we have any standard alphabets? 6

ASCII american standard for information interchange FIGURE FROM TEXTBOOK 7

A different type of ASCII table 8

Frequency and bandwidth 9 9

Frequency Sinusoidal signals: One cycle or period Time Frequency = Number of cycles per second [Herz] Example :The AC power in your home has a frequency of 50 Hertz. This also means that the cycle time is 20 ms. 10 10

Adding sinusoids [1] 25 Hz What frequency? 50 Hz Is no longer a pure sinusoid. Contains TWO frequencies. 11 11

Adding sinusoids [2] Can we build ”any” signal by adding sinusoids? Yes! 50 Hz 100 Hz 150 Hz 200 Hz 250 Hz After an infinite number of sinusoids we get a sawtooth signal! 12 12

Spectrum 13 13

Spectrum [1] If we can build any signal by adding sinusoids ... can we view the frequency content of a signal in some way? Amplitude This is the amplitude spectrum of the ”sawtooth signal”. 0 50 100 150 200 250 Frequency [Hz] 14 14

The vocal tract • Vocal cord produces the tone, the rest is forming the sound • Voiced sounds/unvoiced gomsegel sounds • 5-10 sounds/s in speech struplock matstrupe stämband luftstrupe 15

Voiced/unvoiced sounds 16

Frequency content of speech Main energy in 100-800 Hz (speaker recognition) 800 Hz-4 kHz (intelligibility range) Less than 1% above 4 kHz harmonics fundamental 17

Demo: Audio analyzer

Standard phone line • 40 dB signal to noise ratio (SNR) desired • 4 kHz bandwidth • Uses uncompressed PCM, as opposed to cell phones where there is speech coding 19

3 bit PCM • 2 3 regions (bins) • A deviation means an error – noise • SNR= 6 b -C 0 dB • If C 0 =7.3 ... how many bits do you need? 20

Reconstruction error 21

How often do you have to sample? You need this simple version of the Sampling Theorem to solve Chapter 3 problems. We will go through it in more detail later. A continuous-time signal x ( t ) whose frequency components are all below some largest frequency f Hz is completely characterized by samples of the signal taken T s seconds apart, x ( kT s ), as long as the sampling frequency f s = 1/ T s > 2 f . In “plain” English: If you sample a signal at TWICE the largest frequency present in the signal, you can completely reconstruct the entire signal from those samples. Example: A speech signal with frequency components up to f = 4 kHz needs to be sampled at f s = 8 kHz, i.e. every T s = 1/8000 second. 22

Music • Highly dynamic 30-50 dB power variations • Funtamental tone+overtones, 20-20 000 Hz – Sensitive in the range 100-4000 Hz – No direction below 100 Hz 23

Music recording on a CD 2 channels*44.1 k samples/s*16 bits/sample result in a bit stream of 1.4 Mbit/s 24

How many bits are there? 25

SUMMARY Voice Signal quality – dB measure ● ● Voice signals/speech created by vocal cords – Power ratio in dB: – producing the tone Amplitude ratio in dB: – … and rest of the voice aparatus forming the – spectrum Text: ● Voiced and univoiced sounds – Sequence letters (symbols from an – Most information contained below 4 kHz – alphabet) forming words 40 dB SNR PCM coding: 8 kHz sampling x 8 bit/ – Several coding standards, e.g. ASCII sample = 64 kbit/sek – Music Sinusoidal signals ● ● Different instruments playing the same tone differ – Have frequency (period time) and – in their over-tones amplitude Frequency span: from 20 Hz to 20 kHz – Can be added to form signals of other – CD quality PCM (stereo): 44.1 kHz sampling x 2 – shapes channels x 16 bit/sample = 1.4 Mbit/sek Amount of each sinusoidal used Error correcting codes used to protect against – – errors when reading from CD (amplitude) called the spektrum 26

Information Transmission Chapter 3, text and speech OVE EDFORS - PowerPoint PPT Presentation

Information Transmission Chapter 3, text and speech OVE EDFORS ELECTRICAL AND INFORMATION TECHNOLOGY Learning outcomes Understand some of the most important concepts regarding information and its representation (bits, bandwidth, SNR),

Speech Processing Speech Processing Using Speech with Computers Overview Overview Speech vs

Speech Processing 15-492/18-492 Speech Synthesis Overview Text processing Speech Synthesis

10 slides that always work Simple text boxes (I) Sample text Sample text Sample text

6-Text To Speech (TTS) Speech Synthesis Speech Synthesis Concept Speech Naturalness Phone

EECS E6870 converting speech to text Speech Recognition automatic speech recognition

Speech Processing 11-492/18-492 Speech Synthesis Overview Text processing Speech Synthesis

CONTENT TITLE Insert Subtitle Here Enter Text Here Enter Text Here Enter Text Here

5. Text CHAPTER HIGHLIGHTS Text tradition. Codes for computer text. C d f t t t

Post-Conference Presentation Sunday Oladayo Oladejo Table of Content A Introduction B

Speech recognition (briefly) Chapter 15, Section 6 Chapter 15, Section 6 1 Outline Speech

Speech recognition (briefly) Chapter 15, Section 6 Chapter 15, Section 6 1 Outline Speech

10 Year Transmission 10 Year Transmission Expansion Plan Expansion Plan Southeastern Region

Offshore Wind & Transmission Forum Financing Offshore Transmission A Transmission

Speech Processing 15-492/18-492 Speech Synthesis Waveform generation 2 Speech Synthesis Text

Information Transmission Chapter 3, text and speech OVE EDFORS ELECTRICAL AND INFORMATION

Part-of-Speech Tagging Part-of-Speech Tagging Berlin Chen 2003 References: 1. Speech and

RESOURCES FOR SPEECH SYNTHESIS OF VIENNESE VARIETIES Contents Project Viennese Sociolect

TIMBRE CONNECTIONS 1 YU / LAMONT MARCH 6, 2018 2 MAP ON THURSDAY DR. MEI-YAU SHIH

Section 3: Digitising Speech, Music & Video 29Dec'06 Comp30282 Sectn 3 1 3.1 Digitising

Speech & Audio Coding TSBK01 Image Coding and Data Compression Lecture 11, 2003 Jrgen

glu deployment automation platform July 2011 Yan Pujante in: http://www.linkedin.com/in/yan

Multi-party Off-the-Record Messaging Ian Goldberg glu Berkant Ustao Matthew D. Van Gundy

A Theory of School-Choice Lotteries M. Utku Onur Kesten & Unver Carnegie Mellon

LANGUAGE MODELING WITH GATED CONVOLUTIONAL NETWORKS YANN N. DAUPHIN, ANGELA FAN, MICHAEL AULI AND

Information Transmission Chapter 3, text and speech OVE EDFORS - PowerPoint PPT Presentation

Information Transmission Chapter 3, text and speech OVE EDFORS ELECTRICAL AND INFORMATION TECHNOLOGY Learning outcomes Understand some of the most important concepts regarding information and its representation (bits, bandwidth, SNR),

Speech Processing Speech Processing Using Speech with Computers Overview Overview Speech vs

Speech Processing 15-492/18-492 Speech Synthesis Overview Text processing Speech Synthesis

10 slides that always work Simple text boxes (I) Sample text Sample text Sample text

6-Text To Speech (TTS) Speech Synthesis Speech Synthesis Concept Speech Naturalness Phone

EECS E6870 converting speech to text Speech Recognition automatic speech recognition

Speech Processing 11-492/18-492 Speech Synthesis Overview Text processing Speech Synthesis

CONTENT TITLE Insert Subtitle Here Enter Text Here Enter Text Here Enter Text Here

5. Text CHAPTER HIGHLIGHTS Text tradition. Codes for computer text. C d f t t t

Post-Conference Presentation Sunday Oladayo Oladejo Table of Content A Introduction B

Speech recognition (briefly) Chapter 15, Section 6 Chapter 15, Section 6 1 Outline Speech

Speech recognition (briefly) Chapter 15, Section 6 Chapter 15, Section 6 1 Outline Speech

10 Year Transmission 10 Year Transmission Expansion Plan Expansion Plan Southeastern Region

Offshore Wind &amp; Transmission Forum Financing Offshore Transmission A Transmission

Speech Processing 15-492/18-492 Speech Synthesis Waveform generation 2 Speech Synthesis Text

Information Transmission Chapter 3, text and speech OVE EDFORS ELECTRICAL AND INFORMATION

Part-of-Speech Tagging Part-of-Speech Tagging Berlin Chen 2003 References: 1. Speech and

RESOURCES FOR SPEECH SYNTHESIS OF VIENNESE VARIETIES Contents Project Viennese Sociolect

TIMBRE CONNECTIONS 1 YU / LAMONT MARCH 6, 2018 2 MAP ON THURSDAY DR. MEI-YAU SHIH

Section 3: Digitising Speech, Music &amp; Video 29Dec'06 Comp30282 Sectn 3 1 3.1 Digitising

Speech &amp; Audio Coding TSBK01 Image Coding and Data Compression Lecture 11, 2003 Jrgen

glu deployment automation platform July 2011 Yan Pujante in: http://www.linkedin.com/in/yan

Multi-party Off-the-Record Messaging Ian Goldberg glu Berkant Ustao Matthew D. Van Gundy

A Theory of School-Choice Lotteries M. Utku Onur Kesten &amp; Unver Carnegie Mellon

LANGUAGE MODELING WITH GATED CONVOLUTIONAL NETWORKS YANN N. DAUPHIN, ANGELA FAN, MICHAEL AULI AND

Offshore Wind & Transmission Forum Financing Offshore Transmission A Transmission

Section 3: Digitising Speech, Music & Video 29Dec'06 Comp30282 Sectn 3 1 3.1 Digitising

Speech & Audio Coding TSBK01 Image Coding and Data Compression Lecture 11, 2003 Jrgen

A Theory of School-Choice Lotteries M. Utku Onur Kesten & Unver Carnegie Mellon