Speech and Language CS 188: Artificial Intelligence Speech - PDF document

Speech and Language CS 188: Artificial Intelligence § Speech technologies § Automatic speech recognition (ASR) § Text-to-speech synthesis (TTS) § Dialog systems § Language processing technologies Lecture 18: Speech § Machine translation Pieter Abbeel --- UC Berkeley § Information extraction Many slides over this course adapted from Dan Klein, Stuart Russell, § Web search, question answering Andrew Moore § Text classification, spam filtering, etc … Digitizing Speech Speech in an Hour § Speech input is an acoustic wave form s p ee ch l a b “ l ” to “ a ” transition: Graphs from Simon Arnfield ’ s web tutorial on speech, 3 4 Sheffield: http://www.psyc.leeds.ac.uk/research/cogn/speech/tutorial/ Spectral Analysis Part of [ae] from “ lab ” § Frequency gives pitch; amplitude gives volume § sampling at ~8 kHz phone, ~16 kHz mic (kHz=1000 cycles/sec) s p ee ch l a b amplitude § Complex wave repeating nine times § Plus smaller wave that repeats 4x for every large cycle § Large wave: freq of 250 Hz (9 times in .036 seconds) § Fourier transform of wave displayed as a spectrogram § Small wave roughly 4 times this, or roughly 1000 Hz § darkness indicates energy at each frequency frequency [ demo ] 5 6 1

Resonances of the vocal tract [ demo ] § The human vocal tract as an open tube Closed end Open end Length 17.5 cm. § Air in a tube of a given length will tend to vibrate at resonance frequency of tube. § Constraint: Pressure differential should be maximal at (closed) glottal From end and minimal at (open) lip end. Mark Liberman ’ s 7 8 website Figure from W. Barry Speech Science slides Vowel [i] sung at successively higher pitches Acoustic Feature Sequence F#2 A2 C3 § Time slices are translated into acoustic feature vectors (~39 real numbers per slice) frequency F#3 A3 C4 (middle C) …………………………………………… .. e 12 e 13 e 14 e 15 e 16 ……… .. A4 § These are the observations, now we need the hidden states X 10 Figures from Ratree Wayland State Space HMMs for Speech § P(E|X) encodes which acoustic vectors are appropriate for each phoneme (each kind of sound) § P(X|X ’ ) encodes how sounds can be strung together § We will have one state for each sound in each word § From some state x, can only: § Stay in the same state (e.g. speaking slowly) § Move to the next position in the word § At the end of the word, move to the start of the next word § We build a little state graph for each word and chain them together to form our state space X 11 12 2

Transitions with Bigrams Decoding 198015222 the first § While there are some practical issues, finding the words Training Counts 194623024 the same given the acoustics is an HMM inference problem 168504105 the following 158562063 the world … § We want to know which state sequence x 1:T is most likely 14112454 the door ----------------- given the evidence e 1:T : 23135851162 the * § From the sequence x, we can simply read off the words 14 Figure from Huang et al page 618 3

Speech and Language CS 188: Artificial Intelligence Speech - PDF document

Speech and Language CS 188: Artificial Intelligence Speech technologies Automatic speech recognition (ASR) Text-to-speech synthesis (TTS) Dialog systems Language processing technologies Lecture 18: Speech

Speech and Language CS 188: Artificial Intelligence Spring 2011 Speech technologies

Artificial Intelligence Games need opponents that are challenging, or allies that are helpful

Artificial Intelligence Opponents that are challenging, or allies that are helpful Unit

CS 188: Artificial Intelligence Lecture 18: Speech Pieter Abbeel --- UC Berkeley Many slides

Artificial Intelligence for Games IMGD 4000 Introduction to Artificial Intelligence (AI)

What is NLP? CS 188: Artificial Intelligence Language Fundamental goal: analyze and process

Embodied Machines Artificial vs. Embodied Intelligence Artificial Intelligence (AI)

Artificial intelligence Artificial Intelligence is the science of PHILOSOPHY OF ARTIFICIAL

What is Artificial Intelligence? . . . Exactly what the computer provides is the ability not to be

CS 188: Artificial Intelligence Spring 2006 Lecture 19: Speech Recognition 3/23/2006 Dan Klein

Natural Language Processing Philipp Koehn 23 April 2020 Philipp Koehn Artificial Intelligence:

Natural Language Processing Philipp Koehn 27 April 2017 Philipp Koehn Artificial Intelligence:

AI Artificial Intelligence Definition artificial intelligence / rd

Artificial Intelligence Artificial Intelligence Artificial Intelligence Study and design of

1.1 What is AI? 1. What is Artificial Intelligence? 2. AI Past and Present 3. Rational

Artificial Intelligence Course Presentation Summary Artificial Intelligence Motivations

Artificial Intelligence Course Presentation Summary Artificial Intelligence Motivations

Better Language Models and Their Implications (GPT-2) Joey Lim List of Artificial Intelligence

Introduction to Artificial Intelligence What is Artificial Intelligence for YOU? CPSC 533

CSCI 446: Artificial Intelligence CSCI 446: Artificial Intelligence Course Website:

Traditional Definition of Artificial Intelligence Trends Artificial Intelligence (AI) is

Foundations of Artificial Intelligence 15. Natural Language Processing Understand, interpret,

INF4820: Algorithms for Artificial Intelligence and Natural Language Processing Introduction and

INF4820: Algorithms for Artificial Intelligence and Natural Language Processing Introduction and