DTW and Search Hsin-min Wang References Books 1. X. Huang, A. - PowerPoint PPT Presentation

DTW and Search Hsin-min Wang

References Books 1. X. Huang, A. Acero, H. Hon, “Spoken Language Processing”, Chapters 12-13, Prentice Hall, 2001 2. Chin-Hui Lee, Frank K. Soong and Kuldip K. Paliwal. Automatic Speech and Speaker Recognition, Chapters 13, 16-18, Kluwer Academic Publishers, 1996 3. John R. Deller, JR. John G. Proakis, and John H. L. Hansen. Discrete-Time Processing of Speech Signals, Chapters 11-12, IEEE Press, 2000 4. L.R. Rabiner and B.H. Juang. Fundamentals of speech recognition, Chapter 7, Prentice Hall, 1993 5. Frederick Jelinek. Statistical Methods for Speech Recognition, Chapters 5-6, MIT Press, 1999 6. N. Nilisson. Principles of Artificial Intelligence, 1982 Papers 1. Hermann Ney, “Progress in Dynamic Programming Search for LVCSR,” Proceedings of the IEEE, August 2000 2. Patrick Kenny, et al, “A*-Admissible heuristics for rapid lexical access,” IEEE Trans. on SAP, 1993 3. Stefan Ortmanns and Hermann Ney, “A Word Graph Algorithm for Large Vocabulary Continuous Speech Recognition,” Computer Speech and Language (1997) 11,43-72 4. Jean-Luc Gauvain and Lori Lamel, “Large-Vocabulary Continuous Speech Recognition: Advances and Applications,” Proceedings of the IEEE, August 2000 2

Search Algorithms for Speech Recognition � Template-based: without statistical modeling/training – Directly compare/align the test and reference waveforms on their features vector sequences (could be with different length) to derive the overall distortion between them – Dynamic Time Warping (DTW) : warp speech templates in the time dimension to alleviate the distortion � Model-based: HMMs widely used – Concatenate the subword models according to the pronunciation of the words in a lexicon – The states in the HMM can be expanded to form the state-search space (HMM state transition network) in the search – Apply appropriate search strategies 3

Dynamic Time Warping (DTW) Fig .3 4 Eamonn J. Keogh & Michael J. Pazzani

DTW (cont.) 5

DTW (cont.) ( n , m ) (2,3) (2,2) (1,1) 6

DTW (cont.) 7

DTW (cont.) 8

Advantages of DTW � Speech recognition based on DTW is simple to implement and fairly effective for small-vocabulary isolated word speech recognition � Dynamic programming (DP) can temporally align patterns to account for differences in speaking rates across speakers as well as across repetitions of the word by the same speaker 9

Weaknesses of DTW � Creation of the template streams from data is non-trivial and typically is accomplished by pairwise warping of training instances � Alternatively, all observed instances are stored as templates, but this is incredibly slow � As a result, the HMM is a much better alternative for spoken language processing Also refer to page 383 of the text book. 10

Continuous Speech Recognition � Continuous speech recognition is both a pattern recognition and search problem – In speech recognition, making a search decision is also referred as decoding • Find a sequence of words whose corresponding acoustic and language models best match the input signal • The search space (complexity) is highly correlated with the search space determined by the constraints imposed by the language models � Speech recognition search is usually done with the Viterbi or A* stack decoders – The relative merits of both search algorithms were quite controversial in the 1980s 11

Model-based Speech Recognition � A search process to uncover the word sequence ( ) ˆ = W w w ,..., w with the maximum posterior probability P W X 1 2 m ( ) ˆ = W arg max P W X W ) ( ) ( P W P X W = W w , w ,.. w ,..., w = arg max ( ) 1 2 i m P X { } W ∈ where w V : v ,v ,.....,v ) ( ) ( i 1 2 N = arg max P W P X W W Acoustic Model Probability Language Model Probability Unigram: ( ) ( ) C w ( ) ( ) ( ) ( ) N-gram ≈ = P w w .. w P w P w ... P w , P w j ( ) C w 1 2 k 1 2 k j ∑ Language Modeling Bigram: i i ( ) ( ) ) ( ) ( ) C w w ( ) ( ( ) ≈ = P w w .. w P w P w w ... P w w , P w w − j 1 j C w − − 1 2 k 1 2 1 k k 1 j j 1 − Trigram: j 1 ( ) ) ( ) ( ) ( ) ( ) C w w w ( ) ( ≈ = ( ) P w w .. w P w P w w P w w w ... P w w w , P w w w j − 2 j − 1 j C w w − − − − 1 2 k 1 2 1 3 1 2 k k 2 k 1 k k 1 k 2 − − j 2 j 1 12

Block Diagram of Model-based Speech Recognition Information-based Case Grammar 13

Basic Search Algorithms

What Is “Search”? � The idea of search implies moving around, examining things, and making decisions about whether the sought object has yet been found � Two classical problems in AI – Traveling salesman’s problem : find a shortest-distance tour, starting at one of many cities, visiting each city exactly once, and returning to the starting city – N -queens problem : place N queens on an N x N chessboard in such a way that no queen can capture any other queen; i.e., there is no more than one queen in any given row, column, or diagonal. 15

A Simple City-traveling Problem 16

A Simple City-traveling Problem (cont.) 17

The General Graph Search Algorithm OPEN : stores the nodes waiting for expansion CLOSE : stores the already expanded nodes If both 6(a) and 6(b) are omitted, graph search -> tree search 6(a) and 6(b): bookkeeping or merging process 18

Blind Graph Search Algorithms � If the aim of the search problem is to find an acceptable path instead of the best path, blind search is often used � Blind search treats every node in the OPEN list the same and blindly decides the order to be expanded without using any domain knowledge � Blind search does not expand nodes randomly. It follows some systematic way to explore the search graph – Depth-first search – Breadth-first search 19

Depth-First Search � Depth-first search picks an arbitrary alternative at every node visited � The search sticks with this partial path and works forward from it � Other alternatives at the same level are ignored completely � The deepest nodes are expanded first and nodes of equal depth are ordered arbitrary 20

Depth-First Search (cont.) � Depth-first search generates only one at a time – Graph search generates all successors at a time � When depth-first search reaches a dead-end, it goes back to the last decision point and proceeds with another alternative – Depth-first search could be dangerous because it might search an impossible path that is actually an infinite dead-end – A depth bound can be placed to constrain the nodes to be explored 21

Breadth-First Search � Breadth-first search examines all the nodes on one level before considering any of the nodes on the next level (depth) � Breadth-first search is guaranteed to find a solution if one exists – it might not find a shortest- distance path, but it’s guaranteed to find the one with fewest cities visited (minimum-length path) � May be inefficient when all solutions leading to the goal node are at approximately the same depth 22

Breadth-First Search (cont.) 23

Heuristic Graph Search � Blind search finds only one arbitrary solution instead of the optimal solution – To find the optimal solution with depth-first or breadth-first search, the search needs to continue rather than stop searching when the first solution is discovered • After the search reaches all solutions, we can compare them and select the best – British Museum search or brute-force search � Heuristic search takes advantage of the heuristic information (domain-specific knowledge) during search – Use the heuristic function to re-order the OPEN list in Step 7 of Algorithm 12.1 – Some heuristics can reduce search effort without sacrificing optimality, while other can greatly reduce search effort but provide only sub-optimal solutions – Best-first (or A*) search and beam search 24

Best-First Search � Best-first search explores the best node first since it offers the best hope of leading to the best path, this is why it is called best-first search � A search algorithm is called admissible if it can guarantee to find the optimal solution – Admissible best-first search is called A* search f ( N )= g ( N )+ h ( N ) 25 h ( N )=0& g ( N ):depth of node N ->breadth-first

A* Search � History of A* Search in AI – The most widely studied best-first search (Hert, Nilsson,1968) – Developed for additive cost measures (The cost of a path = sum of the costs of its arcs) � Properties – A* search can sequentially generates multiple recognition candidates – A* search needs a heuristic function that satisfies the admissible condition � Admissibility – The property that a search algorithm guarantees to find an optimal solution, if one exists 26

A* Search – 1st example 27

A* Search – 1st example (cont.) S 2+10.3 3+8.5 A (3+4)+10.3 (3+3)+5.7 C (6+3)+2.8 E (9+5)+7 9+3 G 28

DTW and Search Hsin-min Wang References Books 1. X. Huang, A. - PowerPoint PPT Presentation

DTW and Search Hsin-min Wang References Books 1. X. Huang, A. Acero, H. Hon, Spoken Language Processing, Chapters 12-13, Prentice Hall, 2001 2. Chin-Hui Lee, Frank K. Soong and Kuldip K. Paliwal. Automatic Speech and Speaker

Implementation of DTW and DDTW algorithm on Cell Broadband Engine Pavel Bazika

Sta$s$cal model training DTW, EM, and HMM training DTW:

Search Engines Issues Avi Rappoport Search Tools Consulting Search Issues Enterprise Search

Audio Files Realignment by Dynamic Time Warping (DTW) Florian Picard, Florian Tilquin June 27,

ASR-free CNN-DTW keyword spotting using multilingual bottleneck features for almost zero-resource

Speech Processing 15-492/18-492 Speech Recognition Template matching Speech Recognition by

& HMM DTW

Foundations of Artificial Intelligence 9. State-Space Search: Tree Search and Graph Search Malte

Tabu Search Search Tabu Page 1 Part I Part I Tabu Search Principles Search Principles Tabu

Uninformed Search 2 Informed Search Rest of blind search An informed search strategyone

Informed search algorithms Outline Best-first search Greedy best-first search A *

Elastic Search - Aditi Choksi (EW18455) Elastic Search Search engine Distributed

2 EBI Search 3 EBI Search 4 EBI

Balanced Search Trees Binary Search Trees Binary Search Tree Binary Search Tree A binary tree is

Search Algorithms 3 AI Slides (6e) c Lin Zuoquan@PKU 2003-2020 3 1 3 Search Algorithms

Query DB structures Manipulation queries DB search Hits Memory search 2 Standardization of

MySQL+HandlerSocket=NoSQL Protocol Using HS Commands Peculiarities Configuration hints Use

Preliminary Findings of the Vision Group Translation and Localisation Jrg Porsiel Volkswagen

Outline Autoformalization Demos PCFG-based Parsing Neural Parsing 2 / 21 Autoformalization

Testing for Real ESUG: 2006 Refactoring Test Code out of Real Data Niall Ross,

Loss-augmented Structured Prediction CMSC 723 / LING 723 / INST 725 Marine Carpuat Figures,

Abstract Syntax Networks for Code Generation and Semantic Parsing Maxim Rabinovich, Mitchell

Limits its of nume meric rical al appr proache oaches FACET CETS S neur urom omorp

Chapter 4 Gates and Circuits Hofstra University Overview of 9/19/06 Computer Science,

DTW and Search Hsin-min Wang References Books 1. X. Huang, A. - PowerPoint PPT Presentation

DTW and Search Hsin-min Wang References Books 1. X. Huang, A. Acero, H. Hon, Spoken Language Processing, Chapters 12-13, Prentice Hall, 2001 2. Chin-Hui Lee, Frank K. Soong and Kuldip K. Paliwal. Automatic Speech and Speaker

Implementation of DTW and DDTW algorithm on Cell Broadband Engine Pavel Bazika

Sta$s$cal model training DTW, EM, and HMM training DTW:

Search Engines Issues Avi Rappoport Search Tools Consulting Search Issues Enterprise Search

Audio Files Realignment by Dynamic Time Warping (DTW) Florian Picard, Florian Tilquin June 27,

ASR-free CNN-DTW keyword spotting using multilingual bottleneck features for almost zero-resource

Speech Processing 15-492/18-492 Speech Recognition Template matching Speech Recognition by

&amp; HMM DTW

Foundations of Artificial Intelligence 9. State-Space Search: Tree Search and Graph Search Malte

Tabu Search Search Tabu Page 1 Part I Part I Tabu Search Principles Search Principles Tabu

Uninformed Search 2 Informed Search Rest of blind search An informed search strategyone

Informed search algorithms Outline Best-first search Greedy best-first search A *

Elastic Search - Aditi Choksi (EW18455) Elastic Search Search engine Distributed

2 EBI Search 3 EBI Search 4 EBI

Balanced Search Trees Binary Search Trees Binary Search Tree Binary Search Tree A binary tree is

Search Algorithms 3 AI Slides (6e) c Lin Zuoquan@PKU 2003-2020 3 1 3 Search Algorithms

Query DB structures Manipulation queries DB search Hits Memory search 2 Standardization of

MySQL+HandlerSocket=NoSQL Protocol Using HS Commands Peculiarities Configuration hints Use

Preliminary Findings of the Vision Group Translation and Localisation Jrg Porsiel Volkswagen

Outline Autoformalization Demos PCFG-based Parsing Neural Parsing 2 / 21 Autoformalization

Testing for Real ESUG: 2006 Refactoring Test Code out of Real Data Niall Ross,

Loss-augmented Structured Prediction CMSC 723 / LING 723 / INST 725 Marine Carpuat Figures,

Abstract Syntax Networks for Code Generation and Semantic Parsing Maxim Rabinovich, Mitchell

Limits its of nume meric rical al appr proache oaches FACET CETS S neur urom omorp

Chapter 4 Gates and Circuits Hofstra University Overview of 9/19/06 Computer Science,

& HMM DTW