Generative Process • Step 1 - Generate the number of phones that each letter maps to ( ) n i red sox ~ Dir( η ) 𝜚 r l i 0 1 2 1 n i π l i ... 1 2 3 K ... x t θ 1 θ 2 θ 3 θ K 25
Generative Process • Step 1 - Generate the number of phones that each letter maps to ( ) n i red sox ~ Dir( η ) 𝜚 e l i 0 1 2 1 n i π l i ... 1 2 3 K ... x t θ 1 θ 2 θ 3 θ K 26
Generative Process • Step 1 - Generate the number of phones that each letter maps to ( ) n i red sox ~ Dir( η ) 𝜚 e l i 0 1 2 1 1 n i π l i ... 1 2 3 K ... x t θ 1 θ 2 θ 3 θ K 26
Generative Process • Step 1 - Generate the number of phones that each letter maps to ( ) n i red sox ~ Dir( η ) 𝜚 d l i 0 1 2 1 1 n i π l i ... 1 2 3 K ... x t θ 1 θ 2 θ 3 θ K 27
Generative Process • Step 1 - Generate the number of phones that each letter maps to ( ) n i red sox ~ Dir( η ) 𝜚 d l i 0 1 2 1 1 1 n i π l i ... 1 2 3 K ... x t θ 1 θ 2 θ 3 θ K 27
Generative Process • Step 1 - Generate the number of phones that each letter maps to ( ) n i red sox ~ Dir( η ) 𝜚 _ l i 0 1 2 1 1 1 n i π l i ... 1 2 3 K ... x t θ 1 θ 2 θ 3 θ K 28
Generative Process • Step 1 - Generate the number of phones that each letter maps to ( ) n i red sox ~ Dir( η ) 𝜚 _ l i 0 1 2 1 1 1 0 n i π l i ... 1 2 3 K ... x t θ 1 θ 2 θ 3 θ K 28
Generative Process • Step 1 - Generate the number of phones that each letter maps to ( ) n i red sox ~ Dir( η ) 𝜚 s l i 0 1 2 1 1 1 0 n i π l i ... 1 2 3 K ... x t θ 1 θ 2 θ 3 θ K 29
Generative Process • Step 1 - Generate the number of phones that each letter maps to ( ) n i red sox ~ Dir( η ) 𝜚 s l i 0 1 2 1 1 1 0 1 n i π l i ... 1 2 3 K ... x t θ 1 θ 2 θ 3 θ K 29
Generative Process • Step 1 - Generate the number of phones that each letter maps to ( ) n i red sox ~ Dir( η ) 𝜚 o l i 0 1 2 1 1 1 0 1 n i π l i ... 1 2 3 K ... x t θ 1 θ 2 θ 3 θ K 30
Generative Process • Step 1 - Generate the number of phones that each letter maps to ( ) n i red sox ~ Dir( η ) 𝜚 o l i 0 1 2 1 1 1 0 1 1 n i π l i ... 1 2 3 K ... x t θ 1 θ 2 θ 3 θ K 30
Generative Process • Step 1 - Generate the number of phones that each letter maps to ( ) n i red sox ~ Dir( η ) 𝜚 x l i 0 1 2 1 1 1 0 1 1 n i π l i ... 1 2 3 K ... x t θ 1 θ 2 θ 3 θ K 31
Generative Process • Step 1 - Generate the number of phones that each letter maps to ( ) n i red sox ~ Dir( η ) 𝜚 x l i 0 1 2 1 1 1 0 1 1 2 n i π l i ... 1 2 3 K ... x t θ 1 θ 2 θ 3 θ K 31
Generative Process • Step 1 - Generate the number of phones that each letter maps to ( ) n i red sox ~ Dir( η ) 𝜚 l i l i 0 1 2 1 1 1 0 1 1 2 n i π l i ... 1 2 3 K ... x t θ 1 θ 2 θ 3 θ K 32
Generative Process • Step 2 - Generate the phone label ( ) for every phone that a letter maps to, c i,p 1 ≤ p ≤ n i red sox ~ Dir( η ) 𝜚 l i l i 0 1 2 1 1 1 0 1 1 2 n i π l i ... 1 2 3 K ... x t θ 1 θ 2 θ 3 θ K 33
Generative Process • Step 2 - Generate the phone label ( ) for every phone that a letter maps to, c i,p 1 ≤ p ≤ n i red sox ~ Dir( η ) 𝜚 l i l i 0 1 2 1 1 1 0 1 1 2 n i π l i ... 1 2 3 K ... x t θ 1 θ 2 θ 3 θ K 33
Generative Process • Step 2 - Generate the phone label ( ) for every phone that a letter maps to, c i,p 1 ≤ p ≤ n i red sox ~ Dir( η ) 𝜚 l i l i 0 1 2 1 1 1 0 1 1 2 n i π l i ... c i,p 1 2 3 K ... x t θ 1 θ 2 θ 3 θ K 33
Generative Process • Step 2 - Generate the phone label ( ) for every phone that a letter maps to, c i,p 1 ≤ p ≤ n i red sox ~ Dir( η ) 𝜚 l i l i 0 1 2 1 1 1 0 1 1 2 n i π r ... c i,p 1 2 3 K ... x t θ 1 θ 2 θ 3 θ K 34
Generative Process • Step 2 - Generate the phone label ( ) for every phone that a letter maps to, c i,p 1 ≤ p ≤ n i red sox ~ Dir( η ) 𝜚 l i l i 0 1 2 1 1 1 0 1 1 2 n i π r ~ Dir ( γ ) ... c i,p 1 2 3 K ... x t θ 1 θ 2 θ 3 θ K 35
Generative Process • Step 2 - Generate the phone label ( ) for every phone that a letter maps to, c i,p 1 ≤ p ≤ n i red sox ~ Dir( η ) 𝜚 l i l i 0 1 2 1 1 1 0 1 1 2 n i π r ~ Dir ( γ ) ... 3 c i,p 1 2 3 K ... x t θ 1 θ 2 θ 3 θ K 36
Generative Process • Step 2 - Generate the phone label ( ) for every phone that a letter maps to, c i,p 1 ≤ p ≤ n i red sox ~ Dir( η ) 𝜚 l i l i 0 1 2 1 1 1 0 1 1 2 n i π e ~ Dir ( γ ) ... 3 1 c i,p 1 2 3 K ... x t θ 1 θ 2 θ 3 θ K 37
Generative Process • Step 2 - Generate the phone label ( ) for every phone that a letter maps to, c i,p 1 ≤ p ≤ n i red sox ~ Dir( η ) 𝜚 l i l i 0 1 2 1 1 1 0 1 1 2 n i π d ~ Dir ( γ ) ... 3 1 17 c i,p 1 2 3 K ... x t θ 1 θ 2 θ 3 θ K 38
Generative Process • Step 2 - Generate the phone label ( ) for every phone that a letter maps to, c i,p 1 ≤ p ≤ n i red sox ~ Dir( η ) 𝜚 l i l i 0 1 2 1 1 1 0 1 1 2 n i π s ~ Dir ( γ ) ... 3 1 17 2 c i,p 1 2 3 K ... x t θ 1 θ 2 θ 3 θ K 39
Generative Process • Step 2 - Generate the phone label ( ) for every phone that a letter maps to, c i,p 1 ≤ p ≤ n i red sox ~ Dir( η ) 𝜚 l i l i 0 1 2 1 1 1 0 1 1 2 n i π o ~ Dir ( γ ) ... 3 1 17 2 19 c i,p 1 2 3 K ... x t θ 1 θ 2 θ 3 θ K 40
Generative Process • Step 2 - Generate the phone label ( ) for every phone that a letter maps to, c i,p 1 ≤ p ≤ n i red sox ~ Dir( η ) 𝜚 l i l i 0 1 2 1 1 1 0 1 1 2 n i π x ~ Dir ( γ ) ... 3 1 17 2 19 56 c i,p 1 2 3 K ... x t θ 1 θ 2 θ 3 θ K 41
Generative Process • Step 2 - Generate the phone label ( ) for every phone that a letter maps to, c i,p 1 ≤ p ≤ n i red sox ~ Dir( η ) 𝜚 l i l i 0 1 2 1 1 1 0 1 1 2 n i π x ~ Dir ( γ ) ... 3 1 17 2 19 56 2 c i,p 1 2 3 K ... x t θ 1 θ 2 θ 3 θ K 42
Generative Process • Step 3 - Generate speech ( ) x t red sox ~ Dir( η ) 𝜚 l i l i 0 1 2 1 1 1 0 1 1 2 n i π l i ~ Dir ( γ ) ... 3 1 17 2 19 56 2 c i,p 1 2 3 K ... x t x t θ 1 θ 2 θ 3 θ K 43
Generative Process • Step 3 - Generate speech ( ) x t red sox ~ Dir( η ) 𝜚 l i l i 0 1 2 1 1 1 0 1 1 2 n i π l i ~ Dir ( γ ) ... 3 1 17 2 19 56 2 c i,p 1 2 3 K ... x t x t θ 1 θ 2 θ 3 θ K 44
Generative Process • Step 3 - Generate speech ( ) x t red sox ~ Dir( η ) 𝜚 l i l i 0 1 2 1 1 1 0 1 1 2 n i π l i ~ Dir ( γ ) ... 3 1 17 2 19 56 2 c i,p 1 2 3 K ... x t x t θ 1 θ 2 θ 3 θ K 45
Generative Process • Step 3 - Generate speech ( ) x t red sox ~ Dir( η ) 𝜚 l i l i 0 1 2 1 1 1 0 1 1 2 n i π l i ~ Dir ( γ ) ... 3 1 17 2 19 56 2 c i,p 1 2 3 K ... x t θ 1 θ 2 θ 3 θ K 46
Generative Process • Step 3 - Generate speech ( ) x t red sox ~ Dir( η ) 𝜚 l i l i 0 1 2 1 1 1 0 1 1 2 n i π l i ~ Dir ( γ ) ... 3 1 17 2 19 56 2 c i,p 1 2 3 K ... x t θ 1 θ 2 θ 3 θ K 47
Generative Process • Step 3 - Generate speech ( ) x t red sox ~ Dir( η ) 𝜚 l i l i 0 1 2 1 1 1 0 1 1 2 n i π l i ~ Dir ( γ ) ... 3 1 17 2 19 56 2 c i,p 1 2 3 K ... x t θ 1 θ 2 θ 3 θ K 48
Generative Process • Step 3 - Generate speech ( ) x t red sox ~ Dir( η ) 𝜚 l i l i 0 1 2 1 1 1 0 1 1 2 n i π l i ~ Dir ( γ ) ... 3 1 17 2 19 56 2 c i,p 1 2 3 K ... x t θ 1 θ 2 θ 3 θ K 49
Generative Process • Step 3 - Generate speech ( ) x t red sox ~ Dir( η ) 𝜚 l i l i 0 1 2 1 1 1 0 1 1 2 n i π l i ~ Dir ( γ ) ... 3 1 17 2 19 56 2 c i,p 1 2 3 K ... x t θ 1 θ 2 θ 3 θ K 50
Generative Process • Step 3 - Generate speech ( ) x t red sox ~ Dir( η ) 𝜚 l i l i 0 1 2 1 1 1 0 1 1 2 n i π l i ~ Dir ( γ ) ... 3 1 17 2 19 56 2 c i,p 1 2 3 K ... x t θ 1 θ 2 θ 3 θ K 51
Generative Process • Step 3 - Generate speech ( ) x t red sox ~ Dir( η ) 𝜚 l i l i 0 1 2 1 1 1 0 1 1 2 n i π l i ~ Dir ( γ ) ... 3 1 17 2 19 56 2 c i,p 1 2 3 K ... x t θ 1 θ 2 θ 3 θ K 51
Context-dependent L2S Rules • Take context into account for learning L2S mapping rules - More specific rules - Natural back-off mechanism red sox π o ... ~ π o 1 2 3 K c i 𝜚 sox ~DP( γ , ) 𝜚 o ... θ 4 ... θ 1 θ 2 θ 3 52
Context-dependent L2S Rules • Take context into account for learning L2S mapping rules - More specific rules - Natural back-off mechanism red sox π o ... ~ π sox 1 2 3 K c i π sox 𝜚 sox ~DP( γ , ) 𝜚 o ... ... 1 2 3 K θ 4 ... θ 1 θ 2 θ 3 53
Context-dependent L2S Rules • Take context into account for learning L2S mapping rules - More specific rules - Back-off mechanism through hierarchy π o ... 1 2 3 K π sox ... 1 2 3 K 54
Context-dependent L2S Rules • Take context into account for learning L2S mapping rules - More specific rules - Back-off mechanism through hierarchy π o ... 1 2 3 K ~ Dir ( απ o ) π sox ... 1 2 3 K 55
Context-dependent L2S Rules • Take context into account for learning L2S mapping rules - More specific rules - Back-off mechanism through hierarchy π o ... • View as the prior of π o π sox 1 2 3 K - If sox appears frequently empirical distribution π sox - If sox is rarely observed ~ Dir ( απ o ) π sox ... π sox π o 1 2 3 K 56
Context-dependent L2S Rules • Take context into account for learning L2S mapping rules - More specific rules ~ Dir ( 𝛿 ) β - Back-off mechanism through hierarchy ~ Dir ( λβ ) π o ... • View as the prior of π o π sox 1 2 3 K - If sox appears frequently empirical distribution π sox - If sox is rarely observed ~ Dir ( απ o ) π sox ... π sox π o 1 2 3 K 57
Graphical Model G : the set of graphemes G × {n,p} l : sequence of three graphemes η γ l i 1 ≤ n ≤ 2 l : observed graphemes 1 ≤ p ≤ n x : observation speech β n i π l ,n,p 𝜚 l d : phone duration G × G × G c : phone id λ n : number of phones a grapheme maps to c i,p π l ,n,p L : total number of graphemes α G × G K : total number of HMMs x t 𝜚 l : 3-dim categorical distribution t = 1... d i θ k θ 0 θ k : a HMM θ 0 : HMM prior p = 1 ... n i K π l ,n,p , π l ,n,p , β : K-dim categorical distribution i = 1 ... L 𝛿 , λ , α : concentration parameter 58
Inference G × {n,p} η γ l i 1 ≤ n ≤ 2 1 ≤ p ≤ n β n i π l ,n,p 𝜚 l G × G × G λ c i,p π l ,n,p α G × G x t t = 1... d i θ k θ 0 p = 1 ... n i K i = 1 ... L 59
Inference G × {n,p} η γ l i 1 ≤ n ≤ 2 1 ≤ p ≤ n β n i π l ,n,p 𝜚 l G × G × G λ c i,p π l ,n,p α G × G x t t = 1... d i θ k θ 0 Latent Regular p = 1 ... n i K model latent i = 1 ... L parameters variables 60
Inference • Procedure G × {n,p} - 20,000 iterations η γ l i 1 ≤ n ≤ 2 1 ≤ p ≤ n β n i π l ,n,p 𝜚 l G × G × G λ c i,p π l ,n,p α G × G x t t = 1... d i θ k θ 0 Latent Regular p = 1 ... n i K model latent i = 1 ... L parameters variables 60
Inference • Procedure G × {n,p} - 20,000 iterations η γ l i 1 ≤ n ≤ 2 1 ≤ p ≤ n Sample from prior β n i π l ,n,p 𝜚 l G × G × G λ c i,p π l ,n,p α G × G x t t = 1... d i θ k θ 0 Latent Regular p = 1 ... n i K model latent i = 1 ... L parameters variables 60
Inference • Procedure G × {n,p} - 20,000 iterations η γ l i 1 ≤ n ≤ 2 1 ≤ p ≤ n Sample from prior β n i π l ,n,p 𝜚 l G × G × G λ Sample given a c i,p π l ,n,p α G × G x t t = 1... d i θ k θ 0 Latent Regular p = 1 ... n i K model latent i = 1 ... L parameters variables 60
Inference • Procedure G × {n,p} - 20,000 iterations η γ l i 1 ≤ n ≤ 2 1 ≤ p ≤ n Sample from prior β n i π l ,n,p 𝜚 l G × G × G λ Sample given a c i,p π l ,n,p α G × G Sample given a x t t = 1... d i θ k θ 0 Latent Regular p = 1 ... n i K model latent i = 1 ... L parameters variables 60
Inference • Procedure G × {n,p} - 20,000 iterations η γ l i 1 ≤ n ≤ 2 1 ≤ p ≤ n Sample from prior β n i π l ,n,p 𝜚 l G × G × G λ Sample given a c i,p π l ,n,p α G × G Sample given a x t t = 1... d i θ k θ 0 Latent Regular p = 1 ... n i K model latent i = 1 ... L parameters variables 60
Inference • Procedure G × {n,p} - 10,000 iterations η γ l i 1 ≤ n ≤ 2 1 ≤ p ≤ n Sample from prior β n i π l ,n,p 𝜚 l G × G × G λ Sample given a c i,p π l ,n,p Block- α sampling G × G Sample given a x t t = 1... d i θ k θ 0 Latent Regular p = 1 ... n i K model latent i = 1 ... L parameters variables 61
Induce Lexicon and Acoustic Model • and define word pronunciations and phone transcriptions n i c i red sox l i x t 62
Induce Lexicon and Acoustic Model • and define word pronunciations and phone transcriptions n i c i red sox l i 1 1 1 0 1 1 2 n i 3 1 17 2 19 56 2 c i x t 63
Induce Lexicon and Acoustic Model • and define word pronunciations and phone transcriptions n i c i red sox l i 1 1 1 0 1 1 2 n i 3 1 17 2 19 56 2 c i x t 64
Induce Lexicon and Acoustic Model • and define word pronunciations and phone transcriptions n i c i red sox l i red : 3 1 17 1 1 1 0 1 1 2 n i sox : 2 19 56 2 3 1 17 2 19 56 2 c i x t 64
Induce Lexicon and Acoustic Model • and define word pronunciations and phone transcriptions n i c i red sox l i red : 3 1 17 1 1 1 0 1 1 2 n i sox : 2 19 56 2 3 1 17 2 19 56 2 c i x t 65
Induce Lexicon and Acoustic Model • and define word pronunciations and phone transcriptions n i c i red sox l i red : 3 1 17 1 1 1 0 1 1 2 n i sox : 2 19 56 2 3 1 17 2 19 56 2 c i ... x t θ 1 θ 2 θ 3 θ K 65
Recommend
More recommend