Identifiability and Unmixing of Latent Parse Trees Daniel Hsu, Sham Kakade, Percy Liang NIPS 2012 Jan Gasthaus Tea talk January 8th, 2013 1 / 15
Parsing 2 / 15
Big Picture Generative parsing models define joint distributions P θ ( x , z ) over sentences x and their structure z . 3 / 15
Big Picture Generative parsing models define joint distributions P θ ( x , z ) over sentences x and their structure z . Can we identify θ given only sentences (but not their structure, i.e. without supervision)? 3 / 15
Big Picture Generative parsing models define joint distributions P θ ( x , z ) over sentences x and their structure z . Can we identify θ given only sentences (but not their structure, i.e. without supervision)? The paper has two parts: Identifiabilty of several models (PCFGs not identifiable!) 1 3 / 15
Big Picture Generative parsing models define joint distributions P θ ( x , z ) over sentences x and their structure z . Can we identify θ given only sentences (but not their structure, i.e. without supervision)? The paper has two parts: Identifiabilty of several models (PCFGs not identifiable!) 1 Parameter recovery: unmixing (for restricted PCFGs) 2 3 / 15
Big Picture the lady sang Gatsby likes Bayesians 4 / 15
Big Picture 5 / 15
Big Picture 5 / 15
PCFG model 6 / 15
Dependency Grammars 7 / 15
Identifiability 8 / 15
Identifiability S Θ ( θ 0 ) defined by moment constraints h θ 0 ( θ ) = µ ( θ ) − µ ( θ 0 ) = 0 Rows of Jacobian of h θ 0 are directions of constraint violation 9 / 15
Identifiability S Θ ( θ 0 ) defined by moment constraints h θ 0 ( θ ) = µ ( θ ) − µ ( θ 0 ) = 0 Rows of Jacobian of h θ 0 are directions of constraint violation 9 / 15
Identifiability 10 / 15
Unmixing 11 / 15
Unmixing 12 / 15
Unmixing 13 / 15
Results 14 / 15
Conclusions 15 / 15
Recommend
More recommend