identifiability and unmixing of latent parse trees
play

Identifiability and Unmixing of Latent Parse Trees Daniel Hsu, Sham - PowerPoint PPT Presentation

Identifiability and Unmixing of Latent Parse Trees Daniel Hsu, Sham Kakade, Percy Liang NIPS 2012 Jan Gasthaus Tea talk January 8th, 2013 1 / 15 Parsing 2 / 15 Big Picture Generative parsing models define joint distributions P ( x , z )


  1. Identifiability and Unmixing of Latent Parse Trees Daniel Hsu, Sham Kakade, Percy Liang NIPS 2012 Jan Gasthaus Tea talk January 8th, 2013 1 / 15

  2. Parsing 2 / 15

  3. Big Picture Generative parsing models define joint distributions P θ ( x , z ) over sentences x and their structure z . 3 / 15

  4. Big Picture Generative parsing models define joint distributions P θ ( x , z ) over sentences x and their structure z . Can we identify θ given only sentences (but not their structure, i.e. without supervision)? 3 / 15

  5. Big Picture Generative parsing models define joint distributions P θ ( x , z ) over sentences x and their structure z . Can we identify θ given only sentences (but not their structure, i.e. without supervision)? The paper has two parts: Identifiabilty of several models (PCFGs not identifiable!) 1 3 / 15

  6. Big Picture Generative parsing models define joint distributions P θ ( x , z ) over sentences x and their structure z . Can we identify θ given only sentences (but not their structure, i.e. without supervision)? The paper has two parts: Identifiabilty of several models (PCFGs not identifiable!) 1 Parameter recovery: unmixing (for restricted PCFGs) 2 3 / 15

  7. Big Picture the lady sang Gatsby likes Bayesians 4 / 15

  8. Big Picture 5 / 15

  9. Big Picture 5 / 15

  10. PCFG model 6 / 15

  11. Dependency Grammars 7 / 15

  12. Identifiability 8 / 15

  13. Identifiability S Θ ( θ 0 ) defined by moment constraints h θ 0 ( θ ) = µ ( θ ) − µ ( θ 0 ) = 0 Rows of Jacobian of h θ 0 are directions of constraint violation 9 / 15

  14. Identifiability S Θ ( θ 0 ) defined by moment constraints h θ 0 ( θ ) = µ ( θ ) − µ ( θ 0 ) = 0 Rows of Jacobian of h θ 0 are directions of constraint violation 9 / 15

  15. Identifiability 10 / 15

  16. Unmixing 11 / 15

  17. Unmixing 12 / 15

  18. Unmixing 13 / 15

  19. Results 14 / 15

  20. Conclusions 15 / 15

Recommend


More recommend