An Empirical Comparison of Unsupervised Constituency Parsing Methods - PowerPoint PPT Presentation

An Empirical Comparison of Unsupervised Constituency Parsing Methods Jun Li, Yifan Cao, Jiong Cai, Yong Jiang, Kewei Tu {lijun2, caoyf, caijiong, tukw}@shanghaitech.edu.cn yongjiang.jy@alibaba-inc.com

Background ● Goal : To learn a constituency parser without parse tree annotations

Background ● Goal : To learn a constituency parser without parse tree annotations ● Trends : This task receives a lot of attention recently (2019) ○ increasing number of accepted papers: NACCL*2 ， ACL*5, EMNLP*3 ○ with high quality: ICLR 2019 best paper (Shen et al., 2019)

Background ● Goal : To learn a constituency parser without parse tree annotations ● Trends : This task receives a lot of attention recently (2019) ○ increasing number of accepted papers: NACCL*2 ， ACL*5, EMNLP*3 ○ with high quality: ICLR 2019 best paper (Shen et al., 2019) ● Problems : No unified experimental standard has been adopted ○ making the results across papers incomparable

Background ● Goal : To learn a constituency parser without parse tree annotations ● Trends : This task receives a lot of attention recently (2019) ○ increasing number of accepted papers: NACCL*2 ， ACL*5, EMNLP*3 ○ with high quality: ICLR 2019 best paper (Shen et al., 2019) ● Problems : No unified experimental standard has been adopted ○ making the results across papers incomparable ● Our contributions : ○ Propose a standardized experimental setup ○ Conduct a systematic experiments on ■ PRPN (Shen et al., 2018) ■ URNNG (Kim et al., 2019b) ■ DIORA (Drozdov et al., 2019) ■ CCM (Klein and Manning, 2002) ■ CCL (Seginer, 2007)

Experimental setup ● Language

Experimental setup ● Language Different languages have different syntactic properties Japanese (mostly left branching) English (mostly right branching)

Experimental setup ● Language : Use KTB and PTB for training and evaluation ● Dataset pre-processing

Experimental setup ● Language : Use KTB and PTB for training and evaluation ● Dataset pre-processing : Train on length <= 10/40; Split into train/dev/test ● Punctuation post-processing :

Experimental setup ● Language : Use KTB and PTB for training and evaluation ● Dataset pre-processing : Train on length <= 10/40; Split into train/dev/test ● Punctuation post-processing : Attach to root or least common ancestor ● Evaluation （，）（，）（，） …… （，）（，）（，）（，）

Experimental setup ● Language : Use KTB and PTB for training and evaluation ● Dataset pre-processing : Train on length <= 10/40; Split into train/dev/test ● Punctuation post-processing : Attach to root or least common ancestor ● Evaluation : Report Micro/Macro/Evalb F1 ● ….. ● More details can be found in our paper

Experimental results (English)

Experimental results (English) …

Experimental results (English)

Experimental results (Japanese)

Conclusion ● We propose a standardized experimental setup for unsupervised constituency parsing ● We empirically compare five methods and find that recent models do not show a clear advantage over decade-old models

Thank you!

An Empirical Comparison of Unsupervised Constituency Parsing Methods - PowerPoint PPT Presentation

An Empirical Comparison of Unsupervised Constituency Parsing Methods Jun Li, Yifan Cao, Jiong Cai, Yong Jiang, Kewei Tu {lijun2, caoyf, caijiong, tukw}@shanghaitech.edu.cn yongjiang.jy@alibaba-inc.com Background Goal : To learn a

UNSUPERVISED LEARNING, CLUSTERING UNSUPERVISED LEARNING UNSUPERVISED LEARNING Supervised

WORLD BANK GROUP AFRICA GROUP 1 CONSTITUENCY 17 th Statutory Constituency Meeting ANNUAL REPORT

Advisory Group 5 Subregional Focal Points, 14 Constituency Focal Points Constituency Groups (1)

WORLD BANK GROUP AFRICA GROUP 1 CONSTITUENCY 16 th Statutory Constituency Meeting INTERIM REPORT

Constituency/Stakeholder Travel FY 11 FY 12 Update Update on Constituency Travel Support

Syntax: Conjunction Constituency Tests Recursion, Conjunction, and Auxiliary Verbs

The Constituency of Hyperlinks in a Hypertext Corpus . mitcho (Michael Yoshitaka Erlewine)

Unsupervised Learning and Clustering l In unsupervised learning you are given a data set with no

4CSLL5 Parameter Estimation (Supervised and Unsupervised) Unsupervised Maximum Likelihood

On the Limitations of Unsupervised Bilingual Dictionary Induction Anders Sgaard Sebastian

Unsupervised Learning Andrea Passerini passerini@disi.unitn.it Machine Learning Unsupervised

Introduction to PCA Unsupervised Learning in R Unsupervised learning Two methods of

Functional Principal Component Analysis May 14, 2018 Empirical Principal Component FPC for the

Domain Adaptation for Constituency Parsing Using Partial Annotations Vidur Joshi Matthew Peters

Linear Time Constituency Parsing with RNNs and Dynamic Programming Juneki Hong 1 Liang Huang 1,2 1

SDG-Education 2030 Steering Committee SDG4 Steering Committee Foundations Constituency Report (

CSE440: Introduction to HCI Methods for Design, Prototyping and Evaluating User Interaction

Elected Officials Council 21 May 2020 Agenda 0800 SSMCP Welcome Don Anderson, Co-Chair Elected

T2S: Two Years to Launch The Strategy of London Stock Exchange for T2S London Stock Exchange

EXCHANGEABLE RANDOM MEASURES BY F. C ARON AND E. B. F OX Benjamin Bloem-Reddy

Lottery ticket hypothesis By : Grishma Gupta, Lokit Paras 1.Motivation Deep learning models

YOUR TURN Materials Task format Data collection Design one task for a usability

Mesos Problem Different applications need different frameworks How can we share a cluster

Thrivability Strategy Dino Karabeg Friday, June 13, 2014 We are in the midst of a great

Sambuz

Useful Links

Newsletter

Mail Us