Dependency Parser for Bengali-English Code-Mixed Data enhanced with - PowerPoint PPT Presentation

Dependency Parser for Bengali-English Code-Mixed Data enhanced with a Synthetic Treebank Urmi Ghosh, Dipti Misra Sharma and Simran Khanuja LTRC, IIIT-H, India

Code-Mixing ● mixing of various linguistic units ● from two (or more) languages ● within a sentence kobe theke #BOSS2 er shooting start hobe bn bn univ bn en en bn “When” “from” “of” “will be”

Bengali-English CM ● Language Identification (Das and Gambäck, 2014) Bengali ● the second most widely ● POS tagging (Jamatia et al., spoken language in India 2015) after Hindi (Bhatia, 1982) ● the official and national language of Bangladesh ● Dependency parser (Bhat, 2018) - Hindi-English! ● 261 million speakers (Ethnologue, 2018)

Similarities with Hi-EN ● dirty hands ke use se bache Hindi + English SOV SVO ● dirty hands era use ediye chalun Bengali + English

Data Preparation and Annotation ● 500 Bengali-English tweets from Twitter ● code-mixing ratio of 30:70(%) E s , = embedded ● Universal Dependency Annotations M s = matrix

Code-Mixing Data Synthesis

Code-Mixing Process (NP Your self-confidence) (ADVP also) (VP increases (PP with (NP teeth))) ENGLISH Chunk Harmonizer (NP daanter “teeth” jonyo “for”) (NP aapnaar “your”) (NP aatmaviswas “self-confidence” 1. Separate the coordinating conjunction o“also”) (VP baadhe “increases”) BENGALI 2. Combine the adverbs of degree with preceding NP (NP Your) (NP self-confidence also) (VP 3. Convert PP to NP, separate from VP increases) (NP with teeth) HARMONIZED 4. Split NP at genitives ENGLISH Rule-based Chunk Replacement (NP teeth er “of” jonyo “for” ) (NP aapnaar “your” ) (NP self-confidence also ) (VP ● Closed Class Constraint (Sridhar and baadhe “increases” ) BENGALI -ENGLISH Sridhar, 1980; Joshi, 1982) CM ● Replace Bengali NP and JJP with English ● Retain Bengali Post positions

Synthetic Bengali-English Treebank dirty hands era use ediye chalun en en bn en bn bn

Neural-Stack based Dependency Parser ● Bhat et al. (2018) for Hindi-English ● transition-based parser (Kiperwasser and Goldberg, 2016) ● Joint learning of POS and Parsing (Zhang and Weiss, 2016; Chen et al., 2016) ● enhanced by neural stacks to incorporate monolingual syntactic knowledge with the CM model

Experiments and Results (Trilingual + Syn BE) Bilingual + Gold BE Trilingual + Gold (BE+HE) + Gold (BE +HE) POS UAS LAS POS UAS LAS POS UAS LAS 89.63 76.24 61.41 79.39 62.78 49.38 87.43 74.42 60.04 ● Small CM Training Data ● + Utilizes existing ● + Utilizes Syn-BE (3643) Size (140) BE(140), HE data (1448) ● + Utilizes existing ● Utilizes English(12k), CM data BE(140), HE data (1448) Bengai Treebank (9k) ● + Utilizes English(12k), CM data ● Not enough CM grammer Bengai Treebank (9k), ● + Utilizes English(12k), Hindi Treebank (11k) Bengai Treebank (9k), Hindi Treebank (11k)

Conclusion Limitations 1. Error Propagation as automatically annotated 2. Not all cases of code-mixing is covered Contribution 1. State of the art POS tagger + Dependency Parser for Bengali English CM ( 89.63 76.24 61.41 ) 2. 500 Bengali-English UD annotated tweets 3. Synthetic-BE Data to help in other NLP CM systems

Thank You!

Dependency Parser for Bengali-English Code-Mixed Data enhanced with - PowerPoint PPT Presentation

Dependency Parser for Bengali-English Code-Mixed Data enhanced with a Synthetic Treebank Urmi Ghosh, Dipti Misra Sharma and Simran Khanuja LTRC, IIIT-H, India Code-Mixing mixing of various linguistic units from two (or more)

https://bazel.build/ Inputs /usr/bin/cc Action Outputs ./parser.h cc -I. -c parser.c -o

1 2 3+4 2 type Parser = String Tree type Parser = String ( Tree, String) type Parser =

Ensemble Models for Dependency Parsing: Cheap and Good? Mihai Surdeanu and Christopher D. Manning

A Fast and Accurate Dependency Parser using Neural Networks Danqi Chen, Christopher D. Manning.

4 English I CP or Honors Credits English II CP or Honors of English III CP or

Building a Predictive Parser I.e., How to build the parse table for a recursive-descent parser 1

Tasks of a Parser Tasks of a Parser Document Parser Interfaces Document Parser Interfaces

Dependency Dependency- -Based Automatic Evaluation Based Automatic Evaluation Dependency

DCU meets MET: Bengali and Hindi Morpheme Extraction Debasis Ganguly, Johannes Leveling, Gareth

A Fast and Accurate Dependency Parser using Neural Networks Danqi Chen & Christopher D.

Hebrew Dependency Parsing: Initial Results Yoav Goldberg Michael Elhadad IWPT 2009, Paris

Deep Dependency Graph Conversion in English 15th International Workshop on Treebanks and

Graph Based Dependency Parsing Wei Qiu December 15, 2011 . . . . . . Graph Based

Dependency Grammars Topological Dependency Trees: A Constraint-based Account of Linear

Lecture 19: Dependency Grammars and Dependency Parsing Julia Hockenmaier juliahmr@illinois.edu

Parser Evaluation and the BNC Standard Parser Evaluation The Parsers Jennifer Foster and Josef

Update on Federal Policies that Impact Home and Community Based Services Alison Barkoff, J.D.

The Royal Bank of Scotland Group Q311 Results 4 th November 2011 Important Information Certain

policy 2017 Stefan Ingves Governor of the Riksbank Riksdag Committee on Finance 3 May 2018 A

Accelerating Sparse DNN Models without Hardware-Support via Tile-Wise Sparsity 2020/11

Optimal piezoelectric energy harvesting strategy Joint work with B. Kaltenbacher Pavel Krej

Feedback Capacity of Finite-State Channels with Causal State Information Known at the Encoder Eli

Reinforcement Learning Shaswot Shresthamali, Masaaki Kondo, Hiroshi Nakamura The University of

Competitive Neutrality Comments, Session 2 ACCC Regulation and Competition Conference, July 25

Sambuz

Useful Links

Newsletter

Mail Us

Dependency Parser for Bengali-English Code-Mixed Data enhanced with - PowerPoint PPT Presentation

Dependency Parser for Bengali-English Code-Mixed Data enhanced with a Synthetic Treebank Urmi Ghosh, Dipti Misra Sharma and Simran Khanuja LTRC, IIIT-H, India Code-Mixing mixing of various linguistic units from two (or more)

https://bazel.build/ Inputs /usr/bin/cc Action Outputs ./parser.h cc -I. -c parser.c -o

1 2 3+4 2 type Parser = String Tree type Parser = String ( Tree, String) type Parser =

Ensemble Models for Dependency Parsing: Cheap and Good? Mihai Surdeanu and Christopher D. Manning

A Fast and Accurate Dependency Parser using Neural Networks Danqi Chen, Christopher D. Manning.

4 English I CP or Honors Credits English II CP or Honors of English III CP or

Building a Predictive Parser I.e., How to build the parse table for a recursive-descent parser 1

Tasks of a Parser Tasks of a Parser Document Parser Interfaces Document Parser Interfaces

Dependency Dependency- -Based Automatic Evaluation Based Automatic Evaluation Dependency

DCU meets MET: Bengali and Hindi Morpheme Extraction Debasis Ganguly, Johannes Leveling, Gareth

A Fast and Accurate Dependency Parser using Neural Networks Danqi Chen &amp; Christopher D.

Hebrew Dependency Parsing: Initial Results Yoav Goldberg Michael Elhadad IWPT 2009, Paris

Deep Dependency Graph Conversion in English 15th International Workshop on Treebanks and

Graph Based Dependency Parsing Wei Qiu December 15, 2011 . . . . . . Graph Based

Dependency Grammars Topological Dependency Trees: A Constraint-based Account of Linear

Lecture 19: Dependency Grammars and Dependency Parsing Julia Hockenmaier juliahmr@illinois.edu

Parser Evaluation and the BNC Standard Parser Evaluation The Parsers Jennifer Foster and Josef

Update on Federal Policies that Impact Home and Community Based Services Alison Barkoff, J.D.

The Royal Bank of Scotland Group Q311 Results 4 th November 2011 Important Information Certain

policy 2017 Stefan Ingves Governor of the Riksbank Riksdag Committee on Finance 3 May 2018 A

Accelerating Sparse DNN Models without Hardware-Support via Tile-Wise Sparsity 2020/11

Optimal piezoelectric energy harvesting strategy Joint work with B. Kaltenbacher Pavel Krej

Feedback Capacity of Finite-State Channels with Causal State Information Known at the Encoder Eli

Reinforcement Learning Shaswot Shresthamali, Masaaki Kondo, Hiroshi Nakamura The University of

Competitive Neutrality Comments, Session 2 ACCC Regulation and Competition Conference, July 25

Sambuz

Useful Links

Newsletter

Mail Us

A Fast and Accurate Dependency Parser using Neural Networks Danqi Chen & Christopher D.