cross lingual part of speech tagging through ambiguous
play

Cross-Lingual Part-of-Speech Tagging through Ambiguous Learning - PowerPoint PPT Presentation

1/27 Cross-Lingual Part-of-Speech Tagging through Ambiguous Learning Guillaume Wisniewski Nicolas Pcheux Souhir Gahbiche-Braham Franois Yvon Universit Paris-Sud & LIMSI-CNRS October 28, 2014 2/27 Context performance standards


  1. 1/27 Cross-Lingual Part-of-Speech Tagging through Ambiguous Learning Guillaume Wisniewski Nicolas Pécheux Souhir Gahbiche-Braham François Yvon Université Paris-Sud & LIMSI-CNRS October 28, 2014

  2. 2/27 Context performance standards for many NLP tasks in-domain data ▶ Supervised Machine Learning techniques have established new ▶ Success crucially depends on the availability of annotated ▶ Not so common situation (e.g. under-resourced languages) ▶ What can we do then ?

  3. 3/27 Context ▶ Unsupervised learning ▶ Crawl data (e.g. Wiktionary)

  4. 4/27 . Research . Scientific . for . Market . a . Making scientifique . . recherche . la . pour . marché . Un . . VERB Context NOUN . . NOUN . . ADP . . NOUN . . . . . NOUN . . ADP . . NOUN . . DET . Example NOUN Classification Data Transformations YES svm_one_class_trainer YES < 20K Go get linear_manifold_regularizer with radial_basis_kernel Samples labels! NO a graph of "similar" Do you have vector_normalizer_frobmetric samples? (see one_class_classifiers_ex.cpp svm_c_linear_dcd_trainer NO YES NO NO example program) discriminant_pca < 5K NO YES to label things Are you trying Samples as anomalous sammon_projection YES Learning a svm_c_trainer vs. normal? distance metric? histogram_intersection_kernel with radial_basis_kernel or cca YES YES NO Transfer NOT NO Do you have svm_c_linear_trainer NO < 20K WORKING krr_trainer with two views of Samples radial_basis_kernel Do you have your data? labeled data? with krr_trainer using one_vs_one_trainer YES radial_basis_kernel NO Number of YES YES YES Number of features NO features < 100 YES Do you want svm_multiclass_linear_trainer NO < 100 to transform Predicting a your data? NO true or false label? YES NO Structured Prediction Do you want to detect objects in images? Do you have YES Predicting a YES YES labeled data? categorial label? NO Predicting a to rank order Are you trying NO continuous quantity? something? structural_object_detection_trainer NO newman_cluster or NO svm_rank_trainer chinese_whispers NO < 20K structural_track_association_trainer Do you know how Samples YES TOO many categories? svr_linear_trainer NOT WORKING krls or rls Is this a time-series structural_assignment_trainer SLOW YES or online prediction problem? kkmeans or YES svr_trainer with krr_trainer with NO YES NO find_clusters_using_kmeans radial_basis_kernel or radial_basis_kernel histogram_intersection_kernel Clustering Regression Want to make a tracker? Trying to solve an NO YES assignment problem? Ressource-rich language Less-ressourced language YES NO NO Predicting the labels structural_svm_problem ▶ Cross-lingual transfer (weakly supervised learning)

  5. 4/27 . Research . Scientific . for . Market . a . Making scientifique . . recherche . la . pour . marché . Un . . VERB Context NOUN . . NOUN . . ADP . . NOUN . . . . . NOUN . . ADP . . NOUN . . DET . Example NOUN Classification Data Transformations YES svm_one_class_trainer YES < 20K Go get linear_manifold_regularizer with radial_basis_kernel Samples labels! NO a graph of "similar" Do you have vector_normalizer_frobmetric samples? (see one_class_classifiers_ex.cpp svm_c_linear_dcd_trainer NO YES NO NO example program) discriminant_pca < 5K NO YES to label things Are you trying Samples as anomalous sammon_projection YES Learning a svm_c_trainer vs. normal? distance metric? histogram_intersection_kernel with radial_basis_kernel or cca YES YES NO Transfer NOT NO Do you have svm_c_linear_trainer NO < 20K WORKING krr_trainer with two views of Samples radial_basis_kernel Do you have your data? labeled data? with krr_trainer using one_vs_one_trainer YES radial_basis_kernel NO Number of YES YES YES Number of features NO features < 100 YES Do you want svm_multiclass_linear_trainer NO < 100 to transform Predicting a your data? NO true or false label? YES NO Structured Prediction Do you want to detect objects in images? Do you have YES Predicting a YES YES labeled data? categorial label? NO Predicting a to rank order Are you trying NO continuous quantity? something? structural_object_detection_trainer NO newman_cluster or NO svm_rank_trainer chinese_whispers NO < 20K structural_track_association_trainer Do you know how Samples YES TOO many categories? svr_linear_trainer NOT WORKING krls or rls Is this a time-series structural_assignment_trainer SLOW YES or online prediction problem? kkmeans or YES svr_trainer with krr_trainer with NO YES NO find_clusters_using_kmeans radial_basis_kernel or radial_basis_kernel histogram_intersection_kernel Clustering Regression Want to make a tracker? Trying to solve an NO YES assignment problem? Ressource-rich language Less-ressourced language YES NO NO Predicting the labels structural_svm_problem ▶ Cross-lingual transfer (weakly supervised learning)

  6. 5/27 State of the art State of the art ▶ In most cases this only results in partially annotated data ▶ Alternative ML techniques need to be designed ▶ Partially observed CRF [Täckström et al., 2013] ▶ Posterior regularization [Ganchev and Das, 2013] ▶ Expectation maximization [Wang and Manning, 2014]

  7. 6/27 Contributions 1. We cast this problem in the framework of ambiguous learning [Bordes et al., 2010, Cour et al., 2011] 2. We present a novel method to learn from ambiguous supervision data 3. We show significant improvements over prior state of the art 4. We conduct a detailed analysis that allows us to identify the limits of transfer-based methods and their evaluation

  8. 7/27 Part I Projecting Labels across Aligned Corpora

  9. 8/27 Hypothesis Strong assumption Syntactic categories in the source language can be directly related to the ones in the target one Universal tagset [Petrov et al., 2012] ▶ In this work we focus on POS tagging { Noun , Verb , Adj , Adv , Pron , Det , Adp , Num , Conj , Prt , ‘ . ’, X } ▶ All annotations are mapped to this universal tagset

  10. 9/27 Type and token constraints [Täckström et al., 2013] Type and token constraints [Täckström et al., 2013] 1. type constraints from a dictionary . . . 2. token constraints projected through alignment links . . . Transfer-based methods only deliver partial and noisy supervision ▶ Heuristic filtering rules [Yarowsky et al., 2001] ▶ Graph-base projection [Das and Petrov, 2011] ▶ Combine with monolingual information

  11. 10/27 NOUN . market . . NOUN VERB . . . . . VERB . . NOUN . . VERB … walked Type constraints . From tag dictionaries Build from the projected labels across the aligned corpora . … . marché . … . marché . … . … . market . … We use the intersection of the two above ▶ Automatically extracted from Wiktionary

  12. 10/27 . . . market . . NOUN VERB . NOUN Type constraints . . VERB . . NOUN . . VERB … . walked . From tag dictionaries . … . marché . … . marché . … . … . market . … We use the intersection of the two above ▶ Automatically extracted from Wiktionary ▶ Build from the projected labels across the aligned corpora ⇒

  13. 10/27 . . . market . . NOUN VERB . NOUN Type constraints . . VERB . . NOUN . . VERB … . walked . From tag dictionaries . … . marché . … . marché . … . … . market . … ▶ Automatically extracted from Wiktionary ▶ Build from the projected labels across the aligned corpora ⇒ ▶ We use the intersection of the two above

  14. 11/27 . . PRON NOUN DET ADJ . scientifique NOUN . recherche . la . pour . VERB marché . NOUN . . VERB NOUN . PRON . NOUN DET . . NOUN ADP . . . Token constraints Market Research . Scientific . for . . . a . Making . . 1. Use the type constraints . VERB Un . . NOUN . . NOUN . ADP . . . NOUN . . DET . ADJ

  15. 11/27 . . . PRON NOUN DET ADJ . VERB scientifique . recherche . la . NOUN . . . NOUN . . VERB NOUN . PRON . NOUN DET . . NOUN ADP pour marché Token constraints . . Research . Scientific . for Market VERB . a . Making . 2. Use the alignment links from the parallel corpora . . . . Un . NOUN . . NOUN . . ADP . . NOUN . . DET ADJ

  16. 11/27 . . . PRON NOUN DET ADJ . VERB scientifique . recherche . la . NOUN . . . NOUN . . VERB NOUN . PRON . NOUN DET . . NOUN ADP pour marché Token constraints . . Research . Scientific . for Market VERB . a . Making . 3. Tag the source side (resource-rich) . . . . Un . NOUN . . NOUN . . ADP . . NOUN . . DET ADJ

  17. 11/27 . . . PRON NOUN DET ADJ . VERB scientifique . recherche . la . NOUN . . . NOUN . . VERB NOUN . PRON . NOUN DET . . NOUN ADP pour marché Token constraints . . Research . Scientific . for Market VERB . a . Making . 4. Project labels if licensed by type constraints . . . . Un . NOUN . . NOUN . . ADP . . NOUN . . DET ADJ

  18. 12/27 Part II Modeling Sequences under Ambiguous Supervision

Recommend


More recommend