Creating Large-Scale Multilingual Cognate Tables Winston Wu and - PowerPoint PPT Presentation

Creating Large-Scale Multilingual Cognate Tables Winston Wu and David Yarowsky Center for Language and Speech Processing Johns Hopkins University

http://educationviews.org/wp-content/uploads/2013/06/world-bread-cognates-panis.jpg

Cognates and Cognate Chains

Data • Panlex and Wiktionary

Cognate Table Construction Initial cluster with Alignment to get Cluster with unweighted edit lexical translation weighted distance distance probabilities function

Clustering tuk: stol uig: ustel azj: stol tur: tablo uzn: stol tat: ostal tuk: tablisa tat: tablis uzn: tablista

Bitext from Clusters eng azj tat tuk tur uig uzn table stol stol stol table ostal ustel table tablo table tablis tablisa tablista

Alignment ü s t e l UIG o s t o l TAT t -> t 0.600 l -> l 0.747 h -> h 0.529 t -> d 0.098 l -> r 0.048 h -> u 0.150 t -> c 0.061 l -> n 0.024 h -> NULL 0.140 t -> r 0.057 l -> t 0.019 h -> l 0.048 t -> p 0.019 l -> o 0.018 h -> a 0.032 t -> s 0.017 l -> d 0.016 h -> j 0.019 t -> l 0.017 l -> c 0.015 h -> o 0.017 t -> n 0.015 l -> a 0.015 h -> k 0.015

Clustering Distance Function • Language-pair-specific edit distance • Intra-family edit distance • Same backtranslation • Same POS • Same MeaningID

Cognate Tables

Experiments • Hold out words • Use MT to predict • Single language pair and system combination • Evaluate on 1-best, 10-best, MRR

Results: Romance

Results: Turkic

Results: Romance

Results: Turkic

Conclusion • Cluster-alignment-cluster process for multilingual cognate table construction • Experiments • 1-best exact match accuracy is hard! • Close languages tend to do better • Data size matters • Code and data at github.com/wswu/coglust

Creating Large-Scale Multilingual Cognate Tables Winston Wu and - PowerPoint PPT Presentation

Creating Large-Scale Multilingual Cognate Tables Winston Wu and David Yarowsky Center for Language and Speech Processing Johns Hopkins University http://educationviews.org/wp-content/uploads/2013/06/world-bread-cognates-panis.jpg Cognates and

A large-scale International IPv6 Network A large-scale International IPv6 Network www.6net.org

FINANCING LARGE SCALE SOLAR Large Scale Solar Conference - Sydney Gloria Chan Director, Large

Large-Scale Machine Learning at Twitter 2 Large-Scale Machine Learning at Twitter Jimmy Lin and

INFRASTRUCTURE 2110414 Large Scale Computing Systems Natawut Nupairoj, Ph.D. Outline 2

Module 4: Creating Data Types and Tables Overview Creating Data Types Creating Tables

Creating Dashboards of Direct and Creating Dashboards of Direct and Creating Dashboards of Direct

Creating a Community of Inquiry Creating a Community of Inquiry : Creating a Community of Inquiry

Module 3: Creating and Managing Databases Overview Creating Databases Creating

MongoDB large scale data-centric architectures QConSF 2012 Kenny Gorman Founder, ObjectRocket

INCORPORATING LARGE-SCALE CITIZEN INCORPORATING LARGE-SCALE CITIZEN DELIBERATION INTO

Workshop Workshop on Large on Large- -Scale Disaster Recovery Scale Disaster Recovery i i

A large-scale chemical data integration system Gaia Paolini Pfizer Confidential 1 Large-Scale

Efficient Large-Scale Graph Processing on Hybrid CPU and GPU Systems A. Gharaibeh, E.

Meeting the Challenges of Ultra- -Large Large- -Scale Scale Meeting the Challenges of Ultra

Large Scale I nternational I Pv6 Pilot Large Scale I nternational I Pv6 Pilot Network (6NET)

Deploying Large Scale AVB/TSN Networks Jeff Koftinoff, Meyer Sound Laboratories, Inc. June 19,

Data Centric Systems and Networking (DCSN) Session 1: Introduction to R212 Eiko Yoneki Systems

Physics 2D Lecture Slides Feb 12 Vivek Sharma UCSD Physics Wave Packet : Localization To make

Why to give an academic talk? You have a good work, and want others to know about it

MBSE in Telescope Modeling: European Extremely Large Telescope Worlds Biggest Eye on the Sky

Institute for Cyber Security A Framework for Risk-Aware Role Based Access Control Khalid Zaman

Institute for Cyber Security A Lattice Interpretation of Group-Centric Collaboration with

Migrating to PostgreSQL Boriss Mejas Consultant - 2ndQuadrant Air Guitar Player https://www.

Complexation of 3 ,7 ,12 -trihydroxy-5 -cholan-24-amine by - and -cyclodextrins

Explore More Topics

Sambuz

Useful Links

Newsletter

Mail Us