Automatic construction of distributional thesaurus (for multiple - PowerPoint PPT Presentation

Automatic construction of distributional thesaurus (for multiple languages) Zheng ZHANG 1 st year PhD student ILES, LIMSI 28/03/2017 1

What is a distributional thesaurus? • For a given input, a distributional thesaurus identifies semantically similar words based on the assumption that they share a similar distribution . • Distributional assumption: In practice, two words are considered similar if their occurrences share similar contexts . Ref. Vincent Claveau, Ewa Kijak. Distributional Thesauri for Information Retrieval and vice versa. 28/03/2017 2

Why do we need it? • It is useful for alleviating data sparseness in many NLP applications. • It is useful for completing lexical resources. Ref. Enrique Henestroza Anguiano, Pascal Denis. FreDist: Automatic construction of distributional thesauri for French. 28/03/2017 3

Contexts • These contexts are typically co-occurring words in a limited window around the considered words, or syntactically linked words. Ref. http://nlp.stanford.edu:8080/corenlp/process 28/03/2017 4

A new context: Graph-of-words • A graph whose vertices represent unique terms of the document and whose edges represent co-occurrences between the terms within a fixed-size sliding window. • “This is an example about how to generate a graph. ” (window size=4) Ref. Rousseau F., Vazirgiannis M. (2015) Main Core Retention on Graph-of-Words for Single-Document Keyword Extraction. https://safetyapp.shinyapps.io/GoWvis/ 28/03/2017 5

Graph attributes: K-core • A subgraph H k = (Vʹ,Eʹ), induced by the subset of vertices Vʹ ⊆ V (and a fortiori by the subset of edges Eʹ ⊆ E), is called a k-core or a core of order k iff ∀ v ∈ Vʹ, degH k (v) ≥ k and H k is the maximal subgraph with this property, i.e. it cannot be augmented without losing this property. Ref. Rousseau F., Vazirgiannis M. (2015) Main Core Retention on Graph-of-Words for Single-Document Keyword Extraction. Text Mining – an introduction, Michalis Vazirgiannis, 2017 Data Science Winter School, Beijing, China 28/03/2017 6

Graph attributes: K-core • In other words, the k-core of a graph corresponds to the maximal • A subgraph H k = (Vʹ,Eʹ), induced by the subset of vertices Vʹ ⊆ V (and connected subgraph whose vertices are at least of degree k within the a fortiori by the subset of edges Eʹ ⊆ E), is called a k-core or a core of order k iff ∀ v ∈ Vʹ, degH k (v) ≥ k and H k is the maximal subgraph subgraph. with this property, i.e. it cannot be augmented without losing this property. Ref. Rousseau F., Vazirgiannis M. (2015) Main Core Retention on Graph-of-Words for Single-Document Keyword Extraction. Text Mining – an introduction, Michalis Vazirgiannis, 2017 Data Science Winter School, Beijing, China 28/03/2017 6

Why graph-of-words may be a good choice? • Graph-of-words: • Taking into account word co-occurrence and word order (optional) . (compared with bag-of-words) • K-core: • In one core, all neighborhoods contribute equally to the subgraph. (compared with centrality which is used in PageRank & HITS) • K-cores are adaptive. • It has been proved that main core has a good performance in information retrieval. Ref. Rousseau F., Vazirgiannis M. (2015) Main Core Retention on Graph-of-Words for Single-Document Keyword Extraction. 28/03/2017 7

Difficulty: optimization for Big data • Texts: Multiprocessing • Encoding text by local ids • Merge local id-word dictionaries to get an universal id-word dictionary • Transfer local encoded text • “MapReduce like” Multiprocessing to prepare edges files • “This is an example about how to generate a graph. ” (window size=2) • Edges of window size n = edges of distance 2 + … + edges of distance n 28/03/2017 8

Difficulty: optimization for Big data • Texts: Multiprocessing • Encoding text by local ids • Merge local id-word dictionaries to get an universal id-word dictionary • Transfer local encoded text • “MapReduce like” Multiprocessing to prepare edges files • “This is an example about how to generate a graph. ” (window size=2) 3 • Edges of window size n = edges of distance 2 + … + edges of distance n 28/03/2017 8

Difficulty: optimization for Big data • Texts: Multiprocessing • Encoding text by local ids • Merge local id-word dictionaries to get an universal id-word dictionary • Transfer local encoded text • “MapReduce like” Multiprocessing to prepare edges files • “This is an example about how to generate a graph. ” (window size=2) 3 4 • Edges of window size n = edges of distance 2 + … + edges of distance n 28/03/2017 8

Multiple languages (ideas) • Using a small dictionary to generate a mixed text • Find common graph patterns for multiple languages Ref. Stephan Gouws, Anders Søgaard, Simple task-specific bilingual word embeddings 28/03/2017 9

Future work • word2vec: GoW model architecture • Using graph-of-words for other task. (e.g. identifying parallel sentences in comparable corpora, BUCC2017 shared task) • From distributional thesaurus to semantic classes 28/03/2017 10

Me Merci

Automatic construction of distributional thesaurus (for multiple - PowerPoint PPT Presentation

Automatic construction of distributional thesaurus (for multiple languages) Zheng ZHANG 1 st year PhD student ILES, LIMSI 28/03/2017 1 What is a distributional thesaurus? For a given input, a distributional thesaurus identifies

An Approach to Automated Thesaurus Construction Using Clusterization-Based Dictionary Analysis

Lorna Balkan CESSDA Thesaurus Coordination Officer UK Data Archive University of Essex NKOS,

Mapping Metaphor with the Historical Thesaurus Wendy Anderson, Ellen Bramwell, Rachael Hamilton

Distributional Semantics The unsupervised modeling of meaning on a large scale Tim Van de Cruys

Automatic Verification of Automatic Verification of Automatic Verification of Automatic

LanguaL thesaurus from A to Z Jayne Ireland & Anders Mller Danish Food Information

Challenges in Deploying and Managing Large Terminologies: NCI Thesaurus For Protg Workshop June

Cross-language High Similarity Search using a Conceptual Thesaurus no 2 and Paolo Rosso 1 Parth

Thesaurus-Based Similarity Ling571 Deep Processing Techniques for NLP March 2, 2015 Roadmap

Thesaurus-Based Similarity Ling571 Deep Processing Techniques for NLP February 29, 2016

Thesaurus-Based Similarity Ling571 Deep Processing Techniques for NLP February 22, 2017

Distributional Compositionality Intro to Distributional Semantics Raffaella Bernardi University

Linear mixed models with improper priors and flexible distributional assumptions for longitudinal

Statistics and Samples in Distributional Reinforcement Learning Mark Rowland, Robert Dadashi,

Statistics and Samples in Distributional Reinforcement Learning Rowland, Dadashi, Kumar, Munos,

Compositional Distributional Semantic Models for Semantic Relatedness and Entailment Sidharth

Mining User Navigation Patterns for Personalizing Topic Directories Theodore Dalamagas,

Toward Mining Concept Keywords from Identifiers in Large Software Projects Masaru Ohba

Extracting keywords from images Bag-of-visual-words enriched with graph techniques Gjorgji

Innovation in HPOG: A Facilitated Learning Experience Using Appreciative Inquiry Prep Webinar

Argument Retrieval in Project Debater Yufang Hou IBM Research Europe, Dublin IBM Research:

A Model for Recommending Research Articles: A Case Study in Computer Science, Neuroscience and

Security for Cloud & Big Data CS 161: Computer Security Prof. David Wagner April 18, 2013

Combining Text and Image Processing in an Automa6c Image

Automatic construction of distributional thesaurus (for multiple - PowerPoint PPT Presentation

Automatic construction of distributional thesaurus (for multiple languages) Zheng ZHANG 1 st year PhD student ILES, LIMSI 28/03/2017 1 What is a distributional thesaurus? For a given input, a distributional thesaurus identifies

An Approach to Automated Thesaurus Construction Using Clusterization-Based Dictionary Analysis

Lorna Balkan CESSDA Thesaurus Coordination Officer UK Data Archive University of Essex NKOS,

Mapping Metaphor with the Historical Thesaurus Wendy Anderson, Ellen Bramwell, Rachael Hamilton

Distributional Semantics The unsupervised modeling of meaning on a large scale Tim Van de Cruys

Automatic Verification of Automatic Verification of Automatic Verification of Automatic

LanguaL thesaurus from A to Z Jayne Ireland &amp; Anders Mller Danish Food Information

Challenges in Deploying and Managing Large Terminologies: NCI Thesaurus For Protg Workshop June

Cross-language High Similarity Search using a Conceptual Thesaurus no 2 and Paolo Rosso 1 Parth

Thesaurus-Based Similarity Ling571 Deep Processing Techniques for NLP March 2, 2015 Roadmap

Thesaurus-Based Similarity Ling571 Deep Processing Techniques for NLP February 29, 2016

Thesaurus-Based Similarity Ling571 Deep Processing Techniques for NLP February 22, 2017

Distributional Compositionality Intro to Distributional Semantics Raffaella Bernardi University

Linear mixed models with improper priors and flexible distributional assumptions for longitudinal

Statistics and Samples in Distributional Reinforcement Learning Mark Rowland, Robert Dadashi,

Statistics and Samples in Distributional Reinforcement Learning Rowland, Dadashi, Kumar, Munos,

Compositional Distributional Semantic Models for Semantic Relatedness and Entailment Sidharth

Mining User Navigation Patterns for Personalizing Topic Directories Theodore Dalamagas,

Toward Mining Concept Keywords from Identifiers in Large Software Projects Masaru Ohba

Extracting keywords from images Bag-of-visual-words enriched with graph techniques Gjorgji

Innovation in HPOG: A Facilitated Learning Experience Using Appreciative Inquiry Prep Webinar

Argument Retrieval in Project Debater Yufang Hou IBM Research Europe, Dublin IBM Research:

A Model for Recommending Research Articles: A Case Study in Computer Science, Neuroscience and

Security for Cloud &amp; Big Data CS 161: Computer Security Prof. David Wagner April 18, 2013

Combining Text and Image Processing in an Automa6c Image

LanguaL thesaurus from A to Z Jayne Ireland & Anders Mller Danish Food Information

Security for Cloud & Big Data CS 161: Computer Security Prof. David Wagner April 18, 2013