Chinese Hypernym-Hyponym Extraction from User Generated Categories - PowerPoint PPT Presentation

Chinese Hypernym-Hyponym Extraction from User Generated Categories Chengyu Wang, Xiaofeng He School of Computer Science and Software Engineering, East China Normal University Shanghai, China

Outline • Introduction • Background and Related Work • Proposed Approach • Experiments • Conclusion 2

Chinese Is-A Relation Extraction • Chinese is-a relation extraction – Chinese is-a relations are essential to construct large-scale Chinese taxonomies and knowledge graphs. – It is difficult to extract such relations due to the flexibility of language expression. • User generated categories – User generated categories are valuable knowledge sources, providing fine- grained candidate hypernyms of entities. – The semantic relations between an entity and its categories are not clear. 3

Baidu Baike: one of the largest online encyclopedias in China, with 13M+ entries Barack Obama Categories: Political figure, Foreign country, Leader, Person 4

The task : distinguishing is-a Barack Obama and not-is-a relations between Chinese words/phases Is-a Is-a Not- Is-a is-a Categories: Political figure, Foreign country, Leader, Person 5

Background • Taxonomy: a hierarchical type system for knowledge graphs, consisting of is-a relations among classes and entities – Example Entitiy Classes Person Country Political Leader Scientist Developed Country Entities 7

Describing the Task • Learning is-a relations for taxonomy expansion Entitiy Entitiy Person Country Learning Person Country Political Leader Scientist Developed Country Algorithm Political Leader Scientist Developed Country Key challenge : identify is-a relations from user generated categories 8

Modeling the Task • Taxonomy – Direct acyclic graph 𝐻 = (𝐹, 𝑆) ( 𝐹 : entities/classes, 𝑆 : is-a relations) • User generated categories – Collection of entities 𝐹 ∗ – Set of user generated categories: 𝐷𝑏𝑢 𝑓 for 𝑓 ∈ 𝐹 ∗ • Goal – Predict whether there is an is-a relation between 𝑓 and 𝑑 where 𝑓 ∈ 𝐹 ∗ and 𝑑 ∈ 𝐷𝑏𝑢 𝑓 based on the taxonomy 𝐻 9

Previous Approaches • Pattern matching-based approaches – Handcraft patterns: high accuracy, low coverage • Hearst Patterns: NP 1 such as NP 2 – Automatic generated patterns: higher coverage, lower accuracy – Not suitable for Chinese with flexible expression • Thesauri and encyclopedia based approaches – Taxonomy construction based on existing knowledge sources • YAGO: Wikipedia + WordNet • More precise but have limited scope constrained by sources – Chinese: relatively low-resourced • No Chinese version of WordNet and Freebase available 10

Previous Approaches • Text inference based approach – Infer relations using distributed similarity measures • Assumption: a hyponym can only appear in some of the contexts of its hypernym and a hypernym can appear in all contexts of its hyponyms – Not suitable for Chinese with flexible and sparse contexts • Word embedding based approach – Represent words as dense, low-dimensional vectors – Learn semantic projection models from hyponyms to hypernyms – State-of-the-art approach for Chinese is-a relation extraction (ACL’14) 11 Figures taken from Mikolov et al., 2013

Learning from Previous Work • Lessons learned from “state-of-the art” – Use word embeddings to represent words – Learn relations between hyponyms and hypernyms in the embedding space • Basic approaches – Vector offsets – Linear projection 12 Figures taken from Mikolov et al., 2013

Observations • Word vector offsets between Chinese is-a pairs – Multiple linguistic regularities may exist in is-a pairs • Different levels of hypernyms • Different types of is-a relations (instanceOf vs. subClassOf) • Different domains 13

General Framework • Initial stage – Train piecewise linear projection models based on the Chinese taxonomy • Iterative learning stage – Extract new is-a relations and adjust model parameters based on an incremental learning approach – Use Chinese Hypernym/Hyponym patterns to prevent “semantic drift” in each iteration 15

Initial Model Training • Linear projection model – Projection model: 𝑁𝑤 ⃗ 𝑦 3 + 𝑐 = 𝑤 ⃗ 𝑧 3 Projection matrix Word vector Offset vector • Piecewise linear projection model – Partition a collection of is-a relations 𝑆 7 ⊂ 𝑆 ∗ into 𝐿 clusters ( 𝐷 : ,⋯ ,𝐷 < ,⋯ ,𝐷 = ) – Each cluster 𝐷 < share projection matrix 𝑁 < and offset vector 𝑐 < – Optimization function: 1 C 𝐾 𝑁 < ,𝑐 < ; 𝐷 < = A 𝑁 < 𝑤 ⃗ 𝑦 3 + 𝑐 < − 𝑤 ⃗ 𝑧 3 𝐷 < (D E ,F E )∈G H 16

Iterative Learning (1) • Initialization – Word pairs: positive is-a set 𝑆 7 , unlabeled set 𝑉 – Model parameters: 𝑁 < and 𝑐 < for each cluster • Iterative process ( 𝑢 = 1, ⋯ , 𝑈 ) Sample δ 𝑉 word pairs from 𝑉 , denoted as 𝑉 (L) . 1. 2. Use the model to predict the relation between words. Denote “positive” (L) . word pairs as 𝑉 M (L) Use pattern-based relation selection method to select a subset of 𝑉 M 3. (L) . which have high confidence, denoted as 𝑉 N (L) from 𝑉 and add it to 𝑆 7 . Remove 𝑉 N 4. 17

Iterative Learning (2) • Iterative process ( 𝑢 = 1, ⋯ , 𝑈 ) (L) . Update cluster centroids incrementally based on 𝑉 N 5. 1 (LN:) = 𝑑 (L) + 𝜇 Q (L) A 𝑑 ⃑ < ⃑ < 𝑤 ⃑ 𝑦 3 − 𝑤 ⃑ 𝑧 3 − 𝑑 ⃑ < (L) 𝑉 < (S) (D E ,F E )∈R H New centroid Distance from centroid Old centroid Learning rate of centroid shift 6. Update model parameters based on new cluster assignments. 1 C (L) − 𝑤 (L) ,𝑐 < (L) ; 𝐷 < (L) (L) 𝑤 A 𝐾 𝑁 < = 𝑁 < ⃗ 𝑦 3 + 𝑐 < ⃗ 𝑧 3 (L) 𝐷 < (S) (D E ,F E )∈G H 18

Iterative Learning (3) • Model prediction – The prediction of the final piecewise linear projection models – The transitivity closure of existing is-a relations • Discussion – Combination of semantic and lexical extraction of is-a relations • Sematic level: word embedding based projection models • Lexical level: pattern-based relation selection – Incremental learning • Update of cluster centroids • Update of model parameters 19

Pattern-based Relation Selection (1) • Two observations Examples of Chinese – Positive evidence Hypernym/Hyponym Patterns • Is-A patterns Category Example • Such-As patterns Is-A 𝑦 3 是一个 𝑧 (between 𝑦 3 /𝑦 V and 𝑧 ) 𝑦 3 is a kind of 𝑧 Hypothesis : 𝑦 3 /𝑦 V is-a 𝑧 𝑧 ，例如 𝑦 3 、 𝑦 V Such-As – Negative evidence 𝑧 , such as 𝑦 3 and 𝑦 V • Such-As patterns 𝑦 3 、 𝑦 V 等 Co-Hyponym (between 𝑦 3 and 𝑦 V ) 𝑦 3 , 𝑦 V and others • Co-Hyponym patterns Hypothesis : 𝑦 3 not-is-a 𝑦 V 𝑦 V not-is-a 𝑦 3 20

Pattern-based Relation Selection (2) • Positive and negative evidence scores – Positive score 𝑒 (L) 𝑦 3 , 𝑧 3 𝑜 : 𝑦 3 ,𝑧 3 + 𝛿 𝑄𝑇 𝑦 3 ,𝑧 3 = 𝛽 1 − + (1 − 𝛽) 𝑒 (L) 𝑦, 𝑧 max max 𝑜 : 𝑦,𝑧 + 𝛿 D,F ∈R ^ D,F ∈R ^ Confidence of model prediction Statistics of ”positive” patterns – Negative score 𝑜 C 𝑦 3 , 𝑧 3 + 𝛿 𝑂𝑇 𝑦 3 , 𝑧 3 = log (𝑜 C 𝑦 3 + 𝛿) Q (𝑜 C 𝑧 3 + 𝛿) • Relation selection via optimization (L) to generate 𝑉 N (L) – Target: select 𝑛 word pairs from 𝑉 M (L) ⊆ 𝑉 M (L) = 𝑛 L , 𝑉 N A A max 𝑄𝑇 𝑦 3 ,𝑧 3 s.t. 𝑂𝑇 𝑦 3 ,𝑧 3 < 𝜄, 𝑉 N (S) (S) D E ,F E ∈R f D E ,F E ∈R f 21

Pattern-based Relation Selection (3) • Relation selection algorithm 22

Experimental Data • Text corpus – Text contents from Baidu Baike, 1.088B words – Train 100-dimensional word vectors using Skip-gram model • Is-a relation sets – Training: A subset of is-a relations derived from a Chinese taxonomy – Unlabeled: Entities and categories from Baidu Baike – Testing: publicly available labeled dataset (ACL’14) Unlabeled set statistics 24

Model Performance • With pattern-based relation selection – The performance increases first and becomes relatively stable. – A few false positive pairs are still inevitably selected by our approach. • Without pattern-based relation selection – The performance drops quickly despite the improvement in the first few iterations. 25

Comparative Study • Comparing with state-of-the-art Pattern-based Dictionay-based Distributed similarity-based Word embedding- based 26

Chinese Hypernym-Hyponym Extraction from User Generated Categories - PowerPoint PPT Presentation

Chinese Hypernym-Hyponym Extraction from User Generated Categories Chengyu Wang, Xiaofeng He School of Computer Science and Software Engineering, East China Normal University Shanghai, China Outline Introduction Background and Related

Constituency-based Hyponymy Extraction COMP 762 Chianyu Liu, 260576898 Hyponym and Hypernym

Generated by CamScanner Generated by CamScanner Generated by CamScanner Generated by CamScanner

uf: Minimizing the Coq Extraction TCB Eric Mullen , Stuart Pernsteiner, James Wilcox, Zachary

WELCOME CHINESE Your Access Channel to the Chinese Market Welcome Chinese mission statement

Soil Extraction Cell: An Alternative Soil Extraction Cell: An Alternative Method of Soil

Looking for Hyponyms in Vector Space Marek Rei, SwiftKey Ted Briscoe, University of Cambridge

Hearst Patterns Revisited: Automatic Hypernym Detection from Large Text Corpora Stephen Roller

Extraction of Alumina from Coal Fly Ash Generated from Inner-Mongolia Chinese Coal Xiaoting Liu 1

Declarative Information Extraction Declarative Information Extraction Using Datalog Datalog with

Chinese Event Extraction School of Data Science, Fudan University

RUN groupadd -r user && useradd -r -g user user USER user $ docker run --read-only debian

EUTR Chinese Plywood Project Chinese Plywood Project Presentation 15.04.15 Why Chinese plywood?

Variability Extraction and Analysis Toolkit (VEXA) VEXA Introduction The Variability Extraction

3. Feature Extraction 3.1 Feature Extraction from Speech or other types of audio like music

Automated Feature Extraction Automated Feature Extraction for Object Recognition for Object

Discovering the multifaceted information hidden within large user-generated text streams Daniel

Promising Solutions : Molecular Oncology and Personalized Medicine Simon B. Sutcliffe, MD, FRCP,

Medication Safety Beyond Institution: Issues and Magnitude of the Problem Pr Prof ofesso essor

HBV TeleECHO Case Examples Amy Shen Tang, MD Hepatitis B Program Director Immune active chronic

10/4/18 What is a medication error? A medication error is defined by the Nation Coordinating

Traffic-related air pollutants and their health effects in Beijing Yang Cao, yang.cao@ki.se Unit

TRADITIONAL CHINESE MEDICINE (TCM) INTRODUCTION One of the world's oldest forms of medicine

Designing Heavens Will: Lessons in Market Design from the Chinese Imperial Civil Servants

Roz Eisenberg: Mentoring 47 rotation students & VMD/DMD/undergrad students Many lab

Chinese Hypernym-Hyponym Extraction from User Generated Categories - PowerPoint PPT Presentation

Chinese Hypernym-Hyponym Extraction from User Generated Categories Chengyu Wang, Xiaofeng He School of Computer Science and Software Engineering, East China Normal University Shanghai, China Outline Introduction Background and Related

Constituency-based Hyponymy Extraction COMP 762 Chianyu Liu, 260576898 Hyponym and Hypernym

Generated by CamScanner Generated by CamScanner Generated by CamScanner Generated by CamScanner

uf: Minimizing the Coq Extraction TCB Eric Mullen , Stuart Pernsteiner, James Wilcox, Zachary

WELCOME CHINESE Your Access Channel to the Chinese Market Welcome Chinese mission statement

Soil Extraction Cell: An Alternative Soil Extraction Cell: An Alternative Method of Soil

Looking for Hyponyms in Vector Space Marek Rei, SwiftKey Ted Briscoe, University of Cambridge

Hearst Patterns Revisited: Automatic Hypernym Detection from Large Text Corpora Stephen Roller

Extraction of Alumina from Coal Fly Ash Generated from Inner-Mongolia Chinese Coal Xiaoting Liu 1

Declarative Information Extraction Declarative Information Extraction Using Datalog Datalog with

Chinese Event Extraction School of Data Science, Fudan University

RUN groupadd -r user &amp;&amp; useradd -r -g user user USER user $ docker run --read-only debian

EUTR Chinese Plywood Project Chinese Plywood Project Presentation 15.04.15 Why Chinese plywood?

Variability Extraction and Analysis Toolkit (VEXA) VEXA Introduction The Variability Extraction

3. Feature Extraction 3.1 Feature Extraction from Speech or other types of audio like music

Automated Feature Extraction Automated Feature Extraction for Object Recognition for Object

Discovering the multifaceted information hidden within large user-generated text streams Daniel

Promising Solutions : Molecular Oncology and Personalized Medicine Simon B. Sutcliffe, MD, FRCP,

Medication Safety Beyond Institution: Issues and Magnitude of the Problem Pr Prof ofesso essor

HBV TeleECHO Case Examples Amy Shen Tang, MD Hepatitis B Program Director Immune active chronic

10/4/18 What is a medication error? A medication error is defined by the Nation Coordinating

Traffic-related air pollutants and their health effects in Beijing Yang Cao, yang.cao@ki.se Unit

TRADITIONAL CHINESE MEDICINE (TCM) INTRODUCTION One of the world's oldest forms of medicine

Designing Heavens Will: Lessons in Market Design from the Chinese Imperial Civil Servants

Roz Eisenberg: Mentoring 47 rotation students &amp; VMD/DMD/undergrad students Many lab

RUN groupadd -r user && useradd -r -g user user USER user $ docker run --read-only debian

Roz Eisenberg: Mentoring 47 rotation students & VMD/DMD/undergrad students Many lab