Exploratory Neural Relation Classification for Domain Knowledge - PowerPoint PPT Presentation

Exploratory Neural Relation Classification for Domain Knowledge Acquisition Yan Fan , Chengyu Wang, Xiaofeng He School of Computer Science and Software Engineering East China Normal University Shanghai, China

Outline • Introduction • Related Work • Proposed Approach • Experiments • Conclusion 2

Relation Extraction • Relation extraction – Structures the information from the Web by annotating the plain text with entities and their relations • E.g., “ Inception is directed by Christopher Nolan .” entity 1 relation entity 2 • Relation classification – Formulates relation extraction as a classification problem • E.g., ( Inception , Christopher Nolan ) should be classified as the relation “directed by”, instead of “played by”. 3

Domain Knowledge Acquisition • Knowledge graph – Relation extraction is a key technique in constructing knowledge graphs. • Challenges for domain knowledge graph – Long-tail domain entities : Most domain entities which follow long-tail distribution, leading to the context sparsity problem for pattern-based methods. – Incomplete predefined relations : Since predefined relations are limited, unlabeled entity pairs may be wrongly forced into existing relation labels. 4

Dynamic Structured Neural Network for Exploratory Relation Classification • Goal 1. Classifies entity pairs into a finite pre-defined relations 2. Discovers new relations and instances from plain texts with high confidence • Method – Context sparsity problem: A distributional embedding layer is introduced to encode corpus-level semantic features of domain entities. – Limited label assignment: A clustering method is proposed to generate new relations from unlabeled data which can not be classified to be any existing relations. 5

Relation Classification Approaches • Traditional approaches – Feature-based: applies textual analysis • N-grams, POS tagging, NER, dependency parsing – Kernel-based: similarity metric in higher dimensional space • Kernel functions are applied to strings, word sequences, parsing trees – Requires empirical features or well-designed kernel functions • Deep learning models – Distributional representation: word embeddings – Neural network models: • CNN: extracts features with local information • RNN: captures long-term dependency on the sequence – Automatically extracts features 7

Relation Discovery Approaches • Open relation extraction – automatically discovers relations from large-scale corpus with limited seed instances or patterns without predefined types – Representative systems: TextRunner, ReVerb, OLLIE – Inapplicable to domain knowledge due to data sparsity problem • Clustering-based approaches – Predefined K: Standard KMeans – Automatically learned K: Non-parametric Bayesian models • Chinese restaurant process (CRP), distance dependent CRP (ddCRP) 8

Task Definition • Notations – Labeled entity pair set ! " = (% & , % ( ) and their labels * " – Unlabeled entity pair set ! + = (% & , % ( ) • Exploratory relation classification (ERC) – Trains a model to predict the relations for entity pairs in ! + with , + . output labels, where , denotes the number of pre-defined relations in * " , and . is the number of newly discovered relations. 10

General Framework 11

Base Neural Network Training • Syntactic contexts via LSTM – Nodes on the root augmented dependency path (RADP) • E.g. [Inception, directed, Christopher Nolan] – Node representation • {word embedding, POS tag, dependency relation, relational direction} • E.g. {Inception, nnp, nsubjpass, <-} • Lexical contexts via CNN – Word embeddings of sliding window of n-grams around entities • Semantic contexts – Word embeddings of two tagged entities 12

Base Neural Network Architecture 13

Chinese Restaurant Process (CRP) • Goal – Groups customers into random tables where they sit • Distribution over table assignment – " # : number of customers sitting at table $ – % & : index of the table where the ' -th customer sits – % (& : indices of tables for customers except for the ' -th customer – ) : scaling parameter for a new table – * : number of occupied tables 14

Similarity Sensitive Chinese Restaurant Process (ssCRP) • Idea – Exploits similarities between customers – Turns the problem to customer assignment • Distribution over customer assignment – " #$ : similarity score between the % -th and & -th customer – '()) : similarity function to magnify input differences – + : the parameter balancing the weight of table size – , = {/, 1 2 , 3, +} : set of hyperparameters 15

Illustration of ssCRP 16

Relation Prediction • Idea – Populates small clusters generated via ssCRP – Enriches existing relations with more instances • Prediction criteria – Distribution over ! + # relations for entity pair (% & , % ( ) : Pr , & % & , % ( , … , Pr , ./0 % & , % ( – “Max-secondMax” value for “near uniform” criteria: max Pr , & % & , % ( , … , Pr , ./0 % & , % ( conf % & , % ( = secondMax Pr , & % & , % ( , … , Pr , ./0 % & , % ( 17

Experimental Data • Text corpus – Text contents from 37,746 pages of entertainment domain in Chinese Wikipedia • Statistics – Training & Validation & Testing: • 3480 instances on 4 predefined relations from (Fan et al., 2017) – Unlabeled: • 3161 entity pairs which share joint occurrence in the sentences 19

Evaluation of Relation Classification • Comparative study – We compare our method to CNN-based and RNN-based models, and experiment with different feature sets to verify their significance. 20

Evaluation of Relation Discovery • Pairwise experiment – We manually construct a testing set by sampling pairs of instances ( ! " , ! # ) from unlabeled data where ! = % & , % ( . ! " , ! # ∈ 2|4 ",# = 1 ∧ 4 ",#7 = 1 Precison = ! " , ! # ∈ 2|4 ",#7 = 1 ! " , ! # ∈ 2|4 ",# = 1 ∧ 4 ",#7 = 1 Recall = ! " , ! # ∈ 2|4 ",# = 1 – 4 ",# ∈ 1,0 for the ground truth, 4 ",#7 ∈ 1,0 for the clustering result 21

Evaluation of Relation Discovery • Newly discovered relations – 6 new relations are generated, covering 96.4% unlabeled data • Top- ! precision – We heuristically choose ! = 0.4 because the precision drops relatively faster when ! is larger than this setting. 22

Conclusion • Exploratory relation classification – Problem: assign labels for unlabeled entity pairs to both predefined and unknown relations – Iterative process: • an integrated base neural network for relation classification • a similarity-based clustering algorithm ssCRP to generate new relations • constrained relation prediction process to populate new relations – Experiments: on Chinese Wikipedia entertainment domain, with base neural network achieving 0.92 F1-score, and 6 new relations generated with 0.75 F1-score. 24

Thanks!

Exploratory Neural Relation Classification for Domain Knowledge - PowerPoint PPT Presentation

Exploratory Neural Relation Classification for Domain Knowledge Acquisition Yan Fan , Chengyu Wang, Xiaofeng He School of Computer Science and Software Engineering East China Normal University Shanghai, China Outline Introduction

Exploratory Data Analysis Paul Cohen ISTA 370 Spring, 2012 Paul Cohen ISTA 370 () Exploratory

Introduction to Data Science: x (1) x 1 x 2 x ( n ) x i n 1 1 Size: size

CME/STATS 195 CME/STATS 195 Lecture 5: Exploratory Data Analysis Lecture 5: Exploratory Data

Exploratory Monitoring at Bing AUTOMATED SYNTHETIC EXPLORATORY MONITORING OF DYNAMIC WEB SITES

Subgroup Discovery Exploratory Data Analysis Exploratory Data Analysis Classification:

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

Exploratory Data Analysis Maneesh Agrawala CS 448B: Visualization Fall 2018 1 A2: Exploratory

Strong Baselines for Neural Semi-supervised Learning under Domain Shift Sebastian Ruder Barbara

Relation between things vs. a relation between people Lenin: Where the bourgeois economists

Part I: Soil Mechanics Volume-Volume relation Mass-Mass relation Mass-Volume relation

Relation Schema Given domains D 1 , D 2 , . D n a relation r is a subset of D 1 x D 2 x

Middle Level Exploratory Classes Standards Based Grading McLean County Unit 5 Exploratory

Agenda Agenda 1. ProjectOverview 1 Project Overview 2. DrillingProgram 3 3.

EXPLORATORY PRACTICE Ins K. de Miller (PUC-Rio, Brasil) Exploratory Practice: work for

An Exploratory Study of How Developers Exploratory Study Seek, Relate, and Collect Relevant

Session-Based Exploratory Session-Based Exploratory TestingWith a Twist TestingWith a

D3 Tutorial Manipulation of DOM Edit by Jiayi Xu and Han-Wei Shen, The Ohio State University

Borel Functors and Infinitary Interpretations Matthew Harrison-Trainor University of California,

Session 7 JavaScript Part 2 W3C DOM Reading and Reference Background and introduction

DOM events Every element in the DOM supports a variety of events Page load, mouse click,

Incorporating Domain Knowledge in Matching Problems via Harmonic Analysis Deepti Pachauri (joint

CS445 / SE463 / ECE 451 / CS645 So,ware requirements

Student Model Sasikumar M IIT Bombay Overview What is SM and why SM? Types of SM Relation to

Domain-independent planning and Domain-dependent planning Le Meilleur est lennemi

Sambuz

Useful Links

Newsletter

Mail Us