Inferring Relationships from a Corpus Axel Larsson & Erik - PowerPoint PPT Presentation

Jan 28, 2023 •109 likes •255 views

Inferring Relationships from a Corpus Axel Larsson & Erik Grtner Goals Extract named entities from a corpus Identify relationships between the persons Infer more relationships based on extracted data Building the Graph

Inferring Relationships from a Corpus Axel Larsson & Erik Gärtner
Goals Extract named entities from a corpus ● Identify relationships between the persons ● Infer more relationships based on extracted data ●
Building the Graph
Named Entities Extraction NER: locate and classify elements in texts into predefined ● categories; names of persons, locations, organisations etc. Stanford CoreNLP NER tags annotator ● Uses a trained model to detect: names, places and organisations ○ We filter for only person names ●
Resolving co-references Mentions not using the primary name, such as: ● he ○ the president ○ Very slow process ●
Detecting Relationships OpenIE triples (subject, relation, object) ● (Eric, is the son of, Anders) ● Stanford OpenIE ●
Filtering OpenIE Triples Wikidata - a free knowledge base ● List of properties on Person:s of type Relationship ● father ○ mother ○ [ { brother ○ "title": "brother", "id": "P7", sister ○ "description": "male sibling", spouse ○ "english": ["bro", "brother"], "swedish": ["broder", "bror", "brorsa"] partner ○ }, child { ○ "title": "father", stepfather ○ "id": "P22", "description": "the male parent", stepmother ○ "english": ["father", "dad", "daddy"], relative ○ "swedish": ["far", "fader"] }, godparent ○
Inferring Relationships Rule-based engine ● Iterates on the graph ● Add new inferred relationships such as: ● "Abel is the son of Adam" ○ Son -> Father ○
Experimental Setup Sherlock - The Boscombe Valley Mystery ● ~9600 words ○ ~10 family relationships ○ ~3 minutes to extract relationships ○ Manually annotated for scores ○ Training + testing ○ CoreNLP ● Opensource ○ Cutting edge ○ Scala ●
The Graph
(idiot, marry, her) This fellow is madly, insanely, in love with her, but some two years ago, when he was only a lad, and before he really knew her, for she had been away five years at a boarding-school, what does the idiot do but get into the clutches of a barmaid in Bristol and marry her at a registry office?
Results and Evaluation Managed to extract relationships from a novel ● Promising but further work needed ● Evaluation scores ● True positives: 2 ○ False positives: 3 ○ False negatives: 7 ○ Recall: 0.22 ○ Precision: 0.40 ○ F-score: 0.29 ○
Future Work Improve relationship extraction; more important than NER. ● Add more languages ● Improve rules engine ●
Questions?

Recommend

Inferring Autonomous System Relationships in the Internet Lixin Gao Motivation Routing

Inferring Autonomous System Relationships in the Internet Lixin Gao Motivation Routing policies are constrained by the contractual commercial agreements between administrative domains For example : AS sets policy so that it does not

558 views • 23 slides

Inferring Internet Server IPv4 and IPv6 Address Relationships Robert Beverly, Arthur Berger ,

Inferring Internet Server IPv4 and IPv6 Address Relationships Robert Beverly, Arthur Berger , Nicholas Weaver , Larry Campbell Naval Postgraduate School Akamai ICSI/UCSD rbeverly@nps.edu, awberger@mit.edu February 7, 2013

323 views • 19 slides

Corpus Stylistics: Speech, Writing and Thought Presentation in a Corpus of English Writing

Corpus Stylistics: Speech, Writing and Thought Presentation in a Corpus of English Writing (Routledge Advances in Corpus Linguistics) Elena Semino, Mick Short Click here if your download doesn"t start automatically Corpus Stylistics:

386 views • 5 slides

On Inferring and Characterizing On Inferring and Characterizing Internet Routing Policies

On Inferring and Characterizing On Inferring and Characterizing Internet Routing Policies Internet Routing Policies Feng Wang Lixin Gao Department of Electrical and Computer Engineering University of Massachusetts, Amherst MA 01002, USA 1

450 views • 18 slides

The need for Corpus Statistics: Corpus analysis and the identification of linguistically relevant

The need for Corpus Statistics: Corpus analysis and the identification of linguistically relevant patterns Launching the Corpus Statistics Group 11 th Feb. 2016 University of Birmingham The Corpus Statistics group Core members (not just

460 views • 19 slides

MACAQ : A Multi Annotated Corpus to study how we adapt Answers to various Questions Anne

Overview Why a corpus of human answers? Corpus constitution Corpus annotation Conclusion MACAQ : A Multi Annotated Corpus to study how we adapt Answers to various Questions Anne Garcia-Fernandez , Sophie Rosset, Anne Vilnat LIMSI-CNRS and

562 views • 35 slides

A mas novas vos torn / Now I take you back Corpus to my tale Structure Corpus Study

Introduction Parallel Corpora A mas novas vos torn / Now I take you back Corpus to my tale Structure Corpus Study Conclusion The Romance of Flamenca References Olga Scrivner, E.D. Blodgett*, Sandra K ubler, Michael McGuire

526 views • 39 slides

TrustedOut Corpus Intelligence Corpus Intelligence Makes Intelligence Trustworthy. Florent Solt,

TrustedOut Corpus Intelligence Corpus Intelligence Makes Intelligence Trustworthy. Florent Solt, CTO & co-founder GESTE, Feb 20 th 2019, Paris. The problem: Distrust in media. TrustedOut Corpus Intelligence ?? ?? The consequence: In

437 views • 12 slides

From the National Corpus of Polish to the Polish Corpus Infrastructure Maciej Ogrodniczuk

From the National Corpus of Polish to the Polish Corpus Infrastructure Maciej Ogrodniczuk Linguistic Engineering Group Institute of Computer Science Polish Academy of Sciences SLOVKO 2019 Bratislava, 25 October 2019 Agenda Three main

437 views • 39 slides

Corpus Analysis from a Mathematical Perspective Corpus Statistics Research Group launch event

Corpus Analysis from a Mathematical Perspective Corpus Statistics Research Group launch event Birmingham, 11th Feb 2016 Simon Preston (University of Nottingham) Joint work with R. Carrington, A. Hennessey, M. Mahlberg, K. Severn, Y. van Gennip,

510 views • 17 slides

CORPUS STYLISTICS: SPEECH, WRITING AND THOUGHT PRESENTATION IN A CORPUS OF ENGLISH WRITING

CORPUS STYLISTICS: SPEECH, WRITING AND THOUGHT PRESENTATION IN A CORPUS OF ENGLISH WRITING Download Free Author: Elena Semino, Mick Short Number of Pages: 272 pages Published Date: 15 Aug 2014 Publisher: Taylor & Francis Ltd Publication

82 views • 4 slides

ERROR ANALYSIS IN A WRITTEN LEARNER CORPUS FROM SPANISH SPEAKERS EFL LEARNERS. A CORPUS BASED

ERROR ANALYSIS IN A WRITTEN LEARNER CORPUS FROM SPANISH SPEAKERS EFL LEARNERS. A CORPUS BASED STUDY Mara Victoria Pardo Rodrguez UCREL Session Lancaster University November 30th, 2017 Work plan 1. Problem summary, hypothesis, error

760 views • 28 slides

Relationships - why some work and others don t. Relationships and Parenting. How we are treated

1 Relationships - why some work and others don t. Relationships and Parenting. How we are treated determines our relationships why some work and others don t. Hello Unhealthy relationships mess up lives - ours, our partner s and

890 views • 70 slides

Introduction to characters and parsimony analysis Genetic Relationships Genetic relationships

Introduction to characters and parsimony analysis Genetic Relationships Genetic relationships exist between individuals within populations These include ancestor-descendent relationships and more indirect relationships based on common

990 views • 66 slides

The Extended SPaRKy Restaurant Corpus designing a corpus with variable information density David

The Extended SPaRKy Restaurant Corpus designing a corpus with variable information density David M. Howcroft Dietrich Klakow Vera Demberg Department of Language Science and Technology Saarland Informatics Campus, Saarland University, Germany

759 views • 54 slides

Statistical Analysis of Corpus Data with R Hypothesis Testing for Corpus Frequency Data The

Statistical Analysis of Corpus Data with R Hypothesis Testing for Corpus Frequency Data The Library Metaphor Marco Baroni 1 & Stefan Evert 2 http://purl.org/stefan.evert/SIGIL 1 Center for Mind/Brain Sciences, University of Trento 2

1.61k views • 150 slides

The Corpus of Old English P . S. Langeslag The Dictionary of Old English Corpus 3060 Texts

The Corpus of Old English P . S. Langeslag The Dictionary of Old English Corpus 3060 Texts Table 1: DOEC statistics for the 2009 release A Poetry 177,480 words 6% B Prose 2,128,781 words 70% C Glosses 699,606 words 23% D

265 views • 10 slides

TrustedOut Corpus Intelligence Corpus Intelligence Makes Intelligence Smarter and Trustworthy.

TrustedOut Corpus Intelligence Corpus Intelligence Makes Intelligence Smarter and Trustworthy. Digimind Event, Paris, April 11st 2019. Freddy Mini, CEO & co-founder Strictly Confidential 1919 2016 2017 2018/05 2018/06 2018/2019

654 views • 16 slides

Statistical Analysis of Corpus Data with R The Limitations of Random Sampling Models for Corpus

Statistical Analysis of Corpus Data with R The Limitations of Random Sampling Models for Corpus Data Marco Baroni 1 & Stefan Evert 2 http://purl.org/stefan.evert/SIGIL 1 Center for Mind/Brain Sciences, University of Trento 2 Institute of

1.08k views • 72 slides

Inferring Internet Inferring Internet Denial- -of of- -Service Activity Service Activity

Inferring Internet Inferring Internet Denial- -of of- -Service Activity Service Activity Denial Geoffrey M. Voelker Geoffrey M. Voelker University of California, San Diego University of California, San Diego Joint work with David Moore

369 views • 32 slides

SH 358 IMPROVEMENTS Corpus Christi District Updated October 2018 SH 358 Improvements Corpus

SH 358 IMPROVEMENTS Corpus Christi District Updated October 2018 SH 358 Improvements Corpus Christi District All dates & schedules are subject to change Updated October 2018 Project Overview Length: 15 miles Total cost: $49.96

711 views • 34 slides

Getting to know your corpus: applying Topic Modelling to a corpus of research articles Paul

Getting to know your corpus: applying Topic Modelling to a corpus of research articles Paul Thompson Akira Murakami Susan Hunston University of Birmingham University of Cambridge University of Birmingham p.thompson@bham.ac.uk

1.4k views • 77 slides

Topic 5: Non-Linear Relationships and Non-Linear Least Squares Non-linear Relationships Many

Topic 5: Non-Linear Relationships and Non-Linear Least Squares Non-linear Relationships Many relationships between variables are non-linear. (Examples) OLS may not work (recall A.1). It may be biased and inconsistent. In other situations, we

292 views • 25 slides

Corpus Linguistics Seminar Resources for Computational Linguists SS 2007 Magdalena Wolska

Corpus Linguistics Seminar Resources for Computational Linguists SS 2007 Magdalena Wolska & Michaela Regneri Armchair Linguists vs. Corpus Linguists Competence Performance Resources for Comp Corpus

268 views • 24 slides