Distributional Semantics The unsupervised modeling of meaning on a - PowerPoint PPT Presentation

equipe melodi Distributional Semantics The unsupervised modeling of meaning on a large scale Tim Van de Cruys IRIT, Toulouse Tuesday 17 November 2015

Distributional similarity distributional hypothesis [Harris 1954] 2/52 — Distributional Semantics — tim.vandecruys@irit.fr • The induction of meaning from text is based on the • Take a word and its contexts: • tasty sooluceps • sweet sooluceps • stale sooluceps • freshly baked sooluceps • By looking at a word’s context, one can infer its meaning

Distributional similarity distributional hypothesis [Harris 1954] 2/52 — Distributional Semantics — tim.vandecruys@irit.fr • The induction of meaning from text is based on the • Take a word and its contexts: • tasty sooluceps ⇒ food • sweet sooluceps • stale sooluceps • freshly baked sooluceps • By looking at a word’s context, one can infer its meaning

Distributional similarity distributional hypothesis [Harris 1954] 2/52 — Distributional Semantics — tim.vandecruys@irit.fr • The induction of meaning from text is based on the • Take a word and its contexts: • tasty sooluceps • sweet sooluceps • stale sooluceps • freshly baked sooluceps • By looking at a word’s context, one can infer its meaning

Matrix 0 1 1 0 1 truck 2 1 0 1 car 0 2 2 strawberry 0 0 1 2 raspberry second-hand fast tasty red 3/52 — Distributional Semantics — tim.vandecruys@irit.fr • captures co-occurrence frequencies of two entities

Vector space model red fast car 4/52 — Distributional Semantics — tim.vandecruys@irit.fr raspberry strawberry

Word-context matrix context1 context2 context3 context4 word1 word2 word3 word4 He drove his second-hand car a couple of miles down the road . 5/52 — Distributional Semantics — tim.vandecruys@irit.fr • Different notions of context • window around word • dependency-based features (extracted from parse trees)

Word-context matrix context1 context2 context3 context4 word1 word2 word3 word4 5/52 — Distributional Semantics — tim.vandecruys@irit.fr • Different notions of context • window around word (2 words) • dependency-based features (extracted from parse trees) He drove [ his second-hand car a couple ] of miles down the road .

Word-context matrix context1 context2 context3 context4 word1 word2 word3 word4 5/52 — Distributional Semantics — tim.vandecruys@irit.fr • Different notions of context • window around word (sentence) • dependency-based features (extracted from parse trees) [ He drove his second-hand car a couple of miles down the road . ]

Word-context matrix context1 context2 context3 context4 word1 word2 word3 word4 He drove his second-hand car a couple of miles down the road . obj mod 5/52 — Distributional Semantics — tim.vandecruys@irit.fr • Different notions of context • window around word • dependency-based features (extracted from parse trees)

Different kinds of semantic similarity (co-)hyponymous as association and meronymy 6/52 — Distributional Semantics — tim.vandecruys@irit.fr • ‘tight’, synonym-like similarity : (near-)synonymous or • loosely related, topical similarity : more loose relationships, such

Different kinds of semantic similarity (co-)hyponymous as association and meronymy Example treatment , illness 6/52 — Distributional Semantics — tim.vandecruys@irit.fr • ‘tight’, synonym-like similarity : (near-)synonymous or • loosely related, topical similarity : more loose relationships, such • doctor : nurse , GP , physician , practitioner , midwife , dentist , surgeon • doctor : medication , disease , surgery , hospital , patient , clinic , nurse ,

Relation context – similarity 7/52 — Distributional Semantics — tim.vandecruys@irit.fr • Different context leads to different kind of similarity • Syntax, small window ↔ large window, documents • The former models induce tight, synonymous similarity • The latter models induce topical relatedness

Computing similarity … strawberry psychologist, physicist, sociologist, statistician 8/52 — Distributional Semantics — tim.vandecruys@irit.fr • Monday, Tuesday, Wednesday, Thursday, Friday, Saturday, Sunday • blackberry, blackcurrant, blueberry, raspberry, redcurrant, • anthropologist, biologist, economist, linguist, mathematician, • drought, earthquake, famine, flood, flooding, storm, tsunami

…on a large scale from the web, … formats that take advantage of the sparseness 9/52 — Distributional Semantics — tim.vandecruys@irit.fr • Frequency matrices are extracted from very large corpora • Large collections of newspapers, Wikipedia, documents crawled • > 100 billion words • Large demands with regard to computing power and memory • Matrices are very sparse → use of algorithms and storage

…on a large scale framework (10 nodes, 640 cores in total) 10/52 — Distributional Semantics — tim.vandecruys@irit.fr • Take advantage of parallel computations • Many algorithms can be implemented within a map-reduce • Collection of frequency matrices • Matrix transformations • Syntactic parsing • Make use of IRIT’s high performance computing cluster OSIRIM • Huge speedup

Dimensionality reduction Two reasons for performing dimensionality reduction: similarity computations may become intractable again is able to capture intrinsic semantic features data sparseness and noise) 11/52 — Distributional Semantics — tim.vandecruys@irit.fr • Intractable computations • When number of elements and number of features is too large, • reduction of the number of features makes computation tractable • Generalization capacity • the dimensionality reduction is able to describe the data better, or • dimensionality reduction is able to improve the results (counter

Non-negative matrix factorization and H such that: (1) additive, no subtractive relations are allowed 12/52 — Distributional Semantics — tim.vandecruys@irit.fr • Given a non-negative matrix V, find non-negative matrix factors W V n × m ≈ W n × r H r × m • Choosing r ≪ n , m reduces data • Constraint on factorization: all values in three matrices need to be non-negative values ( ≥ 0 ) • Constraint brings about a parts-based representation: only • Particularly useful for finding topical, thematic information

13/52 — Distributional Semantics — tim.vandecruys@irit.fr Graphical Representation context words k context words H k = V W nouns nouns x

Example dimensions via logiciel pomme universitaires tumeurs connexion saumon scolarité lésions canard enseignant étudiants cardiaque internet poire étudiant métabolisme html fumé formateurs artérielle veau dim 9 desserts dim 12 dim 21 dim 24 infection fichiers agneau professeurs respiratoire windows cursus serveur respiratoires messagerie miel enseignants maladies téléchargement boeuf pédagogique nerveux 14/52 — Distributional Semantics — tim.vandecruys@irit.fr

Word meaning in context ‘global’ word meaning meaning (1) Jack is listening to a record (2) Jill updated the record 15/52 — Distributional Semantics — tim.vandecruys@irit.fr • Standard word space models are good at capturing general, ↔ Words have different senses ↔ Meaning of individual word instances differs significantly • Context is determining factor for construction of individual word

Distributional Semantics The unsupervised modeling of meaning on a - PowerPoint PPT Presentation

equipe melodi Distributional Semantics The unsupervised modeling of meaning on a large scale Tim Van de Cruys IRIT, Toulouse Tuesday 17 November 2015 Distributional similarity distributional hypothesis [Harris 1954] 2/52 Distributional

Semantics 1 / 21 Outline What is semantics? Denotational semantics Semantics of naming What

Distributional Compositionality Intro to Distributional Semantics Raffaella Bernardi University

Logic and Natural Language Semantics: Distributional Semantics R affaella B ernardi DISI, U

Modelling constructional change with distributional semantics Florent Perek Overview o Applying

Synonymy in an approach to combined distributional and compositional semantics Ann Copestake and

Operational Semantics 1 / 14 Outline What is semantics? Operational Semantics What is

15-411: Dynamic Semantics Jan Ho ff mann Dynamic Semantics Static semantics: definition of

Distributional Semantics Crash Course September 11, 2018 CSCI 2952C: Computational Semantics

Distributional Semantics Joo Sedoc IntroHLT class November 4, 2019 Intuition of

JoBimText Framework for Distributional Semantics Alexander Panchenko TU Darmstadt FG

Natural Language Processing (CSEP 517): Distributional Semantics Roy Schwartz 2017 c

Combining distributional semantics and structured data to study lexical change Astrid van Aggelen ,

Linear mixed models with improper priors and flexible distributional assumptions for longitudinal

Statistics and Samples in Distributional Reinforcement Learning Mark Rowland, Robert Dadashi,

Statistics and Samples in Distributional Reinforcement Learning Rowland, Dadashi, Kumar, Munos,

Compositional Distributional Semantic Models for Semantic Relatedness and Entailment Sidharth

IMPROVING BASIC SERVICES FOR THE BOTTOM FORTY PERCENT LESSONS FROM ETHIOPIA by by Qaiser ser

Restraint use in older adults in home care: a systematic review Koen Milisen KU Leuven

Mismatches in Russian Nominal Ellipsis

Military Institutional Stigma and Nursing CPT Amy Brzuchalski, RN, MSN, DNP Student CPT Douglas

Why the identity of Noah is important for the origins debate Alan Dickin Alternative views of

SYSTEMATIC REVIEWS OF EXPERT OPINION & POLICY DOCUMENTS 8 TH February 2019 SYSTEMATIC REVIEWS

Top Ten Things You Should Know About Employee Benefits AIDS Legal Referral Panel April 19, 2018

1 Example: Deferral of Taxes Tax- Example: Deferral of Taxes Tax -deductible IRA deductible IRA

Sambuz

Useful Links

Newsletter

Mail Us

Distributional Semantics The unsupervised modeling of meaning on a - PowerPoint PPT Presentation

equipe melodi Distributional Semantics The unsupervised modeling of meaning on a large scale Tim Van de Cruys IRIT, Toulouse Tuesday 17 November 2015 Distributional similarity distributional hypothesis [Harris 1954] 2/52 Distributional

Semantics 1 / 21 Outline What is semantics? Denotational semantics Semantics of naming What

Distributional Compositionality Intro to Distributional Semantics Raffaella Bernardi University

Logic and Natural Language Semantics: Distributional Semantics R affaella B ernardi DISI, U

Modelling constructional change with distributional semantics Florent Perek Overview o Applying

Synonymy in an approach to combined distributional and compositional semantics Ann Copestake and

Operational Semantics 1 / 14 Outline What is semantics? Operational Semantics What is

15-411: Dynamic Semantics Jan Ho ff mann Dynamic Semantics Static semantics: definition of

Distributional Semantics Crash Course September 11, 2018 CSCI 2952C: Computational Semantics

Distributional Semantics Joo Sedoc IntroHLT class November 4, 2019 Intuition of

JoBimText Framework for Distributional Semantics Alexander Panchenko TU Darmstadt FG

Natural Language Processing (CSEP 517): Distributional Semantics Roy Schwartz 2017 c

Combining distributional semantics and structured data to study lexical change Astrid van Aggelen ,

Linear mixed models with improper priors and flexible distributional assumptions for longitudinal

Statistics and Samples in Distributional Reinforcement Learning Mark Rowland, Robert Dadashi,

Statistics and Samples in Distributional Reinforcement Learning Rowland, Dadashi, Kumar, Munos,

Compositional Distributional Semantic Models for Semantic Relatedness and Entailment Sidharth

IMPROVING BASIC SERVICES FOR THE BOTTOM FORTY PERCENT LESSONS FROM ETHIOPIA by by Qaiser ser

Restraint use in older adults in home care: a systematic review Koen Milisen KU Leuven

Mismatches in Russian Nominal Ellipsis

Military Institutional Stigma and Nursing CPT Amy Brzuchalski, RN, MSN, DNP Student CPT Douglas

Why the identity of Noah is important for the origins debate Alan Dickin Alternative views of

SYSTEMATIC REVIEWS OF EXPERT OPINION &amp; POLICY DOCUMENTS 8 TH February 2019 SYSTEMATIC REVIEWS

Top Ten Things You Should Know About Employee Benefits AIDS Legal Referral Panel April 19, 2018

1 Example: Deferral of Taxes Tax- Example: Deferral of Taxes Tax -deductible IRA deductible IRA

Sambuz

Useful Links

Newsletter

Mail Us

SYSTEMATIC REVIEWS OF EXPERT OPINION & POLICY DOCUMENTS 8 TH February 2019 SYSTEMATIC REVIEWS