Con-S2V : A Generic Framework for Incorporating Extra-Sentential - PowerPoint PPT Presentation

Con-S2V : A Generic Framework for Incorporating Extra-Sentential Context into Sen2Vec Tanay Kumar Saha 1 Shafiq Joty 2 Mohammad Al Hasan 1 1 Indiana University Purdue University Indianapolis, Indianapolis, IN 46202, USA 2 Nanyang Technological University, Singapore September 22, 2017 Saha, Joty, Hasan (IUPUI, NTU) CON-S2V: Latent Repres. of Sentences September 22, 2017 1 / 35

Outline 1 Introduction and Motivation 2 Con-S2V Model 3 Experimental Settings 4 Experimental Results 5 Conclusion Saha, Joty, Hasan (IUPUI, NTU) CON-S2V: Latent Repres. of Sentences September 22, 2017 2 / 35

Outline 1 Introduction and Motivation Introduction Related Work 2 Con-S2V Model Modeling Content Modeling Distributional Similarity Modeling Proximity Training Con-S2V 3 Experimental Settings Evaluation Tasks Metrics for Evaluation Baseline Models for Evaluation Optimal Parameter Settings 4 Experimental Results Classification and Clustering Performance Summarization Performance 5 Conclusion Saha, Joty, Hasan (IUPUI, NTU) CON-S2V: Latent Repres. of Sentences September 22, 2017 3 / 35

Sen2Vec (Model for representation of Sentences) ◮ Learn distributed representation of sentences from unlabeled data ◮ v 1 : I eat rice → [0.2 0.3 0.4] ◮ φ : V → R d ◮ For many text processing tasks that involve classification, clustering, or ranking of sentences, vector representation of sentences is a prerequisite ◮ Distributed Representation has been shown to perform better than Bag-of-Words (BOW) based vector representation ◮ Proposed by Mikolov et. al Saha, Joty, Hasan (IUPUI, NTU) CON-S2V: Latent Repres. of Sentences September 22, 2017 4 / 35

Con-S2V (Our Model) ◮ A novel approach to learn distributed representation of sentences from unlabeled data by jointly modeling both content and context of a sentence ◮ v 1 : I have an NEC multisync 3D monitor for sale ◮ v 2 : Looks new ◮ v 3 : Great Condition ◮ In contrast to the existing works, we consider context sentences as atomic linguistic units. ◮ We consider two types of context: discourse and similarity. However, our model can take any arbitrary type of context ◮ Our evaluation on these tasks across multiple datasets shows impressive results for our model, which outperforms the best existing models by up to 7 . 7 F 1 -score in classification, 15 . 1 V -score in clustering, 3 . 2 ROUGE-1 score in summarization. ◮ Build on top of Sen2Vec Saha, Joty, Hasan (IUPUI, NTU) CON-S2V: Latent Repres. of Sentences September 22, 2017 5 / 35

Context Types of a Sentence ◮ Discourse Context of a Sentence ◮ It is formed by the previous and the following sentences in the text ◮ Adjacent sentences in a text are logically connected by certain coherence relations (e.g., elaboration, contrast) to express the meaning ◮ Lactose is a milk sugar. The enzyme lactase breaks it down. Here, the second sentence is an elaboration of the first sentence. ◮ Similarity Context of a Sentence ◮ Based on more direct measures of similarity ◮ Considers relations between all possible sentences in a document and possibly across multiple documents Saha, Joty, Hasan (IUPUI, NTU) CON-S2V: Latent Repres. of Sentences September 22, 2017 6 / 35

Related Work ◮ Sen2Vec ◮ Uses Sentence ID as a special token and learn the representation of the sentence by predicting all the words in a sentence ◮ For example, for a sentence, v 1 : I eat rice, it will learn representation for v 1 by learning to predict each of the words, i.e. I, eat, and rice correctly ◮ Shown to perform better than tf-idf ◮ W2V-avg ◮ Uses word vector averaging ◮ A tough-to-beat baseline for most downstream tasks ◮ SDAE ◮ Employs an encoder-decoder framework, similar to neural machine translation (NMT) to de-noise an original sentence (target) from its corrupted version (source) ◮ SAE is similar in spirit to SDAE but does not corrupt source Saha, Joty, Hasan (IUPUI, NTU) CON-S2V: Latent Repres. of Sentences September 22, 2017 7 / 35

Related Work ◮ C-Phrase ◮ C-PHRASE is an extension of CBOW (Continuous Bag of Words Model) ◮ The context of a word is extracted from a syntactic parse of the sentence ◮ Syntax tree for a sentence, A sad dog is howling in the park is: (S (NP A sad dog) (VP is (VP howling (PP in (NP the park))))) ◮ C-PHRASE will optimize context prediction for dog, sad dog, a sad dog, a sad dog is howling, etc., but not, for example, for howling in, as these two words do not form a syntactic constituent by themselves ◮ Uses word vector addition for representing sentences Saha, Joty, Hasan (IUPUI, NTU) CON-S2V: Latent Repres. of Sentences September 22, 2017 8 / 35

Related Work ◮ Skip-Thought (Context Sensitive) ◮ Uses the NMT framework to predict adjacent sentences (target) given a sentence (source) ◮ FastSent (Context Sensitive) ◮ An additive model to learn sentence representation from word vectors ◮ It predicts the words of its adjacent sentences in addition to its own words Saha, Joty, Hasan (IUPUI, NTU) CON-S2V: Latent Repres. of Sentences September 22, 2017 9 / 35

Con-S2V ◮ A novel model to learn distributed representation of sentences by considering content as well as context of a sentence ◮ It treats the context sentences as an atomic unit ◮ Efficient to train compared to compositional methods like encoder-decoder models (e.g., SDAE, Skip-Thought) that compose a sentence vector from the word vectors Saha, Joty, Hasan (IUPUI, NTU) CON-S2V: Latent Repres. of Sentences September 22, 2017 10 / 35

Outline 1 Introduction and Motivation Introduction Related Work 2 Con-S2V Model Modeling Content Modeling Distributional Similarity Modeling Proximity Training Con-S2V 3 Experimental Settings Evaluation Tasks Metrics for Evaluation Baseline Models for Evaluation Optimal Parameter Settings 4 Experimental Results Classification and Clustering Performance Summarization Performance 5 Conclusion Saha, Joty, Hasan (IUPUI, NTU) CON-S2V: Latent Repres. of Sentences September 22, 2017 11 / 35

Con-S2V Model ◮ The model for learning the vector representation of a sentence comprises three components ◮ The first component models the content by asking the sentence vector to predict its constituent words (modeling content) ◮ The second component models the distributional hypotheses of a context (modeling context) ◮ Third component models the proximity hypotheses of a context, which also suggests that sentences that are proximal should have similar representations (modeling context) Saha, Joty, Hasan (IUPUI, NTU) CON-S2V: Latent Repres. of Sentences September 22, 2017 12 / 35

Con-S2V Model v 1 : I have an NEC multisync great v 1 v 3 condition 3D monitor for sale L g L g L c L c L r L r L r L r v 1 v 3 v 1 v 3 v 2 : Great Condition φ φ v 2 v 2 v 3 : Looks New (a) (b) (c) Figure: Two instances (see (b) and (c) ) of our model for learning representation of sentence v 2 within a context of two other sentences: v 1 and v 3 (see (a) ). Directed and undirected edges indicate prediction loss and regularization loss, respectively, and dashed edges indicate that the node being predicted is randomly sampled. (Collected from: 20news-bydate-train/misc.forsale/74732. The central topic is “forsale”.) Saha, Joty, Hasan (IUPUI, NTU) CON-S2V: Latent Repres. of Sentences September 22, 2017 13 / 35

Con-S2V Model ◮ We minimize the following loss function for learning representation of sentences: � � � L c ( v i , v ) + L g ( v i , v j ) + J ( φ ) = v i ∈ V v ∈� v i � l t j ∼ U (1 , C i ) L r ( v i , N ( v i )) � (1) ◮ L c : Modeling Content (First Component) ◮ L g : Modeling Context with Distributional Hypothesis (Second Component). The distributional hypothesis conveys that the sentences occurring in similar contexts should have similar representations ◮ L r : Modeling Context with Proximity Hypothesis (Third Component). Proximity hypotheses of a context, which also suggests that sentences that are proximal should have similar representations Saha, Joty, Hasan (IUPUI, NTU) CON-S2V: Latent Repres. of Sentences September 22, 2017 14 / 35

Modeling Content ◮ Our approach for modeling content of a sentence is similar to the distributed bag-of-words (DBOW) model of Sen2Vec ◮ Given an input sentence v i , we first map it to a unique vector φ ( v i ) by looking up the corresponding vector in the sentence embedding matrix φ ◮ We then use φ ( v i ) to predict each word v sampled from a window of words in v i . Formally, the loss for modeling content using negative sampling is: � � w T L c ( v i , v ) = − log σ v φ ( v i ) S � � � − w T − log E v s ∼ ψ c σ v s φ ( v i ) (2) s =1 Saha, Joty, Hasan (IUPUI, NTU) CON-S2V: Latent Repres. of Sentences September 22, 2017 15 / 35

Modeling Distributional Similarity ◮ Our sentence-level distributional hypothesis is that if two sentences share many neighbors in the graph, their representations should be similar ◮ We formulate this in our model by asking the sentence vector to predict its neighboring nodes ◮ Formally, the loss for predicting a neighboring node v j ∈ N ( v i ) using the sentence vector φ ( v i ) is: � � w T L g ( v i , v j ) = − log σ j φ ( v i ) S � � � − w T − log E j s ∼ ψ g σ j s φ ( v i ) (3) s =1 Saha, Joty, Hasan (IUPUI, NTU) CON-S2V: Latent Repres. of Sentences September 22, 2017 16 / 35

Con-S2V : A Generic Framework for Incorporating Extra-Sentential - PowerPoint PPT Presentation

Con-S2V : A Generic Framework for Incorporating Extra-Sentential Context into Sen2Vec Tanay Kumar Saha 1 Shafiq Joty 2 Mohammad Al Hasan 1 1 Indiana University Purdue University Indianapolis, Indianapolis, IN 46202, USA 2 Nanyang Technological

www.rodecon.co.za Extra Abrasion Liner (EAL) www.rodecon.co.za Extra Abrasion Liner (EAL)

What are Generics? e.g. Generics, Generic Programming, Generic Types, Generic Methods 6

CON MI NE CON MI NE CON MI NE CON MI NE CLOSURE & RECLAMATI ON CLOSURE & RECLAMATI ON

Generic Programming in a Dependently Typed Language Generic proofs for generic programs Peter

Generic Methods 36 What are Generic Methods? Generic methods = methods that introduce type

1 Definition of a simple generic class Why generic programming (cont.) class Pair <T> {

Consistency Maintenance: Propagation Consistency Maintenance: Propagation Con fl ict Resolution

Planning and Optimization C14. Merge-and-Shrink Abstractions: Generic Algorithm Malte Helmert and

Generic classes Declaration Use Annotations 54 Generic classes Declaration add

Company Presentation Con Condo dor r Pr Pressu essure Con e Contr trol ol Parent Company

Representing Constraints datatype con = of ty * ty | /\ of con * con | TRIVIAL infix 4

INCORPORATING LARGE-SCALE CITIZEN INCORPORATING LARGE-SCALE CITIZEN DELIBERATION INTO

Incorporating the Zebrafish Embryo Incorporating the Zebrafish Embryo Teratogenicity Assay Into

New Generic Attacks on Hash-based MACs G. Leurent (Inria) New Generic Attacks on Hash-based MACs

Generic absoluteness and universally Baire sets of reals Trevor Wilson Miami University, Ohio

Generic Programming Department of Computer Science University of Maryland, College Park Generic

CS502: Compiler Design Syntax Analysis Manas Thakur Fall 2020 Where are we? Character stream

Compiler construction Martin Steffen February 20, 2017 Contents 1 Abstract 1 1.1 Parsing .

CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis Hall Phases of a Syntactic

Reasoning 7 AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 1 7 7 Reasoning 7.1 Proofs 7.2

Syntactical analysis Syntactical analysis Context-free grammars A context-free grammar is a

An Approach for Bridging the Gap Between Business Rules and the Semantic Web Birgit Demuth

On classification of XML document transformations Jana Dvo rkov FMFI UK, Bratislava

Grammar transformation with DPO rewriting Aleks Kissinger 1 Vladimir Zamdzhiev 2 1 iCIS Radboud

Con-S2V : A Generic Framework for Incorporating Extra-Sentential - PowerPoint PPT Presentation

Con-S2V : A Generic Framework for Incorporating Extra-Sentential Context into Sen2Vec Tanay Kumar Saha 1 Shafiq Joty 2 Mohammad Al Hasan 1 1 Indiana University Purdue University Indianapolis, Indianapolis, IN 46202, USA 2 Nanyang Technological

www.rodecon.co.za Extra Abrasion Liner (EAL) www.rodecon.co.za Extra Abrasion Liner (EAL)

What are Generics? e.g. Generics, Generic Programming, Generic Types, Generic Methods 6

CON MI NE CON MI NE CON MI NE CON MI NE CLOSURE &amp; RECLAMATI ON CLOSURE &amp; RECLAMATI ON

Generic Programming in a Dependently Typed Language Generic proofs for generic programs Peter

Generic Methods 36 What are Generic Methods? Generic methods = methods that introduce type

1 Definition of a simple generic class Why generic programming (cont.) class Pair &lt;T&gt; {

Consistency Maintenance: Propagation Consistency Maintenance: Propagation Con fl ict Resolution

Planning and Optimization C14. Merge-and-Shrink Abstractions: Generic Algorithm Malte Helmert and

Generic classes Declaration Use Annotations 54 Generic classes Declaration add

Company Presentation Con Condo dor r Pr Pressu essure Con e Contr trol ol Parent Company

Representing Constraints datatype con = of ty * ty | /\ of con * con | TRIVIAL infix 4

INCORPORATING LARGE-SCALE CITIZEN INCORPORATING LARGE-SCALE CITIZEN DELIBERATION INTO

Incorporating the Zebrafish Embryo Incorporating the Zebrafish Embryo Teratogenicity Assay Into

New Generic Attacks on Hash-based MACs G. Leurent (Inria) New Generic Attacks on Hash-based MACs

Generic absoluteness and universally Baire sets of reals Trevor Wilson Miami University, Ohio

Generic Programming Department of Computer Science University of Maryland, College Park Generic

CS502: Compiler Design Syntax Analysis Manas Thakur Fall 2020 Where are we? Character stream

Compiler construction Martin Steffen February 20, 2017 Contents 1 Abstract 1 1.1 Parsing .

CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis Hall Phases of a Syntactic

Reasoning 7 AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 1 7 7 Reasoning 7.1 Proofs 7.2

Syntactical analysis Syntactical analysis Context-free grammars A context-free grammar is a

An Approach for Bridging the Gap Between Business Rules and the Semantic Web Birgit Demuth

On classification of XML document transformations Jana Dvo rkov FMFI UK, Bratislava

Grammar transformation with DPO rewriting Aleks Kissinger 1 Vladimir Zamdzhiev 2 1 iCIS Radboud

CON MI NE CON MI NE CON MI NE CON MI NE CLOSURE & RECLAMATI ON CLOSURE & RECLAMATI ON

1 Definition of a simple generic class Why generic programming (cont.) class Pair <T> {