A coherence model based on syntactic patterns by Annie Louis and Ani Nenkova M.Sc. Seminar: Discourse Coherence Theories and Modeling Nikolina Koleva Saarland University Department of Computational Linguistics June 10, 2013 Nikolina Koleva (CoLi Saarland) Syntactic Approach to Modeling Coherence June 10, 2013 1 / 32
Overview 1 Motivation 2 Coherence models based on syntax Evidence for syntactic coherence Representing syntax Local co-occurrence model Global model 3 Evaluation Prediction on reports Prediction on academic articles 4 Conclusion Nikolina Koleva (CoLi Saarland) Syntactic Approach to Modeling Coherence June 10, 2013 2 / 32
Motivation Factors contributing to coherence 1 attentional structure ( items under discussion ) 2 organization of discourse segments 3 intentional structure ( purpose of the discourse ) Nikolina Koleva (CoLi Saarland) Syntactic Approach to Modeling Coherence June 10, 2013 3 / 32
Motivation Factors contributing to coherence 1 attentional structure: items under discussion ✧ entity approaches 2 organization of discourse segments ✧ content approaches 3 intentional structure: purpose of the discourse ✪ not much work Nikolina Koleva (CoLi Saarland) Syntactic Approach to Modeling Coherence June 10, 2013 3 / 32
Motivation Every discourse has a purpose • explaining a concept • narrating an event • critiquing an idea • ... ✌ each sentence in a text has a communicative goal Nikolina Koleva (CoLi Saarland) Syntactic Approach to Modeling Coherence June 10, 2013 4 / 32
Motivation Example 1 An aqueduct is a water supply or navigable channel constructed to convey water. 2 In modern engineering, the term is used for any system of pipes, canals, tunnels, and other structures used for this purpose. Nikolina Koleva (CoLi Saarland) Syntactic Approach to Modeling Coherence June 10, 2013 5 / 32
Motivation Example 1 An aqueduct is a water supply or navigable channel constructed to convey water. 2 In modern engineering, the term is used for any system of pipes, canals, tunnels, and other structures used for this purpose. 1 Cytokine receptors are receptors that bind cytokines. 2 In recent years, the cytokine receptors have come to demand more attention because their deficiency has now been directly linked to certain debilitating immunodeficiency states. Nikolina Koleva (CoLi Saarland) Syntactic Approach to Modeling Coherence June 10, 2013 5 / 32
Motivation Example 1 An aqueduct is a water supply or navigable channel constructed to convey water . 2 In modern engineering, the termis used for any system of pipes, canals, tunnels, and other structures used for this purpose. 1 Cytokine receptors are receptors that bind cytokines. In recent years, the cytokine receptors have come to 2 demand more attention because their deficiency has now been directly linked to certain debilitating immunodeficiency states. Nikolina Koleva (CoLi Saarland) Syntactic Approach to Modeling Coherence June 10, 2013 5 / 32
Motivation Example 1 An aqueduct is a water supply or navigable channel constructed to convey water . 2 In modern engineering, the term is used for any system of pipes, canals, tunnels, and other structures used for this purpose. 1 Cytokine receptors are receptors that bind cytokines. 2 In recent years, the cytokine receptors have come to demand more attention because their deficiency has now been directly linked to certain debilitating immunodeficiency states. ✌ unique syntactic structure of definitions, questions etc. ✌ syntax as proxy for the communicative goal Nikolina Koleva (CoLi Saarland) Syntactic Approach to Modeling Coherence June 10, 2013 5 / 32
Coherence models based on syntax Coherence model based on syntax Underlying assumptions: 1 Sentences with similar syntax are likely to have the same communicative goal. 2 Regularities in intentional structure will be manifested in syntactic regularities between adjacent sentences. ✌ supported by recent related work Nikolina Koleva (CoLi Saarland) Syntactic Approach to Modeling Coherence June 10, 2013 6 / 32
Coherence models based on syntax Evidence for syntactic coherence Pilot study for the validation of assumption No: 2 • Material: gold standard parse trees from the Penn Treebank • Unit of analysis: two adjacent sentences, a pair ( S 1 , S 2 ) Nikolina Koleva (CoLi Saarland) Syntactic Approach to Modeling Coherence June 10, 2013 7 / 32
Coherence models based on syntax Evidence for syntactic coherence Pilot study for the validation of assumption No: 2 • Material: gold standard parse trees from the Penn Treebank • Unit of analysis: two adjacent sentences, a pair ( S 1 , S 2 ) Steps: 1 enumerate all productions = 197 unique productions • productions with frequency < 25 are removed 2 for all ordered pairs ( p 1 , p 2 ) compute • c ( p 1 , p 2 ) , c ( p 1 , ¬ p 2 ) , c ( ¬ p 1 , p 2 ) and c ( ¬ p 1 , ¬ p 2 ) c ( p 1 , p 2 ) : # of sentence pairs where p 1 ∈ S 1 and p 2 ∈ S 2 3 perform chi-square test to • prove significance of the count c ( p 1 , p 2 ) • check independence of the occurrences of p 1 and p 2 where, p1: production 1 and p2: production 2 Nikolina Koleva (CoLi Saarland) Syntactic Approach to Modeling Coherence June 10, 2013 7 / 32
Coherence models based on syntax Evidence for syntactic coherence Outcome of the study • small fraction of repetitions (5%) p1: VP → VBD SBAR p2: VP → VBD SBAR S1: Documents filed with the Securities and Exchange Commission on the 1 pending spinoff [[disclosed] VBD [that Cray Research Inc. will withdraw the almost $ 100 million in financing it is providing the new firm if Mr. Cray leaves or if the product-design project, he heads, is scrapped] SBAR ] VP . S2: The documents also [[said] VBD [that although the 64-year-old Mr. Cray has 2 been working on the project for more than six years , the Cray-3 machine is at least another year away from a fully operational prototype] SBAR ] VP . Nikolina Koleva (CoLi Saarland) Syntactic Approach to Modeling Coherence June 10, 2013 8 / 32
Coherence models based on syntax Evidence for syntactic coherence Outcome of the study • finance domain-specific p1: NP → NP NP-ADV p2: QP → CD CD S1: The two concerns said they entered into a definitive merger agreement 1 under which Ratners will begin a tender offer for all of Weisfield’s common shares for [$57.50 each] NP . S2: Also on the takeover front, Jaguar’s ADRs rose 1/4 to 13 7/8 on turnover of 2 [4.4 million] QP . Nikolina Koleva (CoLi Saarland) Syntactic Approach to Modeling Coherence June 10, 2013 9 / 32
Coherence models based on syntax Evidence for syntactic coherence Outcome of the study • neither repetitions nor domain dependent p1: VP → VB VP p2: NP-SBJ → NNP NNP S1: "The refund pool may not [be held hostage through another round of 1 appeals] VP , " Judge Curry said. S2: [Commonwealth Edison] NP − SBJ said it is already appealing the underlying 2 commission order and is considering appealing Judge Curry’s order. Nikolina Koleva (CoLi Saarland) Syntactic Approach to Modeling Coherence June 10, 2013 10 / 32
Coherence models based on syntax Evidence for syntactic coherence Outcome of the study • neither repetitions nor domain dependent p1: VP → VB VP p2: NP-SBJ → NNP NNP S1: "The refund pool may not [be held hostage through another round of 1 appeals] VP , " Judge Curry said. S2: [Commonwealth Edison] NP − SBJ said it is already appealing the underlying 2 commission order and is considering appealing Judge Curry’s order. • S1 present hypothesis or speculation • S2 introduces an entity (PERS, ORG) that gives explanation or opinion on the statement • intentional structure: SPECULATE , ENDORSE Nikolina Koleva (CoLi Saarland) Syntactic Approach to Modeling Coherence June 10, 2013 10 / 32
Coherence models based on syntax Evidence for syntactic coherence Outcome of the study • neither repetitions nor domain dependent p1: NP-LOC → NNP p2: S-TPC-1 → NP-SBJ VP S1: "It has to be considered as an additional risk for the investor," said Gary P . 1 Smaby of Smaby Group Inc., [Minneapolis] NP − LOC . S2: ["Cray Computer will be a concept stock,"] S − TPC − 1 he said. 2 Nikolina Koleva (CoLi Saarland) Syntactic Approach to Modeling Coherence June 10, 2013 11 / 32
Coherence models based on syntax Evidence for syntactic coherence Outcome of the study • neither repetitions nor domain dependent p1: NP-LOC → NNP p2: S-TPC-1 → NP-SBJ VP S1: "It has to be considered as an additional risk for the investor," said Gary P . 1 Smaby of Smaby Group Inc., [Minneapolis] NP − LOC . S2: ["Cray Computer will be a concept stock,"] S − TPC − 1 he said. 2 • S1 introduces location name associated with an entity • S2 contains quote from that entity • intentional structure: INTRODUCE X , STATEMENT BY X Nikolina Koleva (CoLi Saarland) Syntactic Approach to Modeling Coherence June 10, 2013 11 / 32
Coherence models based on syntax Representing syntax Representing syntax 1 productions • sentence as set of grammatical productions (LHS → RHS) • RHS could be very long and thus rather specific • available information only about nodes of the same constituent Nikolina Koleva (CoLi Saarland) Syntactic Approach to Modeling Coherence June 10, 2013 12 / 32
Recommend
More recommend