Competence and Performance Grammar in Incremental Parsing Vincenzo Lombardo Alessando Mazzei Patrick Sturt Dipartimento di Informatica Dipartimento di Informatica Department of Psychology Università di Torino Università di Torino University of Glasgow 1
Idea Defining a grammatical formalism that explicitly takes in account of the incrementality of natural language. 2
Idea Defining a grammatical formalism that explicitly takes in account of the incrementality of natural language. w 1 ... w n w n+1 w n+2 3
Motivations ● Psycholinguistics : Experimental data on the connectivity of partial syntactic structures: Connection of the words before the verb in Japanese [Kamide-et-al03] Incremental syntactic interpretation in English [Sturt- Lombardo04]. 4
Motivations ● Psycholinguistics : Experimental data on the connectivity of partial syntactic structures: Connection of the words before the verb in Japanese [Kamide-et-al03] Incremental syntactic interpretation in English [Sturt- Lombardo04]. ● Theoretical syntax : Deep relation between constituency and incrementality [Phillips03]. 5
Motivations ● Psycholinguistics : Experimental data on the connectivity of partial syntactic structures: Connection of the words before the verb in Japanese [Kamide-et-al03] Incremental syntactic interpretation in English [Sturt- Lombardo04]. ● Theoretical syntax : Deep relation between constituency and incrementality [Phillips03]. ● Practical Motivations Language modeling [Roark01] Interpretation of prefix sentences [Milward95] 6
Outline ● Definition of strong connectivity ● Dynamic version of LTAG: DVTAG ● Left association ● Empirical tests on left association ● Limitations and open issues 7
Strong connectivity People incorporate each word into a single, totally connected "syntactic structure" before any further words follow [Stabler94]. 8
Strong connectivity People incorporate each word into a single, totally connected "syntactic structure" before any further words follow [Stabler94]. In this discussion: Syntactic structure Constituency tree 9
Overview of LTAG VP S S VP NP NP↓ VP ADV VP* often ADV VP V pleases NP↓ N often Bill V NP pleases NP NP N Sue N N Bill Sue 10
Dynamic Syntax with TAG ● State ● Transition S i S i+1 C( _q ) B( _k ) B( _k ) Wi+1 : w i+1 C*( _q ) A( w i ) C( _k ) A( w i ) C( _k ) w i+1 C( _k ) D↓( _k ) E↓( _j ) w i w i D↓( _k ) E↓( _j ) 11
DVTAG lexicon ● Left-anchor and lexical projection B( _k ) ● Predicted nodes C( _k ) A( w i ) ● Each non terminal D↓( _k ) E↓( _j ) node is augmented w i with a head-variable 12
DVTAG lexicon ● Left-anchor and lexical projection B( _k ) ● Predicted nodes C( _k ) A( w i ) ● Each non terminal D↓( _k ) E↓( _j ) node is augmented w i with a head-variable 13
DVTAG lexicon ● Left-anchor and lexical projection B( _k ) ● Predicted nodes C( _k ) A( w i ) ● Each non terminal D↓( _k ) E↓( _j ) node is augmented w i with a head-variable 14
Constraints on DVTAG derivation B( _k ) C( _k ) A( w i ) D↓( _k ) E↓( _j ) w i Accessibility Fringe 15
Constraints on DVTAG derivation B( _k ) C( _k ) A( w i ) D↓( _k ) E↓( _j ) w i Accessibility Fringe 16
Constraints on DVTAG derivation B( _k ) C( _k ) A( w i ) D↓( _k ) E↓( _j ) w i Accessibility Fringe E(w q ) w q 17
DVTAG example Sue often pleases Bill 18
DVTAG example Sue 19
DVTAG example Sue S( _i ) VP( _i ) NP( Sue ) V( _i ) NP↓( _j ) eats N Sue likes pleases ... 20
DVTAG example Sue often S( _i ) VP( _i ) NP( Sue ) V( _i ) NP↓( _j ) eats N Sue likes pleases ... 21
DVTAG example Sue often S( _i ) VP( _i ) NP( Sue ) V( _i ) NP↓( _j ) eats N Sue likes pleases ... VP( _k ) ADV( often ) VP*( _k ) often 22
DVTAG example Sue often S( _i ) VP( _i ) NP( Sue ) V( _i ) NP↓( _j ) eats N Sue likes Adjoining from pleases the left ... VP( _k ) ADV( often ) VP*( _k ) often 23
DVTAG example Sue often S( _i ) NP( Sue ) VP( _i ) ADV( often ) VP( _i ) N often Sue V( _i ) NP↓( _j ) eats likes pleases ... 24
DVTAG example Sue often pleases S( _i ) NP( Sue ) VP( _i ) ADV( often ) VP( _i ) N often Sue V( _i ) NP↓( _j ) eats likes pleases ... 25
DVTAG example Sue often pleases S( _i ) NP( Sue ) VP( _i ) ADV( often ) VP( _i ) N often Sue V( _i ) NP↓( _j ) eats likes pleases Shift ... 26
DVTAG example Sue often pleases S( pleases ) NP( Sue ) VP( pleases ) ADV( often ) VP( pleases ) N often Sue V( pleases ) NP↓( _j ) pleases 27
DVTAG example Sue often pleases Bill S( pleases ) NP( Sue ) VP( pleases ) ADV( often ) VP( pleases ) N often Sue V( pleases ) NP↓( _j ) pleases 28
DVTAG example Sue often pleases Bill S( pleases ) NP( Sue ) VP( pleases ) ADV( often ) VP( pleases ) N often Sue V( pleases ) NP↓( _j ) pleases NP(Bill ) N Bill 29
DVTAG example Sue often pleases Bill S( pleases ) NP( Sue ) VP( pleases ) ADV( often ) VP( pleases ) N often Sue V( pleases ) NP↓( _j ) pleases NP(Bill) Substitution N Bill 30
DVTAG example Sue often pleases Bill S( pleases ) NP( Sue ) VP( pleases ) ADV( often ) VP( pleases ) N often Sue V( pleases ) NP( Bill ) pleases N Bill 31
Left association ● Constraints on introduction of predicted nodes ● Produce a DVTAG lexicon from a LTAG lexicon (1) STEP-1: Iteration of off-line Substitution and Adjoining on the left-side of the LTAG trees (2) STEP-2: Template trees 32
Left association (1) STEP-1: Iteration of off-line Substitution and Adjoining on the left-side of the LTAG trees AP( _k ) ADVP( very ) AP*( _k ) ADV( very ) very 33
Left association (1) STEP-1: Iteration of off-line Substitution and Adjoining on the left-side of the LTAG trees AP( _k ) N'( _j ) ADVP( very ) AP( nice ) AP*( _k ) N'*( _j ) ADV( very ) ADJ( nice ) very nice 34
Left association (1) STEP-1: Iteration of off-line Substitution and Adjoining on the left-side of the LTAG trees NP( cats ) AP( _k ) N'( _j ) N'( cats ) ADVP( very ) AP( nice ) AP*( _k ) N'*( _j ) N( cats ) ADV( very ) cats ADJ( nice ) very nice 35
Left association (1) STEP-1: Iteration of off-line Substitution and Adjoining on the left-side of the LTAG trees NP( cats ) N'( cats ) AP( nice ) N'( cats ) ADVP( very ) AP( nice ) N( cats ) cats ADV( very ) ADJ( nice ) very nice 36
Left association (2) STEP-2: Template trees NP( cats ) NP( _i ) N'( cats ) N'( _i ) AP( nice ) AP( _j ) N'( cats ) N'( _i ) ADVP( very ) ADVP( very ) AP( nice ) AP( _j ) N( _i ) N( cats ) cats cats ADV( very ) ADV( very ) ADJ( nice ) ADJ( _j ) dogs very very nice nice ... small ... 37
Empirical tests on left association ● Left association on a wide coverage LTAG – Closure on left association. – Termination condition: no repetitions of the same root X i ≠X j X n X 1 38
Test 1 XTAG sample (English), 628 templates ● Result : 176,190 templates – The max number of left associations is 7 – Maximum number of template occurrences (62.970) with 4 left associations ● In previous experiments on Penn treebank maximum depth 4 [Lombardo-Sturt02b]. 39
Test 2 Italian treebank grammar, 988 templates ● Treebank filter : only left-associated templates present in the treebank. ● Result : 706,866 templates. The max number of left associations is 3. 40
Conclusions ● DVTAG account of strong connectivity ● Left association to produce wide coverage DVTAG ● Empirical naive tests 41
Limitations and open issues ● Producing a DVTAG with left association is computationally intensive. ● How does the lexicon size affect the parsing complexity in DVTAG? ● Do we need underspecification technique in a real context (es. [Roark01]) ? 42
Thank you . 43
Reference [Kamide-et-al03] Y. Kamide, G. T.M. Altmann, and S. L. Haywood. 2003. The time-course of prediction in incremental sentence processing: Evidence from anticipatory eye movements . In Journal of Memory and Language, 49. [Lombardo-Sturt02] V. Lombardo and P. Sturt. 2002c. Towards a dynamic version of TAG . In TAG+6. [Lombardo-Sturt02b] V. Lombardo and P. Sturt.2002 Incrementality and Lexicalism: a Treebank Study . In The Lexical Basis of Sentence Processing. [Milward1995] D. Milward. 1995. Incremental interpretation of categorial grammar . In Proceedings of EACL95. 44
Reference [Phillips03] C. Phillips. 2003. Linear order and constituency. In Linguistic Inquiry, 34. [Stabler94] E. P. Stabler. 1994. The finite connectivity of linguistic structure . In Perspectives on Sentence Processing. [Sturt-Lombardo04] Sturt, P. and Lombardo, V. (2004). The time-course of processing of coordinate sentences . Poster presented at the 17th annual CUNY Sentence Processing Conference. [Roark2001] B. Roark. 2001. Probabilistic top-down parsing and language modeling . In Computational Linguistics, 27(1). 45
Lexicalized Tree Adjoining Grammars ● Extended domain of locality ● Recursion Factorization by adjoining operation ● Lexicalization 46
Recommend
More recommend