Proceedin gs of the 16th Internati on al Conference on Computati ona l Linguistics (COLING-9 6), pp. 340-345, Copenhagen , August 1996. [See the cited TR, Eisner (1996), for the m uc h-impro v ed �nal results and exp erimen tal details. Algorithmic details are in subsequen t pap ers.] Three New Probabilistic Mo dels � for Dep endency P arsing: An Exploration Jason M. Eisner CIS Departmen t, Univ ersit y of P ennsylv ania 200 S. 33rd St., Philadelphia, P A 19104-6389, USA jeisner@li nc .ci s.u pen n. edu Abstract 3 After presen ting a no v el ( n ) parsing al- O (a) The man in the corner taught his dachshund to play golf EOS gorithm for dep endency grammar, w e de- DT NN IN DT NN VBD PRP$ NN TO VB NN v elop three con trasting w a ys to sto c hasticize EOS it. W e prop ose (a) a lexical a�nit y mo del taught where w ords struggle to mo dify eac h other, man play (b) a sense tagging mo del where w ords �uc- (b) The in dachshund to golf tuate randomly in their selectional prefer- ences, and (c) a generativ e mo del where corner his the sp eak er �eshes out eac h w ord's syn tactic the and conceptual structure without regard to the implicatio ns for the hearer. W e also giv e Figure 1: (a) A bare-b ones dep endency parse. Eac h preliminary empirical results from ev aluat- w ord p oin ts to a single paren t , the w ord it mo di�es; ing the three mo dels' parsing p erformance the head of the sen tence p oin ts to the EOS (end-of- on annotated Wal l Str e et Journal training sen tence) mark. Crossing links and cycles are not al- text (deriv ed from the P enn T reebank). In lo w ed. (b) Constituen t structure and sub categoriza- these results, the generativ e mo del p erforms tion ma y b e highlig h ted b y displa ying the same de- signi�can tly b etter than the others, and p endencies as a lexical tree. do es ab out equally w ell at assigning part- of-sp eec h tags. them. It is useful to lo ok in to these basic ques- tions b efore trying to �ne-tune the p erformance of 1 1 In tro duction systems whose b eha vior is harder to understand. The main con tribution of the w ork is to pro- In recen t y ears, the statistical parsing comm unit y p ose three distinct, lexicalist h yp otheses ab out the has b egun to reac h out for syn tactic formalism s probabilit y space underlying sen tence structure. that recognize the individualit y of w ords. Link W e illustrate ho w eac h h yp othesis is expressed in grammars (Sleator and T emp erley , 1991) and lex- a dep endency framew ork, and ho w eac h can b e icalized tree-adjoining grammars (Sc hab es, 1992) used to guide our parser to w ard its fa v ored so- ha v e no w receiv ed sto c hastic treatmen ts. Other lution. Finally , w e p oin t to exp erimen tal results researc hers, not wishing to abandon con text-free that compare the three h yp otheses' parsing p er- grammar (CF G) but disillusioned with its lexical formance on sen tences from the Wal l Str e et Jour- blind sp ot, ha v e tried to re-parameterize sto c has- nal . The parser is trained on an annotated corpus; tic CF G in con text-sensitiv e w a ys (Blac k et al., no hand-written grammar is required. 1992) or ha v e augmen ted the formalism with lex- ical headw ords (Magerman, 1995; Collins, 1996). 2 Probabilistic Dep endencies In this pap er, w e presen t a �exible probabilistic parser that sim ultaneously assigns b oth part-of- It cannot b e emphasized to o strongly that a gr am- sp eec h tags and a bare-b ones dep endency struc- (dep endency parses, tag se- matic al r epr esentation ture (illustrated in Figure 1). The c hoice of a quences, phrase-structure trees) do es not en tail simple syn tactic structure is delib erate: w e w ould an y particular pr ob ability mo del . In principle, one lik e to ask some basic questions ab out where lex- could mo del the distribution of dep endency parses ical relationships app ear and ho w b est to exploit 1 � This material is based up on w ork supp orted un- Our no v el parsing algorithm also rescues dep en- der a National Science F oundation Graduate F ello w- dency from certain criticisms: \Dep endency gram- ship, and has b ene�ted greatly from discussions with mars : : : are not lexical, and (as far as w e kno w) lac k Mik e Collins, Dan Melamed, Mitc h Marcus and Ad- a parsing algorithm of e�ciency comparable to link w ait Ratnaparkhi. grammars." (La�ert y et al., 1992, p. 3)
Recommend
More recommend