Proceedings of the 34th Annual Meeting of the ACL, Santa Cruz, June 1996, pp. 79-86. E�cien t Normal-F orm P arsing � for Com binatory Categorial Gramm ar Jason Eisner Dept. of Computer and Information Science Univ ersit y of P ennsylv ania 200 S. 33rd St., Philadelphia, P A 19104-6389, USA jeisner@l inc .ci s.u pe nn. edu Abstract exhaustiv e parser serv es up 252 CCG parses of (3), whic h m ust b e sifted through, at considerable cost, Under categorial gramma rs that ha v e p o w- in order to iden tify the t w o distinct meanings for erful rules lik e comp osition, a simple 1 further pro cessing. n -w ord sen tence can ha v e exp onen tially (3) the galo ot in the corner man y parses. Generating all parses is ine�- cien t and obscures whatev er true seman tic NP/N N (N n N)/NP NP/N N that I said Mary am biguities are in the input. This pap er addresses the problem for a fairly general (N n N)/(S/NP) S/(S n NP) (S n NP)/S S/(S n NP) form of Com binatory Categorial Grammar, pretends to b y means of an e�cien t, correct, and easy (S n NP)/(S n NP) (S n NP)/(S n NP) stem inf inf to implem en t normal-form parsing tec h- lik e nique. The parser is pro v ed to �nd ex- (S n NP)/NP stem actly one parse in eac h seman tic equiv- This pap er presen ts a simple and �exible CCG alence class of allo w able parses; that is, parsing tec hnique that prev en ts an y suc h explosion spurious am biguit y (as carefully de�ned) of redundan t CCG deriv ations. In particular, it is is sho wn to b e b oth safely and completely pro v ed in x 4.2 that the metho d constructs exactly eliminated. one syn tactic structure p er seman tic reading|e.g., just t w o parses for (3). All other parses are sup- 1 In tro duction pressed b y simple normal-form constrain ts that are enforced throughout the parsing pro cess. This ap- Com binatory Categorial Grammar (Steedman, proac h w orks b ecause CCG's spurious am biguities 1990), lik e other \�exible" categorial gramma rs, arise (as is sho wn) in only a small set of circum- su�ers from spurious ambiguity (Witten burg, 1986). stances. Although similar w ork has b een attempted The non-standard constituen ts that are so crucial to in the past, with v arying degrees of success (Kart- CCG's analyses in (1), and in its accoun t of in to- tunen, 1986; Witten burg, 1986; P aresc hi & Steed- national fo cus (Prev ost & Steedman, 1994), remain man, 1987; Bouma, 1989; Hepple & Morrill, 1989; a v ailable ev en in simpler sen tences. This renders (2) K� onig, 1989; Vija y-Shank er & W eir, 1990; Hepple, syn tactically am biguous. 1990; Mo ortgat, 1990; Hendriks, 1993; Niv, 1994), this app ears to b e the �rst full normal-form result (1) a. Co ordination: [[John lik es ] , and S/NP for a categorial formalism ha ving more than con text- [Mary pretends to lik e] ], the big S/NP free p o w er. galo ot in the corner. b. Extraction: Ev eryb o dy at this part y 2 De�nitions and Related W ork [whom [John lik es] ] is a big galo ot. S/NP CCG ma y b e regarded as a generalization of con text- (2) a. John [lik es Mary] . S n NP free grammar (CF G)|one where a grammar has b. [John lik es ] Mary . S/NP in�nitely man y non terminals and phrase-structure The practical problem of \extra" parses in (2) b e- rules. In addition to the familia r atomic non ter- comes exp onen tially w orse for longer strings, whic h minal categories (t ypically S for sen tences, N for can ha v e up to a Catalan n um b er of parses. An 1 Namely , Mary pretends to lik e the galo ot in 168 � This material is based up on w ork supp orted under parses and the corner in 84. One migh t try a statis- a National Science F oundation Graduate F ello wship. I tical approac h to am biguit y resolution, discarding the ha v e b een grateful for the advice of Ara vind Joshi, Nob o lo w-probabilit y parses, but it is unclear ho w to mo del Komagata, Seth Kulic k, Mic hael Niv, Mark Steedman, and train an y probabili tie s when no single parse can b e and three anon ymous review ers. tak en as the standard of correctness.
Recommend
More recommend