General Context-Free Grammar Parsing: Application of grammar rewrite rules A phrase structure grammar � S � Also known as a context-free grammar (CFG) → NP VP � S → NP VP DT → the → DT NNS VBD DT NNS children DT NN students → The children slept NP → NNS → NP PP mountains � S VP PP slept → NP VP VBD ate VP → VBD → → DT NNS VBD NP VBD NP saw � � in → DT NNS VBD DT NN PP → IN NP IN → of → The children ate the cake NN → cake 68 69 Why we need recursive phrase structure Phrase structure is recursive � Kupiec (1992): Sometimes HMM tagger goes awry: waves → verb So we use at least context-free grammars, in general S � The velocity of the seismic waves rises to . . . NP VP S DT NNS VBD NP NP sg VP sg the students ate NP PP DT NN IN NP DT NN PP rises to . . . the cake of NP PP The velocity IN NP pl DT NN IN NP the children in DT NN of the seismic waves � Language model: There are similar problems. the mountains The captain of the ship yelled out. 71 72 Constituency Why we need phrase structure (2) � Phrase structure organizes words into nested constituents . � How do we know what is a constituent? (Not that lin- � Syntax gives important clues in information extraction guists don’t argue about some cases.) tasks and some cases of named entity recognition � Distribution: behaves as a unit that appears in differ- � We have recently demonstrated that stimulation of [ CELLTYPE human ent places: T and natural killer cells] with [ PROTEIN IL-12] induces ◮ John talked [to the children] [about drugs]. tyrosine phosphorylation of the [ PROTEIN Janus family ◮ John talked [about drugs] [to the children]. tyrosine kinase] [ PROTEIN JAK2] and [ PROTEIN Tyk2]. ◮ * John talked drugs to the children about � Things that are the object of phosphorylate are likely � Substitution/expansion/pro-forms: proteins. ◮ I sat [on the box/right on top of the box/there]. � Coordination, no intrusion, fragments, semantics, . . . 73 74
Prepositional phrase attaching to noun Natural language grammars are ambiguous: Prepositional phrase attaching to verb S S NP VP NP VP DT NNS VBD NP DT NNS VP PP The children ate NP PP The children VBD NP IN NP DT NN IN NP ate DT NN with DT NN the cake with DT NN the cake a spoon a spoon 75 76 Penn Treebank Sentences: an example ( (S (NP-SBJ (DT The) (NN move)) (VP (VBD followed) Attachment ambiguities in a real sentence (NP (NP (DT a) (NN round)) (PP (IN of) (NP (NP (JJ similar) (NNS increases)) The board approved [its acquisition] [by Royal Trustco Ltd.] (PP (IN by) (NP (JJ other) (NNS lenders))) [of Toronto] (PP (IN against) (NP (NNP Arizona) (JJ real) (NN estate) (NNS loans)))))) (, ,) [for $27 a share] (S-ADV (NP-SBJ (-NONE- *)) [at its monthly meeting]. (VP (VBG reflecting) (NP (NP (DT a) (VBG continuing) (NN decline)) (PP-LOC (IN in) (NP (DT that) (NN market))))))) (. .))) 77 78 What is parsing? � We want to run the grammar backwards to find the struc- tures Ambiguity � Parsing can be viewed as a search problem � Parsing is a hidden data problem � Programming language parsers resolve local ambigui- � We search through the legal rewritings of the grammar ties with lookahead � We want to examine all structures for a string of words � Natural languages have global ambiguities: (for the moment) � I saw that gasoline can explode � We can do this bottom-up or top-down � What is the size of embedded NP? � This distinction is independent of depth-first/bread- first etc. – we can do either both ways � Doing this we build a search tree which is different from the parse tree 79 80
State space search � States: Human parsing � Operators: � Start state: � Humans often do ambiguity maintenance � Goal test: � Have the police . . . eaten their supper? � Algorithm come in and look around. � stack = { startState } taken out and shot. � solutions = {} � But humans also commit early and are “garden pathed”: loop � The man who hunts ducks out on weekends. if stack is empty, return solutions � The cotton shirts are made from grows in Mississippi. state = remove-front(stack) if goal(state) push(state, solutions) � The horse raced past the barn fell. stack = pushAll(expand(state, operators), stack) end 81 82 cats scratch people with claws Another phrase structure grammar S NP VP S → NP VP N → cats NP PP VP 3 choices VP → V NP N → claws NP PP PP VP VP → V NP PP N → people oops! NP → NP PP N → scratch N VP NP → N V → scratch cats VP NP e P with → → cats V NP 2 choices NP N N PP P NP → → cats scratch NP cats scratch N 3 choices – showing 2nd (By linguistic convention, S is the start symbol, but in the cats scratch people oops! PTB, we use the unlabeled node at the top, which can rewrite cats scratch NP PP various ways.) cats scratch N PP 3 choices – showing 2nd . . . cats scratch people with claws 83 84 Phrase Structure (CF) Grammars G = � T, N, S, R � Recognizers and parsers � T is set of terminals � A recognizer is a program for which a given grammar � N is set of nonterminals and a given sentence returns yes if the sentence is ac- � For NLP, we usually distinguish out a set P ⊂ N of cepted by the grammar (i.e., the sentence is in the lan- preterminals which always rewrite as terminals guage) and no otherwise � S is start symbol (one of the nonterminals) � A parser in addition to doing the work of a recognizer � R is rules/productions of the form X → γ , where X also returns the set of parse trees for the string is a nonterminal and γ is a sequence of terminals and nonterminals (may be empty) � A grammar G generates a language L 85 86
Top-down parsing Soundness and completeness � Top-down parsing is goal directed � A parser is sound if every parse it returns is valid/correct � A top-down parser starts with a list of constituents to � A parser terminates if it is guaranteed to not go off into be built. The top-down parser rewrites the goals in the an infinite loop goal list by matching one against the LHS of the gram- � A parser is complete if for any given grammar and sen- mar rules, and expanding it with the RHS, attempting to tence it is sound, produces every valid parse for that match the sentence to be derived. sentence, and terminates � If a goal can be rewritten in several ways, then there is a � (For many purposes, we settle for sound but incomplete choice of which rule to apply (search problem) parsers: e.g., probabilistic parsers that return a k -best � Can use depth-first or breadth-first search, and goal or- list) dering. 87 88 Bottom-up parsing Problems with top-down parsing � Left recursive rules � Bottom-up parsing is data directed � A top-down parser will do badly if there are many dif- � The initial goal list of a bottom-up parser is the string ferent rules for the same LHS. Consider if there are 600 to be parsed. If a sequence in the goal list matches the rules for S, 599 of which start with NP, but one of which RHS of a rule, then this sequence may be replaced by starts with V, and the sentence starts with V. the LHS of the rule. � Useless work: expands things that are possible top-down � Parsing is finished when the goal list contains just the but not there start category. � Top-down parsers do well if there is useful grammar- � If the RHS of several rules match the goal list, then there driven control: search is directed by the grammar is a choice of which rule to apply (search problem) � Top-down is hopeless for rewriting parts of speech (preter- � Can use depth-first or breadth-first search, and goal or- minals) with words (terminals). In practice that is always dering. done bottom-up as lexical lookup. � The standard presentation is as shift-reduce parsing. � Repeated work: anywhere there is common substructure 89 90 Problems with bottom-up parsing � Unable to deal with empty categories: termination prob- lem, unless rewriting empties as constituents is some- Principles for success: what one needs to do how restricted (but then it’s generally incomplete) � Useless work: locally possible, but globally impossible. � If you are going to do parsing-as-search with a grammar � Inefficient when there is great lexical ambiguity (grammar- as is: driven control might help here) � Left recursive structures must be found, not predicted � Conversely, it is data-directed: it attempts to parse the � Empty categories must be predicted, not found words that are there. � Doing these things doesn’t fix the repeated work prob- � Repeated work: anywhere there is common substructure lem. � Both TD (LL) and BU (LR) parsers can (and frequently do) do work exponential in the sentence length on NLP problems. 91 92
Recommend
More recommend