natural language parsing techonlogy
play

Natural Language Parsing Techonlogy Foundations of Language Science - PowerPoint PPT Presentation

Natural Language Parsing Techonlogy Foundations of Language Science and Technology (WS 2014/2015) Bernd Kiefer Language Technology Lab, DFKI GmbH Department of Computational Linguistics Saarland University November 2014 1 Natural Language


  1. Natural Language Parsing Techonlogy Foundations of Language Science and Technology (WS 2014/2015) Bernd Kiefer Language Technology Lab, DFKI GmbH Department of Computational Linguistics Saarland University November 2014 1 Natural Language Parsing Technology

  2. Outline Overview Basic Parsing Algorithms Parsing Strategies CYK Algorithm Earley’s Algorithm Parsing with Probabilistic Context-Free Grammar PCFG Inside-Outside Algorithm Recent Advances in Parsing Technology 2 Natural Language Parsing Technology

  3. Outline Overview Basic Parsing Algorithms Parsing Strategies CYK Algorithm Earley’s Algorithm Parsing with Probabilistic Context-Free Grammar PCFG Inside-Outside Algorithm Recent Advances in Parsing Technology 3 Natural Language Parsing Technology

  4. Language & Grammar q Language q Structural q Productive q Ambiguous, yet efficient in human-human communication q Grammar q Generalization of regularities in language structures q Morphology & syntax, often complemented by phonetics, phonology, semantics, and pragmatics 4 Natural Language Parsing Technology

  5. Ambiguity q Human languages are ambiguous on almost every layer q Grammar frameworks are designed to represent necessary ambiguities, and eliminate unnecessary ones q Parsing models are responsible for retrieving valid analyses according to the grammar 5 Natural Language Parsing Technology

  6. Syntactic Parser as NLP Component PoS Tagging Chunking Morph. Analysis NER Syntactic Parsing Semantic Analysis . . . 6 Natural Language Parsing Technology

  7. Trees (or not) S D E   PHON | ORTH " GAVE " NP VP          2 3  Sue V NP NP  2 3   HEAD VERB         6 2 3 7  D E  6 7  gave Paul Det N  6 NP 1 7  SUBJ  6 7   CAT  6 7  6 7   VAL 6 7  6 7 6 7   an A N D E  6 7  4 5  4 NP 2 , NP 3 5  COMPS  6 7     6 7    old penny  SYNSEM | LOC  6 7    6 7    8 9  6 2 3 7    ARG 1 1  6 7   > >   > >  6 7 > >    < 6 7 =  6 7 ARG 2 2  CONT | RELS   6 6 7 7    DOBJ  6 4 5 7    > ARG 3 3 >  4 5  > >    > >  give_rel : ;   DET   SBJ IOBJ ADJ gave penny Sue Paul an old 7 Natural Language Parsing Technology

  8. Chomsky Hierarchy q Type 0 (unrestricted rewriting system) ↵ ! � ↵ , � 2 ( V N [ V T ) ∗ q Type 1 (context sensitive grammars) � A ! ! ��! A 2 V N , � , � , ! 2 ( V N [ V T ) ∗ q Type 2 (context free grammars) A ! � A 2 V N , � 2 ( V N [ V T ) ∗ q Type 3 (regular grammars) A ! xB _ A ! x A , B 2 V N , x 2 V T 8 Natural Language Parsing Technology

  9. Context-Free Grammar A CFG is a quadruple: h V T , V N , P , S i q V T : terminal symbols q V N : non-terminal symbols q P : context-free productions A 2 V N , � 2 ( V N [ V T ) ∗ A ! � q S : start symbol 9 Natural Language Parsing Technology

  10. Context-Free Phrase Structure Grammar q S ! NP VP q N ! dog | cat q NP ! Det N q Det ! the | a q N ! Adj N q V ! chases | sleeps q VP ! V q Adj ! gray | lazy q VP ! V NP q Adv ! fiercely q VP ! Adv VP 10 Natural Language Parsing Technology

  11. CFG Derivation q If � = � A � , ! = �↵� and A ! ↵ 2 P then ! follows � , � ) ! q If a sequence of strings � 1 , � 2 , . . . , � m where for all i (1  i  m � 1), � i ) � i + 1 then � 1 , � 2 , . . . , � m is a derivation from � 1 to � m q “Derivable” relation: transitive, reflexive ∗ ) � m � 1 11 Natural Language Parsing Technology

  12. Outline Overview Basic Parsing Algorithms Parsing Strategies CYK Algorithm Earley’s Algorithm Parsing with Probabilistic Context-Free Grammar PCFG Inside-Outside Algorithm Recent Advances in Parsing Technology 12 Natural Language Parsing Technology

  13. Parsing Strategies q Top-down: start from the start symbol, and expand the tree with grammar rules (e.g. replace LHS symbol with RHS sequences of CFG productions) q Bottom-up: start from the input sequence, and apply grammar rules to build trees upwards (e.g. reducing RHS sequence into LHS symbols) 13 Natural Language Parsing Technology

  14. Top-Down Parsing q Goal-directed search 1. S ! NP VP q Waste time on trees that do 2. NP ! NP PP not match input sentence 3. . . . q Pure top-down (left-first) S approach cannot parse NP VP (left-)recursion grammars NP PP NP PP NP PP . . . 14 Natural Language Parsing Technology

  15. Bottom-Up Parsing q Use the input to guide the 1. A ! B | a search (data-driven) 2. B ! A q Waste time on trees that don’t 3. . . . result in S . . . q Recursive unary rules still B create an infinite parse forest A for a finite length sentence B A a 15 Natural Language Parsing Technology

  16. Problems q Left-recursion NP ! NP PP q Ambiguity q Repeated parsing of subtrees 16 Natural Language Parsing Technology

  17. Dynamic Programming (DP) q Divisibility: the optimal solution of a sub problem is part of the optimal solution of the whole problem q Memoization: solve small problems only once and remember the answers Example Calculating Fibonacci numbers: F n = F n − 1 + F n − 2 ( F 0 = 0 , F 1 = 1 ) Pascal Triangle (Binomial Coefficients): ✓ n + 1 ◆ ✓ n ◆ ✓ n ◆ = + k + 1 k k + 1 17 Natural Language Parsing Technology

  18. CYK Algorithm q Cocke-Younger-Kasami, also known as CKY algorithm q Essentially a bottom-up chart parsing algorithm using dynamic programming q CFG is in Chomsky Normal Form (CNF) q A ! BC q A ! a q S ! ✏ q A , B , C 2 V N , a 2 V T , B , C 6 = S q Fill in a two-dimension array: C [ i ][ j ] contains all the possible syntactic interpretations of the substring w i + 1 . . . w j q Complexity O ( n 3 ) 18 Natural Language Parsing Technology

  19. CYK Algorithm 0  i < j  n do 1: for all i , j C [ i ][ j ] ( ; 2: 3: end for 4: for all A ! w i 2 P do C [ i � 1 ][ i ] ( { A } [ C [ i � 1 ][ i ] 5: 6: end for 7: for s = h 2 . . . n i do 8: for all A ! B C 2 P , i , k : 0  i < k < i + s do 9: if B 2 C [ i ][ k ] ^ C 2 C [ k ][ i + s ] then 10: C [ i ][ i + s ] ( { A } [ C [ i ][ i + s ] 11: end if 12: end for 13: end for 19 Natural Language Parsing Technology

  20. CYK Chart Example S → NP VP | N VP | N V | NP V VP → V NP | V N | VP PP NP → D N | NP PP | N PP PP → P NP | P N N → john, girl, car V → saw, walks P → in D → the, a john saw the girl in a car 0 1 2 3 4 5 6 7 20 Natural Language Parsing Technology

  21. CYK Chart Example N V S → NP VP | N VP | N V | NP V VP → V NP | V N | VP PP D NP → D N | NP PP | N PP N PP → P NP | P N N → john, girl, car P V → saw, walks D P → in D → the, a N N V D N P D N john saw the girl in a car 0 1 2 3 4 5 6 7 20 Natural Language Parsing Technology

  22. CYK Chart Example N S V S → NP VP | N VP | N V | NP V VP → V NP | V N | VP PP D NP NP → D N | NP PP | N PP N PP → P NP | P N N → john, girl, car P V → saw, walks D NP P → in D → the, a N S NP NP N V D N P D N john saw the girl in a car 0 1 2 3 4 5 6 7 20 Natural Language Parsing Technology

  23. CYK Chart Example N S V VP S → NP VP | N VP | N V | NP V VP → V NP | V N | VP PP D NP NP → D N | NP PP | N PP N PP → P NP | P N N → john, girl, car P PP V → saw, walks D NP P → in D → the, a N VP PP S NP NP N V D N P D N john saw the girl in a car 0 1 2 3 4 5 6 7 20 Natural Language Parsing Technology

  24. CYK Chart Example N S S V VP S → NP VP | N VP | N V | NP V VP → V NP | V N | VP PP D NP NP → D N | NP PP | N PP N NP PP → P NP | P N N → john, girl, car P PP V → saw, walks D NP P → in D → the, a N S NP VP PP S NP NP N V D N P D N john saw the girl in a car 0 1 2 3 4 5 6 7 20 Natural Language Parsing Technology

  25. CYK Chart Example N S S V VP S → NP VP | N VP | N V | NP V VP → V NP | V N | VP PP D NP NP NP → D N | NP PP | N PP N NP PP → P NP | P N N → john, girl, car P PP V → saw, walks D NP P → in D → the, a N NP S NP VP PP S NP NP N V D N P D N john saw the girl in a car 0 1 2 3 4 5 6 7 20 Natural Language Parsing Technology

  26. CYK Chart Example N S S V VP VP S → NP VP | N VP | N V | NP V VP → V NP | V N | VP PP D NP NP NP → D N | NP PP | N PP N NP PP → P NP | P N N → john, girl, car P PP V → saw, walks D NP P → in D → the, a N VP NP S NP VP PP S NP NP N V D N P D N john saw the girl in a car 0 1 2 3 4 5 6 7 20 Natural Language Parsing Technology

  27. CYK Chart Example N S S S V VP VP S → NP VP | N VP | N V | NP V VP → V NP | V N | VP PP D NP NP NP → D N | NP PP | N PP N NP PP → P NP | P N N → john, girl, car P PP V → saw, walks D NP P → in D → the, a N S VP NP S NP VP PP S NP NP N V D N P D N john saw the girl in a car 0 1 2 3 4 5 6 7 20 Natural Language Parsing Technology

Recommend


More recommend