Assignment 2: Parsing PCFG and CKY with C2FP Chan Young Park
Background: PCFG Recap 2
Background: PCFG Recap S → NP VP S → NP VP N → I NP → N NP → N N → students NP → NP PP NP → NP PP N → telescope NP → DT NN NP → DT NN ADV → recently VP → ADV V NP VP → ADV V NP V → saw PP → P NP PP → P NP P → with DT → a 3
Background: PCFG Recap S → NP VP N → I 1.0 0.33 1.0 NP → N N → students 0.5 0.33 0.5 1.0 NP → NP PP N → telescope 0.25 0.33 0.33 1.0 1.0 0.25 NP → DT NN 0.25 ADV → recently 1.0 0.5 VP → ADV V NP V → saw 1.0 1.0 0.33 1.0 0.25 PP → P NP P → with 1.0 1.0 1.0 0.33 DT → a 1.0 4
Main Implementation 1. Main entry point: PCFGParserTester (+ baseline: BaselineParser) 2. Two classes you need to implement - GeneratvieParserFactory - CoarseToFineParserFactory (optional) 3. Methods you need to implement - getParser(List<Tree<String>> trainTrees) - getBestParse(List<String> sentence) 5
Two methods to implement 1. getParser(List<Tree<String>> trainTrees) Use given classes Grammar, Simple Lexicon, annotateTrees(trainTrees) 6 UnaryClosure
Two methods to implement Use given classes to build a parser Grammar, Simple Lexicon, UnaryClosure 7
Two methods to implement 2. getBestParse(List<String> sentence) Back Tracking CKY Algorithm(buildChart) 8 (getBestTree)
The Assignment: Parsing In Summary, What do you need to implement? [Pre-processing] Tree Annotator ● [Main stuff] CKY algorithm that can handle unaries ● [Post-processing] Extracting the best tree from the backpointers ● [Extra credit] Implementing coarse-to-fine pruning ● 9
1. Tree Annotation Binarization + Markovization + Parent Annotation 10
Binarization + Markovization + Parent Annotation Tree binarization: ● 11
Binarization + Markovization + Parent Annotation Tree binarization: ● 12
Binarization + Markovization + Parent Annotation Horizontal Markovization: ● 13
Binarization + Markovization + Parent Annotation Parent annotation: ● 14
TreeAnnotations.class ● edu.berkeley.nlp.assignments.parsing.TreeAnnotations Choose which order you want to use first ● 15
2. CKY algorithm Filling in Unary & Binary Charts Slide credit: Lecture Slides for Stanford Coursera course -- Probabilistic Parsing 16
CKY Main Algorithm Standard CKY works with binary rules only ● One possible way to handle unary rules: ● Use two charts: binary chart and unary chart ○ Binary chart: store the scores of non-terminals after applying binary rules ○ Unary chart: store the scores of non-terminals after applying unary rules ○ Alternate filling the unary and binary charts ○ 17
Possible Ways to Fill the Charts [Main Stuff] CKY Algorithm max=1 max=2 max=3 max=1 max=2 max=3 min=0 min=0 min=1 min=1 min=2 min=2 Method 1 Method 2 Based on length and Based on max and then min then min 18
CKY Main Algorithm <fill in possible pre-terminals and unary closures of those> for each max from 2 to n max=1 max=2 max=3 max=4 for each min from max-2 to 0 min=0 for each non-terminal C min=1 for each binary rule C -> C1 C2 min=2 for each mid from min+1 to max-1 if unary_chart[min][mid][C1] and min=3 unary_chart[mid][max][C2] then binary_chart[min][max][C] = score(min, mid, max, C, C1, C2) <fill in unary_chart based on binary_chart, but with unary rules> We can improve the speed by avoiding testing unnecessary rules ● 19
CKY Main Algorithm <fill in possible pre-terminals and unary closures of those> max=1 max=2 max=3 for each max from 2 to n min=0 for each min from max-2 to 0 for each mid from min+1 to max-1 min=1 for each non-terminal C1 present at [min][mid] min=2 for each binary rule C -> C1 C2 if unary_chart[min][mid][C1] and unary_chart[mid][max][C2] then binary_chart[min][max][C] = score(min, mid, max, C, C1, C2) <fill in unary_chart based on binary_chart, but with unary rules> You can experiment with other ways to prune the rules to be tested ● 20
The grammar: Binary 21
22
23
24
25
26
27
28
29
30
31
CKY Implementation Details Utilize these functions well: ● Grammar.binaryRulesBy{LeftChild,RightChild,Parent} ○ UnaryClosures. closedUnaryRulesBy{Child,Parent} ○ Lexicon.scoreTagging ○ How to store the possible non-terminals? ● A fixed list of all non-terminals? How many of them? Is it small enough? ○ A dynamic-size array for non-terminals? Is it fast enough? ○ 32
3. Extracting the Best Tree 33
Extracting Tree: Following Backpointers What information do you need to store? ● The information should be able to identify uniquely the previous step during the ○ bottom-up process of CKY What happen if we could not find any valid parse? ● For the purpose of this assignment, you still need to return a Tree object ○ Recursive or iterative method? ● Don’t forget to debinarize the resulting tree TreeAnnotations.unAnnotateTree() ● 34
buildTree() 35
buildTree() buildTree() 36
4. Coarse-to-fine Pruning 37
Extra Credit: Coarse-to-fine Pruning Coarse-to-fine pruning: more advanced pruning method ● Idea: prune non-terminals that are not plausible as part of the full tree ● Define non-plausible as “having small enough posterior probability” ● This is where inside-outside algorithm comes into play ● PER PER PER PER PER . . . ORG ORG ORG ORG ORG START STOP LOC LOC LOC LOC LOC . . . O O O O O 38
Extra Credit: Coarse-to-fine Pruning Calculate inside: ● Calculate outside: ● Calculate posterior probability: (then prune if this is below certain threshold) ● Detailed explanation on these equations is available in the additional note we provide ● 39
Some Tips Print the rules after your Markovization (binary rules, unary rules, expanded ● unary chains, label set) As always, have a small set of sentences and trees for which you can ● process manually, and test on them Getting a very high* F1 or a very fast* decoding time might give extra points ● *Threshold will be decided later 40
Some Tips instead of Grammar.getUnaryRulesBy ● UnaryClosure.getClosedUnaryRulesBy (+ UnaryClosure.getPath to unroll the closed unary rules) Loop order? {max, min, grammar}? ● How can we reduce the number of grammar access? ● How can we reduce the number of grammar? (w/o reducing order of ● markovization) Chart of Objects? Primitives? ● 41
Some Questions to be Explored Right-branching vs Left-branching (vs other?) during binarization? ● What is the use of parent annotation? What is the use of horizontal ● Markovization? Any examples showing the benefits/drawbacks? What grammar to be used as coarse grammar in coarse-to-fine pruning? ● 42
Read the References! References: ● https://nlp.stanford.edu/~manning/papers/unlexicalized-parsing.pdf Dan Klein and Christopher D. Manning. 2003. Accurate Unlexicalized Parsing 43
Recommend
More recommend