grundlegende parsingalgorithmen
play

Grundlegende Parsingalgorithmen Top-Down & Bottom-Up Parsing - PowerPoint PPT Presentation

Top-Down Parser Bottom-Up Parser Grundlegende Parsingalgorithmen Top-Down & Bottom-Up Parsing Kurt Eberle k.eberle@lingenio.de (Viele Folien, Teile von Folien, Materialien von Helmut Schmid s Parsing-Kurs WS14 T ubingen, u.a.) 24.


  1. Top-Down Parser Bottom-Up Parser Grundlegende Parsingalgorithmen Top-Down & Bottom-Up Parsing Kurt Eberle k.eberle@lingenio.de (Viele Folien, Teile von Folien, Materialien von Helmut Schmid ’s Parsing-Kurs WS14 T¨ ubingen, u.a.) 24. Februar 2020 1 / 17

  2. Top-Down Parser Bottom-Up Parser ¨ Uberblick Top-Down Parser Bottom-Up Parser 2 / 17

  3. Top-Down Parser Bottom-Up Parser ¨ Uberblick Top-Down Parser Bottom-Up Parser 3 / 17

  4. Top-Down Parser Bottom-Up Parser Classification of Parsing Methods ◮ top-down vs. bottom-up ◮ derivation-oriented vs. table-driven vs. chart-based 4 / 17

  5. Top-Down Parser Bottom-Up Parser Top-Down Parser Idea: Systematically enumerate all left-most derivations until the input string has been derived. Left-most derivation: The left-most non-terminal is expanded in each step. Input: a n v a n 5 / 17

  6. Top-Down Parser Bottom-Up Parser Top-Down Parser: Example Input: a n v a n Left-most derivation: Grammar: S ⇒ NP VP S → NP VP ⇒ a n VP NP → a n ⇒ a n v NP NP VP → v NP NP ⇒ a n v a n NP VP → v NP ⇒ a n v a n a n (go 3 steps back) VP → v ⇒ a n v NP ⇒ a n v a n Other name: recursive descent parser 6 / 17

  7. Top-Down Parser Bottom-Up Parser Top-Down Parser ◮ Non-deterministic regarding ◮ the choice of the non-terminal ◮ the choice of the rule ◮ Convention for NT selection: left-most derivation ◮ Backtracking in order to try different grammar rules 7 / 17

  8. Top-Down Parser Bottom-Up Parser Formal Characterization TD ◮ Configuration: Pair ( α, r ) , s.t. α ∈ ( V ∪ Σ) ∗ , r ∈ Σ ∗ pair = < sentential form, remaining string to be recognized > Start configuration: ( S , w ) ◮ configuration transitions: ◮ ( a α, aw ) �→ ( α, w ) ( “consumption” of an expected terminal symbol) ◮ ( A β, r ) �→ ( αβ, r ) with A → α ∈ P ( Expansion , left-most derivation step) ◮ End configuration: ( ε, ε ) (complete parse found) 8 / 17

  9. Top-Down Parser Bottom-Up Parser Configuration Transitions TD (S , a n v a n a n ) (NP VP , a n v a n a n ) ( a n VP , a n v a n a n ) ( n VP , n v a n a n ) (VP , v a n a n ) ( v NP NP , v a n a n ) (NP NP , a n a n ) ( a n NP , a n a n ) ( n NP , n a n ) (NP , a n ) ( a n , a n ) ( n , n ) ( ε , ε ) 9 / 17

  10. Top-Down Parser Bottom-Up Parser Problems of the TD Parser ◮ Left-recursive Non-terminals: A + ⇒ A β Danger of an infinite loop ◮ Rule selection: “blind” expansion ◮ Inefficiency of Backtracking: Partial analyses are repeated causing an exponential runtime ◮ Advantage: easy to implement 10 / 17

  11. Top-Down Parser Bottom-Up Parser 50 x/10 Growth Curves 45 x**2/100 x**3/1000 x**6/1000000 40 2**x/1024 35 30 25 20 15 10 5 0 5 10 15 20 25 30 35 40 45 50 500000 x/10 450000 x**2/100 x**3/1000 x**6/1000000 400000 2**x/1024 350000 300000 250000 200000 150000 100000 50000 0 5 10 15 20 25 30 35 40 45 50 2 50 / 1024 seconds ≈ 35000 years 11 / 17

  12. Top-Down Parser Bottom-Up Parser ¨ Uberblick Top-Down Parser Bottom-Up Parser 12 / 17

  13. Top-Down Parser Bottom-Up Parser Bottom-Up Parser Idea: Backward application of grammar rules (reductions) produces an inverted right-most derivation. Input: a n v a n a n Grammar: Left-most Reduction: S → NP VP a n v a n a n ⇐ NP v a n a n NP → a n ⇐ NP v NP a n VP → v NP NP ⇐ NP v NP NP VP → v NP ⇐ NP VP VP → v ⇐ S 13 / 17

  14. Top-Down Parser Bottom-Up Parser Tree S ✘ PPPP ✘ ✘ ✘ ✘ ✘ P 5 NP VP ✏ ❳❳❳❳❳ ✏ � ❅ � ✏ ✏ ❅ ✏ ❳ 1 4 a n v NP NP � ❅ � � ❅ � ❅ ❅ 2 3 a n a n 14 / 17

  15. Top-Down Parser Bottom-Up Parser Formal Charakterization BU ◮ Configuration: Pair ( α, r ) , with α ∈ ( V ∪ Σ) ∗ , r ∈ Σ ∗ pair = < sentential form, remaining string to be recognized > ◮ Start configuration: ( ε, w ) ◮ Configuration transitions: ◮ ( α, ar ) �→ ( α a , r ) (Shift action) ◮ ( βα, r ) �→ ( β A , r ) with A → α ∈ P (Reduce action) ◮ End configuration: ( S , ε ) ⇒ “Shift-Reduce”-Parser 15 / 17

  16. Top-Down Parser Bottom-Up Parser Configuration Transitions BU ( ε , a n v a n a n ) ( a , n v a n a n ) ( a n , v a n a n ) (NP , v a n a n ) (NP v , a n a n ) (NP v a , n a n ) (NP v a n , a n ) (NP v NP , a n ) (NP v NP a , n ) (NP v NP a n , ε ) (NP v NP NP , ε ) (NP VP , ε ) (S , ε ) 16 / 17

  17. Top-Down Parser Bottom-Up Parser Problems BU ◮ Rule cycles and ε productions may result in infinite loops. ◮ Rule selection: “Blind shift” ◮ Inefficiency of Backtracking 17 / 17

More recommend