Representing context-free grammars in Prolog Towards a basic setup: - PDF document

Representing context-free grammars in Prolog • Towards a basic setup: – What needs to be represented? Implementing context-free grammars – On the relationship between context-free rules and logical implications – A first Prolog encoding • Encoding the string coverage of a node: Detmar Meurers: Intro to Computational Linguistics I From lists to difference lists OSU, LING 684.01 • Adding syntactic sugar: Definite clause grammars (DCGs) • Representing simple English grammars as DCGs 2 What needs to be represented? On the relationship between context-free rules and logical implications We need representations (data types) for: • Take the following context-free rewrite rule: − terminals, i.e., words S → NP VP − syntactic rules − linguistic properties of terminals and their propagation in rules: • Nonterminals in such a rule can be understood as predicates holding of the lists of terminals dominated by the nonterminal. − syntactic category − other properties • A context-free rules then corresponds to a logical implication: − string covered (“phonology”) − case, agreement, . . . ∀ X ∀ Y ∀ Z NP( X ) ∧ VP( Y ) ∧ append( X , Y , Z ) ⇒ S( Z ) − analysis trees, i.e., syntactic structures • Context-free rules can thus directly be encoded as logic programs. 3 4 Components of a direct Prolog encoding A small example grammar G = ( N, Σ , S, P ) N = { S, NP , VP , V i , V t , V s } • terminals: unit clauses (facts) Σ = { a, clown, Mary, laughs, loves, thinks } • syntactic rules: non-unit clauses (rules) S = S • linguistic properties:   S → NP VP   NP → Det N   – syntactic category: predicate name     VP → V i   NP → PN       VP → V t NP – other properties: predicate’s arguments, distinguished by position       PN → Mary P = VP → V s S ∗ in general: compound terms   Det → a ∗ for strings: list representation  V i → laughs          V t → loves  N → clown  – analysis trees:      →  V s thinks   compound term as predicate argument 5 6

An encoding in Prolog A modified encoding dcg/append encoding1.pl dcg/append encoding2.pl s(S) :- np(NP), vp(VP), append(NP,VP,S). s(S) :- append(NP,VP,S), np(NP), vp(VP). vp(VP) :- vi(VP). vp(VP) :- vi(VP). vp(VP) :- vt(VT), np(NP), append(VT,NP,VP). vp(VP) :- append(VT,NP,VP), vt(VT), np(NP). vp(VP) :- vs(VS), s(S), append(VS,S,VP). vp(VP) :- append(VS,S,VP), vs(VS), s(S). np(NP) :- pn(NP). np(NP) :- pn(NP). np(NP) :- det(Det), n(N), append(Det,N,NP). np(NP) :- append(Det,N,NP), det(Det), n(N). pn([mary]). n([clown]). det([a]). pn([mary]). n([clown]). det([a]). vi([laughs]). vt([loves]). vs([thinks]). vi([laughs]). vt([loves]). vs([thinks]). 7 8 Difference list encoding Basic DCG notation for encoding CFGs dcg/diff list encoding.pl A DCG rule has the form “ LHS --> RHS . ” with s(X0,Xn) :- np(X0,X1), vp(X1,Xn). • LHS : a Prolog atom encoding a non-terminal, and vp(X0,Xn) :- vi(X0,Xn). • RHS : a comma separated sequence of vp(X0,Xn) :- vt(X0,X1), np(X1,Xn). vp(X0,Xn) :- vs(X0,X1), s(X1,Xn). – Prolog atoms encoding non-terminals – Prolog lists encoding terminals np(X0,Xn) :- pn(X0,Xn). np(X0,Xn) :- det(X0,X1), n(X1,Xn). When a DCG rule is read in by Prolog, it is expanded by adding the difference list arguments to each predicate. pn([mary|X],X). n([clown|X],X). det([a|X],X). vi([laughs|X],X). vt([loves|X],X). vs([thinks|X],X). (Some Prologs also use a special predicate ’C’/3 to encode the coverage of terminals, defined as ’C’([Head|Tail],Head,Tail). ) 9 10 Examples for some cfg rules in DCG notation An example grammar in definite clause notation dcg/dcg encoding.pl • S → NP VP s --> np, vp. s --> np, vp. np --> pn. • S → NP thinks S np --> det, n. s --> np, [thinks], s. vp --> vi. • S → NP picks up NP s --> np, [picks, up], np. vp --> vt, np. vp --> vs, s. • S → NP picks NP up s --> np, [picks], np, [up]. pn --> [mary]. n --> [clown]. det --> [a]. vi --> [laughs]. vt --> [loves]. vs --> [thinks]. • NP → ǫ np --> []. 11 12

The example expanded by Prolog More complex terms in DCGs Non-terminals can be any Prolog term, e.g.: ?- listing. pn([mary|A], A). s --> np(Per,Num), s(A, B) :- vp(A, B) :- vp(Per,Num). vi(A, B). n([clown|A], A). np(A, C), vp(C, B). This is translated by Prolog to vp(A, B) :- det([a|A], A). np(A, B) :- vt(A, C), s(A, B) :- np(C, B). vi([laughs|A], A). pn(A, B). np(C, D, A, E), vp(C, D, E, B). vp(A, B) :- vt([loves|A], A). np(A, B) :- vs(A, C), det(A, C), Restriction: n(C, B). s(C, B). vs([thinks|A], A). • The LHS has to be a non-variable, single term (plus possibly a sequence of terminals). 13 14 Using compound terms to store an analysis tree Adding more linguistic properties dcg/dcg tree.pl dcg/dcg linguistic.pl s --> np(Per,Num), vp(Per,Num). s(s_node(NP,VP)) --> np(NP), vp(VP). vp(Per,Num) --> vi(Per,Num). np(np_node(PN)) --> pn(PN). vp(Per,Num) --> vt(Per,Num), np(_,_). np(np_node(Det,N)) --> det(Det), n(N). vp(Per,Num) --> vs(Per,Num), s. vp(vp_node(VI)) --> vi(VI). np(3,sg) --> pn. vp(vp_node(VT,NP)) --> vt(VT), np(NP). np(3,Num) --> det(Num), n(Num). vp(vp_node(VS,S)) --> vs(VS), s(S). pn --> [mary]. pn(mary_node) --> [mary]. det(sg) --> [a]. n(sg) --> [clown]. n(clown_node) --> [clown]. det(_) --> [the]. n(pl) --> [clowns]. det(a_node) --> [a]. vi(laugh_node)--> [laughs]. vi(3,sg) --> [laughs]. vi(_,pl) --> [laugh]. vt(love_node) --> [loves]. vt(3,sg) --> [loves]. vt(_,pl) --> [love]. vs(think_node)--> [thinks]. vs(3,sg) --> [thinks]. vs(_,pl) --> [think]. 15 16 Additional notation for the RHS of DCGs (I) Additional notation for the RHS of DCGs (II) The RHS can include The RHS can include • disjunctions expressed by the “ ; ” operator, e.g.: • extra conditions expressed as prolog relation calls inside “ { } ”: vp --> vintr; s --> np(Case), vp, {check_case(Case)}. vtrans, np. s --> {write(’rule 1’),nl}, np, {write(’after np’),nl}, vp, {write(’after vp’),nl}. • groupings are expressed using parenthesis “ ( ) ”, e.g. • the cut “ ! ” (can occur without enclosing “ {} ”). vp --> v, (pp_of; pp_at). 17 18

Additional notation for the RHS of DCGs: Meta-variables On the RHS , variables can be used for non-terminals and terminals, i.e. as meta-variables. E.g.: verb([up]) --> [pick]. vp --> verb(Particle), % pick np, % the ball Particle. % up Note: The value of the variable has to be known at the time Prolog attempts to prove the subgoal represented by the variable. 19

Representing context-free grammars in Prolog Towards a basic setup: - PDF document

Representing context-free grammars in Prolog Towards a basic setup: What needs to be represented? Implementing context-free grammars On the relationship between context-free rules and logical implications A first Prolog encoding

Learn Prolog Now! SWI Prolog Freely available Prolog interpreter Works with Linux,

Prolog Prolog.1 Textbook Title u PROLOG programming for artificial intelligence l Author u

An Introduction to Prolog Programming 1 What is Prolog? Prolog ( pro gramming in log ic) is a

Prolog Programming CM20019-S1 Y2006/07 1 Prolog = programming in logic Prolog = Programming in

Derivations Derivations Informatics 2A: Lecture 4 Tree Diagrams Non-Equivalent Derivations

Derivations Derivations Informatics 2A: Lecture 4 Tree Diagrams Non-Equivalent Derivations

An Introduction to Prolog Programming Ulle Endriss Institute for Logic, Language and Computation

Introduction to Prolog 20070524 Prolog 1 History of Prolog PROgramming in LOGic - based

Prolog Dr. Mattox Beckman University of Illinois at Urbana-Champaign Department of Computer

Probabilistic Context-Free Grammars Probabilistic Context-Free Grammars Berlin Chen Graduate

Probabilistic Context-Free Grammars Zipfs Law Informatics 2A: Lecture 19 2 Probabilistic

Context-Free Grammars and Languages Context-Free Grammars and Languages p.1/40

Probabilistic Context-Free Probabilistic Context-Free Grammars (PCFGs) Grammars (PCFGs) Berlin

Probabilistic Context-Free Grammars Informatics 2A: Lecture 18 Bonnie Webber and Frank Keller

Language Processing with Perl and Prolog Chapter 9: Phrase-Structure Grammars in Prolog Pierre

Context Sensitivity Example of a CSG Informatics 2A: Lecture 26 2 Context in Programming

ECE264: Advanced C Programming Summer 2019 Week 5: Examples of Recursive Algorithms (Mergesort,

Lecture 4: Refinement Based on material from Section 10.8, Specifying Systems by Leslie Lamport

Evaluating arithmetic expressions Stack-based algorithms are used for syntactical analysis (

Week 2 Discussion Wednesday, 10/9/19 Reminders PSA1 due Tuesday, October 15 11:59pm Quiz 1

From Context-Free Grammars to Definite Claus Grammars Grammar Formalisms for CL Seminar f ur

Relational Calculus Module 3, Lecture 2 Database Management Systems, R. Ramakrishnan 1

Semi-Joins and Bloom Join Databases: The Complete Book Ch 20 1 Practical Concerns UNION R 1

A Two-way QKD Protocol Outperforming One-way Protocols at Low QBER Jari Lietzn, Roope