Session 2: Syntactic Transfer � Syntactic Transfer Machine Translation � Steps: Analysis, Transfer, Generation – Classical and Statistical Approaches � How are the various types of divergence dealt with? Session 2: Syntactic Transfer � For lab exercise: Quick Prolog Intro/Recap � Basic Prolog Terminology and Syntax Jonas Kuhn Universität des Saarlandes, Saarbrücken � Lists and Definite Clause Grammars (DCGs) The University of Texas at Austin jonask@coli.uni-sb.de DGfS/CL Fall School 2005, Ruhr-Universität Bochum, September 19-30, 2005 Jonas Kuhn: MT 2 Syntactic Transfer Syntactic Transfer: Resources � Translation process is governed by three sets of rules Syntactic Syntactic Structure Structure � Standard grammar specification for source language analysis (e.g., context-free grammars) String String � Transfer “grammar”: Transformation rules � Source-language syntactic analysis: � Include translation variables (e.g., tv(X) in Trujillo’s construct SL analysis tree tree notation) � Transfer: Tree-to-tree transformations applied � Set of transformation rules will be applied recursively recursively to SL tree: construct TL tree to each occurrence of translation variables � recursive, non-deterministic top-down process � Standard grammar specification for target language � (No syntactic generation required in TL) generation � Morphological generation � Consolidation: Applying TL grammar constraints to the TL structure to enforce grammaticality (and fill in underspecified values) Jonas Kuhn: MT 3 Jonas Kuhn: MT 4
Transfer grammar Example Tree-to-tree transformation NP NP �� tv(X) tv(Y) tv(X) tv(Y) N1 N1 �� English grammar Adj N N Adj NP � Det N1 Spanish N1 � Adj N tv(A) tv(B) tv(B) tv(A) grammar Det � a NP � Det N1 Adj � delicious Det Det delicious �� deliciosa N1 � N Adj �� N � soup a una soup �� sopa Det � una Adj � deliciosa N � sopa NP NP N1 N1 Det Adj N Det Adj N a delicious soup una sopa deliciosa Jonas Kuhn: MT 5 Jonas Kuhn: MT 6 Transformations: Prolog notation Divergences in syntactic transfer � Actual Prolog code by Trujillo (slightly different structural � Thematic divergence analysis than in text book) � En: You like her � We will come back to the details of this notation… � Sp: Ella te gusta [np|_]/_ dtrs [ DetE, N1E ] <==> [np|_]/_ dtrs [ DetS, N1S ] :- DetE <==> DetS, N1E <==> N1S. [n1|_]/_ dtrs [ [ap|_]/_ dtrs [ AdjvE ], [n1|_]/_ dtrs [ NE ]] <==> [n1|_]/_ dtrs [ [n1|_]/_ dtrs [ NS ], [ap|_]/_ dtrs [ AdjvS ]] :- AdjvE <==> AdjvS, NE <==> NS. [n|_]/soup <==> [n|_]/sopa. [adjv|_]/delicious <==> [adjv|_]/deliciosa. [det|_]/a <==> [det|_]/una. Jonas Kuhn: MT 7 Jonas Kuhn: MT 8
Divergences in syntactic transfer Divergences in syntactic transfer � Head switching � Structural � En: The baby just ate � En: Luisa entered the house � Sp: El bebé acaba de comer � Sp: Luisa entró a la casa Jonas Kuhn: MT 9 Jonas Kuhn: MT 10 Divergences in syntactic transfer Divergences in syntactic transfer � Categorial � Lexical gaps (conflational divergence) � En: a little bread � En: Camillo got up early � Sp: un poco de pan � Sp: Camillo madrugó Jonas Kuhn: MT 11 Jonas Kuhn: MT 12
Divergences in syntactic transfer Divergences in syntactic transfer � Lexicalization (lexical divergence) � Collocational � En: Susan swam across the channel � En: Jan made a decision � Sp: Susan cruzó el canal nadando � Sp: Jan tomó/*hizó una decisión Jonas Kuhn: MT 13 Jonas Kuhn: MT 14 Divergences in syntactic transfer Quick Prolog Intro/Recap � Compare: Blackburn, Bos & Striegnitz: Learn Prolog � Idiomatic Now! � En: Socrates kicked the bucket [www.coli.uni-sb.de/~kris/learn-prolog-now/] � Sp: Socrates estiró la pata � Public domain compiler SWI Prolog � Developed since 1987 at the University of Amsterdam, The Netherlands � http://www.swi-prolog.org/ � Available for MS-Windows, Mac, and Linux � Logic programming, i.e., a Prolog program is (mostly) not a sequence of commands, but a set of facts and rules used to prove or refute new facts Jonas Kuhn: MT 15 Jonas Kuhn: MT 16
Interpreter and knowledge base Terminology � How we communicate with the system � Knowledge base: � Knowledge base (file we can edit) � Facts woman(mia). � Rules playsAirGuitar(jody). Inference rules to derive new facts from given facts listensToMusic(yolanda) :- happy(yolanda). Read: “If Yolanda is happy, then she listens to music.” � Interpreter (shell in which we can type queries) � ?- � Facts and rules define predicates ?- woman(mia). Examples: happy, listensToMusic � Yes � Interpreter: � In order to use a knowledge base, we have to load or consult one � Query ?- [’my-knowledge-base-file.pl’]. Clause for which we ask: is there a proof from the � With SWI running under MS Windows, the File menu can be used � knowledge base? � To quit the interpreter at the end of your session type ?- halt. Jonas Kuhn: MT 17 Jonas Kuhn: MT 18 Prolog rules Variables � A predicate definition may consist of several clauses � Capitalized identifiers are interpreted as variables (undergoing unification) � Disjunctive interpretation playsAirGuitar(butch):- woman(mia). happy(butch). woman(jody). playsAirGuitar(butch):- woman(yolanda). listensToMusic(butch). � Each clause ends in a period loves(vincent,mia). � The condition part (right-hand side) of a rule may loves(marcellus,mia). loves(pumpkin,honey_bunny). contain several term loves(honey_bunny,pumpkin). ?- woman(X). � Conjunctive interpretation playsAirGuitar(vincent):- X=mia jealous(X,Y) :- listensToMusic(vincent), ; happy(vincent). loves(X,Z), loves(Y,Z). X=jody � The consequence part (left-hand side) may only Hitting semicolon tells Prolog ?- jealous(marcellus,W). to find alternative solutions contain one term W=vincent � Backtracking Jonas Kuhn: MT 19 Jonas Kuhn: MT 20
Variables Variables � The match predicate “ = ” can be used to state that two � Special variable: _ (the “anonymous” things are the same variable) jealous(X,Y) :- � Can match any arbitrary value – even if used loves(X,U), loves(Y,V), U=V. � Normally, variables are simply re-used in predicate several times in the same clause! definitions in order to express that two argument positions have to be the same in_love(X) :- � When a variable is used just once, this is often due to a typo loves(X,_). � Prolog will issue a warning for variables used only once in a clause � To suppress the warning, a leading underscore can be used in_love(X) :- loves(X,_Someone). Jonas Kuhn: MT 21 Jonas Kuhn: MT 22 Prolog survival guide Prolog lists � Clauses (facts/rules/queries) end in a period � Important data structure for linguistic tasks � Uppercase identifiers are variables, functors/atoms � List elements can be enumerated within brackets have to start with a lowercase letter! [fred, ann, pete] � Prolog variables are logical variables tied to a � Special case: the empty list: [] particular value within the scope of a clause (unlike variables in other programming languages where values of variables can be changed) � For flexible access to list elements, Prolog has a built-in operator for decomposing lists into head and tail: the “ |” operator � Don’t forget consulting your knowledge base (and re- consulting after making changes) � To exit the Prolog interpreter type ?- [ X | Y ] = [fred, ann, pete] ?- halt. X = fred (and don’t forget the period!) Y = [ann, pete] Jonas Kuhn: MT 23 Jonas Kuhn: MT 24
Prolog lists Built-in list predicates � Lists are typically manipulated in recursive predicates � Some important, generic list predicates are predefined in most Prolog versions (“built-in”) � member/2 trans(eins,one). � member(X,L) is true if and only if X is an element of the list L trans(zwei,two). � Examples: member(b,[a,b,c]), trans(drei,three). member([2,3],[1,[2,3]]) � append/3 trans_list([],[]). � append(L1,L2,L3) is true if and only if L3 is the concatenation of lists L1 and L2 trans_list([H|T],[H1|T1]) :- � Examples: append([a],[b,c],[a,b,c]), trans(H,H1), append([],[1,2],[1,2]) trans_list(T,T1). � reverse/2 � reverse(L1,L2) is true if and only if L1 is the reversed version of list L2 � Example application: � Examples: reverse([a,b,c],[c,b,a]) ?- trans_list([zwei,eins,drei],X). � length/2 X = [two,one,three] length(L,N) is true if and only if the integer N is the length (number of � elements) of list L Examples: length([a,b,c],3), length([],0) � Jonas Kuhn: MT 25 Jonas Kuhn: MT 26 Definite Clause Grammars (DCGs) Definite Clause Grammars (DCGs) � Simple built-in grammar formalism � Internally, the rewrite rule notation is compiled out as follows (using a “difference � Rewrite rules for (augmented) context-free list notation” for phrase coverage): grammars s(X,Z) :- np(X,Y), vp(Y,Z). s --> np, vp. np(X,Z) :- det(X,Y), n(Y,Z). np --> det, n. vp(X,Z) :- v(X,Y), np(Y,Z). vp --> v, np. vp(X,Z) :- v(X,Z). vp --> v. det([the|T],T). det --> [the]. n([dog|T],T). n --> [dog]. v([barks|T],T). v --> [barks]. Jonas Kuhn: MT 27 Jonas Kuhn: MT 28
Recommend
More recommend