machine translation
play

Machine Translation Classification of divergences Classical and - PowerPoint PPT Presentation

Session 4: Interlingua-based MT Dorr (1992, 1994): UNITRAN system Machine Translation Classification of divergences Classical and Statistical Approaches Lexical Conceptual Structure Translation mappings between syntactic


  1. Session 4: Interlingua-based MT � Dorr (1992, 1994): UNITRAN system Machine Translation � Classification of divergences – Classical and Statistical Approaches � Lexical Conceptual Structure � Translation mappings between syntactic structure and LCS representations Session 4: Interlingua-based MT � Language-specific exceptions to translation mappings Jonas Kuhn Universität des Saarlandes, Saarbrücken The University of Texas at Austin jonask@coli.uni-sb.de DGfS/CL Fall School 2005, Ruhr-Universität Bochum, September 19-30, 2005 Jonas Kuhn: MT 2 UNITRAN Translation divergences (1) Thematic divergence: � Translation between Spanish, English and German E: I like Mary �� S: Maria me gusta a mi (bidirectionally) 'Mary pleases me' (2) Promotional divergence: E: John usually goes home �� S: Juan suele ira casa 'John tends to go home' (3) Demotional divergence: E: I like eating �� G: Ich esse gern 'I eat likingly' (4) Structural divergence: E: John entered the house �� S: Juan entró en la casa 'John entered in the house' Jonas Kuhn: MT 3 Jonas Kuhn: MT 4

  2. Translation divergences Lexical Conceptual Structure (5) Conflational divergence: � Following Jackendoff (1983, 1990) E: I stabbed John �� S: Yo le di puñaladas a Juan 'I gave knife-wounds to John' � Example: (6) Categorial divergence: � English: Bill went into the house E: I am hungry �� G: Ich habe Hunger 'I have hunger' � LCS: GO(BILL,TO(IN(HOUSE))) (7) Lexical divergence: � Spanish: Bill entró a la casa. E: John broke into the room �� S: Juan forzó la entrada al cuarto 'John forced (the) entry to the room' Jonas Kuhn: MT 5 Jonas Kuhn: MT 6 LCS – Definitions LCS – Definitions Definition 1 (Dorr 1994) Example 1 A lexical conceptual structure (LCS) is a modified version of the representation � proposed by Jackendoff (1983, 1990) that conforms to the following structural � John went happily to school form: � [ Event GO Loc Logical This corresponds to the tree-like representation shown in Figure 2, in which ([ Thing JOHN], � Head � (1) X' is the logical head; [ Path TO Loc ([ Position AT Loc ([ Thing JOHN], [ Location SCHOOL])])] � (2) W' is the logical subject; [ Manner HAPPILY])] � (3) Z' 1 ... Z' n are the logical arguments; and � (4) Q' 1 ... Q' n are the logical modifiers . Logical Subject Figure 2: In addition, T( φ ) is the logical type (Event, State, Path, Position, etc.) � corresponding to the primitive φ (CAUSE, LET, GO, STAY, BE, etc.); Logical Logical Primitives are further categorized into fields (e.g., Possessional, Identificational, � Argument Temporal, Locational, etc.). Modifier Jonas Kuhn: MT 7 Jonas Kuhn: MT 8

  3. LCS – Definitions LCS – Definitions � Types and primitives: � Primitives must adhere to constraints on argument structure � Spatial dimension � Causal dimension Jonas Kuhn: MT 9 Jonas Kuhn: MT 10 LCS – Definitions LCS – Definitions � Field dimension (specialization of primitive stating undre � LCS representation in the lexicon and as the interlingua representation which domain it is interpreted – e.g., GO Loc vs. GO Temp ) Definition 2 (Dorr 1994) � A RLCS (i.e., a root LCS) is an uninstantiated LCS that is associated with a word definition in the lexicon (i.e., a LCS with unfilled variable positions). Definition 3 (Dorr 1994) � A CLCS (i.e., a composed LCS) is an instantiated LCS that is the result of combining two or more RLCSs by means of unification (roughly). This is the interlingua, or language- independent, form that serves as the pivot between the source and target languages. Footnote 14: Technically the second argument for each of these fields is a Path or a Position. For the purposes of the current description the column under “Argument 2” refers to the lowest leaf node embedded inside of the second argument. Jonas Kuhn: MT 11 Jonas Kuhn: MT 12

  4. LCS – Definitions Composition of LCSs Examples of RLCSs and CLCSs: � Notion of “Unification” differs from standard � RLCS associated with the word go: unification [ Event GO Loc ([ Thing X], [ Path TO Loc ([ Position AT Loc ([ Thing X], [ Location Z])])])] � Not directly invertible � More “relaxed” notion (for words associated with special parameters like :INT, :EXT, :PROMOTE etc.) � CLCS: composition of RLCSs for go, John, school , and happily leads to the LCS seen previously (using a concept of “unification”) Jonas Kuhn: MT 13 Jonas Kuhn: MT 14 Composition of LCSs Composition of LCSs Composition based on syntactic parse (following the GB framework � � Example (Government-and-Binding theory)) Syntactic Internal � John went happily to school Adjuncts Arguments Definition 4 (Dorr 1994) A syntactic phrase is a maximal projection that conforms to the � following structural form: Syntactic Internal Adjunct External Argument Argument External Argument Syntactic Syntactic Syntactic Head Head Adjuncts Jonas Kuhn: MT 15 Jonas Kuhn: MT 16

  5. The translation mappings The translation mappings � Generalized linking routine (GLR) � Generalized linking routine (GLR) X’: Logical Head X: Syntactic Head � Canonical syntactic realization (CSR) W: External � Simplified schema: Argument Z: Internal W’: Logical Argument Subject Z’: Logical Q: Syntactic Argument Adjunct Q’: Logical Modifier Jonas Kuhn: MT 17 Jonas Kuhn: MT 18 The translation mappings The translation mappings � Generalized linking routine (GLR) � Canonical syntactic realization (CSR) X’: Logical Head X: Syntactic Head W: External � Example Argument Z: Internal W’: Logical Z’: Logical Argument Subject Argument Q’: Logical Q: Syntactic Modifier Jonas Kuhn: MT 19 Jonas Kuhn: MT 20 Adjunct

  6. Addressing the Divergence The Divergence Problem Problem � Parameters for encoding language-specific � There can be (language-specific) exceptions information to the GLR and/or the CSR � GLR, CSR: language independent � Translation divergences occur when such � Parameters: language-specific information about lexical items exceptions occur in one language, but not in � Seven parameters: the other � :INT � :EXT � :PROMOTE � Formal classification of lexical-semantic � :DEMOTE divergences � * � :CAT � :CONFLATED Jonas Kuhn: MT 21 Jonas Kuhn: MT 22 Thematic Divergence Thematic Divergence E: I like Mary �� S: Maria me gusta a mi 'Mary pleases me' � Arises only where there is a logical subject � Encoded with the :INT and :EXT parameters Jonas Kuhn: MT 23 Jonas Kuhn: MT 24

  7. Thematic Divergence Parameter markings � Parameter markers such as :INT and :EXT show up only in the RLCS (for lexicon entries) Translation mapping for � The CLCS does not include such markers, it English is a language-independent representation relies on GLR defaults Jonas Kuhn: MT 25 Jonas Kuhn: MT 26 Promotional Divergence Promotional Divergence � E: John usually goes home �� S: Juan suele ira casa 'John tends to go home‘ Logical Logical Modifier Head Logical Argument Logical Head Jonas Kuhn: MT 27 Jonas Kuhn: MT 28

  8. Promotional Divergence Demotional Divergence E: I like eating �� G: Ich esse gern 'I eat likingly' Jonas Kuhn: MT 29 Jonas Kuhn: MT 30 Demotional Divergence Demotional Divergence � :DEMOTE parameter: � logical head and logical argument swap places Jonas Kuhn: MT 31 Jonas Kuhn: MT 32

  9. Divergence Types Structural Divergence � The difference between promotional and E: John entered the house �� S: Juan entró en la casa demotional divergences 'John entered in the house' � In promotional divergences (e.g., soler- � In structural divergence it is not the positions in the GLR usually), the verb (soler) triggers the head mapping that are altered, but the nature of the relation between switching, no matter what event is substituted the different positions as its argument � In demotional divergences (e.g., like-gern), the adverbial satellite (gern) is the trigger Jonas Kuhn: MT 33 Jonas Kuhn: MT 34 Structural Divergence Conflational Divergence E: I stabbed John �� S: Yo le di puñaladas a Juan 'I gave knife-wounds to John‘ Logical Argument; suppressed in English Jonas Kuhn: MT 35 Jonas Kuhn: MT 36

  10. Conflational Divergence Conflational Divergence Not realized syntactically Jonas Kuhn: MT 37 Jonas Kuhn: MT 38 Divergence Types Categorial Divergence (1) Thematic divergence (2) Promotional divergence E: I am hungry �� G: Ich habe Hunger Default Operation (3) Demotional divergence 'I have hunger' of GLR is changed (4) Structural divergence (5) Conflational divergence (6) Categorial divergence (7) Lexical divergence Default Operation of CSR is changed Jonas Kuhn: MT 39 Jonas Kuhn: MT 40

  11. Categorial Divergence Lexical Divergence � Arises only in the context of other divergence types � Choice of lexical items in any languge relies on the realization and composition properties of those items � Since the various other divergences alter these properties, lexical divergence is viewed as a side effect of other divergences � No specific override markers used Jonas Kuhn: MT 41 Jonas Kuhn: MT 42 Lexical Divergence Lexical Divergence � E: John broke into the room �� S: Juan forzó la entrada al cuarto 'John forced (the) entry to the room‘ “break into” � Conflational divergence forces the occurrence of a subsumes two lexical divergence concepts Jonas Kuhn: MT 43 Jonas Kuhn: MT 44

Recommend


More recommend