chunk based verb reordering in vso sentences for arabic
play

Chunk-based Verb Reordering in VSO Sentences for Arabic-English SMT - PowerPoint PPT Presentation

Chunk-based Verb Reordering in VSO Sentences for Arabic-English SMT Arianna Bisazza, Marcello Federico FBK-irst Trento, Italy WMT 2010, Uppsala, 15-16 July 2010 1 Introduc)on Englishwordorder:SubjectVerbObject


  1. Chunk-based Verb Reordering in VSO Sentences for Arabic-English SMT Arianna Bisazza, Marcello Federico FBK-irst Trento, Italy WMT 2010, Uppsala, 15-16 July 2010 1

  2. Introduc)on
 ● 
English
word
order
:
Subject‐Verb‐Object
 ● 
Arabic
:
both
SVO
and
VSO
 ● 
Common
errors
in
phrase‐based
SMT
outputs:
 −
wrong
order
of
syntacBc
consBtuents
 −
verbless
sentences
 WMT 2010, Uppsala A. Bisazza, M. Federico 2

  3. Outline 
 ● 
Reordering
paEerns
in
Arabic‐English

 ● 
Chunk‐based
verb
reordering:
technique
and
analysis
 ● 
Impact
of
VSO
sentences
on
translaBon
quality

 ● 
Chunk‐based
reordering
laIces
 WMT 2010, Uppsala A. Bisazza, M. Federico 3

  4. Outline 
 ● 
Reordering
pa3erns
in
Arabic‐English

 ● 
Chunk‐based
verb
reordering:
technique
and
analysis
 ● 
Impact
of
VSO
sentences
on
translaBon
quality

 ● 
Chunk‐based
reordering
laIces
 WMT 2010, Uppsala A. Bisazza, M. Federico 4

  5. Reordering
pa3erns
in
Arabic‐English
 VSO
sentence:
Arabic
verb
 an#cipated 
wrt
English
 WMT 2010, Uppsala A. Bisazza, M. Federico 5

  6. Reordering
pa3erns
in
Arabic‐English
 VSO
sentence:
Arabic
verb
 an#cipated 
wrt
English
 Several
local,
one
long
reordering
involving
the
verb
 Typical
phrase‐based
SMT
outputs:

 *The
Moroccan
monarch
King
Mohamed
VI
__
his
support
to…
 *He
renewed
the
Moroccan
monarch
King
Mohamed
VI
his
support
to…
 WMT 2010, Uppsala A. Bisazza, M. Federico 6

  7. Previous
works
 
















(Habash
'07;
Crego&Habash
'08;
Elming&Habash
'09)
 • 
preprocess
source
data
to
approximate
target
word
order
 • 
address
 all
 reorderings
 • 
determinisBc
reordering
=>
1
most
probable
permutaBon
 • 
non‐determinisBc
=>
word
reordering
laIces
 
Our
work:
 • 
only
one
class
of
reorderings
 • 
mixed
approach:
determinisBc
for
train,
laIces
for
test
 WMT 2010, Uppsala A. Bisazza, M. Federico 7

  8. Reordering
pa3erns
in
Arabic‐English
 Working
hypothesis:

 






uneven
distribu#on
of
reordering
phenomena

 WMT 2010, Uppsala A. Bisazza, M. Federico 8

  9. Reordering
pa3erns
in
Arabic‐English
 Working
hypothesis:

 






uneven
distribu#on
of
reordering
phenomena

 Many
local
 
 − 
adjecBval
modifiers
following
their
noun

 

 − 
head‐iniBal
geniBve
construcBons
( idafa )

 
 
 
 
























Example
=> 

 Few
global



 − 
Verb‐Subject‐Object
sentences

 WMT 2010, Uppsala A. Bisazza, M. Federico 9

  10. Reordering
pa3erns
in
Arabic‐English
 Working
hypothesis:

 






uneven
distribu#on
of
reordering
phenomena

 Many
local
 
 − 
adjecBves
follow
nouns
 

 − 
head‐iniBal
geniBve
construcBons
( idafa )

 
 
 
 
























Example
=> 

 Few
global



 − 
Verb‐Subject‐Object
sentences

 WMT 2010, Uppsala A. Bisazza, M. Federico 10

  11. Reordering
pa3erns
in
Arabic‐English
 Working
hypothesis:

 






uneven
distribu#on
of
reordering
phenomena

 Many
local
 
 − 
adjecBves
follow
nouns
 

 − 
head‐iniBal
geniBve
construcBons
( idafa )

 
 
 
 
























Example
=> 

 Few
global



 − 
Verb‐Subject‐Object
sentences

 WMT 2010, Uppsala A. Bisazza, M. Federico 11

  12. Reordering
pa3erns
in
Arabic‐English
 VSO
sentences:

 moving
verb
a\er
subject
simplifies
reordering
 Other
(local)
reorderings:

 handled
inside
phrases
or
through
distorBon
 WMT 2010, Uppsala A. Bisazza, M. Federico 12

  13. Outline 
 ● 
Reordering
paEerns
in
Arabic‐English

 ● 
Chunk‐based
verb
reordering:
technique
and
analysis
 ● 
Impact
of
VSO
sentences
on
translaBon
quality

 ● 
Chunk‐based
reordering
laIces
 WMT 2010, Uppsala A. Bisazza, M. Federico 13

  14. Chunk‐based
verb
reordering
 –
Simplifying
assumpBons:


 1)
verb
reordering
only
between
shallow
syntax
chunks;





 2)
no
overlap
between
consecuBve
verb
movements
 WMT 2010, Uppsala A. Bisazza, M. Federico 14

  15. Chunk‐based
verb
reordering
 –
Simplifying
assumpBons:


 1)
verb
reordering
only
between
shallow
syntax
chunks;





 2)
no
overlap
between
consecuBve
verb
movements
 –
Possible
movements:

 move
verb
chunk…
 WMT 2010, Uppsala A. Bisazza, M. Federico 15

  16. Chunk‐based
verb
reordering
 –
Simplifying
assumpBons:


 1)
verb
reordering
only
between
shallow
syntax
chunks;





 2)
no
overlap
between
consecuBve
verb
movements
 –
Possible
movements:

 move
verb
chunk…
 ...or
verb
chunk
+
next
chunk
(e.g.
adverbials)
 by
up
to
X
chunks
to
the
right

 WMT 2010, Uppsala A. Bisazza, M. Federico 16

  17. Chunk‐based
verb
reordering
 Best
movement:

 minimizes
distorBon
wrt
English
translaBon
 WMT 2010, Uppsala A. Bisazza, M. Federico 17

  18. Chunk‐based
verb
reordering:
 corpus
analysis
 DistribuBon
by
movement
length
 IntersecBon
of
GIZA++
alignments
 Manual
alignments
 WMT 2010, Uppsala A. Bisazza, M. Federico 18

  19. Chunk‐based
verb
reordering:
 corpus
analysis
 DistribuBon
by
movement
length
 =>
Good
coverage
(≥
99.5%)

 with
max
movement
length
 6
 WMT 2010, Uppsala A. Bisazza, M. Federico 19

  20. Outline 
 ● 
Reordering
paEerns
in
Arabic‐English

 ● 
Chunk‐based
verb
reordering:
technique
and
analysis
 ● 
Impact
of
VSO
sentences
on
transla)on
quality

 ● 
Chunk‐based
reordering
laIces
 WMT 2010, Uppsala A. Bisazza, M. Federico 20

  21. Impact
of
VSO
sentences
on
MT
quality
 • 
Baseline:
Moses,
30M
words
newswire
from
NIST09
 WMT 2010, Uppsala A. Bisazza, M. Federico 21

  22. Impact
of
VSO
sentences
on
MT
quality
 • 
Baseline:
Moses,
30M
words
newswire
from
NIST09
 • 
Shallow
syntax
chunking:
AMIRA
(Diab&al.2004)







 • 
Verb‐reorder
training
and
devset,
re‐train
whole
system
 WMT 2010, Uppsala A. Bisazza, M. Federico 22

  23. Impact
of
VSO
sentences
on
MT
quality
 • 
Baseline:
Moses,
30M
words
newswire
from
NIST09
 • 
Shallow
syntax
chunking:
AMIRA
(Diab&al.2004)







 • 
Verb‐reorder
training
and
devset,
re‐train
whole
system
 • 
Verb‐reorder
test
aligned
with
reference
 (oracle) 
 • 
Tested
with
different
DistorBon
Limits
(DL)
from
2
to
10





 and
wide
beam
search
 WMT 2010, Uppsala A. Bisazza, M. Federico 23

  24. Impact
of
VSO
sentences
on
MT
quality
 %BLEU
scores
on
Eval08‐NW
(MERT
on
Dev06‐NW):
 WMT 2010, Uppsala A. Bisazza, M. Federico 24

  25. Impact
of
VSO
sentences
on
MT
quality
 %BLEU
scores
on
Eval08‐NW
(MERT
on
Dev06‐NW):
 Verb
reordering
of
training
 data
only
=>
posiBve
effect
 (9%
more
phrases
extracted)
 WMT 2010, Uppsala A. Bisazza, M. Federico 25

  26. Impact
of
VSO
sentences
on
MT
quality
 %BLEU
scores
on
Eval08‐NW
(MERT
on
Dev06‐NW):
 Verb
reordering
of
training
 and 
test
=>
further
gain










 (+1.2
with
1/3
of
sentences
modified)
 Verb
reordering
of
training
 data
only
=>
posiBve
effect
 (9%
more
phrases
extracted)
 WMT 2010, Uppsala A. Bisazza, M. Federico 26

  27. Impact
of
VSO
sentences
on
MT
quality
 %BLEU
scores
on
Eval08‐NW
(MERT
on
Dev06‐NW):
 Verb
reordering
of
training
 and 
test
=>
further
gain










 Relaxing
the
DL
to
high
 (+1.2
with
1/3
of
sentences
modified)
 values
doesn’t
help
 Verb
reordering
of
training
 data
only
=>
posiBve
effect
 (9%
more
phrases
extracted)
 WMT 2010, Uppsala A. Bisazza, M. Federico 27

  28. Impact
of
VSO
sentences
on
MT
quality
 To
resume:
 • 
VSO
sentences
affect
negaBvely
phrase‐based
SMT
 • 
Specific
models
needed
to
handle
verb
reordering
of
test
 WMT 2010, Uppsala A. Bisazza, M. Federico 28

Recommend


More recommend