10/6/2011 The Importance of Visual Context Clues in Multimedia Translation Christopher G. Harris The University of Iowa, USA Tao Xu Tongji University, China Background • Video websites (e.g. YouTube) have proliferated – estimated 1 billion views daily in 2010 • Quick translations essential to reaching wider global audience • Professional translators: Expensive and slow • Machine Translation Tools: Cheap… but accurate? • What about crowdsourcing? CLEF 2011 Amsterdam 2 1
10/6/2011 Research objectives • Is it sufficient to work from a written transcript… or are the visual context clues found in video beneficial for translation? • Since crowdsourcing involves humans (who can take advantage of video context), how effective is crowdsourcing relative to the MT tools available today? • Does our success depend on the genre of multimedia we choose to translate? CLEF 2011 Amsterdam 3 Meteor evaluation tool Videos Examined Score CS and MT results against a gold standard (PT) using translations from Mandarin Chinese to: • Spanish • Russian • English Meteor can match 3 different ways: • Exact match • Stemmed match • Synonym match (powerful and flexible ) CLEF 2011 Amsterdam 4 2
∙ ∙ ∙ 10/6/2011 Meteor parameters Videos Examined tradeoff between precision & recall = + 1 − = ∙ ( / ) functional relation maximum penalty between fragmentation and penalty English Russian Spanish 0.95 0.85 0.90 0.50 0.60 0.50 0.45 0.70 0.55 CLEF 2011 Amsterdam 5 Experimental setup (1) Video Transcript Professional The Crowd Translator Machine Translation MT MT MT CLEF 2011 Amsterdam 6 3
10/6/2011 Experimental setup (2) Professional The Crowd Translator Machine Translation MT MT MT Meteor Score CLEF 2011 Amsterdam 7 The bigger picture Video We want to: • Compare 3 genres (AN, TS, MV) • Use 3 videos from each genre Transcript • Compare 3 languages (EN, ES, RU) • Test with and without video usage PT The Crowd MT • 3 x 3 x 3 x 2 x 2 = 108 runs For each run: • 5 MT tools • Minimum of 2 CS translations Meteor ( = 3. 8 translations/transcript) Score CLEF 2011 Amsterdam 8 4
10/6/2011 Tools and platforms used Videos Examined CS platforms used MT tools used • Google translate • oDesk • Babelfish • eLance • Bing • Taskcn* • Worldlingo • Zhubajie* • Lingoes* • epWeike* (No Mechanical Turk) * = Chinese-based CLEF 2011 Amsterdam 9 Animation video clips examined Videos Examined 3 animated clips • Plenty of exaggerated expressions • Use of imagery CLEF 2011 Amsterdam 10 5
10/6/2011 Music video clips examined Videos Examined 3 music videos • Figurative poetic language • Uses lots of imagery CLEF 2011 Amsterdam 11 Talk show video clips examined Videos Examined 3 talk show clips • Fast-paced dialog • Lots of sarcasm and idiomatic expressions used CLEF 2011 Amsterdam 12 6
10/6/2011 Meteor scoring 1.00 0.90 0.80 0.70 0.60 0.50 0.40 0.30 0.20 0.10 0.00 CLEF 2011 Amsterdam 13 Features evaluated Videos Examined • Four features studied • Use a cube to represent Multimedia Genre three of them TS • Language • Genre MV • Type of translation method AN • MM vs. transcript MT PT evaluated separately CS Translation Type CLEF 2011 Amsterdam 14 7
10/6/2011 Representing our results Videos Examined Multimedia Genre TS MV AN CS MT Translation Type CLEF 2011 Amsterdam 15 Heat-map showing Meteor score Videos Examined CLEF 2011 Amsterdam 16 8
10/6/2011 Where did visual context help most? Videos Examined CLEF 2011 Amsterdam 17 Inter-annotator agreement Videos Examined • Inter-annotator agreement (Cohen’s Kappa) between crowdsourced and professional translations grouped by genre. • The MM considers visual context clues whereas WT only considers the written transcripts. Media source used MM WT TS 0.69 0.61 Genre AN 0.71 0.67 MV 0.65 0.57 CLEF 2011 Amsterdam 18 9
10/6/2011 Additional validation Videos Examined Validated translations at a high level, we had human translators provide simple preference judgments on each feature: • Crowdsourced translations generated from written transcripts compared with crowdsourced translations generated from multimedia • Machine translations compared with crowdsourced translations • Professional translations compared with the crowdsourced translations CLEF 2011 Amsterdam 19 Crowdsourcing vs. professional translations Videos Examined Professional Translations • Done on a per-word basis of 6-12 cents/word • Average cost of US$49.65/translation • Took an average of 4.7 business days to complete Crowdsourcing Translations • Average cost of US$2.15 per translation 1/23 rd of the cost of a professional translation • • Took an average of 40 hours (1.6 days) to complete CLEF 2011 Amsterdam 20 10
10/6/2011 Conclusion Videos Examined • Able to observe and quantify the advantage of using video context clues over standard written translations alone • Observe that some genres gain more from using video context clues • Observe that some languages gain more from video context clues than others • Crowdsourcing translations appear to be a cost- effective way to obtain translations quickly CLEF 2011 Amsterdam 21 Thank you Videos Examined Questions? CLEF 2011 Amsterdam 22 11
Recommend
More recommend