MT@EC Final Multilingual W eb w orkshop Luxem bourg 1 5 -1 6 March 2 0 1 2 Spyridon Pilos Language applications DG Translation of the European Com m ission DGT
A new m achine translation service for the European Com m ission • Necessary • Based on data-driven MT technology • Making best use of Commission language resources • Making best use of internal linguistic expertise (1700 translators for 23 languages) • Open and flexible • Ensuring technological independence • Being built by DG Translation • Started: summer 2010 • Deploy: summer 2013 MLWeb workshop MT@EC 2 DGT
MT@EC service architecture and m ore… MT data MT engines language resources by language, specific for each MT engine Language Users and DISPATCHER subject… resources Services managing MT requests built around Euramis and m ore… S and m ore… O DATA A MODELLING Customised interfaces USER FEEDBACK DATA HUB ENGINES HUB and m ore… MT a MT actio tion l lin ines 3. Service 1. Data 2. Engines MLWeb workshop MT@EC 3 DGT
Multilingual w eb = Multilingual content ( ML W = ML C ) ML C = C available in L 1 , L 2 , L 3 , … , L n • • How it is done: • Author produces C in one language • Translators transfer the C to other languages Publisher gets C in all languages and publishes it • MLWeb workshop MT@EC 4 DGT
Multilingual w eb = Multilingual content ( ML W = ML C ) Translate Translate L 2 , L 3 Translate L 4 Author L 5 Translate L 1 L n Publish Publish L 1 , L 2 , L 3 ,…, L n L 1 , L 2 , L 3 ,…, L n MLWeb workshop MT@EC 5 DGT
Multilingual w eb = Multilingual content ( ML W = ML C ) Publish L 1 L 2 L 3 L 4 L… L n Author Translate Translate Translate L 1 L n L 4 L 2 , L 3 MLWeb workshop MT@EC 6 DGT
Multilingual w eb = Multilingual content ( ML W = ML C ) ML C = C available in L 1 , L 2 , L 3 , … , L n • • How it could be done: • Publisher prepared for C in all languages (“placeholders”) • Author produces C in one language “ready for publication” • Translators produce C in other languages “ready for publication” MLWeb workshop MT@EC 7 DGT
Language applications: example MT L 1 L 2 L 3 W eb L 4 L… L n MLWeb workshop MT@EC 8 DGT
Getting multilingual data from the web L 1 • europa.eu has lots of ML C • We tried with web site 1 : not that difficult L 2 • BUT L 3 • … when going to web site 2 , we need to analyse W eb the way that site manages linguistic versions L 4 • … • … web site n has yet another way of managing L… linguistic versions • … and why not, since there is no standard to L n follow? MLWeb workshop MT@EC 9 DGT
Giving multilingual data to the web • DGT-TM L 1 • Translation memories from the Official Journal of the EU L 2 • uses tmx • yearly updates (to come): “do not change, so that there is continuity” L 3 DGT-Acquis • W eb • New resource (to come) L 4 • Parallel text from the Official Journal of the EU • First version: “do not change too much compared to L… JRC-Acquis” • establish “our” standard way of working (which we make public) L n • yearly updates (to come): “do not change, so that there is continuity” MLWeb workshop MT@EC 10 DGT
Conclusion • Need some order for multilingual content on the web • “Standard” data models? • “Standard” structure for data “storage” or “publication”? • Standard? • MW consortium (and MW-LT) are meant to : • propose a feasible approach • demonstrate the benefit for all • Commission DG for Translation: founding member of “Language interoperability portfolio” (Linport) MLWeb workshop MT@EC 11 DGT
Thank you for your attention MLWeb workshop MT@EC 12 DGT
Recommend
More recommend