Interactive alignment of Parallel Texts – a cross browser experience (standards in practice) Gavin Brelstaff (gjb@ crs4.it) CRS4 09010 Pula (CA) – Sardinia, Italy Francesca Chessa University of Sassari, Italy Multilingual Web Workshop Pisa April 2011 MLW Pisa 2011 G.Brelstaff & F.Chessa 1
Introduction Alignment of parallel texts; multi-lingual; minority languages; poetry Dante’s was a minority language. But why? MLW Pisa 2011 G.Brelstaff & F.Chessa 2
“Think global, act local” “Think local, act global” Genius loci the creative spirits of place – geolocated . Minority language a seed-bed for poetic expression, beyond mere communication. Whenever we lose a language the He was the cat that walked by himself “genetic basis” for such expression and all places were alike to him. diminishes, globally Kipling MLW Pisa 2011 G.Brelstaff & F.Chessa 3
Echo Chamber Minority language Island language (song,verse,prose) MLW Pisa 2011 G.Brelstaff & F.Chessa 4
Echo Chamber in poet’s head MLW Pisa 2011 G.Brelstaff & F.Chessa 5
Echo Chamber inside the head (ear,tongue,thought) MLW Pisa 2011 G.Brelstaff & F.Chessa 6
Echo Chamber inside the head (ear,tongue,thought, eye) � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � MLW Pisa 2011 G.Brelstaff & F.Chessa 7
�������������� ����������������� ��������������������� ����������������� ��� ��������������� ������������� Language Barrier MLW Pisa 2011 G.Brelstaff & F.Chessa 8
Cultural context A Cultural context B cf R.Jakobson d i f d f u i f s f u i o s n i o n d i f f u s i o n d i f f u s i o n Language Barrier MLW Pisa 2011 G.Brelstaff & F.Chessa 9
Minority language Global language “cellular membrane” diffusion diffusion d i f f u s i o n d i f f u s i osmosis o n diffusion Assist avoiding dilution, shrivelling, bursting. Language Barrier MLW Pisa 2011 G.Brelstaff & F.Chessa 10
�������������� ����������������� ��������������������� ����������������� ��� ��������������� ������������� Language Barrier MLW Pisa 2011 G.Brelstaff & F.Chessa 11
Translator �������������� ����������������� ������������������ ��������������������� ��������������� ����������������� ������������� ��� ��������������� ������������ ������������� ������������������ ���������� Parallel text alignment ↔ to communicate semantics • standards-based markup • web delivery, cross-browser • non-verbal interactvity → • beyond GoogleTranslate Midway ������������������ Nel mezzo �������������� ��������������� ����������������� ������������� ��������������������� MLW Pisa 2011 G.Brelstaff & F.Chessa 12 ������������ ����������������� ������������������ ��� ��������������� ���������� �������������
Beyond GoogleTranslate : • SMT not going to translate poetry well any time soon. • We allow the translator to clarify by alignment • Point-&-click interface to modify standard markup • Colour-code: formal & dynamic equivalence [Nida-Taber] • Demo Parallel text alignment web interface MLW Pisa 2011 G.Brelstaff & F.Chessa 13
Demo (a desktop browser: IE8-9,FF3-4,Opera11,Chrome,Safari ) MLW Pisa 2011 G.Brelstaff & F.Chessa 14
Demo: selection by click MLW Pisa 2011 G.Brelstaff & F.Chessa 15
Demo: selection & alignment MLW Pisa 2011 G.Brelstaff & F.Chessa 16
Standards in practice eXist XML http put db XQL XMLSchema REST/ajax Presentation Content Structure Semantics XHTML CSS XML TEI-p5 not RDF Unicode ��������������� not XSL w3cRange jQuery Javascript DOM Pros & Cons MLW Pisa 2011 G.Brelstaff & F.Chessa 17
eXist XML http put db XQL XMLSchema REST/ajax Presentation Content Structure Semantics XHTML CSS XML TEI-p5 not RDF Unicode Cons: #1 ��������������� not XSL We can’t interact directly with Semantics w3cRange jQuery Browsers only bind events to XHTML (why not XML?) elements Javascript DOM Incurs two degrees of messy indirection. MLW Pisa 2011 G.Brelstaff & F.Chessa 18
eXist XML http put db XQL XMLSchema REST/ajax Presentation Content Structure Semantics XHTML CSS XML TEI-p5 not RDF Unicode Cons: #2 ��������������� not XSL w3cRange is not “road worthy”. w3cRange jQuery We resort to Click to Select Javascript DOM Selection within words still lacking. MLW Pisa 2011 G.Brelstaff & F.Chessa 19
eXist XML http put db XQL XMLSchema REST/ajax Presentation Content Structure Semantics XHTML CSS XML TEI-p5 not RDF Unicode Cons: #3 ��������������� not XSL TEI-p5 must be subsetted to avoid w3cRange overlapping markup jQuery Javascript DOM We prioritise alignment tags over {verse-line,paragraph} hierarchy. MLW Pisa 2011 G.Brelstaff & F.Chessa 20
eXist XML http put db XQL XMLSchema REST/ajax Presentation Content Structure Semantics XHTML CSS XML TEI-p5 not RDF Unicode Pros: #1 ��������������� not XSL Unicode in XML attributes permits our w3cRange novel alignment scheme: jQuery Javascript DOM The verbatim source text is simply assigned as an attributed of an enclosing tag in the translated text MLW Pisa 2011 G.Brelstaff & F.Chessa 21
eXist XML http put db XQL XMLSchema REST/ajax Presentation Content Structure Semantics XHTML CSS XML TEI-p5 not RDF Unicode Pros: #2 ��������������� not XSL CSS selection mechanism as embraced w3cRange in jQuery helps tame the complexity of jQuery cross-browser DOM programming. Javascript DOM MLW Pisa 2011 G.Brelstaff & F.Chessa 22
eXist XML http put db XQL XMLSchema REST/ajax Presentation Content Structure Semantics XHTML CSS XML TEI-p5 not RDF Unicode Pros: #3 ��������������� not XSL REST ful archiving is a reality w3cRange due to: jQuery • Ajax in the browser Javascript DOM • Http PUT on the wire, & • eXist XML db on the server MLW Pisa 2011 G.Brelstaff & F.Chessa 23
Conclusion Standards in practice Cons Pros • Can’t bind to XML • Unicode in attributes • W3cRange not ready • CSS&jQuery v. DOM • Must subset TEI-p5 • REST ful reality MLW Pisa 2011 G.Brelstaff & F.Chessa 24
Browser issues • Opera : no transparent cursor in text • Firefox : synchronous scoll down bug • IE : onselectstart issue • Google Chrome : run from disk fix • Safari/Chrome/IE Form Enctype: validation MLW Pisa 2011 G.Brelstaff & F.Chessa 25
That’s all folks: Gavin Brelstaff ( gjb@ crs4.it ) CRS4 09010 Pula (CA) – Sardinia, Italy Francesca Chessa L'Amor che move il sole e l'altre stelle University of Sassari, Italy MLW Pisa 2011 G.Brelstaff & F.Chessa 26
Recommend
More recommend