This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 2019 Linguistic Institute Course 353: Digital Methods in Language Documentation Days 3&4: ELAN Our class Google Drive folder: Lesson 1 bit.ly/DigLangDocLSA2019 Andrea Berez-Kroeker (U Hawai ʻ i) Please download and unzip the Christopher Cox (Carleton U) ELAN1-Lesson 1 folder
Welcome! We ♥ teaching ELAN! Chris and Andrea have been “Partners in ELAN” since the first CoLang in 2008. andrea.berez@hawaii.edu christopher.cox@carleton.ca
This workshop: Hands-on For beginners Simple → complex Time for practice
In lesson 1 we will... Get to know each other and ELAN Learn about time-aligned transcripts Learn about tiers Practice working in Simple-ELAN and ELAN
Let’s get started.
ELAN makes What does time-aligned ELAN do? transcripts. What’s that?
Time-aligned transcripts Match text (“annotations”) with sections of an audio or video recording (“media”)
Time-aligned transcripts are everywhere!
ELAN makes time-aligned transcripts for language documentation.
ELAN is great for language documentation! Developed by linguists. Unicode compliant. Free, open source, non-proprietary. XML, so very portable.
Let’s take a tour.
Can’t edit media. Need other software. What ELAN is not: If you do , you must change An audio/video your ELAN file to match editor the new timeline.
Outputs in plain text . Add embellishments ... What ELAN is not: underlining bolding A text editor color ...in a text editor.
ELAN has a learning curve! You have to know what you want.
Simple transcripts are simple... ...but complex transcripts are complex! 1. What are your goals for your transcript? 2. What are ELAN’s built-in concepts? 3. How do ELAN’s concepts map onto your goals?
Let’s look at some common transcript goals.
Example 1: One language, one speaker (( Morality , provided by RadioLab http://www.wnyc.org/shows/radiolab)) Josh: “How do people make this judgment? Forget whether or not these judgments are right or wrong, just, what’s going on in the brain that makes people distinguish so naturally and intuitively between these two cases? Which, from an actuarial point of view, are very very very similar if not identical...”
Example 2: One language, multiple speakers 4 speakers Laughter/sniffs/coughs Measured pauses
ALVIN: yeah, I haven't -- like admittedly, I haven't, (SNIFF) (0.3) It's funny, I haven't watched those in years, you know, PETER: Yeah. ALVIN: so I've thought, (0.7) I've thought, it might be fun to see them again. But, (0.4) um, PETER: Or not so fun. ALVIN: (COUGH) Yeah, @@@may-, PETER: @@@@@@@@@ ALVIN: Maybe it's best just to remember the pleasant childhood memories. ALLISON: @@The way it was, LEA: @ Yeah. ((Excerpt: Television ))
Example 3: Two languages, one speaker A subject language (the language being spoken) A translation language (a language of wider communication) With free translations (translations of sentences, not single words)
(( Glacier Water , Dena’ina. Speaker: Andrew Balluta)) Qizhjeh Vena Qizhjeh Vena veq'atl'a ghini tustes ghu ł i yan nlan ha t'ent'a Dze ł Ken teh. ‘Up at the head of Lake Clark, up in that valley, there are passes in the Alaska Range in both directions.’ Yi ghini idghalzex ch'u k'etnu gguya q'andazdlen ha t'ix ł i ta'a nlan ha . ‘When the glaciers start melting, all the water flows into the river.’ Ghuh q'andazdlen ch'u Chuqutenghehtnu dahkadilax ha yehdi ven edilax. ‘And then it forms into a lakelark.’ Ł i ta'a ghini, yi edilax ch'uq'u Qizhjeh Vena ku'u edilax. ‘ That glacier water, it forms Qizhjeh Vena [Lake Clark].’ Yi ghini edilax ch'u Nundaltin Vena kiq'u edilax. And then it flows into Nundaltin Vena [Nondalton Lake].’
Example 4: Two languages, richer linguistic info A subject language + a translation language Free translations Word-level translations Morphemes and glosses Intonation units...
(( Pear Story , Kannada. Speaker: Keshava Subramanya)) KAN Sentence: Ii kathe obba, ha ṇṇ u maaliya bagge, mattu, obba hu ḍ ugana bagge ide. Intonation Unit: Ii kathe obba, Morphemes: ii kathe obba Gloss: this story one Intonation Unit: ha ṇṇ u maaliya bagge, Morphemes: ha ṇṇ u maali -ya bagge Gloss: fruit gardener- SRC about Intonation Unit: mattu, Morphemes: mattu Gloss: and Intonation Unit: obba hu ḍ ugana bagge ide. Morphemes: obba hu ḍ uga -na bagge ide Gloss: someone.M boy -GEN about be.3SM Free Translation: ‘ This story is about a fruit farmer and a boy. ’
Those are just some of the possibilities. ELAN can handle many others! What other features might you want?
Some important concepts in ELAN (They get easier with practice, I promise!)
Tiers Everything that is the same “kind” of information is part of the same tier. All of Lea’s utterances. All the English free translations of the Dena’ina sentences. All the glosses of the Kannada words.
Thinking about tiers: Puppet show video You are filming a 2-person puppet show designed to teach Finnish speakers sentences in the North Saami language. Heide Hegon
Thinking about tiers: Puppet show video How many tiers do you need? What are they? Heide Hegon
Thinking about tiers: Puppet show video 1. Heide’s utterances in North Saami 2. Translations of Heide’s sentences into Finnish. 3. Hegon’s utterances in North Saami 4. Translations of Hegon’s sentences into Finnish. Heide Hegon
Thinking about tiers Tier 1... Tier 1... Tier 1... Printed transcripts have fixed Tier 2... Tier 2... Tier 2... Tier 3... Tier 3... Tier 3... a page width. Tier 1... Tier 1... Tier 1... Tier 2... Tier 2... Tier 2... Tier 3... Tier 3... Tier 3... Forced split of contents of single tier across many lines :-(
Thinking about tiers But ELAN is time based! :-) Tiers are continuous and contiguous to the media timeline: Tier 1 Tier 2 Tier 3
Thinking about tiers Multiple continuous tiers can capture speaker overlap: (Time) Alvin: [···SPEECH···] [·SPEECH·] [·····SPEECH·····] Peter: [··SPEECH··] [···SPEECH···] Lea: [·····SPEECH·····]
Top-level tiers Every ELAN transcript will have at least one top-level tier. It contains the contents of the recording. Better: It contains a direct representation of the contents of the recording, and is linked to the time-line of the recording.
Top-level tiers The words spoken, in orthography (spelling). Any or all of these can be The words spoken, in IPA. top-level tiers in ELAN! Sign language. (And gestures) They are direct representations of the recording contents & Background noises/activity. they are linked directly to the timeline of the recording. Comments (from the analyst).
Let’s do Exercise 1 (Have you installed ELAN?) https://tla.mpi.nl/tools/tla-tools/elan/download/
Exercise 1: 1 language, 1 speaker Goal: Make our 1st ELAN transcript in ELAN using an audio recording of an English monologue.
Step 1: Download the exercise files In our Google Drive, find the folder called ‘exercise1_language_opinion’. Download this entire folder onto your computer. You will put your ELAN files for this exercise inside this folder.
Step 1: Create a new file 1. Open the ELAN software. Go to “File>New”
Step 1: Create a new file 2. In the pop-up, click “Add Media File” and navigate to the language_opinion.wav file. Click “Open.”
Step 1: Create a new file Your file should be listed in the text box. Click “OK.”
Step 1: Create a new file 3. Save your file. Go to “File>Save.”
Step 1: Create a new file Call your file “language_opinion” and save it in the ‘exercise_1_language_opinion’ folder on your computer. Click “Save.”
Your file is now saved Listen to this file to get familiar with it. You can listen to the recording using the controls in the middle of the screen.
Your new file is saved. Now we can start transcribing.
Step 2: Rename the transcription tier Use Tier>Change Tier Attributes to rename the Default tier
Step 2: Rename the transcription tier In the box that appears, you can add some information: -Tier Name (give it a good name) -Participant (who is the speaker/signer?) -Annotator (that’s you!) Leave the rest as-is. Click “Change”.
Step 2: Rename the transcription tier The tier name has changed. If you hover over the tier label, you will see all the information you added.
Step 3: Listen and transcribe When you are ready, use your mouse to highlight a portion of the wave form. Use the play selection button to listen to it.
Step 3: Listen and transcribe Double-click in the pink area under your selection to open up an annotation box. Type your transcription in the box and press Enter.
Step 3: Listen and transcribe Your transcription is now committed. Even if your text is longer than the width of the box, don’t worry, it’s still there. Hover to see it.
Recommend
More recommend