Using forced alignment for segmental analysis Using forced alignment for segmental analysis Erin Olson, Michael Wagner, A Review Meghan Clayards McGill University Erin Olson, Michael Wagner, Meghan Clayards Introduction McGill University to the Prosodylab Aligner Assessing the Aligner Computational Field Workshop Background: the experiments Assessing McGill University, Montr´ eal alignment results Assessing 28 May 2013 alignment accuracy Tools for endangered languages
Table of Contents Using forced alignment for segmental analysis Erin Olson, Introduction to the Prosodylab Aligner 1 Michael Wagner, Meghan Clayards McGill Assessing the Aligner 2 University Background: the experiments Introduction Assessing alignment results to the Prosodylab Assessing alignment accuracy Aligner Assessing the Aligner Background: the experiments Tools for endangered languages 3 Assessing alignment results Assessing alignment accuracy Tools for endangered languages
What is the Prosodylab Aligner? Using forced alignment for segmental The Prosodylab Aligner (Gorman et al. 2011) is a tool for analysis performing forced alignment on audio data Erin Olson, Michael Some details: Wagner, Meghan Python codebase Clayards McGill Compatible with UNIX -based systems (so far) University Based on the Hidden Markov Model Toolkit (HTK) Introduction It takes these files... to the Prosodylab .lab files (transcripts of the audio files) Aligner .wav files (the audio files themselves) Assessing the Aligner ... and gives back .TextGrid files (readable in Praat Background: the experiments (Boersma & Weenink 2013)) Assessing alignment results Both words and segments are aligned Assessing alignment No previously aligned data is necessary – just a transcript accuracy Tools for endangered languages
Aligner demo Using forced alignment for segmental analysis Erin Olson, Michael Wagner, Meghan Clayards McGill University Demo Introduction to the Prosodylab Aligner Assessing the Aligner Background: the experiments Assessing alignment results Assessing alignment accuracy Tools for endangered languages
Training the Aligner Using forced alignment for segmental analysis The Aligner is also capable of being trained on different data. Erin Olson, Michael This data can come from: Wagner, Meghan a single speaker Clayards McGill a single dialect University a new language Introduction to the Prosodylab Training the Aligner requires: Aligner At least two hours of transcribed training data Assessing the Aligner A phonetic dictionary , such as the CMU Pronouncing Background: the experiments Assessing Dictionary or Lexique alignment results Assessing alignment accuracy Tools for endangered languages
Training demo Using forced alignment for segmental analysis Erin Olson, Michael Wagner, Meghan Clayards McGill University Demo Introduction to the Prosodylab Aligner Assessing the Aligner Background: the experiments Assessing alignment results Assessing alignment accuracy Tools for endangered languages
Goals for this talk Using forced alignment for segmental analysis Erin Olson, Michael Wagner, Meghan Clayards We’ve seen how good word alignment can be, but what McGill University about segmental alignment ? Introduction How can we make this tool as useful as possible for field to the Prosodylab linguists in its present state? Aligner Assessing the Aligner Background: the experiments Assessing alignment results Assessing alignment accuracy Tools for endangered languages
Table of Contents Using forced alignment for segmental analysis Erin Olson, Introduction to the Prosodylab Aligner 1 Michael Wagner, Meghan Clayards McGill Assessing the Aligner 2 University Background: the experiments Introduction Assessing alignment results to the Prosodylab Assessing alignment accuracy Aligner Assessing the Aligner Background: the experiments Tools for endangered languages 3 Assessing alignment results Assessing alignment accuracy Tools for endangered languages
The studies Using forced Goal : Hayes (2007) claims that vowel phonemes are “realized alignment for segmental as extra short when a voiceless consonant follows” in English. analysis Is this really the case? Erin Olson, Michael Two experiments performed comparing vowel length Wagner, Meghan before voiced and voiceless obstruents to vowel length Clayards McGill words before sonorants University One with fricatives (F): fuss, fuzz, fun Introduction One with stops (S): cot, cod, con to the Prosodylab More details on experimental design: Aligner All words were monosyllabic and spoken in a carrier Assessing the Aligner phrase “Please say again” Background: the experiments Experiment F had 6 (near) minimal triplets comparing [s] Assessing alignment results and [z] with [m] or [n]; 19 participants Assessing Experiment S had 30 minimal triplets comparing stops alignment accuracy with [m], [n], [ N ], [l]; 27 participants Tools for endangered Participants only saw one word of each minimal triplet languages
Human annotation Using forced Two research assistants aligned the vowel of interest and the alignment for segmental following consonant for both experiments analysis Erin Olson, For experiment S, stop consonants were split into closure Michael Wagner, and burst components Meghan Clayards McGill University Introduction to the Prosodylab Aligner Assessing the Aligner Background: the experiments Assessing alignment results Assessing alignment accuracy Tools for endangered languages
Results: Human annotation Using forced alignment for segmental analysis Erin Olson, Michael Wagner, Meghan Clayards McGill University Introduction to the Prosodylab Aligner Assessing the Aligner Background: the experiments Assessing alignment results Results of human annotation for experiment F and experiment S. Error bars represent 90% confidence Assessing intervals. All differences are significant, as found by a linear mixed model regression. alignment accuracy Tools for endangered languages
Results: Human annotation Using forced alignment for segmental analysis Erin Olson, Michael Wagner, All three conditions are significantly different from one Meghan Clayards another in both experiments McGill University For experiment F, the Sonorant and Voiceless conditions Introduction were closest ( | t | = 3 . 628) to the Prosodylab For experiment S, the Sonorant and Voiced conditions Aligner were closest ( | t | = 4 . 254) Assessing the Aligner Background: the experiments Assessing alignment results Assessing alignment accuracy Tools for endangered languages
Alignment Using forced alignment for segmental analysis The training set: Erin Olson, Michael Around four hours of training data Wagner, Meghan Clayards Previously collected through other Prosodylab experiments McGill University Two alignments performed: Introduction One using the CMU Pronouncing Dictionary to the Prosodylab (Alignment 1, or A1) Aligner One using a modified version of the Pronouncing Assessing the Aligner Dictionary, where stops are separated in closures and Background: the experiments bursts (Alignment 2, or A2) Assessing alignment results Assessing alignment accuracy Tools for endangered languages
Results: Alignment Using forced alignment for segmental analysis Erin Olson, Michael Wagner, Meghan Clayards McGill University Introduction to the Prosodylab Aligner Assessing the Aligner Background: the Results of aligned annotation for both experiments. Error bars represent 90% confidence intervals. All experiments differences between conditions are significant, as found by a linear mixed model regression. Assessing alignment results Assessing alignment accuracy Tools for endangered languages
Results: Alignment Using forced alignment for segmental analysis All three conditions are still significantly different from Erin Olson, Michael another, in both experiments and both alignments. Wagner, Meghan For experiment F, the Sonorant and Voiceless conditions Clayards McGill were closest to one another ( | t | = 2 . 611 for A1 and University | t | = 2 . 876 for A2), just as in the hand-annotated data Introduction to the For experiment S, the Sonorant and Voiced conditions Prosodylab Aligner were closest to one another ( | t | = 3 . 192 for A1 and Assessing the | t | = 2 . 147 for A2), just as in the hand-annotated data Aligner Background: the experiments Take home message : the alignments give the same qualitative Assessing alignment results result as the hand-annotated data Assessing alignment accuracy Tools for endangered languages
Assessment: Duration Using forced alignment for segmental analysis Are the measures of vowel duration significantly different Erin Olson, from the human-annotated durations? Michael Wagner, Meghan Clayards McGill University Introduction to the Prosodylab Aligner Assessing the Aligner Background: the experiments Assessing alignment results Results from all annotations, grouped by condition and annotation. Error bars represent 90% confidence Assessing intervals. Asterisks indicate significant difference from hand annotation, as measured by a mixed model linear alignment regression. accuracy Tools for endangered languages
Recommend
More recommend