f0 of adolescent speakers
play

F0 of Adolescent Speakers First Results for the German - PowerPoint PPT Presentation

F0 of Adolescent Speakers First Results for the German Ph@ttSessionz Database Chr. Draxler, F. Schiel, T. Ellbogen BAS Bavarian Archive of Speech Signals University of Munich, Germany Introduction previous f0 studies of adolescents


  1. F0 of Adolescent Speakers First Results for the German Ph@ttSessionz Database Chr. Draxler, F. Schiel, T. Ellbogen BAS Bavarian Archive of Speech Signals University of Munich, Germany

  2. Introduction • previous f0 studies of adolescents • small numbers of speakers • limited and artificial speech material, e.g. sustained vowels • no speech data available • forensic databases • not available for German

  3. Ph@ttSessionz: Goals • 1000 speakers • 50% male, 50% female (±5%) • 13-19 years • good dialect coverage • recorded via Internet in secondary schools • 22.05 kHz, 16 bit linear PCM, stereo

  4. Session Contents item # item # isolated digit 10 date 3 numbers 11-100 19 time 3 PC command phrases 12 directory assistance 9 telephone numbers 13 spelling 10 mobile phone keys 3 phonetically rich 30 credit card 3 spontaneous 5 PIN 3 narrative 2

  5. Session Contents item # item # isolated digit 10 date 3 numbers 11-100 19 time 3 PC command phrases 12 directory assistance 9 telephone numbers 13 spelling 10 mobile phone keys 3 phonetically rich 30 credit card 3 spontaneous 5 PIN 3 narrative 2 • SpeechDat and RVG-I compatible

  6. Speaker Data • date of birth, sex, weight, height • dialect region (federal state at age 6) • mother tongue of speaker and family • smoking habits, dental braces, piercings

  7. F0 Analysis • pre-release version of the database • 762 speakers • ~ 49% f, 51% m • good age distribution • biased dialect region distribution • 90829 utterances

  8. F0 Calculation • Praat built-in algorithm • frequency 75-400 Hz • max candidates 15 • silence/voicing threshold 0.03/0.45 • octave/jump/voiced cost 0.01/0.35/0.14 • f0 mean, min, max (in Hz and mel)

  9. F0 mean vs. Age 250,00 200,00 150,00 100,00 50,00 0,00 13 14 15 16 17 18 19 m f

  10. F0 vs. BMI mean f0 vs. BMI (female) mean f0 vs. BMI (male) 350,00 350,00 300,00 300,00 250,00 250,00 200,00 200,00 Hz Hz 150,00 150,00 100,00 100,00 50,00 50,00 0,00 0,00 0,00 10,00 20,00 30,00 40,00 0,00 10,00 20,00 30,00 40,00 BMI BMI

  11. F0 Data f0 single digit f 400,00 350,00 300,00 250,00 200,00 150,00 100,00 50,00 0,00 13 14 15 16 17 18 19 f0 min f0 max f0 mean

  12. F0 Data f0 single digit f f0 single digit m 400,00 400,00 350,00 350,00 300,00 300,00 250,00 250,00 200,00 200,00 150,00 150,00 100,00 100,00 50,00 50,00 0,00 0,00 13 14 15 16 17 18 19 13 14 15 16 17 18 19 min f0 max f0 mean f0 f0 min f0 max f0 mean

  13. F0 Data f0 single digit f f0 single digit m 400,00 400,00 350,00 350,00 300,00 300,00 250,00 250,00 200,00 200,00 150,00 150,00 100,00 100,00 50,00 50,00 0,00 0,00 13 14 15 16 17 18 19 13 14 15 16 17 18 19 min f0 max f0 mean f0 f0 min f0 max f0 mean f0 spelling geographical name m f0 spelling geographical name f 400,00 400,00 350,00 350,00 300,00 300,00 250,00 250,00 200,00 200,00 150,00 150,00 100,00 100,00 50,00 50,00 0,00 0,00 13 14 15 16 17 18 19 13 14 15 16 17 18 19 min f0 max f0 mean f0 f0 min f0 max f0 mean

  14. F0 Range • F0 abs = F0 max - F0 min • F0 rel = F0 max / F0 min • scale • absolute Hz scale • perception-based mel scale

  15. 0,00 0,50 1,00 1,50 2,00 2,50 3,00 3,50 digit n. geographical number n. company n. person command time F0 rel mel PIN code date sentence telephone sp. geographical sp. arbitrary mobile keys sp. person credit card short text long production

  16. Outlook • use final release of the database • 864 speakers • refine analysis • re-compute F0 for phrases

  17. Summary • Ph@ttSessionz database • largest database for adolescent speakers • technology development and research • statistically reliable voice data for German • F0 variation dependent on utterance class

  18. Summary • Ph@ttSessionz database • largest database for adolescent speakers • technology development and research • statistically reliable voice data for German • F0 variation dependent on utterance class?

Recommend


More recommend