Speaking Speaking under co under cover er: The he impact impact - PowerPoint PPT Presentation

Speaking Speaking under co under cover er: The he impact impact of of f face ace-con concea cealing ling gar garments ments on on the aco the acoustics of ustics of fri frica cativ tives. es. Natalie Fecher Language & Linguistic Science University of York, York, UK PhD supervisors: Dominic Watt, David van Leeuwen IAFPA 2011, Vienna, Austria, 27 th July 2011

Outline Outline  Background of the Project  The ‘ Face Cover ’ Corpus  Acoustic Fricative Study  Conclusions and Outlook

Bac Backg kground ound

Forensic ic Speech Sp Scien Sc ience PhD PhD Pr Projec oject t on on Multimod Multi modal al Speech an Spee h and d Spea Speaker er Recognition ition Audio udio- Visual isual Spe Speec ech h Pr Proce ocessing ssing 4

Joint processing and transformation of acoustic and facial information under qualitatively variable input : Acous Acoustic tic Noise Noise • microphone type/placement • acoustic environment A / V / A / V / AV V • channel characteristics A(V) ‘Identify’ Speech/ Speec h/ • complexity of the scenario Identify speech/ Speech/ Speec h/ Speak Speaker er speech/ • ► face coverings speaker Speak Speaker er Recognition ecognition speaker or recognition or or by H by Hum uman an verify by human Verify erify Per erceptual ceptual claimed perceptual Claimed Claimed Visual isual Noise Noise content/ or or Content/ Content/ or automatic identity Automa utomatic tic Identity Identity • lighting, occlusion, perspective system System System • image background • resolution/compression • appearance change • ► face coverings (on the basis of Aleksic&Katsaggelos, 2006) 5

Pr Previou vious s Rese esear arch Llamas/H Llama s/Har arrison rison/Do /Donn nnell elly/W y/Watt tt (2009 2009)  set of common confusions during bimodal presentation of AV stimuli  sound transmission loss characteristics (TL) of 3 fabrics Watt/ tt/Llama Llamas/H s/Har arrison rison (2010 2010)  sound quality judgement of speech filtered with TL spectra Zhang/ Zhan g/Tan an ( (2008 2008)  test of an ASR system with 10 types of voice disguise  ‘masking’ amongst the 3 guises with the lowest similarity rate Coniam Con iam (2005 2005)  impact of surgical masks in oral exams during SARS outbreak 6

‘Face Cover’ Corpus

Whe here? e? High-quality audio/ video recordings in a professional TV Studio at the University of York. 8

Who ho? 10 British English speakers. Control for demographic, educational and language background (details in Fecher, 2011a/b). 9

Di Disg sguise uise? Not: Selection criteria:  forensic relevance  facial parts covered  mask material No voice disguise per se. 10

Disg Di sguise uise? 11

Wha hat? t? Phoneticall Phonetically y contr controlled olled stimuli timuli syllable structure /C 1 VC 2 / (existing English words excluded) [ ɑ :] as in <father> vowel /p, t, k, b, d, g, f, s, ʃ , ϴ , v, z, ʒ , ð, m, n, ŋ, h/ consonants syllable position initial, final carrier phrase He said / stimulus /. n o /ŋ/ initial, no /h/ final phonotactic rules IPA, randomised ► 576 stimuli per speaker presentation 12

Ho How? w? VI VIDEO DEO AUDIO half-profile camera 3m headband 13

How? Ho w? 14

Frica ricativ tive Study e Study

Method Metho 6000 20dB f [Hz] A [dB] 4000 FFT spectrum s ʃ f θ t f [Hz] 2.4*10 4  /s ʃ f θ/ × 2 tokens × 2 syllable positions × 6 speakers × 8 disguise conditions  less standardised analysis procedures for obstruents (see e.g. Haley et al., 2010; Maniwa et al., 2009; Jongman et al., 2000; Flipsen et al., 1999; Shadle&Mair, 1996; Tabain&Watson, 1996)  no bandpass filter, no pre-emphasis (48kHz/16bit/PCM) 16

Varia ariables bles intensity peak CoG variance spectral moments skewness kurtosis 17

intens inte nsity ity ʃ s θ f 18

peak peak fr freq eque uenc ncy s θ f ʃ 19

cent ce ntre of e of gravity vity s θ f ʃ 20

sk skewn ewnes ess s * * ku kurto tosis sis 21 HEL r²=0.68, p<.05 s ʃ RUB 18 HOO kurtosis (dimensionless) f θ 15 NIQ BAL CON 12 SUR 9 TAP TAP 6 HEL HOO r²=.10, p=.44 RUB SUR r²=.67, p<.05 r²=.90, p<.001 CON HEL 3 HEL BAL TAP BAL TAP NIQ RUB CON NIQ RUB BAL HOO 0 SUR CON HOO SUR NIQ -3 0 0.3 0.6 0.9 1.2 1.5 1.8 2.1 2.4 2.7 3 3.3 skewness (dimensionless) 21

Summar Su mmary  sound energy absorption dependent on mask material  but: additional intensity variation due to the speakers’ individual compensation strategies  overall stronger effects for the spectrally diffuse and low- energy non-sibilants /f, θ / than for the sibilants /s, ʃ /  more prone to energy absorption in higher frequency bands  lower centre of gravity for most coverings  highly variable peak frequencies  positive correlation for skewness*kurtosis; for both measures same ranking of guises by size of effect ( NIQ least, HEL most) 22

Conc Conclusions lusions

Spe Speec ech h pr produc oducti tion on Misarticulation physiological and somatosensory effects, e.g. lip/nose contact, restricted jaw movement, skin stretching (Fuchs et al., 2010; Haley et al., 2010; Iskarous et al., 2009; Maniwa at al., 2009) Articulatory compensation e.g. increased vocal effort (Coniam, 2005; Sluijter at al., 1997) , may be increased when impaired auditory self-monitoring 24

Speec Spe ech h acou acousti stics cs Interdependence of physiological and physical events in the vocal tract Acoustic damping effects mask materials assumed to act like a low-pass filter which attenuate energy in higher frequency bands (Watt et al., 2010; Llamas et al., 2009; Coniam, 2005) 25

Speec Spe ech h per perception ception Upcoming research Investigating speech intelligibility when the (visual) speech signal is impaired , i.e. when the mapping between acoustically distinct signals and perceptually consistent categories may be constrained due to a) acoustic transmission loss caused by the mask material, b) the auditory consequences of impaired speech production and acoustics, c) impoverished visual (facial) speech cues. 26

Ref efer erences ences

Aleksic , P.S. & Katsaggelos, A.K. (2006). Audio-Visual Biometrics, Proc. IEEE 94/11 , 2025-44. Coniam , D. 2005. The impact of wearing a face mask in a high-stakes oral examination: An exploratory post-SARS study in Hong Kong. Language Assessment Quarterly 2 , 235-261. Fecher , N. 2011a. Spectral properties of fricatives: a forensic approach. Proc. of the 4 th ISCA Tutorial and Research Workshop on Experimental Linguistics , May 25-27, Paris, France, 71-74. Fecher , N., Watt, D. 2011b. Speaking under cover: The effect of face-concealing garments on spectral properties of fricatives. Proc. of the 17 th International Congress of Phonetic Sciences , Hong Kong, August 2011 (accepted). Flipsen , P.Jr., Shriberg, L., Weismer, G., Karlsson, H., McSweeny, J. 1999. Acoustic characteristics of /s/ in adolescents. JSLHR 42 , 663-677. Fuchs , S., Weirich, M., Kroos, C., Fecher, N., Pape, D., Koppetsch, S. 2010. Time for a shave? Does facial hair interfere with visual speech intelligibility? In: Fuchs, S., Hoole, P., Mooshammer, C., Zygis, M. (eds.). Between the regular and the particular in speech and language . Frankfurt/M.: Peter Lang, 247-264. Haley , K.L., Seelinger, E., Mandulak, K.C., Zajac, D.J. 2010. Evaluating the spectral distinction between sibilant fricatives through a speaker-centered approach. Journal of Phonetics 38(4) , 548-554. Iskarous , K., Shadle, C., Proctor, M. 2008. Evidence for the dynamic nature of fricative production: American English /s/. Proc. of the 8 th Int. Seminar on Speech Production , Strasbourg, France, 405-408. Jongman , A., Wayland, R., Wong, S. 2000. Acoustic characteristics of English fricatives. JASA 108 (3), 1252-63. Llamas , C., Harrison, P., Donnelly, D., Watt, D. 2009. Effects of different types of face coverings on speech acoustics and intelligibility. York Papers in Linguistics (Series 2) 9 , 80-104. Maniwa , K., Jongman, A., Wade, T. 2009. Acoustic characteristics of clearly spoken English fricatives. JASA 125(6) , 3962-73. Shadle , C., Mair, S.J. 1996. Quantifying spectral characteristics of fricatives. Proc. of Interspeech , Philadelphia, 1521-24. Sluijter , A. M. C., van Heuven, V. J., Pacilly, J. J. A. 1997. Spectral balance as a cue in the perception of linguistic stress. JASA 101 (1), 503-513. Tabain , M., Watson, C. 1996. Classification of fricatives. Proc. 6 th Aust. Int. Conf. Speech Sci. Technol ., Adelaide, 623-628. Watt , D., Llamas, C., Harrison, P. 2010. Differences in perceived sound quality between speech recordings filtered using transmission loss spectra of selected fabrics. Talk given at the IAFPA Conference 2010 , Trier, Germany. Zhang , C., Tan, T. 2008. Voice disguise and automatic speaker recognition, Forensic Science International 175(2-3) , 118-122. 28

Speaking Speaking under co under cover er: The he impact impact - PowerPoint PPT Presentation

Speaking Speaking under co under cover er: The he impact impact of of f face ace-con concea cealing ling gar garments ments on on the aco the acoustics of ustics of fri frica cativ tives. es. Natalie Fecher Language &

COAL COVER COAL COAL COAL COVER COVER COVER Searfoss

11.4 The Pricing Method: Vertex Cover Weighted Vertex Cover Weighted vertex cover. Given a

Graphs Vertex Cover Vertex Cover A vertex cover of a graph G=(V ,E) is a set C of vertices such

SPEAKING TRUTH SPEAKING TRUTH Ti e Impact of World Religions on Leadership for Social Change: C

Speaking in the Shower: Presentation Skills Exposed (Paperback) Speaking in the Shower:

Faster Cover Trees Mike Izbicki and Christian R. Shelton UC Riverside Izbicki and Shelton (UC

LP techniques for set cover Chs. 13, 14, 15 Risto Hakala risto.m.hakala@tkk.fi March 10, 2008

Presentation & Public Speaking Main Topics Benefits of public speaking

SPEAKING TRUTH SPEAKING TRUTH Ti e Impact of World Religions on Leadership for Social Change A

PUBLIC SPEAKING - The Formula of 3 Public Speaking is a way in which we share information about a

Leadership and Public Leadership and Public Speaking Speaking Building a strategic m essage and

Presentation skills Speaking before a group Speaking before a group 1. 1. Heights Heights 2.

Skills Development Scotland GROUND RULES No speaking when someone else is speaking Follow

Presentation Skills: Public Speaking & PowerPoint Public Speaking 6 Ps Preparation

SHORT COURSES FOR PROFESSIONALS Presentation Training and Public Speaking Course overview

EFFECTIVE PUBLIC SPEAKING AND PRESENTATION TECHNIQUES PROGRAM DESCRIPTION The Effective Public

Chris Griffith, CEO Anglo American Platinum, Mining Indaba 2015 Speech Topic: Modernisation

25 www.AgriMoon.Com Communication Skills 6.2.1.4 Be prepared Develop good command over

School team: your critical role in AAC Implementation (An example presentation) Claire Hayward,

EMOTION IN SPEECH Nick Campbell ATR Human Information Science Labs Keihanna Science City, Kyoto,

Tetsuya Fujimoto Managing Executive Officer in charge of Finance Thank you for joining our

24.10.13 Investa Office Fund (ASX:IOF) Annual Unitholder Meeting Dear Sir/Madam, Enclosed is

Speech: The Next Generation Bryan Catanzaro along with Baidu

Effective Open Source Speech Recognition in Your Application #kde-speech Peter Grasch

Speaking Speaking under co under cover er: The he impact impact - PowerPoint PPT Presentation

Speaking Speaking under co under cover er: The he impact impact of of f face ace-con concea cealing ling gar garments ments on on the aco the acoustics of ustics of fri frica cativ tives. es. Natalie Fecher Language &

COAL COVER COAL COAL COAL COVER COVER COVER Searfoss

11.4 The Pricing Method: Vertex Cover Weighted Vertex Cover Weighted vertex cover. Given a

Graphs Vertex Cover Vertex Cover A vertex cover of a graph G=(V ,E) is a set C of vertices such

SPEAKING TRUTH SPEAKING TRUTH Ti e Impact of World Religions on Leadership for Social Change: C

Speaking in the Shower: Presentation Skills Exposed (Paperback) Speaking in the Shower:

Faster Cover Trees Mike Izbicki and Christian R. Shelton UC Riverside Izbicki and Shelton (UC

LP techniques for set cover Chs. 13, 14, 15 Risto Hakala risto.m.hakala@tkk.fi March 10, 2008

Presentation &amp; Public Speaking Main Topics Benefits of public speaking

SPEAKING TRUTH SPEAKING TRUTH Ti e Impact of World Religions on Leadership for Social Change A

PUBLIC SPEAKING - The Formula of 3 Public Speaking is a way in which we share information about a

Leadership and Public Leadership and Public Speaking Speaking Building a strategic m essage and

Presentation skills Speaking before a group Speaking before a group 1. 1. Heights Heights 2.

Skills Development Scotland GROUND RULES No speaking when someone else is speaking Follow

Presentation Skills: Public Speaking &amp; PowerPoint Public Speaking 6 Ps Preparation

SHORT COURSES FOR PROFESSIONALS Presentation Training and Public Speaking Course overview

EFFECTIVE PUBLIC SPEAKING AND PRESENTATION TECHNIQUES PROGRAM DESCRIPTION The Effective Public

Chris Griffith, CEO Anglo American Platinum, Mining Indaba 2015 Speech Topic: Modernisation

25 www.AgriMoon.Com Communication Skills 6.2.1.4 Be prepared Develop good command over

School team: your critical role in AAC Implementation (An example presentation) Claire Hayward,

EMOTION IN SPEECH Nick Campbell ATR Human Information Science Labs Keihanna Science City, Kyoto,

Tetsuya Fujimoto Managing Executive Officer in charge of Finance Thank you for joining our

24.10.13 Investa Office Fund (ASX:IOF) Annual Unitholder Meeting Dear Sir/Madam, Enclosed is

Speech: The Next Generation Bryan Catanzaro along with Baidu

Effective Open Source Speech Recognition in Your Application #kde-speech Peter Grasch

Presentation & Public Speaking Main Topics Benefits of public speaking

Presentation Skills: Public Speaking & PowerPoint Public Speaking 6 Ps Preparation