Telephone Based Automatic Telephone Based Automatic Voice Pathology Assessment. Voice Pathology Assessment. Rosalyn Moran 1 , R. B. Reilly 1 , P.D. Lacy 2 1 Department of Electronic and Electrical Engineering, University College Dublin, Ireland 2 Royal Victoria Eye and Ear Hospital (RVEEH), Dublin, Ireland . AAAI Fall Symposium, Dialogue Systems for Health Communication, Washington DC, Oct 2004. D.S.P. Research Group, University College Dublin …….. To Follow …… .. To Follow • Why a system for voice disease? • The Anatomy of Voice • Generic Voice Classifiers • The next obvious step • System Design • Results • Refinement D.S.P. Research Group, University College Dublin 1
Why is this area Important ? Why is this area Important ? � Voice disorders are relatively common in the general population – 5% suffering abnormalities requiring medical intervention. – Cancerous tumors of vocal folds account for 40% of all head and neck carcinomas – At risk professionals: logistically difficult to monitor � Currently, the accurate diagnosis requires visualisation of the larynx. – Videostroboscopy is the current gold standard • costly, time consuming, often subjective and labour intensive. D.S.P. Research Group, University College Dublin Anatomy of the Vocal Folds: Healthy Anatomy of the Vocal Folds: Healthy Waves Waves � Vagus Nerve activates fold closure Air Pressure from lungs force folds � apart � For short “Voiced” phonation (long vowels) folds move periodically; The Mucosal Wave D.S.P. Research Group, University College Dublin 2
Vocal Fold Pathologies, Unhealthy Waves Vocal Fold Pathologies, Unhealthy Waves � Structural (growths), � Neurological (Loss of effective nerve action), Lack of constancy � Escaping Air due to incomplete fold closure � D.S.P. Research Group, University College Dublin Generic voice pathology classifier Generic voice pathology classifier Normal / Normal / Feature Feature Acquisition Acquisition Classifier Classifier Extraction Extraction Abnormal Abnormal High Measures of Quality vocalisation Various, Sustained Speech: constancy HMMs Phonation of ….. 25kHz vowel sound ANNs Pitch /a/ Sound LDA Proof Amplitude Chamber Noise Language Independent! D.S.P. Research Group, University College Dublin 3
Where have we come to date? Where have we come to date? Successful Automatic classifiers : � Normal / Normal / Feature Feature Acquisition Acquisition Classifier Classifier Extraction Extraction Abnormal Abnormal � Classification rate in excess of 90% for separating normal from pathology voice (- Godino-Llorente, P Gomez-Vilda, IEEE Transactions on Biomedical Engineering. - C. Maguire, P de Chazal, R Reilly, P.D. Lacy, World Congress on Medical Physics and Biomedical Engineering. ) Motivation: Can we make this more useful ? …………… Could you use a telephone … D.S.P. Research Group, University College Dublin Under Investigation….. Under Investigation….. Normal / Normal / Feature Feature Acquisition: Acquisition: Classifier Classifier Extraction Extraction Abnormal Abnormal IVR IVR � New method of acquisition employing IVRs to allow transfer of data across telephone networks and the internet. - Remote - Secure - Identifiable System Infrastructure :VoiceXML D.S.P. Research Group, University College Dublin 4
An intelligent dialogue system An intelligent dialogue system Incorporation of Transmission digital signal processing Characteristics algorithms DSP DSP Comms Comms D.S.P. Research Group, University College Dublin Voice XML Voice XML Acquisition Acquisition � VoiceXML Scripts held on a web server . Transferred to VoiceXML Gateway Voxpilot for TTS and speaker � recognition. � Dial up applications using any telephone. D.S.P. Research Group, University College Dublin 5
Database Database Normal / Normal / Feature Feature Acquisition Acquisition Classifier Classifier Extraction Extraction Abnormal Abnormal � Disorderd Voice Database Model 4337 Massachusetts Eye and Ear Infirmary 631 valid patient samples of sustained phonation of /a/ - Wide variety of pathologies condensed to normal / abnormal in this study - Prelabelled by panel of experts - Recorded in soundproof environment using a high quality microphone D.S.P. Research Group, University College Dublin ….Corrupting The Corpus .Corrupting The Corpus … � To Identify Causes of Information loss Normal / Normal / Feature Feature Acquisition Acquisition Classifier Classifier Extraction Extraction Abnormal Abnormal � Imitate telephone conditions by progressively degrading the quality of the database. � Examine feature accuracies at each stage D.S.P. Research Group, University College Dublin 6
….Creating 5 Test Corpii .Creating 5 Test Corpii … 1.Begin : 631 High Quality Speech Files @ 10kHz 2. Degrade : Resample to 8kHz 3. Degrade : Bandpass filtered from 100Hz-3.2kHz 4. Degrade : Add Noise 5. Transmit : Original Database D.S.P. Research Group, University College Dublin Transmission Channels: Transmission Channels: Analog and Digital Long Distance Links Analog and Digital Long Distance Links 1. VoiceXML Calling Application: Plays 30 Speech files, 2. VoiceXml Application: “answer” and save transmitted speech files. *CORPUS 5 D.S.P. Research Group, University College Dublin 7
Features Features Features! Features Features Features! Normal / Normal / Feature Feature Acquisition Classifier Acquisition Classifier Extraction Extraction Abnormal Abnormal …… Of Medical Relevance in conjunction with our Medical Consultants - Pitch Perturbation Features, Jitter (12) - Amplitude Perturbation Features, Shimmer (12) - Energy Measures, Harmonic to Noise Ratio HNR (11) D.S.P. Research Group, University College Dublin Classifier / Performance Estimation Classifier / Performance Estimation Normal / Normal / Feature Feature Acquisition Acquisition Classifier Classifier Extraction Extraction Abnormal Abnormal � Classifier – Linear Discriminant Analysis � Performance Estimation – Normal recordings duplicated to balance classes – 10 runs of 10 fold cross-validation • independent training and testing sets – Specificity, sensitivity, predictivities, accuracy D.S.P. Research Group, University College Dublin 8
Classification Performance Classification Performance D.S.P. Research Group, University College Dublin Composite Feature Breakdown Composite Feature Breakdown C lea n 1 0 kH z B andlimited (S ampling R ate 8kH z) 77.97 79.79 7 7 .1 7 7 .8 6 8 .9 3 100 66.09 1 0 0 80 60 % 5 0 S e rie s1 % S eries1 40 20 0 0 J itte r S h im m e r H N R Jitter S him m er H N R F iltered 100H z-3200H z Noise Corrupted (30dB SNR) 75.63 75.7 74.85 100 100 66 63.66 57.16 80 80 60 60 % S eries1 % Series1 40 40 20 20 0 0 J itter S him m er H N R Jitter Shimmer HNR D.S.P. Research Group, University College Dublin 9
Composite Feature Breakdown Composite Feature Breakdown Telephone Corpus 73.03 100 64.7 57.85 80 60 % Series1 40 20 0 Jitter Shim m er HNR � Shimmer Group proving most robust. � HNR accuracies fall significantly. D.S.P. Research Group, University College Dublin Moving On Moving On � Classification rate of 74% for separating normal from pathology voice……over the telephone. � Further Refinement - Homogenous Data Sets * Physical * Neuromuscular * Mixed D.S.P. Research Group, University College Dublin 10
Physical Pathology Physical Pathology � Occurs when the function of the larynx has been affected by a physical change in the anatomy of the larynx. For example an arytenoid granuloma. D.S.P. Research Group, University College Dublin Neuromuscular Pathology Neuromuscular Pathology � Occurs when the nerves that control the movement of the muscles in the larynx have been altered in some way. An instance of this is Vocal Fold paralysis. D.S.P. Research Group, University College Dublin 11
Telephone Based Results : Improved Accuracy Telephone Based Results : Improved Accuracy Accuracy Neuromuscular 87% Physical 78% Mixed 61% D.S.P. Research Group, University College Dublin Wider Impact for Healthcare Opportunities for providing related Opportunities for providing related health care information by voice health care information by voice applications. applications. Voice assessment Voice assessment Speech training Speech training Improving literacy Improving literacy D.S.P. Research Group, University College Dublin 12
Thank you Thank you How is your voice? How is your voice? rosalyn.moran@ee.ucd.ie D.S.P. Research Group, University College Dublin 13
Recommend
More recommend