spoken language biomarkers for detecting cognitive
play

Spoken Language Biomarkers for Detecting Cognitive Impairment Tuka - PowerPoint PPT Presentation

Spoken Language Biomarkers for Detecting Cognitive Impairment Tuka Alhanai Advisor: James Glass Spoken Language Systems Group Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology 3 rd May 2018 tuka@mit.edu


  1. Spoken Language Biomarkers for Detecting Cognitive Impairment Tuka Alhanai Advisor: James Glass Spoken Language Systems Group Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology 3 rd May 2018 tuka@mit.edu talhanai talhanai.com 1

  2. 2

  3. Objective : Automatically detect cognitive conditions using spoken language. 3

  4. Cognitive impairment Definition : decline in mental abilities that is severe enough to interfere with daily life. • Alzheimer’s • Vascular Dementia • Lewy Body Dementia 4

  5. Cognitive impairment Definition : decline in mental abilities that is severe enough to interfere with daily life. 2 nd $200B to spinal cord injuries in expenditure in USA. [Alzheimer’s Association, (2015]) terms of its debilitating effects. [WHO, (2003)] equivalent value as: 5

  6. Why detect it? 6

  7. Cognitive function Pathological load Normal MCI Dementia Nestor et al. 2004 7

  8. Plan Hospital in the home : 50% suffer $80K a year 35% hosp. visits. from depression. 4% mortality rates. 8

  9. Lifestyle Delay onset Delay onset Delay onset by 7 months by 4 years by 2 months 3 times a week, Fish meal a week, 45% lower risk 70% lower risk 9

  10. Prevention Vascular Lewy Body Parkinson’s Alzheimer’s SIRT3 protein AD alone less Non-steroidal Anti- protects brain damaging than inflammatory Drugs cells against mixed pathologies. lowers risk degeneration. 10

  11. 11

  12. Data : Audio recordings of neuropsychological exams at the Framingham Heart Study. 12

  13. 13

  14. Framingham Heart Study since 15,000+ 1948 subjects recording since audio 2006 neuropsychological exams 14

  15. Outcome recall details describe scene recall verbal pair associates 15

  16. Outcome • severity • onset • cause reviewed 16

  17. Study : 92 subjects (21 impaired) 17

  18. Data statistics N Subjects 92 N Impaired 21 (22.8%) Age 68 years (+/- 17) Gender 47 male (51 female) Duration 65 minutes (+/- 18) Vocabulary Size 527 words (+/- 181) Transcript Size 2,496 words (+/- 1,508) 18

  19. Outcome of interest • Binary cognitive impairment • According to dementia review panel assessment • Pathology : • 14 Alzheimer’s • 5 Vascular Dementia • Severity : • 10 < mild • 6 mild • 5 moderate 19

  20. Assessment • AUC: Area Under the Receiver Operating Curve • TPR: True Positive Rate • FPR: False Positive Rate • HL-test: Hosmer-Lemeshow Test for statistical calibration • LOOCV: Leave-one-out cross-validation 20

  21. Modeling 21

  22. Inside the box Models: • Support vector machine (SVM) • Discriminant analysis • Decision tree • K-nearest neighbor • Logistic regression 22

  23. Inside the box Models: • Support vector machine (SVM) • Discriminant analysis • Decision tree • K-nearest neighbor • Interpretable • Logistic regression • Best performing 23

  24. Baseline model • Output : binary cognitive impairment • Model : logistic regression • Features : age, education, employment, gender part-time never age retired volunteer unemployed other disability high school female some college college 24

  25. Age Education Employment Gender unemployed part-time disability retired never other age *** *** * *** Model Coefficients *** *** volunteer female high school some college college (modeled on N = 6,258, evaluated on the 92 subjects) 25

  26. Age Education Employment Gender unemployed More likely with: part-time disability retired never other age • Increasing age *** *** * *** Model Coefficients *** *** volunteer female high school some college college (modeled on N = 6,258, evaluated on the 92 subjects) 26

  27. Age Education Employment Gender unemployed More likely with: part-time disability retired never other age • Increasing age *** *** * • Less education *** Model Coefficients *** *** volunteer female high school some college college (modeled on N = 6,258, evaluated on the 92 subjects) 27

  28. Age Education Employment Gender unemployed More likely with: part-time disability retired never other age • Increasing age *** *** * • Less education *** Model Coefficients • Less employment *** *** volunteer female high school some college college (modeled on N = 6,258, evaluated on the 92 subjects) 28

  29. Age Education Employment Gender unemployed More likely with: part-time disability retired never other age • Increasing age *** *** * • Less education *** Model Coefficients • Less employment • Male *** *** volunteer female high school some college college (modeled on N = 6,258, evaluated on the 92 subjects) 29

  30. Audio pre-processing 30

  31. Extracting features 31

  32. Extracting features 32

  33. Modeling 33

  34. Features (inputs) Pitch Segment Duration Jitter Speaking Rate Spectral Energy Question Mark Shimmer # Words RMS Energy Lexical Overlap Language Perplexity 34

  35. Features (inputs) Pitch Segment Duration Jitter Speaking Rate Spectral Energy Question Mark Shimmer # Words RMS Energy Lexical Overlap Language Perplexity 35

  36. Spectral Energy Prosody Text . f f i d 8 C C k F r a M 3 M 6 1 C n C o C C i F t F s Model Coefficients M M e u Q 3 h . 0 n f c f 1 o C i t d i i t C C P a 3 C r F u C M F D C M t F n M e m g e S 3 r . e f f C t i t d i C J 3 F C M C F M 36

  37. Spectral Energy Prosody Text • Monotonous voice . f f i d 8 C C k F r a M 3 M 6 1 C n C o C C i F t F s Model Coefficients M M e u Q 3 h . 0 n f c f 1 o C i t d i i t C C P a 3 C r F u C M F D C M t F n M e m g e S 3 r . e f f C t i t d i C J 3 F C M C F M 37

  38. Spectral Energy Prosody Text • Monotonous voice . f f i d 8 C C • Hesitation k F r a M 3 M 6 1 C n C o C C i F t F s Model Coefficients M M e u Q 3 h . 0 n f c f 1 o C i t d i i t C C P a 3 C r F u C M F D C M t F n M e m g e S 3 r . e f f C t i t d i C J 3 F C M C F M 38

  39. Spectral Energy Prosody Text • Monotonous voice . f f i d 8 C C • Hesitation k F r a M 3 M 6 1 C n C o C C i F t F • Limited response s Model Coefficients M M e u Q 3 h . 0 n f c f 1 o C i t d i i t C C P a 3 C r F u C M F D C M t F n M e m g e S 3 r . e f f C t i t d i C J 3 F C M C F M 39

  40. Results Features AUC TPR @ FPR 10% HL-test Text 0.69 0.14 > 0.05 Demographic 0.79 0.38 < 0.05 Audio 0.90 0.71 > 0.05 Text + Audio 0.92 0.76 > 0.05 • Text + Audio best performing (better than demographic) • Text + Audio also has best recall rate • Best performing model is well-calibrated 40

  41. Conclusion • A method to quantify speech patterns to model cognitive impairment. • Utilize findings without formally deploying the model. • Don’t necessarily need to know exam structure. 41

  42. Future Work 5,000+ subjects 7,000+ audio recordings 42

  43. Details in Publication “ Spoken Language Biomarkers for Detecting Cognitive Impairment ” T. Alhanai, R. Au, and J. Glass, IEEE Automatic Speech and Recognition Workshop , December 2017 [Paper]: https://groups.csail.mit.edu/sls/publications/2017/ASRU17_alhanai.pdf [Source Code]: https://github.com/talhanai/asru2017-method.git 43

  44. Spoken Language Biomarkers for Detecting Cognitive Impairment Tuka Alhanai Advisor: James Glass Spoken Language Systems Group Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology 3 rd May 2018 tuka@mit.edu talhanai talhanai.com 44

Recommend


More recommend