Algorithms for Dysfluency Detection in Symbolic Sequences using - PowerPoint PPT Presentation

Algorithms for Dysfluency Detection in Symbolic Sequences using Suffix Arrays alfy 1 , 2 ıchal 1 J. P´ J. Posp´ 1 Slovak University of Technology, Faculty of Informatics and Information Technologies, Bratislava, Slovakia 2 Slovak Academy of Sciences, Institute of Informatics, Bratislava, Slovakia Text Speech and Dialogue, September 3, 2013

Overview ◮ Introduction to Dysfluencies ◮ Motivation in Dysfluent Speech Recognition ◮ Common Approach & Problem with “Complex” Dysfluencies ◮ Methodology ◮ Results ◮ Conclusion

Introduction to Dysfluencies ◮ D y sfluencies are disruptions or breaks in the smooth flow of speech. (Shipley & McAfee, 1998) ◮ Unlike read speech, spontaneous speech contains high rates of d i sfluencies (Shriberg, 1994)

"Normal" Understanding Disfluencies Different Types Hesitations (pause) Interjections (um, uh, er) of Speech Revisions ("I want-I need that") Repetitions of phrases ("I want- I want that") Disfluencies Disfluencies occur more frequently Repetitions of multisyllabic Reactions to whole words (“mommy- disfluencies increase mommy-mommy let’s go.”) Tension or struggle increases Repetitions of monosyllabic whole words (“I-I-I want to go.”) Duration (length) of disfluencies increases NOTE: "Normal" disfluencies can be used to avoid or Tension during postpone stuttering (e.g., "normal" disfluencies "Stuttered" “I um, you know, uh, I want to um, g-g-g-o with you.”) Disfluencies Repetitions of sounds or syllables ("li-li-like this") Prolongations ("llllllike this") From Yaruss & Reardon (2006), Young Children Who Stutter: Information and Support for Parents. New York: National Stuttering Association (NSA). Blocks ("l---ike this")

Motivation in Dysfluent Speech Recognition Dysfluent speech recognition: ◮ Speech Language Pathology (SLP) - automatic & objective evaluation e.g. analysis tool ◮ Automatic Speech Recognition (ASR) - improve the accuracy e.g. module

Problem with Dysfluencies ◮ statistical distribution of atomic parts of speech - build Automatic Speech Recognition (ASR) system ◮ sparse regularity of dysfluencies - design ASR (like Hidden Markov Models (HMM)) ◮ ASR complexity - define every transition between states which can occur in case of dysfluent events

Conventions In our work we used convention: ◮ “simple” dysfluencies - e.g. part-word/syllable repetitions (R1), prolongations (P); already studied in many works “simple” dysfluencies e.g. P: rrrun, R1: re re research ◮ “complex” dysfluencies - specified as a chaotic mixture of dysfluent events (e.g. repetition of phrase, prolongation combined with hesitation & repetition) are frequent in stutterers speech “complex” dysfluencies e.g. I do my, I do my work; j j j jer j j jer ja just

Common Approach & Problem with “Complex” Dysfluencies common approach ◮ fix a window (e.g. 200 - 800 ms) ◮ build a dysfluency recognition system (e.g. Artificial Neural Networks, Support Vector Machines) ◮ recognize the “simple” dysfluent events in a fixed interval problem ◮ dysfluencies frequently do not fit the fixed window, but are dynamically distributed throughout much longer 2 - 4 s intervals ◮ how to choose the right window size for “complex” dysfluencies?

Our Methodology Speech language ◮ our solution: combine & pathology Data mining apply methods from other Dysfluencies Time series fields of science 0.5 Amplitude 0 ◮ SLP - knowledge, −0.5 0 1000 2000 Vector dysfluencies Alg. 1-2 ◮ Data Mining - mining time Sequence analysis series, Symbolic Aggregate Approximation (SAX) Bioinformatics ◮ Bioinformatics - sequence (DNA) analysis, Suffix Arrays

Methodology: Corpus ◮ University College London Archive of Stuttered Speech (UCLASS) ◮ Howell, Huckvale, 2004, ˜ 500 recordings, 16 - 44.1 KHz, 2 - 15 min playing time, age 8 - 47 year, male / female ◮ Howell, Davis, Bartrip, 2009, 12 selected recordings, working set from UCLASS ◮ We annotated & used subset of this working set, 22.05 KHz, 19:32 min playing time

Methodology: Feature Extraction PAA, SAX Speech ◮ speech, 22.5 KHz ◮ short-time energy, X = x 1 , . . . x n ◮ Piecewise Aggregate Approx. Short time energy X = x 1 , . . . x N (1) SAX n N i � x i = N (2) x j Lexical content: n “c can c c can” j = n N ( i − 1)+1 ◮ Symbolic Aggregate Approx. B = β 1 , ..., β a − 1 (3) � W = � w 1 , . . . � (4) w m ◮ map X → � W w i = a i , iff β j − 1 < x j < = β j � (5)

Methodology: Data Structure, Suffix Arrays i Pos[i] C[Pos[i] … n] i = 1 2 3 4 5 6 7 8 9 10 11 ◮ large sequence 1 11 $ C = p r o c e s s i n g $ 2 4 cessing$ 3 5 essing$ C = c 0 c 1 . . . c N − 1 4 10 g$ 5 8 ing$ ◮ suffix of C , 6 9 ng$ 7 3 ocessing$ C i = c i c i +1 . . . c N − 1 8 1 processing$ 9 2 rocessing$ 10 7 sing$ ◮ lexicog. sorted array, Pos 11 6 ssing$ ◮ Pos [ k ], k th smallest suffix in the set C 0 , C 1 , . . . C N − 1 ◮ assume that Pos is given then ◮ C Pos [0] < C Pos [1] < · · · < C Pos [ N − 1] ◮ where ’ < ’ denotes the lexicog. order

Methodology: Our Derived Functions Speech waveform ◮ prolongations are characterized by 0.5 Amplitude 0 minimal difference between n neighboring frames −0.5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 Time (s) Wideband spectrogram ◮ video segmentation → were Frequency (Hz) 10000 5000 adapted for speech 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 Time (s) x = x 1 , . . . x N , y = y 1 , . . . y N (6) Prolongation detection functions 1 Function D(x,y) Db(x,y) Dh(x,y) N � 0.5 Dg(x,y) D ( x , y ) = 1 | x i − y i | (7) 0 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 N Time (s) i =1 Lexical content: “personal � b s:eedee player” D b ( x ) = D ( x i , x ( i + l ) ) (8) i =1 � h D h ( H x ) = D ( H x ( i ) , H x ( i + l )) (9) i =1

Methodology: Our Developed Algorithms ◮ Alg. 1 - for speech pattern searching ◮ Alg. 2 - for searching repeated patterns (repetitions) in speech ◮ P short sequence, C long sequence, s is a shift, l is C length 1: while i < n do ⊲ Begin: Alg.2() In i -th window 1st block set to P, remaining blocks put to 2: C Compute Pos for P. ⊲ Pos is a suffix array 3: With Pos construe Tab for P. ⊲ Tab is a look up table 4: while s < l do ⊲ Begin: Alg.1() 5: Use Tab to query C in P. 6: Save patterns position and patterns length. 7: ⊲ End: Alg.1() end while 8: 9: end while ⊲ End: Alg.2()

Methodology: Our Features for “Complex” Dysfluencies For every 5 s long interval, 3 features of 100ms blocks were computed: ◮ patterns average redundancy ◮ patterns relative frequency ◮ patterns redundancies sum Algorithms iterative output Evaluate columns 1 1 6 window 1 window 2 2 7 3 3 8 4 4 9 Evaluate 5 5 10 rows 6 window 2 window 1 window 2 7 8 9 10

Methodology: Main Steps in Running Algorithms ◮ Alg. 1-2 based on SAX - Symrep ◮ in relational DBs, short query is executed in a large set of data ◮ Alg. 1 - opposite to relat. DBs, query a long sequence C in a short sequence P ◮ Alg. 2 - adaptation capability to unknown repeated speech pattern length ◮ DTW based on MFCC - Specrep

Results: Statistical Analysis process of classifier design: ◮ measurement of data class separability - correlation ◮ study of data characteristics - Mann-Whitney U-test compare features: ◮ Specrep - DTW on basis of MFCC features ◮ Symrep - our developed algorithms on basis of SAX ◮ r - correlation coefficients ◮ h - accepted hypotheses ( p-values < 0 . 05 level)

Results: Objective Assessment ◮ SVMs to perform objective assessment of MFCC, Specrep , Symrep ◮ training (80 %) & testing (20 %) sets ◮ we trained individual SVMs, sigmoidal kernel function

Conclusion ◮ derived functions for prolongation detection ◮ developed algorithms Alg.1-2 - detection of “complex” dysfluencies ◮ new designated features - statistically analyzed ◮ objective assessment of new features & MFCC by SVM, 47.4% ◮ symbolic sequences are competitive to spectral domain

Bibliography 1/2 Camastra, F. and Vinciarelli, A., Machine Learning for Audio, Image and Video Analysis: Theory and Applications . Springer-Verlag London Limited, 2008. Hamel, L., Knowledge Discovery with Support Vector Machines . John Wiley & Sons, Inc., Hoboken, NJ, USA, July 2009. Howell, P., Davis, S., Bartrip, J., The UCLASS archive of stuttered speech . Journal of Speech, Language, and Hearing Research, 52, pp. 556-569, 2009. Keogh, E., Chakrabarti, K., Pazzani, M., and Mehrotra, S., Dimensionality Reduction for Fast Similarity Search in Large Time Series Databases . Knowledge and Information Systems 3, pp. 263-286, 2001.

Algorithms for Dysfluency Detection in Symbolic Sequences using - PowerPoint PPT Presentation

Algorithms for Dysfluency Detection in Symbolic Sequences using Suffix Arrays alfy 1 , 2 chal 1 J. P J. Posp 1 Slovak University of Technology, Faculty of Informatics and Information Technologies, Bratislava, Slovakia 2 Slovak Academy of

Decidability Decidability and Symbolic Symbolic Verification Symbolic Symbolic Verification

20-03-06 7. Learning Sequences/Behaviors How to use sequences/behaviors? Sequences and more

Detection of neutral particles detection of neutrons detection of neutrinons detection of low

Sequences Sequences and Difference Equations "Sequences" is a central topic in

Sequences Sequences and Difference Equations "Sequences" is a central topic in

Hierarchical Exact Symbolic Analysis y y of Large Analog Integrated Circuits By Symbolic Stamps

Lazy Heap Analysis with Symbolic Memory Graphs Alexander Driemeyer Outline 1. Motivation 2.

Symbolic data analysis Symbolic data analysis Clustering of large data sets of mixed units

CS 478 - Tools for Machine Learning and Data Mining Symbolic Clustering - COBWEB Symbolic

Neural-Symbolic Integration Strategies Neural-Symbolic Integration Unification Hybrid

Symbolic Execution of Linux binaries About Symbolic Execution Dynamically explore all

Cognitive Modeling Symbolic School Lecture 2: Approaches Symbolic Models 2 Symbolic

Formal Verification Methods 2: Symbolic Simulation John Harrison Intel Corporation

Symbolic Execution: Applications Symbolic execution is widely used in practice. Tools based on

Symbolic Mathematics Dr. Mihail November 20, 2018 (Dr. Mihail) Symbolic November 20, 2018 1 /

Symbolic execution as search, and the rise of solvers Search and SMT Symbolic execution is

AOMIC SPECTRAL LINE BROADNING AND DATABASES FOR STELLAR PLASMA RESEARC MILAN S.

Intelligent Multimedia Presentation Systems: A proposal of reference model Article October 1999

A Portable Knowledge Based System for Car Breakdown Evaluation Eugenio Roanes-Lozano 1 , Jos a

THE MINISTRY OF EMERGENCY SITUATIONS OF THE REPUBLIC OF AZERBAIJAN CASPIAN BASIN ACCIDENT-RESCUE

Prediction of Moving Object Location Based on Frequent Trajectories Mikoaj Morzy Institute of

Use Everyday! September 19, 2018 6:00PM 8:00pm About the Speakers Kris Miller, CBAP

DYMATICA Modeling & Assessment Current Work and Capabilities Sandia National Laboratories

Looking Ahead to Person Resolution 2004 Family History Technology Workshop March 25, 2004 Mary

Algorithms for Dysfluency Detection in Symbolic Sequences using - PowerPoint PPT Presentation

Algorithms for Dysfluency Detection in Symbolic Sequences using Suffix Arrays alfy 1 , 2 chal 1 J. P J. Posp 1 Slovak University of Technology, Faculty of Informatics and Information Technologies, Bratislava, Slovakia 2 Slovak Academy of

Decidability Decidability and Symbolic Symbolic Verification Symbolic Symbolic Verification

20-03-06 7. Learning Sequences/Behaviors How to use sequences/behaviors? Sequences and more

Detection of neutral particles detection of neutrons detection of neutrinons detection of low

Sequences Sequences and Difference Equations &quot;Sequences&quot; is a central topic in

Sequences Sequences and Difference Equations &quot;Sequences&quot; is a central topic in

Hierarchical Exact Symbolic Analysis y y of Large Analog Integrated Circuits By Symbolic Stamps

Lazy Heap Analysis with Symbolic Memory Graphs Alexander Driemeyer Outline 1. Motivation 2.

Symbolic data analysis Symbolic data analysis Clustering of large data sets of mixed units

CS 478 - Tools for Machine Learning and Data Mining Symbolic Clustering - COBWEB Symbolic

Neural-Symbolic Integration Strategies Neural-Symbolic Integration Unification Hybrid

Symbolic Execution of Linux binaries About Symbolic Execution Dynamically explore all

Cognitive Modeling Symbolic School Lecture 2: Approaches Symbolic Models 2 Symbolic

Formal Verification Methods 2: Symbolic Simulation John Harrison Intel Corporation

Symbolic Execution: Applications Symbolic execution is widely used in practice. Tools based on

Symbolic Mathematics Dr. Mihail November 20, 2018 (Dr. Mihail) Symbolic November 20, 2018 1 /

Symbolic execution as search, and the rise of solvers Search and SMT Symbolic execution is

AOMIC SPECTRAL LINE BROADNING AND DATABASES FOR STELLAR PLASMA RESEARC MILAN S.

Intelligent Multimedia Presentation Systems: A proposal of reference model Article October 1999

A Portable Knowledge Based System for Car Breakdown Evaluation Eugenio Roanes-Lozano 1 , Jos a

THE MINISTRY OF EMERGENCY SITUATIONS OF THE REPUBLIC OF AZERBAIJAN CASPIAN BASIN ACCIDENT-RESCUE

Prediction of Moving Object Location Based on Frequent Trajectories Mikoaj Morzy Institute of

Use Everyday! September 19, 2018 6:00PM 8:00pm About the Speakers Kris Miller, CBAP

DYMATICA Modeling &amp; Assessment Current Work and Capabilities Sandia National Laboratories

Looking Ahead to Person Resolution 2004 Family History Technology Workshop March 25, 2004 Mary

Sequences Sequences and Difference Equations "Sequences" is a central topic in

Sequences Sequences and Difference Equations "Sequences" is a central topic in

DYMATICA Modeling & Assessment Current Work and Capabilities Sandia National Laboratories