FINDING YOUR VOICE IN THE REGULATORY AGE NIGEL CANNINGS CTO - PowerPoint PPT Presentation

FINDING YOUR VOICE IN THE REGULATORY AGE NIGEL CANNINGS CTO nigel.cannings@intelligentvoice.com @intelligentvox

2017! 2016? 2015 THE YEAR OF VOICE As almost 50% of all corporate data will have a voice component within 5 years, either as audio or video, all companies, but particularly banks and insurance companies, need to get a handle not just on where this data is being held, but what is being said in it, and also who is saying it. Banks face LIBOR Amazon Alexa SIRI(?) FX Scandal Multi-Billion $ fines

AUDIENCE PARTICIPATION

HOW OFTEN DO YOU USE A VOICE ASSISTANT? Of the people with a smart phone how many use their integrated voice assistant (e.g. Siri, Cortana): Never Monthly Weekly Daily 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% Results taken from a survey on 5 th October 2017 of 1500 people across Europe

HOW OFTEN DO YOU USE A VOICE ASSISTANT? Of the people with an Alexa home assistant how often do they use it: Monthly Weekly Daily 0% 5% 10% 15% 20% 25% 30% 35% 40% 45% 50% Results taken from a survey on 5 th October 2017 of 1500 people across Europe

IT’S A DOUBLE WHAMMY GDPR MiFID II Where? Who? What?

CLOUD SECURITY Where is your voice stored? Your voice could be used for any number of the following:  Use (edit) your voice recordings to impersonate you  Learn about you → Your identity, gender, nationality (accent), emotional state..  Track you from uploads / communications of voice recordings WHERE

ENCRYPTED SPEECH PROCESSING Powered by GPU AES Encryption (Public key) Powered by machine learning Privacy preserving encrypted phonetic search of speech data A New Secure and Lightweight Searchable Encryption Scheme over Encrypted Cloud Data C Glackin, G Chollet, N Dugan, N Cannings, J Wall, S Tahir, IG Ray S Tahir, S Ruj, Y Rahulamathavan, M Rajarajan, C Glackin IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2017. IEEE Transactions on Emerging Topics in Computing, 2017. WHERE

DEEP SPIKING NEURAL NETWORKS FOR SPEECH ENHANCEMENT Recurrent lateral inhibitory spiking networks for speech enhancement J Wall, C Glackin, N Cannings, G Chollet, N Dugan International Joint Conference on Neural Networks (IJCNN), pp. 1023-1028, 2016. TECHNICAL

CONVOLUTIONAL NEURAL NETWORKS FOR ACOUSTIC MODELLING TIMIT Speech Corpus 1.4M spectrograms for the training set Sliding window used for timing 4 to 5 phones in each 0.256 second window ? 61 Phoneme Classes - Beaten the current NTIMIT. State of the art! - Beaten the current NTIMIT. State of the art! - Beaten the current NTIMIT. State of the TECHNICAL

TECHNICAL

HOW FAST? 120 100 100 80 80 Times Real Time 60 50 40 30 20 10 0 WHAT

UNDERSTANDING 100x Realtime using P5000 WHAT

TELEFONICA/O2 But this is just the beginning: Voice data is generated not only in the organisation, but externally, maybe as YouTube content. One area commonly forgotten is mobile telephony. MiFID II now places a strong requirement not just on recording calls made from a regulated organisation premises, but their mobile calls as well. Intelligent Voice are working with Telefonica/O2 to capture, index and analyse mobile phone calls, and introduce them as part of a compliance and monitoring workflow for MiFID II . WHAT

CREDIBILITY WHAT IS WRONG WITH THESE STATEMENTS? “Woke up at 7:30. Had a shower. Made breakfast and read the newspaper. At 8:30, drove to work.” “We should have done a better job.” “That’s their way of doing things.” “You’d better ask them.” Alleged robbery victim: “The man asked for my money.” “He told me not to look at him. He said he would shoot me if I screamed.” WHAT

CREDIBILITY INDICATORS Pronouns: Omission, Improper use, Higher rates of third person plural pronounced person plural pronouns Complexity: Parameters such as number of letters/syllables per word, higher word count, higher rate of pauses Speaking verbs: Strong tone (told, demanded, telling), soft tone (said, asked, stated, saying) – tone changes Tempo: Slow tempo (indicator of cognitive load), fast tempo (indicator of arousal and negative effects) Pitch: Higher pitch/lower voice quality at specific times are indications of fraudulent related utterances Specific Words: Explainers (so, since therefore, because…) These are just a few of the indicators of suspicious language WHAT

CREDIBILITY NETWORK INTERVIEWER Voice Activity What happened next? CALLER Detection He told me not to look at him. He said he GPU- accelerated would shoot me if… RNN-based Speech to Text … He told me not to look at him . He said … i-vector Embedding diarization LSTM LSTM Inspired by recurrent networks for named entity recognition and part of speech tagging We can use bi-directional recurrent networks to attach credibility tags to the speech transcription Strong Weak Bi-directionality is important for context followed by tone tone Network can tag explainers, changes in tone, pronouns etc. WHAT

SPEAKER IDENTIFICATION Dialect identification via images and DIGITS NIST evaluation of 500 hours and 20 dialects SOX MATLAB PYTHON RASTA 12 RASTA WHO

NIST EVALUATION English- Portuguese-Brazilian Spanish- Spanish-European Chinese-Min_Dong Arabic Chinese-Cantonese Arabic-Egyptian English-British Spanish-Caribbean Slavic-Russian Arabic-Maghrebi Chinese-Mandarin Arabic-Iraqi English-American French-West_African Chinese-Wu Slavic-Polish French-Haitian Arabic-Leventine 0 50 100 Preliminary Results WHO

CELEBRITY SOUND A LIKE https://celebsoundalike.com/ Tweet your results to @intelligentvox WHO

CONCLUSION THANK YOU nigel.cannings@intelligentvoice.com @intelligentvox

FINDING YOUR VOICE IN THE REGULATORY AGE NIGEL CANNINGS CTO - PowerPoint PPT Presentation

FINDING YOUR VOICE IN THE REGULATORY AGE NIGEL CANNINGS CTO nigel.cannings@intelligentvoice.com @intelligentvox 2017! 2016? 2015 THE YEAR OF VOICE As almost 50% of all corporate data will have a voice component within 5 years, either as

Finding your voice Body in facilitating productive conversations Source: von Frank, V. (2013,

Finding my own voice Families' Voices... Sent to 500 family members Personal Info

Finding a Voice within Organisations A workshop by Yo Autscape 2013 You will be handed a

Growing Global Leaders Advancing Palliative Care Finding Your Voice Ron Cameron-Lewis, BA,

Genomic approaches towards finding cis -regulatory modules (CRM) in animals Matthew I. Omoruyi

DMR and Digital Voice Modes DMR and Digital Voice Modes DMR and Digital Voice Modes DMR and

Digital Voice VHF, UHF, and HF Analog Voice - AM/SSB Analog Voice - FM Digital Voice GMSK UHF

Voice production Larynx and the vocal folds ARTICULATION PHONATION BREATHING (Lena Lyons efter

Finding your way in a graph Finding your way in a graph Finding your way in a graph Finding your

There is a voice speaking. That voice is sovereign. That voice alone is sovereign. Jeremiah

Getting Sta rted with Voice API Lorna Mitchell Getting Sta rted with Voice API Use the Voice

1 Incidence of teachers voice problems in UK LSBU survey of teachers voice levels (2004) 36

SOUTHERNAIRAVIATION FLY-BY-VOICE TM INTO NEXTGEN CENTURY Voice Activated Cockpit

1 Agenda Agenda Agenda Wie sieht die HW aus? IP Telefon Voice Gateway Was Was

TO TO VOICE SEARCH Be Beware: e: Vo Voice Search Is Ex Exciting iting Everyone is

Active / Passive Voice Acquistion THE PASSIVE VOICE IS EVIL. Two exceptions:

STATUS COUNT FINDING APPROVED 5 FINDING CONDITIONAL 16 FINDING DENIED 11

I Want My Voice to Be Heard: IP over

REGULATORY UPDATE AND THE PRCS ROLE IN REFORM ASSOCIATION OF ALTERNATE POSTAL SYSTEMS THE

Aisle Safety Light Brightness SFMTA Fleet Engineering Voice Annunciator Volume Voice

YOUR VOICE, YOUR CHOICE YOUR VOICE, YOUR CHOICE Headlines 1,279 people participated casting

Sequence-Based Data Mining Jaroslaw Pillardy Computational Biology Service Unit Cornell

TAKE THAT! finding, trapping and taming exertion sounds Introduction! I'm DB Cooper, a

Voice Activity Detection Voice Activity Detection Speaker Recognition Feature Extraction