beyond the equal error rate
play

Beyond the Equal Error Rate About the inter-relationship between - PowerPoint PPT Presentation

ISCA Archive


  1. ����������������������� �������������������������������� ISCA Archive ������������� ���������������������������������� ���������������� Beyond the Equal Error Rate About the inter-relationship between algorithm and application Renana Peres Comverse Technology

  2. Market needs Effective authentication tools for remote services authentication Direct banking RF Signatures N O T F R I E N D L Y Home shopping U N S A F E Calling Cards PIN codes E V I S N E P X E E commerce Mobile Commerce Profiles Service Centers Questions Smart cards

  3. Effective authentication The barrier in the expansion of remote commerce services FRAUD IMAGE SERVICE Customer Satisfaction ECONOMIC Telecom 10-30 B$/Y Service centers 0.5B$ Calling Cards, 97 ’ US 1B$ Visa, 96 ’ , US Profitability AT&T, 94 ’ , US 2B$ Expenses

  4. Operational Scenarios Free speech and vocal password applications Applications: Free speech Call Centers (Text Independent) Claimed id. Cellular Roamers Calling cards Voice / IP Verify Accept/Reject ����������������������������������������������������������������������������������������������� ����������������������������������������������������������������������������������������������� Speaker Verification ����������������������������������������������������������������������������������������������� ����������������������������������������������������������������������������������������������� ����������������������������������������������������������������������������������������������� ����������������������������������������������������������������������������������������������� Applications: Verify Accept/Reject Credit Cards d r o w s s IVR Interactions a p l ) Claimed id. a t n c o e d V Physical Access n e p e D t x E-commerce e T (

  5. Voice based verification Authentication solution for any remote services Friendly Saves Costs Fraud prevention Combines with transaction flow No passwords Reduce bureaucracies Use of natural speech Increase service volumes Shortens call duration Safe Personal, biometric verification

  6. Typical Architecture Integrated into the service provider infrastructures Management Calling application Audio system Coordinator Storage system Processing units

  7. Audio issues Transfer volumes: Call Center : 100 - 3000 agents = 100 concurrent calls Research challenges Telecom : 10 - 30 trunks= 300-900 concurrent calls Speaker separation and Free speech segmentation IVR Segmentation with unknown no. of speakers Non-speech and silence vox Vocal password Management Calling application Non-password speech vox Audio system Coordinator Storage system Processing units

  8. Storage Internal vs. external storage architecture Internal storage Disk chase Audio system Coordinator Processing units External storage Audio system Coordinator Coordinator Audio system Processing units Processing units

  9. Storage Large, dynamic storage, containing voice & data Storage issues: Storage operations Large storage volumes: 1 minute � Create new VS audio = 0.5 Mbyte (PCM) � Add audio to VS � Remove session from VS Storage of audio objects � Remove VS Backup, redundancy � Add audio to world model Verification audio Voice signature maintenance � Store speaker model � Modify claimed id. Verification results � Get VS data statistics Coordinator Claimed identities World models & data Processing Audio units System Audio Voice signatures Speaker models

  10. Storage Voice signature maintenance Research challenges Time evolution of VS VS update policy Identification of faulty sessions call1 call2 Re-training without audio call3 Compact speaker models cellular car g g g Time n n n g i i i n n n n i i a i i a a n r r r i T T a T r T Add to VS Audio sessions are added to VS; VS is re-trained Verify

  11. Recognition phases Calibration, enrolment, verification Calibration : Initial parameter settings, creating world models Enrolment Enrolment : VS data accumulation, subscribers Train creating speaker model Verification : Match an incoming call against a claimed identity Train Train Train Train Time Time Calibration Add to VS Add to Cal Add to VS Verify Verify

  12. Calibration Initial parameter setting Research challenges Calibration with mixed source data (unsupervised clustering ?) Time evolution of world models Calibration for text-dependent applications (no impostor repetitions, no language info) Calibration data: world models, tuning data,other params Large amount of audio Heavy computation No source labeling

  13. Enrolment Voice Signature for each subscriber Research challenges Free speech Vocal Password Minimum user involvement Signature robustness Mixed source signature Enrolment session First 2-3 calls Mixed source corpora Measurement for VS quality Add call to VS Repeat password Off-line operation Train Train More audio? VS ready for verification VS ready for verification Problem in VS During enrolment, alternative authentication methods are used

  14. Verification Research challenges The most frequent mission Multi trial verification Share info between trials Verification API CLI DTMF Verify Speech recognition Claimed id. Another trial required Accept / Reject

  15. Result update policies For free speech applications Call Call Transaction 1 Transaction 2 Start End Fixed intervals Upon request Confidence level

  16. Decision Policy Algorithmic results + application cost function Security oriented Service oriented FR Threshold FA

  17. Decision and Scoring Research challenges Inter speaker Intra speaker Effective scoring Likelihood ratio -> FA / FR Posterior Probability FR FA Likelihood Decision Threshold Ratio

  18. Management Tools for system monitoring and maintenance Speaker Recognition information General information No. of trained voice signatures System status Data collection status Mission status Rejection cases Loads Performance measurements Feedback

  19. Summary New research challenges Algorithms Applications Service User behavior Telephony Transfer volumes

  20. Summary of Research challenges Scoring Storage Enrolment • Effective scoring • Re-training without audio • Likelihood ratio -> FA / FR • Identification of faulty sessions • Minimum user involvement • VS update policy Corpora • Signature robustness • Time evolution of VS • Mixed source signature • Compact speaker models • Time evolution of signatures • Mixed source • Measurement for VS quality • Cellular Calibration • Free password Audio Verification • Speaker separation and segmentation • Calibration with mixed source data (unsupervised clustering ?) • Segmentation with unknown no. of speakers • Multi trial verification • Time evolution of world models • Non-speech and silence vox • Share info between trials • Calibration for text-dependent applications (no impostor repetitions, no • Non-password speech vox language info)

Recommend


More recommend