Interchangeable Modalities W3C Workshop on MultiModal Interaction 22-23 July 2013, New York
Background: iSpeech is a Text-to-Speech & Speech Recognition Company Enterprise Enterprise Mobile, Auto, Home, & Fast growing list of Developer Publishing Customers Mobile, Auto, Home Experience Developer and Publishing Consumer Customers 25,000+ developers s Experience 15,000+ devs in 30+ million app Consumers 12 months, 2x downloads growth of 30+ million app downloads
Credibility: Developer Ecosystem > 25K developers registered > 2 billion API calls serviced > 99.9% uptime . Mobile Devs Mobile OEM/OS Auto Home Publishing
Speech: New Frontiers Growth Following New Use Cases Breakdown of Developers by Segment & Activity 1 Entertainment 2 Mobile/Nav 3 eLearning 4 Telephony Developers API Usage
Challenges of Speech Technology Many technologists have never ‘experienced’ working directly with speech technologies Uncharted Technology Waters: Audio DSP, NLP, Domain/Grammar/Lexicon, Multimodal UI Speech mirrors humans; more like ‘wetware’ than ‘software’? Life-cycle of continuous adaptations and QA
Consideration: Speech Technology Value Chain ASR NLP TTS
Standards & HTLM5 • 25,000 Developers X 10 ways to package web services (APIs and SDKs) And that’s just Cloud, dozens more embedded engines to account • HTML5 adoption – audio playback of TTS ok, audio recording (ASR) not widely used • Example: impact of mature standards on use of Speech Technology: VoiceXML, SSML, SRGS, MRCP
Talkz™ Case Study • Talkz - successor to Drivesafe.ly • Interchangeable Multimodal App • Available today through iTunes
Multi- Modal’s Silver Lining : Universal Accessibility 1st Accesible VoiceOver Smart- Section 508 Mac OS X phone 1998 2005 2009 2000 2006 2013 Global Windows YouTube Public Inclusive Narrator Captions Infrastructure (GPII) “The gap between usability and accessibility is narrowing and with it the digital divide between disabled and non-disabled people .” - Robin Christopherson, AbilityNet
Conclusions for Multi-Modal Developers • Plan to Partner & Partner to Plan • Multimodal is a CENTRAL UI pillar, not an after thought
Craig Campbell Chief Evangelist ccampbell@iSpeech.org www.iSpeech.org/developers
Recommend
More recommend