Effective Open Source Speech Recognition in Your Application #kde-speech Peter Grasch peter@grasch.net
The Basics Speech model Decoder Acoustic model Language model ● Sounds ● Vocabulary ● Grammar
Open Source Speech Recognition Decoder Trainer UI CMU SPHINX ✓ ✓ (PocketSphinx, SphinxTrain) Julius ✓ KALDI ✓ ✓ Simon ✓ ✓ ✓
Standard Architecture Commands Simond Simon Your application ? Acoustic model Language model
Standard Architecture Commands Simond Simon Scenario Scenario Your application Scenario Acoustic model Language model
Headless Architecture Commands Simond Simon Your application Acoustic model Language model
Embedded Architecture Commands Simond Simon Your application Acoustic model Language model Decoder
Standard Architecture Commands Simond Simon Scenario Scenario Your application Scenario Acoustic model Language model
Writing your Scenario ● Lay out the commands you want to support ● Create: – Vocabulary – Grammar – Commands
Writing your Scenario Demonstration
Tighter Integration: Write a Custom Command Plug-In ● Full, programmatic control of the scenario ● Meta information of recognition results: – Phonetic transcriptions – Confidence scores* – Alternative results*
Tighter Integration: Write a Custom Command Plug-In Demonstration
Q & A #kde-speech Peter Grasch peter@grasch.net
Thank you for your attention
Recommend
More recommend