Slides: http://wolfpaulus.com/slidesCode: git clone https://github.com/wolfpaulus/bots.git
Voice-Enabling Chatbots
© 2012-2013 Wolf Paulus - http://wolfpaulus.com
Star Trek© 2012-2013 Wolf Paulus - http://wolfpaulus.com
Graphical User Interfaces - Mac System 1 (1984), Windows 95 (1995)© 2012-2013 Wolf Paulus - http://wolfpaulus.com
Voice User Interfaces - Ford Sync, Siri, and Cora© 2012-2013 Wolf Paulus - http://wolfpaulus.com
Speed and Accuracy of Speech Recognition© 2012-2013 Wolf Paulus - http://wolfpaulus.com
Wearable Computing - Google Project Glass, Pebble Watch, Apple iWatch (concept)© 2012-2013 Wolf Paulus - http://wolfpaulus.com
Artist on Android© 2012-2013 Wolf Paulus - http://wolfpaulus.com
Artist on Android, Voice User Interface flattens navigation and configuration hierarchies© 2012-2013 Wolf Paulus - http://wolfpaulus.com
Speech Recognition Client - Record, Encode, Compress, Send, Receive Transcription and Confidence
Adaptive Multi-Rate Narrowband Speech Codec8 KHz sampling rate and 12 Kb encoding rate
© 2012-2013 Wolf Paulus - http://wolfpaulus.com
Horsemen of Speech Recognition© 2012-2013 Wolf Paulus - http://wolfpaulus.com
2001 Space Odyssey © 2012-2013 Wolf Paulus - http://wolfpaulus.com
Speech Synthesis
Use a pre-installed Text-To-Speech Engine
Package and ship a distinct Synthesizer and Voice with mobile application
Use a web-service to synthesis text into speech audio (VAAS)
Voice Matters© 2012-2013 Wolf Paulus - http://wolfpaulus.com
Speech Synthesis on Android© 2012-2013 Wolf Paulus - http://wolfpaulus.com
Echo BotAIML BotCreating a simple Voice-Enabled Android App
© 2012-2013 Wolf Paulus - http://wolfpaulus.com
Cora
Speech Recognition
private void startVoiceRecognitionActivity() { final Intent intent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);
// Specify the calling package to identify your application intent.putExtra(RecognizerIntent.EXTRA_CALLING_PACKAGE, getClass().getPackage().getName());
// Display an hint to the user about what he should say. intent.putExtra(RecognizerIntent.EXTRA_PROMPT, getResources().getString(R.string.speakPROMPT));
// Given an hint to the recognizer about what the user is going to say intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL, RecognizerIntent.LANGUAGE_MODEL_FREE_FORM);
// Specify how many results you want to receive. The results will be sorted // where the first result is the one with higher confidence. intent.putExtra(RecognizerIntent.EXTRA_MAX_RESULTS, 1);
//intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE, new Locale("es").getLanguage()); startActivityForResult(intent, VOICE_RECOGNITION_REQUEST_CODE); }
© 2012-2013 Wolf Paulus - http://wolfpaulus.com
Speech Synthesis
private void android.speech.tts.TextToSpeech mTts;..// Instantiate TextToSpeech with the current Context and an OnInitListenermTts = new TextToSpeech(this, this);..private void onInit(final int status) {
if (status == TextToSpeech.SUCCESS && mTts != null) { mTts.setOnUtteranceCompletedListener(new TextToSpeech.OnUtteranceCompletedListener() { public void onUtteranceCompleted(final String s) { startVoiceRecognitionActivity(); } }); } }
private void say(final String s) { final HashMap<String, String> map = new HashMap<String, String>(1); map.put(TextToSpeech.Engine.KEY_PARAM_UTTERANCE_ID, UTTERANCE_ID); mTts.speak(s, TextToSpeech.QUEUE_FLUSH, map); mTV_TTS.setText(s); }
© 2012-2013 Wolf Paulus - http://wolfpaulus.com
Capture Speech Input
Convert Speech into Text
Synthesize Voice (Message)
SpeekMessage
access Web Serviceperform on Device
Echo Bot
© 2012-2013 Wolf Paulus - http://wolfpaulus.com
Capture Speech Input
Convert Speech into Text
Execute Command
Synthesize Voice (Message)
SpeekMessage
access Web Serviceperform on Device
“stock quote for ...” Stock Quote Bot
© 2012-2013 Wolf Paulus - http://wolfpaulus.com
Capture Speech Input
Convert Speech into Text
Create Text Response
Message or Command ?
Execute Command
Synthesize Voice (Message)
SpeekMessage
Cmd
Msg
access Web Serviceperform on Device
Msg
AIML Bot
© 2012-2013 Wolf Paulus - http://wolfpaulus.com
Voice-Enabled Web Bots
• Recognition
<script>function processspeech() {document.form.submit();}</script>
<input type="TEXT" autocomplete="off" name="input" speech="speech" x-webkit-speech="x-webkit-speech" onspeechchange="processspeech();" onwebkitspeechchange="processspeech();" />
• Synthesis
<audio autoplay="true"><source type="audio/mpeg" src="http://goo.gl/r9Mhm" ></audio>
© 2012-2013 Wolf Paulus - http://wolfpaulus.com
Voice User Interfaces - Ford Sync, Siri, and Cora© 2012-2013 Wolf Paulus - http://wolfpaulus.com
Thanks
© 2012-2013 Wolf Paulus - http://wolfpaulus.com
Slides: http://wolfpaulus.com/slidesCode: git clone https://github.com/wolfpaulus/bots.git