Upload
matthew-parks
View
219
Download
2
Embed Size (px)
Citation preview
2
Speech Technology Center
Global provider in the rapidly growing voice-enabled sector
3
Agenda
Speech Technology CenterBackground Experience Team
STC solutions for Mobile Platforms
VoicePinVitalVoice Mobile VoiceComDenoiser MobileVocal Search application
Large-scale speech solutions VoiceNavigator VitalVoice VoiceKey Service
Automated media monitoring solution Jingle Tracker
Voice-based identification solutions VoiceNet and Trawl series
4
Company Overview
Speech Technology Center
5
Executive summary
Company
STC develops cutting edge technologies and products related to recording, analysis, recognition and synthesis of the human speech.
STC has 250 employees and 20 years experience in voice-based technology development. The company built one of the strongest R&D teams in the industry to stay ahead of the competition and quickly respond to market demands.
Focus
Speech technologies are rapidly growing as technology development enables new applications. Key market segments:
Biometric solutions (voice verification and databases) Speech recognition (speech to text) and synthesis (text to speech) Professional (high quality) recording and analysis
6
Growth
Strong growth potential
Well-established products (purple line) provide high growth potential
New technologies (yellow line) drastically improve
growth projections
7
From science to products
Complete business cycle STC business model is built around the concept of a complete business cycle Strong R&D Product Design Production (though outsourcing) Marketing and sales
Diverse product line
8
One of the leading R&D teams (voice sector) in the world: over 100 technical specialists, scientist and software developers (including 25 PhDs)Strong support and client relationship teams
R&D Cluster
STC R&D facility, Saint-Petersburg
STC built one of the strongest R&D teams in the industry
9
Global customer base in more than 60 countries
10
STC Mobile ApplicationsVoice Biometrics Mobile Applications
STC product range
11STC R&D facility, Saint-Petersburg
Advantages
Speech is a natural way of communication
Authentication of a person without a direct contact
No special equipment is required
Voice is the key that cannot be “lost” like regular PIN or
password.
Fraud protection (biometric patterns can not be stolen or
forged)
Low deployment costs and fast investments return
Voice Biometrics Mobile Applications
12STC R&D facility, Saint-Petersburg
VoicePIN is a unique solution for user verification on mobile devices
VoicePIN protects the data on the mobile phone with a voice
password
VoicePIN controls access to the mobile phone in general or
separate folders and services
Features
Language independent solution
High accuracy of verification
Correct work in noisy environments (street, restaurant, etc.)
Password phrase to pronounce is shown on the screen
Algorithm adapts for the microphone characteristics of the
device
Regular bypass password option if voice verification is not
possible (very noisy environment, person’s health problem
affecting his speech)
Voice Biometrics Mobile Applications: VoicePIN
13STC R&D facility, Saint-Petersburg
Voice Biometrics Mobile Applications: VoicePIN
VoicePIN
Technical characteristics
User verification process is 3-5 seconds
Password phrase is 3-5 seconds long
Registration process takes 1-2 minutes
Noise robust
Multiplatform application
14STC R&D facility, Saint-Petersburg
VoiceCom Mobile is a command recognition solution for mobile devices. Using VoiceCom one can easily manage his mobile phone and navigate through
functions and applications with the most natural user interface – his voice.
Features
Easy hands-free operation
Fast and accurate recognition of pronounced commands
Quick access to any mobile application or function including the hidden ones
Robustness to noise ensures reliable functioning in any environment (street, car,
crowded place)
Few minutes to set up
One-click launching
Language independence
Multiplatform application
Voice Biometrics Mobile Applications: VoiceCom Mobile
15
STC Mobile ApplicationsNoise Cancellation Mobile Application
Speech Technology Center product range
16STC R&D facility, Saint-Petersburg
Denoiser Mobile is a noise cancellation solution that can reduce a wide range of noises in any audio file stored on a smartphone.
Application
phone conversations recording
conversation playing back
enhancement of the quality of the
recorded conversations and other
audio files
Noise cancellation mobile application
17STC R&D facility, Saint-Petersburg
Noise cancellation mobile application
Denoiser MobileFeatures
Recording phone conversations and saving them to audio files
Built-in audio player
Automatic noise cancellation processing of the recorded conversations
Option to turn on and off the sound enhancement during the playback allowing to
hear the original and processed recordings
Noise suppression in any audio file saved on the mobile device
Choice of preset noise cancelling algorithms based on the type of recorded sound
(music, speech)
Advanced settings allowing to choose from10 presets based on the needed level of
sound enhancement
Manual filter settings for expert noise suppression
Platform-independent algorithms
18
STC Mobile ApplicationsSpeech Recognition and Synthesis
Mobile Application for Russian Language
Speech Technology Center product range
19STC R&D facility, Saint-Petersburg
Internet search engine
Voice inquiry
Mobile phone
Information search and data communication
Vocal Search Application for Mobile phones
Application that enables information search in Internet by means of vocal commands
20STC R&D facility, Saint-Petersburg
Vocal Search Application for Mobile phones
GSM/GPRS
Voice Navigator
MRCP server(Media server)
ASRAutomatic speech recognition
TTSText to speech
Search on the Map
Internet Search Engine
*.vxml speech recognition grammarDB
Voice inquiry
Recognition of voice inquiry
Building the dictionary for speech recognition based on Internet search engine rating
Search engine
STC site
Mobile phone user
21STC R&D facility, Saint-Petersburg
Vocal Search Application for Mobile phones
Voice search and management of information
Voice search of information in Internet
Pronouncing of dynamically changing information
GPRS navigation Device messages
Synthesized reading of email messages, documents, e-books, sms, web site contents, etc.
Mobile phone user
ASR and TTS use cases
22STC R&D facility, Saint-Petersburg
Vocal Search Application for Mobile phones
Advantages
Highly scalable solution, supporting unlimited number of recognition and synthesis requests.
ASR and TTS modules take into account all specifics of Russian language, they were developed in Speech Technology Center by native speaking specialists
Fast adaptation of ASR and TTS algorithms to new requirements, accession of new words to the dictionaries for synthesis and recognition, addition of new reading rules for certain words, etc
Minimum time required for processing a request (synthesis and recognition) – the system can be implemented in real-time and online solutions.
23STC R&D facility, Saint-Petersburg
VitalVoice Mobile
The speech synthesis application
that allows to read aloud any text
document or messages in Russian on
mobile device. Synthesized speech
can be saved in mp3 or wav file.
Minimal requirements:
Windows 5.0
Processor: 400Mhz
RAM: 64MB
500MB of flash memory (for one
voice)
Speech Synthesis Mobile Application: VitalVoice Mobile
24
Large-scale Speech Solutions Russian Language ASR and TTS
STC enterprise solutions
25STC R&D facility, Saint-Petersburg
Russian language ASR and TTS: VoiceNavigator
PSTN
PBX
IVRVoiceNavigator
MRCP serverSpeech platform STC
ASRAutomatic Speech Recognition
TTSText to Speech
SIVSpeaker Identification Verification
MRCP - client
VoiceNavigator deployment
26STC R&D facility, Saint-Petersburg
Russian language ASR and TTS: VoiceNavigator
Benefits of speech synthesis and recognition Increasing the efficiency of call center functioning
Lowering the stress load on the agents of first level
Lowering the cost of a minute of service
Increasing customer satisfaction
Building services unavailable with regular DTMF
Providing clients with dynamically changing information
Servicing clients who doesn’t have touch tone phones
Natural communication between a user and a System
Image of the Company as innovative solution provider
Benefits of speech synthesis and recognition in IVR
27STC R&D facility, Saint-Petersburg
Russian language ASR and TTS: VoiceNavigator
ASR and TTS …. Why is it comfortable?
Receiving informationabout credits and deposits
Receiving information about rates and prices
Search of addresses ofATMs and bank branches
Many ways to pronounce the name of the same object or location
Dynamically changing information
Building menus with multiple choice options
Company functioning automation
Call routing by employee name
28STC R&D facility, Saint-Petersburg
Russian language ASR and TTS: VoiceNavigator
Synthesis was developed in by native speaking linguists
On-demand voice creation from the speaker’s phonograms
Unique linguistic processing (correct reading of abbreviations, abridgements, numbers, special characters, harmony of tenses, homonymy).
Why to chose the synthesis of Russian speech by STC
29
Large-scale Speech Solutions Russian Language Synthesis
STC enterprise solutions
30STC R&D facility, Saint-Petersburg
Speech synthesis VitalVoice
Speech synthesis is the process of creation of voice signal from the written text.
Speech Synthesis Methods:
Allophonic (based on phonemes combinations)
Unit Selection (based on speech fragments selection from the database)
STC approach
Allophonic + Unit Selection Methods= Naturally sounding speech
Speech Technology Center strong RnD team has developed new method of the speech synthesis, combining the advantages and excluding the shortcomings of both methods and created unique Russian language solution VitalVoice
31STC R&D facility, Saint-Petersburg
Speech synthesis VitalVoice
Speech Synthesis Technology
Speech formatting is done on 9 levels of text analysis, including such significant parameters like pauses, syntagms, speech specifics of the speaker, etc.
Such analysis and breaking of the text allows to compose the speech from the correctly chosen and structured fragments
32STC R&D facility, Saint-Petersburg
Speech synthesis VitalVoice
Advantages
Naturally sounding speech produced from any text
Consideration of phonetic, morphologic and grammatical specifics of Russian language
Technology of the natural intonation cloning
Putting correct stressing in words
Correct pronunciation of abbreviations, abridgements, numbers and special characters
Easy to implement and use
Support of standard data exchange protocols and markup languages (MRCP, SAPI, SSML)
8 different voices
On-demand creation of unique voice
33
Large-scale Speech Solutions Voice Biometrics Solution
STC enterprise solutions
34STC R&D facility, Saint-Petersburg
Voice Biometrics Solution: VoiceKey ServiceVoiceKey
VoiceKey is a biometric engine for verification and
identification of a user based on his unique voice
characteristics. It is proprietary technology designed,
developed and patented by Speech Technology Center.
Features
Easy registration process
Fast verification. 4-5 seconds to identify a person
Low error rate, ERR is about 3,5%
More than 20 unique voice characteristics are extracted
and analyzed
Language and accent independent
35STC R&D facility, Saint-Petersburg
Voice Biometrics Solution: VoiceKey Service
VoiceKey Service (VKS) is a software module based on
VoiceKey engine intended for biometric verification and
identification of users by their voices.
VKS is specifically designed to control user’s access to
corporate materials, personal financial, confidential and
valuable information, virtual resources and services using a
voice password phrase.
36STC R&D facility, Saint-Petersburg
Voice Biometrics Solution: VoiceKey Service
Accurate and reliable verification/identification
Easy, intuitive, handy usage and management. No end user training required
Fast enrolment and verification/identification processes
No need to keep details in mind: account number, card number, passwords –
your voice is the key Language independence
Noise robust. Uses STC’s proprietary algorithm.
Fraud security (biometric patterns can not be stolen)
Compliance with international standards and data exchange
formats
Highly scalable and flexible
Easy to implement
Low deployment costs and fast investments return
VoiceKey Service advantages
37
Automated media monitoring solutionJingle Tracker
STC enterprise solutions
3838
Purpose
Automatic monitoring of media content in stream (TV, Radio broadcast)
Automatic monitoring between regional broad- casting and central
channel schedule
Monitoring of media channels to reveal
unapproved use of musical content or other copyright objects
Search of audio samples in media archives (TV, radio, Internet)
Advantages
Fully automatic search
High reliability
Service scalability – work with large number of samples and channels
as well as single objects
High productivity
Capability for simultaneous search of up to 15,000 messages
Automated media monitoring solution: Jingle Tracker
39
Automated voice-based identification solutionsVoiceNet and Trawl series
STC expert solutions
40
Voice-based identification solutions
Trawl LabExpress identification of speakers under investigation
Trawl ХExpress search (real-time) of audio files containing the speech of target
speaker
VoiceNetExpress investigation of speech recordings and voice database management
system
VoiceNet SDKSDK providing automation of speech recordings biometric express investigation
VoiceNet IDLarge-scale speaker identification and voice database management solution
4141
STC voice-based identification solutions benefits
Easy to use (high level of automation)
Rapid search (up to 1500 phonograms per second)
Language and accent independent
High reliability
Voice-based identification solutions
42
Thank you!
WWW.SPEECHPRO.COM4 Krasutskogo st.Saint Petersburg
RUSSIAtel.: 812 325-8848fax: 812 327-9297