20
Speech Technology Part I : Automatic Speech Recognition Rajesh M. Hegde [email protected] Associate Professor Dept. of EE Indian Institute of Technology Kanpur Several pictures used in this presentation have been collected from various sources available on the web and have been acknowledged in the slides.

Speech Technology Part I : Automatic Speech Recognition

  • Upload
    others

  • View
    9

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Speech Technology Part I : Automatic Speech Recognition

Speech Technology Part I : Automatic Speech Recognition

Rajesh M. Hegde [email protected]

Associate Professor Dept. of EE

Indian Institute of Technology Kanpur

Several pictures used in this presentation have been collected from various sources available on the web and have been acknowledged in the slides.

Page 2: Speech Technology Part I : Automatic Speech Recognition

Topics Covered

• What is Automatic speech recognition (ASR)? • What are the challenges in implementing ASR

systems on a mobile phone ? • How can speech technology be used for

developing applications on a mobile phone ?

Page 3: Speech Technology Part I : Automatic Speech Recognition

Broad Objectives of Speech Recognition for Machines

Speech to Text (ASR)

Source : Reynolds et. al

Page 4: Speech Technology Part I : Automatic Speech Recognition

Speech Recognition for Mobile Phones

• Speech recognition converts a speech signal, acquired by a mobile phone, to a sequence of words.

• The recognition output can be used in command and control, email, search, and communication.

• This output can also be used in dialog management and natural language understanding.

• What you can do with it : Dictation, Call routing, Directory assistance, Travel planning, and Logistics.

Page 5: Speech Technology Part I : Automatic Speech Recognition

Overview of the Automatic Speech Recognition (ASR) Technology

Open Source Tools : HTK and CMU Sphinx

Source : Google Image Search

Page 6: Speech Technology Part I : Automatic Speech Recognition

Popular Commercial Applications : Siri, Google Voice

Source : Google, Apple

Page 7: Speech Technology Part I : Automatic Speech Recognition

Client and Server Based Speech Recognition on the Mobile Phone

Source : Pearce et. al. ETSI

Speech Recognition at the Client Mobile Phone

Server based Speech Recognition on the Mobile Phone

Page 8: Speech Technology Part I : Automatic Speech Recognition

ASR Issues on Mobile Phone

• Memory Crunching • Computational Complexity • Power Requirement

Page 9: Speech Technology Part I : Automatic Speech Recognition

“Their Car” =

ASR Issues on Mobile Phones : Search Complexity

DH EH R [word] K AA R DH

P(“DH”)

Source : Slides Krishna et.al, from U Michigan

Page 10: Speech Technology Part I : Automatic Speech Recognition

DH

DH EH R [word] K AA R

Source : Slides from Krishna et.al, U Michigan

ASR Issues on Mobile Phones : Search Complexity

Page 11: Speech Technology Part I : Automatic Speech Recognition

DH

DH EH R [word] K AA R

Source : Slides from Krishna et. al, U Michigan

ASR Issues on Mobile Phones : Search Complexity

Page 12: Speech Technology Part I : Automatic Speech Recognition

DH EH R

AH

AX

“The”

IH

IY

[word]

“Ear”

“Their”

DH EH R [word] K AA R

Source : Slides from Krishna et. al, U Michigan

ASR Issues on Mobile Phones : Search Complexity

Page 13: Speech Technology Part I : Automatic Speech Recognition

DH EH R [word] K AA R

DH EH R

AH

AX

“The”

IH

IY

[word]

“Ear”

“Their”

Source : Slides from Krishna et. al, U Michigan

ASR Issues on Mobile Phones : Search Complexity

Page 14: Speech Technology Part I : Automatic Speech Recognition

“Their” “Car”

“The” [word]

“Ear”

DH EH R

AH

AX IH

IY

K AA R

[word]

P AE

T “Cat”

“Cap”

DH EH R [word] K AA R

Source : Slides from Krishna et. al, U Michigan

ASR Issues on Mobile Phones : Search Complexity

Page 15: Speech Technology Part I : Automatic Speech Recognition

DH EH R

AH

AX IH

IY

K AA R

P AE

T

NH

F S

N L

EH OY

DH EH R [word] K AA R

Source : Slides from Krishna et. al, U Michigan

ASR Issues on Mobile Phones : Search Complexity

Page 16: Speech Technology Part I : Automatic Speech Recognition

DH EH R

AH

AX IH

IY

K AA R

P AE

T

NH

F S

N L

EH OY

TH

SH

T

IY

OW G

DH EH R [word] K AA R

Source : Slides from Krishna et. al, U Michigan

ASR Issues on Mobile Phones : Search Complexity

Page 17: Speech Technology Part I : Automatic Speech Recognition

DH EH R

AH

AX IH

IY

K AA R

P AE

T

NH

F S

N L

EH OY

TH

SH

T

IY

OW G

DK CH

ER

IY IH

F K

DUH OW

IH

Z OW

JH V

ZH AX

G SH GH DH EH R [word] K AA R

Source : Slides from Krishna et. al, U Michigan

ASR Issues on Mobile Phones : Search Complexity

Page 18: Speech Technology Part I : Automatic Speech Recognition

SEARCH – Computing Requirements on the Mobile Phone

1. Search • Roughly 50% of total time for Speech Recognition is taken away by search • Even More for Large Vocabulary Recognition • Considerably less for Small vocabulary tasks 2. Solutions • Network optimization • Efficient search techniques • Pruning methods i) Look-ahead based strategy ii) Pruning threshold dependent on the grammar • Multi-pass methods i) A fast first pass to produce a short list of candidates or a lattice, followed by second pass rescoring with larger acoustic and language models

Source : Rose et. al

Page 19: Speech Technology Part I : Automatic Speech Recognition

Speech Recognition Based Access of Agrocommodity Prices in Hindi for Uttar Pradesh

Sponsored by DieTY Govt. Of India

Page 20: Speech Technology Part I : Automatic Speech Recognition

Questions [email protected]

URL : http://202.3.77.107/mips/

?