36
Neurophysiology of Speech T.S. Yo

The Neurophysiology of Speech

Embed Size (px)

DESCRIPTION

An introduction to the biology and neurophysiology of human speech. The target audience is researchers and engineers working on speech recognition technology.

Citation preview

Page 1: The Neurophysiology of Speech

Neurophysiology of Speech

T.S. Yo

Page 2: The Neurophysiology of Speech

ReferencesAudition, the body senses, and the chemical senses. Physiology of behavior, 6th Ed, 1998, pp. 185-223. by Carlson N. R.

Human communication. Physiology of behavior, 6th Ed, 1998, pp. 477-508. by Carlson, N. R.

FUNCTIONAL MRI OF LANGUAGE: New Approaches to Understanding the Cortical Organization of Semantic Processing Annu. Rev. Neurosci., (2002), pp. 151-188. by Bookheimer, S.

Lateralization of auditory language functions: A dynamic dual pathway model Brain and Language, 89 (2004) 267–276 by Friederici, A.D. and Alter, K.

Page 3: The Neurophysiology of Speech

Outline

● Auditory apparatus● MFCC● Lesion study● Neuroimaging● Dynamic dual channel model● Can we design ASR systems by mimicking

organic systems?

Page 4: The Neurophysiology of Speech

Auditory system

鼓膜耳廓

槌骨

砧骨

鐙骨

歐氏管;耳咽管

耳蝸

前庭

Page 5: The Neurophysiology of Speech

Cochlea

Page 6: The Neurophysiology of Speech

Cochlea (2)

Page 7: The Neurophysiology of Speech

Auditory Pathway

Page 8: The Neurophysiology of Speech

Detecting Acoustic Features● Pitch

– High freq: place coding– Low freq: rate coding

● Loudness– Freq of firing in cochlea nerves

● Timbre– Waveform decomposition

Page 9: The Neurophysiology of Speech

Localization with Neural Circuits

Page 10: The Neurophysiology of Speech

Localization with Neural Circuits

Page 11: The Neurophysiology of Speech

Vestibular System

Page 12: The Neurophysiology of Speech

MFCC● Mel Frequency Cepstral Coefficient

– Take the Fourier transform of a signal– Map the log amplitudes of the spectrum obtained

above onto the mel scale, using triangular overlapping windows.

– Take the Discrete Cosine Transform of the list of mel log-amplitudes, as if it were a signal.

– The MFCCs are the amplitudes of the resulting spectrum.

Page 13: The Neurophysiology of Speech

From the ears to the brain● Ear

– Spectral signals.– Fourier transform done by neural circuits.

● Brain– Two pathways in two hemisphere– Left: semantics and syntactics– Right: prosody

Page 14: The Neurophysiology of Speech

Brain Mechanisms for Language

● From lesion study to neuroimaging● Localization of functions● Lateralization● Speech Production and Comprehension● Prosody

Page 15: The Neurophysiology of Speech

Lesion Studies● Aphasia

– Difficulty in producing or comprehending speech caused by brain damage.

● Broca's aphasia– agrammatism– anomia

● Wernicke's aphasia– poor speech comprehension

Page 16: The Neurophysiology of Speech

Broca's Aphasia● Agrammatism:

– difficulty in understanding / using grammar● Anomia:

– difficulty in finding the appropriate word to describe an object, action, or attribute.

● Apraxia of speech: – impairment in the ability to program movements of

the tongue, lips, and throat required to produce the proper sequence of speech sounds.

Page 17: The Neurophysiology of Speech

Broca's Aphasia Example● "Yes ... Monday ... Dad, and Dad ... hospital,

and ... Wednesday, Wednesday, nine o'clock and ... Thursday, ten o'clock ... doctors, two, two ... doctors and ... teeth, yah."

● 是...阿...星期一...阿...父親及父親....阿...醫院...及阿...星期三...星期三九點... 以及 ,喔...星期四...十點, 阿,醫生...兩個...醫生...及阿...牙齒...對的。

Page 18: The Neurophysiology of Speech

Broca's Aphasia

Page 19: The Neurophysiology of Speech

Wernicke's Aphasia● Poor speech comprehension:

● Fluent but meaningless speech: –

● Pure word deafness: – The ability to hear, to speak, and to read and write

without being able to comprehend the meaning of speech.

Page 20: The Neurophysiology of Speech

Wernicke's Aphasia Example● Examiner: What kind of work have you done? ● Patient: We, the kids, all of us, and I, we were working for a long time

in the ... you know ... it's the kind of space, I mean place rear to the spedawn ...

● Examiner: Excuse me, but I wanted to know what work you have been doing.

● Patient: If you had said that, we had said that, poomer, near the fortunate, porpunate, tamppoo, all around the fourth of martz. Oh, I get all confused.

Page 21: The Neurophysiology of Speech

Wernicke's Aphasia

Page 22: The Neurophysiology of Speech

Neuroimaging Studies● Neuroimaging

– Functional magnetic resonance imaging (fMRI)– Positron emission tomography (PET)

● Subjects are asked to perform cognitive tasks while taking imaging.

Page 23: The Neurophysiology of Speech

Neuroimaging● FMRI● PET

Page 24: The Neurophysiology of Speech

Normalizing Neuroimages● Talairach coordinate space

– Center: Anterior Commissure

– X: [-65, +65]– Y: [+70, -90]– Z: [-40, +65]

Page 25: The Neurophysiology of Speech

Semantic Conditions● Same

– The lawyer questioned the witness.– The attorney questioned the witness.

● Different– The man was attacked by the doberman.– The man was attacked by the pitbull.

Page 26: The Neurophysiology of Speech

Syntactic Conditions● Same

– The policeman arrested the thief.– The thief was arrested by the policeman.

● Different– The teacher was outsmarted by the student.– The teacher outsmarted the student.

Page 27: The Neurophysiology of Speech

Summary by Bookheimer, 2002

● The role of the left inferior frontal lobe in semantic processing and dissociations from other frontal lobe language functions.

● The organization of categories of objects and concepts in the temporal lobe.

● The role of the right hemisphere in comprehending contextual and figurative meaning.

Page 28: The Neurophysiology of Speech

Overview by Ahrens, 2007● Past

– Functional localization (brain damage)● Present

– Narrower localization + discussion of overlap and integration (neuro-imaging techniques)

● Future – Language as a brain function (integrate knowledge

about timing, context, and individual differences)

Page 29: The Neurophysiology of Speech

The Three Myths● Myth 1: Broca’s area deals with syntax/production

– Fact: Semantics and phonology cluster in different areas of the IFG; syntax seems to be distributed throughout the IFG.

– Fact: IFG is activated during non-language tasks.

● Myth 2: Wernicke’s area deals with semantics/comprehension– Fact: There are functional subdivisions for language in

posterial temporal area.

Page 30: The Neurophysiology of Speech

The Three Myths● Myth 3: The right hemisphere is not used when

processing language – Fact: The right hemisphere is called upon for many

integrative language processes.> Figurative Language and Metaphor> Linguistic Context> Prosody

Page 31: The Neurophysiology of Speech

Summary of Neuroimaging Studies

Page 32: The Neurophysiology of Speech

Dynamic Dual Pathway Model

● Spoken language comprehension requires the coordination of different subprocesses in time.

● Segmental information: – phonemes, syntactic elements and lexical-semantic

elements.● Suprasegmental information:

– accentuation and intonational phrases, i.e., prosody.

Page 33: The Neurophysiology of Speech

Localization of Different Subsystems

● Segmental information:– syntactic and semantic information are primarily

processed in a left hemispheric temporo-frontal pathway including separate circuits for syntactic and semantic information

● Suprasegmental information: – sentence level prosody is processed in a right

hemispheric temporo-frontal pathway.

Page 34: The Neurophysiology of Speech

Dynamic Interaction● Corpus Callosum

Page 35: The Neurophysiology of Speech

Can we design ASR systems by imitating the brain?

● An open question– Is it possible? Is it more effective?

● Complexity– Basic computation power of a neuron: 60 hz– 10^8 of input, 10^10 in the brain, each with >8000

connections● Training time

– How long would it take for a human being to understand language?

Page 36: The Neurophysiology of Speech

Some factors in human neural system