View
231
Download
1
Tags:
Embed Size (px)
Citation preview
Beyond the segment Prosody and intonation
English: Speech is divided into phrases. Word stress is meaningful in English. Stressed syllables are aligned in a fairly regular rhythm, while unstressed syllables take very little time.
Every phrase has a focus. An extended flat or low-rising intonation at the end of a phrase can indicate that a speaker intends to continue to speak.
A falling intonation sounds more final.
Beyond the segment Prosodic factors (supra segmentals)
Stress Emphasis on syllables in sentences
Rate Speed of articulation
Intonation Use of pitch to signify different meanings
across sentences
Beyond the segment Stress effects
On meaning “black bird” versus “blackbird”
Top-down effects on perception Better anticipation of upcoming segments when syllable
is stressed
Beyond the segment Rate effects
How fast you speak has an impact on the speech sounds
Faster talking - shorter vowels, shorter VOT Normalization
Taking speed and speaker information into account
Rate normalization Speaker normalization
Visual language Why so much research using visual language
We do use it Easy to use in research
The parts Letters Words Eye movements (next lecture)
Visual perception of language
Same object category (‘e’) may have different shapes, sizes, and orientations EE E
E
E
EE
EE
E
E
E E
E
E
E
E
E EE
E
E
E
Perhaps the brain is able to represent these objects in a way that is “translationallyinvariant” and “size invariant”.
Invariance a problem in vision too?
Letter Recognition
How do we recognize a group of lines and curves as letters?
Two common explanations: Template matching Feature detection
Template matching Store in brain a copy of what every possible
input will look like. Match observed object to the proper image in
memory Costly: think of all the possible fonts,
handwriting styles etc. Normalization before matching
Prolblems with Template matching
Massive numbers of templates are required (remember all those E’s?)..
Predicts no transfer to novel views of the same object
Objects are often obstructed/occluded
E
Feature detection
Analysis-by-synthesis1. Letter broken down to its constituent parts
2. List of parts compared to patterns in memory
3. Best matching pattern chosen
A fixed set of elementary properties are analyzedIndependently and in parallel across visual field.
Possible examples
Line Orientations:
Different Sizes:
Curvature:
+45deg. -10deg.
Free line endings:
Colors:
Feature detection
PerceptualRepresentation
3 Horizontal lines1 Vertical line4 Right angles
MemoryRepresentation
3 Horizontal lines 1 Vertical line 4 Right anglesE
F2 Horizontal lines 1 Vertical line 3 Right angles
A simple theory of Feature detection
Evidence for Features:
The visual search task is straightforward, you are given some target to look for, and asked to simply decide, asquickly as possible, whether the target is present or absentin a set of objects.
For example, let’s try a few searches to give you a feel forthis.
Search 1 - Is there an O present in the following displays?
Is an O present?
T T T T
T OT T T
T T T T TTTT T T T T T TT T T O TTT T TT
TT T TT T T TT TT T TT T TTT T TT
Q Q QQ Q Q
O Q QQ Q Q Q
Q QQ Q Q QQQ QQQ Q QQQ Q O QQ QQ Q Q QQ Q QQ Q QQQ Q QQ QQ QQQ Q QQQQ Q
Interactive Activation Model (AIM)
McClelland and Rumelhart, (1981)
Nodes: • (visual) feature• (positional) letter• word detectors
• Inhibitory and excitatory connections between them.
Previous models posed a bottom-up flow of information (from features to letters to words).
IAM also poses a top-down flows of information
Inhibitory connections within levels If the first letter of a word is “a”, it isn’t “b” or “c” or …
Inhibitory and excitatory connections between levels (bottom-up and top-down)
If the first letter is “a” the word could be “apple” or “ant” or …., but not “book” or “church” or……
If there is growing evidence that the word is “apple” that evidence confirms that the first letter is “a”, and not “b”…..
Interactive Activation Model (AIM)
U &&&&&
A
Mask presented with alternatives above and belowthe target letter … participants must pick one as theletter they believe was presented in that position.
The Word-Superiority Effect (Reicher, 1969)
The Word-Superiority Effect (Reicher, 1969)
+
E E
& T
+
PLANE E
&&&&& T
+
KLANE E
&&&&& T
Letter only Say 60%
Letter in Nonword Say 65%
Letter in Word Say 80%
Why is identification better when a letter is presented in a word?
IAM & the word superiority effect
We are processing at the word and letter levels simultaneously Letters in words benefit from bottom-up and
top-down activation But letters alone receive only bottom-up
activation.
Other Relevant Findings?
.
Bias towards “well-formed” stimuli Bisidentify words with uncommon spelling patterns
BOUT as BOAT misidentify nonwords (e.g., SALID) as words that are like it
(SALAD). Difficulty identifying nonwords with irregular spelling patterns
(e.g., ITPR) more than those with regular spelling patterns (e.g., PIRT).
Sublexical units bigger than phonemes and graphemes? onsets and rimes
onset: initial consonant or consonant cluster in a word or syllable
rime: following vowel and consonants if words broken at onset-rime boundary, resulting letter
clusters more easily recognized as belonging together than if broken at other points
example: FL OST ANK TR
vs. FLA ST NK TRO
Sublexical units
Adding a bigram level
By adding a frequency-sensitive bigram level, we can accountfor the findings of well-formedness along with the others.
Summing up
Based on all of this, we are left with the claim that humanword recognition is based on a feature-detector system thatis biased to perceive common or recently occurring features.
Based on this model, we can make explicit predictions aboutsituations where the system will do well, and others where itwill make errors … thus the system can be further tested andrefined.