24
Human Perception and Memory Semester 2, 2009 1 Vision Human Visual Perception Humans are visual creatures. While your eye is like a camera, there’s no little guy inside your head watching T.V. Rather, your subjective perception of the visual world is created by your brain: John Ross: “Vision is an hallucination triggered by sense inputs”. Notes For most people, vision is the dominant sense, at least in terms of gath- ering information and interacting with the world. This is particularly true of conventional user interfaces, although in part that’s due to limitations of technology. We have pretty good widely available technology, at least for cre- ating flat, 2D, imagery (LCD display screens, etc.), and for stereo sound. But at the moment we have only limited, and mostly experimental, technology for smell and touch. Human vision, like any other mode of perception, is far from simple. While the creation of your subjective perception is fundamentally a myste- rious process, we know a lot about it, enough to provide practical guidance for designing GUIs. The Human Eye See Figure 1. Notes The fovea (the term I’ll use), is also called the macula (as in Figure 1).

Human Perception and Memory - University of …people.eng.unimelb.edu.au/adrianrp/COMP30019/2010/slides/...• The lens is made out of special transparent crystalline protein. •

Embed Size (px)

Citation preview

Human Perception and Memory

Semester 2, 2009

1 Vision

Human Visual Perception

• Humans are visual creatures.

• While your eye is like a camera, there’s no little guy inside your headwatching T.V.

• Rather, your subjective perception of the visual world is created byyour brain:

John Ross: “Vision is an hallucination triggered by sense inputs”.

NotesFor most people, vision is the dominant sense, at least in terms of gath-

ering information and interacting with the world. This is particularly trueof conventional user interfaces, although in part that’s due to limitations oftechnology. We have pretty good widely available technology, at least for cre-ating flat, 2D, imagery (LCD display screens, etc.), and for stereo sound. Butat the moment we have only limited, and mostly experimental, technologyfor smell and touch.

Human vision, like any other mode of perception, is far from simple.While the creation of your subjective perception is fundamentally a myste-rious process, we know a lot about it, enough to provide practical guidancefor designing GUIs.

The Human EyeSee Figure 1.

NotesThe fovea (the term I’ll use), is also called the macula (as in Figure 1).

Figure 1: Cut-away view of the human eye. From http://en.wikipedia.

org/wiki/Image:Human_eye_cross-sectional_view_grayscale.png.

The Human Eye Functioning

• focussing (cornea, lens)

• aperture control (iris)

• image capture (retina)

– variable resolution (fovea)

– photoreceptors (rods, cones)

– integration time (1/15 sec.)

– preprocessing

• pointing (saccades)

Focussing

• Cornea and lens form an image on retina.

• The lens is made out of special transparent crystalline protein.

• Lens and ciliary muscles adjust focus for different distances—accommodation.

• Focussing problems

2

• Age effects: presbyopia, yellowing, stiffening, “floaters”, . . .

Notes

• Just as a camera’s lens forms an image of the outside scene on its film orCCD array, so the cornea and lens of the human eye working togetherform an image on the retina.

• The lens is made out of special transparent crystalline protein.

• Since the difference in refractive index between the lens and its sur-rounding fluids is fairly small, it can’t bend light very much.

• Most of the bending of light to form the image is actually done by thecornea (the curved, transparent front surface of the eyeball), becauseof the substantial difference in refractive index between the cornea andthe air.

This is why you can’t see clearly under water: The cornea loses almostall of its focussing ability, because its refractive index is not muchdifferent from water’s. Wearing a diving mask restores this cornea-airinterface, so your eye can focus.

• The lens functions mainly to adjust the focussing already done by thecornea.

This is why people can still see after they’ve had the lens surgicallyremoved because of cataracts (a clouding of the lens), though they’llneed spectacles or a contact lens to compensate.

• And just as with a camera, for a particular setting, only objects at acertain distance will be perfectly in focus, therefore the focus settinggenerally needs to be changed depending on whether you’re looking atnearby or far away objects.

Commonly, because of the shape of your cornea, lens or eyeball, youmay not be able to focus properly. Usually this can be corrected bywearing eye glasses or contact lenses.

• Cameras adjust focus by changing the distance between the lens andthe image; the eye instead does it by changing the shape of the lens.

The lens is attached by the zonular ligaments. In the normal, restingstate, the ciliary muscle is relaxed, the ligaments are taut and pull thelens into a thinner shape that bends light less. This is for focussing onfar-away objects.

3

When the ciliary muscle contracts, it loosens the ligaments, allowingthe lens to spring back into a fatter shape that bends light more. Thisis for focussing on close-up objects.

This process of focus adjustment is called accommodation.

• Age effects: The protein of your lenses has to last your entire life.It never gets replaced, and deteriorates with time. It loses its trans-parency and elasticity and becomes yellowish with age. Think aboutwhat effects this will have. In particular, the lens loses its ability torelax into the fat shape needed to focus on close-up objects. This phe-nomenon is called presbyopia, from Greek words meaning “elder see-ing”. (The name of the Christian denomination, Presbyterian, comesfrom one of the same Greek words, since their churches are governedby “elders”.) This is why most people need reading glasses as theyget older. Another thing that happens any time, but increasingly withage, is that little bits of stuff become detached and drift around insidethe eyeball (“floaters”), casting a faint shadow on the retina. You cansometimes see these when you look up at a clear blue sky.

Aperture Control

• The iris is like the aperture control of a camera.

• Part of adjustment for varying light levels.

• Bigger aperture:

– Harder to focus.

– More effect of lens defects.

• So vision much harder under low-light conditions.

4

Notes

• The iris is the band of muscles in front of the lens. The opening in theiris is the pupil. It looks black because you’re looking into the relativelydark interior of the eyeball. The iris is pigmented and patterned: the“color” of somebody’s eyes is the color of his’r’her iris. Some securitysystems recognize people by the pattern in their iris, like a fingerprint.

• The iris controls the size of the pupil and hence the amount of lightentering the eye. It’s one of ways the eye adjusts to varying light levels:In bright light, the pupil contracts to reduce the amount of light gettinginto the eye. In low-light conditions, the pupil expands (to about 7mmacross) to gather as much light as possible. The eye also adjusts todifferent light levels by chemically altering the reponsiveness of thephotoreceptors.

• Pupil adjustment is like the aperture control (f-ratio) of a camera.

• As with a camera, the aperture also affects depth of focus (aka depthof field). To see an object clearly, your eye has to accommodate tofocus on that object. Objects nearer or further away will be out offocus, blurry. This focus adjustment is much more critical for a largeaperture than a small one. (This is how you can get cheap “focus-free”cameras: They have a small enough aperture that everything over somereasonable range of distances is acceptably in focus.)

• Another issue is that with a large aperture (big pupil) the light ispassing through more of your lens, and will be affected more by anyoptical defects in your lens.

• The practical effect of these two effects is that vision is much moredifficult under low-light conditions. In bright light, your pupil shutsdown to a small aperture: this makes focussing easier and reduces theimpact of lens defects. In low light, though, your pupil opens up to alarge aperture: focussing becomes more difficult and lens defects showup more.

Retina

• Retina covers inside surface of back of eyeball.

• Like CCD sensor of a camera.

5

• Specialized light-sensitive nerve cells, photoreceptors.

• Rods and cones

• Rods—“night vision”, scotopic vision, dark adaptation, blue sensitive.

• Cones—normal light, photopic vision, color perception.

• Retina: neural image-processing front-end computer.

• Blind spot

Notes

• The photoreceptors contain special pigment molecules, which react tolight, setting up a chain of chemical reactions in the cell causing it tosend off nerve signals.

• The photoreceptors come in two varieties, rods and cones. (The namescome from the shapes of the cells.)

• The rods are very sensitive to light, and operate only under low-lightconditions. They are responsible for our “night vision” (known techni-cally as scotopic vision, from Greek for “dark seeing”).

Rod-based night vision is of importance only in some applications, likefor visual astronomers, police, emergency workers, military, and thelike, also for ordinary citizens driving cars at night, particularly in thecountry with no street lights. In some of these situations, it can be oflife-and-death importance. So it is useful to be aware of the followingcharacteristics of rod-based vision:

– Rods are active only in low-light conditions and are inactived bybright light. Worse than that, it takes some time in the dark forrods to become active, about 10 minutes—even longer, half anhour or more, for them to reach peak sensitivity. This is why,when you step out into a dark night from a well-lit room, you atfirst can’t see anything. Only as your eyes “get used to the dark”will you start to see anything.

The reverse process, however, is not gradual. Even a brief expo-sure to bright light is enough to switch off your rods and disableyour night vision.

– Rod-based night vision is adversely affected by such common drugsas alcohol, caffeine, nicotine, even at quite low doses.

6

– Rods give us no perception of color. (What little color you mightsee walking about at night comes from residual low-level responseof your cones.)

– While rods are not connected to any color perception, they are infact more sensitive to blue light and practically unresponsive tored light.

You can sometimes notice such color-based intensity shifts arounddusk. When he was a toddler, my son had a jumper with repeatedequal-width red, blue, and white stripes. One evening in the back-yard, as it grew dark, I was puzzled that now it appeared to bejumper with alternating thin dark stripes and thick light stripes.This was because my rod-based night vision was starting to kickin, and to the rods the single red stripe looked dark (because theydidn’t respond to red), while the blue and white stripes looked likea single, thick light stripe, since to my rods, the blue and whitelooked much the same: they responded well to the blue light re-flected from the blue stripes and to the blue component of thelight reflected from the white stripes.

Similarly if you have a red sock and a blue sock that seem to beof the same shade (if different hues) in bright light, then by dimlight, once you’re adapted to the dark and have your night vision,the red sock will now appear much darker than the blue. Try itout some time. This is called the Purkinje Effect.

So people who rely on their night vision use a red light to readtheir maps, etc., (or a red-only computer display). The red lightis bright enough to stimulate the cones for normal vision, but iseffectively invisible to the rods, so they are unaffected, and stayactive. (They “think” it’s still dark, since they can’t “see” thered light.) If such people had used an ordinary white light, orfull-color display, then they would have lost their night vision forten minutes or more.

– Many animals, particularly nocturnal animals, have only rod vi-sion. So you can use a red light to observe them in the dark, andthey won’t even notice it.

• The cones are not so sensitive to light, and operate under normal light-ing conditions. They are responsible for our everyday vision underdaylight and normal artificial light (known technically as photopic vi-sion, from Greek for “light seeing”). Of great importance is that conesprovide us with color vision—the ability to perceive and discriminate

7

colors. (You can remember this by the mnemonic: both “cone” and“color” start with “co”.)

So it’s cone-based vision that’s important for most GUI applications.

• People with normal color vision have three kinds of cones in their eyes,nominally sensitive to blue, green, and red light, respectively. However,the wavelengths of light at which these cone types have their peaksensitivities do not correspond with those color names. So among visionscientists it is now the custom to refer to them as the S, M, and Lcones, respectively for short, medium and long wavelength (of light).For example, while the L cones are responsible for our perception ofred, their peak sensitivity actually occurs at a wavelength of light thatwould be perceived as a slightly greenish yellow.

Regardless of the terminology, human color perception is based, ini-tially, on the responses of the three cone types to incoming light. Thisis the basis of color photography and color displays. Within limits,you can mix red, green, and blue light to create almost any perceivablecolor. You just have to adjust the relative mix of red, green and bluelight so as to stimulate the S, M, and L cones in the same way as thedesired color.

• If you don’t have a full complement of working cone receptors, thenyou’ll have color blindness (or at least some degree of color-vision de-ficiency), which effects about 8% of the male population (and about0.4% of the female population).

• However, the retina is much more than just a sensor: It’s better thoughtof as a neural image-processing front-end computer: The image receivedby the photoreceptors goes through a few stages of processing beforebeing relayed to the brain via the optic nerve.

• By some accident of evolution, the vertebrate retina is “wired up” the“wrong way”: The output connections are actually on the front of theretina, on the inside of the eye-ball. All the output nerve fibers getfrom the inside to the outside at a single point where they join theoptic nerve, which connects on to the brain. This point is the so called“blind spot”, because there are no photoreceptors there. But in normalsituations we are totally unaware of this blind spot, because later visualprocessing “fills in” the gap.

• Eyes of cephalopods (like squid and octopus) are remarkably similarto vertebrate eyes (like humans’). However, this is an instance of con-

8

Figure 2: Drawing of a small section of the human retina by Santiago Ramony Cajal, from http://en.wikipedia.org/wiki/Image:Cajal_Retina.jpg

vergent evolution, since cephalopods and vertebrates have no commonancestor with any eyes to speak of. One difference is that the neuralconnections of cephalopod eyes are on the back of the retina, so theyhave no blind spot.

Retina NeurocomputerSee Figure 2, which shows the photoreceptors (rods and cones) at the

top, and the various layers of nerve cells they connect to perform the initialneural preprocessing that’s done before the visual signals are passed back tothe brain through the optic nerves. Note, this diagram is oriented so thatthe front of the eye is at the bottom.

The Fovea versus Peripheral Vision

• Retina: variable-resolution sensor.

• Fovea: central small patch of high-resolution vision.

Cones only, no rods.

9

• Illusion of clarity.

• Peripheral vision.

Notes

• Ultimately, the human vision system has only limited processing re-sources available, so those resources need to be deployed where they’remost useful.

• Normally, what you’re looking directly at is of most importance, so thevision system devotes more resources to the center of your field of view,and less to the periphery.

• In a CCD array in say a digital camera, the pixels are distributedregularly and uniformly across the entire image.

• But the retina is a variable-resolution sensor: There’s a much higherdensity of photoreceptors in a special, central area of the retina, calledthe fovea, and much lower density in the periphery.

• This means we really see things in full detail only in a quite smallregion close to the direction we’re looking in. In visual terms, it’sabout the size of your thumbnail held out at arm’s length (about onedegree across).

• Your impression that you see everything clearly is just an illusion: Aswe’ll see later, your eyes are continually moving, looking in differentdirections. Whatever you’re looking right at now you see in full de-tail, and to some extent what you’ve just looked at your visual system“remembers” in full detail.

Try this: Look directly ahead at something, and fix your gaze on it.You’ll see it in full detail, courtesy of your fovea. Now, without shiftingyour gaze, pay attention to objects off to the side. This is tricky todo, because your normal reaction will be for your gaze to automaticallyfollow your attention. Resist this. Shift your attention without shiftingyour gaze. You’ll appreciate that objects off to the side are “blurry”and ill-defined, compared with how they appear when you look straightat them. Or you can fix your gaze on a word on the page in front ofyou, and without shifting your gaze, try to read a few words to the leftand the right.

10

• It’s not merely the density of photoreceptors that changes, but alsotheir nature. There are fewer cones in the periphery, so our colorperception is modified and our ability to discriminate colors is weaker.

I remember one long night drive I went on, looking ahead at the road(as one does in such situations). I was puzzled by an unusual orangeindicator light I saw down on the dashboard. The more so, becauseevery time I looked straight at it, there was no orange light there, justsome normal red indicator. It took me a while to realise that it was infact the same light. Because of different color perception between thefovea and periphery, the same light that appeared red when looked atdirectly, appeared orange to me when seen off to the side.

Conversely, there are no rods in the fovea (all the space is used up withcones). This means that a faint light may paradoxically be invisiblewhen you look straight at it, but appear “mysteriously” when you lookslightly away from it. This is an old trick used by astronomers.

• I’ll say a bit more about form and motion perception later, but we alsohave more motion detectors out in the periphery, out beyond where wehave form (shape) detectors.

Try this: Look straight ahead, point your arm out in front of you, makea fist and stick your thumb out pointing upwards. Now, while keepingyour gaze fixed ahead, gradually swing your arm horizontally to moveyour thumb off to the side. At first this will be just like the previousparty trick: as your thumb moves off to the side, it will become less andless distinct. But at some stage (probably a little past 90 degrees fromdead ahead), you’ll just reach the point where you can no longer seeyour thumb at all. Now, wiggle your thumb. You’ll be able to clearlysee your thumb moving (or at least see something moving), even thoughyou can’t see it at all when it’s still.

Integration Time and the Movies

• Integration time, about 1/15 second.

• Motion pictures, animations.

• Screen refresh. Trade-offs. Peripheral vision.

Notes

11

• Any sensor has to make its measurements over a finite interval of time,its integration time.

• The photoreceptors in your eye are no exception, and being made outof electrochemical wetware rather than electronic hardware, they arerelatively slow.

• As always, it’s not simple, but roughly speaking, the integration timeof photoreceptors under normal conditions is about 1/15 second.

• In simplest terms, this means that if you have a light flashing on and offslower than 15 times a second, you’ll be able to perceive the individualflashes. But if the light flashes on and off much faster than 15 times asecond, then the flashes will all blend together to create a more or lesscontinuously glowing light of brightness averaged between on and off.

• This is the whole basis of motion pictures: If you show a sequence ofstill images rapidly enough (24 frames a second for standard movies,25 for PAL TV, very close to 30 for NTSC TV), then the successiveimages will blend together to create the illusion of continous motion.

• In user interfaces, this is clearly important if you’re doing animations.But even if you aren’t explicitly doing animations, there’s still the is-sue of the refresh rate of your computer screen, which is redrawn manytimes a second. Because of various effects, which I won’t go into here(mainly the brightness and large size of the screen) “flicker”, the fluc-tuation in brightness caused by the redrawing of the screen, can still benoticeable at refresh rates considerably above the nominal 15Hz. Evenlow-end screens would normally refresh at around 50–60Hz; higher-endat 80–90Hz or more, and some would say these high refresh rates leadto better appearance.

Since one bottleneck is usually total hardware bandwidth, there’s oftena trade-off involved: You can get higher screen resolution (more dotsper inch), but only at lower refresh rates. To go to a higher refresh rate,within the bandwidth limits of your hardware, you’d need to drop thescreen resolution and perhaps the color resolution (number of bits ofprecision devoted to storing color information). Which would be bestwould depend on your intended usage: If you were doing something likedesktop publishing, you’d probably go for the highest screeen and colorresolution, even if it meant a relatively low refresh rate (so long as itdidn’t get down to the level of objectionable flicker). If you were doinganimations or computer gaming, then maybe the best choice would

12

be to opt for higher refresh rates at the expense of screen and colorresolution.

• Integration time depends on various factors like brightness and position.In general, we’re more sensitive to motion and flicker in our peripheralvision. For example, a screen which is adjusted to be flicker-free whenyou look at it directly, may still exhibit a noticeable degree of flickerwhen you view it “out of the corner of your eye”, as you may do whenyou look away to read a document on your desk. Same with TV.

Saccades

• Eye muscles.

• Gaze jumps—saccades.

• Retinally stabilized images.

Notes

• Attached to the eyeball are various muscles that allow it to be turnedin different directions.

• In normal circumstances, your eyes are continually moving, pointing atdifferent parts of what you’re looking at.

• Remember that the fovea, the only part where we see full detail, cor-responds to only a very small part of our field of view. To a largeextent, our illusion of seeing everything more or less clearly is achievedby stitching together lots of little, partial views.

• This gaze motion is usually not continuous and smooth, but our gazemoves in jumps, called saccades, from fixation point to fixation point.Most of this occurs below the level of conscious awareness, but it canbe measured in psychology labs.

• It’s a fairly complicated process, but essentially what happens is thatfrom the unclear view in the periphery, your vision system picks thenext interesting point to look at, and jumps to that point, then repeats,building up an overall perception in the process. Of course, for morespecialized vision tasks like reading, the saccades follow a more stereo-typical pattern, from word to word along a line (or more likely, fromword group to word group) and then a jump back to the beginning ofthe line to start the next line.

13

• The effect of this is that, even if you’re looking at something stationary,the pattern of light projected onto your retina is always changing.

• Why doesn’t our perception jump about then? The reason is that ourvisual system “knows about” the eye movements, and compensates forthem in constructing our perception.

• Moreover, any image that doesn’t move about relative to the retinaactually fades away fairly quickly, in a second or so. These are retinallystabilized images.

This doesn’t happen in the real world: because the eyeball is continuallymoving, nothing “out there” could produce a stabilized image. In theoriginal experiments, this was achieved by putting a marker on a specialcontact lens, tracking this marker to track the eye movement, and thenmoving a projected image on a screen to follow this eye movement, sothat the image on the retina stayed the same.

Kids, you can try this at home. Do it in a reasonably dark room—doesn’t have to be perfectly dark. You’ll need a small light. One ofthose little key-ring red LED lights works well. A small torch willprobably do, but you’ll probably need to put a cardboard disk over thelens with a small hole cut in it, so you get only smallish beam of lightcoming out.

Hold the light close to the corner of your eye and jiggle it around abit. Be careful doing this. Don’t do it at a party where you’re likelyto get bumped in the arm and poke yourself in the eye. If you get theposition just right, you’ll see an amazing thing: this dendritic pattern,which is mainly the shadows cast onto your photoreceptors by bloodvessels on the front of your retina.

The image is elusive: It is a retinally stabilized image. If you don’t movethe light, it’ll fade after a second or so. Even though your eyeballs arestill moving, as always, the blood vessels are so close to the photo-receptors that their shadows fall in almost exactly the same place onthe retina. If you jiggle the light around a little, you can get the imageto persist for longer. The reason is that as you move the light, differentparts become lighter or darker, and that’s enough of a change stop thefading.

This demonstration also points to the reason for this apparently weirdbehavior of retinally stabilized images: In normal circumstances, if animage doesn’t move when the eyeball moves, then it must come fromsomething actually inside the eye (like shadows of retinal blood vessels).

14

Figure 3: Back view of human brain. Visual cortex shown as coloroverlay, from http://en.wikipedia.org/wiki/Image:Brodmann_areas_

17_18_19.png.

The fading is actually the visual system’s way of ignoring such things.It’s part of the computation done in the retina.

Higher-Level Processing

• Several “subsystems”

– form, color, motion, stereo,. . .

• higher-level perception

• pre-attentive and attentive vision

Notes

• After leaving each eye the separate optic nerves meet under the brainat the optic chiasm (named from its shape, which looks like the Greekletter χ), where the nerve pathways crossover. Signals from the rightside of the left eye cross over to the right side of the brain; similarlysignals from the left side of the right eye cross over to the left side ofthe brain.

Why this strange arrangement? Well, the human brain is arranged sothat, for the most part, sensory and motor processing areas for the rightside of the body are in the left hemisphere of the brain, and vice-versa.Because of the inverted projection onto your retina, things in the rightside of your field of vision are imaged on the left side of both your eyes(and vice-versa). The crossover at the optic chiasma means that visionfor the right side of your field of view (from both eyes) is processed inthe same hemisphere (the left) as the motor control for the right side of

15

your body. Similarly for the left side of your field of view (and body).This is a more efficient arrangement, because it means there is a moredirect connection between visual processing and action on each side. Italso means that the vision centers in each hemisphere get input fromboth eyes, which is important for stereo vision.

After the optic chiasma, the main visual pathways on each side passthrough the LGN (Lateral Geniculate Nucleus) and then on to thevisual cortex at the back of the brain. (There are, however, othervisual pathways.) Visual cortex is shown in Figure 3. You can see ittakes up quite a large fraction of the brain. In the figure, differentcolors indicate different areas of visual cortex, but we need not worryabout those distinctions here.

• There is pretty strong evidence that there are separate subsystems re-sponsible for processing different aspects of vision, such as form (shape),color, motion, and stereo perception. It’s almost as if we had a num-ber of distinct visual senses, all running off input from our eyes, butprocessed separately, and only later integrated into a unified consciousvisual perception.

• On top of all this is higher-level visual perception, by which we recog-nize objects and people and what they’re doing.

• An important distinction to be made is between pre-attentive and at-tentive vision. Most of the research on this was initially done by AnneTriesman.

Suppose you have a visual task, like finding an object in your field ofview. Suppose that object can be distinguished by one feature alone,such as color or shape. The task might be “Find a red thing in thispicture”, or “Find a letter X in this picture” (of any color). Then weare able to perform this task very quickly, and in constant time—thatis the time taken does not depend on the number of other objects inthe field of view nor on whether the object sought is present or not.This is called pre-attentive vision, because it can happen before we payattention to any one of the objects we see. It is suggested that it canbe done by parallel processing in the brain.

However if our task involves evaluating a conjunction of several features,like “Find a red letter X in this picture”—that is, something that isboth red and an X, then the task takes noticeably longer. What’s more,the time taken depends linearly on the number of objects seen, and onaverage takes twice as long if the sought object is not present. It seems

16

that to evaluate a conjunction of features, our visual system has tofall back onto sequential processing, scanning through the visual field(even if unconsciously), using essentially linear search. This is calledattentive vision, since we have to attend to each object individually.

Of course, for a simple conjunction, like color and shape, and a modestnumber of objects in the field of view, the time taken for attentivevision is quite short, but still longer than for pre-attentive vision. Thetime differences can be measured by reaction-time experiments.

It’s a little bit more complicated than this. There are certain combi-nations of features for which we seem to have the neural circuitry toprocess them in parallel. For example, if you’re shown a 3D stereo dis-play showing objects at different depths away from you, then you canquickly perform task like “Find a nearby red object”, even though thatseems to be a conjunction of two features, “nearness” and “redness”.

But the distinction between pre-attentive and attentive vision is stillan important one, with obvious implications for the design of user in-terfaces.

Visual Illusions and Oddities Muller-Lyer Illusion

Visual Illusions and Oddities Kanizsa Figure

NotesFigures 4 and 5 show two well known “optical illusions”—although I preferto call them “visual illusions” since they have more to do with human visualprocessing than with optics.

In the Muller-Lyer Illusion (of which Figure 4 is one of several variants),both lines are objectively of identical length on the page. You can measurethem with a ruler. However, to most people, the line with the outward di-rected Vs (like arrow tails) appears longer than the line with inward directedVs (like arrow heads).

Why do we have this apparently incorrect perception? There are a num-ber of theories to explain it. The most interesting, mainly due to the visionscientist Richard Gregory, is this: The line with the arrow heads looks likethe corner of a box or building seen from the outside. In this case, the line isinterpreted as being closer, and therefore as appearing bigger than it reallyis. Our vision system therefore unconsciously perceives this line as shorter,as being an indicator of its true size in 3D space. Conversely, the line witharrow tails looks like the corner of a room seen from inside the room. In thiscase the line is interpreted as being further away, and therefore as appearing

17

Figure 4: Muller-Lyer Illusion.

Figure 5: Kanizsa Figure.

18

smaller than it really is, and our vision system unconsciously perceives thisline as longer, again as being an indicator of its true size in 3D space.

As a point of interest, the Muller-Lyer illusion is to some degree cul-turally dependent. The effect seems to be stronger for people who live inurbanized cultures, where presumably there are lots of rectilinear structuresand pictures of such structures. Also, the effect apparently becomes weakerif you’re exposed to the display a lot—presumably your visual system canlearn in time to partly disregard the depth cues from the arrow heads andarrow tails. After all the display is just a pattern of lines on a flat page (ordisplay screen).

Figure 5 shows a Kanizsa Figure, illustrating the phenomenon of subjec-tive contours. Most people see a white triangle in front of, and obscuring,other shapes in the figure, which seem to be three black discs and an outlinetriangle. The white triangle seems to be a slightly brighter white than thebackground. But “in reality” there is no white triangle—it is completelyillusory. The figure is made up only of three V shapes and three “pacman”shapes (each a disc with a wedge cut out). These shapes are just artfullyarranged and aligned so as to suggest the illusory triangle.

Most people can see definitely, if faintly, the straight edges between thewhiter triangle and the white background—these make up the contour ofthe triangle, the subjective contour, since it doesn’t exist objectively. Thepaper (or screen) inside the white triangle is exactly the same brightness asthe background. Nonetheless the perception is very real. The “non-existent”subjective contours can be used to construct other illusions (such as variationson the Ponzo Illusion). Also, when subjects are asked to adjust the brightnessof another display to match their perceived brightness of the illusory triangle,and then similarly to adjust that other display to match their perceivedbrightness of the background, they will consistently and objectively set thatother display brighter when matching the illusory triangle.

Why do we see the illusory triangle? It’s mostly our visual systems un-consciously making the best guess as to what’s out there, given the visualinputs. In the real world (that is, outside psychology textbooks), what isthe most likely explanation for that picture? One explanation is that therereally is a triangle there—that best explains the aligned obscurations of theinferred disks and outline triangle. The other explanation is that the pacmanshapes and V shapes have been exactly lined up in a very improbable way.Our visual system opts for the first explanation, as being most likely.

Of course, this isn’t a conscious process of weighing up probabilities;rather it’s the way our vision systems have been tuned by evolution to workbest for us in the real world. What Helmholtz called “unconscious inference”.

What’s the point of studying such illusions? Well, for everyone, including

19

HCI practitioners, they emphatically make the point that our perception isa construct—it’s not just like a verbatim image made by a camera. Also,for vision scientists they provide clues about how the human vision systemworks. And for anyone creating visual displays, including HCI practitioners,they serve as a warning that a badly designed display might mislead ourusers and lead them to make wrong judgements and decisions.

Color Vision

• Light is electromagnetic radiation in the wavelength range of roughly400nm (“blue”) to 700nm (“red”)

• Almost all light is a mixture of wavelengths, e.g., the rainbow spectrumof white light from the sun

• Tristimulus theory of color perception

Color Vision

• Three kinds of cones, nominally sensitive to “red”, “green” and “blue”light

– “Tuning” is quite broad

– More accurately perhaps one short wavelength system and twolong wavelength systems

• Tristimulus values: (R,G,B)—raw input

• Later processing into intensity and color opponents R−G and B − Y

• (Intensity, hue, saturation)

• Color constancy

Cone Responses

20

red light400 500 600 700

Wavelength in nm.

"blue"

"green" "red"

blue light

Relative responses of human cone photoreceptors (Approximate only)

Characteristics of Human Vision. . .

• Acuity and hyperacuity

• Form and motion perception does not ‘see’ color

– Color “washes” in drawings

– Encoding of luminance and chroma in color TV

– Isoluminance contours

– Chromatic aberration: color edges cannot be brought to sharpfocus

. . . Characteristics of Human Vision

• Color vision characteristics

– Color perception depends on color context

– Blue alone tends to be perceived quite weakly

– Greatest color discrimination is in the green to yellow range

– Color blindness: red-green, blue-yellow, achromatic

– Possible variation even amongst people with “normal” color vision

– Implications for computer processing and display

21

2 Hearing

Human Auditory System

Human Hearing

• Vibrations: 20Hz to 20,000Hz less with age

• Sound localization: ITD, IID

3 Memory

Complexities

• Human memory is very complex and little understood

• Be wary of any simplistic classifications

• Still, some useful knowledge for HCI

22

Divisions of Memory

• LTM, long-term memory

• STM, short-term memory

– (part of Working Memory, according to some)

• (probably also medium-term memory)

Long-term Memory

• effectively life-long memory

• effectively “infinite” capacity

• retrieval/accessibility

• probably mediated by growth of neural connections

• may take up to two years to form

Short-term Memory

• short-duration, task in hand, e.g. dialling a phone number

• probably mediated by neural activation patterns

• limited capacity, “7 plus or minus 2” items (Miller, 1956)

• chunking

– affected by recognition/experience

Kinds of Memory

• Sensory memory

– sense impressions: sight, hearing, smell, taste, touch. . .

– related: motor memory

• Episodic memory

– events: what happened at lunchtime today

• Semantic memory

23

– facts: What is the capital of Maryland?

– (comment: We’re usually good at knowing what we don’t know.)

• Procedural memory

– “knowing how” versus “knowing that”

– like swimming, riding a bike, typing your password

Other Memory Phenomena

• primacy and recency

• closure

• psychological/emotional factors

– vividness, associations

– blocked/suppressed memories

– false memories, biases

Some Implications for HCI

• Keep well within STM limitations

– (Leave room for user’s real goals)

• Much HCI depends on sensory, motor, procedural LTM

– (Re-arrange a familiar GUI at your peril!)

• . . .

24