A Visual Keyboard

8/3/2019 A Visual Keyboard

1/6

A VISUAL KEYBOARD

Replacement for the conventional keyboard...

ABSTRACTInformation and communication

technologies can have a key role in

helping people with special

educational needs,considering both

physical and cognitive disabilities.Replacing a keyboard or mouse,

with eye-scanning cameras mounted

on computers have become

necessary tools for people without

limbs or those affected with

paralysis. The camera scans the

image of the character, allowing

users to "type" on a monitor as they

look at the Visual Keyboard.

The paper describes an input device,

based on eye scanning techniques

that allows people with severe motor

disabilities to use gaze for selecting

specific areas on a computer screen.

Compared to other existing

solutions, the strong points of this

approach are simplicity and novelty.

KEYWORDS:Vis-Key, ICT, Cornea, Sclera,

Choroids, Retina, Digitizer, Pixel,

Equalization, Contrast, Digital

image.

1. INTRODUCTION

Vis-Key aims at replacing the

conventional hardware keyboard with a

Visual Keyboard. It employs

sophisticated scanning and pattern

matching algorithms to achieve the

objective. It exploits the eyes natural

ability to navigate and spot familiarpatterns. Eye typing research extends

over twenty years; however, there is

little research on the design issues.

Recent research indicates that the type

of feedback impacts typing speed, error

rate, and the users need to switch her

gaze between the visual keyboard and

the monitor.

2. Description of Concepts

2.1 The Eye


2/6

Fig 2.1 Cross Sectional View of the Human Eyes

Fig 2.1 shows us the horizontal cross

section of the human eye. The eye is

nearly a sphere with an average

diameter of approximately

20mm.Three membranes TheCornea & Sclera cover, the Choroid

layer and the Retina encloses the eye.

When the eye is properly focused, light

from an object is imaged on the retina.

Pattern vision is afforded by the

distribution of discrete light receptors

over the surface of the Retina. There

are two classes of receptors Cones

and Rods. The cones,typically present

in the central portion of the retina

called fovea is highly sensitive tocolor.

The number of cones in the human eye

ranges from 6-7 millions. These cones

can resolve fine details because they

are connected to its very own nerve

end. Cone vision is also known as

Photopic or Bright-light vision. The

rods are more in number when

compared to the cones (75-150

million). Several rods are connected to

a single nerve and hence reduce theamount of detail discernible by the

receptors. Rods give a general overall

picture of the view and not much

inclined towards color recognition.

Rod vision is also known as the

Scotopic vision or Dim-light vision as

illustrated in fig 2.1, the curvature of

the anterior surface of the lens is

greater than the radius of its posterior

surface. The shape of the lens is

controlled by the tension in the fiber of

the ciliary body. To focus on distant

objects, thecontrolling muscles cause

the lens to be relatively flattened.

Similarly to focus on nearer objects the

muscles allow the lens to be thicker.

The distance between the focal

distance of the lens and the retina

varies from 17 mm to 14 mm as

the refractive power of the lens

increases from its minimum to its

maximum.

3. The Vis-Key System

The main goal of our system is to

provide users suffering from severe

motor disabilities (and that therefore

are not able to use neither the keyboard

nor the mouse) with a system that

allows them to use a personal

computer.

Fig 3.1 System Hardware of the Vis-Key System

The Vis-Key system (fig 3.1)

comprises of a High Resolution camera

that constantly scans the eye in order to

capture the character image formed on

the Eye. The camera gives acontinuous streaming video as output.

The idea is to capture individual

frames at regular intervals (say of a

second). These frames are then

compared with the base frames stored

in the repository. If the probability of

success in matching exceeds the

threshold value, the corresponding

character is displayed on the screen.

The hardware requirements are simply

a personal computer, Vis-Key Layout(chart) and a web cam connected to the

USB port. The system design, which

refers to the software level, relies on

the construction, design and

implementation of image processing

algorithms applied to the captured

images of the user.

4. System Architecture


3/6

4.1. Calibration

The calibration procedure aims at

initializing the system. The first

algorithm, whose goal is to identify the

face position, is applied only to the

first image, and the result will be used

for processing the successive images,

in order to speed up the process. This

choice is acceptable since the user is

supposed only to make minor

movements. If background is

completely black (easy to obtain) the

users face appears as a white spot, and

the borders can be obtained incorrespondence of a decrease in the

number of black pixels. The Camera

position is below the PC monitor; if it

were above, in fact, when the user

looks at the bottom of the screen the

iris would be partially covered by the

eyelid, making the identification of the

pupil very difficult. The user should

not be distant from the camera, so

that the image does not contain much

besides his/her face. The algorithmsthat respectively

identify the face, the eye and the pupil,

in fact, are based on scanning the

image to find the black pixel

concentration: the more complex the

image is, the slowest the algorithm is

too. Besides, the image resolution will

be lower. The suggested distance

is about 30 cm. The users face should

also be very well illuminated, and

therefore two lamps were posed on

each side of the computer screen. In

fact, since the identification algorithms

work on the black and white images,

shadows should not be present on the

Users face.

4.2. Image AcquisitionThe Camera image acquisition is

implemented via the Functions of the

AviCap window class that is part of the

Video for Windows (VFW) functions.

The entire image of the problem

domain would be scanned every

1/30 of second. The output of the

camera is fed to an Analog to Digital

converter (digitizer) and digitizes it.

Here we can extract individual frames

from the motion picture for furtheranalysis and processing.

4.3. Filtering of the eye component

The chosen algorithms work on a

binary (black and white) image, and

are based on extracting the

concentration of black Pixels. Three

algorithms are applied to the first

acquired image, while from the second

image on only the third one is applied.

Fig 4.3.1 Algorithm 1 Face Positioning

The First algorithm, whose goal is to

identify the face position, is applied

only to the first image, and the result

will be used for processing the

successive images, in order to speed up

the process. This choice is acceptable

since the user is supposed only to make

minor movements. The Face algorithm

converts the image in black and white,

and zooms it to obtain an image thatcontains only the users face. This is


4/6

done by scanning the original image

and identifying the top, bottom, left

and right borders of the face. (Fig

4.3.1). Starting from the resulting

image, the Second algorithm extracts

the information about the eye position(both left and right) pixels is the one

that contains the eyes. The algorithm

uses this information to determine the

top and bottom borders of the eyes area

(Fig. 4.3.2), so that it is extracted from

the image. The new image is then

analyzed to identify the eye: the

algorithm finds the right and left

borders, and generates a new image

containing the left and right eyes

independently.

The procedure described up until now

is applied only to the first image of thesequence, and the data related to the

right eye position are stored in a buffer

and used also for the following images.

This is done to speed up the process,

and is acceptable if the user does only

minor head movements. The Third

algorithm extracts the position of the

center of the pupil from the right eye

image. The Iris identification

procedure uses the same approach of

the previous algorithm. First of all, the

left and right borders of the iris are iris,

is the one that has the higher

concentration of black pixels. The

center of this image represents also the

center of the pupil. The result of this

phase is the coordinates of the center

of the pupil for each of the image in

the sequence

4.4. Preprocessing

The key function of preprocessing is

to improve the image in ways to

improve the chances for success with

other processes. Here preprocessing

deals with 4 important techniques:

_To enhance the contrast of theimage.

_To eliminate/minimize the effect ofnoise on the image.

_To isolate regions whose textureindicates likelihood to alphanumeric

information.

_To provide equalization for theimage.

4.5. Segmentation

Segmentation broadly defines the

partitioning of an input image into its

constituent parts or objects. In general,

autonomous segmentation is one of the

most difficult tasks in Digital ImageProcessing. A rugged segmentation

procedure brings the process a

long way towards successful solution

of the image problem. In terms of

character recognition, the key role of

segmentation is to extract individual

characters from the problem domain.

The output of the segmentation stage is

raw pixel data, constituting either the

boundary of a region or all points in

the region itself. In either caseconverting the data into suitable form

for computer processing is necessary.

The first decision is to decide whether

the data should be represented as a

boundary or as a complete region.

Boundary representation is appropriate

when the focus is on external shape

characteristics like corners and

inflections. Regional representation is

appropriate when the focus is on

internal shape characteristics such as


5/6

texture and skeletal shape. Description

also called feature selection deals with

extracting features that result in some

quantitative information of interest or

features that are basic for

differentiating one class of objectsfrom another.

4.6. Recognition and Interpretation

Recognition is the process that assigns

a label to an object based on the

information provided by its

descriptors. This process allows us to

cognitively recognize characters based

on knowledge base. Interpretation

involves assigning a meaning to an

ensemble of recognized objects.

Interpretation attempts to assignmeaning to a set of labeled entities. For

example, to identify a character say 'C',

we need to associate descriptors for

that character with label 'C'.

4.7. Knowledge Base

Knowledge about a particular problem

domain can be coded into an image

processing system in the form of a

Knowledge database. The knowledge

may be as simple as detailing regions

of an image where the information of

interest is known thus limiting our

search in seeking that information. Or

it can be quite complex such as an

image database where all image entries

are of high resolution. The key

distinction of this knowledge base is

that it, in addition to guiding the

operation of various components,

facilitates feedback operations

between the various modules of thesystem. This depiction on fig 4.1

indicated that communication between

processing modules is based on prior

knowledge of what a result should be.

5. Unique FeaturesThis model is a novel idea and the first

of its kind in the making. Also it opens

a new dimension to how we perceive

the world and should prove to be a

critical technological breakthroughconsidering the fact that there has not

been sufficient research in this field of

Eye scanning. If implemented, it will

be one of the awe-inspiring

technologies to hit the market.

6. Design Constraints:

Though this model is thoughtprovoking, we need to address the

design constraints as well.

_R & D constraints severely hamperour cause for a full-fledged working

model of the Vis-Key system.

_The need for a very high resolutioncamera calls for a high initial

investment.

_The accuracy and the processingcapabilities of the algorithm are very

much liable to quality of the input.

Due to these design constraints we will

be able to chalk out a plan to

encompass modules 4, 5, 6 and 7

(Preprocessing, Segmentation and

Representation, Recognition and

Interpretation and the Knowledge

base). The most preferred softwares to

implement these algorithms are C++

and Mat Lab 6.0.

7. Alternatives/RelatedReferencesThe approaches till date have only

been centered on the Eye Tracking

theory. It lays more emphasis on the

use of eye as a cursor and not as a data

input device. An eye-tracking device

lets users select letters from a screen.

Dasher, the prototype program taps

into the natural gaze of the eye and

makes predictable words and phrases

simpler to write, said David MacKay,project coordinator and physics

professor from Cambridge University.

Dasher calculates the probability of

one letter coming after another. It then

presents the letters required as if

contained on infinitely expanding

bookshelves. Researchers say people

will be able to write up to 25 words per

minute with Dasher compared to on-

screen keyboards, which they say

average about 15 words per minute.

Eye-tracking devices are still


6/6

problematic. "They need re-calibrating

each time you look away from the

computer," says Willis.

He controls Dasher using a trackball.

Bibliography:_http://www.cs.uta.fi/~curly/publications/ECEM12-Majaranta.html_www.inference.phy.cam.ac.uk/djw30/dasher/eye.html_http://www.inference.phy.cam.ac.uk/dasher/_http://www.cs.uta.fi/hci/gaze/eyetyping.php_http://www.acm.org/sigcaph

_Ward, D. J. & MacKay, D. J. C.Fasthands-free writing by gazedirection.Nature, 418, 838, (2002)._Daisheng Luo Pattern

Recognition and ImageProcessing Horwood series inengineering sciences.

Documents

A Visual Keyboard