44
Ceng 492 Revised Design Report 1 08 Fall CENG 492 REVISED DESIGN REPORT İbrahim Kıvanç ERTÜRK 1560192 Süleyman YILMAZ 1631215 Gizem ÖĞRÜNÇ 1631316 Aykut ŞİMŞEK 1631415 Sponsored by: “Turkish Sign Recognition Using Microsoft Kinect” 27 February 2012

11 CENG 492 +'+,+-.$/&0+('$1&231,$ · Ceng 492 Revised Design Report ... Kinect that can recognize and understand Turkish Sign Language (TSL). ... Presentation of the required

Embed Size (px)

Citation preview

Ceng 492 Revised Design Report

1

1.

08

Fall

CENG 492 REVISED DESIGN REPORT

İbrahim Kıvanç ERTÜRK 1560192

Süleyman YILMAZ 1631215

Gizem ÖĞRÜNÇ 1631316

Aykut ŞİMŞEK 1631415

Sponsored by:

!""

111 ...

!"#$%&%'(%)###

"

!"##$%&'($)*#$$+'+,+-.$/&0+('$1&231,$

" "!#$%!&'"

()*+,-./"012345""""""""""""" "!$6!'!#"

" "!$6!6!$"

" "!$6!7!#"

"

(89/:9;+<"=,>"

"

*+*+ +,##

-.)/012#3045#6754.74%#8%&9450:095#;1054#<0&)919=:#>05%&:#

%?"?7@@#

“Turkish Sign Recognition Using Microsoft Kinect”

27 February 2012

Ceng 492 Revised Design Report

2

Table of Contents

1. Introduction...................................................................................................................4 1.1. Problem Definition ......................................................................................................4 1.2. Purpose .......................................................................................................................4 1.3. Scope...........................................................................................................................4 1.4. Overview .....................................................................................................................4 1.5. Definitions, Acronyms and Abbreviations ....................................................................5 1.6. References ...................................................................................................................5 2. System Overview............................................................................................................7 3. Design Considerations ....................................................................................................8 3.1. Design Assumptions, Dependencies and Constraints ....................................................8 3.1.1. Time Constraints ......................................................................................................8 3.1.2. Resource Constraints ................................................................................................8 3.1.3. Performance Constraints ..........................................................................................8 3.1.4. Hardware Constraints ..............................................................................................8 3.1.5. Software Constraints ................................................................................................9 3.2. Design Goals and Guidelines........................................................................................9 4. Data Design..................................................................................................................10 4.1. Data Description........................................................................................................10 4.1.1. Data Objects ...........................................................................................................10 4.1.2. Relationships ..........................................................................................................12 4.1.3. Complete Data Model .............................................................................................14 4.2. Data Dictionary .........................................................................................................15 5. System Architecture .....................................................................................................18 5.1. Architectural Design..................................................................................................18 5.2. Description of Components........................................................................................19 5.2.1. Graphical User Interface (GUI) Module..................................................................19 5.2.1.1. Processing Narrative ............................................................................................19 5.2.1.2. Interface ..............................................................................................................19 5.2.1.3. Detailed Processing ..............................................................................................19 5.2.1.4. Dynamic Behavior................................................................................................19 5.2.2. Database Module ....................................................................................................20 5.2.2.1. Processing Narrative ............................................................................................20 5.2.2.2. Interface ..............................................................................................................21 5.2.2.3. Detailed Processing ..............................................................................................21 5.2.2.4. Dynamic Processing .............................................................................................21 5.2.3. Hand Sign Recognizer Module ................................................................................22 5.2.3.1. Processing Narrative ............................................................................................22 5.2.3.2. Interface ..............................................................................................................22 5.2.3.3. Detailed Processing ..............................................................................................23 5.2.3.4. Dynamic Processing .............................................................................................23 5.3. Design Rationale........................................................................................................24 5.4. Traceability of Requirements ....................................................................................25 5.4.1. GUI Module and Use Case Relations.......................................................................25 Design Elements .................................................................................................................25 Functional Operations ........................................................................................................25 GUI Module Relation..........................................................................................................25 5.4.2. Database Module and Use Case Relations ...............................................................25 Design Elements .................................................................................................................26 Functional Operations ........................................................................................................26

Ceng 492 Revised Design Report

3

Database Module Relation ..................................................................................................26 5.4.3. Hand Sign Recognizer Module and Use Case Relations ...........................................26 Design Elements .................................................................................................................26 Functional Operations ........................................................................................................26 Hand Sing Recognition Mode Relation ................................................................................26 6. User Interface Design ...................................................................................................27 6.1. Overview of User Interfaces.......................................................................................27 6.1.1. Main Screen ...........................................................................................................27 6.1.2. Interpreter Mode Screens .......................................................................................28 6.1.3. Teaching Mode Screens ..........................................................................................29 6.2. Screen Images ...........................................................................................................31 6.2.1. Main Menu Screen..................................................................................................31 6.2.2. Teaching Mode Main Screen...................................................................................31 6.2.3. Interpreter Result Screen........................................................................................32 6.2.4. Educational Feedback Screen .................................................................................33 6.3. Screen Objects and Actions .......................................................................................34 7. Detailed Design ............................................................................................................35 7.1. Graphical User Interface Module ..............................................................................35 7.1.1. Classification ..........................................................................................................35 7.1.2. Definition................................................................................................................35 7.1.3. Responsibilities .......................................................................................................35 7.1.4. Constraints .............................................................................................................35 7.1.5. Compositions ..........................................................................................................36 7.1.6. Uses/Interactions ....................................................................................................36 7.1.7. Resources ...............................................................................................................36 7.1.8. Processing...............................................................................................................36 7.1.9. Interface Exports ....................................................................................................36 7.2. Database Module.......................................................................................................38 7.2.1. Classification ..........................................................................................................38 7.2.2. Definition................................................................................................................38 7.2.3. Responsibilities .......................................................................................................38 7.2.4. Constraints .............................................................................................................38 7.2.5. Compositions ..........................................................................................................38 7.2.6. Uses/Interactions ....................................................................................................39 7.2.7. Resources ...............................................................................................................39 7.2.8. Processing...............................................................................................................39 7.2.9. Interface Exports ....................................................................................................40 7.3. Hand Sign Recognizer Module...................................................................................40 7.3.1. Classification ..........................................................................................................40 7.3.2. Definition................................................................................................................40 7.3.3. Responsibilities .......................................................................................................40 7.3.4. Constraints .............................................................................................................41 7.3.5. Compositions ..........................................................................................................41 7.3.6. Uses/Interactions ....................................................................................................41 7.3.7. Resources ...............................................................................................................41 7.3.8. Processing...............................................................................................................41 7.3.9. Interface Exports ....................................................................................................42 8. Libraries & Tools.........................................................................................................43 9. Time Planning..............................................................................................................43 9.1. Term I (September 2011 – January 2012) ..................................................................43 9.2. Term II (February – June 2012) ................................................................................44 10. Conclusion .................................................................................................................44

Ceng 492 Revised Design Report

4

1. Introduction

1.1. Problem Definition The aim of PAPAĞAN project is to provide an interface working through Microsoft

Kinect that can recognize and understand Turkish Sign Language (TSL). The implementation

will be divided in two main functionalities. First, by using the information inferred from

Kinect, recognition of the TSL movement and translation of the meaning to text, audio and

video. This will provide a bidirectional communication between speech disabled people and

people who doesn’t know TSL, by creating a platform that they can communicate. The

second functionality of PAPAĞAN is to design of an interactive education tool to teach and

practice TSL.

1.2. Purpose This document contains the detailed description and information about PAPAĞAN

project. The Detailed Design Report aims to provide all necessary information regarding to

PAPAĞAN’s development, so that it will be a guide for further implementation of the project.

1.3. Scope

In Detailed Design Report, the requirements that will be needed to design PAPAĞAN is

explained. This document shows how the software system will be structured to satisfy the

requirements that specified in SRS document. Briefly, this document includes design

considerations, data design, architectural design, user interface design and detailed design

procedures for PAPAĞAN. This documentation is prepared in guidance of standards defined

by IEEE [1].

1.4. Overview This documentation will show how PAPAĞAN will be structured to satisfy the

requirements specified in SRS document. Those structures of the system are summarized as

the following;

• System Overview: A general description of PAPAĞAN including its functionality

and matters related to the overall system and its design.

• Design Considerations: Issues, which need to be addressed or resolved before

attempting to devise a complete design solution.

Ceng 492 Revised Design Report

5

• Data Design: Explanation of how the information domain of PAPAĞAN is

transformed into data structures.

• System Architecture: Description of the program architecture and detailed

description of system components.

• User Interface Design: Functionality of the system from the user’s perspective and

introduction of detailed user interfaces description.

• Detailed Design: The internal details of each design entity/component

• Libraries & Tools: Presentation of the required development environment to develop

PAPAĞAN.

• Time Planning: Designed scheduling for development.

• Conclusion: One last word about the document and PAPAĞAN.

1.5. Definitions, Acronyms and Abbreviations TSL: Turkish Sign Language

SL: Sign Language

SRS: Software Requirements Specifications

PAPAĞAN: Name of the product/project

GUI: Graphical User Interface

MSSQL DBMS: Microsoft SQL Server Database Management System

SDK: Software Development Kit

IDE: Integrated Development Environment.

INNOVA: Innova Future-Ready Solutions. PAPAĞAN’s sponsor company.

1.6. References [1] IEEE Std 830-1998: IEEE Recommended Practice for Software Requirements

Specifications

[2] T.C. Aile ve Sosyal Politikalar Bakanlığı Özürlü ve Yaşlı Hizmetleri Genel Müdürlüğü.

(n.d.). Türkiye Özürlüler araştırması temel göstergeleri. Retrieved from

http://www.ozida.gov.tr/arastirma/oztemelgosterge.htm

Ceng 492 Revised Design Report

6

[3]Christopher Grant. (Dec 20th, 2010). Kinect hacks: American sign language recognition.

Retrieved from http://i.joystiq.com/2010/12/20/kinect-hacks-american-sign-language-recognition/

[4] Anonymous (29 December 2011). Decoupling. Retrieved from:

http://en.wikipedia.org/wiki/Decoupling - Software_Development

Ceng 492 Revised Design Report

7

2. System Overview

The main purpose of PAPAĞAN is to recognize TSL by using the information

retrieved from Microsoft Kinect device. Although there are similar examples of such projects

for other sign languages, PAPAĞAN will be a pioneer for TSL and aims to provide a more

comprehensive coverage and functionality then its predecessors. PAPAĞAN’s main

difference from those previous projects will be it’s simple hand sign recognition algorithm,

which is proposed by Think-e Wink-e and named as Divide and Collect approach for sign

language recognition using Microsoft Kinect. The simplicity of this algorithm will ease the

process and enable defining and recognizing a wider range of signs.

According to statistics the percentage of handicapped people in Turkey is 11%.

Furthermore, 36% of the total handicapped people are unable to read or write, where the

number is 13% among non-handicapped people [2]. In order to handicapped people to

communicate, either the handicapped party has to know how to write or both parties has to

know the sign language. PAPAĞAN is willing to help those people by both translating their

language to natural language and spreading the triteness of TSL with our educational tool in

Turkey.

Down below is a general overview of PAPAĞAN and its components;

Figure 1 – System Overview in block diagram

Ceng 492 Revised Design Report

8

3. Design Considerations

The system will be designed with an object-oriented programming paradigm. A class

will represent each module in PAPAĞAN. Where those modules will be able to communicate

with each other. Since the system will be user interactive, the design should also consider

user preferences during implementation of PAPAĞAN, so that the users should be able to use

easily.

3.1. Design Assumptions, Dependencies and Constraints

3.1.1. Time Constraints PAPAĞAN has to run synchronously with the Kinect device and the user. Other

modules should finish their jobs before Kinect send new data. Since user motions will be

input for the system, it should be able to process incoming information in a speed of 10-15

frames per second.

3.1.2. Resource Constraints The project will be using Kinect device as the input source. Microsoft Kinect SDK

should be installed for development. The usage of Microsoft SDK also constrains the

operating system by forcing the use of Microsoft’s operating system (Windows 7) and

Microsoft Database Management Server MSSQL.

3.1.3. Performance Constraints When the system is activated, system will be in a state that interaction with the users

will be the most frequently repeated event; thus, as it is stated before system should process

10-15 frames/sec for a proper recognition of the gestures.

3.1.4. Hardware Constraints

The most crucial hardware component of this project is Kinect device, which the user

interaction will be processed through. The system needs a powerful processor in order to

make complicated pattern recognition and stereo matching algorithms, like processing 10-15

frames per second. Most new generation core due processors will do the work. Also system

memory consumptions should never exceed 20% of the total memory. Minimum memory to

system work properly is 2GB.

Ceng 492 Revised Design Report

9

3.1.5. Software Constraints The system will be implemented by using C# programming language. GUI will be

implemented by using Visual Studio 2010. Database Module interacts with MSSQL DBMS

will be used to appropriate connection between the system and the database. As mentioned

above operating system should be Windows 7 due to Kinect SDK’s own constraints.

3.2. Design Goals and Guidelines Due to PAPAĞAN’s aimed purpose and targeted audience, the system mainly will be

evaluated by its accuracy, usability and coverage. As the development team we will set those

properties as our priorities through the system design. The explanation of those priorities can

be found as below;

PAPAĞAN will be considered as successful as its accuracy to interpret TSL. The

system cannot create user confidence, unless it fulfills a reasonable accuracy level. It is clear

that with the increasing number of TSL moves covered by the language, the accuracy level

will drop, as we can see from the similar projects [3]. For this reason, aiming 100% accuracy

level will be a too optimistic perdition. But, in order to earn the user confidence, the targeted

accuracy level should be above at least 90%.

As the targeted audience will include people with various disabilities, the user

interfaces and system functionalities should be consistent, clean, easy to use and easy to

understand. Also considering the speech and hearing disabled people, the audible interactions

and materials should be kept at minimum. For points where audible materials add value to the

system (such as resulting interpretations), the information should be available in alternative

formats as well.

Once PAPAĞAN reaches to a desired level of performance, accuracy and can be

considered as complete. The rest of the implementation will focus on enlarging the coverage

of TSL by PAPAĞAN. But, for initial design the system coverage will cover the basic

phrases in TSL.

Ceng 492 Revised Design Report

10

4. Data Design

4.1. Data Description This section describes data objects that will be used and managed in PAPAĞAN. Data

Object section defines the data objects and their attributes. Then the Complete Data Model

Section shows the relations between the classes.

4.1.1. Data Objects Motion: This class is used to describe the motions received through Kinect device as inputs

of the user. The attributes and methods with their explanations are shown below.

Motion Table:

Field Name Data Type Description

startPosition tuple The starting coordinates of a body-part movement. It

holds three different values for x, y and z

coordinates

endPosition tuple The ending coordinates of a body-part movement. It

holds three different values for x, y and z

coordinates

feedback float result of a motion in interpreter mode, and it is used

as feedback message for the user. Value between 0.0

– 10.0

isTrue boolean result of a motion in interpreter mode, and it is used

as feedback message for the system

Table 1 – Motion Table

Skeleton: This class is used to represent the basic skeleton form of user body. The attributes

in this class are predefined by and can be accessed through Microsoft Kinect SDK itself and

explained shown as below;

Skeleton Table:

Field Name Data Type Description

Ceng 492 Revised Design Report

11

head Head It represents the user’s head. It maps the head to the sensor

to be tracked by Kinect

neck Neck It represents the user’s neck. It maps the head to the sensor

to be tracked by Kinect

torso Torso It represents the user’s torso. It maps the head to the sensor

to be tracked by Kinect

leftShoulder Shoulder It represents the user’s shoulders. It maps the shoulder’s left

to the sensor to be tracked by Kinect

rightShoulder Shoulder It represents the user’s shoulders. It maps the shoulder’s

right to the sensor to be tracked by Kinect

leftElbow Elbow It represents the user’s elbows. It maps the elbow’s left to

the sensor to be tracked by Kinect

rightElbow Elbow It represents the user’s elbows. It maps the elbow’s right to

the sensor to be tracked by Kinect

leftWrist Wrist It represents the user’s wrists. It maps the wrist’s left to the

sensor to be tracked by Kinect

rightWrist Wrist It represents the user’s wrists. It maps the wrist’s right to

the sensor to be tracked by Kinect

leftHand Hand It represents the user’s hands. It maps the hand’s left to the

sensor to be tracked by Kinect

rightHand Hand It represents the user’s hands. It maps the hand’s right to the

sensor to be tracked by Kinect

leftFingertip FingerTip It represents the user’s fingertips. It maps the left of

fingertip to the sensor to be tracked by Kinect

rightFingertip FingerTip It represents the user’s fingertips. It maps the right of

fingertip to the sensor to be tracked by Kinect

Table 2 – Skeleton Table

Sign: This class holds the attributes to represent TSL signs, which will be compared with

incoming user input. In other words, it is used to do main job on the system. The attributes

and methods with their explanations are shown below.

Sign Table:

Field Name Data Type Decription

Ceng 492 Revised Design Report

12

motion Vector<Tuple> It is used to detect user movements and pass arguments to the

classes and activities that will need

audio AudioStream It keeps the audio stream of relevant sign

text Text It keeps the text form of the word of Turkish sign language

video VideoStream It keeps the video stream of the word of Turkish sign language. It

is used in teaching mode

user User It keeps the information of the user that is used to track of body

motion, hand sign language, to translate in text

Table 3 – Sign Table

User: This class is used to hold attributes that are info about the user. The attributes and

methods with their explanations are shown below.

User Table:

Field Name Data Type Decription

skeleton Skeleton It is an object of Skeleton. It keeps the value of user body joints

locations and transmit the info to the sign class over itself

mode integer It keeps the general info about the user is in which mode on the

kinect system

Table 4 – User Table

4.1.2. Relationships This part described the relationship between classes defined in 4.1.1. Data Objects

section.

Sign – Motion (Has relationship): In sign class, the motion is described as an entity.

Also, it is used to translate motion to sign and then text. This relationship is needed for

translation. The relationship is shown below.

Figure 2 - Sign and Motion Objects Relationship

Ceng 492 Revised Design Report

13

Sign – User (Has relationship): In sign class, the user’s information must be specified

to be mapped to motions’ relevant meaning. So, sign class uses user class and its info. The

relationship is shown below.

Figure 3- Sign and User Objects Relationship

User- Skeleton (Has relationship): In user class, the user has a skeleton to be

recognized by Kinect. The relationship is shown below.

Figure 4 - User and Skeleton Objects Relationship

Ceng 492 Revised Design Report

14

4.1.3. Complete Data Model Complete data model is shown as below.

Figure 5 – Complete Data Model

Ceng 492 Revised Design Report

15

4.2. Data Dictionary

Name Type Refer to Section

startPosition tuple 4.1.1.

endPosition tuple 4.1.1.

angularVelocity float 4.1.1.

feedback float 4.1.1.

isTrue boolean 4.1.1.

head Head 4.1.1.

neck Neck 4.1.1.

torso Torso 4.1.1.

leftShoulder Shoulder 4.1.1.

rightShoulder Shoulder 4.1.1.

leftElbow Elbow 4.1.1.

rightElbow Elbow 4.1.1.

leftWrist Wrist 4.1.1.

rightWrist Wrist 4.1.1.

leftHand Hand 4.1.1.

rightHand Hand 4.1.1.

leftFingertip FingerTip 4.1.1.

Ceng 492 Revised Design Report

16

rightFingertip FingerTip 4.1.1.

motion Vector<Tuple> 4.1.1.

audio AudioStream 4.1.1.

text Text 4.1.1.

video VideoStream 4.1.1.

user User 4.1.1.

skeleton Skeleton 4.1.1.

mode integer 4.1.1.

GUI Module Module 5.2.1

Database Module Module 5.3.1

Hand Sign Recognizer

Module

Module 5.3.2.

User Class 4.1.1.

Sign Class 4.1.1.

Skeleton Class 4.1.1.

Motion Class 4.1.1.

sendMovementData() Action function 5.2.3

sendResulttoScreen() Action function 5.2.3

sendFeedBacktoScreen() Action function 5.2.3

Ceng 492 Revised Design Report

17

sendFeedBack() Action function 5.2.2

sendResult() Action function 5.2.2

start() Action function 5.2.1

end() Action function 5.2.1

selectMode() Action function 5.2.1

playVideo() Action function 5.2.1

playAudio() Action function 5.2.1

selectTeachAction() Action function 5.2.1

retry() Action function 5.2.1

returnSelectedObject() Action function 5.2.1

startTrack() Action function 5.2.1

endTrack() Action function 5.2.1

returnAudio() Action function 5.2.1

returnVideo() Action function 5.2.1

returnText() Action function 5.2.1

Table 6 - Data Dictionary

Ceng 492 Revised Design Report

18

5. System Architecture

5.1. Architectural Design PAPAĞAN will include 3 main modules that interact with each other, the user and the

Kinect device. These modules are GUI Module, Database Module and Hand Sign Recognizer

Module. GUI Module is an intermediate module to provide connection between the user and

PAPAĞAN. The module processes user request and passes it between the user, database, and

Hand Sign Recognizer Module. Database module will hold the gestures and their videos,

which can be handled by PAPAĞAN. This module will communicate with both Hand Sign

Recognition module and GUI module. Final module, Hand Sign Recognizer Module is the

basis module that connects the user to the system. It is connected to the Kinect device in one

end and to database on the other end. It receives the data of a movement on the Kinect device

and interacts with the database. How these modules are connected together is presented in the

following UML diagram.

Figure 6 – Architectural Design UML Diagram

Ceng 492 Revised Design Report

19

5.2. Description of Components

5.2.1. Graphical User Interface (GUI) Module

5.2.1.1. Processing Narrative GUI Module is an intermediate module to provide connection between the user and

PAPAĞAN. The module processes user request and passes it between the user, database, and

Hand Sign Recognizer module. GUI module’s responsibilities over other modules are

receiving input or providing output to the user, invoking Hand Sign Recognizer module for

incoming user input TSL sign input through Kinect, and accessing the database module for

retrieving information for Educational Mode, where an access to Kinect or Hand Sign

Recognizer module is not required.

5.2.1.2. Interface User inputs will be coming to GUI Module from user via GUI. These inputs indicate user

preferences about the screen content. The module will be outputting the screen content,

according user preferences, and informing the user through the processes.

5.2.1.3. Detailed Processing The user input will be received via clicks over interface buttons. Depending on the

process the click can activate the Kinect device and Hand Sign Recognizer module for

incoming TSL movement, navigate the user to another to a different interface in its own

module or retrieve user requested data from database module.

5.2.1.4. Dynamic Behavior

Ceng 492 Revised Design Report

20

Figure 7 – GUI Module Dynamic Behavior Model

Figure 8 – GUI Module Sequence Diagram

5.2.2. Database Module

5.2.2.1. Processing Narrative

Database module will hold the gestures as a sequence of sub-gestures and other relevant

miscellaneous formats like audio, video and text. This module will communicate with both

Hand Sign Recognizer module and GUI module. We choose to use this module due to speed

is one of the main considerations for PAPAĞAN. In interpreter mode where we translate the

given user motion to relevant TSL sign, considering the worst-case scenario the system may

compare all the recorded gestures with received user input. So, for this reason the database

should be kept in a well order and as small as possible.

Ceng 492 Revised Design Report

21

5.2.2.2. Interface Database module will be working collaboratively with the both GUI and Hand Sign

Recognizer module. In Teaching Mode, GUI module can requested a specific gesture video,

audio, etc. by passing its id to the database. User selects a gesture to learn/practice and the

GUI module sends the input the database module. Hand Sign Recognition passes the received

user motion sequence to map the ones in database.

5.2.2.3. Detailed Processing Database module will be processed through the Hand Sign Recognizer module or directly

from the GUI module. The requests can be either sent by a sequence of sub-movements or a

specific id related to the TSL signs.

5.2.2.4. Dynamic Processing

Figure 9 – Database Module Dynamic Behavior Model

Ceng 492 Revised Design Report

22

Figure 10 – Database Module Sequence Diagram

5.2.3. Hand Sign Recognizer Module

This module is the basis module that connects the user to the system. It is connected to

the Kinect device, to database and the GUI Module. It receives the data of a movement on the

Kinect device and interacts with the database. Depending on the incoming data, it makes

some database operations like comparison, insert etc. and returns result or feedback to the

user. The data, which received from Kinect device, is the coordinates of the joints of a body

frame by frame.

5.2.3.1. Processing Narrative This module receives the coordinates of the body joints. After that, the module interacts

with the database. There is a comparison operation, which compares the movement just

received with the movements that stored in the database. According to the feedback of this

comparison operation, it activates other modules on the system.

5.2.3.2. Interface In the module, first the mode selection screen will come up with the selections of training

mode and the interpreter mode. Both these selections create interaction with the database.

Ceng 492 Revised Design Report

23

Training mode: In this mode, there are some movements in the screen, which can be

studied/practiced through PAPAĞAN. After this choice, depending on the choice of TSL

sign, which the training will occur (video or audio), there will be a teaching video/audio

come up to screen to tutor the user.

Interpreter mode: In this mode, PAPAĞAN waits the user to make a move to be interpreted.

So there is just a waiting screen on interface.

5.2.3.3. Detailed Processing

This module waits for incoming user movement. Then the Kinect device transmits this

movement coordinates to this module. Then, these coordinates are compared with the ones

that stored in the database in runtime. If there is a match, it returns the meaning of the

movement. Otherwise, it results a failure and request for retry for the action.

5.2.3.4. Dynamic Processing

Figure 11 – Hand Sign Recognizer Module Dynamic Behavior Model

Ceng 492 Revised Design Report

24

Figure 12 – Hand Sign Recognizer Module Sequence Diagram

5.3. Design Rationale

We have chose three-part module decomposition. The decoupling principle [4] followed

through this selection. As the principle requires, the system is divided in three modules that

should not depend on each other. Of course those modules are highly related through the

processes, but the flow is through one module to another and they do not work

simultaneously. Additional modules could have been represented, but to keep it simple and

understandable a division of three would provide a perfect balance between decoupling the

system for debugging and development, and simplifying it for design and understandability

issues.

Ceng 492 Revised Design Report

25

5.4. Traceability of Requirements In this section the relations between use case operations to descriptions of components will

be shown.

5.4.1. GUI Module and Use Case Relations Use case relations with the GUI Modules are shown in the table below:

Design Elements Functional Operations GUI Module Relation

Login Mode Logging into PAPAĞAN User logins to the system

Select Mode Operation User selects a mode.

Logout from PAPAĞAN User logs outs to the system.

Interpreter Mode Returning the Main Screen User gives input by pressing

the return button.

Select Output Type User chooses the output type

from GUI.

Teaching Mode Returning the Main Screen User gives input by pressing

the return button.

Selecting an Gesture to Teach User selects a gesture from the

gesture list on GUI.

Play Video and Play Video

Again

User presses the play button via

GUI.

Practice User takes a test him/herself

using GUI

Results and Get Feedback GUI shows the result gathered

from Hand Sign Recognition

Module

Table 7 – Use case relations for GUI Module

5.4.2. Database Module and Use Case Relations Use case relations with the Database Module are shown in table below:

Ceng 492 Revised Design Report

26

Design Elements Functional Operations Database Module Relation

Interpreter Mode Translate a Phrase Map user input to a relevant

TSL sign from database

module.

Teaching Mode Selecting a Gesture to Teach Related video for the gesture is

requested from database

module

Practice Related videos and gestures are

compared the ones in the

database

Table 8 – Use case relations for Database Module

5.4.3. Hand Sign Recognizer Module and Use Case Relations Use case relations with the Hand Sing Recognition Mode are shown in table below;

Design Elements Functional Operations Hand Sing Recognition Mode

Relation

Interpreter Mode Translate a Phrase Map user input to a relevant

TSL sign from database

module.

Teaching Mode Practice Results are processed by Hand

Sign Recognition Module and

Translate a Phase operation

applied to each answer.

Table 9 – Use case relations for Hand Sign Recognizer Module

Ceng 492 Revised Design Report

27

6. User Interface Design

6.1. Overview of User Interfaces This section describes the functionality of the system from the user’s perspective.

6.1.1. Main Screen Inıtial Page: This screen is the one that first emerges when the system starts. In that screen,

user is expected to identify himself to the system. This process is necessery for detection of

user by Kinect device. Ther is a button at the bottom of the screen to begin to start identifying

himself. After this button is pressed, user should wait stand still for a while to be identified

clearly. When tihs identification process is finished, user jumps to main screen automatically.

Main Screen : The main screen is the page where user select which mode he uses

PAPAĞAN for. In the screen, there are two buttons, first one is to select interpreter mode,

and the second one is to detect Education mode. According to which mode user choose, he

will directed to related page.

Ceng 492 Revised Design Report

28

Figure 12 – Main Screen Interfaces Overview

6.1.2. Interpreter Mode Screens Interpreter Mode Main Screen: This page is actually has no buttons. Instead there is just a

text for user to direct him to make a motion to be translated. Although this page has no

buttons, it has the functionality to detect the start and end of user’s motion. After detecting

the motion, this gesture is send to the databse module for evaluation. When this evaluation is

finished, user will be directed to result page.

Interpreter Result Page: This screen contains the result of the user motion. After databse

module evaluated the motion and send necessary info to Hand Sign Recognizer module, this

screen will print the related result to the screen. If the user’s motion corresponds to a word or

Ceng 492 Revised Design Report

29

a phrase in our database, and the related word or phrase will be shown on the screen.

Otherwise there is no result in the screen.

In addition to motion result, this screen includes three buttons. First, “tekrar dene” button is

to retry. Actually, when user’s make a wrong motion or the motion has no meaning in our

TSL database, user is directed to Interpreter Mode Main Screen to retry his motion. Second is

“ana menüye dön” button is to retrun to main menu. When user completes his motion and see

the corresponding result on the screen, he is directed to main page by this button. The las one

is the “çıkış” button is to logout from the system.

Figure 13 – Interpreter Mode Screens Interfaces Overview

6.1.3. Teaching Mode Screens

Teaching Mode Main Screen: This mode is to teach some TSL movements. So there are a

lot of buttons in the screen. Each button has a name of a TSL movement. User can chose

which word he wants to learn and then directed to Video Player Screen.

Video Player Screen: In this screen, there is a video playing as the screen name states. This

video actually teach user how can the word he choose be made in TSL. After this video

finishes, user is expected to repeat the same motion. This is the core of our education mode

Ceng 492 Revised Design Report

30

so that user will be evaluated according to how much is he capable of doing the exact motion

on the video.

Education Feedback Screen: This screen emerges after the user’s motion is evaluated by

database module. In the screen there is an evaluation result. Actually, user has a feedback

about his motion, after watching education video and try to repeat that movement. In

addition to evaluation result, there will be three buttons on the screen. These three buttons

actually has same functionalities with the end of our interpret mode. The first “tekrar dene” is

to retry to movement which is taught by the PAPAĞAN. Second, is the “ana menüye dön”

button is to return to the main screen and finally, “çıkış” button is the logout from the system.

Figure 14 – Teaching Mode Screens Interfaces Overview

Ceng 492 Revised Design Report

31

6.2. Screen Images

6.2.1. Main Menu Screen In Main Menu screen user is requested to select desired mode. If user selects Interpretation

Mode (Çevirme Modu) he/she will be directed to a screen to input sign language movements to

Kinect device. If the user selects Education Mode ( Eğitim Modu ) he/she will be directed to

Education Mode Main Screen to select sign to learn and study.

Figure 16 – Main Menu Screen

6.2.2. Teaching Mode Main Screen Education Mode Main Screen is the screen where user selects desired sign to exercise.

Ceng 492 Revised Design Report

32

Figure 17 – Teaching Mode Screen

6.2.3. Interpreter Result Screen In Interpretation mode, after the user inputs the sign movement to Kinect, the interpreter

recognizes the sign and outputs it in text format in Interpreter Result Screen. In this screen

the user can hear the phrase in audio format or watch the video for it. Once the user is done,

he/she can either navigate to the Main Screen or input a new sign by clicking on the relevant

buttons.

Ceng 492 Revised Design Report

33

Figure 18 – Interpreter Result Screen

6.2.4. Educational Feedback Screen In Education mode, once the user watches the video of the desired movement, and

exercises the movement by him/herself, the user will be directed to the Educational Feedback

Screen and can see how well did he/she managed to do the sign. Depending on users

performance, he/she can keep practicing the same move or can navigate to the Main Screen.

Ceng 492 Revised Design Report

34

Figure 19 – Educational Feedback Screen

6.3. Screen Objects and Actions

When PAPAĞAN started, there is an identification screen and a “identify yourself”

object. PAPAĞAN first needs to identify the user and his body joints respectively, before

receiving any incoming input. After identification, there is the mode selection screen. In that

screen, there are two objects for two modes respectively. These objects’ actions are to

activate the corresponding modes in PAPAĞAN.

In the teaching mode, TSL signs will be taught to user. Every object in the screen has the

action to display the corresponding education video/audio. Then PAPAĞAN waits the user to

repeat displayed sign to practice user the shown sign. Once the user completes the practice,

there will be an opening feedback screen, which evaluates user movement. On this screen

there will be two objects that one returns to main menu and the other one request for another

try.

In Interpreter mode, PAPAĞAN requests user input movement that will be interpreted to

TSL. If the movement can be matched with the ones stored in the database, then the

translation of this movement will be reflected to the screen in desired format. Along with the

result the screen will contain two objects. One to return main screen and the other one to

interpret another TSL sign.

Ceng 492 Revised Design Report

35

7. Detailed Design

7.1. Graphical User Interface Module

7.1.1. Classification Graphical User Interface Module is a sub module of the system.

7.1.2. Definition This module is design to handle all user interface interactions. It communicates with the

user, receives inputs, passes it to related components and shows results received from those

other components. Although the functionality of this component is rather simpler than the

other, as indicated in our design goals, usability is a priority for PAPAĞAN, thus this

component is just as crucial as others. This component is the frontier of the product, so it

should be handled smoothly. Because any failure in the component will draw user’s attention

and will be criticized the first.

7.1.3. Responsibilities As mentioned above this module has responsible for user interactions. To make it

clearer, all user requests and responds handled through this component. Since PAPAĞAN’s

targeted user profile includes handicapped people, this component should be implemented

respectively to those people and it must be clear, consistent and easy to use.

7.1.4. Constraints Since the user will in front of the Kinect while using the program. Enabling mouse

interaction is not good idea. Button triggering should be handled in a better way than mouse

clicks. System’s solution for this problem will be controlling the cursor via hand gestures.

The basic gestures explained in the processing section will be used to control cursor

movement. Another is user should first introduce his/her body to get skeleton information to

start using the system.

Ceng 492 Revised Design Report

36

7.1.5. Compositions There are three sub components for this module. First one handles the user commands

and button actions will be handled in this sub component. Second one will handle to database

requests. In teaching mode when a sign to be learned is selected, this component will request

that sign object directly from the database. The last subcomponent will handle the

communication between hand sign recognizer module and GUI Module. This is a more

important sub component then others. It sends the tracked gesture the hand sign recognizer

component and receives the result of gesture interpretation.

7.1.6. Uses/Interactions GUI mostly interacts with the user and it handles the actions triggered by the user. It

changes its states according to these actions and their results. In more detail when

interpretation actions is triggered it receives the user movement and sends it to hand sign

recognizer. After the result returns from the recognizer the module changes its state and

outputs the result on screen. This module also interacts with other two components of the

system as explained in compositions section.

7.1.7. Resources Visual Studio 2010 provides the resources needed for this component. GUI will be

implemented in Windows Form and it uses .Net for GUI implementation.

7.1.8. Processing Graphical User Interface component has straightforward implementation while

interacting with user. Basically, it receives actions and sends it to the related component.

While receiving gesture from the user it creates a motion object. Fills it with data received

form Kinect device then it sends it to the Hand Sign Recognizer component for processing.

7.1.9. Interface Exports This component handles all accessible functionality in PAPAĞAN. At this section those

functionalities will be explained in detail. Related sequence diagram for the module is

available in section 5.2.1.4

Ceng 492 Revised Design Report

37

start() : Programs begins and user introduces the body .

end() : Program terminates.

selectMode( integer ) : When the user selects mode this function is called and it sets

current user mode .

selectTeachingAction( Integer ) : User selects one of the teaching mode gestures.

playVideo() : In teaching mode selected sign’s video is requested from the database to and

played .

playAudio() : In teaching mode selected signs audio is requested from the database to and

played .

returnMainScreen() : When this action is triggered program returns the main screen

retry() : This function is called when the current gesture is not recognized by the System.

Call for this function either indicates user’s inability to input the TSL sign properly or

PAPAĞAN’s inability to process the input.

returnSelectedObject( object id ): In teaching mode when a sign is requested from the

database it sends its id to database and gets the sign object .

startTrack() : Starts receiving data from Kinect device.

endTrack() : Ends receiving data from Kinect device.

Ceng 492 Revised Design Report

38

7.2. Database Module

7.2.1. Classification Database module is a sub-module of the whole system.

7.2.2. Definition This module is the interface between PAPAĞAN and the database management system.

Other modules uses database through this module. Handling database communication eases

the job, instead of accessing through the database from each component separately. The

module either responds requests to fetch information from the database or store data in

database.

7.2.3. Responsibilities The purpose of this module is to create communication for other modules with the

database. It manages the receiving and storing information to the database according to the

requests received from other modules.

7.2.4. Constraints This module is completely dependent on the other modules. It assumes other modules to

request always the correct information to be kept or to be delivered. To explain it easily this

component only does what it is told to. In order to communicate with this component, others

should achieve some constraints such as connecting the database with correct name and

password or manipulating the data with proper request. Stored data types are strings, integers,

floats as usual and also includes video and audio objects which are stored in a different way

which will be explained in processing section.

7.2.5. Compositions There are two subcomponents of this module namely signs and motions. These sub-

components have one to one relation between them. Which, yields for every motion in the

database there will be a sign objects stored in the database.

Ceng 492 Revised Design Report

39

7.2.6. Uses/Interactions

When data is requested from the database, related class instances will be created for

related database tables and will be filled with the requested information to be used for the

requested execution. Similarly other operations like retrieving/updating/deleting/inserting

data in the database will be handled in object oriented manner. The related table with the

object will be updated automatically. The functionality beneath the user interface will be

realized by means of this group of classes.

7.2.7. Resources Database Module interacts with a MSSQL DBMS. We chose this database because we

are developing in Microsoft environment and we are forced to use Microsoft Visual studio

2010, since we are using Microsoft Kinect SDK. But it has its own advantages like it is not

necessary to load ODBC or JDBC drivers in order to access the database. Simply setting the

connection string for the database connection object will be enough. System.Data.SqlClient

will be used for the database actions. Since the project is a single user project with concurrent

requests, there won’t be any deadlock situations or race conditions with database operations.

7.2.8. Processing The component performs its duties over the Motion and Sign Objects that are explained

before with their attributes .The interpreter mode algorithm for the database component will

be as the following. The motion will be sent to recognizer module, and then recognizer

module will request all the motions from the database. If it finds a match, it will request

related sign with the motion id then it will send related information to interface module, if it

can not find the match it will send a proper message the interface module. When the

recognizer request the motions from database for each table element it will create a motion

object, fill its instances and put it to a container. As stated before database module will do

only what is told to do, however, it will be responsible to throw proper exceptions if

something goes during database transactions. For example if connection fails it will rise

connection problem or if there is a problem with particular request it return a message related

to this request. It will ease the job to find bugs or problems for the development team.

Ceng 492 Revised Design Report

40

7.2.9. Interface Exports The set of services provided by this module are limited. The user will not be able to

send request to database directly. All data in the database will be inserted by when the

program is installed and user will not allowed to modify the database in this version of the

project. Actually there will no change in database after the implementation is finished. But if

we develop a new version and enlarge the sign can be interpreted database will modified by

the developers. Functionality of each element of this module is explained below, related

sequence diagram can be found in 5.2.2.4 Dynamic Processing section.

sendFeedBack(): it sends a feedback related question answered in teaching mode.

sendResult(): This function sends the result of the test taken in teaching mode .

7.3. Hand Sign Recognizer Module

7.3.1. Classification

Hand Sign Recognizer Module is a sub module of the system.

7.3.2. Definition Hand Sign Recognizer component basically deals with the recognition process of

incoming users gestures. It processes the given gesture and matches it with the ones stored in

the database. This module is the core of our system since functionality of the system

available only if this module properly.

7.3.3. Responsibilities Hand Sign Recognizer is responsible for detecting gestures and recognizing them. This

component is also responsible for creating the corresponding sequence of sub actions for

recognized gesture and passing resulting data to Graphical User Interface Module.

Ceng 492 Revised Design Report

41

7.3.4. Constraints Hand Sign Recognizer component should be implemented such that it should be able to

deal with gesture recognition process in real time. Providing real time recognition system is

the most problematic issue that the system should deal with. This module is success is

directly related with the quality of the received data from the Kinect device. The Recognizer

component should work properly independent of users distance from the device or

illumination of the environment. So, that this constraints should not be restricted to the user.

7.3.5. Compositions The component does not have any subcomponents is work as a whole.

7.3.6. Uses/Interactions Hand Sign Recognizer component affects the system behavior. It indirectly changes the

UI. Input to GUI component is provided as a Sign object so that interface of the system

changes state to according current mode of the user. In detail if the user is in Interpreter

Mode when sign object returned from this module, GUI will show mapped gesture from the

database in various formats. On the other hand, if the user is using the teaching mode, GUI

will change its state to show results of the feedback test. More cases will be explained in

detail in processing section.

7.3.7. Resources The resources needed by this component are Kinect device and gesture information

stored in database. Without these resources this component can work properly, due that the

module mainly maps the incoming user data through Kinect to relevant match in the

database. Microsoft Kinect SDK will be used to retrieve incoming user data.

7.3.8. Processing Hand sign recognizer does the most important processing work for this entire project.

Implementing this will require the largest effort. Recognizer will work as the following. It

will receive motion object from the GUI module and process it. Motion will be sampled to

small components of gesture particles. Then it will try to match the motion received through

Ceng 492 Revised Design Report

42

the GUI with the ones in the database. If this matching succeeds then the sign object from

database will be requested with a motion object id. The Sign object requested from the

database module will be sent to GUI module. If the match fails a proper message to GUI

module will be shown. Down sampling the given gestures is the most important thing to do in

this project while matching it with in database is trivial.

7.3.9. Interface Exports

This components, as states before will handle most crucial services of PAPAĞAN. Sequence

diagram can found 5.2.3.3 Detailed processing part. Functionality of each element is

explained as below;

getMotions() : It is private function of this module and it will get all motion sequences from

the database.

getSign( Motion id ): It is a private function of this module. It is job is to get Sign object

with the same motion id.

sampleTheMotion(Motion ): It takes a motion as parameter and breaks it to small gesture

components to compare it with saved ones.

Ceng 492 Revised Design Report

43

8. Libraries & Tools We are using Kinect device in our project. In order to get required data from Kinect we

need install Microsoft Kinect SDK. This SDK can be installed only in Microsoft

environment. This restricts the operating system choice to Windows 7.

The system will be developed in C# programming language. It will be coded in Visual

Studio 2010 IDE, which also contains required tools for GUI development. MSSQL is chosen

as database management server for the project.

9. Time Planning

9.1. Term I (September 2011 – January 2012)

Figure 20 – Gannt Chart for Term I

Ceng 492 Revised Design Report

44

9.2. Term II (February – June 2012)

Figure 21 – Gannt Chart for Term II

10. Conclusion This report is the Detailed Design Report (DDR) document of PAPAĞAN Project. The

project purposes to understand TSL and while doing so building a bridge between

handicapped people and others. It also aims to increase the usage of TSL with its educational

features. The report generally aims to focus on the aspects, which are considered as crucial to

develop the system.