Upload
phungnga
View
214
Download
0
Embed Size (px)
Citation preview
Ceng 492 Revised Design Report
1
1.
08
Fall
CENG 492 REVISED DESIGN REPORT
İbrahim Kıvanç ERTÜRK 1560192
Süleyman YILMAZ 1631215
Gizem ÖĞRÜNÇ 1631316
Aykut ŞİMŞEK 1631415
Sponsored by:
!""
111 ...
!"#$%&%'(%)###
"
!"##$%&'($)*#$$+'+,+-.$/&0+('$1&231,$
" "!#$%!&'"
()*+,-./"012345""""""""""""" "!$6!'!#"
" "!$6!6!$"
" "!$6!7!#"
"
(89/:9;+<"=,>"
"
*+*+ +,##
-.)/012#3045#6754.74%#8%&9450:095#;1054#<0&)919=:#>05%&:#
%?"?7@@#
“Turkish Sign Recognition Using Microsoft Kinect”
27 February 2012
Ceng 492 Revised Design Report
2
Table of Contents
1. Introduction...................................................................................................................4 1.1. Problem Definition ......................................................................................................4 1.2. Purpose .......................................................................................................................4 1.3. Scope...........................................................................................................................4 1.4. Overview .....................................................................................................................4 1.5. Definitions, Acronyms and Abbreviations ....................................................................5 1.6. References ...................................................................................................................5 2. System Overview............................................................................................................7 3. Design Considerations ....................................................................................................8 3.1. Design Assumptions, Dependencies and Constraints ....................................................8 3.1.1. Time Constraints ......................................................................................................8 3.1.2. Resource Constraints ................................................................................................8 3.1.3. Performance Constraints ..........................................................................................8 3.1.4. Hardware Constraints ..............................................................................................8 3.1.5. Software Constraints ................................................................................................9 3.2. Design Goals and Guidelines........................................................................................9 4. Data Design..................................................................................................................10 4.1. Data Description........................................................................................................10 4.1.1. Data Objects ...........................................................................................................10 4.1.2. Relationships ..........................................................................................................12 4.1.3. Complete Data Model .............................................................................................14 4.2. Data Dictionary .........................................................................................................15 5. System Architecture .....................................................................................................18 5.1. Architectural Design..................................................................................................18 5.2. Description of Components........................................................................................19 5.2.1. Graphical User Interface (GUI) Module..................................................................19 5.2.1.1. Processing Narrative ............................................................................................19 5.2.1.2. Interface ..............................................................................................................19 5.2.1.3. Detailed Processing ..............................................................................................19 5.2.1.4. Dynamic Behavior................................................................................................19 5.2.2. Database Module ....................................................................................................20 5.2.2.1. Processing Narrative ............................................................................................20 5.2.2.2. Interface ..............................................................................................................21 5.2.2.3. Detailed Processing ..............................................................................................21 5.2.2.4. Dynamic Processing .............................................................................................21 5.2.3. Hand Sign Recognizer Module ................................................................................22 5.2.3.1. Processing Narrative ............................................................................................22 5.2.3.2. Interface ..............................................................................................................22 5.2.3.3. Detailed Processing ..............................................................................................23 5.2.3.4. Dynamic Processing .............................................................................................23 5.3. Design Rationale........................................................................................................24 5.4. Traceability of Requirements ....................................................................................25 5.4.1. GUI Module and Use Case Relations.......................................................................25 Design Elements .................................................................................................................25 Functional Operations ........................................................................................................25 GUI Module Relation..........................................................................................................25 5.4.2. Database Module and Use Case Relations ...............................................................25 Design Elements .................................................................................................................26 Functional Operations ........................................................................................................26
Ceng 492 Revised Design Report
3
Database Module Relation ..................................................................................................26 5.4.3. Hand Sign Recognizer Module and Use Case Relations ...........................................26 Design Elements .................................................................................................................26 Functional Operations ........................................................................................................26 Hand Sing Recognition Mode Relation ................................................................................26 6. User Interface Design ...................................................................................................27 6.1. Overview of User Interfaces.......................................................................................27 6.1.1. Main Screen ...........................................................................................................27 6.1.2. Interpreter Mode Screens .......................................................................................28 6.1.3. Teaching Mode Screens ..........................................................................................29 6.2. Screen Images ...........................................................................................................31 6.2.1. Main Menu Screen..................................................................................................31 6.2.2. Teaching Mode Main Screen...................................................................................31 6.2.3. Interpreter Result Screen........................................................................................32 6.2.4. Educational Feedback Screen .................................................................................33 6.3. Screen Objects and Actions .......................................................................................34 7. Detailed Design ............................................................................................................35 7.1. Graphical User Interface Module ..............................................................................35 7.1.1. Classification ..........................................................................................................35 7.1.2. Definition................................................................................................................35 7.1.3. Responsibilities .......................................................................................................35 7.1.4. Constraints .............................................................................................................35 7.1.5. Compositions ..........................................................................................................36 7.1.6. Uses/Interactions ....................................................................................................36 7.1.7. Resources ...............................................................................................................36 7.1.8. Processing...............................................................................................................36 7.1.9. Interface Exports ....................................................................................................36 7.2. Database Module.......................................................................................................38 7.2.1. Classification ..........................................................................................................38 7.2.2. Definition................................................................................................................38 7.2.3. Responsibilities .......................................................................................................38 7.2.4. Constraints .............................................................................................................38 7.2.5. Compositions ..........................................................................................................38 7.2.6. Uses/Interactions ....................................................................................................39 7.2.7. Resources ...............................................................................................................39 7.2.8. Processing...............................................................................................................39 7.2.9. Interface Exports ....................................................................................................40 7.3. Hand Sign Recognizer Module...................................................................................40 7.3.1. Classification ..........................................................................................................40 7.3.2. Definition................................................................................................................40 7.3.3. Responsibilities .......................................................................................................40 7.3.4. Constraints .............................................................................................................41 7.3.5. Compositions ..........................................................................................................41 7.3.6. Uses/Interactions ....................................................................................................41 7.3.7. Resources ...............................................................................................................41 7.3.8. Processing...............................................................................................................41 7.3.9. Interface Exports ....................................................................................................42 8. Libraries & Tools.........................................................................................................43 9. Time Planning..............................................................................................................43 9.1. Term I (September 2011 – January 2012) ..................................................................43 9.2. Term II (February – June 2012) ................................................................................44 10. Conclusion .................................................................................................................44
Ceng 492 Revised Design Report
4
1. Introduction
1.1. Problem Definition The aim of PAPAĞAN project is to provide an interface working through Microsoft
Kinect that can recognize and understand Turkish Sign Language (TSL). The implementation
will be divided in two main functionalities. First, by using the information inferred from
Kinect, recognition of the TSL movement and translation of the meaning to text, audio and
video. This will provide a bidirectional communication between speech disabled people and
people who doesn’t know TSL, by creating a platform that they can communicate. The
second functionality of PAPAĞAN is to design of an interactive education tool to teach and
practice TSL.
1.2. Purpose This document contains the detailed description and information about PAPAĞAN
project. The Detailed Design Report aims to provide all necessary information regarding to
PAPAĞAN’s development, so that it will be a guide for further implementation of the project.
1.3. Scope
In Detailed Design Report, the requirements that will be needed to design PAPAĞAN is
explained. This document shows how the software system will be structured to satisfy the
requirements that specified in SRS document. Briefly, this document includes design
considerations, data design, architectural design, user interface design and detailed design
procedures for PAPAĞAN. This documentation is prepared in guidance of standards defined
by IEEE [1].
1.4. Overview This documentation will show how PAPAĞAN will be structured to satisfy the
requirements specified in SRS document. Those structures of the system are summarized as
the following;
• System Overview: A general description of PAPAĞAN including its functionality
and matters related to the overall system and its design.
• Design Considerations: Issues, which need to be addressed or resolved before
attempting to devise a complete design solution.
Ceng 492 Revised Design Report
5
• Data Design: Explanation of how the information domain of PAPAĞAN is
transformed into data structures.
• System Architecture: Description of the program architecture and detailed
description of system components.
• User Interface Design: Functionality of the system from the user’s perspective and
introduction of detailed user interfaces description.
• Detailed Design: The internal details of each design entity/component
• Libraries & Tools: Presentation of the required development environment to develop
PAPAĞAN.
• Time Planning: Designed scheduling for development.
• Conclusion: One last word about the document and PAPAĞAN.
1.5. Definitions, Acronyms and Abbreviations TSL: Turkish Sign Language
SL: Sign Language
SRS: Software Requirements Specifications
PAPAĞAN: Name of the product/project
GUI: Graphical User Interface
MSSQL DBMS: Microsoft SQL Server Database Management System
SDK: Software Development Kit
IDE: Integrated Development Environment.
INNOVA: Innova Future-Ready Solutions. PAPAĞAN’s sponsor company.
1.6. References [1] IEEE Std 830-1998: IEEE Recommended Practice for Software Requirements
Specifications
[2] T.C. Aile ve Sosyal Politikalar Bakanlığı Özürlü ve Yaşlı Hizmetleri Genel Müdürlüğü.
(n.d.). Türkiye Özürlüler araştırması temel göstergeleri. Retrieved from
http://www.ozida.gov.tr/arastirma/oztemelgosterge.htm
Ceng 492 Revised Design Report
6
[3]Christopher Grant. (Dec 20th, 2010). Kinect hacks: American sign language recognition.
Retrieved from http://i.joystiq.com/2010/12/20/kinect-hacks-american-sign-language-recognition/
[4] Anonymous (29 December 2011). Decoupling. Retrieved from:
http://en.wikipedia.org/wiki/Decoupling - Software_Development
Ceng 492 Revised Design Report
7
2. System Overview
The main purpose of PAPAĞAN is to recognize TSL by using the information
retrieved from Microsoft Kinect device. Although there are similar examples of such projects
for other sign languages, PAPAĞAN will be a pioneer for TSL and aims to provide a more
comprehensive coverage and functionality then its predecessors. PAPAĞAN’s main
difference from those previous projects will be it’s simple hand sign recognition algorithm,
which is proposed by Think-e Wink-e and named as Divide and Collect approach for sign
language recognition using Microsoft Kinect. The simplicity of this algorithm will ease the
process and enable defining and recognizing a wider range of signs.
According to statistics the percentage of handicapped people in Turkey is 11%.
Furthermore, 36% of the total handicapped people are unable to read or write, where the
number is 13% among non-handicapped people [2]. In order to handicapped people to
communicate, either the handicapped party has to know how to write or both parties has to
know the sign language. PAPAĞAN is willing to help those people by both translating their
language to natural language and spreading the triteness of TSL with our educational tool in
Turkey.
Down below is a general overview of PAPAĞAN and its components;
Figure 1 – System Overview in block diagram
Ceng 492 Revised Design Report
8
3. Design Considerations
The system will be designed with an object-oriented programming paradigm. A class
will represent each module in PAPAĞAN. Where those modules will be able to communicate
with each other. Since the system will be user interactive, the design should also consider
user preferences during implementation of PAPAĞAN, so that the users should be able to use
easily.
3.1. Design Assumptions, Dependencies and Constraints
3.1.1. Time Constraints PAPAĞAN has to run synchronously with the Kinect device and the user. Other
modules should finish their jobs before Kinect send new data. Since user motions will be
input for the system, it should be able to process incoming information in a speed of 10-15
frames per second.
3.1.2. Resource Constraints The project will be using Kinect device as the input source. Microsoft Kinect SDK
should be installed for development. The usage of Microsoft SDK also constrains the
operating system by forcing the use of Microsoft’s operating system (Windows 7) and
Microsoft Database Management Server MSSQL.
3.1.3. Performance Constraints When the system is activated, system will be in a state that interaction with the users
will be the most frequently repeated event; thus, as it is stated before system should process
10-15 frames/sec for a proper recognition of the gestures.
3.1.4. Hardware Constraints
The most crucial hardware component of this project is Kinect device, which the user
interaction will be processed through. The system needs a powerful processor in order to
make complicated pattern recognition and stereo matching algorithms, like processing 10-15
frames per second. Most new generation core due processors will do the work. Also system
memory consumptions should never exceed 20% of the total memory. Minimum memory to
system work properly is 2GB.
Ceng 492 Revised Design Report
9
3.1.5. Software Constraints The system will be implemented by using C# programming language. GUI will be
implemented by using Visual Studio 2010. Database Module interacts with MSSQL DBMS
will be used to appropriate connection between the system and the database. As mentioned
above operating system should be Windows 7 due to Kinect SDK’s own constraints.
3.2. Design Goals and Guidelines Due to PAPAĞAN’s aimed purpose and targeted audience, the system mainly will be
evaluated by its accuracy, usability and coverage. As the development team we will set those
properties as our priorities through the system design. The explanation of those priorities can
be found as below;
PAPAĞAN will be considered as successful as its accuracy to interpret TSL. The
system cannot create user confidence, unless it fulfills a reasonable accuracy level. It is clear
that with the increasing number of TSL moves covered by the language, the accuracy level
will drop, as we can see from the similar projects [3]. For this reason, aiming 100% accuracy
level will be a too optimistic perdition. But, in order to earn the user confidence, the targeted
accuracy level should be above at least 90%.
As the targeted audience will include people with various disabilities, the user
interfaces and system functionalities should be consistent, clean, easy to use and easy to
understand. Also considering the speech and hearing disabled people, the audible interactions
and materials should be kept at minimum. For points where audible materials add value to the
system (such as resulting interpretations), the information should be available in alternative
formats as well.
Once PAPAĞAN reaches to a desired level of performance, accuracy and can be
considered as complete. The rest of the implementation will focus on enlarging the coverage
of TSL by PAPAĞAN. But, for initial design the system coverage will cover the basic
phrases in TSL.
Ceng 492 Revised Design Report
10
4. Data Design
4.1. Data Description This section describes data objects that will be used and managed in PAPAĞAN. Data
Object section defines the data objects and their attributes. Then the Complete Data Model
Section shows the relations between the classes.
4.1.1. Data Objects Motion: This class is used to describe the motions received through Kinect device as inputs
of the user. The attributes and methods with their explanations are shown below.
Motion Table:
Field Name Data Type Description
startPosition tuple The starting coordinates of a body-part movement. It
holds three different values for x, y and z
coordinates
endPosition tuple The ending coordinates of a body-part movement. It
holds three different values for x, y and z
coordinates
feedback float result of a motion in interpreter mode, and it is used
as feedback message for the user. Value between 0.0
– 10.0
isTrue boolean result of a motion in interpreter mode, and it is used
as feedback message for the system
Table 1 – Motion Table
Skeleton: This class is used to represent the basic skeleton form of user body. The attributes
in this class are predefined by and can be accessed through Microsoft Kinect SDK itself and
explained shown as below;
Skeleton Table:
Field Name Data Type Description
Ceng 492 Revised Design Report
11
head Head It represents the user’s head. It maps the head to the sensor
to be tracked by Kinect
neck Neck It represents the user’s neck. It maps the head to the sensor
to be tracked by Kinect
torso Torso It represents the user’s torso. It maps the head to the sensor
to be tracked by Kinect
leftShoulder Shoulder It represents the user’s shoulders. It maps the shoulder’s left
to the sensor to be tracked by Kinect
rightShoulder Shoulder It represents the user’s shoulders. It maps the shoulder’s
right to the sensor to be tracked by Kinect
leftElbow Elbow It represents the user’s elbows. It maps the elbow’s left to
the sensor to be tracked by Kinect
rightElbow Elbow It represents the user’s elbows. It maps the elbow’s right to
the sensor to be tracked by Kinect
leftWrist Wrist It represents the user’s wrists. It maps the wrist’s left to the
sensor to be tracked by Kinect
rightWrist Wrist It represents the user’s wrists. It maps the wrist’s right to
the sensor to be tracked by Kinect
leftHand Hand It represents the user’s hands. It maps the hand’s left to the
sensor to be tracked by Kinect
rightHand Hand It represents the user’s hands. It maps the hand’s right to the
sensor to be tracked by Kinect
leftFingertip FingerTip It represents the user’s fingertips. It maps the left of
fingertip to the sensor to be tracked by Kinect
rightFingertip FingerTip It represents the user’s fingertips. It maps the right of
fingertip to the sensor to be tracked by Kinect
Table 2 – Skeleton Table
Sign: This class holds the attributes to represent TSL signs, which will be compared with
incoming user input. In other words, it is used to do main job on the system. The attributes
and methods with their explanations are shown below.
Sign Table:
Field Name Data Type Decription
Ceng 492 Revised Design Report
12
motion Vector<Tuple> It is used to detect user movements and pass arguments to the
classes and activities that will need
audio AudioStream It keeps the audio stream of relevant sign
text Text It keeps the text form of the word of Turkish sign language
video VideoStream It keeps the video stream of the word of Turkish sign language. It
is used in teaching mode
user User It keeps the information of the user that is used to track of body
motion, hand sign language, to translate in text
Table 3 – Sign Table
User: This class is used to hold attributes that are info about the user. The attributes and
methods with their explanations are shown below.
User Table:
Field Name Data Type Decription
skeleton Skeleton It is an object of Skeleton. It keeps the value of user body joints
locations and transmit the info to the sign class over itself
mode integer It keeps the general info about the user is in which mode on the
kinect system
Table 4 – User Table
4.1.2. Relationships This part described the relationship between classes defined in 4.1.1. Data Objects
section.
Sign – Motion (Has relationship): In sign class, the motion is described as an entity.
Also, it is used to translate motion to sign and then text. This relationship is needed for
translation. The relationship is shown below.
Figure 2 - Sign and Motion Objects Relationship
Ceng 492 Revised Design Report
13
Sign – User (Has relationship): In sign class, the user’s information must be specified
to be mapped to motions’ relevant meaning. So, sign class uses user class and its info. The
relationship is shown below.
Figure 3- Sign and User Objects Relationship
User- Skeleton (Has relationship): In user class, the user has a skeleton to be
recognized by Kinect. The relationship is shown below.
Figure 4 - User and Skeleton Objects Relationship
Ceng 492 Revised Design Report
14
4.1.3. Complete Data Model Complete data model is shown as below.
Figure 5 – Complete Data Model
Ceng 492 Revised Design Report
15
4.2. Data Dictionary
Name Type Refer to Section
startPosition tuple 4.1.1.
endPosition tuple 4.1.1.
angularVelocity float 4.1.1.
feedback float 4.1.1.
isTrue boolean 4.1.1.
head Head 4.1.1.
neck Neck 4.1.1.
torso Torso 4.1.1.
leftShoulder Shoulder 4.1.1.
rightShoulder Shoulder 4.1.1.
leftElbow Elbow 4.1.1.
rightElbow Elbow 4.1.1.
leftWrist Wrist 4.1.1.
rightWrist Wrist 4.1.1.
leftHand Hand 4.1.1.
rightHand Hand 4.1.1.
leftFingertip FingerTip 4.1.1.
Ceng 492 Revised Design Report
16
rightFingertip FingerTip 4.1.1.
motion Vector<Tuple> 4.1.1.
audio AudioStream 4.1.1.
text Text 4.1.1.
video VideoStream 4.1.1.
user User 4.1.1.
skeleton Skeleton 4.1.1.
mode integer 4.1.1.
GUI Module Module 5.2.1
Database Module Module 5.3.1
Hand Sign Recognizer
Module
Module 5.3.2.
User Class 4.1.1.
Sign Class 4.1.1.
Skeleton Class 4.1.1.
Motion Class 4.1.1.
sendMovementData() Action function 5.2.3
sendResulttoScreen() Action function 5.2.3
sendFeedBacktoScreen() Action function 5.2.3
Ceng 492 Revised Design Report
17
sendFeedBack() Action function 5.2.2
sendResult() Action function 5.2.2
start() Action function 5.2.1
end() Action function 5.2.1
selectMode() Action function 5.2.1
playVideo() Action function 5.2.1
playAudio() Action function 5.2.1
selectTeachAction() Action function 5.2.1
retry() Action function 5.2.1
returnSelectedObject() Action function 5.2.1
startTrack() Action function 5.2.1
endTrack() Action function 5.2.1
returnAudio() Action function 5.2.1
returnVideo() Action function 5.2.1
returnText() Action function 5.2.1
Table 6 - Data Dictionary
Ceng 492 Revised Design Report
18
5. System Architecture
5.1. Architectural Design PAPAĞAN will include 3 main modules that interact with each other, the user and the
Kinect device. These modules are GUI Module, Database Module and Hand Sign Recognizer
Module. GUI Module is an intermediate module to provide connection between the user and
PAPAĞAN. The module processes user request and passes it between the user, database, and
Hand Sign Recognizer Module. Database module will hold the gestures and their videos,
which can be handled by PAPAĞAN. This module will communicate with both Hand Sign
Recognition module and GUI module. Final module, Hand Sign Recognizer Module is the
basis module that connects the user to the system. It is connected to the Kinect device in one
end and to database on the other end. It receives the data of a movement on the Kinect device
and interacts with the database. How these modules are connected together is presented in the
following UML diagram.
Figure 6 – Architectural Design UML Diagram
Ceng 492 Revised Design Report
19
5.2. Description of Components
5.2.1. Graphical User Interface (GUI) Module
5.2.1.1. Processing Narrative GUI Module is an intermediate module to provide connection between the user and
PAPAĞAN. The module processes user request and passes it between the user, database, and
Hand Sign Recognizer module. GUI module’s responsibilities over other modules are
receiving input or providing output to the user, invoking Hand Sign Recognizer module for
incoming user input TSL sign input through Kinect, and accessing the database module for
retrieving information for Educational Mode, where an access to Kinect or Hand Sign
Recognizer module is not required.
5.2.1.2. Interface User inputs will be coming to GUI Module from user via GUI. These inputs indicate user
preferences about the screen content. The module will be outputting the screen content,
according user preferences, and informing the user through the processes.
5.2.1.3. Detailed Processing The user input will be received via clicks over interface buttons. Depending on the
process the click can activate the Kinect device and Hand Sign Recognizer module for
incoming TSL movement, navigate the user to another to a different interface in its own
module or retrieve user requested data from database module.
5.2.1.4. Dynamic Behavior
Ceng 492 Revised Design Report
20
Figure 7 – GUI Module Dynamic Behavior Model
Figure 8 – GUI Module Sequence Diagram
5.2.2. Database Module
5.2.2.1. Processing Narrative
Database module will hold the gestures as a sequence of sub-gestures and other relevant
miscellaneous formats like audio, video and text. This module will communicate with both
Hand Sign Recognizer module and GUI module. We choose to use this module due to speed
is one of the main considerations for PAPAĞAN. In interpreter mode where we translate the
given user motion to relevant TSL sign, considering the worst-case scenario the system may
compare all the recorded gestures with received user input. So, for this reason the database
should be kept in a well order and as small as possible.
Ceng 492 Revised Design Report
21
5.2.2.2. Interface Database module will be working collaboratively with the both GUI and Hand Sign
Recognizer module. In Teaching Mode, GUI module can requested a specific gesture video,
audio, etc. by passing its id to the database. User selects a gesture to learn/practice and the
GUI module sends the input the database module. Hand Sign Recognition passes the received
user motion sequence to map the ones in database.
5.2.2.3. Detailed Processing Database module will be processed through the Hand Sign Recognizer module or directly
from the GUI module. The requests can be either sent by a sequence of sub-movements or a
specific id related to the TSL signs.
5.2.2.4. Dynamic Processing
Figure 9 – Database Module Dynamic Behavior Model
Ceng 492 Revised Design Report
22
Figure 10 – Database Module Sequence Diagram
5.2.3. Hand Sign Recognizer Module
This module is the basis module that connects the user to the system. It is connected to
the Kinect device, to database and the GUI Module. It receives the data of a movement on the
Kinect device and interacts with the database. Depending on the incoming data, it makes
some database operations like comparison, insert etc. and returns result or feedback to the
user. The data, which received from Kinect device, is the coordinates of the joints of a body
frame by frame.
5.2.3.1. Processing Narrative This module receives the coordinates of the body joints. After that, the module interacts
with the database. There is a comparison operation, which compares the movement just
received with the movements that stored in the database. According to the feedback of this
comparison operation, it activates other modules on the system.
5.2.3.2. Interface In the module, first the mode selection screen will come up with the selections of training
mode and the interpreter mode. Both these selections create interaction with the database.
Ceng 492 Revised Design Report
23
Training mode: In this mode, there are some movements in the screen, which can be
studied/practiced through PAPAĞAN. After this choice, depending on the choice of TSL
sign, which the training will occur (video or audio), there will be a teaching video/audio
come up to screen to tutor the user.
Interpreter mode: In this mode, PAPAĞAN waits the user to make a move to be interpreted.
So there is just a waiting screen on interface.
5.2.3.3. Detailed Processing
This module waits for incoming user movement. Then the Kinect device transmits this
movement coordinates to this module. Then, these coordinates are compared with the ones
that stored in the database in runtime. If there is a match, it returns the meaning of the
movement. Otherwise, it results a failure and request for retry for the action.
5.2.3.4. Dynamic Processing
Figure 11 – Hand Sign Recognizer Module Dynamic Behavior Model
Ceng 492 Revised Design Report
24
Figure 12 – Hand Sign Recognizer Module Sequence Diagram
5.3. Design Rationale
We have chose three-part module decomposition. The decoupling principle [4] followed
through this selection. As the principle requires, the system is divided in three modules that
should not depend on each other. Of course those modules are highly related through the
processes, but the flow is through one module to another and they do not work
simultaneously. Additional modules could have been represented, but to keep it simple and
understandable a division of three would provide a perfect balance between decoupling the
system for debugging and development, and simplifying it for design and understandability
issues.
Ceng 492 Revised Design Report
25
5.4. Traceability of Requirements In this section the relations between use case operations to descriptions of components will
be shown.
5.4.1. GUI Module and Use Case Relations Use case relations with the GUI Modules are shown in the table below:
Design Elements Functional Operations GUI Module Relation
Login Mode Logging into PAPAĞAN User logins to the system
Select Mode Operation User selects a mode.
Logout from PAPAĞAN User logs outs to the system.
Interpreter Mode Returning the Main Screen User gives input by pressing
the return button.
Select Output Type User chooses the output type
from GUI.
Teaching Mode Returning the Main Screen User gives input by pressing
the return button.
Selecting an Gesture to Teach User selects a gesture from the
gesture list on GUI.
Play Video and Play Video
Again
User presses the play button via
GUI.
Practice User takes a test him/herself
using GUI
Results and Get Feedback GUI shows the result gathered
from Hand Sign Recognition
Module
Table 7 – Use case relations for GUI Module
5.4.2. Database Module and Use Case Relations Use case relations with the Database Module are shown in table below:
Ceng 492 Revised Design Report
26
Design Elements Functional Operations Database Module Relation
Interpreter Mode Translate a Phrase Map user input to a relevant
TSL sign from database
module.
Teaching Mode Selecting a Gesture to Teach Related video for the gesture is
requested from database
module
Practice Related videos and gestures are
compared the ones in the
database
Table 8 – Use case relations for Database Module
5.4.3. Hand Sign Recognizer Module and Use Case Relations Use case relations with the Hand Sing Recognition Mode are shown in table below;
Design Elements Functional Operations Hand Sing Recognition Mode
Relation
Interpreter Mode Translate a Phrase Map user input to a relevant
TSL sign from database
module.
Teaching Mode Practice Results are processed by Hand
Sign Recognition Module and
Translate a Phase operation
applied to each answer.
Table 9 – Use case relations for Hand Sign Recognizer Module
Ceng 492 Revised Design Report
27
6. User Interface Design
6.1. Overview of User Interfaces This section describes the functionality of the system from the user’s perspective.
6.1.1. Main Screen Inıtial Page: This screen is the one that first emerges when the system starts. In that screen,
user is expected to identify himself to the system. This process is necessery for detection of
user by Kinect device. Ther is a button at the bottom of the screen to begin to start identifying
himself. After this button is pressed, user should wait stand still for a while to be identified
clearly. When tihs identification process is finished, user jumps to main screen automatically.
Main Screen : The main screen is the page where user select which mode he uses
PAPAĞAN for. In the screen, there are two buttons, first one is to select interpreter mode,
and the second one is to detect Education mode. According to which mode user choose, he
will directed to related page.
Ceng 492 Revised Design Report
28
Figure 12 – Main Screen Interfaces Overview
6.1.2. Interpreter Mode Screens Interpreter Mode Main Screen: This page is actually has no buttons. Instead there is just a
text for user to direct him to make a motion to be translated. Although this page has no
buttons, it has the functionality to detect the start and end of user’s motion. After detecting
the motion, this gesture is send to the databse module for evaluation. When this evaluation is
finished, user will be directed to result page.
Interpreter Result Page: This screen contains the result of the user motion. After databse
module evaluated the motion and send necessary info to Hand Sign Recognizer module, this
screen will print the related result to the screen. If the user’s motion corresponds to a word or
Ceng 492 Revised Design Report
29
a phrase in our database, and the related word or phrase will be shown on the screen.
Otherwise there is no result in the screen.
In addition to motion result, this screen includes three buttons. First, “tekrar dene” button is
to retry. Actually, when user’s make a wrong motion or the motion has no meaning in our
TSL database, user is directed to Interpreter Mode Main Screen to retry his motion. Second is
“ana menüye dön” button is to retrun to main menu. When user completes his motion and see
the corresponding result on the screen, he is directed to main page by this button. The las one
is the “çıkış” button is to logout from the system.
Figure 13 – Interpreter Mode Screens Interfaces Overview
6.1.3. Teaching Mode Screens
Teaching Mode Main Screen: This mode is to teach some TSL movements. So there are a
lot of buttons in the screen. Each button has a name of a TSL movement. User can chose
which word he wants to learn and then directed to Video Player Screen.
Video Player Screen: In this screen, there is a video playing as the screen name states. This
video actually teach user how can the word he choose be made in TSL. After this video
finishes, user is expected to repeat the same motion. This is the core of our education mode
Ceng 492 Revised Design Report
30
so that user will be evaluated according to how much is he capable of doing the exact motion
on the video.
Education Feedback Screen: This screen emerges after the user’s motion is evaluated by
database module. In the screen there is an evaluation result. Actually, user has a feedback
about his motion, after watching education video and try to repeat that movement. In
addition to evaluation result, there will be three buttons on the screen. These three buttons
actually has same functionalities with the end of our interpret mode. The first “tekrar dene” is
to retry to movement which is taught by the PAPAĞAN. Second, is the “ana menüye dön”
button is to return to the main screen and finally, “çıkış” button is the logout from the system.
Figure 14 – Teaching Mode Screens Interfaces Overview
Ceng 492 Revised Design Report
31
6.2. Screen Images
6.2.1. Main Menu Screen In Main Menu screen user is requested to select desired mode. If user selects Interpretation
Mode (Çevirme Modu) he/she will be directed to a screen to input sign language movements to
Kinect device. If the user selects Education Mode ( Eğitim Modu ) he/she will be directed to
Education Mode Main Screen to select sign to learn and study.
Figure 16 – Main Menu Screen
6.2.2. Teaching Mode Main Screen Education Mode Main Screen is the screen where user selects desired sign to exercise.
Ceng 492 Revised Design Report
32
Figure 17 – Teaching Mode Screen
6.2.3. Interpreter Result Screen In Interpretation mode, after the user inputs the sign movement to Kinect, the interpreter
recognizes the sign and outputs it in text format in Interpreter Result Screen. In this screen
the user can hear the phrase in audio format or watch the video for it. Once the user is done,
he/she can either navigate to the Main Screen or input a new sign by clicking on the relevant
buttons.
Ceng 492 Revised Design Report
33
Figure 18 – Interpreter Result Screen
6.2.4. Educational Feedback Screen In Education mode, once the user watches the video of the desired movement, and
exercises the movement by him/herself, the user will be directed to the Educational Feedback
Screen and can see how well did he/she managed to do the sign. Depending on users
performance, he/she can keep practicing the same move or can navigate to the Main Screen.
Ceng 492 Revised Design Report
34
Figure 19 – Educational Feedback Screen
6.3. Screen Objects and Actions
When PAPAĞAN started, there is an identification screen and a “identify yourself”
object. PAPAĞAN first needs to identify the user and his body joints respectively, before
receiving any incoming input. After identification, there is the mode selection screen. In that
screen, there are two objects for two modes respectively. These objects’ actions are to
activate the corresponding modes in PAPAĞAN.
In the teaching mode, TSL signs will be taught to user. Every object in the screen has the
action to display the corresponding education video/audio. Then PAPAĞAN waits the user to
repeat displayed sign to practice user the shown sign. Once the user completes the practice,
there will be an opening feedback screen, which evaluates user movement. On this screen
there will be two objects that one returns to main menu and the other one request for another
try.
In Interpreter mode, PAPAĞAN requests user input movement that will be interpreted to
TSL. If the movement can be matched with the ones stored in the database, then the
translation of this movement will be reflected to the screen in desired format. Along with the
result the screen will contain two objects. One to return main screen and the other one to
interpret another TSL sign.
Ceng 492 Revised Design Report
35
7. Detailed Design
7.1. Graphical User Interface Module
7.1.1. Classification Graphical User Interface Module is a sub module of the system.
7.1.2. Definition This module is design to handle all user interface interactions. It communicates with the
user, receives inputs, passes it to related components and shows results received from those
other components. Although the functionality of this component is rather simpler than the
other, as indicated in our design goals, usability is a priority for PAPAĞAN, thus this
component is just as crucial as others. This component is the frontier of the product, so it
should be handled smoothly. Because any failure in the component will draw user’s attention
and will be criticized the first.
7.1.3. Responsibilities As mentioned above this module has responsible for user interactions. To make it
clearer, all user requests and responds handled through this component. Since PAPAĞAN’s
targeted user profile includes handicapped people, this component should be implemented
respectively to those people and it must be clear, consistent and easy to use.
7.1.4. Constraints Since the user will in front of the Kinect while using the program. Enabling mouse
interaction is not good idea. Button triggering should be handled in a better way than mouse
clicks. System’s solution for this problem will be controlling the cursor via hand gestures.
The basic gestures explained in the processing section will be used to control cursor
movement. Another is user should first introduce his/her body to get skeleton information to
start using the system.
Ceng 492 Revised Design Report
36
7.1.5. Compositions There are three sub components for this module. First one handles the user commands
and button actions will be handled in this sub component. Second one will handle to database
requests. In teaching mode when a sign to be learned is selected, this component will request
that sign object directly from the database. The last subcomponent will handle the
communication between hand sign recognizer module and GUI Module. This is a more
important sub component then others. It sends the tracked gesture the hand sign recognizer
component and receives the result of gesture interpretation.
7.1.6. Uses/Interactions GUI mostly interacts with the user and it handles the actions triggered by the user. It
changes its states according to these actions and their results. In more detail when
interpretation actions is triggered it receives the user movement and sends it to hand sign
recognizer. After the result returns from the recognizer the module changes its state and
outputs the result on screen. This module also interacts with other two components of the
system as explained in compositions section.
7.1.7. Resources Visual Studio 2010 provides the resources needed for this component. GUI will be
implemented in Windows Form and it uses .Net for GUI implementation.
7.1.8. Processing Graphical User Interface component has straightforward implementation while
interacting with user. Basically, it receives actions and sends it to the related component.
While receiving gesture from the user it creates a motion object. Fills it with data received
form Kinect device then it sends it to the Hand Sign Recognizer component for processing.
7.1.9. Interface Exports This component handles all accessible functionality in PAPAĞAN. At this section those
functionalities will be explained in detail. Related sequence diagram for the module is
available in section 5.2.1.4
Ceng 492 Revised Design Report
37
start() : Programs begins and user introduces the body .
end() : Program terminates.
selectMode( integer ) : When the user selects mode this function is called and it sets
current user mode .
selectTeachingAction( Integer ) : User selects one of the teaching mode gestures.
playVideo() : In teaching mode selected sign’s video is requested from the database to and
played .
playAudio() : In teaching mode selected signs audio is requested from the database to and
played .
returnMainScreen() : When this action is triggered program returns the main screen
retry() : This function is called when the current gesture is not recognized by the System.
Call for this function either indicates user’s inability to input the TSL sign properly or
PAPAĞAN’s inability to process the input.
returnSelectedObject( object id ): In teaching mode when a sign is requested from the
database it sends its id to database and gets the sign object .
startTrack() : Starts receiving data from Kinect device.
endTrack() : Ends receiving data from Kinect device.
Ceng 492 Revised Design Report
38
7.2. Database Module
7.2.1. Classification Database module is a sub-module of the whole system.
7.2.2. Definition This module is the interface between PAPAĞAN and the database management system.
Other modules uses database through this module. Handling database communication eases
the job, instead of accessing through the database from each component separately. The
module either responds requests to fetch information from the database or store data in
database.
7.2.3. Responsibilities The purpose of this module is to create communication for other modules with the
database. It manages the receiving and storing information to the database according to the
requests received from other modules.
7.2.4. Constraints This module is completely dependent on the other modules. It assumes other modules to
request always the correct information to be kept or to be delivered. To explain it easily this
component only does what it is told to. In order to communicate with this component, others
should achieve some constraints such as connecting the database with correct name and
password or manipulating the data with proper request. Stored data types are strings, integers,
floats as usual and also includes video and audio objects which are stored in a different way
which will be explained in processing section.
7.2.5. Compositions There are two subcomponents of this module namely signs and motions. These sub-
components have one to one relation between them. Which, yields for every motion in the
database there will be a sign objects stored in the database.
Ceng 492 Revised Design Report
39
7.2.6. Uses/Interactions
When data is requested from the database, related class instances will be created for
related database tables and will be filled with the requested information to be used for the
requested execution. Similarly other operations like retrieving/updating/deleting/inserting
data in the database will be handled in object oriented manner. The related table with the
object will be updated automatically. The functionality beneath the user interface will be
realized by means of this group of classes.
7.2.7. Resources Database Module interacts with a MSSQL DBMS. We chose this database because we
are developing in Microsoft environment and we are forced to use Microsoft Visual studio
2010, since we are using Microsoft Kinect SDK. But it has its own advantages like it is not
necessary to load ODBC or JDBC drivers in order to access the database. Simply setting the
connection string for the database connection object will be enough. System.Data.SqlClient
will be used for the database actions. Since the project is a single user project with concurrent
requests, there won’t be any deadlock situations or race conditions with database operations.
7.2.8. Processing The component performs its duties over the Motion and Sign Objects that are explained
before with their attributes .The interpreter mode algorithm for the database component will
be as the following. The motion will be sent to recognizer module, and then recognizer
module will request all the motions from the database. If it finds a match, it will request
related sign with the motion id then it will send related information to interface module, if it
can not find the match it will send a proper message the interface module. When the
recognizer request the motions from database for each table element it will create a motion
object, fill its instances and put it to a container. As stated before database module will do
only what is told to do, however, it will be responsible to throw proper exceptions if
something goes during database transactions. For example if connection fails it will rise
connection problem or if there is a problem with particular request it return a message related
to this request. It will ease the job to find bugs or problems for the development team.
Ceng 492 Revised Design Report
40
7.2.9. Interface Exports The set of services provided by this module are limited. The user will not be able to
send request to database directly. All data in the database will be inserted by when the
program is installed and user will not allowed to modify the database in this version of the
project. Actually there will no change in database after the implementation is finished. But if
we develop a new version and enlarge the sign can be interpreted database will modified by
the developers. Functionality of each element of this module is explained below, related
sequence diagram can be found in 5.2.2.4 Dynamic Processing section.
sendFeedBack(): it sends a feedback related question answered in teaching mode.
sendResult(): This function sends the result of the test taken in teaching mode .
7.3. Hand Sign Recognizer Module
7.3.1. Classification
Hand Sign Recognizer Module is a sub module of the system.
7.3.2. Definition Hand Sign Recognizer component basically deals with the recognition process of
incoming users gestures. It processes the given gesture and matches it with the ones stored in
the database. This module is the core of our system since functionality of the system
available only if this module properly.
7.3.3. Responsibilities Hand Sign Recognizer is responsible for detecting gestures and recognizing them. This
component is also responsible for creating the corresponding sequence of sub actions for
recognized gesture and passing resulting data to Graphical User Interface Module.
Ceng 492 Revised Design Report
41
7.3.4. Constraints Hand Sign Recognizer component should be implemented such that it should be able to
deal with gesture recognition process in real time. Providing real time recognition system is
the most problematic issue that the system should deal with. This module is success is
directly related with the quality of the received data from the Kinect device. The Recognizer
component should work properly independent of users distance from the device or
illumination of the environment. So, that this constraints should not be restricted to the user.
7.3.5. Compositions The component does not have any subcomponents is work as a whole.
7.3.6. Uses/Interactions Hand Sign Recognizer component affects the system behavior. It indirectly changes the
UI. Input to GUI component is provided as a Sign object so that interface of the system
changes state to according current mode of the user. In detail if the user is in Interpreter
Mode when sign object returned from this module, GUI will show mapped gesture from the
database in various formats. On the other hand, if the user is using the teaching mode, GUI
will change its state to show results of the feedback test. More cases will be explained in
detail in processing section.
7.3.7. Resources The resources needed by this component are Kinect device and gesture information
stored in database. Without these resources this component can work properly, due that the
module mainly maps the incoming user data through Kinect to relevant match in the
database. Microsoft Kinect SDK will be used to retrieve incoming user data.
7.3.8. Processing Hand sign recognizer does the most important processing work for this entire project.
Implementing this will require the largest effort. Recognizer will work as the following. It
will receive motion object from the GUI module and process it. Motion will be sampled to
small components of gesture particles. Then it will try to match the motion received through
Ceng 492 Revised Design Report
42
the GUI with the ones in the database. If this matching succeeds then the sign object from
database will be requested with a motion object id. The Sign object requested from the
database module will be sent to GUI module. If the match fails a proper message to GUI
module will be shown. Down sampling the given gestures is the most important thing to do in
this project while matching it with in database is trivial.
7.3.9. Interface Exports
This components, as states before will handle most crucial services of PAPAĞAN. Sequence
diagram can found 5.2.3.3 Detailed processing part. Functionality of each element is
explained as below;
getMotions() : It is private function of this module and it will get all motion sequences from
the database.
getSign( Motion id ): It is a private function of this module. It is job is to get Sign object
with the same motion id.
sampleTheMotion(Motion ): It takes a motion as parameter and breaks it to small gesture
components to compare it with saved ones.
Ceng 492 Revised Design Report
43
8. Libraries & Tools We are using Kinect device in our project. In order to get required data from Kinect we
need install Microsoft Kinect SDK. This SDK can be installed only in Microsoft
environment. This restricts the operating system choice to Windows 7.
The system will be developed in C# programming language. It will be coded in Visual
Studio 2010 IDE, which also contains required tools for GUI development. MSSQL is chosen
as database management server for the project.
9. Time Planning
9.1. Term I (September 2011 – January 2012)
Figure 20 – Gannt Chart for Term I
Ceng 492 Revised Design Report
44
9.2. Term II (February – June 2012)
Figure 21 – Gannt Chart for Term II
10. Conclusion This report is the Detailed Design Report (DDR) document of PAPAĞAN Project. The
project purposes to understand TSL and while doing so building a bridge between
handicapped people and others. It also aims to increase the usage of TSL with its educational
features. The report generally aims to focus on the aspects, which are considered as crucial to
develop the system.