Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
MY PART TIME TUTOR SELECTION SYSTEM USING K-MEANS
ALGORITHM
NUR ZARITH AKILLA BINTI AMBOAKA
BACHELOR OF COMPUTER SCIENCE
(INTERNET COMPUTING) WITH HONOURS
UNIVERSITI SULTAN ZAINAL ABIDIN
2018
MY PART TIME TUTOR SELECTION SYSTEM USING K-MEANS
ALGORITHM
NUR ZARITH AKILLA BINTI AMBOAKA
Bachelor of Computer Science (Internet Computing)
Faculty of Informatics and Computing
Universiti Sultan Zainal Abidin, Terengganu, Malaysia
AUGUST 2018
i
DECLARATION
I hereby declare that this report is based on my original work except for quotations
and citations, which have been duly acknowledged. I also declare that it has not been
previously or concurrently submitted for any other degree at Universiti Sultan Zainal
Abidin or other institutions.
________________________________
Name : Nur Zarith Akilla Binti Amboaka
Date : ..................................................
ii
CONFIRMATION
This is to confirm that this project entitled My Part Time Tutor Selection System
Using K-Means Algorithm was prepared and submitted by Nur Zarith Akilla Binti
Amboaka (Matric Number: BTCL15039761) and has been satisfactory in terms of
scope, quality and presentation as partial fulfilment of the requirement for the
Bachelor of Computer Science (Internet Computing) with honours in Universiti Sultan
Zainal Abidin. The research conducted and the writing of the report was under my
supervision.
________________________________
Name : Dr Suhailan Dato’ Safei
Date : ..................................................
iii
DEDICATION
In the name of Allah, the Most Gracious and the Most Merciful, all praise is only for
Him the documentation and the system for the subject, CSB 35102, Projek Ilmiah
2018/2019 is finished due the time. I would like to take these opportunities to give a
big thanks to my kind supervisor, Dr. Suhailan Bin Dato’ Safei for the valuable idea,
time, support, advice, guidance, and ideas given through the development of research
until complete the part of the project in phase one. Besides that, I also want to dedicate
my appreciation to my beloved family that supports and motivates me during finishing
this project. And not forget I would to thank a lot to friends that willing to lend their
hand for finishing the project. Lastly, thank you everyone who directly or indirectly
involved in the process of making the system and documentation
iv
ABSTRACT
Nowadays some students need an extra pocket money to support their life in
university. One of the ways to get an extra pocket money is to be a part time tutor
either among their friends in university or among the school students outside the
university. Being a part time tutor is so good for them to build their self-esteem and
also to gain an experience for their future career. However, some of them are still
confused to teach since they don’t really know how to assess their abilities in the
specific subject. Moreover, they need to proof to their client or students that they are
capable to teach the subject. In the other side, there is a problem for the admin to
choose the right student for each subject since there is so many applications from
student to be a part timer tutor, plus the student need to pass the subject requirement.
This project was built to classify their abilities to teach a subject based on their
achievement in the courses that they take in university. The student will apply for the
tutor job and fill in the subject requirement. After that, they will be waiting for the
admin to update the result, since the admin will manage the subject requirement group
and classify the student based on their ranking in that subject. This project is important
to convince another student who need a tutor in a specific subject. To realize this
project, clustering technique will be apply using centroid based clustering algorithm,
K-means. K-means is often called an unsupervised learning, as we don’t have
prescribed labels in the data and no class values denoting a priori grouping of the data
instances are given.
v
ABSTRAK
Pada masa kini, terdapat sesetengah pelajar memerlukan wang tambahan untuk
menyara kehidupan mereka di universiti. Salah satu cara untuk mendapatkan wang
saku tambahan ialah dengan menjadi guru sambilan sama ada di kalangan sahabat
mereka di universiti atau di kalangan rakan sekelas mereka. Dengan menjadi guru
sambilan, adalah sangat baik bagi mereka untuk membina keyakinan diri dan juga
untuk mendapatkan pengalaman untuk kerjaya di masa hadapan. Walau
bagaimanapun, sesetengah daripada mereka masih keliru untuk mengajar sesuatu
subjek kerana mereka tidak tahu bagaimana menilai kebolehan mereka dalam
subjek tertentu. Lebih-lebih lagi, mereka perlu membuktikan kepada klien atau
pelajar bahawa mereka mampu mengajar mata pelajaran. Di sisi lain, terdapat
masalah untuk pentadbir memilih pelajar yang tepat untuk setiap mata pelajaran
kerana terdapat begitu banyak penyertaan dari pelajar yang ingin menjadi guru
sambilan. Tambahan pula, pelajar tersebut juga hendaklah memenuhi syarat untuk
jadi guru untuk subjek tertentu. Projek ini dibina untuk mengklasifikasikan
kebolehan mereka untuk mengajar mata pelajaran berdasarkan pencapaian mereka
dalam kursus yang mereka ambil di universiti. Pelajar akan memohon pekerjaan
tutor dan mengisi keperluan subjek. Selepas itu, mereka akan menunggu pentadbir
mengemas kini hasilnya, kerana pentadbir akan menguruskan kumpulan keperluan
subjek dan mengklasifikasikan pelajar berdasarkan ranking mereka dalam subjek
itu. Projek ini penting untuk meyakinkan pelajar lain yang memerlukan tutor
dalam subjek tertentu. Untuk merealisasikan projek ini, teknik clustering akan
digunakan menggunakan algoritma kluster berasaskan centroid, K-means. K-
means sering dipanggil pembelajaran tanpa pengawasan, kerana kami tidak
menetapkan label dalam data dan tidak ada nilai kelas yang menunjukkan
kumpulan priori dari contoh data yang diberikan.
vi
CONTENTS
PAGE
DECLARATION i
CONFIRMATION ii
DEDICATION iii
ABSTRACT iv
ABSTRAK v
CONTENTS vi
LIST OF TABLES vii
LIST OF FIGURES xvi
LIST OF ABBREVIATIONS xv
CHAPTER I INTRODUCTION
1.1 Background 1
1.2 Problem statement 1
1.3 Objectives 1
1.4
1.5
1.6
Scopes
1.4.1 Scope Admin
1.4.2 Scope Student
Limitation of Work
Expected Outcome
2
2
2
1.7 Report Structure 3
CHAPTER 2 LITERATURE REVIEW
2.1 Introduction 4
2.2 Similar System 4
2.3 K-Means Clustering Algorithm
2.3.1 What is Clustering Technique
2.3.2 Introduction to K-Means Clustering
2.3.3 K-Means Clustering Algorithm
6
vii
CHAPTER 3
METHODOLOGY
3.1 Introduction 9
3.2 Iterative Model 9
3.2.1 Requirement Phase 10
3.3 Analysis and System Design 11
3.3.1 Framework Design 11
3.3.2 System Design 12
3.3.3 Data Model 15
3.4
3.3.4 Technique
Summary
18
16
23
CHAPTER 4
IMPLEMENTATION PHASE
4.1
4.2
4.3
4.4
CHAPTER 5
Introduction
Implementation of My Tutor system
Design Interface
Summary
CONCLUSION
24
24
25
33
5.1
5.2
5.3
5.4
5.5
5.6
Introduction
Project Contribution
Result discussion
Project constraint and limitations
Future work
Summary
34
34
34
34
35
35
REFERENCES 38
viii
LIST OF TABLES
TABLE TITLE PAGE
3.1 List of software 10
3.2 List of hardware 11
3.3 Admin data model 15
3.4
3.5
3.6
3.7
3.8
3.9
3.10
3.11
3.12
4.1
4.3
4.4
4.5
Student data model
Subject data model
Subject mark data model
Subject group data model
Student group data model
K-means data model
Academic data model.
Define centroid example
Calculation of new k-means
Test Cases Success Admin Login
Test Cases Success Add Subject
Test Cases Success Update Subject
Test Cases Success Delete Subject
16
16
16
17
17
17
18
19
20
34
34
35
35
ix
LIST OF FIGURES
Figure TITLE PAGE
2.1
2.2
Part Time Post
E-Rezeki website
5
5
2.3 Nearest cluster assignment formula 7
2.4 Centroids update formula 7
3.1 Iterative Model 9
3.2
3.3
3.4
3.5
3.6
3.7
3.8
4.1
4.2
4.3
4.4
4.5
4.6
4.7
4.8
4.9
4.10
4.11
4.12
4.13
4.14
4.15.1
4.15.2
4.16
System Framework
Context diagram
Data flow diagram level-0 (Admin)
Data flow diagram level-0 (Student)
Entity Relationship diagram
Subject mark example
Subject mark example graph
Main interface
Register page
Dashboard page
Profile page
Academic page
Subject page
History page
Report page
Profile page (Admin)
Manage tutor page
Manage subject page
Manage group page
Manage history page
Calculation page
K-means clustering table page
K-means clustering graph page
Admin report page
11
12
13
14
15
18
19
25
25
26
26
27
27
28
28
29
29
30
30
31
31
32
32
33
x
LIST OF ABBREVIATIONS / TERMS / SYMBOLS
CD Context Diagram
DFD Data Flow Diagram
ERD Entity Relationship Diagram
FYP Final year project
xi
LIST OF APPENDICES
APPENDIX TITLE PAGE
A Appendix 1 39
1
CHAPTER I
INTRODUCTION
1.1 Background
My Part Time Tutor Selection System Using K-Means Algorithm is a web base
application system. This system is to help students who want to be a part-timer
teacher to teach subject that fit their skills in a particular subject. The problem
is how to classify tutor teacher among students according to certain subject
correctly. As example, if they wanted to be a tutor in Data Structure subject,
they must have a good result in basic programming subject and object-oriented
programming subject. The system will group the potential tutors that nearly
matched to the subject requirement. To realize the system, K-Means Clustering
Algorithm will be used. To apply a tutor jobs, they need to fill in subject grade
and the grade will be calculated based on the centroids to determine they are in
the right tutors group.
1.2 Problem Statement
To find the best tutor, we have to assign them in a group that fit their skills in
particular subject. The problem is how to classify tutor teacher among students
according to certain subject correctly
1.3 Objectives
There is three main objective that to develop this system such as:
1.3.1 To analyze a group recommendation for Tutor Selection
System.
1.3.2 To design a proposed system Tutor Selection System based on
Student’s Academic Achievement using K-Means technique.
2
1.3.3 To develop system of Tutor Selection System based on
Student’s Academic Achievement using K-Means technique.
1.4 Scope
There is two scope in this system :
1.4.1 Scope Admin
1.4.1.1 Admin can login to the system.
1.4.1.2 Admin can manage profile, which the part timer tutor
profile.
1.4.1.3 Admin can create, update, and delete user profile.
1.4.2 Scope Student
1.4.2.1 Student can register to the system.
1.4.2.2 Student can add, update and delete their details in the
system.
1.4.2.3 Student need to fill in profile form and educational form
in the system.
1.4.2.4 Student can view recommended subject to teach at the
system.
1.5 Limitation of Work
1.5.1 The subject marks are entered manually by the students. It is up
to the management to validate the data.
1.5.2 This system only can cluster the result and give
recommendation to the part timer tutor.
1.6 Expected Outcome
This system is expected to group part time tutors based on similar course
achievement and assign them with a suitable subject to teach that suit their
skill. Finally, students will be given a list of recommended subjects that is
suitable with their range group.
3
1.7 Report Structure
This report structure has six (6) chapters. In the Chapter 1, the content consists
of project background, problem statement of project, the objective and system
scope. Then, Chapter 2 is about the study of literature review. This chapter is
reviewing the previous systems. The next is Chapter 3, describes the
methodology of research. This research used iterative model. Chapter 4
explains the system’s framework and design. Then, Chapter 5 is all about
implementation, testing and result. Lastly, Chapter 6 is the conclusion of the
whole project.
4
Chapter 2
LITERATURE REVIEW
2.1 Introduction
This chapter describes and explains about the literature review about technique
used for the development of a My Part Time Tutor Selection System on
student’s subject achievement using K-Means Clustering Algorithm.
2.2 Similar System
2.2.1 Manual System
My Part Time Tutor Selection System Using K-Means Algorithm is a project
that built to help an organization to choose the best tutor teacher among
student. The system will choose a tutor will choose a tutor base on a subject
that there are good with, which is they will be choose based on their
achievement in particular subject by calculate their grade of the subject. This is
because not all of the student is good with every subject they take. Some of
them have a high understanding and good achievement in particular subject.
This is what we want so that they can teach the other who didn’t good at the
subject. Nowadays, a normal procedure for tutor or lecture or teacher selections
are based on CGPA and interview session. This method does not guarantee
completely that the selected tutor is good in the job scope given. There is a lack
of selection based on certain subject achievement.
5
2.2.2 Part Time Post
Figure 2.1 Part Time Post
Figure 2.1 above shows the Part Time Post system which is provide many parts
times job for the user based on the requirement that has been set. This system is
very helpful for those who are looking for the part time job including to be a
tutor teacher.
2.2.3 E-Rezeki
Figure 2.3 E rezeki website
Figure 2.3 above shows the e-rezeki system which is they integrate the Part
Time Post system so that the tutor is easy to find a job anywhere.
6
2.3 K-Means Clustering Algorithm
2.3.1 What is clustering technique
Clustering is a technique for finding similarity groups in a data, called clusters.
It is attempts to group individuals in a population together by similarity, but not
driven by a specific purpose. Clustering is often called an unsupervised
learning, as you don’t have prescribed labels in the data and no class values
denoting a priori grouping of the data instances are given (Manu Jeevan,2017).
This K-Means clustering is purposed by J.B. MacQueen (Zhang Yufang,2003).
2.3.2 Introduction to K-Means Clustering Algorithm
K-Means is a method of clustering observations into a specific number of
disjoint clusters. The ‘K’ refers to the number of clusters specified. Various
distance measures exist to determine which observation is to be appended to
which cluster. The algorithm aims at minimizing the measure between the
centroid of the cluster and the given observation by iteratively appending an
observation to any cluster and terminate when the lowest distance measure is
achieved.
2.3.3 K-Means Clustering Algorithm
K-Means defines a prototype in terms of a centroid, which is usually the mean
of a group of points and is typically applied to objects in a continuous n-
dimensional space. The K-Means clustering technique is simple and we begin
with a description of the basic algorithm.
2.3.3.1 Initial Centroids Selection
We first choose K initial centroids, centroid (k) is referring to a cluster centre
that is represented using the feature points for a group of the nearby assigned
objects. It is also used as a reference point in assigning objects into a cluster
based on their nearest distance to the centroid. In the beginning of the
assignment process, a number of K set of initial centroids need to be
7
predetermined so that the objects can be assigned accordingly. In basic K-
Means, these initial centroids are randomly selected among objects.
2.3.3.2 Nearest Cluster Assignment
Each point is then assigned to the closest centroid, and each collection of points
assigned to a centroid cluster. Clustering process begins by measuring each
object distance on each centroid (mk).
Figure 2.3 Nearest cluster assignment formula
where Sik is set of the object in cluster-k, k= 0 to K and d is a feature. The
objects will be assigned to a cluster where they have the closest distance to the
centroid. The distance measurement is using the Euclidean distance method; a
typical K-Means nearest object measurement.
2.3.3.3 Centroids Update
Then, the centroid of each cluster updated based on the points assign to the
cluster. We repeat the assignment and update steps until no point changes
clusters, or equivalently, until the centroids remain the same. This is the final
step where once the objects have been re-assigned, the centroid for each cluster
needs to be re-calculated.
Figure 2.4 Centroids update formula
where M is the total of objects in cluster-k, k = 0 to K and d=0 to D. This step
is to ensure that all objects that currently assigned to a cluster definitely belong
to that cluster (i.e. nearest to its new assigned centroid) and far away from
other clusters. If there is an object that turns out to be nearer to another
centroid, then this object needs to be reassigned to the nearest cluster. Thus,
iteratively, the whole process cycle starting from step (b) to (c) needs to be
repeated until there are no changes to the centroids in all clusters.
8
2.3.3.4 Basic K-Means Algorithm
1; Select K points as initial centroids.
2; repeat
3; Form K clusters by assigning each point to its closest centroid.
4; Recompute the centroid of each cluster.
5; until Centroids do not change.
9
Chapter 3
METHODOLOGY
3.1 Introduction
This chapter will discuss the methodology that has been used to develop the
system from the beginning until the system is completed. Methodology process
is very important in develop our system. It is because, it can describe step by
step about how to develop the system and also as a revision for the next
generation who will continue expand or to study the system. In addition, a
methodology is a formalized approach to implement Software Development
Life Cycle (SDLC). There are various models defined and designed for
software development process. The chosen SDLC model to develop this
system is Iterative Model Life Cycle. Details for every phase involved in this
system development will be explained in this chapter.
3.2 Iterative Model
Figure3.1 Iterative Model
In this model the process starts from the requirements and iteratively enhance
the requirements until the final software implemented. The development
begins by specifying and implementing just part of the software, which can
10
then be reviewed in order to identify further requirements. This process is then
repeated, producing a new version of the software for each cycle of the model.
This model works on four phases. The phases are, requirement phase, design
phase, implementation phase and evaluation phase. This model purposely used
because we can possibly do a better testing at each iteration. In addition, this
model does not require high complexity rate and the feedback is generated
quickly. However, this model requires planning of technical level and also it is
not easily understandable.
3.2.1 Requirement Phase
In this phase, the requirement for the software are gathered and analyzed.
Iteration should eventually result a requirements phase that produces a
complete and final specification of requirements.
3.2.1.1 Software Requirement
Software used to develop the My Part Time Tutor Selection.
Table 3.1 List of Software
11
3.2.1.2 Hardware Requirement
Hardware used to develop the My Part Time Tutor Selection System.
Software Description
Laptop
• HP 15-r236TX
Processor: Intel® Core™ i3-4005U CPU @
1.7 GHz
RAM: 8.00 GB
OS: Window 10
GPU: NVIDIA GeForce FT 820M
Table 3.2 List of Hardware
3.3 Analysis and Design Phase
In this phase, the software solution to meet the requirement is designed. The
diagram of system framework, Context Diagram (CD), Data Flow Diagram
(DFD) and Entity Relationship Diagram (ERD) is built to clarify about the
actual system.
3.3.1 Framework Design
Figure 3.2 System Framework
The figure above shows the overview of the system. Both admin and student
will register and login to the system. Admin will update the available tutor
subject to the system, and student can view and apply as many subjects they
12
want. During apply for the subject, they will enter the requirement subject
mark and the mark will be calculate using K-Means technique in the system.
Once the calculation is done, the result we be give to admin for evaluation and
update the result to student if he or she is success or not.
3.3.2 System Design
3.3.2.1 Context Diagram
A system context diagram (CD) is a diagram that defines the boundary
between the system, or part of a system, and its environment, showing the
entities that interact with it. This diagram is a high-level view of a system.
Figure 3.3 Context Diagram
Figure above show the overview flow of the whole system where there is 2
entities included which is Student and Admin.
3.3.2.2 Data Flow Diagram
A data flow diagram (DFD) is a graphical representation of the “flow” of data
through an information system, modeling its process aspects. A DFD is often
used as a preliminary step to create an overview of the system without going
into great detail, which can later be elaborated.
13
3.3.2.2.1 Data Flow Diagram Level – 0
Figure 3.4 Data Flow Diagram Level-0 [Admin]
Figure above show the DFD Level-0 for Admin where there are 6 processes
included in Admin process. First, the admin will register to the system and
directly go the admin site. In the admin site, the admin will update the
available subject to the system and view if there is an application from the
student. Finally, admin will make a report for the choosen student for each
subject.
14
Figure 3.5 Data Flow Diagram Level-0 [Student]
Figure above show the DFD Level-0 for Student where there are 6 processes
included in Student process. First, the student will register to the system and
view their dashboard. Next, the student will be able ti view the available
subject list and insert their subject mark. Then the student will be able to see
their subject history and wait for the admin to update their report for subject to
teach.
3.3.2.3 Entity Relationship Diagram
Entity relationship diagram (ERD) is a graphical representation of entities and
their relationships to each other, typically used in computing in regard to the
organization of data within databases or information systems.
15
Figure 3.6 Entity Relationship Diagram
Figure above show the ERD of the system, where there is 5 entity and 6
relations included.
3.3.3 Data Model
A data model (or data model) is an abstract model that organizes elements
of data and standardizes how they relate to one another and to properties of the
real-world entities.
3.3.3.1 Admin
# Name Type Pk/Fk Description
1 id int(11) Primary Key
2 Username varchar(255)
3 Password varchar(255)
4 AdminPhoto varchar(255)
Table 3.3 Admin Data Model
Table above shows the details of admin data.
16
3.3.3.2 Student
# Name Type Pk/Fk Dscription
1 TutorRegno varchar(255) Primary
Key
2 TutorPhoto varchar(255)
3 TutorName varchar(255)
4 TutorCgpa decimal(10,2)
5 TutorPwd varchar(255)
6 TutorRegdate timestamp CURRENT_TIMESTAMP
Table 3.4 Student Data Model
Table above shows the details of student data.
3.3.3.3 Subject
# Name Type Pk/Fk Description
1 subcode varchar(255) Primary Key
2 subname varchar(255)
3 subcreate timestamp CURRENT_TIMESTAMP
Table 3.5 Subject Data Model
Table above shows the details of subject data.
3.3.3.4 Subject Enrollment
# Name Type Pk/Fk Description
1 id int(255) Primary Key
2 subcode varchar(255) Foreign Key Table subject
3 subgrade decimal(10,2)
4 TutorRegno varchar(500) Foreign Key Table Student
5 subenroll timestamp CURRENT_TIMESTAMP
Table 3.6 Subject Mark Data Model
Table above shows the details of subject mark data where the subcode is taken
from table subject and TutorRegno is taken from table student.
17
3.3.3.5 Subject Group
T
Table 3.7 Subject Group Data Model
Table above shows the details of subject group data where is admin will update
the two requirement subject for each group which is subA and subB.
3.3.3.6 Student Group
Table 3.8 Student Group Data Model
Table above shows the detail of student group data where is the final student
who is choosen in the subject group and require to teach that subject.
3.3.3.7 Kmeans
# Name Type Pk/Fk Description
1 kmeans_id int(255) Primary Key
2 TutorRegno varchar(255) Foreign Key Table Student
3 subA float Foreign Key Table subject
4 subB float Foreign Key Table subject
5 cluster int(255)
Table 3.9 Kmeans Data Model
Table above shows the detail of kmeans group data where the requirement
subject will be counted and will be place in the specific cluster.
# Name Type Pk/Fk Description
1 group_id varchar(255) Primary Key
2 groupname varchar(255)
3 subA varchar(255) Foreign Key Table Subject
4 subB varchar(255) Foreign Key Table Subject
# Name Type Pk/Fk Description
1 tgId int(255) Primary Key
2 TutorRegno varchar(25) Foreign Key Table Student
3 group_id varchar(255) Foreign Key Table subject Group
18
3.3.3.8 Academic
# Name Type Pk/Fk Description
1 a_id int(255) Primary Key
2 a_department varchar(255)
3 a_course varchar(255)
4 a_sem varchar(255)
5 TutorCgpa varchar(255)
6 TutorRegno varchar(255)
Table 3.10 Academic Data Model
Table above shows the details of academic group data.
3.3.4 Technique
3.3.4.1 K-Means Clustering
K-Means Clustering is the simplest unsupervised learning technique that can
solve clustering problem. The step follows a simple and easy way to classify a
given set of data set through a certain number of cluster (assume k clusters)
fixed a prior. In this project we will select two subject mark of students based
on their subject achievement. Below is the example of their subject mark that
has been listed in the record.
Figure 3.7 Subject Mark
19
Figure 3.7 Subject Mark Graph
There are three main process to calculate the K-Means Clustering: -
3.3.4.1.1 Define k centroids, one for each cluster.
First, we have to assume the initial centroid for each cluster randomly, for this
example the initial centroid for cluster one is (1.0,1.0) and the initial centroid
for cluster two is (3.0,4.0). This initial centroid will be use to calculate the
Euclidean Distance for each object to the nearest distance of centroid.
Table 3.11 Define Centroid Example
These centroids should be placed in a wily way because of different location
cause different result. So, is better to place them as much as possible far away
from each other.
3.3.4.1.2 Take each point belonging to a given data set and associated it to
a nearest centroid.
Clustering process begins by measuring each object distance on each centroid.
Calculation for Record 2: -
Cluster 1 = 1(10,1.0) Cluster 2 = 3(3.0,4.0)
20
Euclidean Distance Cluster 1 = √(𝟏. 𝟓 − 𝟏. 𝟎)𝟐 + (𝟐. 𝟎 − 𝟏. 𝟎)𝟐 = 1.12
Euclidean Distance Cluster 1 = √(𝟏. 𝟓 − 𝟑. 𝟎)𝟐 + (𝟐. 𝟎 − 𝟒. 𝟎)𝟐 = 2.5
Therefore, distance cluster 1 is less than cluster 2, so that Record 2 has been
listed in cluster 1. So, cluster 1 has record 1 and 2.
When no point is pending the first step is done. At this point, recalculated k
new centroids as center of the clusters resulting from the previous step is
needed.
3.3.4.1.3 After this k new centroids, a new binding has to be done between
the same data points and nearest new centroids.
This is the last step where once the objects have been re-assigned, the centroid
for each cluster needs to be re-calculated. So that after record 2 has re-assigned
in cluster
1. We need to calculate the new means.
CLUSTER 1 2
Record 1,2 3(no change)
Means (1.25,1.5) (3.0,4.0)
Table 3.12 Calculation of New Means
New Means for Cluster 1 =( 𝟏+𝟏.𝟓
𝟐 ,
𝟐+𝟏
𝟐 ) = (1.25,1.5)
Thus, A loop has been generated, until it notices that the k centroids
change their location step by step until no more changes are done. In the
simplest words, centroids do not move any more.
21
3.3.4.2 Implementation K-means Clustering Algorithm in My Tutor
3.3.4.2.1 Declaration and set alternative function.
3.3.4.2.2 Initialized Centroid
22
3.3.4.2.3 Assign Cluster
23
3.3.4.2.4 Update Centroids
3.3.5 Summary
In conclusion, choosing the right development methodology is very important
because it will affect the whole development process. The right methodology
will help the project to be done perfectly and smoothly. In addition, design
and framework are also important for us to see the picture of our system so
that we can build it smoothly and create a good system flow.
24
Chapter 4
IMPLEMENTATION AND RESULT
4.1 Introduction
Implementation and result are executed to ensure the system are developed
according to the main objective of the system and achieve user requirement.
This chapter will give the result of the My Part Time Tutor Selection System or
called My Tutor System that has been develop.
4.2 Implementation of My Tutor System
There are several language that has been used to develop My Tutor System.
For the template interface Bootstrap 3.0 and startbootstrap-agency-gh-pages
has been used. Next, for the server side PHP(Hypertext Pre-processor) has
been used as programming language. PHP is widely used because it is an open
source for general-purpose scripting language and can be embedded into
HTML and it suits for Web development.
For the validation, this system had been used HTML5, PHP and JavaScript.
Validation is very important to make sure user have a low rate to make a silly
mistake when they key in their data. For example, user is required to insert
their data in every insert form. When user skip the insert data then they can’t
submit their data. Finally, an Open Source database also has been used in this
system and the database is MySQL version 10.1.22-MariaDB. Apache use to
run the local host server and the version is 2.4.25. Visual Studio Code used for
writing the code.
25
4.3 Design Interface
The design interface is divided into two which is Admin page and user page.
4.3.1 Main Interface
Figure 4.1 Main Interface
Figure 4.1 above shows the main of login interface for both Admin and User.
4.3.2 Register Page
Figure 4.2 Register Page
Figure 4.2 above shows a register page for user. User is required to fill in their
full name, Id nmber and password.
26
4.3.3 Dashboard Page
Figure 4.3 Dahboard Page
Figure 4.3 above shows the dashboard page both for Admin and User.
4.3.4 Profile Page
Figure 4.4 Profile page
Figure 4.4 above shows the profile page for user. User can view their name and
matric number. User also can update their full name and profile picture.
27
4.3.5 Academic Page
Figure 4.5 Academic Page
Figure 4.5 above shows the academic page. User can update their academic
details like cgpa, faculti, course and semester.
4.3.6 Subject Page
Figure 4.6 Subject Page
Figure 4.6 above show the subject page, where user need to choose the subject
and their subject mark.
28
4.3.7 History Page
Figure 4.7 History Page
Figure 4.7 above show the enroll history page for user. User can view all of the
subject that they has been key in and update their subject mark or delete the
subject.
4.3.8 Report Pages
Figure 4.8 Report Pages
Figure 4.8 shows the report pages for user. User can view the full details of
their personal details and academic details. At the bottom of the report they can
view the recommendation group for the to teach.
29
4.3.9 Profile Page
Figure 4.9 Profile Page
Figure 4.9 shows the profile page for admin which is admin can view and
update his personal information.
4.3.10 ManageTutor Pages
Figure 4.10 ManageTutor Pages
Figure 4.10 above shows the tutor pages for admin where admin can view all
of the tutor that already register to the system and admin also can delete the
tutor that not active already.
30
4.3.11 Manage Subject Pages
Figure 4.11 Manage Subject pages
Figure 4.11 above shows the subject pages for admin to manage. In this section
admin can add a new subject that available, edit the subject name and delete
the unavailable subject.
4.3.12 Manage Group Page
Figure 4.12 Manage Group Page
Figure 4.12 above shows the group pages for admin to manage. In this section
admin need to add two subject that fit the group cirteria. Then, admin also can
edit the subject criteria and delete the gorup.
31
4.3.13 Manage History Page
Figure 4.13 Manage History Page
Figure 4.13 above shows the history pages for admin to manage. In this section
admin can view all of the subject that user enroll and admin also can delete the
history for user.
4.14 Calculation page
Figure 4.14 Calculation Page
Figure 4.14 above shows the calculation pages for admin to manage. In this
section admin has to select the group names and number of cluster to start the
calculation.
32
4.4.15 Kmeans Page
Figure 4.15.1 Kmeans clustering table
Figure 4.15.1 above shows the clustering result for three cluster in a table. In
this section, admin can add the choosen student to the group and admin also
can delete the student from the cluster if the student is already assign to
another group.
Figure 4.15.2 Kmeans clustering graph
Figure 4.15.2 above shows the clustering result for three cluster in a table. In
this section, admin can view the cluster visually which is can ease the admin to
choose the right student by looking for the highest cluster on top of the graph.
33
4.3.16 Report
Figure 4.16 Report Page
Figure 4.16 above shows the report of the clustering result. Admin can view
the student who has been choosen to the recommendation group.
4.4 Testing Analysis
After the development of the system are complete, this system will be test
using two techniques of software testing which are black box testing and white
box testing in order to examine the functionality of the system.
4.4.1 Black Box Testing
Module involve in this testing are:
I. Login
II. Create, retrive, update and delete subject
4.4.2 White Box Testing
Module involve in this testing are:
I. Generate Tutor’s Clustering Result.
II. Generate K-Means Clustering Graph
34
4.5 Test Cases
A test case is a set of condition or variables under which tester will determinate
wheatear a system works correctly or under test satisfies requirement. Process
of developing test case may help to find problem in the requirement or design
of an application. It simple define that test cases is a set of condition or
combination of variables under which tester or engineer will identify wheatear
the application under test is working correctly or not. Below shows the cases
for several process in the My Tutor System.
4.5.1 Login
Step Procedure Expected Result Pass/Fail
1. Go to login page Preview page
loaded
Pass
2. Enter the following detail:
Admin Id : Admin
Password : admin17
Message”successfull
login”
Pass
3. Click “Login” Button Pass
Table 4.1 Test Cases Success Admin Login
4.5.2 Admin
Step Procedure Expected Result Pass/Fail
1. Click “subject” page Preview page loaded Pass
2. Click “add” button Pass
3. Enter following detail:
Subject Code : C001
Subject Name : English
pass
4. Click “yes” Button Message “New data
added”
Pass
Table 4.3 Test Case Add Subject
35
Step Procedure Expected Result Pass/Fail
1. Click “subject” page Preview page loaded Pass
2. Click “update” button Pass
3. Enter following detail:
Subject Code : C002
Subject Name : English and
communication
pass
4. Click “yes” Button Message “Data
Updated”
Pass
Table 4.4 Test Case Update Subject
Step Procedure Expected Result Pass/Fail
1. Click subject page Preview page loaded Pass
2. Click “delete” button Message “are you
sure want to delete”
Pass
3. Click “yes” Button One row deleted
from table
Pass
Table 4.5 Test Case Delete Subject
4.6 Summary
As a conclusion, this chapter briefly discussed about the implementation of
code, interface design and testing the final result of the system. After all of the
element where tested, the system shows exellent result where everything is
work as planned.
36
Chapter 5
CONCLUSION
5.1 Introduction
This chapter will discuss a conclusion oh this project and the content on this
chapter are summary for the whole project, project contribution, project
limitation and some suggestion for the future.
5.2 Project Contribution
My Tutor system has been developed for final year student in Faculty of
Informatics Computing in UniSZA. It has achieved the objectives and scope of
scope of this project. Below is the list of the achievements on this project:
5.2.1 Generate student’s group achievement group using K-Means Clustering
5.2.2 This system recommended Student a tutor group that suited their skill.
5.2.3 This system gave benefits to the company by getting good tutor position
at their classroom
5.3 Result Discussion
Generally, this project has been carried out and follow the objectives that has
been explained in Chapter 1. This project has introduced the two main criteria
that would be used to calculate the K-Means clustering result which is two
subjects as the criteria. In addition, this project provided a better way to
student to know their recommender group to teach and also helped the admin a
lot to assign the best student in a group.
5.4 Project Constraint and Limitations
There are a few problems and limitations that occur throughout the
development of this project. The problems and limitation in conducting this
study are;
37
5.4.1 This system is set default for only two criteria which is Subject
Mark A and Subject Mark B.
5.4.2 The cluster number is set default only from 1 to 5 cluster.
5.4.3 The subject mark needs to insert manually by the student.
5.5 Future Work
There is some suggestion that can be made in order to upgrade the system
to be more efficient in the future. The suggestion are:
5.5.1 Upload a resume to be a tutor.
5.5.2 The criteria will be added nad the cluster are not limited only from
1 to 5.
5.5.3 Put the prices for tutoring lesson per hour.
5.6 Summary
My Part Time Tutor Selection System is the system that focuses of
recommendation of a subject to teach for the student. Based on the previous
study and discussion with supervisor the suitable approach that will be
implement in this project is K-MeansClustering techniques. This system
will help the admin to select the best student for the tutor subject by looking
for cluster ranking on the graph. Hopefully, this system will help the
student to be a tutor teacher that suit the subject requirement.
38
REFERENCES
Ju, C., & Xu, C. (2013). A New Collaborative Recommendation Approach
Based on
Users Clustering Using Artificial Bee Colony Algorithm, 2013.
Kodinariya, T. M., & Makwana, P. R. (2013). Review on determining number
of
Cluster in K-Means Clustering. International Journal of Advance Research in
Computer Science and Management Studies, 1(6), 2321–7782.
Li, C. S. (2011). Cluster center initialization method for K-means algorithm
over data
sets with two clusters. Procedia Engineering, 24, 324–328.
https://doi.org/10.1016/j.proeng.2011.11.2650
Li, Y., & Wu, H. (2012). A Clustering Method Based on K-Means Algorithm.
Physics
Procedia, 25, 1104–1109. https://doi.org/10.1016/j.phpro.2012.03.206
Yadav, S., Bharadwaj, B., & Pal, S. (2012). Data mining applications: A
comparative
study for predicting student’s performance. International Journal of Innovative
Technology & Creative Engineering, 1(12), 13–19. Retrieved from
http://arxiv.org/abs/1202.4815
https://doi.org/10.1016/j.proeng.2011.11.2650https://doi.org/10.1016/j.phpro.2012.03.206http://arxiv.org/abs/1202.4815
39
APPENDIX