53
MY PART TIME TUTOR SELECTION SYSTEM USING K-MEANS ALGORITHM NUR ZARITH AKILLA BINTI AMBOAKA BACHELOR OF COMPUTER SCIENCE (INTERNET COMPUTING) WITH HONOURS UNIVERSITI SULTAN ZAINAL ABIDIN 2018

MY PART TIME TUTOR SELECTION SYSTEM USING K-MEANS ... · This is to confirm that this project entitled My Part Time Tutor Selection System ... K-means is often called an unsupervised

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

  • MY PART TIME TUTOR SELECTION SYSTEM USING K-MEANS

    ALGORITHM

    NUR ZARITH AKILLA BINTI AMBOAKA

    BACHELOR OF COMPUTER SCIENCE

    (INTERNET COMPUTING) WITH HONOURS

    UNIVERSITI SULTAN ZAINAL ABIDIN

    2018

  • MY PART TIME TUTOR SELECTION SYSTEM USING K-MEANS

    ALGORITHM

    NUR ZARITH AKILLA BINTI AMBOAKA

    Bachelor of Computer Science (Internet Computing)

    Faculty of Informatics and Computing

    Universiti Sultan Zainal Abidin, Terengganu, Malaysia

    AUGUST 2018

  • i

    DECLARATION

    I hereby declare that this report is based on my original work except for quotations

    and citations, which have been duly acknowledged. I also declare that it has not been

    previously or concurrently submitted for any other degree at Universiti Sultan Zainal

    Abidin or other institutions.

    ________________________________

    Name : Nur Zarith Akilla Binti Amboaka

    Date : ..................................................

  • ii

    CONFIRMATION

    This is to confirm that this project entitled My Part Time Tutor Selection System

    Using K-Means Algorithm was prepared and submitted by Nur Zarith Akilla Binti

    Amboaka (Matric Number: BTCL15039761) and has been satisfactory in terms of

    scope, quality and presentation as partial fulfilment of the requirement for the

    Bachelor of Computer Science (Internet Computing) with honours in Universiti Sultan

    Zainal Abidin. The research conducted and the writing of the report was under my

    supervision.

    ________________________________

    Name : Dr Suhailan Dato’ Safei

    Date : ..................................................

  • iii

    DEDICATION

    In the name of Allah, the Most Gracious and the Most Merciful, all praise is only for

    Him the documentation and the system for the subject, CSB 35102, Projek Ilmiah

    2018/2019 is finished due the time. I would like to take these opportunities to give a

    big thanks to my kind supervisor, Dr. Suhailan Bin Dato’ Safei for the valuable idea,

    time, support, advice, guidance, and ideas given through the development of research

    until complete the part of the project in phase one. Besides that, I also want to dedicate

    my appreciation to my beloved family that supports and motivates me during finishing

    this project. And not forget I would to thank a lot to friends that willing to lend their

    hand for finishing the project. Lastly, thank you everyone who directly or indirectly

    involved in the process of making the system and documentation

  • iv

    ABSTRACT

    Nowadays some students need an extra pocket money to support their life in

    university. One of the ways to get an extra pocket money is to be a part time tutor

    either among their friends in university or among the school students outside the

    university. Being a part time tutor is so good for them to build their self-esteem and

    also to gain an experience for their future career. However, some of them are still

    confused to teach since they don’t really know how to assess their abilities in the

    specific subject. Moreover, they need to proof to their client or students that they are

    capable to teach the subject. In the other side, there is a problem for the admin to

    choose the right student for each subject since there is so many applications from

    student to be a part timer tutor, plus the student need to pass the subject requirement.

    This project was built to classify their abilities to teach a subject based on their

    achievement in the courses that they take in university. The student will apply for the

    tutor job and fill in the subject requirement. After that, they will be waiting for the

    admin to update the result, since the admin will manage the subject requirement group

    and classify the student based on their ranking in that subject. This project is important

    to convince another student who need a tutor in a specific subject. To realize this

    project, clustering technique will be apply using centroid based clustering algorithm,

    K-means. K-means is often called an unsupervised learning, as we don’t have

    prescribed labels in the data and no class values denoting a priori grouping of the data

    instances are given.

  • v

    ABSTRAK

    Pada masa kini, terdapat sesetengah pelajar memerlukan wang tambahan untuk

    menyara kehidupan mereka di universiti. Salah satu cara untuk mendapatkan wang

    saku tambahan ialah dengan menjadi guru sambilan sama ada di kalangan sahabat

    mereka di universiti atau di kalangan rakan sekelas mereka. Dengan menjadi guru

    sambilan, adalah sangat baik bagi mereka untuk membina keyakinan diri dan juga

    untuk mendapatkan pengalaman untuk kerjaya di masa hadapan. Walau

    bagaimanapun, sesetengah daripada mereka masih keliru untuk mengajar sesuatu

    subjek kerana mereka tidak tahu bagaimana menilai kebolehan mereka dalam

    subjek tertentu. Lebih-lebih lagi, mereka perlu membuktikan kepada klien atau

    pelajar bahawa mereka mampu mengajar mata pelajaran. Di sisi lain, terdapat

    masalah untuk pentadbir memilih pelajar yang tepat untuk setiap mata pelajaran

    kerana terdapat begitu banyak penyertaan dari pelajar yang ingin menjadi guru

    sambilan. Tambahan pula, pelajar tersebut juga hendaklah memenuhi syarat untuk

    jadi guru untuk subjek tertentu. Projek ini dibina untuk mengklasifikasikan

    kebolehan mereka untuk mengajar mata pelajaran berdasarkan pencapaian mereka

    dalam kursus yang mereka ambil di universiti. Pelajar akan memohon pekerjaan

    tutor dan mengisi keperluan subjek. Selepas itu, mereka akan menunggu pentadbir

    mengemas kini hasilnya, kerana pentadbir akan menguruskan kumpulan keperluan

    subjek dan mengklasifikasikan pelajar berdasarkan ranking mereka dalam subjek

    itu. Projek ini penting untuk meyakinkan pelajar lain yang memerlukan tutor

    dalam subjek tertentu. Untuk merealisasikan projek ini, teknik clustering akan

    digunakan menggunakan algoritma kluster berasaskan centroid, K-means. K-

    means sering dipanggil pembelajaran tanpa pengawasan, kerana kami tidak

    menetapkan label dalam data dan tidak ada nilai kelas yang menunjukkan

    kumpulan priori dari contoh data yang diberikan.

  • vi

    CONTENTS

    PAGE

    DECLARATION i

    CONFIRMATION ii

    DEDICATION iii

    ABSTRACT iv

    ABSTRAK v

    CONTENTS vi

    LIST OF TABLES vii

    LIST OF FIGURES xvi

    LIST OF ABBREVIATIONS xv

    CHAPTER I INTRODUCTION

    1.1 Background 1

    1.2 Problem statement 1

    1.3 Objectives 1

    1.4

    1.5

    1.6

    Scopes

    1.4.1 Scope Admin

    1.4.2 Scope Student

    Limitation of Work

    Expected Outcome

    2

    2

    2

    1.7 Report Structure 3

    CHAPTER 2 LITERATURE REVIEW

    2.1 Introduction 4

    2.2 Similar System 4

    2.3 K-Means Clustering Algorithm

    2.3.1 What is Clustering Technique

    2.3.2 Introduction to K-Means Clustering

    2.3.3 K-Means Clustering Algorithm

    6

  • vii

    CHAPTER 3

    METHODOLOGY

    3.1 Introduction 9

    3.2 Iterative Model 9

    3.2.1 Requirement Phase 10

    3.3 Analysis and System Design 11

    3.3.1 Framework Design 11

    3.3.2 System Design 12

    3.3.3 Data Model 15

    3.4

    3.3.4 Technique

    Summary

    18

    16

    23

    CHAPTER 4

    IMPLEMENTATION PHASE

    4.1

    4.2

    4.3

    4.4

    CHAPTER 5

    Introduction

    Implementation of My Tutor system

    Design Interface

    Summary

    CONCLUSION

    24

    24

    25

    33

    5.1

    5.2

    5.3

    5.4

    5.5

    5.6

    Introduction

    Project Contribution

    Result discussion

    Project constraint and limitations

    Future work

    Summary

    34

    34

    34

    34

    35

    35

    REFERENCES 38

  • viii

    LIST OF TABLES

    TABLE TITLE PAGE

    3.1 List of software 10

    3.2 List of hardware 11

    3.3 Admin data model 15

    3.4

    3.5

    3.6

    3.7

    3.8

    3.9

    3.10

    3.11

    3.12

    4.1

    4.3

    4.4

    4.5

    Student data model

    Subject data model

    Subject mark data model

    Subject group data model

    Student group data model

    K-means data model

    Academic data model.

    Define centroid example

    Calculation of new k-means

    Test Cases Success Admin Login

    Test Cases Success Add Subject

    Test Cases Success Update Subject

    Test Cases Success Delete Subject

    16

    16

    16

    17

    17

    17

    18

    19

    20

    34

    34

    35

    35

  • ix

    LIST OF FIGURES

    Figure TITLE PAGE

    2.1

    2.2

    Part Time Post

    E-Rezeki website

    5

    5

    2.3 Nearest cluster assignment formula 7

    2.4 Centroids update formula 7

    3.1 Iterative Model 9

    3.2

    3.3

    3.4

    3.5

    3.6

    3.7

    3.8

    4.1

    4.2

    4.3

    4.4

    4.5

    4.6

    4.7

    4.8

    4.9

    4.10

    4.11

    4.12

    4.13

    4.14

    4.15.1

    4.15.2

    4.16

    System Framework

    Context diagram

    Data flow diagram level-0 (Admin)

    Data flow diagram level-0 (Student)

    Entity Relationship diagram

    Subject mark example

    Subject mark example graph

    Main interface

    Register page

    Dashboard page

    Profile page

    Academic page

    Subject page

    History page

    Report page

    Profile page (Admin)

    Manage tutor page

    Manage subject page

    Manage group page

    Manage history page

    Calculation page

    K-means clustering table page

    K-means clustering graph page

    Admin report page

    11

    12

    13

    14

    15

    18

    19

    25

    25

    26

    26

    27

    27

    28

    28

    29

    29

    30

    30

    31

    31

    32

    32

    33

  • x

    LIST OF ABBREVIATIONS / TERMS / SYMBOLS

    CD Context Diagram

    DFD Data Flow Diagram

    ERD Entity Relationship Diagram

    FYP Final year project

  • xi

    LIST OF APPENDICES

    APPENDIX TITLE PAGE

    A Appendix 1 39

  • 1

    CHAPTER I

    INTRODUCTION

    1.1 Background

    My Part Time Tutor Selection System Using K-Means Algorithm is a web base

    application system. This system is to help students who want to be a part-timer

    teacher to teach subject that fit their skills in a particular subject. The problem

    is how to classify tutor teacher among students according to certain subject

    correctly. As example, if they wanted to be a tutor in Data Structure subject,

    they must have a good result in basic programming subject and object-oriented

    programming subject. The system will group the potential tutors that nearly

    matched to the subject requirement. To realize the system, K-Means Clustering

    Algorithm will be used. To apply a tutor jobs, they need to fill in subject grade

    and the grade will be calculated based on the centroids to determine they are in

    the right tutors group.

    1.2 Problem Statement

    To find the best tutor, we have to assign them in a group that fit their skills in

    particular subject. The problem is how to classify tutor teacher among students

    according to certain subject correctly

    1.3 Objectives

    There is three main objective that to develop this system such as:

    1.3.1 To analyze a group recommendation for Tutor Selection

    System.

    1.3.2 To design a proposed system Tutor Selection System based on

    Student’s Academic Achievement using K-Means technique.

  • 2

    1.3.3 To develop system of Tutor Selection System based on

    Student’s Academic Achievement using K-Means technique.

    1.4 Scope

    There is two scope in this system :

    1.4.1 Scope Admin

    1.4.1.1 Admin can login to the system.

    1.4.1.2 Admin can manage profile, which the part timer tutor

    profile.

    1.4.1.3 Admin can create, update, and delete user profile.

    1.4.2 Scope Student

    1.4.2.1 Student can register to the system.

    1.4.2.2 Student can add, update and delete their details in the

    system.

    1.4.2.3 Student need to fill in profile form and educational form

    in the system.

    1.4.2.4 Student can view recommended subject to teach at the

    system.

    1.5 Limitation of Work

    1.5.1 The subject marks are entered manually by the students. It is up

    to the management to validate the data.

    1.5.2 This system only can cluster the result and give

    recommendation to the part timer tutor.

    1.6 Expected Outcome

    This system is expected to group part time tutors based on similar course

    achievement and assign them with a suitable subject to teach that suit their

    skill. Finally, students will be given a list of recommended subjects that is

    suitable with their range group.

  • 3

    1.7 Report Structure

    This report structure has six (6) chapters. In the Chapter 1, the content consists

    of project background, problem statement of project, the objective and system

    scope. Then, Chapter 2 is about the study of literature review. This chapter is

    reviewing the previous systems. The next is Chapter 3, describes the

    methodology of research. This research used iterative model. Chapter 4

    explains the system’s framework and design. Then, Chapter 5 is all about

    implementation, testing and result. Lastly, Chapter 6 is the conclusion of the

    whole project.

  • 4

    Chapter 2

    LITERATURE REVIEW

    2.1 Introduction

    This chapter describes and explains about the literature review about technique

    used for the development of a My Part Time Tutor Selection System on

    student’s subject achievement using K-Means Clustering Algorithm.

    2.2 Similar System

    2.2.1 Manual System

    My Part Time Tutor Selection System Using K-Means Algorithm is a project

    that built to help an organization to choose the best tutor teacher among

    student. The system will choose a tutor will choose a tutor base on a subject

    that there are good with, which is they will be choose based on their

    achievement in particular subject by calculate their grade of the subject. This is

    because not all of the student is good with every subject they take. Some of

    them have a high understanding and good achievement in particular subject.

    This is what we want so that they can teach the other who didn’t good at the

    subject. Nowadays, a normal procedure for tutor or lecture or teacher selections

    are based on CGPA and interview session. This method does not guarantee

    completely that the selected tutor is good in the job scope given. There is a lack

    of selection based on certain subject achievement.

  • 5

    2.2.2 Part Time Post

    Figure 2.1 Part Time Post

    Figure 2.1 above shows the Part Time Post system which is provide many parts

    times job for the user based on the requirement that has been set. This system is

    very helpful for those who are looking for the part time job including to be a

    tutor teacher.

    2.2.3 E-Rezeki

    Figure 2.3 E rezeki website

    Figure 2.3 above shows the e-rezeki system which is they integrate the Part

    Time Post system so that the tutor is easy to find a job anywhere.

  • 6

    2.3 K-Means Clustering Algorithm

    2.3.1 What is clustering technique

    Clustering is a technique for finding similarity groups in a data, called clusters.

    It is attempts to group individuals in a population together by similarity, but not

    driven by a specific purpose. Clustering is often called an unsupervised

    learning, as you don’t have prescribed labels in the data and no class values

    denoting a priori grouping of the data instances are given (Manu Jeevan,2017).

    This K-Means clustering is purposed by J.B. MacQueen (Zhang Yufang,2003).

    2.3.2 Introduction to K-Means Clustering Algorithm

    K-Means is a method of clustering observations into a specific number of

    disjoint clusters. The ‘K’ refers to the number of clusters specified. Various

    distance measures exist to determine which observation is to be appended to

    which cluster. The algorithm aims at minimizing the measure between the

    centroid of the cluster and the given observation by iteratively appending an

    observation to any cluster and terminate when the lowest distance measure is

    achieved.

    2.3.3 K-Means Clustering Algorithm

    K-Means defines a prototype in terms of a centroid, which is usually the mean

    of a group of points and is typically applied to objects in a continuous n-

    dimensional space. The K-Means clustering technique is simple and we begin

    with a description of the basic algorithm.

    2.3.3.1 Initial Centroids Selection

    We first choose K initial centroids, centroid (k) is referring to a cluster centre

    that is represented using the feature points for a group of the nearby assigned

    objects. It is also used as a reference point in assigning objects into a cluster

    based on their nearest distance to the centroid. In the beginning of the

    assignment process, a number of K set of initial centroids need to be

  • 7

    predetermined so that the objects can be assigned accordingly. In basic K-

    Means, these initial centroids are randomly selected among objects.

    2.3.3.2 Nearest Cluster Assignment

    Each point is then assigned to the closest centroid, and each collection of points

    assigned to a centroid cluster. Clustering process begins by measuring each

    object distance on each centroid (mk).

    Figure 2.3 Nearest cluster assignment formula

    where Sik is set of the object in cluster-k, k= 0 to K and d is a feature. The

    objects will be assigned to a cluster where they have the closest distance to the

    centroid. The distance measurement is using the Euclidean distance method; a

    typical K-Means nearest object measurement.

    2.3.3.3 Centroids Update

    Then, the centroid of each cluster updated based on the points assign to the

    cluster. We repeat the assignment and update steps until no point changes

    clusters, or equivalently, until the centroids remain the same. This is the final

    step where once the objects have been re-assigned, the centroid for each cluster

    needs to be re-calculated.

    Figure 2.4 Centroids update formula

    where M is the total of objects in cluster-k, k = 0 to K and d=0 to D. This step

    is to ensure that all objects that currently assigned to a cluster definitely belong

    to that cluster (i.e. nearest to its new assigned centroid) and far away from

    other clusters. If there is an object that turns out to be nearer to another

    centroid, then this object needs to be reassigned to the nearest cluster. Thus,

    iteratively, the whole process cycle starting from step (b) to (c) needs to be

    repeated until there are no changes to the centroids in all clusters.

  • 8

    2.3.3.4 Basic K-Means Algorithm

    1; Select K points as initial centroids.

    2; repeat

    3; Form K clusters by assigning each point to its closest centroid.

    4; Recompute the centroid of each cluster.

    5; until Centroids do not change.

  • 9

    Chapter 3

    METHODOLOGY

    3.1 Introduction

    This chapter will discuss the methodology that has been used to develop the

    system from the beginning until the system is completed. Methodology process

    is very important in develop our system. It is because, it can describe step by

    step about how to develop the system and also as a revision for the next

    generation who will continue expand or to study the system. In addition, a

    methodology is a formalized approach to implement Software Development

    Life Cycle (SDLC). There are various models defined and designed for

    software development process. The chosen SDLC model to develop this

    system is Iterative Model Life Cycle. Details for every phase involved in this

    system development will be explained in this chapter.

    3.2 Iterative Model

    Figure3.1 Iterative Model

    In this model the process starts from the requirements and iteratively enhance

    the requirements until the final software implemented. The development

    begins by specifying and implementing just part of the software, which can

  • 10

    then be reviewed in order to identify further requirements. This process is then

    repeated, producing a new version of the software for each cycle of the model.

    This model works on four phases. The phases are, requirement phase, design

    phase, implementation phase and evaluation phase. This model purposely used

    because we can possibly do a better testing at each iteration. In addition, this

    model does not require high complexity rate and the feedback is generated

    quickly. However, this model requires planning of technical level and also it is

    not easily understandable.

    3.2.1 Requirement Phase

    In this phase, the requirement for the software are gathered and analyzed.

    Iteration should eventually result a requirements phase that produces a

    complete and final specification of requirements.

    3.2.1.1 Software Requirement

    Software used to develop the My Part Time Tutor Selection.

    Table 3.1 List of Software

  • 11

    3.2.1.2 Hardware Requirement

    Hardware used to develop the My Part Time Tutor Selection System.

    Software Description

    Laptop

    • HP 15-r236TX

    Processor: Intel® Core™ i3-4005U CPU @

    1.7 GHz

    RAM: 8.00 GB

    OS: Window 10

    GPU: NVIDIA GeForce FT 820M

    Table 3.2 List of Hardware

    3.3 Analysis and Design Phase

    In this phase, the software solution to meet the requirement is designed. The

    diagram of system framework, Context Diagram (CD), Data Flow Diagram

    (DFD) and Entity Relationship Diagram (ERD) is built to clarify about the

    actual system.

    3.3.1 Framework Design

    Figure 3.2 System Framework

    The figure above shows the overview of the system. Both admin and student

    will register and login to the system. Admin will update the available tutor

    subject to the system, and student can view and apply as many subjects they

  • 12

    want. During apply for the subject, they will enter the requirement subject

    mark and the mark will be calculate using K-Means technique in the system.

    Once the calculation is done, the result we be give to admin for evaluation and

    update the result to student if he or she is success or not.

    3.3.2 System Design

    3.3.2.1 Context Diagram

    A system context diagram (CD) is a diagram that defines the boundary

    between the system, or part of a system, and its environment, showing the

    entities that interact with it. This diagram is a high-level view of a system.

    Figure 3.3 Context Diagram

    Figure above show the overview flow of the whole system where there is 2

    entities included which is Student and Admin.

    3.3.2.2 Data Flow Diagram

    A data flow diagram (DFD) is a graphical representation of the “flow” of data

    through an information system, modeling its process aspects. A DFD is often

    used as a preliminary step to create an overview of the system without going

    into great detail, which can later be elaborated.

  • 13

    3.3.2.2.1 Data Flow Diagram Level – 0

    Figure 3.4 Data Flow Diagram Level-0 [Admin]

    Figure above show the DFD Level-0 for Admin where there are 6 processes

    included in Admin process. First, the admin will register to the system and

    directly go the admin site. In the admin site, the admin will update the

    available subject to the system and view if there is an application from the

    student. Finally, admin will make a report for the choosen student for each

    subject.

  • 14

    Figure 3.5 Data Flow Diagram Level-0 [Student]

    Figure above show the DFD Level-0 for Student where there are 6 processes

    included in Student process. First, the student will register to the system and

    view their dashboard. Next, the student will be able ti view the available

    subject list and insert their subject mark. Then the student will be able to see

    their subject history and wait for the admin to update their report for subject to

    teach.

    3.3.2.3 Entity Relationship Diagram

    Entity relationship diagram (ERD) is a graphical representation of entities and

    their relationships to each other, typically used in computing in regard to the

    organization of data within databases or information systems.

  • 15

    Figure 3.6 Entity Relationship Diagram

    Figure above show the ERD of the system, where there is 5 entity and 6

    relations included.

    3.3.3 Data Model

    A data model (or data model) is an abstract model that organizes elements

    of data and standardizes how they relate to one another and to properties of the

    real-world entities.

    3.3.3.1 Admin

    # Name Type Pk/Fk Description

    1 id int(11) Primary Key

    2 Username varchar(255)

    3 Password varchar(255)

    4 AdminPhoto varchar(255)

    Table 3.3 Admin Data Model

    Table above shows the details of admin data.

  • 16

    3.3.3.2 Student

    # Name Type Pk/Fk Dscription

    1 TutorRegno varchar(255) Primary

    Key

    2 TutorPhoto varchar(255)

    3 TutorName varchar(255)

    4 TutorCgpa decimal(10,2)

    5 TutorPwd varchar(255)

    6 TutorRegdate timestamp CURRENT_TIMESTAMP

    Table 3.4 Student Data Model

    Table above shows the details of student data.

    3.3.3.3 Subject

    # Name Type Pk/Fk Description

    1 subcode varchar(255) Primary Key

    2 subname varchar(255)

    3 subcreate timestamp CURRENT_TIMESTAMP

    Table 3.5 Subject Data Model

    Table above shows the details of subject data.

    3.3.3.4 Subject Enrollment

    # Name Type Pk/Fk Description

    1 id int(255) Primary Key

    2 subcode varchar(255) Foreign Key Table subject

    3 subgrade decimal(10,2)

    4 TutorRegno varchar(500) Foreign Key Table Student

    5 subenroll timestamp CURRENT_TIMESTAMP

    Table 3.6 Subject Mark Data Model

    Table above shows the details of subject mark data where the subcode is taken

    from table subject and TutorRegno is taken from table student.

  • 17

    3.3.3.5 Subject Group

    T

    Table 3.7 Subject Group Data Model

    Table above shows the details of subject group data where is admin will update

    the two requirement subject for each group which is subA and subB.

    3.3.3.6 Student Group

    Table 3.8 Student Group Data Model

    Table above shows the detail of student group data where is the final student

    who is choosen in the subject group and require to teach that subject.

    3.3.3.7 Kmeans

    # Name Type Pk/Fk Description

    1 kmeans_id int(255) Primary Key

    2 TutorRegno varchar(255) Foreign Key Table Student

    3 subA float Foreign Key Table subject

    4 subB float Foreign Key Table subject

    5 cluster int(255)

    Table 3.9 Kmeans Data Model

    Table above shows the detail of kmeans group data where the requirement

    subject will be counted and will be place in the specific cluster.

    # Name Type Pk/Fk Description

    1 group_id varchar(255) Primary Key

    2 groupname varchar(255)

    3 subA varchar(255) Foreign Key Table Subject

    4 subB varchar(255) Foreign Key Table Subject

    # Name Type Pk/Fk Description

    1 tgId int(255) Primary Key

    2 TutorRegno varchar(25) Foreign Key Table Student

    3 group_id varchar(255) Foreign Key Table subject Group

  • 18

    3.3.3.8 Academic

    # Name Type Pk/Fk Description

    1 a_id int(255) Primary Key

    2 a_department varchar(255)

    3 a_course varchar(255)

    4 a_sem varchar(255)

    5 TutorCgpa varchar(255)

    6 TutorRegno varchar(255)

    Table 3.10 Academic Data Model

    Table above shows the details of academic group data.

    3.3.4 Technique

    3.3.4.1 K-Means Clustering

    K-Means Clustering is the simplest unsupervised learning technique that can

    solve clustering problem. The step follows a simple and easy way to classify a

    given set of data set through a certain number of cluster (assume k clusters)

    fixed a prior. In this project we will select two subject mark of students based

    on their subject achievement. Below is the example of their subject mark that

    has been listed in the record.

    Figure 3.7 Subject Mark

  • 19

    Figure 3.7 Subject Mark Graph

    There are three main process to calculate the K-Means Clustering: -

    3.3.4.1.1 Define k centroids, one for each cluster.

    First, we have to assume the initial centroid for each cluster randomly, for this

    example the initial centroid for cluster one is (1.0,1.0) and the initial centroid

    for cluster two is (3.0,4.0). This initial centroid will be use to calculate the

    Euclidean Distance for each object to the nearest distance of centroid.

    Table 3.11 Define Centroid Example

    These centroids should be placed in a wily way because of different location

    cause different result. So, is better to place them as much as possible far away

    from each other.

    3.3.4.1.2 Take each point belonging to a given data set and associated it to

    a nearest centroid.

    Clustering process begins by measuring each object distance on each centroid.

    Calculation for Record 2: -

    Cluster 1 = 1(10,1.0) Cluster 2 = 3(3.0,4.0)

  • 20

    Euclidean Distance Cluster 1 = √(𝟏. 𝟓 − 𝟏. 𝟎)𝟐 + (𝟐. 𝟎 − 𝟏. 𝟎)𝟐 = 1.12

    Euclidean Distance Cluster 1 = √(𝟏. 𝟓 − 𝟑. 𝟎)𝟐 + (𝟐. 𝟎 − 𝟒. 𝟎)𝟐 = 2.5

    Therefore, distance cluster 1 is less than cluster 2, so that Record 2 has been

    listed in cluster 1. So, cluster 1 has record 1 and 2.

    When no point is pending the first step is done. At this point, recalculated k

    new centroids as center of the clusters resulting from the previous step is

    needed.

    3.3.4.1.3 After this k new centroids, a new binding has to be done between

    the same data points and nearest new centroids.

    This is the last step where once the objects have been re-assigned, the centroid

    for each cluster needs to be re-calculated. So that after record 2 has re-assigned

    in cluster

    1. We need to calculate the new means.

    CLUSTER 1 2

    Record 1,2 3(no change)

    Means (1.25,1.5) (3.0,4.0)

    Table 3.12 Calculation of New Means

    New Means for Cluster 1 =( 𝟏+𝟏.𝟓

    𝟐 ,

    𝟐+𝟏

    𝟐 ) = (1.25,1.5)

    Thus, A loop has been generated, until it notices that the k centroids

    change their location step by step until no more changes are done. In the

    simplest words, centroids do not move any more.

  • 21

    3.3.4.2 Implementation K-means Clustering Algorithm in My Tutor

    3.3.4.2.1 Declaration and set alternative function.

    3.3.4.2.2 Initialized Centroid

  • 22

    3.3.4.2.3 Assign Cluster

  • 23

    3.3.4.2.4 Update Centroids

    3.3.5 Summary

    In conclusion, choosing the right development methodology is very important

    because it will affect the whole development process. The right methodology

    will help the project to be done perfectly and smoothly. In addition, design

    and framework are also important for us to see the picture of our system so

    that we can build it smoothly and create a good system flow.

  • 24

    Chapter 4

    IMPLEMENTATION AND RESULT

    4.1 Introduction

    Implementation and result are executed to ensure the system are developed

    according to the main objective of the system and achieve user requirement.

    This chapter will give the result of the My Part Time Tutor Selection System or

    called My Tutor System that has been develop.

    4.2 Implementation of My Tutor System

    There are several language that has been used to develop My Tutor System.

    For the template interface Bootstrap 3.0 and startbootstrap-agency-gh-pages

    has been used. Next, for the server side PHP(Hypertext Pre-processor) has

    been used as programming language. PHP is widely used because it is an open

    source for general-purpose scripting language and can be embedded into

    HTML and it suits for Web development.

    For the validation, this system had been used HTML5, PHP and JavaScript.

    Validation is very important to make sure user have a low rate to make a silly

    mistake when they key in their data. For example, user is required to insert

    their data in every insert form. When user skip the insert data then they can’t

    submit their data. Finally, an Open Source database also has been used in this

    system and the database is MySQL version 10.1.22-MariaDB. Apache use to

    run the local host server and the version is 2.4.25. Visual Studio Code used for

    writing the code.

  • 25

    4.3 Design Interface

    The design interface is divided into two which is Admin page and user page.

    4.3.1 Main Interface

    Figure 4.1 Main Interface

    Figure 4.1 above shows the main of login interface for both Admin and User.

    4.3.2 Register Page

    Figure 4.2 Register Page

    Figure 4.2 above shows a register page for user. User is required to fill in their

    full name, Id nmber and password.

  • 26

    4.3.3 Dashboard Page

    Figure 4.3 Dahboard Page

    Figure 4.3 above shows the dashboard page both for Admin and User.

    4.3.4 Profile Page

    Figure 4.4 Profile page

    Figure 4.4 above shows the profile page for user. User can view their name and

    matric number. User also can update their full name and profile picture.

  • 27

    4.3.5 Academic Page

    Figure 4.5 Academic Page

    Figure 4.5 above shows the academic page. User can update their academic

    details like cgpa, faculti, course and semester.

    4.3.6 Subject Page

    Figure 4.6 Subject Page

    Figure 4.6 above show the subject page, where user need to choose the subject

    and their subject mark.

  • 28

    4.3.7 History Page

    Figure 4.7 History Page

    Figure 4.7 above show the enroll history page for user. User can view all of the

    subject that they has been key in and update their subject mark or delete the

    subject.

    4.3.8 Report Pages

    Figure 4.8 Report Pages

    Figure 4.8 shows the report pages for user. User can view the full details of

    their personal details and academic details. At the bottom of the report they can

    view the recommendation group for the to teach.

  • 29

    4.3.9 Profile Page

    Figure 4.9 Profile Page

    Figure 4.9 shows the profile page for admin which is admin can view and

    update his personal information.

    4.3.10 ManageTutor Pages

    Figure 4.10 ManageTutor Pages

    Figure 4.10 above shows the tutor pages for admin where admin can view all

    of the tutor that already register to the system and admin also can delete the

    tutor that not active already.

  • 30

    4.3.11 Manage Subject Pages

    Figure 4.11 Manage Subject pages

    Figure 4.11 above shows the subject pages for admin to manage. In this section

    admin can add a new subject that available, edit the subject name and delete

    the unavailable subject.

    4.3.12 Manage Group Page

    Figure 4.12 Manage Group Page

    Figure 4.12 above shows the group pages for admin to manage. In this section

    admin need to add two subject that fit the group cirteria. Then, admin also can

    edit the subject criteria and delete the gorup.

  • 31

    4.3.13 Manage History Page

    Figure 4.13 Manage History Page

    Figure 4.13 above shows the history pages for admin to manage. In this section

    admin can view all of the subject that user enroll and admin also can delete the

    history for user.

    4.14 Calculation page

    Figure 4.14 Calculation Page

    Figure 4.14 above shows the calculation pages for admin to manage. In this

    section admin has to select the group names and number of cluster to start the

    calculation.

  • 32

    4.4.15 Kmeans Page

    Figure 4.15.1 Kmeans clustering table

    Figure 4.15.1 above shows the clustering result for three cluster in a table. In

    this section, admin can add the choosen student to the group and admin also

    can delete the student from the cluster if the student is already assign to

    another group.

    Figure 4.15.2 Kmeans clustering graph

    Figure 4.15.2 above shows the clustering result for three cluster in a table. In

    this section, admin can view the cluster visually which is can ease the admin to

    choose the right student by looking for the highest cluster on top of the graph.

  • 33

    4.3.16 Report

    Figure 4.16 Report Page

    Figure 4.16 above shows the report of the clustering result. Admin can view

    the student who has been choosen to the recommendation group.

    4.4 Testing Analysis

    After the development of the system are complete, this system will be test

    using two techniques of software testing which are black box testing and white

    box testing in order to examine the functionality of the system.

    4.4.1 Black Box Testing

    Module involve in this testing are:

    I. Login

    II. Create, retrive, update and delete subject

    4.4.2 White Box Testing

    Module involve in this testing are:

    I. Generate Tutor’s Clustering Result.

    II. Generate K-Means Clustering Graph

  • 34

    4.5 Test Cases

    A test case is a set of condition or variables under which tester will determinate

    wheatear a system works correctly or under test satisfies requirement. Process

    of developing test case may help to find problem in the requirement or design

    of an application. It simple define that test cases is a set of condition or

    combination of variables under which tester or engineer will identify wheatear

    the application under test is working correctly or not. Below shows the cases

    for several process in the My Tutor System.

    4.5.1 Login

    Step Procedure Expected Result Pass/Fail

    1. Go to login page Preview page

    loaded

    Pass

    2. Enter the following detail:

    Admin Id : Admin

    Password : admin17

    Message”successfull

    login”

    Pass

    3. Click “Login” Button Pass

    Table 4.1 Test Cases Success Admin Login

    4.5.2 Admin

    Step Procedure Expected Result Pass/Fail

    1. Click “subject” page Preview page loaded Pass

    2. Click “add” button Pass

    3. Enter following detail:

    Subject Code : C001

    Subject Name : English

    pass

    4. Click “yes” Button Message “New data

    added”

    Pass

    Table 4.3 Test Case Add Subject

  • 35

    Step Procedure Expected Result Pass/Fail

    1. Click “subject” page Preview page loaded Pass

    2. Click “update” button Pass

    3. Enter following detail:

    Subject Code : C002

    Subject Name : English and

    communication

    pass

    4. Click “yes” Button Message “Data

    Updated”

    Pass

    Table 4.4 Test Case Update Subject

    Step Procedure Expected Result Pass/Fail

    1. Click subject page Preview page loaded Pass

    2. Click “delete” button Message “are you

    sure want to delete”

    Pass

    3. Click “yes” Button One row deleted

    from table

    Pass

    Table 4.5 Test Case Delete Subject

    4.6 Summary

    As a conclusion, this chapter briefly discussed about the implementation of

    code, interface design and testing the final result of the system. After all of the

    element where tested, the system shows exellent result where everything is

    work as planned.

  • 36

    Chapter 5

    CONCLUSION

    5.1 Introduction

    This chapter will discuss a conclusion oh this project and the content on this

    chapter are summary for the whole project, project contribution, project

    limitation and some suggestion for the future.

    5.2 Project Contribution

    My Tutor system has been developed for final year student in Faculty of

    Informatics Computing in UniSZA. It has achieved the objectives and scope of

    scope of this project. Below is the list of the achievements on this project:

    5.2.1 Generate student’s group achievement group using K-Means Clustering

    5.2.2 This system recommended Student a tutor group that suited their skill.

    5.2.3 This system gave benefits to the company by getting good tutor position

    at their classroom

    5.3 Result Discussion

    Generally, this project has been carried out and follow the objectives that has

    been explained in Chapter 1. This project has introduced the two main criteria

    that would be used to calculate the K-Means clustering result which is two

    subjects as the criteria. In addition, this project provided a better way to

    student to know their recommender group to teach and also helped the admin a

    lot to assign the best student in a group.

    5.4 Project Constraint and Limitations

    There are a few problems and limitations that occur throughout the

    development of this project. The problems and limitation in conducting this

    study are;

  • 37

    5.4.1 This system is set default for only two criteria which is Subject

    Mark A and Subject Mark B.

    5.4.2 The cluster number is set default only from 1 to 5 cluster.

    5.4.3 The subject mark needs to insert manually by the student.

    5.5 Future Work

    There is some suggestion that can be made in order to upgrade the system

    to be more efficient in the future. The suggestion are:

    5.5.1 Upload a resume to be a tutor.

    5.5.2 The criteria will be added nad the cluster are not limited only from

    1 to 5.

    5.5.3 Put the prices for tutoring lesson per hour.

    5.6 Summary

    My Part Time Tutor Selection System is the system that focuses of

    recommendation of a subject to teach for the student. Based on the previous

    study and discussion with supervisor the suitable approach that will be

    implement in this project is K-MeansClustering techniques. This system

    will help the admin to select the best student for the tutor subject by looking

    for cluster ranking on the graph. Hopefully, this system will help the

    student to be a tutor teacher that suit the subject requirement.

  • 38

    REFERENCES

    Ju, C., & Xu, C. (2013). A New Collaborative Recommendation Approach

    Based on

    Users Clustering Using Artificial Bee Colony Algorithm, 2013.

    Kodinariya, T. M., & Makwana, P. R. (2013). Review on determining number

    of

    Cluster in K-Means Clustering. International Journal of Advance Research in

    Computer Science and Management Studies, 1(6), 2321–7782.

    Li, C. S. (2011). Cluster center initialization method for K-means algorithm

    over data

    sets with two clusters. Procedia Engineering, 24, 324–328.

    https://doi.org/10.1016/j.proeng.2011.11.2650

    Li, Y., & Wu, H. (2012). A Clustering Method Based on K-Means Algorithm.

    Physics

    Procedia, 25, 1104–1109. https://doi.org/10.1016/j.phpro.2012.03.206

    Yadav, S., Bharadwaj, B., & Pal, S. (2012). Data mining applications: A

    comparative

    study for predicting student’s performance. International Journal of Innovative

    Technology & Creative Engineering, 1(12), 13–19. Retrieved from

    http://arxiv.org/abs/1202.4815

    https://doi.org/10.1016/j.proeng.2011.11.2650https://doi.org/10.1016/j.phpro.2012.03.206http://arxiv.org/abs/1202.4815

  • 39

    APPENDIX