14
CHATBOT DEVELOPMENT IN DATA REPRESENTATION FOR DIABETES EDUCATION ABBAS SALIIMI BIN LOKMAN MASTER OF SCIENCE (COMPUTER) UNIVERSITI MALAYSIA PAHANG

ABBAS SALIIMI BIN LOKMAN - Institutional repositoryumpir.ump.edu.my/17818/1/Chatbot development in data representation for... · ABBAS SALIIMI BIN LOKMAN MASTER OF SCIENCE (COMPUTER)

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: ABBAS SALIIMI BIN LOKMAN - Institutional repositoryumpir.ump.edu.my/17818/1/Chatbot development in data representation for... · ABBAS SALIIMI BIN LOKMAN MASTER OF SCIENCE (COMPUTER)

CHATBOT DEVELOPMENT IN DATA REPRESENTATION FOR DIABETES EDUCATION

ABBAS SALIIMI BIN LOKMAN

MASTER OF SCIENCE (COMPUTER)

UNIVERSITI MALAYSIA PAHANG

Page 2: ABBAS SALIIMI BIN LOKMAN - Institutional repositoryumpir.ump.edu.my/17818/1/Chatbot development in data representation for... · ABBAS SALIIMI BIN LOKMAN MASTER OF SCIENCE (COMPUTER)

CHATBOT DEVELOPMENT IN DATA REPRESENTATION FOR DIABETES EDUCATION

ABBAS SALIIMI BIN LOKMAN

Thesis submitted in fulfilment of the requirements for the award of the degree of Master of Science (Computer)

Faculty of Computer Systems & Software Engineering UNIVERSITI MALAYSIA PAHANG

MAY 2011

Page 3: ABBAS SALIIMI BIN LOKMAN - Institutional repositoryumpir.ump.edu.my/17818/1/Chatbot development in data representation for... · ABBAS SALIIMI BIN LOKMAN MASTER OF SCIENCE (COMPUTER)

SUPERVISOR’S DECLARATION

I hereby declare that I have checked this thesis and in my opinion, this thesis is

adequate in terms of scope and quality for the award of the degree of Master of

Science in Computer.

Signature:

Name of Supervisor: ASSOC. PROF. DR. JASNI BINTI MOHAMAD ZAIN

Position: DEAN,

FACULTY OF COMPUTER SYSTEMS

& SOFTWAE ENGINEERING,

UNIVERSITI MALAYSIA PAHANG

Date:

Page 4: ABBAS SALIIMI BIN LOKMAN - Institutional repositoryumpir.ump.edu.my/17818/1/Chatbot development in data representation for... · ABBAS SALIIMI BIN LOKMAN MASTER OF SCIENCE (COMPUTER)

! """!

STUDENT’S DECLARATION

I hereby declare that the work in this thesis is my own except for quotations and

summaries which have been duly acknowledged. The thesis has not been accepted for

any degree and is not concurrently submitted for award of other degree.

Signature:

Name: ABBAS SALIIMI BIN LOKMAN

ID Number: MCC08004

Date:

The following articles had been published as a direct result of this research:

1. Abbas S. Lokman and Jasni M. Zain, “One-Match and All-Match Categories

for Keywords Matching in Chatbot,” American Journal of Applied Science 7

(10): 2010, pp. 1212-1218, DOI: 10.3844/ajassp.2010.1406.1411.

2. Abbas S. Lokman and Jasni M. Zain, “Extension and Prerequisite: An

Algorithm to Enable Relations Between Responses in Chatbot Technology,”

Journal of Computer Science 6 (10): 2010, pp. 1212-1218, DOI:

10.3844/jcssp.2010.1212.1218.

3. Abbas S. Lokman and Jasni M. Zain, “Chatbot Enhanced Algorithms: A Case

Study on Implementation in Bahasa Malaysia Human Language,” Networked

Digital Technologies, F. Zavoral et al., eds., Springer-Verlag, 2010, ISBN:

978-3-642-14291-8, pp. 31-44, DOI: 10.1007/978-3-642-14292-5_5.

Page 5: ABBAS SALIIMI BIN LOKMAN - Institutional repositoryumpir.ump.edu.my/17818/1/Chatbot development in data representation for... · ABBAS SALIIMI BIN LOKMAN MASTER OF SCIENCE (COMPUTER)

! "#!

4. Abbas S. Lokman and Jasni M. Zain, “An Architectural Design of Virtual

Dietitian (ViDi) for diabetic patients,” Proc. 2nd IEEE International

Conference on Computer Science and Information Technology (ICCSIT 09),

IEEE Press, August 2009, pp. 408-411, DOI: 10.1109/ICCSIT.2009.5234671.

5. Abbas S. Lokman and Jasni M. Zain, “Designing a Chatbot for diabetic

patients,” Proceedings of International Conference on Software Engineering

and Computer Systems, ICSECS 2009.

6. Jasni M. Zain and Abbas S. Lokman, “Developing Multimedia Content for

Diabetes Education (in English and Malay Language): A First Prototype in

Developing Content to Educate, Motivate and Monitor (Public and

Diabetic),” Proceedings of The 1st Makassar International Conference on

Electrical Engineering & Informatics, MICEEI 2008.

7. Abbas S. Lokman and Jasni M. Zain, “Sequence Words Deleted (SWD)

Technique for Pattern Matching in Chatbot,” Proceedings of Malaysian

Technical Universities Conference on Engineering and Technology,

MUCEET 2009.

8. Abbas S. Lokman and Jasni M. Zain, “Sentence Processing Algorithm for

Chatbot: An Attempt to Solve the Garden-path Effect,” Presented on National

Conference of Postgraduate Research, NCON-PGR 2009.

The following award had been received as a direct result of this research:

1. Itex Gold Medal at the 19th International Invention, Innovation & Technology

Exhibition (ITEX 2008, Kuala Lumpur, Malaysia)

Page 6: ABBAS SALIIMI BIN LOKMAN - Institutional repositoryumpir.ump.edu.my/17818/1/Chatbot development in data representation for... · ABBAS SALIIMI BIN LOKMAN MASTER OF SCIENCE (COMPUTER)

! "#!

ACKNOWLEDGEMENTS

Alhamdulillah. I am grateful and would like to express my sincere gratitude to my supervisor; Associate Professor Dr. Jasni binti Mohamad Zain for her invaluable guidance, continuous encouragement and constant support in making this research possible. My honest appreciation for every opportunity her create, every lesson her taught and every support her gave since before I enrolled in UMP’s graduate program until this concluding moment. I am also truly grateful for her continuous vision about my practice in science, her tolerance of my naïve mistakes, and also her commitment to my future career. I also would like to express my very special thanks to three medical doctors from International Islamic University Malaysia (IIUM) Kuantan Campus; Prof. Dr. Mohammed Fauzi bin Abdul Rani, Dr. Marzuki bin Omar and Dr. Mohammad Yousuf Rathor for their supports in completing this research.

I acknowledge my sincere indebtedness and gratitude to my family and in-

laws for their love, support and sacrifice throughout my life especially my mother, Mariani binti Mat who was always been there for me. I acknowledge my appreciation and thankfulness to my wife, Nurliyana binti Mohd Johari for her absolute support in every part of my life as being my best friend, my inspiration, my counsellor, my supporter, my teacher and many more roles that her played in which I cannot find the exact words in reimaging everything that she ever done and sacrifice for me. All the thanks in the world for you my dear.

My sincere thanks also extend to all fellow Faculty of Computer Systems & Software Engineering, UMP, IIUM and Hospital Tengku Ampuan Afzan (HTAA) Kuantan staffs for their direct and indirect support throughout realizing this research. Thank you all.

Page 7: ABBAS SALIIMI BIN LOKMAN - Institutional repositoryumpir.ump.edu.my/17818/1/Chatbot development in data representation for... · ABBAS SALIIMI BIN LOKMAN MASTER OF SCIENCE (COMPUTER)

ix

TABLE OF CONTENTS

Page

TITLE PAGE i

SUPERVISOR’S DECLARATION ii

STUDENT’S DECLARATION iii

DEDICATION v

ACKNOWLEDGEMENTS vi

ABSTRACT vii

ABSTRAK viii

TABLE OF CONTENTS ix

LIST OF TABLES xiii

LIST OF FIGURES xiv

LIST OF ABBREVIATIONS xvi

CHAPTER 1 INTRODUCTION

1.1 Introduction 1

1.1.1 An Application of Known Technique to a Chosen Domain 2 1.1.2 An Extension and Improvement of Existing Technique 3

1.2 Research Motivation and Problem Statements 3

1.3 Research Questions 4

1.4 Research Objectives 4

1.5 Research Methodology 4

1.6 E-CARE Components and the Previous Study 6

1.6.1 The Previous Study: “Expectation and Feasibility of a Computer Aided Education in Diabetes Urban Area in Malaysia: Views from Patients, Healthcare Staff and Hospital Administrators” (Jasni and Fauzi, 2008)

6

1.6.2 Multimedia Content 7 1.6.3 ViDi Chatbot 8

1.7 Conclusion 9

Page 8: ABBAS SALIIMI BIN LOKMAN - Institutional repositoryumpir.ump.edu.my/17818/1/Chatbot development in data representation for... · ABBAS SALIIMI BIN LOKMAN MASTER OF SCIENCE (COMPUTER)

x

CHAPTER 2 LITERATURE REVIEW

2.1 Introduction 10

2.2 Diabetes 10

2.2.1 Significant of Diabetes Education 11 2.2.2 Diabetes Educational Content Application 12

2.3 Educational Multimedia Design 13

2.3.1 Analysis 13 2.3.2 Audience, Needs and Delivery Environment 14

2.4 Chatbot 14

2.4.1 ELIZA 15 2.4.2 A.L.I.C.E. 16 2.4.3 VPbot 18 2.4.4 Focus of Study Within Chatbot Technology (Areas to be

Improved) 19

2.4.4.1 Responses’ Relations 19 2.4.4.2 Keywords Matching 23 2.4.4.3 Synonyms Replacement 28 2.4.4.4 Response Selection or Generation 31

2.5 Conclusion 33

CHAPTER 3 E-CARE MULTIMEDIA CONTENT

3.1 Introduction 34

3.2 Planning 34

3.2.1 Field Study 35 3.2.2 Data Collection and Content Development 35 3.2.3 Final Prototype Released 36

3.3 Architectural Design 36

3.4 Development 37

3.4.1 Technology Used 37 3.4.2 Interface Design 38

3.5 Conclusion 46

Page 9: ABBAS SALIIMI BIN LOKMAN - Institutional repositoryumpir.ump.edu.my/17818/1/Chatbot development in data representation for... · ABBAS SALIIMI BIN LOKMAN MASTER OF SCIENCE (COMPUTER)

xi

CHAPTER 4 ViDi CHATBOT

4.1 Introduction 47

4.2 Proposal and Planning 47 4.3 Architectural Design 49 4.4 Development 50 4.4.1 Technology Used 51 4.4.2 Interface Design 51 4.4.3 Database Design 54 4.4.4 Prototypes and Final Product 56 4.5 Proposed Approaches 57

4.5.1 Vpath 57 4.5.2 Sequence Words Deleted (SWD) 60 4.5.3 Extension and Prerequisite 65 4.5.4 One-Match and All-Match Categories (OMAMC) 70 4.5.5 Synonyms and Root-words 72 4.5.6 General Words Percentage (GWP) 75

4.6 Conclusion 77

CHAPTER 5 RESULTS AND DISCUSSIONS

5.1 Introduction 78

5.2 E-CARE Multimedia Content 78

5.3 ViDi Chatbot 80

5.3.1 Vpath and Sequence Words Deleted (SWD) 80 5.3.2 Extension and Prerequisite 83 5.3.3 Other Proposed Approaches 87

5.4 Significance of Results 93

5.5 Conclusion 95

CHAPTER 6 CONCLUSION AND RECOMMENDATIONS

6.1 Introduction 96

6.2 Conclusion 96

6.3 Significant Contributions 97

6.4 Recommendations 99

Page 10: ABBAS SALIIMI BIN LOKMAN - Institutional repositoryumpir.ump.edu.my/17818/1/Chatbot development in data representation for... · ABBAS SALIIMI BIN LOKMAN MASTER OF SCIENCE (COMPUTER)

xii

REFERENCES 100

APPENDICES 103

A UI for E-CARE Multimedia Content Final Released Version 103

B Medical Doctors’ Curriculum Vitae 104

C Medical Doctors’ Comments on E-CARE 138

D ViDi Chatbot Complete Syntaxes 139

E Itex Gold Medal Certificate 189

Page 11: ABBAS SALIIMI BIN LOKMAN - Institutional repositoryumpir.ump.edu.my/17818/1/Chatbot development in data representation for... · ABBAS SALIIMI BIN LOKMAN MASTER OF SCIENCE (COMPUTER)

xiii

LIST OF TABLES

Table No. Title Page 4.1 ViDi’s database table “responses” 54 4.2 ViDi’s database table “keywords” 55 4.3 ViDi’s database table “synonyms” 55 4.4 ViDi’s database table “rootWords” 56 4.5 ViDi’s database table “generalWords” 56 4.6 Sample array of possible variables to be match 62 4.7 Root-words database arrangement 73 5.1 Incremental parsing result 81 5.2 SWD result 81 5.3 AIML result 88 5.4 Vpbot’s keywords set result 88 5.5 One-match and All-match keywords result 89 5.6 Sample GWP calculation for different utterances variations 92 5.7 Extension and Prerequisite against topic mechanism 93 5.8 OMAMC against AIML Graphmaster and VPbot’s keywords set 94 5.9 Synonyms and Root-words against Synonyms alone 94 5.10 GWP against AIML Graphmaster and VPbot’s algorithm on

response selection 94

Page 12: ABBAS SALIIMI BIN LOKMAN - Institutional repositoryumpir.ump.edu.my/17818/1/Chatbot development in data representation for... · ABBAS SALIIMI BIN LOKMAN MASTER OF SCIENCE (COMPUTER)

xiv

LIST OF FIGURES

Figure No. Title Page 2.1 General chatbot process model 20 2.2 Normalization process within chatbot algorithm 29 2.3 General chatbot process model for response selection/generation 33 3.1 Live shot from diabetis patients-doctor session during the field

study 35

3.2 E-CARE multimedia content architectural design model 37 3.3 E-CARE multimedia content version 1 (Start Page) 38 3.4 E-CARE multimedia content version 1 (Content Page) 39 3.5 E-CARE multimedia content version 2 (Start Page) 40 3.6 E-CARE multimedia content version 2 (Content Page for Public) 40 3.7 E-CARE multimedia content version 2 (Content Page for

Diabetic) 41

3.8 E-CARE multimedia content final released version (Start Page) 42 3.9 E-CARE multimedia content final released version (Home Page) 42 3.10 E-CARE multimedia content final released version

(Public Home Page) 43

3.11 E-CARE multimedia content final released version

(Public Content Page) 43

3.12 E-CARE multimedia content final released version

(Diabetic Home Page) 44

3.13 E-CARE multimedia content final released version

(Diabetic Menu Page) 44

3.14 E-CARE multimedia content final released version

(Diabetic Content Page sample 1) 45

3.15 E-CARE multimedia content final released version

(Diabetic Content Page sample 2) 45

Page 13: ABBAS SALIIMI BIN LOKMAN - Institutional repositoryumpir.ump.edu.my/17818/1/Chatbot development in data representation for... · ABBAS SALIIMI BIN LOKMAN MASTER OF SCIENCE (COMPUTER)

xv

Figure No. Title Page 4.1 ViDi architectural design model 49 4.2 ViDi’s chatting interface 52 4.3 vBrain: Managing ViDi’s Responses data 53 4.4 vBrain: Managing ViDi’s each Response, Keywords and

Extension/s data 53

4.5 vBrain: Managing Synonyms, Root-words and General words

data 54

4.6 A sample path taken by patient in conversation with ViDi 58 4.7 Process flow for ViDi version one (Virtual Dietitian) 59 4.8 SWD flowchart 63 4.9 General chatbot process model 66 4.10 General chatbot process model with response’s relations 66 4.11 vBrain: Add new response UI 67 4.12 Sample relations between responses regarding the implementation

of Extension and Prerequisite 68

4.13 vBrain’s UI for managing each Response, Keywords (One-match,

All-match and Prerequisite keywords) and Extension data 69

4.14 Sample of One-match and All-match keywords in the vBrain 71 4.15 vBrain’s UI for managing Synonyms and Root-words data 73 4.16 Normalization process within chatbot algorithm with the

component of Root-words replacement 74

4.17 Chatbot process model for response selection/generation with

incorporation of GWP component 76

4.18 Sample database table for ViDi’s responses data with GWP value

(first column from the right). 76

5.1 Sample conversation with ViDi (result 1) 83 5.2 Sample conversation with ViDi (result 2) 84

Page 14: ABBAS SALIIMI BIN LOKMAN - Institutional repositoryumpir.ump.edu.my/17818/1/Chatbot development in data representation for... · ABBAS SALIIMI BIN LOKMAN MASTER OF SCIENCE (COMPUTER)

xvi

LIST OF ABBREVIATIONS

AIML Artificial Intelligence Markup Language AJAX Asynchronous Javascript + XML A.L.I.C.E. Artificial Linguistic Internet Computer Entity CAE Computer Aided Education CD Compact Disc CSS Cascading Style Sheet DBMS Database Management System DOM Document Object Model GUI Graphical User Interface GWP General Words Percentage HTML Hypertext Markup Language IIUM International Islamic University Malaysia MIT Massachusetts Institute of Technology OMAMC One-Match and All-Match Categories PHP Hypertext Preprocessor SQL Strictired Query Language SWD Sequence Words Deleted UI User Interface ViDi Virtual Diabetes physician XML Extensible Markup Language