Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
HOME ABOUT LOGIN REGISTER SEARCH CURRENT ARCHIVES
ANNOUNCEMENTS
Home > Archives > Vol 9, No 3
June 2019
DOI: http://doi.org/10.11591/ijece.v9i3
Table of Contents
The effect of load modelling on phase balancing in distribution networks using search
harmony algorithmSaeid Eftekhari, Mahmoud Oukati Sadegh
Total views : 717 times
1461-1471
Vol 9, No 3 http://ijece.iaescore.com/index.php/IJECE/issue/view/485
1 of 5 7/23/2019, 12:44 AM
Parameters affecting the selectivity of an electrical insecticideLahouaria Neddar, Samir Flazi
Total views : 213 times
1472-1478
Modified approach for harmonic reduction in three-phase to seven-phase using
transformer winding connections
Mehrdad Ahmadi Kamarposhti, Ashkan Abyar Hosseini
Total views : 144 times
1496-1505
Matlab/simulink simulation of unified power quality conditioner-battery energystorage system supplied by PV-wind hybrid using fuzzy logic controller
Amirullah Amirullah, Ontoseno Penangsang, Adi Soeprijanto
Total views : 526 times
1479-1495
Relevance vector machine based fault classification in wind energy conversion systemRekha S. N., P. Aruna Jeyanthy, D. Devaraj
Total views : 184 times
1506-1513
Intelligent controller based power quality improvement of microgrid integration of
photovoltaic power system using new cascade multilevel inverterGaddala Jayaraju, Gudapati Sambasiva Rao
Total views : 723 times
1514-1523
Low-voltage ride-through for a three-phase four-leg photovoltaic system using SRFPIcontrol strategy
Haval Sardar Kamil, Dalila Mat Said, Mohd Wazir Mustafa, Mohammad Reza Miveh,
Nasarudin Ahmad
Total views : 130 times
1524-1530
Direct and indirect vector control of a doubly fed induction generator based in a wind
energy conversion systemManale Bouderbala, Badre Bossoufi, Ahmed Lagrioui, Mohammed Taoussi, Hala
Alami Aroussi, Yasmine Ihedrane
Total views : 73 times
1531-1540
Wireless power transfer to a micro implant device from outside of human bodyKazuya Yamaguchi, Kazuma Onishi, Kenichi Iida
Total views : 341 times
1541-1545
Extended Kalman observer based sensor fault detection
Noura Rezika Bellahsene Hatem, Mohammed Mostefai, Oum El Kheir Aktouf
Total views : 755 times
1546-1552
Impact of compressed air energy storage system into diesel power plant with windpower penetration
Abdulla Ahmed, Tong Jiang
Total views : 167 times
1553-1560
Risk assessment for ancillary servicesOmer Hadzic, Smajo Bisanovic
Total views : 158 times
1561-1568
An analysis of energy saving through delamping method
Mohd Firdaus Mohd Ab Halim, Muhamad Faizal Yaakub, Mohamad Haniff Harun,
Khalil Azha Mohd Annuar, Farriz Hj Md Basar, Mohd Nazri Omar
Total views : 85 times
1569-1575
Experimental dataset to develop a parametric model based of DC geared motor in
feeder machineAzlan. W. M., Salleh S. M., Mahzan S, Sadikin A, Ahmad S
Total views : 159 times
1576-1584
Dynamic model of A DC-DC quasi-Z-source converter (q-ZSC)
Muhammad Ado, Awang Jusoh, Abdulhamid Usman Mutawakkil, Tole Sutikno
Total views : 99 times
1585-1597
Energy optimization of 6T SRAM cell using low-voltage and high-performance inverter
structuresM. Madhusudhan Reddy, M. Sailaja, K. Babulu
Total views : 173 times
1606-1619
Crosstalk in misaligned free space optical interconnects: modelling and simulationNedal Al-Ababneh
Total views : 86 times
1620-1629
Performance analysis on color image mosaicing techniques on FPGA
Jayalaxmi H, S. Ramachandran
Total views : 86 times
1630-1636
Demand-driven Gaussian window optimization for executing preferred population ofjobs in cloud clusters
Vaidehi M, T. R. Gopalakrishnan
Total views : 92 times
1637-1644
Fault modeling and parametric fault detection in analog VLSI circuits using
discretizationBaldev Raj, G. M. Bhat, Sandeep Thakur
Total views : 269 times
1598-1605
PID controller using rapid control prototyping techniquesOdair A. Trujillo, Nicolás Toro-García, Fredy E. Hoyos
Total views : 280 times
1645-1655
Classification of cow’s behaviors based on 3-DoF accelerations from cow’smovements
Phung Cong Phi Khanh, Kieu Thi Nguyen, Nguyen Dinh-Chinh, Tran Duc-Nghia,
Hoang Quang Trung, Nguyen Van Thang, Tran Duc-Tan
Total views : 112 times
1656-1662
“Brilliantreflect”: smart mirror for smart life
Shelena Soosay Nathan, Amelia Sulaiman, Aisha Amila Kamarulzaman, FeliciaTiera, Mazniha Berahim
1663-1668
Vol 9, No 3 http://ijece.iaescore.com/index.php/IJECE/issue/view/485
2 of 5 7/23/2019, 12:44 AM
Total views : 92 times
Analysis of harmonics using wavelet technique
Thangaraj. K, Subramaniam. N. P., Narmada R, Oma Mageswari. M
Total views : 115 times
1669-1675
Development of automatic healthcare instruction system via movement gesturesensor for paralysis patient
S. A. C. Aziz, A. F. Kadmin, N. Rahim, W. H. W. Hassan, I. F. A. Aziz, M. S. Hamid,
R. A. Hamzah
Total views : 107 times
1676-1682
PID-based temperature control device for electric kettle
Mohd Badril Nor Shah, Norfahaniza Zailany, Amar Faiz Zainal Abidin, Mohd FirdausHalim, Khalil Azha Annuar, Arman Hadi Azahar, Muhamad Haniff Harun, MuhammadFaizal Yaakub
Total views : 322 times
1683-1693
Flood disaster indicator of water level monitoring system
Wan Haszerila Wan Hassan, Aiman Zakwan Jidin, Siti Asma Che Aziz, Norain Rahim
Total views : 139 times
1694-1699
Impulsive spike enhancement on gamelan audio using harmonic percussiveseparation
Solekhan Solekhan, Yoyon K. Suprapto, Wirawan Wirawan
Total views : 141 times
1700-1710
Moment invariant-based features for Jawi character recognitionFitri Arnia, Khairun Saddami, Khairul Munadi
Total views : 95 times
1711-1719
A review of remotely sensed satellite image classification
Sakshi Dhingra, Dharminder Kumar
Total views : 44 times
1720-1731
Blind separation of complex-valued satellite-AIS data for marine surveillance: aspatial quadratic time-frequency domain approach
Omar Cherrak, Hicham Ghennioui, Nadege Thirion Moreau, El Hossein Abarkan
Total views : 272 times
1732-1741
Improve performance of the digital sinusoidal generator in FPGA by memory usageoptimization
Aiman Zakwan Jidin, Irna Nadira Mahzan, A. Shamsul Rahimi A. Subki, WanHaszerila Wan Hassan
Total views : 90 times
1742-1749
RTL Implementation of image compression techniques in WSNS. Aruna Deepthi, E. Sreenivasa Rao, M. N. Giri Prasad
Total views : 162 times
1750-1756
CMOS ring oscillator delay cell performance: a comparative studyD. A. Hadi, A. Z. Jidin, N. Ab Wahab, Madiha Z., Nurliyana Abd Mutalib, Siti Halma
Johari, Suziana Ahmad, M. Nuzaimah
Total views : 159 times
1757-1764
Optimum range of angle tracking radars: a theoretical computingSadegh Samadi, Mohammad Reza Khosravi, Jafar A. Alzubi, Omar A. Alzubi, VarunG. Menon
Total views : 291 times
1765-1772
Robot navigation in unknown environment with obstacle recognition using laser
sensor
Neerendra Kumar, Zoltán Vámossy
Total views : 160 times
1773-1779
Automatic traffic light controller for emergency vehicle using peripheral interface
controllerNorlezah Hashim, Fakrulradzi Idris, Ahmad Fauzan Kadmin, Siti Suhaila Jaapar
Sidek
Total views : 135 times
1788-1794
Development of an automatic can crusher using programmable logic controller
N. A. A. Hadi, Lim Hui Yee, K. A. M. Annuar, Zulhilmi Bin Zaid, Z. A Ghani, M.F.
Mohd Ab Halim, Amar Faiz Zainal Abidin, M. B. N. Shah
Total views : 147 times
1795-1804
Development of portable automatic number plate recognition (ANPR) system onRaspberry Pi
S. Fakhar A. G, M. Saad H, A. Fauzan K, R. Affendi H., M. Aidil A.
Total views : 268 times
1805-1813
Development of a portable community video surveillance systemS. Fakhar A. G, A. Fauzan K, M. Saad H, R. Affendi H, K. H. Fen
Total views : 156 times
1814-1821
Automated medical surgical trolley
N. M. Saad, A. R. Abdullah, N. S. M. Noor, N. A. Hamid, M. A. Muhammad Syahmi,N. M. Ali
Total views : 78 times
1822-1831
Automated segmentation and classification technique for brain strokeN. S. M. Noor, N. M. Saad, A. R. Abdullah, N. M. Ali
Total views : 104 times
1832-1841
Intelligent fire detection and alert system using labVIEWFakrulradzi Idris, Norlezah Hashim, Ahmad Fauzan Kadmin, Lee Boon Yee
Total views : 90 times
1842-1849
Tchebichef image watermarking along the edge using YCoCg-R color space for
copyright protectionFerda Ernawan
Total views : 94 times
1850-1860
Vol 9, No 3 http://ijece.iaescore.com/index.php/IJECE/issue/view/485
3 of 5 7/23/2019, 12:44 AM
Black-box modeling of nonlinear system using evolutionary neural NARX modelNguyen Ngoc Son, Nguyen Duy Khanh, Tran Minh Chinh
Total views : 114 times
1861-1870
Driving cycle development for Kuala Terengganu city using k-means method
I. N. Anida, A. R. Salisa
Total views : 87 times
1780-1787
Modified honey encryption scheme for encoding natural language message
Abiodun Esther Omolara, Aman Jantan
Total views : 185 times
1871-1878
Fingereye: improvising security and optimizing ATM transaction time based on iris-scan authentication
Abiodun Esther Omolara, Aman Jantan, Oludare Isaac Abiodun, Humaira Arshad,
Nachaat AbdElatif Mohamed
Total views : 132 times
1879-1886
Bulk binding approach for PMIPv6 protocol to reduce handoff latency in IoT
Adnan J. Jabir
Total views : 125 times
1894-1901
Single feed circularly polarized crescent-cut and extended corner square microstripantennas for wireless biotelemetry
Mehdi Hasan Chowdhury, Quazi Delwar Hossain, Md. Azad Hossain, Ray ChakChung Cheung
Total views : 155 times
1902-1909
Novel steganography scheme using Arabic text features in Holy Quran
Huda Kadhim Tayyeh, Mohammed Salih Mahdi, Ahmed Sabah Ahmed AL-Jumaili
Total views : 117 times
1910-1918
Efficiency and effectiveness video on demand over worldwide interoperability formicrowave access
Muthanna Jaafar Abbas, Abeer Abd Al Hameed Mahmood
Total views : 140 times
1919-1923
A collaborative physical layer security scheme
Dimitrios Efstathiou
Total views : 32 times
1924-1934
Adaptive resources assignment in OFDM-based cognitive radio systems
Shirin Razmi, Naser Parhizgar
Total views : 133 times
1935-1943
Improvement of crankshaft MAC protocol for wireless sensor networks: a simulationstudy
Madina Hamiane, Makiya J. Ahmed
Total views : 118 times
1944-1956
ERMO2 algorithm: an energy efficient mobility management in mobile cloudcomputing system for 5G heterogeneous networks
L. Pallavi, A. Jagan, B. Thirumala Rao
Total views : 54 times
1957-1967
Intelligent black hole detection in mobile AdHoc networksYaesr Khamayseh, Muneer Bani Yassein, Mai Abu-Jazoh
Total views : 61 times
1968-1977
Development of a Java-based application for environmental remote sensing dataprocessing
Bad-reddine Boudriki Semlali, Chaker El Amrani, Siegfried Denys
Total views : 248 times
1978-1986
Evaluation of earth fault location algorithm in medium voltage distribution network
with correction technique
N. S. B. Jamili, M. R. Adzman, S. R. A. Rahim, S. M. Zali, M. Isa, H. Hanafi
Total views : 255 times
1987-1996
The telecom value chain, opportunities and revenues created by the nigerian telecom
boomNsikan Nkordeh, Uzairue Stanley Idiake, Ibinabo Bob-Manuel, Francis Anyasi
Total views : 126 times
2018-2024
An approximation delay between consecutive requests for congestion control in
unicast CoAP-based group communicationChanwit Suwannapong, Chatchai Khunboa
Total views : 144 times
1997-2005
How software size influence productivity and project durationMridul Bhardwaj, Ajay Rana, Neeraj Kumar Sharma
Total views : 92 times
2006-2017
A cellular base station antenna configuration for variable coverageAbdul-Rahman Shakeeb, K. H. Sayidmarie
Total views : 238 times
1887-1893
Arabic named entity recognition using deep learning approach
Ismail El Bazi, Nabil Laachfoubi
Total views : 364 times
2025-2032
Design of a multi-agent system using the "MaSE" method for learners' metacognitivehelp
Hanane Elbasri, Adil Haddi, Hakim Allali
Total views : 157 times
2033-2040
Information system to enhance medical services quality in IndonesiaSusana Limanto, Andre Andre
Total views : 178 times
2049-2056
Vol 9, No 3 http://ijece.iaescore.com/index.php/IJECE/issue/view/485
4 of 5 7/23/2019, 12:44 AM
Equity-Based free channels assignment for secondary users in a cognitive radionetwork
Said Lakhal, Zouhair Guennoun
Total views : 93 times
2057-2063
Reengineering framework for open source software using decision tree approach
Jaswinder Singh, Kanwalvir Singh, Jaiteg Singh
Total views : 220 times
2041-2048
Parallel hybrid chicken swarm optimization for solving the quadratic assignmentproblem
Soukaina Cherif Bourki Semlali, Mohammed Essaid Riffi, Fayçal Chebihi
Total views : 111 times
2064-2074
Memetic chicken swarm algorithm for job shop scheduling problemsoukaina cherif bourki semlali, Mohammed Essaid Riffi, Fayçal Chebihi
Total views : 172 times
2075-2082
AWSQ: an approximated web server queuing algorithm for heterogeneous web
server clusterKadiyala Ramana, M. Ponnavaikko
Total views : 48 times
2083-2093
Data mining techniques application for prediction in OLAP cubeAsma Lamani, Brahim Erraha, Malika Elkyal, Abdallah Sair
Total views : 39 times
2094-2102
Complete agglomerative hierarchy document’s clustering based on fuzzy luhn’s gibbs
latent dirichlet allocation
P. M. Prihatini, I. K. G. D. Putra, I. A. D. Giriantari, M. Sudarma
Total views : 16 times
2103-2111
Crops diseases detection and solution system
Mohammad Jahangir Alam, Md. Abdul Awal, Md. Nurul Mustafa
Total views : 349 times
2112-2120
Community detection of political blogs network based on structure-attribute graphclustering model
Ahmed F. Al-Mukhtar, Eman S. Al-shamery
Total views : 23 times
2121-2130
Data collection algorithm for wireless sensor networks using collaborative mobile
elements
Alhasanat Abdullah, Alhasanat Khaled, Ahmed Maamoun
Total views : 54 times
2131-2140
Testing embedded system through optimal mining technique (OMT) based on multi-input domain
J. K. R. Sastry, M. Lakshmi Prasad
Total views : 61 times
2141-2151
Opinion mining on newspaper headlines using SVM and NLP
Chaudhary Jashubhai Rameshbhai, Joy Paulose
Total views : 212 times
2152-2163
Optimal remote access trojans detection based on network behavior
Khin Swe Yin, May Aye Khine
Total views : 112 times
2177-2184
Image watermarking based on integer wavelet transform-singular value
decomposition with variance pixelsFerda Ernawan, Dhani Ariatmanto
Total views : 292 times
2185-2195
A design of license plate recognition system using convolutional neural networkP. Marzuki, A. R. Syafeeza, Y. C. Wong, N. A. Hamid, A. Nur Alisa, M. M. Ibrahim
Total views : 190 times
2196-2204
Prediction of Answer Keywords using Char-RNN
Pratheek I, Joy Paulose
Total views : 160 times
2164-2176
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Vol 9, No 3 http://ijece.iaescore.com/index.php/IJECE/issue/view/485
5 of 5 7/23/2019, 12:44 AM
International Journal of Electrical and Computer Engineering (IJECE)
Vol. 9, No. 3, June 2019, pp. 2103~2111
ISSN: 2088-8708, DOI: 10.11591/ijece.v9i3.pp2103-2111 2103
Journal homepage: http://iaescore.com/journals/index.php/IJECE
Complete agglomerative hierarchy document’s clustering based
on fuzzy luhn’s gibbs latent dirichlet allocation
P. M. Prihatini1, I. K. G. D. Putra
2, I. A. D. Giriantari
3, M. Sudarma
4
1Study Program of Doctoral Engineering Science, Faculty of Engineering, Udayana University, Bali, Indonesia 2Study Program of Information Technology, Faculty of Engineering, Udayana University, Bali, Indonesia 3.4Study Program of Electrical Engineering, Faculty of Engineering, Udayana University, Bali, Indonesia
Article Info ABSTRACT
Article history:
Received Apr 24, 2018
Revised Nov 18, 2018
Accepted Dec 11, 2018
Agglomerative hierarchical is a bottom up clustering method, where the
distances between documents can be retrieved by extracting feature values
using a topic-based latent dirichlet allocation method. To reduce the number
of features, term selection can be done using Luhn’s Idea. Those methods
can be used to build the better clusters for document. But, there is less
research discusses it. Therefore, in this research, the term weighting
calculation uses Luhn’s Idea to select the terms by defining upper and lower
cut-off, and then extracts the feature of terms using gibbs sampling latent
dirichlet allocation combined with term frequency and fuzzy Sugeno method.
The feature values used to be the distance between documents, and clustered
with single, complete and average link algorithm. The evaluations show the
feature extraction with and without lower cut-off have less difference. But,
the topic determination of each term based on term frequency and fuzzy
Sugeno method is better than Tsukamoto method in finding more relevant
documents. The used of lower cut-off and fuzzy Sugeno gibbs latent dirichlet
allocation for complete agglomerative hierarchical clustering have consistent
metric values. This clustering method suggested as a better method in
clustering documents that is more relevant to its gold standard.
Keywords:
Fuzzy sugeno
Gibbs sampling
Hierarchical clustering
Latent dirichlet allocation
Luhn’s idea
Copyright © 2019 Institute of Advanced Engineering and Science.
All rights reserved.
Corresponding Author:
P. M. Prihatini,
Study Program of Doctoral Engineering Science,
Faculty of Engineering,
Udayana University,
Kampus UNUD Bukit Jimbaran Kuta Selatan, Badung, Bali, Indonesia, 80361.
Email: [email protected]
1. INTRODUCTION
Clustering is one of the tasks in data mining to analyze large amounts of data and is able to generate
hidden information that is very useful for decision making. Clustering is an unsupervised classification
technique that grouping data with similarities into a cluster [1]. The techniques in clustering include
partitioning, hierarchical, grid-based, and density-based method [2]. These methods have been used in
various applications such as K-Means for SME Risk Analysis Documents [3], KPrototype for clustering big
data based on MapReduce [4], K-Means for earthquake cluster analysis [5], Ward’s linkage method for
classifying the languages [6], grid and density-based for trajectory clustering [7], and DBSCAN for
categorizing districts [8]. The hierarchical clustering method uses a tree concept, which is divided into
agglomerative and divisive approaches [9]. Agglomerative is known as bottom up method, while divisive is
known as top down method [10]. In general, agglomerative algorithm have three characteristics such as
single link, complete link, and average link [11]. The difference between these algorithms is how to
determine the distance between data that will be merged. The data that will be merged has diverse forms such
ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 9, No. 3, June 2019 : 2103 - 2111
2104
as text, images, sound or video. For text-shaped data, the text is processed first through several steps such as
tokenization, filtering, lemmatization or stemming [12].
The results of text processing will be used to generate terms of indexing, which is a vocabulary
extracted in the collection of texts, and determine a weight for each term [13]. The terms and its weights will
be used to determine the distance between data to be merged in agglomerative algorithm. There are several
methods of term weighting in text processing [14]. For vector space model, there is a commonly used Term
Frequency-Inverse Document Frequency (TF-IDF) [15],[16]. To overcome the weakness of TF-IDF in
addressing synonym and polysemy in natural language, Latent Semantic Indexing (LSI) was developed.
Researches on text clustering with Agglomerative Hierarchical Clustering (AHC) algorithm has been done
with those term weighting schemes. The AHC algorithm with TF-IDF has been used to cluster the web pages
[17], construct taxonomies from a corpus of text documents [18], construct multi-keyword ranked search
scheme [19], context aware document clustering [20], automatic taxonomy construction from keywords [21].
The AHC algorithm has also been developed with LSI for document clustering [22], clustering of news
articles [23], information retrieval [24]. The weakness of LSI is overcome by developing a topic-based
weighting term called Latent Dirichlet allocation (LDA). LDA is a generative probabilistic model of a
corpus, which documents are represented as random mixtures over latent topics, and each topic is
characterized by a distribution over words [25]. A document in a corpus is not only identified as a single
topic, but can be identified as several topics with their respective probabilities [26]-[28].
LDA has been developed based on a hierarchy, known as hLDA [29], but this method is not able to
capture the hierarchical relationship that is formed [30]. Therefore, research needs to be done to classify
documents hierarchically by using hierarchical clustering method and LDA for weighting term. The research
that integrates LDA into the hierarchical clustering method, especially agglomerative, has already been done.
X. Li, H. Wang, G. Yin, T. Wang, C. Yang, Y. Yu, D. Tang [31] used LDA for inducing taxonomy from tags
based on word clustering. AHC framework is used to determine how similar every two tags, and then LDA is
used to capture thematic correlations among tags that resulted by AHC. D. Tu, L. Chen, G. Chen [32] used
LDA to extract the most typical words in every latent topic and apply a multi-way hierarchical agglomerative
clustering algorithm (AHC and WordNet) to cluster candidate concept words. The problem is those papers
discussed about English text. Until now, the performance of using the LDA method and agglomerative
hierarchical clustering in Indonesian text has never been published. If both of these methods are proven to
have good performance in clustering Indonesian texts, then it can also be used on other text mining tasks, for
example for document summarization.
To overcome this problem, in this research, AHC and LDA are used to cluster documents, where
LDA is not used for clustering, but used to generate the weight of terms contained in document text. This
research has differences with other related researches for Indonesian text. First, the term weighting
calculation used Luhn’s Idea to select the terms of text by defining upper cut-off and lower cut-off, and then
extracts the feature of terms using Gibbs Sampling LDA combined with the term frequency values and fuzzy
Sugeno logic. While, in other research, P.M. Prihatini, I.K.G.D. Putra, I.A.D. Giriantari, M. Sudarma [26]
used only TF-IDF for term weighting calculation. Second, the calculation of the distance between documents
for AHC is topic-based because it uses the value that resulted by Fuzzy Luhn’s Gibbs LDA. Third, the
document clustering with AHC uses three characteristics: single link, complete link and average link based
on Fuzzy Luhn’s Gibbs LDA, and then compares the best AHC characteristics with measurement metrics.
While, in other researches, Yuhefizar, B. Santosa, I.K. Eddy, Y.K. Suprapto [33] used Euclidean for distance
calculation and single linkage for document clustering. M.A.A. Riyadi, D.S. Pratiwi, A.R. Irawan, K.
Fithriasari [34] used single link, complete link and Ward’s link based on autocorrelation distance. The
discussion in this research is divided as follows. Section 2 discusses about research method. Section 3
discusses about the results and its analysis. Section 4 discusses about the conclusion of this research.
2. RESEARCH METHOD
This research consists of several steps, such as document text processing, term weighting with
Fuzzy Luhn’s Gibbs LDA, documents clustering with Fuzzy Luhn’s Gibbs LDA, and evaluation, as shown in
Figure 1.
2.1. Document text processing
In this research, the documents used are news text files obtained from Indonesian news website.
Each file is delimited into a collection of terms in the tokenization process. In the filtering process, each term
is filtered using a stop words list resulting in a meaningful set of terms. The terms generated by the filtering
process, some are already in the form of basic word, some are still have affixes. To make all terms in a
uniform shape, all terms are parsing into basic words through the stemming process. In this research,
Int J Elec & Comp Eng ISSN: 2088-8708
Complete agglomerative hierarchy document’s clustering based on fuzzy luhn’s gibbs… (P. M. Prihatini)
2105
stemming uses the deletion of affixes method based on rules and basic dictionary. The stemming algorithm
used is a modification of Nazief-Adriani [35].
Figure 1. The research method design
2.2. Term weigthing with Fuzzy Luhn’s Gibbs LDA
In this research, term weighting is done through term selection and feature extraction. The term
selection is based on the concept of Luhn's Idea, where each term is calculated based on its relative frequency
against all terms in the document text [36]. Luhn describes the relationship between the occurrence
frequencies of a term (term frequency) with the importance of a term in the document. The term with
medium-frequency is more important than high or low frequency terms. Low frequency terms are included in
the lower cut-off, while high frequency terms are included in the upper cut-off. Medium-frequency terms can
be obtained by cutting the upper and lower cut-off. To eliminate terms in the upper cut-off can be done by
filtering terms based on stop words list. However, to eliminate terms in the lower cut-off, so far there has
been no research that can determine effective ways to determine the lower cut-off limits.
In this research, the elimination of terms in the upper cut-off limit is done twice. First, it done by
filtering of terms based on stop word list. Second, the filtering results are filtered again through stemming
proccess. For the elimination of terms in the lower cut-off limit is based on the stemming result, with
different percentage removal values for each text document, as in (1). This is based on the idea that each
document has different text lengths, so that no single constant value can be taken for all documents. Variable
lcod (lower cut-off document) refers to the lower-cut-off constant value for document d (in the form of a
positive integer). Variable fsd (false-stemming document) refers to the number of unsuccessful term
stemming in document d. Variable frd (filtering result document) refers to the number of terms in document d
used for the stemming process. Variable tsd (true-stemming document) refers to the number of successful
term stemming in document d.
(
) (1)
The term selection result is a collection of selected terms of each document that have important
meanings to be processed at the feature extraction. In this research, feature extraction is done by topic-based
LDA method. LDA has some reasoning algorithms, one of which is Gibbs Sampling that have proven
effective in conducting the topic sampling process [28]. In general, in the initialization process, Gibbs
Sampling assigns the topic of each term randomly using a multinomial random function. However, the use of
this function cannot represent the existence of each term in the topic. Therefore, in this research, the
determination of topic for each term in the initialization process is done based on the highest occurrence
frequency (tf) of the term in all topics, as in (2). Variable zt,k similar with k refers to the topic. Variable tft,k
refers to tf value of term t on topic k. To calculate the probability of each term in the sampling process used
the formula as in (3). Variable pt,k refers to probability value of sampling for term t on topic k. Variable nkw-1
refers to the value of the topic-term matrix by ignoring the current term value. Variable V is the unique
number of terms in all documents. Variable ndk-1 refers to value of the document-topic matrix by ignoring the
current term value. Variable β determine the mixing proportion of documents on the topics, while α
determines the mixture components of words on the topics [37]. Variable K is the number of topic.
ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 9, No. 3, June 2019 : 2103 - 2111
2106
(2)
∑
∑ (3)
In general, Gibbs Sampling in the LDA requires several times iterations for the sampling process
until it reaches convergent conditions. This takes time and high complexity. The addition of fuzzy
Tsukamoto logic into the sampling process can accelerate the achievement of convergent conditions with
good measurement values [26]. The fuzzy logic concept that used in that research will be improved through
this research by using Sugeno method to increase the accuracy value, considering the output of fuzzy logic
which needed for sampling is a constant value. In this research, the upper and lower limits for the fuzzy curve
are determined based on the tf value of each term. Fuzzification uses a triangular curve with the probability
value of the sampling result for each term p, as in (4). Variable u[t] refers to the degree of membership for
term t. Variable a refers to the lower bound of the curve. Variable b refers to the peak of the curve. Variable c
refers to the upper bound of the curve. The implication function used is OR because fuzzy logic here is used
to determine the probability value of term for one topic, while all topic will be determined in sampling
process. The rule composition generates the αp value based on the maximum value of all u[t] as in (5), and
the value of zo is based on the term probability value across the topic whose value is not equal to zero, as in
(6). Variable t refers to the term probability of sampling result. Variable zo refers to the composition output.
Variable n refers to the number of topics whose term probability is not equal to zero. For the defuzzification,
the final output of fuzzy z is obtained by calculating the mean value, as in (7). The value of z is used as the
probability value of term p for topic k and will be used for the next sampling process until it reaches
convergent conditions. After convergence, the final value of z will be the feature value for each term and
ready for clustering.
[ ] {
(4)
[ ] [ ] [ ] (5)
∑
(6)
(7)
2.3. Documents clustering with Fuzzy Luhn’s Gibbs LDA
The feature values obtained at the feature extraction are used to calculate the distance between
documents to be used in the clustering process. In this research, distance calculations using the Cosine
Similarity, as in (8). Variable |di-dj| refers to the distance between documents i and j. Variable di refers to
document i, while dj refers to document j.
√
(8)
The distance between documents is used to cluster document using three types of AHC algorithms.
In the Single Link AHC algorithm, clusters are based on the smallest distance between pairs of two
documents, as in (9). In the Complete Link AHC algorithm, clusters are based on the largest distance
between pairs of two documents, as in (10). In the Average Link AHC algorithm, clusters are based on the
average distance between pairs of two documents, as in (11). Variable dij refers to the selected pair of
documents i and j.
(9)
( | |) (10)
(11)
Int J Elec & Comp Eng ISSN: 2088-8708
Complete agglomerative hierarchy document’s clustering based on fuzzy luhn’s gibbs… (P. M. Prihatini)
2107
2.4. Metrics evaluation
In this research, evaluation is done in two steps: evaluation of the feature extraction results and
evaluation of the clustering results. The text document used in this research has been classified into five
categories by Indonesian news media websites, so it can be used as gold standard for the evaluation process.
Evaluation of feature extraction results is done by comparing results with lower cut-off and without
lower cut-off. An evaluation was also performed to compare the feature extraction results between the Fuzzy
Gibbs LDA method [26] and Fuzzy Luhn's Gibbs LDA that used in this research. The evaluation was
performed using two measurement metrics. First, the perplexity is used to measure the ability of the Fuzzy
Luhn's Gibbs LDA feature extraction method to generalize the hidden data, as in (12) and (13) [25]. The
smaller value of the perplexity indicates the better performance of the method. Variable refers to
perplexity value. Variable M refers to the number of documents. Variable V is the unique number of terms in
all documents. Variable refers to the number of occurrences of the word t in document m. Variable K is
the number of topic. Variable refers to the number of documents for each topic. Variable refers to
the number of words for each topic.
∑
∑
(12)
∑
∑ ) (13)
Second, Precision (P), Recall (R) and F-Score (F) metrics are used to measure the ability of the
Fuzzy Luhn's Gibbs LDA feature extraction method in finding relevant documents according to the gold
standard, as in (14)-(16) [14]. Variable TP is true positive, refers to the number of relevant items retrieved.
Variable FP is false positive, refers to the number of non-relevant items retrieved. Variable FN is false
negative, refers to the number of relevant items that cannot retrieved. The greater value of P, R and F
indicates the better performance of the methods.
(14)
(15)
(16)
Evaluation of clustering results is done by comparing the clustering results between Single Link,
Complete Link and Average Link AHC using feature extraction results. In this research, the evaluation was
performed using five measurement metrics. Precision, Recall, and F-Score (PRF) are used to measure the
ability of methods to cluster relevant documents according to the gold standard, as in (14)-(16). The fourth is
Normalized Mutual Information (NMI), as in (17)-(20) [38]. Variable refers to the value of mutual
information for class (gold standard) and cluster. Variable refers to the entropy of class. Variable
refers to the entropy of cluster. Variable refers to the number of document belongs to class k. Variable
refers to the number of document belongs to cluster j.
(17)
∑ ∑| |
(18)
∑
(19)
∑| |
| |
(20)
The fifth is Adjusted Rand Index (ARI), as in (21)-(24) [39]. Variable refers to the number of
document belongs to class i and cluster j. Variable refers to the number of document belongs to class i.
Variable refers to the number of document belongs to cluster j. The greater value of P, R, F, NMI and
ARI indicates the better performance of the methods.
ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 9, No. 3, June 2019 : 2103 - 2111
2108
∑ ∑ (
)
(21)
∑ ( )
(22)
∑ (
)
(23)
(24)
3. RESULTS AND ANALYSIS
3.1. Fuzzy Luhn’s Gibbs LDA
The evaluation results of the Fuzzy Luhn's Gibbs LDA feature extraction method can be seen in
Table 1. The value in the table shows the comparison between Fuzzy Luhn's Gibbs LDA feature extraction
method and Fuzzy Gibbs LDA that have published by P.M. Prihatini, I.K.G.D. Putra, I.A.D. Giriantari, M.
Sudarma [26]. The Fuzzy Luhn's Gibbs LDA feature extraction method uses lower cut-off and without lower
cut-off for the selection feature method, while Fuzzy Gibbs LDA did not use Luhn’s concept.
Table 1. Comparison of Fuzzy Luhn’s Gibbs LDA and Fuzzy Gibbs LDA Metrics
evaluations Fuzzy Luhn’s Gibbs LDA The difference Fuzzy Gibbs LDA The difference
Lower cut-off (1) Without lower cut-off (2) (1) & (2) (3) (1) & (3) (2) & (3)
Perplexity 0.0375 0.0339 0.0036 0.0376 0.0001 0.0037
Precision 0.9435 0.9515 0.0080 0.8975 0.0460 0.0540
Recall 0.9280 0.9360 0.0080 0.8486 0.0794 0.0874 F-Score 0.9296 0.9387 0.0091 0.8420 0.0876 0.0967
The evaluation results in Table 1 shows that the feature extraction with lower cut-off using equation
(1) gives the evaluation value not much different than without the lower cut-off. The difference of metric
measurement values between the two methods is very small with the range from 0.0036 to 0.0091. This
insignificant difference occurs because the feature selection in this research has been done through two step
of the upper cut-off, which at filtering step with stop word list and then at stemming step. These two steps
have filtered the term with frequencies that appear frequently and rarely appear, so it results a list of
meaningful terms for the feature extraction. The lower cut-off process with the value adjusted to the length of
the document only removes a small portion of the meaningful term in the feature selection so it does not
significantly affect the feature extraction results.
The evaluation results in Table 1 also shows Fuzzy Gibbs LDA method resulted perplexity of
0.0376, while Fuzzy Luhn's Gibbs LDA in this research gives the value of perplexity 0.0375 for lower cut-
off and 0.0339 without lower cut-off. This indicates that the Fuzzy Luhn's Gibbs LDA algorithm performs
as well as Fuzzy Gibbs LDA in generating hidden data. But, the results of the P, R, and F metric indicate that
the Fuzzy Luhn's Gibbs LDA algorithm performed gives better results ranging from 0.9280 to 0.9515 than
Fuzzy Gibbs LDA algorithm ranging from 0.8420 to 0.8975. The increasing value of PRF metric shows that
the topic determination of each term for initial sampling that performed based on the highest occurrence
frequency (tf) of term to all topics by using Luhn’s Idea and the use of the Fuzzy Sugeno method is better
able to find documents relevant to the gold standard. This indicates that Fuzzy Luhn's Gibbs LDA algorithm
is a better choice in performing feature extraction for clustering.
2.2. AHC with Fuzzy Luhn’s Gibbs LDA
The evaluation results of the AHC algorithms performed based on the Fuzzy Luhn's Gibbs LDA
feature extraction can be seen in Table 2.
Int J Elec & Comp Eng ISSN: 2088-8708
Complete agglomerative hierarchy document’s clustering based on fuzzy luhn’s gibbs… (P. M. Prihatini)
2109
Table 2. Evaluation Results of AHC with Fuzzy Luhn’s Gibbs LDA
Metrics
Evaluations
Fuzzy Luhn’s Gibbs LDA with Lower cut-off Fuzzy Luhn’s Gibbs LDA without Lower cut-off The
difference
Single Link
AHC
Complete Link
AHC (1)
Average Link
AHC
Single Link
AHC
Complete Link
AHC (2)
Average Link
AHC (1)(2)
Precision 0.8255 0.9549 0.9169 0.8642 0.9583 0.9340 0.0034
Recall 0.6021 0.9273 0.8179 0.6714 0.9247 0.8201 0.0026
F-Score 0.6963 0.9409 0.8646 0.7557 0.9412 0.8733 0.0003 NMI 0.5827 0.9196 0.7208 0.6075 0.8933 0.6474 0.0263
ARI 0.9523 0.9989 0.9128 0.9534 0.9974 0.9017 0.0015
The evaluation results in Table 2 shows that the feature selections with lower cut-off or without
lower cut-off do not affect the performance of the AHC algorithms in the clustering process. It can be seen
from the measurement metric values that both feature selection methods produce Complete Link AHC
algorithm as the AHC clustering algorithm with the best metric value. The differences for the Complete Link
AHC algorithm with both feature selection methods ranges from 0.0003 to 0.0263. This shows that both
feature selection methods can be used as a good choice in clustering process with AHC. But, in terms of the
consistency of the value generated by the five metric measurements, Complete Link AHC and Fuzzy
Luhn's Gibbs LDA with lower cut-off have consistent metric values, ranging from 0.9196 to 0.9989, with
differences ranging from 0.0213 to 0.0793; while Complete Link AHC and Fuzzy Luhn's Gibbs LDA
without lower cut-off have values ranging from 0.8933 to 0.9974, with differences ranging from 0.0213 to
0.0793, and decreased the NMI metric value 0.0263 compared to lower cut-off. The results of AHC with
Fuzzy Luhn’s Gibbs LDA compared with the results of AHC with autocorrelation distance that have
published by M.A.A. Riyadi, D.S. Pratiwi, A.R. Irawan, K. Fithriasari [34]. In their research, Complete Link
AHC with Autocorrelation distance resulted accuracy value of 0.8235. Therefore, the use of Complete
Link AHC and Fuzzy Luhn's Gibbs LDA with lower cut-off is more relevant as a better clustering method in
clustering documents especially Indonesian text news.
4. CONCLUSION
Complete Link AHC and Fuzzy Luhn's Gibbs LDA with lower cut-off algorithm that has built in
this research can improve the quality of clusters generation for document clustering especially for Indonesian
text news. This is shown by the value of evaluation metrics, which are Precision, Recall, F-Score, Perplexity,
Normalized Mutual Information, and Adjusted Rand Index. The values of Precision, Recall and F-Score for
lower cut-off have less difference than without the lower cut-off, which means both methods can be used in
term selection process. The values of Perplexity, Precision, Recall and F-Score for Fuzzy Luhn's Gibbs LDA
algorithm was increased, which means it performed better than Fuzzy Gibbs LDA. The value of Precision,
Recall, F-Score, Perplexity, Normalized Mutual Information, and Adjusted Rand Index showed that the
Complete Link AHC and Fuzzy Luhn's Gibbs LDA algorithm as the best AHC clustering algorithm, with or
without lower cut-off. But, the Complete Link AHC algorithm and Fuzzy Luhn's Gibbs LDA with lower cut-
off produce more consistency value for the five metric measurements, which means is more relevant to its
gold standard.
REFERENCES [1] Bansal A., et al., “Improved K-mean clustering algorithm for prediction analysis using classification technique in
data mining,” International Journal of Computer Applications, vol. 157, pp. 35-40, 2017.
[2] Arora P., et al., “Analysis of K-Means and K-Medoids algorithm for big data,” Procedia Computer Science, vol.
78, pp. 507-12, 2016.
[3] Wahyudin I., et al., “Cluster analysis for SME risk analysis documents based on Pillar K-Means,” TELKOMNIKA
Telecommunication Computing Elctronics and Control, vol. 14, pp. 674-83, 2016.
[4] Bathla G., et al., “A novel approach for clustering big data based on MapReduce,” International Journal of
Electrical and Computer Engineering (IJECE), vol. 8, pp. 1711, 2018.
[5] Kamat R. K., et al., “Earthquake cluster analysis: K-Means approach,” Journal of Chemical and Pharmaceutical
Sciences, vol. 10, pp. 250-3, 2017.
[6] Strauss T., et al., “Generalising ward's method for use with manhattan distances,” PloS one, vol. 12, 2017.
[7] Mao Y., et al., “An adaptive trajectory clustering method based on grid and density in mobile pattern analysis,”
Sensors (Basel), vol. 17, 2017.
[8] Majumdar J., et al., “Analysis of agriculture data using data mining techniques: application of big data,” Journal of
Big Data, vol. 4, 2017.
[9] Balcan M. F., et al., “Robust hierarchical clustering,” Journal of Machine Learning Research, vol. 15, pp. 4011-51,
2014.
ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 9, No. 3, June 2019 : 2103 - 2111
2110
[10] Goulas C., et al., “HCuRMD: Hierarchical clustering using relative minimal distances,” in Chbeir R. M. Y., et al.,
“Artificial Intelligence Applications and Innovations,” IFIP Advances in Information and Communication
Technology, Springer, pp. 440-7, 2015.
[11] Marathe M., et al., “A survey of clustering algorithms for similarity search,” International Journal of Pure and
Applied Mathematics, vol. 114, pp. 343-51, 2017.
[12] Allahyari M., et al., “A brief survey of text mining: classification, clustering and extraction techniques,” KDD
Bigdas, 2017.
[13] Ribeiro M. N., et al., “Local feature selection in text clustering,” in Köppen M. K. N. and Coghill G., “Advances in
Neuro-Information Processing ICONIP 2008,” Lecture Notes in Computer Science. Berlin, Heidelberg, Springer,
2009.
[14] Manning C. D., et al., “An Introduction to Information Retrieval,” England, Cambridge University Press, 2008.
[15] Islam M. R., et al., “Technical approach in text mining for stock market prediction: a systematic review,”
Indonesian Journal of Electrical Engineering and Computer Science, vol. 10, pp. 770-7, 2018.
[16] Amoli P. V., et al., “Scientific douments clustering based on text summarization,” International Journal of
Electrical and Computer Engineering (IJECE), vol. 5, pp. 782-7, 2015.
[17] Ramage D., et al., “Clustering the tagged web,” Proceedings of the Second ACM International Conference on Web
Search and Data Mining, 2009.
[18] Knijff J., et al., “Domain taxonomy learning from text: the subsumption method versus hierarchical clustering,”
Data & Knowledge Engineering, vol. 83, pp. 54-69, 2013.
[19] Indhuja A., et al., “A multi-keyword ranked search scheme over encrypted based on hierarchical clustering index,”
International Journal On Smart Sensing And Intelligent Systems, vol. 10, pp. 539-59, 2017.
[20] Venkateshkumar P., et al., “Using data fusion for a context aware document clustering,” International Journal of
Computer Applications, vol. 72, pp. 17-20, 2013.
[21] Song Y., et al., “Automatic taxonomy construction from keywords via scalable bayesian rose trees,” IEEE
Transactions on Knowledge and Data Engineering, vol. 27, pp. 1861-74, 2015.
[22] Kuta M., et al., “Comparison of latent semantic analysis and probabilistic latent semantic analysis for documents
clustering,” Computing and Informatics, vol. 33, pp. 652–66, 2014.
[23] Rott M., et al., “Investigation of latent semantic analysis for clustering of Czech news articles,” 25th International
Workshop on Database and Expert Systems Applications (DEXA), 2014.
[24] Park H., et al., “Agglomerative hierarchical clustering for information retrieval using latent semantic index,” IEEE
International Conference on Smart City/SocialCom/SustainCom (SmartCity), pp. 426-31, 2015.
[25] Blei D. M., et al., “Latent dirichlet allocation,” Journal of Machine Learning Research, vol. 3, pp. 993-1022, 2003.
[26] Prihatini P. M., et al., “Fuzzy-gibbs latent dirichlet allocation model for feature extraction on Indonesian
documents,” Contemporary Engineering Sciences, vol. 10, pp. 403-21, 2017.
[27] Prihatini P. M., et al., “Feature extraction for document text using Latent Dirichlet allocation,” Journal of Physics:
Conference Series, vol. 953, pp. 012047, 2018.
[28] Prihatini P. M., et al., “Indonesian text feature extraction using gibbs sampling and mean variational inference
latent dirichlet allocation,” Quality of Research (QIR): International Symposium on Electrical and Computer
Engineering, 2017 15th International Conference on, 2017.
[29] Blei D. M., et al., “Hierarchical topic models and the nested chinese restaurant process,” NIPS'03 Proceedings of
the 16th International Conference on Neural Information Processing Systems, pp. 17-24, 2003.
[30] Yerebakan H. Z., et al., “Hierarchical latent word clustering,” Bayesian Nonparametrics: The Next Generation
NIPS 2015 Workshop, 2015.
[31] Li X., et al., “Inducing taxonomy from tags : an agglomerative hierarchical clustering framework,” International
Conference on Advanced Data Mining and Applications ADMA 2012: Advanced Data Mining and Applications,
pp. 64-77, 2012.
[32] Tu D., et al., “WordNet based multi-way concept hierarchy construction from text corpus,” Proceedings of the
Twenty-Seventh AAAI Conference on Artificial Intelligence, pp. 1647-8, 2013.
[33] Yuhefizar Y., et al., “Combination of Cluster Method for Segmentation of Web Visitors,” TELKOMNIKA
(Telecommunication Computing Electronics and Control), vol. 11, pp. 207, 2013.
[34] Riyadi M. A. A., et al., “Clustering stationary and non-stationary time series based on autocorrelation distance of
hierarchical and k-means algorithms,” International Journal of Advances in Intelligent Informatics, vol. 3, pp. 154-
60, 2017.
[35] Asian J., et al., “Stemming Indonesian,” Proceedings of the Twenty-eighth Australasian conference on Computer
Science, 2004.
[36] Kocabas I., et al., “Investigation of Luhn’s claim on information retrieval,” Turk J Elec Eng & Comp Sci, vol. 19,
pp. 993-1004, 2011.
[37] Heinrich G., “Parameter estimation for text analysis,” University of Leipzig, Germany, 2008.
[38] Fred A. L. N., et al., “Robust Data Clustering,” Proceedings of IEEE Computer Society Conference on Computer
Vision and Pattern Recognition, vol. 3, pp. 128-36, 2003.
[39] Kuncheva L. I., et al., “Using diversity in cluster ensembles,” vol. 2, pp. 1214-9, 2004.
Int J Elec & Comp Eng ISSN: 2088-8708
Complete agglomerative hierarchy document’s clustering based on fuzzy luhn’s gibbs… (P. M. Prihatini)
2111
BIOGRAPHIES OF AUTHORS
Putu Manik Prihatini was born in Bali (Indonesia) on March 17, 1980. She is earned a bachelor's
degree of informatics engineering at Sekolah Tinggi Teknologi Telkom Bandung (Indonesia) in
2002, and her master's degree at Universitas Udayana (Indonesia) in 2012. She is a lecturer at
Politeknik Negeri Bali (Indonesia) since 2002 until now. Currently, she is being a doctoral student
of Ilmu Teknik at Universitas Udayana (Indonesia). Her research and interest is Text Mining,
Information Retrieval System, and Soft Computing. Putu Manik Prihatini, ST, MT. Email:
I Ketut Gede Darma Putra was born in Bali (Indonesia) on April 24, 1974. He is earned a
bachelor's degree of informatics at Institut Teknologi Sepuluh November (ITS-Indonesia); masters
and doctoral degrees at Universitas Gadjah Mada (UGM-Indonesia). He is a lecturer at Universitas
Udayana (Indonesia) since 1999 until now. His research and interest is Data Mining and Image
Processing. Prof. Dr. I Ketut Gede Darma Putra, S.Kom., MT. Email: [email protected]
Ida Ayu Dwi Giriantari was born in Bali (Indonesia) on December 13, 1965. She is earned a
bachelor's degree of electrical engineering at Universitas Udayana (Indonesia); masters and
doctoral degrees at The University of New South Wales (Australia). She is a lecturer at Universitas
Udayana Indonesia since 1991 until now. Her research and interest is electric power system,
renewable energy technology and application, smart grid and control. Prof. Ir. Ida Ayu Dwi
Giriantari, M.Eng,Sc., PhD. Email: dayu.giriantari @unud.ac.id
Made Sudarma was born in Bali (Indonesia) on December 31, 1965. He is earned a bachelor's
degree of informatics at Institut Teknologi Sepuluh Nopember (ITS-Indonesia); master’s degree at
School of Information Technology and Engineering, Ottawa University (Canada) and doctoral
degrees at Universitas Udayana (Indonesia). He is a lecturer at Universitas Udayana Indonesia
since 1993 until now. His research and interest is Data Mining and Image Processing. Dr. Ir. Made
Sudarma, M.A.Sc. Email: [email protected]