15
Research Article Multiclass Event Classification from Text Daler Ali , Malik Muhammad Saad Missen , and Mujtaba Husnain e Islamia University Bahawalpur, Bahawalpur, Pakistan Correspondence should be addressed to Daler Ali; [email protected] and Malik Muhammad Saad Missen; [email protected] Received 3 November 2020; Accepted 24 December 2020; Published 13 January 2021 Academic Editor: Ligang He Copyright © 2021 Daler Ali et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Social media has become one of the most popular sources of information. People communicate with each other and share their ideas, commenting on global issues and events in a multilingual environment. While social media has been popular for several years, recently, it has given an exponential rise in online data volumes because of the increasing popularity of local languages on the web. is allows researchers of the NLP community to exploit the richness of different languages while overcoming the challenges posed by these languages. Urdu is also one of the most used local languages being used on social media. In this paper, we presented the first-ever event detection approach for Urdu language text. Multiclass event classification is performed by popular deep learning (DL) models, i.e.,Convolution Neural Network (CNN), Recurrence Neural Network (RNN), and Deep Neural Network (DNN). e one-hot- encoding, word embedding, and term-frequency inverse document frequency- (TF-IDF-) based feature vectors are used to evaluate the Deep Learning(DL) models. e dataset that is used for experimental work consists of more than 0.15 million (103965) labeled sentences. DNN classifier has achieved a promising accuracy of 84% in extracting and classifying the events in the Urdu language script. 1. Introduction In the current digital era, social media dominated other sources of communication, i.e., print and broadcast media [1]. Real-time availability [2] and multilingual support [3] are the key features that boost the usage of social media for communication. e usage of local languages on social media is overwhelming for the last few years. People share ideas, opinions, events, sentiments, and advertisements, etc. [4] in the world via social media using local languages. A considerable amount of heterogeneous data is being gen- erated which causes challenges to extract worthy insights, while this information plays a vital role in developing natural language processing (NLP) application, i.e., sentiment analysis [5], risk factor analysis [6], law and order predictor, timeline constructor, opining mining, decision-making systems [7], monitoring social media [8], spam detection, information retrieval, document classification [9], e-mail categorization [10], and sentence classification [11], topic modeling [12], content labeling, and finding the latest trend. In South Asia (https://www.worldometers.info/), about 24.98% population of the world live in different countries. Many languages are being spoken in Asia. e most famous among these are Arabic, Hindi, Malay, Persian, and Urdu, etc. 1.1.FeaturesofUrduLanguage. e Urdu language is one of the languages in South Asia that is frequently used for communication on social media, namely, Facebook, Twitter, News Channels, and Web Blogs [13]. It is also the national language of Pakistan which is the 6th (https://www. worldometers.info/world-population/population-by- country/) most populous country in the world. In other countries, i.e., India, Afghanistan, and Iran, the Urdu lan- guage is also spoken and understood. ere are 340 million people in the world who use the Urdu language on social media for various purposes [13]. e Urdu language follows the right-to-left writing script. Its grammatical structure is different from other languages. (1) Subject-object-verb (SOV) sentence structure [14] (2) No letter capitalization Hindawi Scientific Programming Volume 2021, Article ID 6660651, 15 pages https://doi.org/10.1155/2021/6660651

ResearchArticle MulticlassEventClassificationfromText2020/11/03  · script. Its grammatical structure is different from other languages. (1)Subject-object-verb(SOV)sentencestructure[14]

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: ResearchArticle MulticlassEventClassificationfromText2020/11/03  · script. Its grammatical structure is different from other languages. (1)Subject-object-verb(SOV)sentencestructure[14]

Research ArticleMulticlass Event Classification from Text

Daler Ali Malik Muhammad Saad Missen and Mujtaba Husnain

e Islamia University Bahawalpur Bahawalpur Pakistan

Correspondence should be addressed to Daler Ali daleraliiubedupk and Malik Muhammad Saad Missensaadmisseniubedupk

Received 3 November 2020 Accepted 24 December 2020 Published 13 January 2021

Academic Editor Ligang He

Copyright copy 2021 Daler Ali et al )is is an open access article distributed under the Creative Commons Attribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

Social media has become one of the most popular sources of information People communicate with each other and share their ideascommenting on global issues and events in amultilingual environmentWhile socialmedia has been popular for several years recently ithas given an exponential rise in online data volumes because of the increasing popularity of local languages on the web )is allowsresearchers of the NLP community to exploit the richness of different languages while overcoming the challenges posed by theselanguages Urdu is also one of the most used local languages being used on social media In this paper we presented the first-ever eventdetection approach for Urdu language text Multiclass event classification is performed by popular deep learning (DL) modelsieConvolution Neural Network (CNN) Recurrence Neural Network (RNN) and Deep Neural Network (DNN) )e one-hot-encoding word embedding and term-frequency inverse document frequency- (TF-IDF-) based feature vectors are used to evaluate theDeep Learning(DL)models)e dataset that is used for experimental work consists ofmore than 015million (103965) labeled sentencesDNN classifier has achieved a promising accuracy of 84 in extracting and classifying the events in the Urdu language script

1 Introduction

In the current digital era social media dominated othersources of communication ie print and broadcast media[1] Real-time availability [2] and multilingual support [3]are the key features that boost the usage of social media forcommunication )e usage of local languages on socialmedia is overwhelming for the last few years People shareideas opinions events sentiments and advertisements etc[4] in the world via social media using local languages Aconsiderable amount of heterogeneous data is being gen-erated which causes challenges to extract worthy insightswhile this information plays a vital role in developing naturallanguage processing (NLP) application ie sentimentanalysis [5] risk factor analysis [6] law and order predictortimeline constructor opining mining decision-makingsystems [7] monitoring social media [8] spam detectioninformation retrieval document classification [9] e-mailcategorization [10] and sentence classification [11] topicmodeling [12] content labeling and finding the latest trend

In South Asia (httpswwwworldometersinfo) about2498 population of the world live in different countries

Many languages are being spoken in Asia )e most famousamong these are Arabic Hindi Malay Persian and Urduetc

11 Features of Urdu Language )e Urdu language is one ofthe languages in South Asia that is frequently used forcommunication on social media namely Facebook TwitterNews Channels and Web Blogs [13] It is also the nationallanguage of Pakistan which is the 6th (httpswwwworldometersinfoworld-populationpopulation-by-country) most populous country in the world In othercountries ie India Afghanistan and Iran the Urdu lan-guage is also spoken and understood )ere are 340 millionpeople in the world who use the Urdu language on socialmedia for various purposes [13]

)e Urdu language follows the right-to-left writingscript Its grammatical structure is different from otherlanguages

(1) Subject-object-verb (SOV) sentence structure [14](2) No letter capitalization

HindawiScientific ProgrammingVolume 2021 Article ID 6660651 15 pageshttpsdoiorg10115520216660651

(3) Diacritics(4) Free word order [15]

)e Urdu language has 38 basic characters which can bewritten as joined and non-joined with other characters [16])e words having joined characters of Urdu alphabet set arecalled ligature and this joining feature of the alphabets madepossible to enrich the Urdu vocabulary having almost 24000ligatures [15 16] It is pertinent to mention that this alphabetset is also considered as a superset of all Urdu script-basedlanguages alphabets namely the Arabic and Persian whichcontain 28 and 32 alphabets respectively Furthermorethere are also some additional alphabets in Urdu script thatare used to express some Hindi phonemes [15 16]

12 EventClassification An event can be defined as ldquospecificactions situations or happenings occurring in a certainperiod [17 18]rdquo )e extracted information can representdifferent types of events ie sports politics terrorist attacksand inflation etc information can be detected and classifiedat a different level of granularity ie document level [19]sentence level [20] word level character level and phraselevel [21]

Event classification is an automated way to assign apredefined label to new instances It is pertinent to describethat the classification can be binary multiclass and mul-tilabel [22]

)e implementation of a neural network for text clas-sification provided help to handle a complex and largeamount of data [23] Semantically similar words are used togenerate feature vectors [24] that eliminate the sparsity ofn-grams models Urdu text classification is performed [25]to assess the quality of the product based on comments andfeedback In [25] an embedded layer of the neural networkwas used to convert text into numeric values and classifi-cation performed at the document level Contrary to [25]multiclass event classification is performed at the sentencelevel instead of the document level We further performedmultiple experiments to develop an efficient classificationsystem using TF-IDF one-hot-encoding pretrained Urduword embedding model and by creating custome pretrainedUrdu language word embedding models

13 Event Classification Challenges )e lack of processingresources ie part-of-speech (PoS) tagger name entityrecognizer and annotation tools is the other major hurdleto perform the event detection and classification for theUrdu language Many people are unfamiliar with themeaning and usage of some Urdu words It creates se-mantically ambiguous content that makes the event classi-fication process a nontrivial and challenging task )eunavailability of appropriate resourcesdatasets is anothermajor challenge for data-driven and knowledge-based ap-proaches to extract events and classify events

Our contributions are given as follows

(1) )e first-ever large-scale labeled Urdu dataset forevent classification that is the biggest in terms of

instances [15] and classes [25] in other Urdu textdatasets reported in state of the art [19 26 27]

(2) To our best knowledge it is the first multiclass eventclassification task at sentence level for the Urdulanguage

(3) Different feature vector generating methods ieone-hot-encoding word embedding and TF-IDFare used to evaluate the performance of DNN CNNand RNN deep learning models

(4) Pretrained and custom word embedding models forthe Urdu language are also explored

(5) Performance comparison of traditional machinelearning classifiers and deep learning classifiers

In this paper we performed a multiclass event classifi-cation on an imbalance dataset of Urdu language text Ourframework is a design to classify twelve different types ofevents ie sports inflation politics casualties law andorder terrorist attack sexual assault fraud (and corruption)showbiz business weather and earthquake Furthermorewe also presented a detailed comparative analysis of differentdeep learning algorithms ie long short-term memory(LSTM) and convolutional neural network (CNN) using TF-IDF one-hot-coding and word embedding methods Wealso compared the results of traditional machine learningclassifiers with deep learning classifiers

2 Related Work

In the past researchers were impassive in the Urdu languagebecause of limited processing resources ie datasets an-notators part-of-speech (PoS) taggers and translators [14]etc However now since the last few years feature-basedclassification for Urdu text documents started the use ofmachine learning models [28ndash30] A framework was pro-posed [31] to classify Chinese short texts into 7 kinds [32] ofemotion and product review )e event-level informationfrom the text and conceptual information from the externalknowledge base are provided as supplementary input to theneural models

A fusion of CNN and RNN models is used to classifysentences using a movie review dataset and achieved 93accuracy [33] A comparative research study of machinelearning (ML) and deep learning (DL) models is presented[25] for Urdu text classification at the document level CNNand RNN single-layermultilayer architectures are used toevaluate three different sizes of the dataset [26] )e purposeof their work was to analyze and to predict the quality ofproducts ie valuable not valuable relevant irrelevantbad good or very good [25]

Different datasets reported in state of the art ieNorthwestern Polytechnical University Urdu (NPUU)consist of 10K news articles labeled into six classes Naıvedataset including 5003 news articles consists of five classes[34] and Corpus of Urdu News Text Reuse (COUNTER)having 1200 news articles with five classes [27] A jointframework consisting of CNN and RNN layers was used forsentiment analysis [35] Stanford movie review dataset and

2 Scientific Programming

Stanford Treebank dataset were used to evaluate the per-formance of the system )eir proposed system showed933 and 892 accuracy respectively

In [35] the authors performed a supervised text clas-sification in the Urdu language by using a statistical ap-proach like Naıve Bayes and support vector machine (SVM))e classification is initiated by applying different pre-processing approaches namely stemming stop word re-moval and both stop words elimination and stemming )eexperimental results showed that the steaming process haslittle impact on improving performance On the other handthe elimination of stop words showed a positive effect onresults )e SVM outperformed the Naıve Bayes byachieving the classification accuracies of 8953 and 9334based on polynomial and radial function respectively

Similarly the SVM is also applied in the news headlinesclassification [36] in Urdu text showing a very low amount ofaccuracy improvement of 35 News headlines are a smallpiece of information that frequently does not describe thecontextual meaning of the contents In [36] the majorityvoting algorithm used for text classification in the Urdulanguage showed 94 accuracy )e classification is per-formed on seven different types of news text However thenumber of instances was very limited A dynamic neuralnetwork [37] was designed to model the sentiment ofsentences It consists of dynamic K-modeling pooling andglobal pooling over a linear sequence that performs mul-ticlass sentiment classification

A quite different task is performed [38] in which theauthors used a hybrid approach of rule-based and machinelearning-based techniques to perform the sentiment clas-sification while analyzing the Urdu script [38] at the phraselevel )e hybrid approach showed an accuracy of 3125846 and 216 using the performance metrics of recallprecision and accuracy respectively In [39] a variant ofrecurrent neural network (RNN) called long short-termmemory (LSTM) is used to overcome the weakness of bag-of-words and n-grams models and it outperformed theseconventional approaches

A neural network-based system [39] was developed toclassify events )e purpose of the system was to help thepeople in natural disasters like floods by analyzing tweets)e Markov model was used to classify and predict thelocation that showed 81 accuracy for classification tweetsas a request for help and 87 accuracy to locate the locationResearch work was conducted on life event detection andclassification ie marriage birthday and traveling etc toanticipate products and services to facilitate the people [40])e data about life events exist in a very small amountLinear regression Naıve Bayes and nearest neighbor al-gorithms were evaluated on the original dataset that was verysmall but did not show favorable results

A multiple minimal reduct extraction algorithm wasdesigned [41] by improving the quick reduct algorithm )emultiple reducts are used to generate the set of classificationrules which represent the rough set classifier To evaluate theproposed approach an Arabic corpus of 2700 documentswas used to categorize into nine classes By using multipleand single minimal reducts the proposed system showed

94 and 86 respectively Experimental results also showedthat both the K-NN and J48 algorithms outperformed re-garding classification accuracy using the dataset on hand

Table 1 depicts the summary of the related researchdiscussed previously

3 Dataset

31DataCollection Contrary to the dataset reported in stateof the art [27 34] in which no datasets were created for eventclassification we created a larger dataset specific for eventclassification Instead of focusing on a specific product [25]analysis or phrase-level sentiment analysis [38] we decidedto classify sentences into multiple event classes Instead ofusing the joint framework of CNN and RNN for sentimentanalysis [35] we evaluated the performance of deep learningmodels for multiclass event classification To collect data aPHP-based web scraper is written to crawl data from thepopular social media websites ie Geo News Channel(httpsurdugeotv) website BBC Urdu (httpswwwbbccomurdu) and Urdu point (httpswwwurdupointcomdaily) A complete post is retrieved from the website andstored in MariaDB (database) It consists of a title bodypublished date location and URL )e sample text or tweetof both languages of the South Asian countries ie Urdulanguage on Twitter and Hindi language on Facebook isshown in Figure 1

)ere are 015 million (150000) Urdu language sen-tences )e diversity of data collection sources helped us todevelop multiclass datasets It consists of twelve types ofevents )e subset of datasets can be useful for otherresearchers

32 Preprocessing In the first phase of dataset preparationwe performed some preprocessing steps ie noise removingand sentence annotationlabeling All non-Urdu wordssentences hyperlinks URLs and special symbols were re-moved It was necessary to clean out the dataset to annotatelabel the sentences properly

321 Annotation Guidelines

(1) Go through each sentence and assign a class label(2) Remove ambiguous sentences(3) Merge relevant sentences to a single class ie ac-

cident murder and death(4) Assign one of the twelve types of events ie sports

inflation murder and death terrorist attack politicslaw and order earthquake showbiz fraud andcorruption weather sexual assault and business toeach sentence

To annotate our dataset two MPhil (Urdu) level lan-guage experts were engaged )ey deeply read and analyzedthe dataset sentence by sentence before assigning eventlabels )ey recommended removing 46035 sentences fromthe dataset because those sentences would not contain in-formation that useful for event classification Finally after

Scientific Programming 3

annotation the dataset size was reduced to 103965 imbal-anced instances of twelve different types of events

)e annotation interagreement ie Cohen Kappa scoreis 093 which indicates the strong agreement between thelanguage and expert annotators )e annotated dataset isalmost perfect according to the annotation agreement score

In the second phase of preprocessing the following stepsare performed ie stop words eliminated word tokeniza-tion and sentence filtering

All those words which do not semantically contribute tothe classification process are removed as stop words ie ہو

ہریغوہریغوےسرواساںیم etc A list of standard stopwords of the Urdu language is available here (httpswwwkagglecomrtatmanurdu-stopwords-list)

After performing data cleaning and stop word removalevery sentence is tokenized into words based on white spaceAn example of sentence tokenization is given in Table 2

)e previous preprocessing step revealed that manysentences are varying in length Some sentences were soshort and many were very long We decided to define alength boundary for tokenized sentences We observed thatmany sentences exist in the dataset which have a lengthrange from 5 words to 250 words We selected sentences thatconsist of 5 words to 150 words An integer value is assignedto each type of event for all selected sentences )e detaileddescription of the different types of events and their cor-responding numeric (integer) values that are used in thedataset is also given in Table 3

In Figure 2 a few instances of the dataset after pre-processing are presented It is a comma-separated value

(CSV) file that consists of two fields ie sentence and labelie numeric value for each class (1ndash12)

In our dataset three types of events have a larger numberof instances ie sports (18746) politics (33421) and fraudand corruption (10078) contrary to three other types ofevents that have a smaller number of instances ie sexualassault (2916) inflation (3196) and earthquake (3238)

)e remaining types of events have a smaller differenceof instances among them )ere are 51814 unique words inour dataset )e visualization in Figure 3 shows that thedataset is imbalanced

4 Methodology

We analyzed the performance of deep learning ie deepneural network convolutional neural network and recur-rent neural network along with other machine learningclassifiers ie K-nearest neighbor decision tree randomforest support vector machine Naıve Bayes multinominaland linear regression

)e Urdu news headlines contain insufficient informa-tion ie few numbers of words and lack of contextualinformation to classify the events [29] However compar-atively to news headlines the sentences written in informalway contain more information )e sentence-level classifi-cation is performed using deep learning models instead ofonly machine learning algorithms )e majority voting al-gorithm outperforms on a limited number of instances forseven classes It showed 94 [36] accuracy but in our workmore than 015 million instances which are labeled intotwelve classes are used for classification

)ere exist several approaches to extract useful infor-mation from a large amount of data )ree common ap-proaches are rule-based a machine learning approach andhybrid approaches [42] )e selection of methodology istightly coupled with the research problem In our problemwe decided to use machine learning (traditional machinelearning and deep learning approaches) classifiers Sometraditional machine learning algorithms ie K-nearestneighbor (KNN) random forest (RF) support vector ma-chine (SVM) decision tree (DT) and multinomial NaıveBayes (MNB) are evaluated for multiclass eventclassification

Deep learning models ie convolutional neural network(CNN) deep neural network (DNN) and recurrent neuralnetwork (RNN) are also evaluated for multiclass eventclassification

Table 1 Summary of the related research

Paper reference Classifier used Dataset Accuracy ()[33] CNN and RNN Movie reviews 92

[34] CNN and RNN 1 Stanford movie review dataset2 Stanford Treebank dataset 933 and 892

[35] Naıve Bayes and SVM Corpus of Urdu documents 8953 and 9334[36] Dynamic neural network News articles 965[38] Rule-based modeling Urdu corpus of news headlines 3125[39] LSTM Tweets 8100[41] K-NN and J48 Arabic corpus of 2700 documents 95 and 86

Figure 1 Urdu and Hindi language text on social media

4 Scientific Programming

A collection of Urdu text documentsD d1 d2 dn1113864 1113865 is split into a set of sentencesS s1 s2 sn1113864 1113865 Our purpose is to classify the sentences toa predefined set of events E e1 e2 en1113864 1113865

Various feature generating methods are used to create afeature vector for deep learning and machine learning classi-fiers ie TF-IDF one-hot-encoding and word embeddingFeature vectors generated by all these techniques are fed up asinput into the embedding layer of neural networks )e outputgenerated by the embedding layers is fed up into the next fullyconnected layer (dense layer) of deep learning models ieRNN CNN and DNN A relevant class label out of twelvecategories is assigned to each sentence at the end of modelprocessing in the testingvalidation phase

Bag-of-words is a common method to represent text Itignores the sequence order and semantic of text [43] whilethe one-hot-coding method maintains the sequence of textWord embedding methods Word2Vec and Glove (httpsybbaigogitbooksio26pretrained-word-embeddingshtml)that are used to generate feature vectors for deep learningmodels are highly recommended for textual data Howeverin the case of Urdu text classification pre-existing wrod2vecand Glove are incompatible

)e framework of our designed system is represented inFigure 4 It shows the structure of our system from takinginput to producing output

5 Experimental Setup

We performed many experiments on our dataset by usingvarious traditional machine learning and deep learningclassifiers )e purpose of many experiments is to find themost efficient and accurate classification model for themulticlass event on an imbalance dataset for the Urdulanguage text A detailed comparison between traditionalclassifiers and deep neural classifiers is given in the nextsection

51 Feature Space Unigram and bigram tokens of the wholecorpus are used as features to create the feature space TF-IDF vectorization is used to create a dictionary-based modelIt consists of 656608 features )e training and testingdataset are converted to TF-IDF dictionary-based featurevectors A convolutional sequential model (see Figure 5)consists of three layers ie the input layer hidden layer andoutput layer which are used to evaluate our dataset Sim-ilarly word embedding and one-hot-encoding are also in-cluded in our feature space to enlarge the scope of ourresearch problem

52 Feature Vector Generating Techniques Feature vectorsare the numerical representation of text )ey are an actualform of input that can be processed by the machine learningclassifier )ere are several feature generating techniquesused for text processing We used the following featurevector generating techniques

521 Word Embedding A numerical representation of thetext is that each word is considered as a feature vector Itcreates a dense vector of real values that captures thecontextual semantical and syntactical meaning of the wordIt also ensures that similar words should have a relatedweighted value [29]

522 Pretrained Word Embedding Models Usage of apretrained word embedding model for the small amount ofdata is highly recommended by researchers in state of the artGlove and Word2Vec are famous word embedding modelsthat are developed by using a big amount of data Wordembedding models for text classification especially in theEnglish language showed promising results It has emergedas a powerful feature vector generating technique amongothers ie TF TF-IDF and one-hot encoding etc

In our research case sentence classification for differentevents in the Urdu language using the word embeddingtechnique is potentially preferable Unfortunately the Urdulanguage is lacking in processing resources We found onlythree word embedding models a word embedding model

Table 2 Sentence tokenization

Sentence Tokenized sentenceیلےلناجیکںوگولددعتمےنسرئاوانورک یل ےل ناج ںوگول ددعتم سرئاو انورک

یئگرگتھچیکںورھگیئکےسشرابینافوط ئگ رگ تھچ ںورھگ یئک شراب ینافوط

Table 3 Event label

Event LabelSports 1Inflation 2Murder and death 3Terrorist attack 4Politics 5Law and order 6Earthquake 7Showbiz 8Fraud and corruption 9Rainweather 10Sexual assault 11

Figure 2 )e few instances of the dataset

Scientific Programming 5

0

5000

10000

15000

20000

25000

30000

35000

40000To

tal n

umbe

r of i

nsta

nces

Imbalance dataset of the urdu label sentences for events

Type of events

18746

31966932

3034

33421

69603238

741710078

3406 2916 3617

Spor

ts

Infla

tion

Mur

der

Terr

orist

atta

ck

Polit

ics

Law

and

orde

r

Eart

hqua

ke

Show

biz

Frau

d an

dco

rrup

tion

Wea

ther

Sexu

al as

saul

t

Busin

ess

Figure 3 Imbalance instances of the dataset

Raw input Preprocessed input Word embedding layersDLM

RNN (LSTM-unidirection)

CNN

DNN

LabelsTF-IDF

Documents1

Documents2

Documents3

Documents

n

1

2

3

12

Figure 4 Methodology

00

050

055

060

065

070

075

080

085

05 10 15 20 25 30 35 40Epochs

Accuracy

Valid

ation_

accuracy

TrainTest

Figure 5 DDNrsquos accuracy

6 Scientific Programming

[44] that is developed by using three publicly available Urdudatasets Wikipediarsquos Urdu text another corpus having 90million tokens [45] and 35 million tokens [46] It has 102214unique tokens Each token comprises 300-dimensional realvalues Another model publicly available for research purposesconsists of 25925 unique words of Urdu language [47] Everyword has a 400-dimensional value A word embedding modelcomprises web-based text created to classify text It consists of64653 unique Urdu words and 300 dimensions for each word

)e journey of research is not over here to expand ourresearch scope and find the most efficient word embeddingmodel for sentence classification we decided to developcustom word embedding models We developed four wordembedding models that contain 57251 unique words

)e results of pretrained existing word embedding modelsare good at the initial level but very low ie 6026 accuracyWe explored the contents of these models which revealed thatmanywords are irrelevant and borrowed from other languagesie Arabic and Persian )e contents of Wikipedia are entirelydifferent than news websites that also affect the performance ofembedding models Another major factor ie low amount ofdata affected the feature vector generation quality Stop wordsin the pretrained word embedding model are not eliminatedand considered as a token while in our dataset all the stopwords are removed It also reduces the size of the vocabulary ofthe model while generating a feature vector )erefore wedecided to develop a custom word embedding model on ourpreprocessed dataset To postulate the enlargement of theresearch task three different word embedding models aredeveloped )e details of all used pretrained word embeddingmodels are given in Table 4

523 One-Hot-Encoding Text cannot be processed directlybymachine learning classifiers therefore we need to convertthe text into a real value We used one-hot-encoding toconvert text to numeric features For example the sentencesgiven in Table 5 can be converted into a numeric featurevector using one-hot-encoding as shown in Table 6

524 TF-IDF TF and TF-IDF are feature engineeringtechniques that transform the text into the numerical for-mat It is one of the most highly used feature vectors forcreating a method for text data )ree deep learning modelswere evaluated on our corpus )e sequential model withembedding layers outperformed other pretrained wordembedding models [44] reported in state of the art [48] )edetailed summary of the evaluation results of CNN RNNand DNN is discussed in Section 7

53 Deep Learning Models

531 Deep Neural Network Architecture Our DNN archi-tecture consists of three layers ie n-input layer 150 hidden(dense) layers and 12 output layers Feature vector is givenas input into a dense layer that is fully connected )eSoftMax activation function is used in the output layer toclassify sentences into multiple classes

532 Recurrence Neural Network )e recurrence neuralnetwork is evaluated using a long short-term memory(LSTM) classifier RNN consists of embedding dropoutLSTM and dense layers A dictionary of 30000 unique mostfrequent tokens is made )e sentences are standardized tothe same length by using a padding sequence)e dimensionof the feature vector is set as 250 RNN showed an overall81 accuracy that is the second highest in our work

533 Convolutional Neural Network (CNN) CNN is a classof deep neural networks that are highly recommended forimage processing [49] It consists of the input layer (em-bedding layer) multiple hidden layers and an output layer)ere are a series of convolutional layers that convolve witha multiplication )e embedded sequence layer and averagelayer (GloobalAveragePooling1D) are also part of the hiddenlayer )e common activation of CNN is RELU Layer )edetails of the hypermeters that are used in our problem totrain the CNN model are given in Table 7

534 Hyperparameters In this section all the hyper-parameters that are used in our experiments are given in thetabular format Only those hyperparameters are being dis-cussed here which have achieved the highest accuracy ofDNN RNN and CNN models )e hyperparameters ofDNN that are fine-tuned in our work are given in Table 8

)e RNN model showed the highest accuracy (803and 81) on two sets of hyperparameters that are given inTable 9 Similarly Table 7 provides the details of thehyperparameters of the convolutional neural network

6 Performance Measuring Parameters

)emost common performance measuring [41] parametersie precision recall and F1-measure are used to evaluatethe proposed framework )e selection of these parameterswas decided because of the multiclass classification andimbalance dataset

Precision TP

(TP + FP) (1)

Recall TP

(TP + FN) (2)

F1 2lowast PrecisionlowastRecall( 1113857

(Precision + Recall) (3)

Accuracy (TP + TN)

(TP + TN + FP + FN) (4)

where TP TN FP and FN represent total positive totalnegative false positive and false negative values respec-tively Precision is defined as the closeness of the mea-surements to each other and recall is the ratio of the totalamount of relevant (ie TP values) instances that wereactually retrieved during the experimental work It is

Scientific Programming 7

noteworthy that both precision and recall are the relativevalues of measure of relevance

7 Results

71 Deep Learning Classifiers )e feature vector can begenerated using different techniques )e details of feature

vector generating techniques were discussed in Section 5)e results of feature vector generating techniques that wereused in our work ie ldquomulticlass event classification for theUrdu language textrdquo are given in the proceedingsubsections

711 Pretrained Word Embedding Models )e convolu-tional neural network model is evaluated on the featuresvectors that were generated by all pretrained word em-bedding models )e summary of all results generated by

Table 4 Pretrained word embedding model and custom word embedding model

Sr no Unique words Dimension Window sizeExisting pretrained word embedding models1 [11] 64653 300 mdash2 [19] 102214 100 mdash3 53454 300 mdashCustom pretrained word embedding models1 57251 50 22 57251 100 23 57251 100 34 57251 350 1

Table 5 Event sentence

Urdu sentence English sentenceےہاتلیھکلابٹفیلع Ali plays football

یلےلناجیکںوگولںوھکالےنسرئاوانورک Corona virus killed millions of people

Table 6 Event sentence converted using one-hot-encoding

Sentence اتلیھک لاب ٹف یلع ناج ںوگول ںوھکال سرئاو انورک1 1 1 1 1 0 0 0 0 02 0 0 0 0 1 1 1 1 1

Table 7 CNNrsquos hyperparameters

CNN (7928)Parameter ValueMax_words 20000Batch size 128Embedding_dim 50Activation function SoftMaxDense_node 256Trainingtesting 70ndash30No of epochs 20Loss function Categorical cross-entropy

Table 8 DNNrsquos hyperparameters

Parameter ValueMax_words 5000Batch size 128Embedding_dim 512Activation function SoftMaxLayers 04Trainingtesting 70ndash30No of epochs 15Loss function Sparse categorical cross-entropy

Table 9 RNNrsquos hyperparameters

Parameter ValueRNN (LSTM) (803)Max_words 50000Batch size 64Embedding_dim 100Activation function SoftMaxRecurrent dropout 02Trainingtesting 90ndash10No of epochs 05Loss function Sparse categorical cross-entropyRNN (LSTM) (81)Max_words 30000Batch size 128Embedding_dim 100Activation function SoftMaxRecurrent dropout 02Trainingtesting 80ndash20No of epochs 05Loss function Sparse categorical cross-entropy

8 Scientific Programming

pretrained [44] and custom pretrained word embeddingmodels is given in Table 10 Our custom pretrained wordembedding model that contains 57251 unique tokens largerdimension size 350 and 1 as the size of a window showed3868 accuracy )e purpose of developing a differentcustom pretrained word embedding model was to develop adomain-specific model and achieve the highest accuracyHowever the results of both pre-existing pretrained wordembedding models and domain-specific custom word em-bedding models are very low )e detail summary of resultscan be seen in Table 10

712 TF-IDF Feature Vector DNN architecture consists ofan input layer a dense layer and a max pool layer)e denselayer is also called a fully connected layer comprised of 150nodes SoftMax activation function and sparse_categor-ical_cross-entropy are used to compile the model on thedataset

25991 instances are used to validate the accuracy of theDNN model )e DNN with connected layer architectureshowed 84 overall accuracy for all event classes)e details ofthe performance measuring parameters for each class of eventare given in Table 11 Law and order the 6th type of event inour dataset consists of 2000 instances that are used for vali-dation It showed 66 accuracy that is comparatively low to theaccuracy of other types of events It affected the overall per-formance of the DNN model )e main reason behind theseresults is that the sentence of law and order overlaps with thesentences of politics Generally sometimes humans hardlydistinguish between law and order and political statements

For example

ldquo ےطخوگتفگہنارادہمذریغیکریزویتموکحےہہرطخےیلےکنماےک rdquo

ldquo)e irresponsible talk of state minister is a threat topeace in the regionrdquo

)e performance of the DNN model is given in Table 11that showed 84 accuracy for multiple classes of events All theother performance measuring parameters ie precessionrecall and F1-score of each class of events are given in Table 11

)e accuracy of the DNN model can be viewed inFigure 5 where the y-axis represents the accuracy and the x-axis represents the number of epochs RNN achieved 84accuracy for multiclass event classification

)e expected solution to tackle the sentence over-lapping problem with multiple classes is to use a ldquopre-trained word embeddingrdquo model like W2Vec and GloveHowever unfortunately like the English language stillthere is no openclose domain pretrained word

embedding model that is developed by a large corpus ofthe Urdu language text

)e RNN sequential model architecture of deep learningis used in our experiments )e recurrent deep learningmodel architecture consists of a sequence of the followinglayers ie embedding layer having 100 dimensions Spa-tialDropout1D LSTM and dense layers Sparse_categor-ical_cross-entropy loss function has been used for thecompilation of the model Multiclass categorical classifica-tion is handled by a sparse categorical cross-entropy lossfunction instead of categorical cross-entropy A SoftMaxactivation function is used at a dense layer instead of thesigmoid function SoftMax can handle nonlinear classifi-cation ie multiple classes while sigmoid is limited to linearclassification and handles binary classification

A bag-of-words consisting of 30000 unique Urdu lan-guage words is used to generate a feature vector )emaximum length of the feature vector is 250 tokens

)e overall accuracy of the RNN model is presented inTable 12 that achieved 81 validation accuracy for our problemby using TF-IDF feature vectors Other performance evaluationparameters of each class are also given in Table 12

)e accuracy of the RNN model can be viewed inFigure 6 where the y-axis represents the accuracy and the x-axis represents the number of epochs RNN achieved 81accuracy for multiclass event classification

Although CNN is highly recommended for imageprocessing it showed considerable results for multiclassevent classification on textual data )e performance mea-suring parameters of the CNN classifier are given in Table 13

)e distributed accuracy of the CNN classifier for thetwelve classes can be viewed in Figure 7 )ere is more thanone peak (higher accuracies) in Figure 7 that showeddatasets are imbalanced

713 One-Hot-Encoding )e results of deep learning clas-sifiers are used in our researcher work and their performanceon one-hot-encoding features is presented in Figure 8)e one-hot-encoded feature vectors are given as input to CNN DNNand RNN deep learning classifiers RNN showed better ac-curacy as compared to CNN while the DNN outperformed

Table 10 Classification accuracy of the CNN model

Srno

Existing pretrained modelrsquosvalidation_accuracy

Custom pretrained modelrsquosvalidation_accuracy

1 5800 36852 6026 38043 5668 37384 mdash 3868

Table 11 Performance measuring parameters for the DNNmodel

Class Precision Recall F1-score Support1 096 095 096 46042 091 091 091 7763 075 075 075 16974 078 070 074 7705 081 085 083 84246 071 063 067 20007 100 100 100 8178 092 090 091 18399 070 070 071 252410 095 099 097 85611 095 099 097 74112 082 073 077 943Accuracy 084 25991Macro avg 084 084 085 25991Weighted avg 084 084 084 25991

Scientific Programming 9

among them RNN and DNN achieved 81 and 84 accuracyrespectively for multiclass event classification

72 Traditional Machine Learning Classifiers We also per-formed a multiclass event classifier by using traditionalmachine learning algorithms K-nearest neighbor (KNN)decision tree (DT) Naıve Bayes multinomial (NBM) ran-dom forest (RF) linear regression (LR) and support vectormachine (SVM) All these models are evaluated using TF-IDF and one-hot encoding features as feature vectors It wasobserved that the results produced using TF-IDF featureswere better than the results generated using one-hot-encoding features A detailed summary of the results of theabove-mentioned machine learning classifiers is given in thenext section

721 K-Nearest Neighbor (KNN) KNN performs theclassification of a new data point by measuring the similaritydistance between the nearest neighbors In our experimentswe set the value of k 5 that measures the similarity distanceamong five existing data points [50]

Table 12 Performance measuring parameters for the RNN model

Class Precision Recall F1-score Support1 095 095 095 46042 078 077 078 7763 070 072 071 16974 078 064 070 7705 078 084 081 84246 067 057 062 20007 100 100 100 8178 091 087 089 18399 070 063 066 252410 093 098 095 85611 086 094 090 74112 076 067 071 943Accuracy 081 25991Macro avg 082 080 081 25991Weighted avg 081 081 081 25991

Table 13 Performance measuring parameters for the CNN model

Class Precision Recall F1-score Support1 096 093 095 56612 081 065 072 9673 072 068 070 21154 078 054 064 8785 073 088 080 100306 064 051 057 22937 099 099 099 9708 091 086 088 22599 071 061 066 304410 093 094 093 103111 091 082 086 88912 077 063 07 1052Accuracy 080 31189Macro avg 082 075 078 31189Weighted avg 080 080 080 31189

0

065

070

075

080

085

090

2 4 6 8Epochs

Accuracy

Valid

ation_

accuracy

TrainTest

Figure 6 RNNrsquos accuracy

00

1000

2000

3000

4000

5000

6000

7000

8000

20 40 60 80 100

Figure 7 CNNrsquos accuracy distribution

CNN

79

80

808181

82

83

8484

RNN DNNDeep learning models

One-hot-encoding

78

85

Valid

atio

n_ac

cura

cy

Figure 8 CNN RNN andDNN accuracy using one-hot-encoding

10 Scientific Programming

Although the performance of traditional machinelearning classifiers is considerable it must be noted that it islower than deep learning classifiers )e main performancedegrading factor of the classifiers is the imbalanced numberof instances and sentences overlapping )e performance ofthe KNN machine learning model is given in Table 14 Itshowed 78 accuracy

722 Decision Tree (DT) Decision Tree (DT)Decision tree(DT) is a type of supervised machine learning algorithm [51]where the data input is split according to certain parameters)e overall accuracy achieved by DT is 73 while anotherperformance detail of classes andDTmodel is given in Table 15

723 Naive Bayes Multinominal (NBM) Naıve Bayesmultinominal is one of the computational [52] efficientclassifiers for text classification but it showed only 70accuracy that is very low as compared to KNN DT and RF)e performance details of all twelve types of classes aregiven in Table 16

724 Linear Regression (LR) Linear regression is highlyrecommended for the prediction of continuous output in-stead of categorical classification [53] Table 17 shows the

Table 17 Performance measuring parameters for the LR model

Class Precision Recall F1-score Support1 095 094 094 56612 083 064 072 9673 072 069 070 21154 077 055 064 8785 073 088 080 100306 064 053 058 22937 100 100 100 9708 091 084 088 22599 073 062 067 304410 094 092 093 103111 090 080 085 88912 077 066 071 1052Accuracy 080 31189Macro avg 082 076 079 31189Weighted avg 080 080 0 80 31189

Table 18 Performance measuring parameters for the RF model

Class Precision Recall F1-score Support1 094 093 094 56612 094 096 095 9673 072 063 067 21154 080 058 067 8785 071 090 079 100306 067 041 051 22937 100 100 100 9708 093 080 086 22599 075 058 065 304410 094 098 096 103111 096 098 097 88912 084 063 072 1052Accuracy 080 31189Macro avg 085 078 081 31189Weighted avg 081 080 0 80 31189

Table 14 Performance measuring parameters for the KNN model

Class Precision Recall F1-score Support1 091 093 092 56612 062 083 071 9673 067 071 069 21154 064 060 062 8785 078 082 080 100306 066 050 057 22937 093 100 096 9708 091 080 085 22599 071 062 066 304410 085 093 089 103111 072 085 078 88912 075 061 067 1052Accuracy 078 31189Macro avg 076 077 076 31189Weighted avg 078 078 0 78 31189

Table 15 Performance measuring parameters for the DT model

Class Precision Recall F1-score Support1 091 089 090 56612 083 097 089 9673 057 052 054 21154 058 054 056 8785 072 075 073 100306 044 041 042 22937 099 100 100 9708 079 077 078 22599 057 055 056 304410 098 098 093 103111 086 098 092 88912 061 056 058 1031Accuracy 073 31189Macro avg 073 074 074 31189Weighted avg 073 073 0 73 31189

Table 16 Performance measuring parameters for the NB Multi-nominal model

Class Precision Recall F1-score Support1 094 091 093 56832 082 034 048 9563 066 047 055 21214 091 020 032 9195 056 095 070 100136 070 022 034 23877 098 095 097 9598 094 075 083 21889 075 040 052 303110 096 078 086 99811 096 032 048 86312 084 025 039 1071Accuracy 070 31189Macro avg 084 054 061 31189Weighted avg 076 070 0 67 31189

Scientific Programming 11

performance of the LR model ie 84 overall accuracy formulticlass event classification

725 Random Forest (RF) It comprises many decision trees[54] Its results showed the highest accuracy among allevaluated machine learning classifiers A detailed summaryof the results is given in Table 18

726 Support Vector Machine (SVM) )e support vectormachine (SVM) is one of the highly recommended modelsfor binary classification It is based on statistical theory [55]Its performance details are given in Table 19

A comparative depiction of results obtained by thetraditional machine learning classifiers is given in Figure 9

8 Discussion and Conclusion

Lack of resources is a major hurdle in research for Urdulanguage texts We explored many feature vectors generatingtechniques Different classification algorithms of traditional

machine learning and deep learning approaches are evaluatedon these feature vectors )e purpose of performing manyexperiments on various feature vector generating techniqueswas to develop the most efficient and generic model of mul-ticlass event classification for Urdu language text

Word embedding feature generating technique is con-sidered an efficient and powerful technique for text analysisWord2Vector (W2Vec) feature vectors can be generated bypretrained word embedding models or using dynamic pa-rameters in embedding layers of deep neural networks Weperformed sentence classification using pretrained wordembedding models one-hot-encoding TF TF-IDF anddynamic embeddings )e results of the rest of the featurevector generating techniques are better than pretrained wordembedding models

Another argument in support of this conclusion is thatonly a few pretrained word embedding models exist forUrdu language texts )ese models are trained on consid-erable tokens and domain-specific Urdu text )ere is a needto develop generic word embedding models for the Urdulanguage on a large corpus CNN and RNN (LSTM) single-layer architecture and multilayer architecture do not affectthe performance of the proposed system

Experimental results are the vivid depiction that theone-hot-encoding method is better than the word em-bedding model and pretrained word embedding modelHowever among all mentioned (see Section 52) featuregenerating techniques TF-IDF outperformed It showedthe highest accuracy (84) by using DNN deep learningclassifier while event classification on an imbalancedataset of multiclass events for Urdu language usingtraditional machine learning classifiers showed consid-erable performance but lower than deep learning modelsDeep learning algorithms ie CNN DNN and RNN arepreferable over traditional machine learning algorithmsbecause there is no need for a domain expert to findrelevant features in deep learning like traditional machinelearning DNN and RNN outperformed among all otherclassifiers and showed overall 84 and 81 accuracyrespectively for the twelve classes of events Compara-tively the performance of CNN and RNN is better thanNaıve Bayes and SVM

Multiclass event classification at the sentence levelperformed on an imbalance dataset events that are having alow number of instances for a specific class affect the overallperformance of the classifiers We can improve the per-formance by balancing the instances of each class )efollowing can be concluded

(1) Pretrained word embedding models are suitable onlyfor sentence classification if pretrained models aredeveloped by an immense amount of textual data

(2) Existing word embedding models Word2Vec andGlove that were developed for the English languagetext are incompatible for Urdu language text

(3) In our case TF-IDF one-hot-encoding and dy-namic embedding layer are better feature generating

Table 19 Performance measuring parameters for the SVM model

Class Precision Recall F1-score Support1 084 094 089 56832 072 043 054 9563 072 049 058 21214 073 043 054 9195 064 090 075 100136 074 024 036 23877 090 099 094 9598 086 078 082 21889 065 047 057 303110 085 087 082 99811 081 062 070 86312 077 063 067 1071Accuracy 073 31189Macro avg 077 063 067 31189Weighted avg 077 073 0 71 31189

SVM

70

7573

70

80

73

80

7880

RF LRMachine learing classifiers

65

85

DT KNNNBM

Valid

atio

n_ac

cura

cy

Figure 9 Machine learning algorithmsrsquo accuracy using TF-IDF

12 Scientific Programming

techniques as compared to pre-existing Urdu lan-guage text word embedding models

(4) )e TF-IDF-based feature vectors showed thehighest results as compared to one-hot-encoding-and dynamic word embedding-based feature vectors

(5) Imbalance number of instances in the dataset af-fected the overall accuracy

9 Future Work

In a comprehensive review of Urdu literature we found onlya few numbers of referential works related to Urdu textprocessing )e main hurdle in Urdu exploration is theunavailability of the processing resources ie event datasetclose-domain part-of-speech tagger lexicons annotatorsand other supporting tools

)ere are a lot of tasks that can be accomplished forUrdu language text in the future Some of those are men-tioned as follows

(1) Generic word embedding models can be developedfor a large corpus of Urdu language text

(2) Different deep learning classifiers can be evaluatedie BERT and ANN

(3) Event classification can be performed at the doc-ument level

(4) A balance dataset can be used for better results(5) Multilabel event classification can be performed in

the future(6) Unstructured data of Urdu text can be classified

into different event classes(7) Classification of events for the Urdu language can

be further performed for other domains ofknowledge ie literacy ratio top trends famousfoods and a religious event like Eid

(8) Contextual information of sentence ie presen-tence and postsentence information certainly playsa vital role in enhancing the performance accuracyof the classification model

(9) Event classification can be performed on a balanceddataset

(10) Unstructured Urdu data can be used for eventclassification

(11) Classification can be performed at a document andphrase level

Data Availability

)e data used to support this study are available at httpsgithubcomunique-worldMulticlass-Event-Classification-Dataset

Conflicts of Interest

)e authors declare that there are no conflicts of interest

References

[1] A Lenhart R Ling S Campbell and K Purcell Teens andMobile Phones Text Messaging Explodes as Teens Embrace it asthe Centerpiece of eir Communication Strategies withFriends Pew Internet amp American Life Project WashingtonDC USA 2010

[2] M Motoyama B Meeder K Levchenko G M Voelker andS Savage ldquoMeasuring online service availability using twit-terrdquo WOSN vol 10 p 13 2010

[3] J Rogstadius M Vukovic C A Teixeira V KostakosE Karapanos and J A Laredo ldquoCrisisTracker crowdsourcedsocial media curation for disaster awarenessrdquo IBM Journal ofResearch and Development vol 57 no 5 pp 4ndash1 2013

[4] T Reuter and P Cimiano ldquoEvent-based classification of socialmedia streamsrdquo in Proceedings of the 2nd ACM InternationalConference on Multimedia Retrieval pp 1ndash8 Bielefeld Ger-many June 2012

[5] K Sailunaz and R Alhajj ldquoEmotion and sentiment analysisfrom Twitter textrdquo Journal of Computational Science vol 36Article ID 101003 2019

[6] P Capet T Delavallade T Nakamura A SandorC Tarsitano and S Voyatzi ldquoA risk assessment system withautomatic extraction of event typesrdquo in Proceedings of theInternational Conference on Intelligent Information Process-ing pp 220ndash229 Springer Beijing China October 2008

[7] F Hogenboom F Frasincar U Kaymak F De Jong andE Caron ldquoA survey of event extraction methods from text fordecision support systemsrdquo Decision Support Systems vol 85pp 12ndash22 2016

[8] S Jiang H Chen J F Nunamaker and D Zimbra ldquoAna-lyzing firm-specific social media and market a stakeholder-based event analysis frameworkrdquo Decision Support Systemsvol 67 pp 30ndash39 2014

[9] B Pang and L Lee ldquoOpinion mining and sentiment analysisrdquoFoundations and Trends in Information Retrieval vol 2 no 1-2 pp 1ndash135 2008

[10] S Deerwester S T Dumais G W Furnas T K Landauerand R Harshman ldquoIndexing by latent semantic analysisrdquoJournal of the American Society for Information Sciencevol 41 no 6 pp 391ndash407 1990

[11] T Mikolov Q V Le and I Sutskever ldquoExploiting similaritiesamong languages for machine translationrdquo 2013 httparxivorgabs13094168

[12] R Alghamdi and K Alfalqi ldquoA survey of topic modeling intext miningrdquo International Journal of Advanced ComputerScience and Applications (IJACSA)vol 6 no 1 2015

[13] D M Eberhard S F Gary and C D Fennig EthnologueLanguages of the World SIL International Dallas TX USA2019

[14] A Daud W Khan and D Che ldquoUrdu language processing asurveyrdquo Artificial Intelligence Review vol 47 no 3pp 279ndash311 2017

[15] M P Akhter Z Jiangbin I R Naqvi M AbdelmajeedA Mehmood and M T Sadiq ldquoDocument-level text clas-sification using single-layer multisize filters convolutionalneural networkrdquo IEEE Access vol 8 pp 42689ndash42707 2021

[16] U Pal and A Sarkar ldquoRecognition of printed Urdu scriptrdquo inProceedings of the 2003 Seventh International Conference onDocument Analysis and Recognition pp 1183ndash1187 IEEEEdinburgh Scotland August 2003

[17] Y Yang T Pierce and J Carbonell ldquoA study of retrospectiveand on-line event detectionrdquo in Proceedings of the 21st annualinternational ACM SIGIR conference on Research and

Scientific Programming 13

development in information retrieval pp 28ndash36 MelbourneAustralia August 1998

[18] T Kala ldquoEvent detection from text datardquo ComputationalIntelligence vol 31 pp 132ndash164 2015

[19] M Naughton N Stokes and J Carthy ldquoSentence-level eventclassification in unstructured textsrdquo Information Retrievalvol 13 no 2 pp 132ndash156 2010

[20] G Jacobs E Lefever and V Hoste ldquoEconomic event de-tection in company-specific news textrdquo in Proceedings of theFirst Workshop on Economics and Natural Language Pro-cessing pp 1ndash10 Melbourne Australia July 2018

[21] E DrsquoAndrea P Ducange A Bechini A Renda andF Marcelloni ldquoMonitoring the public opinion about thevaccination topic from tweets analysisrdquo Expert Systems withApplications vol 116 pp 209ndash226 2019

[22] M Sokolova and G Lapalme ldquoA systematic analysis ofperformance measures for classification tasksrdquo InformationProcessing amp Management vol 45 no 4 pp 427ndash437 2009

[23] T Mikolov I Sutskever K Chen G S Corrado and J DeanldquoDistributed representations of words and phrases and theircompositionalityrdquo in Proceedings Advances Neural Informa-tion Processing Systems vol 26 pp 3111ndash3119 Lake TahoeNV USA December 2013

[24] Y Bengio R Ducharme P Vincent and C Jauvin ldquoA neuralprobabilistic language modelrdquo Journal of Machine LearningResearch vol 3 pp 1137ndash1155 2003

[25] M P Akhter Z Jiangbin I R Naqvi M Abdelmajeed andM Fayyaz ldquoExploring deep learning approaches for Urdu textclassification in product manufacturingrdquo Enterprise Infor-mation Systems pp 1ndash26 2020

[26] G Liu and J Guo ldquoBidirectional LSTM with attentionmechanism and convolutional layer for text classificationrdquoNeurocomputing vol 337 pp 325ndash338 2019

[27] M Sharjeel R M A Nawab and P Rayson ldquoCOUNTERcorpus of Urdu news text reuserdquo Language Resources andEvaluation vol 51 no 3 pp 777ndash803 2017

[28] K Mehmood D Essam and K Shafi ldquoSentiment analysissystem for Roman Urdurdquo in Proceedings of the 2018 Scienceand Information Conference pp 29ndash42 Springer CasablancaMorocco July 2018

[29] K Ahmed M Ali S Khalid and M Kamran ldquoFramework forUrdu news headlines classificationrdquo Journal of Applied ComputerScience amp Mathematics vol 10 no 1 pp 17ndash21 2016

[30] Z Tehseen M P Akhter and Q Abbas ldquoComparative studyof feature selection approaches for Urdu text categorizationrdquoMalaysian Journal Computer Science vol 28 no 2 pp 93ndash109 2015

[31] W Yin and L Shen ldquoA short text classification approach withevent detection and conceptual informationrdquo in Proceedingsof the 2020 5th International Conference on Machine LearningTechnologies pp 129ndash135 Beijing China June 2020

[32] H Zhou M Huang T Zhang et al ldquoEmotional chattingmachine emotional conversation generation with internaland external memoryrdquo in Proceedings of the irty-SecondAAAI Conference on Artificial Intelligence New Orleans LAUSA February 2018

[33] A Hassan and A Mahmood ldquoConvolutional recurrent deeplearningmodel for sentence classificationrdquo IEEE Access vol 6pp 13949ndash13957 2018

[34] T Zia M P Akhter and Q Abbas ldquoComparative study offeature selection approaches for Urdu text categorizationrdquoMalaysian Journal of Computer Science vol 28 no 2pp 93ndash109 2015

[35] A R Ali andM Ijaz ldquoUrdu text classificationrdquo in Proceedingsof the 7th international conference on frontiers of informationtechnology pp 1ndash7 Abbottabad Pakistan December 2009

[36] M Usman Z Shafique S Ayub and K Malik ldquoUrdu textclassification using majority votingrdquo International Journal ofAdvanced Computer Science and Applications vol 7 no 8pp 265ndash273 2016

[37] N Kalchbrenner E Grefenstette and P Blunsom ldquoA con-volutional neural network for modelling sentencesrdquo 2014httparxivorgabs14042188

[38] D M Awais and DM Shoaib ldquoRole of discourse informationin Urdu sentiment classificationrdquo ACMTransactions on Asianand Low-Resource Language Information Processing vol 18no 4 pp 1ndash37 2019

[39] J P Singh Y K Dwivedi N P Rana A Kumar andK K Kapoor ldquoEvent classification and location predictionfrom tweets during disastersrdquo Annals of Operations Researchvol 283 no 1-2 pp 737ndash757 2019

[40] R C Paulo D Fillipe and M S C Sergo ldquoClassification ofevents on social mediardquo 2016

[41] Q A Al-Radaideh and M A Al-Abrat ldquoAn Arabic textcategorization approach using term weighting and multiplereductsrdquo Soft Computing vol 23 no 14 pp 5849ndash5863 2019

[42] J F Allen ldquoMaintaining knowledge about temporal inter-valsrdquo Communications of the ACM vol 26 no 11pp 832ndash843 1983

[43] T Joachims ldquoText categorization with support vector ma-chines learning with many relevant featuresrdquo in Proceedingsof the European conference on machine learning pp 137ndash142Springer Chemnitz Germany April 1998

[44] S Haider ldquoUrdu word embeddingsrdquo in Proceedings of theEleventh International Conference on Language Resources andEvaluation (LREC 2018) Miyazaki Japan May 2018

[45] B Jawaid A Kamran and O Bojar ldquoUrdu monolingual corpusrdquoLINDATCLARIN Digital Library at the Institute of Formal andApplied Linguistics Charles University Prague Czechia

[46] F Adeeba Q Akram H Khalid and S Hussain ldquoCLE Urdubooks n-gramsrdquo in Proceedings of the Conference on Languageand Technology CLT 14 Karachi Pakistan May 2014

[47] A Hassan and A Mahmood ldquoDeep learning for sentenceclassificationrdquo in Proceedings of the 2017 IEEE Long IslandSystems Applications and Technology Conference (LISAT)pp 1ndash5 IEEE New York NY USA May 2017

[48] D-X Zhou ldquoUniversality of deep convolutional neuralnetworksrdquo Applied and Computational Harmonic Analysisvol 48 no 2 pp 787ndash794 2020

[49] M V Valueva N N Nagornov P A Lyakhov G V Valuevand N I Chervyakov ldquoApplication of the residue numbersystem to reduce hardware costs of the convolutional neuralnetwork implementationrdquo Mathematics and Computers inSimulation vol 177 pp 232ndash243 2020

[50] G Guo H Wang D Bell Y Bi and K Greer ldquoKNN model-based approach in classificationrdquo in Proceedings of the OTMConfederated International Conferences ldquoOn the Move toMeaningful Internet Systemsrdquo Catania Italy November 2003

[51] Y Zhong ldquo)e analysis of cases based on decision treerdquo inProceedings of the 2016 7th IEEE international conference onsoftware engineering and service science (ICSESS) pp 142ndash147IEEE Beijing China August 2016

[52] S Xu ldquoBayesian Naıve Bayes classifiers to text classificationrdquoJournal of Information Science vol 44 no 1 pp 48ndash59 2018

[53] T Zhang and F Oles ldquoText categorization based on regu-larized linear classification methodsrdquo Information Retrievalvol 4 no 1 pp 5ndash31 2001

14 Scientific Programming

[54] J Ali R Khan N Ahmad and I Maqsood ldquoRandom forestsand decision treesrdquo International Journal of Computer ScienceIssues (IJCSI) vol 9 no 5 p 272 2012

[55] Y Zhang ldquoSupport vector machine classification algorithmand its applicationrdquo in Proceedings of the InternationalConference on Information Computing and Applicationspp 179ndash186 Springer Bhubaneswar India September 2012

Scientific Programming 15

Page 2: ResearchArticle MulticlassEventClassificationfromText2020/11/03  · script. Its grammatical structure is different from other languages. (1)Subject-object-verb(SOV)sentencestructure[14]

(3) Diacritics(4) Free word order [15]

)e Urdu language has 38 basic characters which can bewritten as joined and non-joined with other characters [16])e words having joined characters of Urdu alphabet set arecalled ligature and this joining feature of the alphabets madepossible to enrich the Urdu vocabulary having almost 24000ligatures [15 16] It is pertinent to mention that this alphabetset is also considered as a superset of all Urdu script-basedlanguages alphabets namely the Arabic and Persian whichcontain 28 and 32 alphabets respectively Furthermorethere are also some additional alphabets in Urdu script thatare used to express some Hindi phonemes [15 16]

12 EventClassification An event can be defined as ldquospecificactions situations or happenings occurring in a certainperiod [17 18]rdquo )e extracted information can representdifferent types of events ie sports politics terrorist attacksand inflation etc information can be detected and classifiedat a different level of granularity ie document level [19]sentence level [20] word level character level and phraselevel [21]

Event classification is an automated way to assign apredefined label to new instances It is pertinent to describethat the classification can be binary multiclass and mul-tilabel [22]

)e implementation of a neural network for text clas-sification provided help to handle a complex and largeamount of data [23] Semantically similar words are used togenerate feature vectors [24] that eliminate the sparsity ofn-grams models Urdu text classification is performed [25]to assess the quality of the product based on comments andfeedback In [25] an embedded layer of the neural networkwas used to convert text into numeric values and classifi-cation performed at the document level Contrary to [25]multiclass event classification is performed at the sentencelevel instead of the document level We further performedmultiple experiments to develop an efficient classificationsystem using TF-IDF one-hot-encoding pretrained Urduword embedding model and by creating custome pretrainedUrdu language word embedding models

13 Event Classification Challenges )e lack of processingresources ie part-of-speech (PoS) tagger name entityrecognizer and annotation tools is the other major hurdleto perform the event detection and classification for theUrdu language Many people are unfamiliar with themeaning and usage of some Urdu words It creates se-mantically ambiguous content that makes the event classi-fication process a nontrivial and challenging task )eunavailability of appropriate resourcesdatasets is anothermajor challenge for data-driven and knowledge-based ap-proaches to extract events and classify events

Our contributions are given as follows

(1) )e first-ever large-scale labeled Urdu dataset forevent classification that is the biggest in terms of

instances [15] and classes [25] in other Urdu textdatasets reported in state of the art [19 26 27]

(2) To our best knowledge it is the first multiclass eventclassification task at sentence level for the Urdulanguage

(3) Different feature vector generating methods ieone-hot-encoding word embedding and TF-IDFare used to evaluate the performance of DNN CNNand RNN deep learning models

(4) Pretrained and custom word embedding models forthe Urdu language are also explored

(5) Performance comparison of traditional machinelearning classifiers and deep learning classifiers

In this paper we performed a multiclass event classifi-cation on an imbalance dataset of Urdu language text Ourframework is a design to classify twelve different types ofevents ie sports inflation politics casualties law andorder terrorist attack sexual assault fraud (and corruption)showbiz business weather and earthquake Furthermorewe also presented a detailed comparative analysis of differentdeep learning algorithms ie long short-term memory(LSTM) and convolutional neural network (CNN) using TF-IDF one-hot-coding and word embedding methods Wealso compared the results of traditional machine learningclassifiers with deep learning classifiers

2 Related Work

In the past researchers were impassive in the Urdu languagebecause of limited processing resources ie datasets an-notators part-of-speech (PoS) taggers and translators [14]etc However now since the last few years feature-basedclassification for Urdu text documents started the use ofmachine learning models [28ndash30] A framework was pro-posed [31] to classify Chinese short texts into 7 kinds [32] ofemotion and product review )e event-level informationfrom the text and conceptual information from the externalknowledge base are provided as supplementary input to theneural models

A fusion of CNN and RNN models is used to classifysentences using a movie review dataset and achieved 93accuracy [33] A comparative research study of machinelearning (ML) and deep learning (DL) models is presented[25] for Urdu text classification at the document level CNNand RNN single-layermultilayer architectures are used toevaluate three different sizes of the dataset [26] )e purposeof their work was to analyze and to predict the quality ofproducts ie valuable not valuable relevant irrelevantbad good or very good [25]

Different datasets reported in state of the art ieNorthwestern Polytechnical University Urdu (NPUU)consist of 10K news articles labeled into six classes Naıvedataset including 5003 news articles consists of five classes[34] and Corpus of Urdu News Text Reuse (COUNTER)having 1200 news articles with five classes [27] A jointframework consisting of CNN and RNN layers was used forsentiment analysis [35] Stanford movie review dataset and

2 Scientific Programming

Stanford Treebank dataset were used to evaluate the per-formance of the system )eir proposed system showed933 and 892 accuracy respectively

In [35] the authors performed a supervised text clas-sification in the Urdu language by using a statistical ap-proach like Naıve Bayes and support vector machine (SVM))e classification is initiated by applying different pre-processing approaches namely stemming stop word re-moval and both stop words elimination and stemming )eexperimental results showed that the steaming process haslittle impact on improving performance On the other handthe elimination of stop words showed a positive effect onresults )e SVM outperformed the Naıve Bayes byachieving the classification accuracies of 8953 and 9334based on polynomial and radial function respectively

Similarly the SVM is also applied in the news headlinesclassification [36] in Urdu text showing a very low amount ofaccuracy improvement of 35 News headlines are a smallpiece of information that frequently does not describe thecontextual meaning of the contents In [36] the majorityvoting algorithm used for text classification in the Urdulanguage showed 94 accuracy )e classification is per-formed on seven different types of news text However thenumber of instances was very limited A dynamic neuralnetwork [37] was designed to model the sentiment ofsentences It consists of dynamic K-modeling pooling andglobal pooling over a linear sequence that performs mul-ticlass sentiment classification

A quite different task is performed [38] in which theauthors used a hybrid approach of rule-based and machinelearning-based techniques to perform the sentiment clas-sification while analyzing the Urdu script [38] at the phraselevel )e hybrid approach showed an accuracy of 3125846 and 216 using the performance metrics of recallprecision and accuracy respectively In [39] a variant ofrecurrent neural network (RNN) called long short-termmemory (LSTM) is used to overcome the weakness of bag-of-words and n-grams models and it outperformed theseconventional approaches

A neural network-based system [39] was developed toclassify events )e purpose of the system was to help thepeople in natural disasters like floods by analyzing tweets)e Markov model was used to classify and predict thelocation that showed 81 accuracy for classification tweetsas a request for help and 87 accuracy to locate the locationResearch work was conducted on life event detection andclassification ie marriage birthday and traveling etc toanticipate products and services to facilitate the people [40])e data about life events exist in a very small amountLinear regression Naıve Bayes and nearest neighbor al-gorithms were evaluated on the original dataset that was verysmall but did not show favorable results

A multiple minimal reduct extraction algorithm wasdesigned [41] by improving the quick reduct algorithm )emultiple reducts are used to generate the set of classificationrules which represent the rough set classifier To evaluate theproposed approach an Arabic corpus of 2700 documentswas used to categorize into nine classes By using multipleand single minimal reducts the proposed system showed

94 and 86 respectively Experimental results also showedthat both the K-NN and J48 algorithms outperformed re-garding classification accuracy using the dataset on hand

Table 1 depicts the summary of the related researchdiscussed previously

3 Dataset

31DataCollection Contrary to the dataset reported in stateof the art [27 34] in which no datasets were created for eventclassification we created a larger dataset specific for eventclassification Instead of focusing on a specific product [25]analysis or phrase-level sentiment analysis [38] we decidedto classify sentences into multiple event classes Instead ofusing the joint framework of CNN and RNN for sentimentanalysis [35] we evaluated the performance of deep learningmodels for multiclass event classification To collect data aPHP-based web scraper is written to crawl data from thepopular social media websites ie Geo News Channel(httpsurdugeotv) website BBC Urdu (httpswwwbbccomurdu) and Urdu point (httpswwwurdupointcomdaily) A complete post is retrieved from the website andstored in MariaDB (database) It consists of a title bodypublished date location and URL )e sample text or tweetof both languages of the South Asian countries ie Urdulanguage on Twitter and Hindi language on Facebook isshown in Figure 1

)ere are 015 million (150000) Urdu language sen-tences )e diversity of data collection sources helped us todevelop multiclass datasets It consists of twelve types ofevents )e subset of datasets can be useful for otherresearchers

32 Preprocessing In the first phase of dataset preparationwe performed some preprocessing steps ie noise removingand sentence annotationlabeling All non-Urdu wordssentences hyperlinks URLs and special symbols were re-moved It was necessary to clean out the dataset to annotatelabel the sentences properly

321 Annotation Guidelines

(1) Go through each sentence and assign a class label(2) Remove ambiguous sentences(3) Merge relevant sentences to a single class ie ac-

cident murder and death(4) Assign one of the twelve types of events ie sports

inflation murder and death terrorist attack politicslaw and order earthquake showbiz fraud andcorruption weather sexual assault and business toeach sentence

To annotate our dataset two MPhil (Urdu) level lan-guage experts were engaged )ey deeply read and analyzedthe dataset sentence by sentence before assigning eventlabels )ey recommended removing 46035 sentences fromthe dataset because those sentences would not contain in-formation that useful for event classification Finally after

Scientific Programming 3

annotation the dataset size was reduced to 103965 imbal-anced instances of twelve different types of events

)e annotation interagreement ie Cohen Kappa scoreis 093 which indicates the strong agreement between thelanguage and expert annotators )e annotated dataset isalmost perfect according to the annotation agreement score

In the second phase of preprocessing the following stepsare performed ie stop words eliminated word tokeniza-tion and sentence filtering

All those words which do not semantically contribute tothe classification process are removed as stop words ie ہو

ہریغوہریغوےسرواساںیم etc A list of standard stopwords of the Urdu language is available here (httpswwwkagglecomrtatmanurdu-stopwords-list)

After performing data cleaning and stop word removalevery sentence is tokenized into words based on white spaceAn example of sentence tokenization is given in Table 2

)e previous preprocessing step revealed that manysentences are varying in length Some sentences were soshort and many were very long We decided to define alength boundary for tokenized sentences We observed thatmany sentences exist in the dataset which have a lengthrange from 5 words to 250 words We selected sentences thatconsist of 5 words to 150 words An integer value is assignedto each type of event for all selected sentences )e detaileddescription of the different types of events and their cor-responding numeric (integer) values that are used in thedataset is also given in Table 3

In Figure 2 a few instances of the dataset after pre-processing are presented It is a comma-separated value

(CSV) file that consists of two fields ie sentence and labelie numeric value for each class (1ndash12)

In our dataset three types of events have a larger numberof instances ie sports (18746) politics (33421) and fraudand corruption (10078) contrary to three other types ofevents that have a smaller number of instances ie sexualassault (2916) inflation (3196) and earthquake (3238)

)e remaining types of events have a smaller differenceof instances among them )ere are 51814 unique words inour dataset )e visualization in Figure 3 shows that thedataset is imbalanced

4 Methodology

We analyzed the performance of deep learning ie deepneural network convolutional neural network and recur-rent neural network along with other machine learningclassifiers ie K-nearest neighbor decision tree randomforest support vector machine Naıve Bayes multinominaland linear regression

)e Urdu news headlines contain insufficient informa-tion ie few numbers of words and lack of contextualinformation to classify the events [29] However compar-atively to news headlines the sentences written in informalway contain more information )e sentence-level classifi-cation is performed using deep learning models instead ofonly machine learning algorithms )e majority voting al-gorithm outperforms on a limited number of instances forseven classes It showed 94 [36] accuracy but in our workmore than 015 million instances which are labeled intotwelve classes are used for classification

)ere exist several approaches to extract useful infor-mation from a large amount of data )ree common ap-proaches are rule-based a machine learning approach andhybrid approaches [42] )e selection of methodology istightly coupled with the research problem In our problemwe decided to use machine learning (traditional machinelearning and deep learning approaches) classifiers Sometraditional machine learning algorithms ie K-nearestneighbor (KNN) random forest (RF) support vector ma-chine (SVM) decision tree (DT) and multinomial NaıveBayes (MNB) are evaluated for multiclass eventclassification

Deep learning models ie convolutional neural network(CNN) deep neural network (DNN) and recurrent neuralnetwork (RNN) are also evaluated for multiclass eventclassification

Table 1 Summary of the related research

Paper reference Classifier used Dataset Accuracy ()[33] CNN and RNN Movie reviews 92

[34] CNN and RNN 1 Stanford movie review dataset2 Stanford Treebank dataset 933 and 892

[35] Naıve Bayes and SVM Corpus of Urdu documents 8953 and 9334[36] Dynamic neural network News articles 965[38] Rule-based modeling Urdu corpus of news headlines 3125[39] LSTM Tweets 8100[41] K-NN and J48 Arabic corpus of 2700 documents 95 and 86

Figure 1 Urdu and Hindi language text on social media

4 Scientific Programming

A collection of Urdu text documentsD d1 d2 dn1113864 1113865 is split into a set of sentencesS s1 s2 sn1113864 1113865 Our purpose is to classify the sentences toa predefined set of events E e1 e2 en1113864 1113865

Various feature generating methods are used to create afeature vector for deep learning and machine learning classi-fiers ie TF-IDF one-hot-encoding and word embeddingFeature vectors generated by all these techniques are fed up asinput into the embedding layer of neural networks )e outputgenerated by the embedding layers is fed up into the next fullyconnected layer (dense layer) of deep learning models ieRNN CNN and DNN A relevant class label out of twelvecategories is assigned to each sentence at the end of modelprocessing in the testingvalidation phase

Bag-of-words is a common method to represent text Itignores the sequence order and semantic of text [43] whilethe one-hot-coding method maintains the sequence of textWord embedding methods Word2Vec and Glove (httpsybbaigogitbooksio26pretrained-word-embeddingshtml)that are used to generate feature vectors for deep learningmodels are highly recommended for textual data Howeverin the case of Urdu text classification pre-existing wrod2vecand Glove are incompatible

)e framework of our designed system is represented inFigure 4 It shows the structure of our system from takinginput to producing output

5 Experimental Setup

We performed many experiments on our dataset by usingvarious traditional machine learning and deep learningclassifiers )e purpose of many experiments is to find themost efficient and accurate classification model for themulticlass event on an imbalance dataset for the Urdulanguage text A detailed comparison between traditionalclassifiers and deep neural classifiers is given in the nextsection

51 Feature Space Unigram and bigram tokens of the wholecorpus are used as features to create the feature space TF-IDF vectorization is used to create a dictionary-based modelIt consists of 656608 features )e training and testingdataset are converted to TF-IDF dictionary-based featurevectors A convolutional sequential model (see Figure 5)consists of three layers ie the input layer hidden layer andoutput layer which are used to evaluate our dataset Sim-ilarly word embedding and one-hot-encoding are also in-cluded in our feature space to enlarge the scope of ourresearch problem

52 Feature Vector Generating Techniques Feature vectorsare the numerical representation of text )ey are an actualform of input that can be processed by the machine learningclassifier )ere are several feature generating techniquesused for text processing We used the following featurevector generating techniques

521 Word Embedding A numerical representation of thetext is that each word is considered as a feature vector Itcreates a dense vector of real values that captures thecontextual semantical and syntactical meaning of the wordIt also ensures that similar words should have a relatedweighted value [29]

522 Pretrained Word Embedding Models Usage of apretrained word embedding model for the small amount ofdata is highly recommended by researchers in state of the artGlove and Word2Vec are famous word embedding modelsthat are developed by using a big amount of data Wordembedding models for text classification especially in theEnglish language showed promising results It has emergedas a powerful feature vector generating technique amongothers ie TF TF-IDF and one-hot encoding etc

In our research case sentence classification for differentevents in the Urdu language using the word embeddingtechnique is potentially preferable Unfortunately the Urdulanguage is lacking in processing resources We found onlythree word embedding models a word embedding model

Table 2 Sentence tokenization

Sentence Tokenized sentenceیلےلناجیکںوگولددعتمےنسرئاوانورک یل ےل ناج ںوگول ددعتم سرئاو انورک

یئگرگتھچیکںورھگیئکےسشرابینافوط ئگ رگ تھچ ںورھگ یئک شراب ینافوط

Table 3 Event label

Event LabelSports 1Inflation 2Murder and death 3Terrorist attack 4Politics 5Law and order 6Earthquake 7Showbiz 8Fraud and corruption 9Rainweather 10Sexual assault 11

Figure 2 )e few instances of the dataset

Scientific Programming 5

0

5000

10000

15000

20000

25000

30000

35000

40000To

tal n

umbe

r of i

nsta

nces

Imbalance dataset of the urdu label sentences for events

Type of events

18746

31966932

3034

33421

69603238

741710078

3406 2916 3617

Spor

ts

Infla

tion

Mur

der

Terr

orist

atta

ck

Polit

ics

Law

and

orde

r

Eart

hqua

ke

Show

biz

Frau

d an

dco

rrup

tion

Wea

ther

Sexu

al as

saul

t

Busin

ess

Figure 3 Imbalance instances of the dataset

Raw input Preprocessed input Word embedding layersDLM

RNN (LSTM-unidirection)

CNN

DNN

LabelsTF-IDF

Documents1

Documents2

Documents3

Documents

n

1

2

3

12

Figure 4 Methodology

00

050

055

060

065

070

075

080

085

05 10 15 20 25 30 35 40Epochs

Accuracy

Valid

ation_

accuracy

TrainTest

Figure 5 DDNrsquos accuracy

6 Scientific Programming

[44] that is developed by using three publicly available Urdudatasets Wikipediarsquos Urdu text another corpus having 90million tokens [45] and 35 million tokens [46] It has 102214unique tokens Each token comprises 300-dimensional realvalues Another model publicly available for research purposesconsists of 25925 unique words of Urdu language [47] Everyword has a 400-dimensional value A word embedding modelcomprises web-based text created to classify text It consists of64653 unique Urdu words and 300 dimensions for each word

)e journey of research is not over here to expand ourresearch scope and find the most efficient word embeddingmodel for sentence classification we decided to developcustom word embedding models We developed four wordembedding models that contain 57251 unique words

)e results of pretrained existing word embedding modelsare good at the initial level but very low ie 6026 accuracyWe explored the contents of these models which revealed thatmanywords are irrelevant and borrowed from other languagesie Arabic and Persian )e contents of Wikipedia are entirelydifferent than news websites that also affect the performance ofembedding models Another major factor ie low amount ofdata affected the feature vector generation quality Stop wordsin the pretrained word embedding model are not eliminatedand considered as a token while in our dataset all the stopwords are removed It also reduces the size of the vocabulary ofthe model while generating a feature vector )erefore wedecided to develop a custom word embedding model on ourpreprocessed dataset To postulate the enlargement of theresearch task three different word embedding models aredeveloped )e details of all used pretrained word embeddingmodels are given in Table 4

523 One-Hot-Encoding Text cannot be processed directlybymachine learning classifiers therefore we need to convertthe text into a real value We used one-hot-encoding toconvert text to numeric features For example the sentencesgiven in Table 5 can be converted into a numeric featurevector using one-hot-encoding as shown in Table 6

524 TF-IDF TF and TF-IDF are feature engineeringtechniques that transform the text into the numerical for-mat It is one of the most highly used feature vectors forcreating a method for text data )ree deep learning modelswere evaluated on our corpus )e sequential model withembedding layers outperformed other pretrained wordembedding models [44] reported in state of the art [48] )edetailed summary of the evaluation results of CNN RNNand DNN is discussed in Section 7

53 Deep Learning Models

531 Deep Neural Network Architecture Our DNN archi-tecture consists of three layers ie n-input layer 150 hidden(dense) layers and 12 output layers Feature vector is givenas input into a dense layer that is fully connected )eSoftMax activation function is used in the output layer toclassify sentences into multiple classes

532 Recurrence Neural Network )e recurrence neuralnetwork is evaluated using a long short-term memory(LSTM) classifier RNN consists of embedding dropoutLSTM and dense layers A dictionary of 30000 unique mostfrequent tokens is made )e sentences are standardized tothe same length by using a padding sequence)e dimensionof the feature vector is set as 250 RNN showed an overall81 accuracy that is the second highest in our work

533 Convolutional Neural Network (CNN) CNN is a classof deep neural networks that are highly recommended forimage processing [49] It consists of the input layer (em-bedding layer) multiple hidden layers and an output layer)ere are a series of convolutional layers that convolve witha multiplication )e embedded sequence layer and averagelayer (GloobalAveragePooling1D) are also part of the hiddenlayer )e common activation of CNN is RELU Layer )edetails of the hypermeters that are used in our problem totrain the CNN model are given in Table 7

534 Hyperparameters In this section all the hyper-parameters that are used in our experiments are given in thetabular format Only those hyperparameters are being dis-cussed here which have achieved the highest accuracy ofDNN RNN and CNN models )e hyperparameters ofDNN that are fine-tuned in our work are given in Table 8

)e RNN model showed the highest accuracy (803and 81) on two sets of hyperparameters that are given inTable 9 Similarly Table 7 provides the details of thehyperparameters of the convolutional neural network

6 Performance Measuring Parameters

)emost common performance measuring [41] parametersie precision recall and F1-measure are used to evaluatethe proposed framework )e selection of these parameterswas decided because of the multiclass classification andimbalance dataset

Precision TP

(TP + FP) (1)

Recall TP

(TP + FN) (2)

F1 2lowast PrecisionlowastRecall( 1113857

(Precision + Recall) (3)

Accuracy (TP + TN)

(TP + TN + FP + FN) (4)

where TP TN FP and FN represent total positive totalnegative false positive and false negative values respec-tively Precision is defined as the closeness of the mea-surements to each other and recall is the ratio of the totalamount of relevant (ie TP values) instances that wereactually retrieved during the experimental work It is

Scientific Programming 7

noteworthy that both precision and recall are the relativevalues of measure of relevance

7 Results

71 Deep Learning Classifiers )e feature vector can begenerated using different techniques )e details of feature

vector generating techniques were discussed in Section 5)e results of feature vector generating techniques that wereused in our work ie ldquomulticlass event classification for theUrdu language textrdquo are given in the proceedingsubsections

711 Pretrained Word Embedding Models )e convolu-tional neural network model is evaluated on the featuresvectors that were generated by all pretrained word em-bedding models )e summary of all results generated by

Table 4 Pretrained word embedding model and custom word embedding model

Sr no Unique words Dimension Window sizeExisting pretrained word embedding models1 [11] 64653 300 mdash2 [19] 102214 100 mdash3 53454 300 mdashCustom pretrained word embedding models1 57251 50 22 57251 100 23 57251 100 34 57251 350 1

Table 5 Event sentence

Urdu sentence English sentenceےہاتلیھکلابٹفیلع Ali plays football

یلےلناجیکںوگولںوھکالےنسرئاوانورک Corona virus killed millions of people

Table 6 Event sentence converted using one-hot-encoding

Sentence اتلیھک لاب ٹف یلع ناج ںوگول ںوھکال سرئاو انورک1 1 1 1 1 0 0 0 0 02 0 0 0 0 1 1 1 1 1

Table 7 CNNrsquos hyperparameters

CNN (7928)Parameter ValueMax_words 20000Batch size 128Embedding_dim 50Activation function SoftMaxDense_node 256Trainingtesting 70ndash30No of epochs 20Loss function Categorical cross-entropy

Table 8 DNNrsquos hyperparameters

Parameter ValueMax_words 5000Batch size 128Embedding_dim 512Activation function SoftMaxLayers 04Trainingtesting 70ndash30No of epochs 15Loss function Sparse categorical cross-entropy

Table 9 RNNrsquos hyperparameters

Parameter ValueRNN (LSTM) (803)Max_words 50000Batch size 64Embedding_dim 100Activation function SoftMaxRecurrent dropout 02Trainingtesting 90ndash10No of epochs 05Loss function Sparse categorical cross-entropyRNN (LSTM) (81)Max_words 30000Batch size 128Embedding_dim 100Activation function SoftMaxRecurrent dropout 02Trainingtesting 80ndash20No of epochs 05Loss function Sparse categorical cross-entropy

8 Scientific Programming

pretrained [44] and custom pretrained word embeddingmodels is given in Table 10 Our custom pretrained wordembedding model that contains 57251 unique tokens largerdimension size 350 and 1 as the size of a window showed3868 accuracy )e purpose of developing a differentcustom pretrained word embedding model was to develop adomain-specific model and achieve the highest accuracyHowever the results of both pre-existing pretrained wordembedding models and domain-specific custom word em-bedding models are very low )e detail summary of resultscan be seen in Table 10

712 TF-IDF Feature Vector DNN architecture consists ofan input layer a dense layer and a max pool layer)e denselayer is also called a fully connected layer comprised of 150nodes SoftMax activation function and sparse_categor-ical_cross-entropy are used to compile the model on thedataset

25991 instances are used to validate the accuracy of theDNN model )e DNN with connected layer architectureshowed 84 overall accuracy for all event classes)e details ofthe performance measuring parameters for each class of eventare given in Table 11 Law and order the 6th type of event inour dataset consists of 2000 instances that are used for vali-dation It showed 66 accuracy that is comparatively low to theaccuracy of other types of events It affected the overall per-formance of the DNN model )e main reason behind theseresults is that the sentence of law and order overlaps with thesentences of politics Generally sometimes humans hardlydistinguish between law and order and political statements

For example

ldquo ےطخوگتفگہنارادہمذریغیکریزویتموکحےہہرطخےیلےکنماےک rdquo

ldquo)e irresponsible talk of state minister is a threat topeace in the regionrdquo

)e performance of the DNN model is given in Table 11that showed 84 accuracy for multiple classes of events All theother performance measuring parameters ie precessionrecall and F1-score of each class of events are given in Table 11

)e accuracy of the DNN model can be viewed inFigure 5 where the y-axis represents the accuracy and the x-axis represents the number of epochs RNN achieved 84accuracy for multiclass event classification

)e expected solution to tackle the sentence over-lapping problem with multiple classes is to use a ldquopre-trained word embeddingrdquo model like W2Vec and GloveHowever unfortunately like the English language stillthere is no openclose domain pretrained word

embedding model that is developed by a large corpus ofthe Urdu language text

)e RNN sequential model architecture of deep learningis used in our experiments )e recurrent deep learningmodel architecture consists of a sequence of the followinglayers ie embedding layer having 100 dimensions Spa-tialDropout1D LSTM and dense layers Sparse_categor-ical_cross-entropy loss function has been used for thecompilation of the model Multiclass categorical classifica-tion is handled by a sparse categorical cross-entropy lossfunction instead of categorical cross-entropy A SoftMaxactivation function is used at a dense layer instead of thesigmoid function SoftMax can handle nonlinear classifi-cation ie multiple classes while sigmoid is limited to linearclassification and handles binary classification

A bag-of-words consisting of 30000 unique Urdu lan-guage words is used to generate a feature vector )emaximum length of the feature vector is 250 tokens

)e overall accuracy of the RNN model is presented inTable 12 that achieved 81 validation accuracy for our problemby using TF-IDF feature vectors Other performance evaluationparameters of each class are also given in Table 12

)e accuracy of the RNN model can be viewed inFigure 6 where the y-axis represents the accuracy and the x-axis represents the number of epochs RNN achieved 81accuracy for multiclass event classification

Although CNN is highly recommended for imageprocessing it showed considerable results for multiclassevent classification on textual data )e performance mea-suring parameters of the CNN classifier are given in Table 13

)e distributed accuracy of the CNN classifier for thetwelve classes can be viewed in Figure 7 )ere is more thanone peak (higher accuracies) in Figure 7 that showeddatasets are imbalanced

713 One-Hot-Encoding )e results of deep learning clas-sifiers are used in our researcher work and their performanceon one-hot-encoding features is presented in Figure 8)e one-hot-encoded feature vectors are given as input to CNN DNNand RNN deep learning classifiers RNN showed better ac-curacy as compared to CNN while the DNN outperformed

Table 10 Classification accuracy of the CNN model

Srno

Existing pretrained modelrsquosvalidation_accuracy

Custom pretrained modelrsquosvalidation_accuracy

1 5800 36852 6026 38043 5668 37384 mdash 3868

Table 11 Performance measuring parameters for the DNNmodel

Class Precision Recall F1-score Support1 096 095 096 46042 091 091 091 7763 075 075 075 16974 078 070 074 7705 081 085 083 84246 071 063 067 20007 100 100 100 8178 092 090 091 18399 070 070 071 252410 095 099 097 85611 095 099 097 74112 082 073 077 943Accuracy 084 25991Macro avg 084 084 085 25991Weighted avg 084 084 084 25991

Scientific Programming 9

among them RNN and DNN achieved 81 and 84 accuracyrespectively for multiclass event classification

72 Traditional Machine Learning Classifiers We also per-formed a multiclass event classifier by using traditionalmachine learning algorithms K-nearest neighbor (KNN)decision tree (DT) Naıve Bayes multinomial (NBM) ran-dom forest (RF) linear regression (LR) and support vectormachine (SVM) All these models are evaluated using TF-IDF and one-hot encoding features as feature vectors It wasobserved that the results produced using TF-IDF featureswere better than the results generated using one-hot-encoding features A detailed summary of the results of theabove-mentioned machine learning classifiers is given in thenext section

721 K-Nearest Neighbor (KNN) KNN performs theclassification of a new data point by measuring the similaritydistance between the nearest neighbors In our experimentswe set the value of k 5 that measures the similarity distanceamong five existing data points [50]

Table 12 Performance measuring parameters for the RNN model

Class Precision Recall F1-score Support1 095 095 095 46042 078 077 078 7763 070 072 071 16974 078 064 070 7705 078 084 081 84246 067 057 062 20007 100 100 100 8178 091 087 089 18399 070 063 066 252410 093 098 095 85611 086 094 090 74112 076 067 071 943Accuracy 081 25991Macro avg 082 080 081 25991Weighted avg 081 081 081 25991

Table 13 Performance measuring parameters for the CNN model

Class Precision Recall F1-score Support1 096 093 095 56612 081 065 072 9673 072 068 070 21154 078 054 064 8785 073 088 080 100306 064 051 057 22937 099 099 099 9708 091 086 088 22599 071 061 066 304410 093 094 093 103111 091 082 086 88912 077 063 07 1052Accuracy 080 31189Macro avg 082 075 078 31189Weighted avg 080 080 080 31189

0

065

070

075

080

085

090

2 4 6 8Epochs

Accuracy

Valid

ation_

accuracy

TrainTest

Figure 6 RNNrsquos accuracy

00

1000

2000

3000

4000

5000

6000

7000

8000

20 40 60 80 100

Figure 7 CNNrsquos accuracy distribution

CNN

79

80

808181

82

83

8484

RNN DNNDeep learning models

One-hot-encoding

78

85

Valid

atio

n_ac

cura

cy

Figure 8 CNN RNN andDNN accuracy using one-hot-encoding

10 Scientific Programming

Although the performance of traditional machinelearning classifiers is considerable it must be noted that it islower than deep learning classifiers )e main performancedegrading factor of the classifiers is the imbalanced numberof instances and sentences overlapping )e performance ofthe KNN machine learning model is given in Table 14 Itshowed 78 accuracy

722 Decision Tree (DT) Decision Tree (DT)Decision tree(DT) is a type of supervised machine learning algorithm [51]where the data input is split according to certain parameters)e overall accuracy achieved by DT is 73 while anotherperformance detail of classes andDTmodel is given in Table 15

723 Naive Bayes Multinominal (NBM) Naıve Bayesmultinominal is one of the computational [52] efficientclassifiers for text classification but it showed only 70accuracy that is very low as compared to KNN DT and RF)e performance details of all twelve types of classes aregiven in Table 16

724 Linear Regression (LR) Linear regression is highlyrecommended for the prediction of continuous output in-stead of categorical classification [53] Table 17 shows the

Table 17 Performance measuring parameters for the LR model

Class Precision Recall F1-score Support1 095 094 094 56612 083 064 072 9673 072 069 070 21154 077 055 064 8785 073 088 080 100306 064 053 058 22937 100 100 100 9708 091 084 088 22599 073 062 067 304410 094 092 093 103111 090 080 085 88912 077 066 071 1052Accuracy 080 31189Macro avg 082 076 079 31189Weighted avg 080 080 0 80 31189

Table 18 Performance measuring parameters for the RF model

Class Precision Recall F1-score Support1 094 093 094 56612 094 096 095 9673 072 063 067 21154 080 058 067 8785 071 090 079 100306 067 041 051 22937 100 100 100 9708 093 080 086 22599 075 058 065 304410 094 098 096 103111 096 098 097 88912 084 063 072 1052Accuracy 080 31189Macro avg 085 078 081 31189Weighted avg 081 080 0 80 31189

Table 14 Performance measuring parameters for the KNN model

Class Precision Recall F1-score Support1 091 093 092 56612 062 083 071 9673 067 071 069 21154 064 060 062 8785 078 082 080 100306 066 050 057 22937 093 100 096 9708 091 080 085 22599 071 062 066 304410 085 093 089 103111 072 085 078 88912 075 061 067 1052Accuracy 078 31189Macro avg 076 077 076 31189Weighted avg 078 078 0 78 31189

Table 15 Performance measuring parameters for the DT model

Class Precision Recall F1-score Support1 091 089 090 56612 083 097 089 9673 057 052 054 21154 058 054 056 8785 072 075 073 100306 044 041 042 22937 099 100 100 9708 079 077 078 22599 057 055 056 304410 098 098 093 103111 086 098 092 88912 061 056 058 1031Accuracy 073 31189Macro avg 073 074 074 31189Weighted avg 073 073 0 73 31189

Table 16 Performance measuring parameters for the NB Multi-nominal model

Class Precision Recall F1-score Support1 094 091 093 56832 082 034 048 9563 066 047 055 21214 091 020 032 9195 056 095 070 100136 070 022 034 23877 098 095 097 9598 094 075 083 21889 075 040 052 303110 096 078 086 99811 096 032 048 86312 084 025 039 1071Accuracy 070 31189Macro avg 084 054 061 31189Weighted avg 076 070 0 67 31189

Scientific Programming 11

performance of the LR model ie 84 overall accuracy formulticlass event classification

725 Random Forest (RF) It comprises many decision trees[54] Its results showed the highest accuracy among allevaluated machine learning classifiers A detailed summaryof the results is given in Table 18

726 Support Vector Machine (SVM) )e support vectormachine (SVM) is one of the highly recommended modelsfor binary classification It is based on statistical theory [55]Its performance details are given in Table 19

A comparative depiction of results obtained by thetraditional machine learning classifiers is given in Figure 9

8 Discussion and Conclusion

Lack of resources is a major hurdle in research for Urdulanguage texts We explored many feature vectors generatingtechniques Different classification algorithms of traditional

machine learning and deep learning approaches are evaluatedon these feature vectors )e purpose of performing manyexperiments on various feature vector generating techniqueswas to develop the most efficient and generic model of mul-ticlass event classification for Urdu language text

Word embedding feature generating technique is con-sidered an efficient and powerful technique for text analysisWord2Vector (W2Vec) feature vectors can be generated bypretrained word embedding models or using dynamic pa-rameters in embedding layers of deep neural networks Weperformed sentence classification using pretrained wordembedding models one-hot-encoding TF TF-IDF anddynamic embeddings )e results of the rest of the featurevector generating techniques are better than pretrained wordembedding models

Another argument in support of this conclusion is thatonly a few pretrained word embedding models exist forUrdu language texts )ese models are trained on consid-erable tokens and domain-specific Urdu text )ere is a needto develop generic word embedding models for the Urdulanguage on a large corpus CNN and RNN (LSTM) single-layer architecture and multilayer architecture do not affectthe performance of the proposed system

Experimental results are the vivid depiction that theone-hot-encoding method is better than the word em-bedding model and pretrained word embedding modelHowever among all mentioned (see Section 52) featuregenerating techniques TF-IDF outperformed It showedthe highest accuracy (84) by using DNN deep learningclassifier while event classification on an imbalancedataset of multiclass events for Urdu language usingtraditional machine learning classifiers showed consid-erable performance but lower than deep learning modelsDeep learning algorithms ie CNN DNN and RNN arepreferable over traditional machine learning algorithmsbecause there is no need for a domain expert to findrelevant features in deep learning like traditional machinelearning DNN and RNN outperformed among all otherclassifiers and showed overall 84 and 81 accuracyrespectively for the twelve classes of events Compara-tively the performance of CNN and RNN is better thanNaıve Bayes and SVM

Multiclass event classification at the sentence levelperformed on an imbalance dataset events that are having alow number of instances for a specific class affect the overallperformance of the classifiers We can improve the per-formance by balancing the instances of each class )efollowing can be concluded

(1) Pretrained word embedding models are suitable onlyfor sentence classification if pretrained models aredeveloped by an immense amount of textual data

(2) Existing word embedding models Word2Vec andGlove that were developed for the English languagetext are incompatible for Urdu language text

(3) In our case TF-IDF one-hot-encoding and dy-namic embedding layer are better feature generating

Table 19 Performance measuring parameters for the SVM model

Class Precision Recall F1-score Support1 084 094 089 56832 072 043 054 9563 072 049 058 21214 073 043 054 9195 064 090 075 100136 074 024 036 23877 090 099 094 9598 086 078 082 21889 065 047 057 303110 085 087 082 99811 081 062 070 86312 077 063 067 1071Accuracy 073 31189Macro avg 077 063 067 31189Weighted avg 077 073 0 71 31189

SVM

70

7573

70

80

73

80

7880

RF LRMachine learing classifiers

65

85

DT KNNNBM

Valid

atio

n_ac

cura

cy

Figure 9 Machine learning algorithmsrsquo accuracy using TF-IDF

12 Scientific Programming

techniques as compared to pre-existing Urdu lan-guage text word embedding models

(4) )e TF-IDF-based feature vectors showed thehighest results as compared to one-hot-encoding-and dynamic word embedding-based feature vectors

(5) Imbalance number of instances in the dataset af-fected the overall accuracy

9 Future Work

In a comprehensive review of Urdu literature we found onlya few numbers of referential works related to Urdu textprocessing )e main hurdle in Urdu exploration is theunavailability of the processing resources ie event datasetclose-domain part-of-speech tagger lexicons annotatorsand other supporting tools

)ere are a lot of tasks that can be accomplished forUrdu language text in the future Some of those are men-tioned as follows

(1) Generic word embedding models can be developedfor a large corpus of Urdu language text

(2) Different deep learning classifiers can be evaluatedie BERT and ANN

(3) Event classification can be performed at the doc-ument level

(4) A balance dataset can be used for better results(5) Multilabel event classification can be performed in

the future(6) Unstructured data of Urdu text can be classified

into different event classes(7) Classification of events for the Urdu language can

be further performed for other domains ofknowledge ie literacy ratio top trends famousfoods and a religious event like Eid

(8) Contextual information of sentence ie presen-tence and postsentence information certainly playsa vital role in enhancing the performance accuracyof the classification model

(9) Event classification can be performed on a balanceddataset

(10) Unstructured Urdu data can be used for eventclassification

(11) Classification can be performed at a document andphrase level

Data Availability

)e data used to support this study are available at httpsgithubcomunique-worldMulticlass-Event-Classification-Dataset

Conflicts of Interest

)e authors declare that there are no conflicts of interest

References

[1] A Lenhart R Ling S Campbell and K Purcell Teens andMobile Phones Text Messaging Explodes as Teens Embrace it asthe Centerpiece of eir Communication Strategies withFriends Pew Internet amp American Life Project WashingtonDC USA 2010

[2] M Motoyama B Meeder K Levchenko G M Voelker andS Savage ldquoMeasuring online service availability using twit-terrdquo WOSN vol 10 p 13 2010

[3] J Rogstadius M Vukovic C A Teixeira V KostakosE Karapanos and J A Laredo ldquoCrisisTracker crowdsourcedsocial media curation for disaster awarenessrdquo IBM Journal ofResearch and Development vol 57 no 5 pp 4ndash1 2013

[4] T Reuter and P Cimiano ldquoEvent-based classification of socialmedia streamsrdquo in Proceedings of the 2nd ACM InternationalConference on Multimedia Retrieval pp 1ndash8 Bielefeld Ger-many June 2012

[5] K Sailunaz and R Alhajj ldquoEmotion and sentiment analysisfrom Twitter textrdquo Journal of Computational Science vol 36Article ID 101003 2019

[6] P Capet T Delavallade T Nakamura A SandorC Tarsitano and S Voyatzi ldquoA risk assessment system withautomatic extraction of event typesrdquo in Proceedings of theInternational Conference on Intelligent Information Process-ing pp 220ndash229 Springer Beijing China October 2008

[7] F Hogenboom F Frasincar U Kaymak F De Jong andE Caron ldquoA survey of event extraction methods from text fordecision support systemsrdquo Decision Support Systems vol 85pp 12ndash22 2016

[8] S Jiang H Chen J F Nunamaker and D Zimbra ldquoAna-lyzing firm-specific social media and market a stakeholder-based event analysis frameworkrdquo Decision Support Systemsvol 67 pp 30ndash39 2014

[9] B Pang and L Lee ldquoOpinion mining and sentiment analysisrdquoFoundations and Trends in Information Retrieval vol 2 no 1-2 pp 1ndash135 2008

[10] S Deerwester S T Dumais G W Furnas T K Landauerand R Harshman ldquoIndexing by latent semantic analysisrdquoJournal of the American Society for Information Sciencevol 41 no 6 pp 391ndash407 1990

[11] T Mikolov Q V Le and I Sutskever ldquoExploiting similaritiesamong languages for machine translationrdquo 2013 httparxivorgabs13094168

[12] R Alghamdi and K Alfalqi ldquoA survey of topic modeling intext miningrdquo International Journal of Advanced ComputerScience and Applications (IJACSA)vol 6 no 1 2015

[13] D M Eberhard S F Gary and C D Fennig EthnologueLanguages of the World SIL International Dallas TX USA2019

[14] A Daud W Khan and D Che ldquoUrdu language processing asurveyrdquo Artificial Intelligence Review vol 47 no 3pp 279ndash311 2017

[15] M P Akhter Z Jiangbin I R Naqvi M AbdelmajeedA Mehmood and M T Sadiq ldquoDocument-level text clas-sification using single-layer multisize filters convolutionalneural networkrdquo IEEE Access vol 8 pp 42689ndash42707 2021

[16] U Pal and A Sarkar ldquoRecognition of printed Urdu scriptrdquo inProceedings of the 2003 Seventh International Conference onDocument Analysis and Recognition pp 1183ndash1187 IEEEEdinburgh Scotland August 2003

[17] Y Yang T Pierce and J Carbonell ldquoA study of retrospectiveand on-line event detectionrdquo in Proceedings of the 21st annualinternational ACM SIGIR conference on Research and

Scientific Programming 13

development in information retrieval pp 28ndash36 MelbourneAustralia August 1998

[18] T Kala ldquoEvent detection from text datardquo ComputationalIntelligence vol 31 pp 132ndash164 2015

[19] M Naughton N Stokes and J Carthy ldquoSentence-level eventclassification in unstructured textsrdquo Information Retrievalvol 13 no 2 pp 132ndash156 2010

[20] G Jacobs E Lefever and V Hoste ldquoEconomic event de-tection in company-specific news textrdquo in Proceedings of theFirst Workshop on Economics and Natural Language Pro-cessing pp 1ndash10 Melbourne Australia July 2018

[21] E DrsquoAndrea P Ducange A Bechini A Renda andF Marcelloni ldquoMonitoring the public opinion about thevaccination topic from tweets analysisrdquo Expert Systems withApplications vol 116 pp 209ndash226 2019

[22] M Sokolova and G Lapalme ldquoA systematic analysis ofperformance measures for classification tasksrdquo InformationProcessing amp Management vol 45 no 4 pp 427ndash437 2009

[23] T Mikolov I Sutskever K Chen G S Corrado and J DeanldquoDistributed representations of words and phrases and theircompositionalityrdquo in Proceedings Advances Neural Informa-tion Processing Systems vol 26 pp 3111ndash3119 Lake TahoeNV USA December 2013

[24] Y Bengio R Ducharme P Vincent and C Jauvin ldquoA neuralprobabilistic language modelrdquo Journal of Machine LearningResearch vol 3 pp 1137ndash1155 2003

[25] M P Akhter Z Jiangbin I R Naqvi M Abdelmajeed andM Fayyaz ldquoExploring deep learning approaches for Urdu textclassification in product manufacturingrdquo Enterprise Infor-mation Systems pp 1ndash26 2020

[26] G Liu and J Guo ldquoBidirectional LSTM with attentionmechanism and convolutional layer for text classificationrdquoNeurocomputing vol 337 pp 325ndash338 2019

[27] M Sharjeel R M A Nawab and P Rayson ldquoCOUNTERcorpus of Urdu news text reuserdquo Language Resources andEvaluation vol 51 no 3 pp 777ndash803 2017

[28] K Mehmood D Essam and K Shafi ldquoSentiment analysissystem for Roman Urdurdquo in Proceedings of the 2018 Scienceand Information Conference pp 29ndash42 Springer CasablancaMorocco July 2018

[29] K Ahmed M Ali S Khalid and M Kamran ldquoFramework forUrdu news headlines classificationrdquo Journal of Applied ComputerScience amp Mathematics vol 10 no 1 pp 17ndash21 2016

[30] Z Tehseen M P Akhter and Q Abbas ldquoComparative studyof feature selection approaches for Urdu text categorizationrdquoMalaysian Journal Computer Science vol 28 no 2 pp 93ndash109 2015

[31] W Yin and L Shen ldquoA short text classification approach withevent detection and conceptual informationrdquo in Proceedingsof the 2020 5th International Conference on Machine LearningTechnologies pp 129ndash135 Beijing China June 2020

[32] H Zhou M Huang T Zhang et al ldquoEmotional chattingmachine emotional conversation generation with internaland external memoryrdquo in Proceedings of the irty-SecondAAAI Conference on Artificial Intelligence New Orleans LAUSA February 2018

[33] A Hassan and A Mahmood ldquoConvolutional recurrent deeplearningmodel for sentence classificationrdquo IEEE Access vol 6pp 13949ndash13957 2018

[34] T Zia M P Akhter and Q Abbas ldquoComparative study offeature selection approaches for Urdu text categorizationrdquoMalaysian Journal of Computer Science vol 28 no 2pp 93ndash109 2015

[35] A R Ali andM Ijaz ldquoUrdu text classificationrdquo in Proceedingsof the 7th international conference on frontiers of informationtechnology pp 1ndash7 Abbottabad Pakistan December 2009

[36] M Usman Z Shafique S Ayub and K Malik ldquoUrdu textclassification using majority votingrdquo International Journal ofAdvanced Computer Science and Applications vol 7 no 8pp 265ndash273 2016

[37] N Kalchbrenner E Grefenstette and P Blunsom ldquoA con-volutional neural network for modelling sentencesrdquo 2014httparxivorgabs14042188

[38] D M Awais and DM Shoaib ldquoRole of discourse informationin Urdu sentiment classificationrdquo ACMTransactions on Asianand Low-Resource Language Information Processing vol 18no 4 pp 1ndash37 2019

[39] J P Singh Y K Dwivedi N P Rana A Kumar andK K Kapoor ldquoEvent classification and location predictionfrom tweets during disastersrdquo Annals of Operations Researchvol 283 no 1-2 pp 737ndash757 2019

[40] R C Paulo D Fillipe and M S C Sergo ldquoClassification ofevents on social mediardquo 2016

[41] Q A Al-Radaideh and M A Al-Abrat ldquoAn Arabic textcategorization approach using term weighting and multiplereductsrdquo Soft Computing vol 23 no 14 pp 5849ndash5863 2019

[42] J F Allen ldquoMaintaining knowledge about temporal inter-valsrdquo Communications of the ACM vol 26 no 11pp 832ndash843 1983

[43] T Joachims ldquoText categorization with support vector ma-chines learning with many relevant featuresrdquo in Proceedingsof the European conference on machine learning pp 137ndash142Springer Chemnitz Germany April 1998

[44] S Haider ldquoUrdu word embeddingsrdquo in Proceedings of theEleventh International Conference on Language Resources andEvaluation (LREC 2018) Miyazaki Japan May 2018

[45] B Jawaid A Kamran and O Bojar ldquoUrdu monolingual corpusrdquoLINDATCLARIN Digital Library at the Institute of Formal andApplied Linguistics Charles University Prague Czechia

[46] F Adeeba Q Akram H Khalid and S Hussain ldquoCLE Urdubooks n-gramsrdquo in Proceedings of the Conference on Languageand Technology CLT 14 Karachi Pakistan May 2014

[47] A Hassan and A Mahmood ldquoDeep learning for sentenceclassificationrdquo in Proceedings of the 2017 IEEE Long IslandSystems Applications and Technology Conference (LISAT)pp 1ndash5 IEEE New York NY USA May 2017

[48] D-X Zhou ldquoUniversality of deep convolutional neuralnetworksrdquo Applied and Computational Harmonic Analysisvol 48 no 2 pp 787ndash794 2020

[49] M V Valueva N N Nagornov P A Lyakhov G V Valuevand N I Chervyakov ldquoApplication of the residue numbersystem to reduce hardware costs of the convolutional neuralnetwork implementationrdquo Mathematics and Computers inSimulation vol 177 pp 232ndash243 2020

[50] G Guo H Wang D Bell Y Bi and K Greer ldquoKNN model-based approach in classificationrdquo in Proceedings of the OTMConfederated International Conferences ldquoOn the Move toMeaningful Internet Systemsrdquo Catania Italy November 2003

[51] Y Zhong ldquo)e analysis of cases based on decision treerdquo inProceedings of the 2016 7th IEEE international conference onsoftware engineering and service science (ICSESS) pp 142ndash147IEEE Beijing China August 2016

[52] S Xu ldquoBayesian Naıve Bayes classifiers to text classificationrdquoJournal of Information Science vol 44 no 1 pp 48ndash59 2018

[53] T Zhang and F Oles ldquoText categorization based on regu-larized linear classification methodsrdquo Information Retrievalvol 4 no 1 pp 5ndash31 2001

14 Scientific Programming

[54] J Ali R Khan N Ahmad and I Maqsood ldquoRandom forestsand decision treesrdquo International Journal of Computer ScienceIssues (IJCSI) vol 9 no 5 p 272 2012

[55] Y Zhang ldquoSupport vector machine classification algorithmand its applicationrdquo in Proceedings of the InternationalConference on Information Computing and Applicationspp 179ndash186 Springer Bhubaneswar India September 2012

Scientific Programming 15

Page 3: ResearchArticle MulticlassEventClassificationfromText2020/11/03  · script. Its grammatical structure is different from other languages. (1)Subject-object-verb(SOV)sentencestructure[14]

Stanford Treebank dataset were used to evaluate the per-formance of the system )eir proposed system showed933 and 892 accuracy respectively

In [35] the authors performed a supervised text clas-sification in the Urdu language by using a statistical ap-proach like Naıve Bayes and support vector machine (SVM))e classification is initiated by applying different pre-processing approaches namely stemming stop word re-moval and both stop words elimination and stemming )eexperimental results showed that the steaming process haslittle impact on improving performance On the other handthe elimination of stop words showed a positive effect onresults )e SVM outperformed the Naıve Bayes byachieving the classification accuracies of 8953 and 9334based on polynomial and radial function respectively

Similarly the SVM is also applied in the news headlinesclassification [36] in Urdu text showing a very low amount ofaccuracy improvement of 35 News headlines are a smallpiece of information that frequently does not describe thecontextual meaning of the contents In [36] the majorityvoting algorithm used for text classification in the Urdulanguage showed 94 accuracy )e classification is per-formed on seven different types of news text However thenumber of instances was very limited A dynamic neuralnetwork [37] was designed to model the sentiment ofsentences It consists of dynamic K-modeling pooling andglobal pooling over a linear sequence that performs mul-ticlass sentiment classification

A quite different task is performed [38] in which theauthors used a hybrid approach of rule-based and machinelearning-based techniques to perform the sentiment clas-sification while analyzing the Urdu script [38] at the phraselevel )e hybrid approach showed an accuracy of 3125846 and 216 using the performance metrics of recallprecision and accuracy respectively In [39] a variant ofrecurrent neural network (RNN) called long short-termmemory (LSTM) is used to overcome the weakness of bag-of-words and n-grams models and it outperformed theseconventional approaches

A neural network-based system [39] was developed toclassify events )e purpose of the system was to help thepeople in natural disasters like floods by analyzing tweets)e Markov model was used to classify and predict thelocation that showed 81 accuracy for classification tweetsas a request for help and 87 accuracy to locate the locationResearch work was conducted on life event detection andclassification ie marriage birthday and traveling etc toanticipate products and services to facilitate the people [40])e data about life events exist in a very small amountLinear regression Naıve Bayes and nearest neighbor al-gorithms were evaluated on the original dataset that was verysmall but did not show favorable results

A multiple minimal reduct extraction algorithm wasdesigned [41] by improving the quick reduct algorithm )emultiple reducts are used to generate the set of classificationrules which represent the rough set classifier To evaluate theproposed approach an Arabic corpus of 2700 documentswas used to categorize into nine classes By using multipleand single minimal reducts the proposed system showed

94 and 86 respectively Experimental results also showedthat both the K-NN and J48 algorithms outperformed re-garding classification accuracy using the dataset on hand

Table 1 depicts the summary of the related researchdiscussed previously

3 Dataset

31DataCollection Contrary to the dataset reported in stateof the art [27 34] in which no datasets were created for eventclassification we created a larger dataset specific for eventclassification Instead of focusing on a specific product [25]analysis or phrase-level sentiment analysis [38] we decidedto classify sentences into multiple event classes Instead ofusing the joint framework of CNN and RNN for sentimentanalysis [35] we evaluated the performance of deep learningmodels for multiclass event classification To collect data aPHP-based web scraper is written to crawl data from thepopular social media websites ie Geo News Channel(httpsurdugeotv) website BBC Urdu (httpswwwbbccomurdu) and Urdu point (httpswwwurdupointcomdaily) A complete post is retrieved from the website andstored in MariaDB (database) It consists of a title bodypublished date location and URL )e sample text or tweetof both languages of the South Asian countries ie Urdulanguage on Twitter and Hindi language on Facebook isshown in Figure 1

)ere are 015 million (150000) Urdu language sen-tences )e diversity of data collection sources helped us todevelop multiclass datasets It consists of twelve types ofevents )e subset of datasets can be useful for otherresearchers

32 Preprocessing In the first phase of dataset preparationwe performed some preprocessing steps ie noise removingand sentence annotationlabeling All non-Urdu wordssentences hyperlinks URLs and special symbols were re-moved It was necessary to clean out the dataset to annotatelabel the sentences properly

321 Annotation Guidelines

(1) Go through each sentence and assign a class label(2) Remove ambiguous sentences(3) Merge relevant sentences to a single class ie ac-

cident murder and death(4) Assign one of the twelve types of events ie sports

inflation murder and death terrorist attack politicslaw and order earthquake showbiz fraud andcorruption weather sexual assault and business toeach sentence

To annotate our dataset two MPhil (Urdu) level lan-guage experts were engaged )ey deeply read and analyzedthe dataset sentence by sentence before assigning eventlabels )ey recommended removing 46035 sentences fromthe dataset because those sentences would not contain in-formation that useful for event classification Finally after

Scientific Programming 3

annotation the dataset size was reduced to 103965 imbal-anced instances of twelve different types of events

)e annotation interagreement ie Cohen Kappa scoreis 093 which indicates the strong agreement between thelanguage and expert annotators )e annotated dataset isalmost perfect according to the annotation agreement score

In the second phase of preprocessing the following stepsare performed ie stop words eliminated word tokeniza-tion and sentence filtering

All those words which do not semantically contribute tothe classification process are removed as stop words ie ہو

ہریغوہریغوےسرواساںیم etc A list of standard stopwords of the Urdu language is available here (httpswwwkagglecomrtatmanurdu-stopwords-list)

After performing data cleaning and stop word removalevery sentence is tokenized into words based on white spaceAn example of sentence tokenization is given in Table 2

)e previous preprocessing step revealed that manysentences are varying in length Some sentences were soshort and many were very long We decided to define alength boundary for tokenized sentences We observed thatmany sentences exist in the dataset which have a lengthrange from 5 words to 250 words We selected sentences thatconsist of 5 words to 150 words An integer value is assignedto each type of event for all selected sentences )e detaileddescription of the different types of events and their cor-responding numeric (integer) values that are used in thedataset is also given in Table 3

In Figure 2 a few instances of the dataset after pre-processing are presented It is a comma-separated value

(CSV) file that consists of two fields ie sentence and labelie numeric value for each class (1ndash12)

In our dataset three types of events have a larger numberof instances ie sports (18746) politics (33421) and fraudand corruption (10078) contrary to three other types ofevents that have a smaller number of instances ie sexualassault (2916) inflation (3196) and earthquake (3238)

)e remaining types of events have a smaller differenceof instances among them )ere are 51814 unique words inour dataset )e visualization in Figure 3 shows that thedataset is imbalanced

4 Methodology

We analyzed the performance of deep learning ie deepneural network convolutional neural network and recur-rent neural network along with other machine learningclassifiers ie K-nearest neighbor decision tree randomforest support vector machine Naıve Bayes multinominaland linear regression

)e Urdu news headlines contain insufficient informa-tion ie few numbers of words and lack of contextualinformation to classify the events [29] However compar-atively to news headlines the sentences written in informalway contain more information )e sentence-level classifi-cation is performed using deep learning models instead ofonly machine learning algorithms )e majority voting al-gorithm outperforms on a limited number of instances forseven classes It showed 94 [36] accuracy but in our workmore than 015 million instances which are labeled intotwelve classes are used for classification

)ere exist several approaches to extract useful infor-mation from a large amount of data )ree common ap-proaches are rule-based a machine learning approach andhybrid approaches [42] )e selection of methodology istightly coupled with the research problem In our problemwe decided to use machine learning (traditional machinelearning and deep learning approaches) classifiers Sometraditional machine learning algorithms ie K-nearestneighbor (KNN) random forest (RF) support vector ma-chine (SVM) decision tree (DT) and multinomial NaıveBayes (MNB) are evaluated for multiclass eventclassification

Deep learning models ie convolutional neural network(CNN) deep neural network (DNN) and recurrent neuralnetwork (RNN) are also evaluated for multiclass eventclassification

Table 1 Summary of the related research

Paper reference Classifier used Dataset Accuracy ()[33] CNN and RNN Movie reviews 92

[34] CNN and RNN 1 Stanford movie review dataset2 Stanford Treebank dataset 933 and 892

[35] Naıve Bayes and SVM Corpus of Urdu documents 8953 and 9334[36] Dynamic neural network News articles 965[38] Rule-based modeling Urdu corpus of news headlines 3125[39] LSTM Tweets 8100[41] K-NN and J48 Arabic corpus of 2700 documents 95 and 86

Figure 1 Urdu and Hindi language text on social media

4 Scientific Programming

A collection of Urdu text documentsD d1 d2 dn1113864 1113865 is split into a set of sentencesS s1 s2 sn1113864 1113865 Our purpose is to classify the sentences toa predefined set of events E e1 e2 en1113864 1113865

Various feature generating methods are used to create afeature vector for deep learning and machine learning classi-fiers ie TF-IDF one-hot-encoding and word embeddingFeature vectors generated by all these techniques are fed up asinput into the embedding layer of neural networks )e outputgenerated by the embedding layers is fed up into the next fullyconnected layer (dense layer) of deep learning models ieRNN CNN and DNN A relevant class label out of twelvecategories is assigned to each sentence at the end of modelprocessing in the testingvalidation phase

Bag-of-words is a common method to represent text Itignores the sequence order and semantic of text [43] whilethe one-hot-coding method maintains the sequence of textWord embedding methods Word2Vec and Glove (httpsybbaigogitbooksio26pretrained-word-embeddingshtml)that are used to generate feature vectors for deep learningmodels are highly recommended for textual data Howeverin the case of Urdu text classification pre-existing wrod2vecand Glove are incompatible

)e framework of our designed system is represented inFigure 4 It shows the structure of our system from takinginput to producing output

5 Experimental Setup

We performed many experiments on our dataset by usingvarious traditional machine learning and deep learningclassifiers )e purpose of many experiments is to find themost efficient and accurate classification model for themulticlass event on an imbalance dataset for the Urdulanguage text A detailed comparison between traditionalclassifiers and deep neural classifiers is given in the nextsection

51 Feature Space Unigram and bigram tokens of the wholecorpus are used as features to create the feature space TF-IDF vectorization is used to create a dictionary-based modelIt consists of 656608 features )e training and testingdataset are converted to TF-IDF dictionary-based featurevectors A convolutional sequential model (see Figure 5)consists of three layers ie the input layer hidden layer andoutput layer which are used to evaluate our dataset Sim-ilarly word embedding and one-hot-encoding are also in-cluded in our feature space to enlarge the scope of ourresearch problem

52 Feature Vector Generating Techniques Feature vectorsare the numerical representation of text )ey are an actualform of input that can be processed by the machine learningclassifier )ere are several feature generating techniquesused for text processing We used the following featurevector generating techniques

521 Word Embedding A numerical representation of thetext is that each word is considered as a feature vector Itcreates a dense vector of real values that captures thecontextual semantical and syntactical meaning of the wordIt also ensures that similar words should have a relatedweighted value [29]

522 Pretrained Word Embedding Models Usage of apretrained word embedding model for the small amount ofdata is highly recommended by researchers in state of the artGlove and Word2Vec are famous word embedding modelsthat are developed by using a big amount of data Wordembedding models for text classification especially in theEnglish language showed promising results It has emergedas a powerful feature vector generating technique amongothers ie TF TF-IDF and one-hot encoding etc

In our research case sentence classification for differentevents in the Urdu language using the word embeddingtechnique is potentially preferable Unfortunately the Urdulanguage is lacking in processing resources We found onlythree word embedding models a word embedding model

Table 2 Sentence tokenization

Sentence Tokenized sentenceیلےلناجیکںوگولددعتمےنسرئاوانورک یل ےل ناج ںوگول ددعتم سرئاو انورک

یئگرگتھچیکںورھگیئکےسشرابینافوط ئگ رگ تھچ ںورھگ یئک شراب ینافوط

Table 3 Event label

Event LabelSports 1Inflation 2Murder and death 3Terrorist attack 4Politics 5Law and order 6Earthquake 7Showbiz 8Fraud and corruption 9Rainweather 10Sexual assault 11

Figure 2 )e few instances of the dataset

Scientific Programming 5

0

5000

10000

15000

20000

25000

30000

35000

40000To

tal n

umbe

r of i

nsta

nces

Imbalance dataset of the urdu label sentences for events

Type of events

18746

31966932

3034

33421

69603238

741710078

3406 2916 3617

Spor

ts

Infla

tion

Mur

der

Terr

orist

atta

ck

Polit

ics

Law

and

orde

r

Eart

hqua

ke

Show

biz

Frau

d an

dco

rrup

tion

Wea

ther

Sexu

al as

saul

t

Busin

ess

Figure 3 Imbalance instances of the dataset

Raw input Preprocessed input Word embedding layersDLM

RNN (LSTM-unidirection)

CNN

DNN

LabelsTF-IDF

Documents1

Documents2

Documents3

Documents

n

1

2

3

12

Figure 4 Methodology

00

050

055

060

065

070

075

080

085

05 10 15 20 25 30 35 40Epochs

Accuracy

Valid

ation_

accuracy

TrainTest

Figure 5 DDNrsquos accuracy

6 Scientific Programming

[44] that is developed by using three publicly available Urdudatasets Wikipediarsquos Urdu text another corpus having 90million tokens [45] and 35 million tokens [46] It has 102214unique tokens Each token comprises 300-dimensional realvalues Another model publicly available for research purposesconsists of 25925 unique words of Urdu language [47] Everyword has a 400-dimensional value A word embedding modelcomprises web-based text created to classify text It consists of64653 unique Urdu words and 300 dimensions for each word

)e journey of research is not over here to expand ourresearch scope and find the most efficient word embeddingmodel for sentence classification we decided to developcustom word embedding models We developed four wordembedding models that contain 57251 unique words

)e results of pretrained existing word embedding modelsare good at the initial level but very low ie 6026 accuracyWe explored the contents of these models which revealed thatmanywords are irrelevant and borrowed from other languagesie Arabic and Persian )e contents of Wikipedia are entirelydifferent than news websites that also affect the performance ofembedding models Another major factor ie low amount ofdata affected the feature vector generation quality Stop wordsin the pretrained word embedding model are not eliminatedand considered as a token while in our dataset all the stopwords are removed It also reduces the size of the vocabulary ofthe model while generating a feature vector )erefore wedecided to develop a custom word embedding model on ourpreprocessed dataset To postulate the enlargement of theresearch task three different word embedding models aredeveloped )e details of all used pretrained word embeddingmodels are given in Table 4

523 One-Hot-Encoding Text cannot be processed directlybymachine learning classifiers therefore we need to convertthe text into a real value We used one-hot-encoding toconvert text to numeric features For example the sentencesgiven in Table 5 can be converted into a numeric featurevector using one-hot-encoding as shown in Table 6

524 TF-IDF TF and TF-IDF are feature engineeringtechniques that transform the text into the numerical for-mat It is one of the most highly used feature vectors forcreating a method for text data )ree deep learning modelswere evaluated on our corpus )e sequential model withembedding layers outperformed other pretrained wordembedding models [44] reported in state of the art [48] )edetailed summary of the evaluation results of CNN RNNand DNN is discussed in Section 7

53 Deep Learning Models

531 Deep Neural Network Architecture Our DNN archi-tecture consists of three layers ie n-input layer 150 hidden(dense) layers and 12 output layers Feature vector is givenas input into a dense layer that is fully connected )eSoftMax activation function is used in the output layer toclassify sentences into multiple classes

532 Recurrence Neural Network )e recurrence neuralnetwork is evaluated using a long short-term memory(LSTM) classifier RNN consists of embedding dropoutLSTM and dense layers A dictionary of 30000 unique mostfrequent tokens is made )e sentences are standardized tothe same length by using a padding sequence)e dimensionof the feature vector is set as 250 RNN showed an overall81 accuracy that is the second highest in our work

533 Convolutional Neural Network (CNN) CNN is a classof deep neural networks that are highly recommended forimage processing [49] It consists of the input layer (em-bedding layer) multiple hidden layers and an output layer)ere are a series of convolutional layers that convolve witha multiplication )e embedded sequence layer and averagelayer (GloobalAveragePooling1D) are also part of the hiddenlayer )e common activation of CNN is RELU Layer )edetails of the hypermeters that are used in our problem totrain the CNN model are given in Table 7

534 Hyperparameters In this section all the hyper-parameters that are used in our experiments are given in thetabular format Only those hyperparameters are being dis-cussed here which have achieved the highest accuracy ofDNN RNN and CNN models )e hyperparameters ofDNN that are fine-tuned in our work are given in Table 8

)e RNN model showed the highest accuracy (803and 81) on two sets of hyperparameters that are given inTable 9 Similarly Table 7 provides the details of thehyperparameters of the convolutional neural network

6 Performance Measuring Parameters

)emost common performance measuring [41] parametersie precision recall and F1-measure are used to evaluatethe proposed framework )e selection of these parameterswas decided because of the multiclass classification andimbalance dataset

Precision TP

(TP + FP) (1)

Recall TP

(TP + FN) (2)

F1 2lowast PrecisionlowastRecall( 1113857

(Precision + Recall) (3)

Accuracy (TP + TN)

(TP + TN + FP + FN) (4)

where TP TN FP and FN represent total positive totalnegative false positive and false negative values respec-tively Precision is defined as the closeness of the mea-surements to each other and recall is the ratio of the totalamount of relevant (ie TP values) instances that wereactually retrieved during the experimental work It is

Scientific Programming 7

noteworthy that both precision and recall are the relativevalues of measure of relevance

7 Results

71 Deep Learning Classifiers )e feature vector can begenerated using different techniques )e details of feature

vector generating techniques were discussed in Section 5)e results of feature vector generating techniques that wereused in our work ie ldquomulticlass event classification for theUrdu language textrdquo are given in the proceedingsubsections

711 Pretrained Word Embedding Models )e convolu-tional neural network model is evaluated on the featuresvectors that were generated by all pretrained word em-bedding models )e summary of all results generated by

Table 4 Pretrained word embedding model and custom word embedding model

Sr no Unique words Dimension Window sizeExisting pretrained word embedding models1 [11] 64653 300 mdash2 [19] 102214 100 mdash3 53454 300 mdashCustom pretrained word embedding models1 57251 50 22 57251 100 23 57251 100 34 57251 350 1

Table 5 Event sentence

Urdu sentence English sentenceےہاتلیھکلابٹفیلع Ali plays football

یلےلناجیکںوگولںوھکالےنسرئاوانورک Corona virus killed millions of people

Table 6 Event sentence converted using one-hot-encoding

Sentence اتلیھک لاب ٹف یلع ناج ںوگول ںوھکال سرئاو انورک1 1 1 1 1 0 0 0 0 02 0 0 0 0 1 1 1 1 1

Table 7 CNNrsquos hyperparameters

CNN (7928)Parameter ValueMax_words 20000Batch size 128Embedding_dim 50Activation function SoftMaxDense_node 256Trainingtesting 70ndash30No of epochs 20Loss function Categorical cross-entropy

Table 8 DNNrsquos hyperparameters

Parameter ValueMax_words 5000Batch size 128Embedding_dim 512Activation function SoftMaxLayers 04Trainingtesting 70ndash30No of epochs 15Loss function Sparse categorical cross-entropy

Table 9 RNNrsquos hyperparameters

Parameter ValueRNN (LSTM) (803)Max_words 50000Batch size 64Embedding_dim 100Activation function SoftMaxRecurrent dropout 02Trainingtesting 90ndash10No of epochs 05Loss function Sparse categorical cross-entropyRNN (LSTM) (81)Max_words 30000Batch size 128Embedding_dim 100Activation function SoftMaxRecurrent dropout 02Trainingtesting 80ndash20No of epochs 05Loss function Sparse categorical cross-entropy

8 Scientific Programming

pretrained [44] and custom pretrained word embeddingmodels is given in Table 10 Our custom pretrained wordembedding model that contains 57251 unique tokens largerdimension size 350 and 1 as the size of a window showed3868 accuracy )e purpose of developing a differentcustom pretrained word embedding model was to develop adomain-specific model and achieve the highest accuracyHowever the results of both pre-existing pretrained wordembedding models and domain-specific custom word em-bedding models are very low )e detail summary of resultscan be seen in Table 10

712 TF-IDF Feature Vector DNN architecture consists ofan input layer a dense layer and a max pool layer)e denselayer is also called a fully connected layer comprised of 150nodes SoftMax activation function and sparse_categor-ical_cross-entropy are used to compile the model on thedataset

25991 instances are used to validate the accuracy of theDNN model )e DNN with connected layer architectureshowed 84 overall accuracy for all event classes)e details ofthe performance measuring parameters for each class of eventare given in Table 11 Law and order the 6th type of event inour dataset consists of 2000 instances that are used for vali-dation It showed 66 accuracy that is comparatively low to theaccuracy of other types of events It affected the overall per-formance of the DNN model )e main reason behind theseresults is that the sentence of law and order overlaps with thesentences of politics Generally sometimes humans hardlydistinguish between law and order and political statements

For example

ldquo ےطخوگتفگہنارادہمذریغیکریزویتموکحےہہرطخےیلےکنماےک rdquo

ldquo)e irresponsible talk of state minister is a threat topeace in the regionrdquo

)e performance of the DNN model is given in Table 11that showed 84 accuracy for multiple classes of events All theother performance measuring parameters ie precessionrecall and F1-score of each class of events are given in Table 11

)e accuracy of the DNN model can be viewed inFigure 5 where the y-axis represents the accuracy and the x-axis represents the number of epochs RNN achieved 84accuracy for multiclass event classification

)e expected solution to tackle the sentence over-lapping problem with multiple classes is to use a ldquopre-trained word embeddingrdquo model like W2Vec and GloveHowever unfortunately like the English language stillthere is no openclose domain pretrained word

embedding model that is developed by a large corpus ofthe Urdu language text

)e RNN sequential model architecture of deep learningis used in our experiments )e recurrent deep learningmodel architecture consists of a sequence of the followinglayers ie embedding layer having 100 dimensions Spa-tialDropout1D LSTM and dense layers Sparse_categor-ical_cross-entropy loss function has been used for thecompilation of the model Multiclass categorical classifica-tion is handled by a sparse categorical cross-entropy lossfunction instead of categorical cross-entropy A SoftMaxactivation function is used at a dense layer instead of thesigmoid function SoftMax can handle nonlinear classifi-cation ie multiple classes while sigmoid is limited to linearclassification and handles binary classification

A bag-of-words consisting of 30000 unique Urdu lan-guage words is used to generate a feature vector )emaximum length of the feature vector is 250 tokens

)e overall accuracy of the RNN model is presented inTable 12 that achieved 81 validation accuracy for our problemby using TF-IDF feature vectors Other performance evaluationparameters of each class are also given in Table 12

)e accuracy of the RNN model can be viewed inFigure 6 where the y-axis represents the accuracy and the x-axis represents the number of epochs RNN achieved 81accuracy for multiclass event classification

Although CNN is highly recommended for imageprocessing it showed considerable results for multiclassevent classification on textual data )e performance mea-suring parameters of the CNN classifier are given in Table 13

)e distributed accuracy of the CNN classifier for thetwelve classes can be viewed in Figure 7 )ere is more thanone peak (higher accuracies) in Figure 7 that showeddatasets are imbalanced

713 One-Hot-Encoding )e results of deep learning clas-sifiers are used in our researcher work and their performanceon one-hot-encoding features is presented in Figure 8)e one-hot-encoded feature vectors are given as input to CNN DNNand RNN deep learning classifiers RNN showed better ac-curacy as compared to CNN while the DNN outperformed

Table 10 Classification accuracy of the CNN model

Srno

Existing pretrained modelrsquosvalidation_accuracy

Custom pretrained modelrsquosvalidation_accuracy

1 5800 36852 6026 38043 5668 37384 mdash 3868

Table 11 Performance measuring parameters for the DNNmodel

Class Precision Recall F1-score Support1 096 095 096 46042 091 091 091 7763 075 075 075 16974 078 070 074 7705 081 085 083 84246 071 063 067 20007 100 100 100 8178 092 090 091 18399 070 070 071 252410 095 099 097 85611 095 099 097 74112 082 073 077 943Accuracy 084 25991Macro avg 084 084 085 25991Weighted avg 084 084 084 25991

Scientific Programming 9

among them RNN and DNN achieved 81 and 84 accuracyrespectively for multiclass event classification

72 Traditional Machine Learning Classifiers We also per-formed a multiclass event classifier by using traditionalmachine learning algorithms K-nearest neighbor (KNN)decision tree (DT) Naıve Bayes multinomial (NBM) ran-dom forest (RF) linear regression (LR) and support vectormachine (SVM) All these models are evaluated using TF-IDF and one-hot encoding features as feature vectors It wasobserved that the results produced using TF-IDF featureswere better than the results generated using one-hot-encoding features A detailed summary of the results of theabove-mentioned machine learning classifiers is given in thenext section

721 K-Nearest Neighbor (KNN) KNN performs theclassification of a new data point by measuring the similaritydistance between the nearest neighbors In our experimentswe set the value of k 5 that measures the similarity distanceamong five existing data points [50]

Table 12 Performance measuring parameters for the RNN model

Class Precision Recall F1-score Support1 095 095 095 46042 078 077 078 7763 070 072 071 16974 078 064 070 7705 078 084 081 84246 067 057 062 20007 100 100 100 8178 091 087 089 18399 070 063 066 252410 093 098 095 85611 086 094 090 74112 076 067 071 943Accuracy 081 25991Macro avg 082 080 081 25991Weighted avg 081 081 081 25991

Table 13 Performance measuring parameters for the CNN model

Class Precision Recall F1-score Support1 096 093 095 56612 081 065 072 9673 072 068 070 21154 078 054 064 8785 073 088 080 100306 064 051 057 22937 099 099 099 9708 091 086 088 22599 071 061 066 304410 093 094 093 103111 091 082 086 88912 077 063 07 1052Accuracy 080 31189Macro avg 082 075 078 31189Weighted avg 080 080 080 31189

0

065

070

075

080

085

090

2 4 6 8Epochs

Accuracy

Valid

ation_

accuracy

TrainTest

Figure 6 RNNrsquos accuracy

00

1000

2000

3000

4000

5000

6000

7000

8000

20 40 60 80 100

Figure 7 CNNrsquos accuracy distribution

CNN

79

80

808181

82

83

8484

RNN DNNDeep learning models

One-hot-encoding

78

85

Valid

atio

n_ac

cura

cy

Figure 8 CNN RNN andDNN accuracy using one-hot-encoding

10 Scientific Programming

Although the performance of traditional machinelearning classifiers is considerable it must be noted that it islower than deep learning classifiers )e main performancedegrading factor of the classifiers is the imbalanced numberof instances and sentences overlapping )e performance ofthe KNN machine learning model is given in Table 14 Itshowed 78 accuracy

722 Decision Tree (DT) Decision Tree (DT)Decision tree(DT) is a type of supervised machine learning algorithm [51]where the data input is split according to certain parameters)e overall accuracy achieved by DT is 73 while anotherperformance detail of classes andDTmodel is given in Table 15

723 Naive Bayes Multinominal (NBM) Naıve Bayesmultinominal is one of the computational [52] efficientclassifiers for text classification but it showed only 70accuracy that is very low as compared to KNN DT and RF)e performance details of all twelve types of classes aregiven in Table 16

724 Linear Regression (LR) Linear regression is highlyrecommended for the prediction of continuous output in-stead of categorical classification [53] Table 17 shows the

Table 17 Performance measuring parameters for the LR model

Class Precision Recall F1-score Support1 095 094 094 56612 083 064 072 9673 072 069 070 21154 077 055 064 8785 073 088 080 100306 064 053 058 22937 100 100 100 9708 091 084 088 22599 073 062 067 304410 094 092 093 103111 090 080 085 88912 077 066 071 1052Accuracy 080 31189Macro avg 082 076 079 31189Weighted avg 080 080 0 80 31189

Table 18 Performance measuring parameters for the RF model

Class Precision Recall F1-score Support1 094 093 094 56612 094 096 095 9673 072 063 067 21154 080 058 067 8785 071 090 079 100306 067 041 051 22937 100 100 100 9708 093 080 086 22599 075 058 065 304410 094 098 096 103111 096 098 097 88912 084 063 072 1052Accuracy 080 31189Macro avg 085 078 081 31189Weighted avg 081 080 0 80 31189

Table 14 Performance measuring parameters for the KNN model

Class Precision Recall F1-score Support1 091 093 092 56612 062 083 071 9673 067 071 069 21154 064 060 062 8785 078 082 080 100306 066 050 057 22937 093 100 096 9708 091 080 085 22599 071 062 066 304410 085 093 089 103111 072 085 078 88912 075 061 067 1052Accuracy 078 31189Macro avg 076 077 076 31189Weighted avg 078 078 0 78 31189

Table 15 Performance measuring parameters for the DT model

Class Precision Recall F1-score Support1 091 089 090 56612 083 097 089 9673 057 052 054 21154 058 054 056 8785 072 075 073 100306 044 041 042 22937 099 100 100 9708 079 077 078 22599 057 055 056 304410 098 098 093 103111 086 098 092 88912 061 056 058 1031Accuracy 073 31189Macro avg 073 074 074 31189Weighted avg 073 073 0 73 31189

Table 16 Performance measuring parameters for the NB Multi-nominal model

Class Precision Recall F1-score Support1 094 091 093 56832 082 034 048 9563 066 047 055 21214 091 020 032 9195 056 095 070 100136 070 022 034 23877 098 095 097 9598 094 075 083 21889 075 040 052 303110 096 078 086 99811 096 032 048 86312 084 025 039 1071Accuracy 070 31189Macro avg 084 054 061 31189Weighted avg 076 070 0 67 31189

Scientific Programming 11

performance of the LR model ie 84 overall accuracy formulticlass event classification

725 Random Forest (RF) It comprises many decision trees[54] Its results showed the highest accuracy among allevaluated machine learning classifiers A detailed summaryof the results is given in Table 18

726 Support Vector Machine (SVM) )e support vectormachine (SVM) is one of the highly recommended modelsfor binary classification It is based on statistical theory [55]Its performance details are given in Table 19

A comparative depiction of results obtained by thetraditional machine learning classifiers is given in Figure 9

8 Discussion and Conclusion

Lack of resources is a major hurdle in research for Urdulanguage texts We explored many feature vectors generatingtechniques Different classification algorithms of traditional

machine learning and deep learning approaches are evaluatedon these feature vectors )e purpose of performing manyexperiments on various feature vector generating techniqueswas to develop the most efficient and generic model of mul-ticlass event classification for Urdu language text

Word embedding feature generating technique is con-sidered an efficient and powerful technique for text analysisWord2Vector (W2Vec) feature vectors can be generated bypretrained word embedding models or using dynamic pa-rameters in embedding layers of deep neural networks Weperformed sentence classification using pretrained wordembedding models one-hot-encoding TF TF-IDF anddynamic embeddings )e results of the rest of the featurevector generating techniques are better than pretrained wordembedding models

Another argument in support of this conclusion is thatonly a few pretrained word embedding models exist forUrdu language texts )ese models are trained on consid-erable tokens and domain-specific Urdu text )ere is a needto develop generic word embedding models for the Urdulanguage on a large corpus CNN and RNN (LSTM) single-layer architecture and multilayer architecture do not affectthe performance of the proposed system

Experimental results are the vivid depiction that theone-hot-encoding method is better than the word em-bedding model and pretrained word embedding modelHowever among all mentioned (see Section 52) featuregenerating techniques TF-IDF outperformed It showedthe highest accuracy (84) by using DNN deep learningclassifier while event classification on an imbalancedataset of multiclass events for Urdu language usingtraditional machine learning classifiers showed consid-erable performance but lower than deep learning modelsDeep learning algorithms ie CNN DNN and RNN arepreferable over traditional machine learning algorithmsbecause there is no need for a domain expert to findrelevant features in deep learning like traditional machinelearning DNN and RNN outperformed among all otherclassifiers and showed overall 84 and 81 accuracyrespectively for the twelve classes of events Compara-tively the performance of CNN and RNN is better thanNaıve Bayes and SVM

Multiclass event classification at the sentence levelperformed on an imbalance dataset events that are having alow number of instances for a specific class affect the overallperformance of the classifiers We can improve the per-formance by balancing the instances of each class )efollowing can be concluded

(1) Pretrained word embedding models are suitable onlyfor sentence classification if pretrained models aredeveloped by an immense amount of textual data

(2) Existing word embedding models Word2Vec andGlove that were developed for the English languagetext are incompatible for Urdu language text

(3) In our case TF-IDF one-hot-encoding and dy-namic embedding layer are better feature generating

Table 19 Performance measuring parameters for the SVM model

Class Precision Recall F1-score Support1 084 094 089 56832 072 043 054 9563 072 049 058 21214 073 043 054 9195 064 090 075 100136 074 024 036 23877 090 099 094 9598 086 078 082 21889 065 047 057 303110 085 087 082 99811 081 062 070 86312 077 063 067 1071Accuracy 073 31189Macro avg 077 063 067 31189Weighted avg 077 073 0 71 31189

SVM

70

7573

70

80

73

80

7880

RF LRMachine learing classifiers

65

85

DT KNNNBM

Valid

atio

n_ac

cura

cy

Figure 9 Machine learning algorithmsrsquo accuracy using TF-IDF

12 Scientific Programming

techniques as compared to pre-existing Urdu lan-guage text word embedding models

(4) )e TF-IDF-based feature vectors showed thehighest results as compared to one-hot-encoding-and dynamic word embedding-based feature vectors

(5) Imbalance number of instances in the dataset af-fected the overall accuracy

9 Future Work

In a comprehensive review of Urdu literature we found onlya few numbers of referential works related to Urdu textprocessing )e main hurdle in Urdu exploration is theunavailability of the processing resources ie event datasetclose-domain part-of-speech tagger lexicons annotatorsand other supporting tools

)ere are a lot of tasks that can be accomplished forUrdu language text in the future Some of those are men-tioned as follows

(1) Generic word embedding models can be developedfor a large corpus of Urdu language text

(2) Different deep learning classifiers can be evaluatedie BERT and ANN

(3) Event classification can be performed at the doc-ument level

(4) A balance dataset can be used for better results(5) Multilabel event classification can be performed in

the future(6) Unstructured data of Urdu text can be classified

into different event classes(7) Classification of events for the Urdu language can

be further performed for other domains ofknowledge ie literacy ratio top trends famousfoods and a religious event like Eid

(8) Contextual information of sentence ie presen-tence and postsentence information certainly playsa vital role in enhancing the performance accuracyof the classification model

(9) Event classification can be performed on a balanceddataset

(10) Unstructured Urdu data can be used for eventclassification

(11) Classification can be performed at a document andphrase level

Data Availability

)e data used to support this study are available at httpsgithubcomunique-worldMulticlass-Event-Classification-Dataset

Conflicts of Interest

)e authors declare that there are no conflicts of interest

References

[1] A Lenhart R Ling S Campbell and K Purcell Teens andMobile Phones Text Messaging Explodes as Teens Embrace it asthe Centerpiece of eir Communication Strategies withFriends Pew Internet amp American Life Project WashingtonDC USA 2010

[2] M Motoyama B Meeder K Levchenko G M Voelker andS Savage ldquoMeasuring online service availability using twit-terrdquo WOSN vol 10 p 13 2010

[3] J Rogstadius M Vukovic C A Teixeira V KostakosE Karapanos and J A Laredo ldquoCrisisTracker crowdsourcedsocial media curation for disaster awarenessrdquo IBM Journal ofResearch and Development vol 57 no 5 pp 4ndash1 2013

[4] T Reuter and P Cimiano ldquoEvent-based classification of socialmedia streamsrdquo in Proceedings of the 2nd ACM InternationalConference on Multimedia Retrieval pp 1ndash8 Bielefeld Ger-many June 2012

[5] K Sailunaz and R Alhajj ldquoEmotion and sentiment analysisfrom Twitter textrdquo Journal of Computational Science vol 36Article ID 101003 2019

[6] P Capet T Delavallade T Nakamura A SandorC Tarsitano and S Voyatzi ldquoA risk assessment system withautomatic extraction of event typesrdquo in Proceedings of theInternational Conference on Intelligent Information Process-ing pp 220ndash229 Springer Beijing China October 2008

[7] F Hogenboom F Frasincar U Kaymak F De Jong andE Caron ldquoA survey of event extraction methods from text fordecision support systemsrdquo Decision Support Systems vol 85pp 12ndash22 2016

[8] S Jiang H Chen J F Nunamaker and D Zimbra ldquoAna-lyzing firm-specific social media and market a stakeholder-based event analysis frameworkrdquo Decision Support Systemsvol 67 pp 30ndash39 2014

[9] B Pang and L Lee ldquoOpinion mining and sentiment analysisrdquoFoundations and Trends in Information Retrieval vol 2 no 1-2 pp 1ndash135 2008

[10] S Deerwester S T Dumais G W Furnas T K Landauerand R Harshman ldquoIndexing by latent semantic analysisrdquoJournal of the American Society for Information Sciencevol 41 no 6 pp 391ndash407 1990

[11] T Mikolov Q V Le and I Sutskever ldquoExploiting similaritiesamong languages for machine translationrdquo 2013 httparxivorgabs13094168

[12] R Alghamdi and K Alfalqi ldquoA survey of topic modeling intext miningrdquo International Journal of Advanced ComputerScience and Applications (IJACSA)vol 6 no 1 2015

[13] D M Eberhard S F Gary and C D Fennig EthnologueLanguages of the World SIL International Dallas TX USA2019

[14] A Daud W Khan and D Che ldquoUrdu language processing asurveyrdquo Artificial Intelligence Review vol 47 no 3pp 279ndash311 2017

[15] M P Akhter Z Jiangbin I R Naqvi M AbdelmajeedA Mehmood and M T Sadiq ldquoDocument-level text clas-sification using single-layer multisize filters convolutionalneural networkrdquo IEEE Access vol 8 pp 42689ndash42707 2021

[16] U Pal and A Sarkar ldquoRecognition of printed Urdu scriptrdquo inProceedings of the 2003 Seventh International Conference onDocument Analysis and Recognition pp 1183ndash1187 IEEEEdinburgh Scotland August 2003

[17] Y Yang T Pierce and J Carbonell ldquoA study of retrospectiveand on-line event detectionrdquo in Proceedings of the 21st annualinternational ACM SIGIR conference on Research and

Scientific Programming 13

development in information retrieval pp 28ndash36 MelbourneAustralia August 1998

[18] T Kala ldquoEvent detection from text datardquo ComputationalIntelligence vol 31 pp 132ndash164 2015

[19] M Naughton N Stokes and J Carthy ldquoSentence-level eventclassification in unstructured textsrdquo Information Retrievalvol 13 no 2 pp 132ndash156 2010

[20] G Jacobs E Lefever and V Hoste ldquoEconomic event de-tection in company-specific news textrdquo in Proceedings of theFirst Workshop on Economics and Natural Language Pro-cessing pp 1ndash10 Melbourne Australia July 2018

[21] E DrsquoAndrea P Ducange A Bechini A Renda andF Marcelloni ldquoMonitoring the public opinion about thevaccination topic from tweets analysisrdquo Expert Systems withApplications vol 116 pp 209ndash226 2019

[22] M Sokolova and G Lapalme ldquoA systematic analysis ofperformance measures for classification tasksrdquo InformationProcessing amp Management vol 45 no 4 pp 427ndash437 2009

[23] T Mikolov I Sutskever K Chen G S Corrado and J DeanldquoDistributed representations of words and phrases and theircompositionalityrdquo in Proceedings Advances Neural Informa-tion Processing Systems vol 26 pp 3111ndash3119 Lake TahoeNV USA December 2013

[24] Y Bengio R Ducharme P Vincent and C Jauvin ldquoA neuralprobabilistic language modelrdquo Journal of Machine LearningResearch vol 3 pp 1137ndash1155 2003

[25] M P Akhter Z Jiangbin I R Naqvi M Abdelmajeed andM Fayyaz ldquoExploring deep learning approaches for Urdu textclassification in product manufacturingrdquo Enterprise Infor-mation Systems pp 1ndash26 2020

[26] G Liu and J Guo ldquoBidirectional LSTM with attentionmechanism and convolutional layer for text classificationrdquoNeurocomputing vol 337 pp 325ndash338 2019

[27] M Sharjeel R M A Nawab and P Rayson ldquoCOUNTERcorpus of Urdu news text reuserdquo Language Resources andEvaluation vol 51 no 3 pp 777ndash803 2017

[28] K Mehmood D Essam and K Shafi ldquoSentiment analysissystem for Roman Urdurdquo in Proceedings of the 2018 Scienceand Information Conference pp 29ndash42 Springer CasablancaMorocco July 2018

[29] K Ahmed M Ali S Khalid and M Kamran ldquoFramework forUrdu news headlines classificationrdquo Journal of Applied ComputerScience amp Mathematics vol 10 no 1 pp 17ndash21 2016

[30] Z Tehseen M P Akhter and Q Abbas ldquoComparative studyof feature selection approaches for Urdu text categorizationrdquoMalaysian Journal Computer Science vol 28 no 2 pp 93ndash109 2015

[31] W Yin and L Shen ldquoA short text classification approach withevent detection and conceptual informationrdquo in Proceedingsof the 2020 5th International Conference on Machine LearningTechnologies pp 129ndash135 Beijing China June 2020

[32] H Zhou M Huang T Zhang et al ldquoEmotional chattingmachine emotional conversation generation with internaland external memoryrdquo in Proceedings of the irty-SecondAAAI Conference on Artificial Intelligence New Orleans LAUSA February 2018

[33] A Hassan and A Mahmood ldquoConvolutional recurrent deeplearningmodel for sentence classificationrdquo IEEE Access vol 6pp 13949ndash13957 2018

[34] T Zia M P Akhter and Q Abbas ldquoComparative study offeature selection approaches for Urdu text categorizationrdquoMalaysian Journal of Computer Science vol 28 no 2pp 93ndash109 2015

[35] A R Ali andM Ijaz ldquoUrdu text classificationrdquo in Proceedingsof the 7th international conference on frontiers of informationtechnology pp 1ndash7 Abbottabad Pakistan December 2009

[36] M Usman Z Shafique S Ayub and K Malik ldquoUrdu textclassification using majority votingrdquo International Journal ofAdvanced Computer Science and Applications vol 7 no 8pp 265ndash273 2016

[37] N Kalchbrenner E Grefenstette and P Blunsom ldquoA con-volutional neural network for modelling sentencesrdquo 2014httparxivorgabs14042188

[38] D M Awais and DM Shoaib ldquoRole of discourse informationin Urdu sentiment classificationrdquo ACMTransactions on Asianand Low-Resource Language Information Processing vol 18no 4 pp 1ndash37 2019

[39] J P Singh Y K Dwivedi N P Rana A Kumar andK K Kapoor ldquoEvent classification and location predictionfrom tweets during disastersrdquo Annals of Operations Researchvol 283 no 1-2 pp 737ndash757 2019

[40] R C Paulo D Fillipe and M S C Sergo ldquoClassification ofevents on social mediardquo 2016

[41] Q A Al-Radaideh and M A Al-Abrat ldquoAn Arabic textcategorization approach using term weighting and multiplereductsrdquo Soft Computing vol 23 no 14 pp 5849ndash5863 2019

[42] J F Allen ldquoMaintaining knowledge about temporal inter-valsrdquo Communications of the ACM vol 26 no 11pp 832ndash843 1983

[43] T Joachims ldquoText categorization with support vector ma-chines learning with many relevant featuresrdquo in Proceedingsof the European conference on machine learning pp 137ndash142Springer Chemnitz Germany April 1998

[44] S Haider ldquoUrdu word embeddingsrdquo in Proceedings of theEleventh International Conference on Language Resources andEvaluation (LREC 2018) Miyazaki Japan May 2018

[45] B Jawaid A Kamran and O Bojar ldquoUrdu monolingual corpusrdquoLINDATCLARIN Digital Library at the Institute of Formal andApplied Linguistics Charles University Prague Czechia

[46] F Adeeba Q Akram H Khalid and S Hussain ldquoCLE Urdubooks n-gramsrdquo in Proceedings of the Conference on Languageand Technology CLT 14 Karachi Pakistan May 2014

[47] A Hassan and A Mahmood ldquoDeep learning for sentenceclassificationrdquo in Proceedings of the 2017 IEEE Long IslandSystems Applications and Technology Conference (LISAT)pp 1ndash5 IEEE New York NY USA May 2017

[48] D-X Zhou ldquoUniversality of deep convolutional neuralnetworksrdquo Applied and Computational Harmonic Analysisvol 48 no 2 pp 787ndash794 2020

[49] M V Valueva N N Nagornov P A Lyakhov G V Valuevand N I Chervyakov ldquoApplication of the residue numbersystem to reduce hardware costs of the convolutional neuralnetwork implementationrdquo Mathematics and Computers inSimulation vol 177 pp 232ndash243 2020

[50] G Guo H Wang D Bell Y Bi and K Greer ldquoKNN model-based approach in classificationrdquo in Proceedings of the OTMConfederated International Conferences ldquoOn the Move toMeaningful Internet Systemsrdquo Catania Italy November 2003

[51] Y Zhong ldquo)e analysis of cases based on decision treerdquo inProceedings of the 2016 7th IEEE international conference onsoftware engineering and service science (ICSESS) pp 142ndash147IEEE Beijing China August 2016

[52] S Xu ldquoBayesian Naıve Bayes classifiers to text classificationrdquoJournal of Information Science vol 44 no 1 pp 48ndash59 2018

[53] T Zhang and F Oles ldquoText categorization based on regu-larized linear classification methodsrdquo Information Retrievalvol 4 no 1 pp 5ndash31 2001

14 Scientific Programming

[54] J Ali R Khan N Ahmad and I Maqsood ldquoRandom forestsand decision treesrdquo International Journal of Computer ScienceIssues (IJCSI) vol 9 no 5 p 272 2012

[55] Y Zhang ldquoSupport vector machine classification algorithmand its applicationrdquo in Proceedings of the InternationalConference on Information Computing and Applicationspp 179ndash186 Springer Bhubaneswar India September 2012

Scientific Programming 15

Page 4: ResearchArticle MulticlassEventClassificationfromText2020/11/03  · script. Its grammatical structure is different from other languages. (1)Subject-object-verb(SOV)sentencestructure[14]

annotation the dataset size was reduced to 103965 imbal-anced instances of twelve different types of events

)e annotation interagreement ie Cohen Kappa scoreis 093 which indicates the strong agreement between thelanguage and expert annotators )e annotated dataset isalmost perfect according to the annotation agreement score

In the second phase of preprocessing the following stepsare performed ie stop words eliminated word tokeniza-tion and sentence filtering

All those words which do not semantically contribute tothe classification process are removed as stop words ie ہو

ہریغوہریغوےسرواساںیم etc A list of standard stopwords of the Urdu language is available here (httpswwwkagglecomrtatmanurdu-stopwords-list)

After performing data cleaning and stop word removalevery sentence is tokenized into words based on white spaceAn example of sentence tokenization is given in Table 2

)e previous preprocessing step revealed that manysentences are varying in length Some sentences were soshort and many were very long We decided to define alength boundary for tokenized sentences We observed thatmany sentences exist in the dataset which have a lengthrange from 5 words to 250 words We selected sentences thatconsist of 5 words to 150 words An integer value is assignedto each type of event for all selected sentences )e detaileddescription of the different types of events and their cor-responding numeric (integer) values that are used in thedataset is also given in Table 3

In Figure 2 a few instances of the dataset after pre-processing are presented It is a comma-separated value

(CSV) file that consists of two fields ie sentence and labelie numeric value for each class (1ndash12)

In our dataset three types of events have a larger numberof instances ie sports (18746) politics (33421) and fraudand corruption (10078) contrary to three other types ofevents that have a smaller number of instances ie sexualassault (2916) inflation (3196) and earthquake (3238)

)e remaining types of events have a smaller differenceof instances among them )ere are 51814 unique words inour dataset )e visualization in Figure 3 shows that thedataset is imbalanced

4 Methodology

We analyzed the performance of deep learning ie deepneural network convolutional neural network and recur-rent neural network along with other machine learningclassifiers ie K-nearest neighbor decision tree randomforest support vector machine Naıve Bayes multinominaland linear regression

)e Urdu news headlines contain insufficient informa-tion ie few numbers of words and lack of contextualinformation to classify the events [29] However compar-atively to news headlines the sentences written in informalway contain more information )e sentence-level classifi-cation is performed using deep learning models instead ofonly machine learning algorithms )e majority voting al-gorithm outperforms on a limited number of instances forseven classes It showed 94 [36] accuracy but in our workmore than 015 million instances which are labeled intotwelve classes are used for classification

)ere exist several approaches to extract useful infor-mation from a large amount of data )ree common ap-proaches are rule-based a machine learning approach andhybrid approaches [42] )e selection of methodology istightly coupled with the research problem In our problemwe decided to use machine learning (traditional machinelearning and deep learning approaches) classifiers Sometraditional machine learning algorithms ie K-nearestneighbor (KNN) random forest (RF) support vector ma-chine (SVM) decision tree (DT) and multinomial NaıveBayes (MNB) are evaluated for multiclass eventclassification

Deep learning models ie convolutional neural network(CNN) deep neural network (DNN) and recurrent neuralnetwork (RNN) are also evaluated for multiclass eventclassification

Table 1 Summary of the related research

Paper reference Classifier used Dataset Accuracy ()[33] CNN and RNN Movie reviews 92

[34] CNN and RNN 1 Stanford movie review dataset2 Stanford Treebank dataset 933 and 892

[35] Naıve Bayes and SVM Corpus of Urdu documents 8953 and 9334[36] Dynamic neural network News articles 965[38] Rule-based modeling Urdu corpus of news headlines 3125[39] LSTM Tweets 8100[41] K-NN and J48 Arabic corpus of 2700 documents 95 and 86

Figure 1 Urdu and Hindi language text on social media

4 Scientific Programming

A collection of Urdu text documentsD d1 d2 dn1113864 1113865 is split into a set of sentencesS s1 s2 sn1113864 1113865 Our purpose is to classify the sentences toa predefined set of events E e1 e2 en1113864 1113865

Various feature generating methods are used to create afeature vector for deep learning and machine learning classi-fiers ie TF-IDF one-hot-encoding and word embeddingFeature vectors generated by all these techniques are fed up asinput into the embedding layer of neural networks )e outputgenerated by the embedding layers is fed up into the next fullyconnected layer (dense layer) of deep learning models ieRNN CNN and DNN A relevant class label out of twelvecategories is assigned to each sentence at the end of modelprocessing in the testingvalidation phase

Bag-of-words is a common method to represent text Itignores the sequence order and semantic of text [43] whilethe one-hot-coding method maintains the sequence of textWord embedding methods Word2Vec and Glove (httpsybbaigogitbooksio26pretrained-word-embeddingshtml)that are used to generate feature vectors for deep learningmodels are highly recommended for textual data Howeverin the case of Urdu text classification pre-existing wrod2vecand Glove are incompatible

)e framework of our designed system is represented inFigure 4 It shows the structure of our system from takinginput to producing output

5 Experimental Setup

We performed many experiments on our dataset by usingvarious traditional machine learning and deep learningclassifiers )e purpose of many experiments is to find themost efficient and accurate classification model for themulticlass event on an imbalance dataset for the Urdulanguage text A detailed comparison between traditionalclassifiers and deep neural classifiers is given in the nextsection

51 Feature Space Unigram and bigram tokens of the wholecorpus are used as features to create the feature space TF-IDF vectorization is used to create a dictionary-based modelIt consists of 656608 features )e training and testingdataset are converted to TF-IDF dictionary-based featurevectors A convolutional sequential model (see Figure 5)consists of three layers ie the input layer hidden layer andoutput layer which are used to evaluate our dataset Sim-ilarly word embedding and one-hot-encoding are also in-cluded in our feature space to enlarge the scope of ourresearch problem

52 Feature Vector Generating Techniques Feature vectorsare the numerical representation of text )ey are an actualform of input that can be processed by the machine learningclassifier )ere are several feature generating techniquesused for text processing We used the following featurevector generating techniques

521 Word Embedding A numerical representation of thetext is that each word is considered as a feature vector Itcreates a dense vector of real values that captures thecontextual semantical and syntactical meaning of the wordIt also ensures that similar words should have a relatedweighted value [29]

522 Pretrained Word Embedding Models Usage of apretrained word embedding model for the small amount ofdata is highly recommended by researchers in state of the artGlove and Word2Vec are famous word embedding modelsthat are developed by using a big amount of data Wordembedding models for text classification especially in theEnglish language showed promising results It has emergedas a powerful feature vector generating technique amongothers ie TF TF-IDF and one-hot encoding etc

In our research case sentence classification for differentevents in the Urdu language using the word embeddingtechnique is potentially preferable Unfortunately the Urdulanguage is lacking in processing resources We found onlythree word embedding models a word embedding model

Table 2 Sentence tokenization

Sentence Tokenized sentenceیلےلناجیکںوگولددعتمےنسرئاوانورک یل ےل ناج ںوگول ددعتم سرئاو انورک

یئگرگتھچیکںورھگیئکےسشرابینافوط ئگ رگ تھچ ںورھگ یئک شراب ینافوط

Table 3 Event label

Event LabelSports 1Inflation 2Murder and death 3Terrorist attack 4Politics 5Law and order 6Earthquake 7Showbiz 8Fraud and corruption 9Rainweather 10Sexual assault 11

Figure 2 )e few instances of the dataset

Scientific Programming 5

0

5000

10000

15000

20000

25000

30000

35000

40000To

tal n

umbe

r of i

nsta

nces

Imbalance dataset of the urdu label sentences for events

Type of events

18746

31966932

3034

33421

69603238

741710078

3406 2916 3617

Spor

ts

Infla

tion

Mur

der

Terr

orist

atta

ck

Polit

ics

Law

and

orde

r

Eart

hqua

ke

Show

biz

Frau

d an

dco

rrup

tion

Wea

ther

Sexu

al as

saul

t

Busin

ess

Figure 3 Imbalance instances of the dataset

Raw input Preprocessed input Word embedding layersDLM

RNN (LSTM-unidirection)

CNN

DNN

LabelsTF-IDF

Documents1

Documents2

Documents3

Documents

n

1

2

3

12

Figure 4 Methodology

00

050

055

060

065

070

075

080

085

05 10 15 20 25 30 35 40Epochs

Accuracy

Valid

ation_

accuracy

TrainTest

Figure 5 DDNrsquos accuracy

6 Scientific Programming

[44] that is developed by using three publicly available Urdudatasets Wikipediarsquos Urdu text another corpus having 90million tokens [45] and 35 million tokens [46] It has 102214unique tokens Each token comprises 300-dimensional realvalues Another model publicly available for research purposesconsists of 25925 unique words of Urdu language [47] Everyword has a 400-dimensional value A word embedding modelcomprises web-based text created to classify text It consists of64653 unique Urdu words and 300 dimensions for each word

)e journey of research is not over here to expand ourresearch scope and find the most efficient word embeddingmodel for sentence classification we decided to developcustom word embedding models We developed four wordembedding models that contain 57251 unique words

)e results of pretrained existing word embedding modelsare good at the initial level but very low ie 6026 accuracyWe explored the contents of these models which revealed thatmanywords are irrelevant and borrowed from other languagesie Arabic and Persian )e contents of Wikipedia are entirelydifferent than news websites that also affect the performance ofembedding models Another major factor ie low amount ofdata affected the feature vector generation quality Stop wordsin the pretrained word embedding model are not eliminatedand considered as a token while in our dataset all the stopwords are removed It also reduces the size of the vocabulary ofthe model while generating a feature vector )erefore wedecided to develop a custom word embedding model on ourpreprocessed dataset To postulate the enlargement of theresearch task three different word embedding models aredeveloped )e details of all used pretrained word embeddingmodels are given in Table 4

523 One-Hot-Encoding Text cannot be processed directlybymachine learning classifiers therefore we need to convertthe text into a real value We used one-hot-encoding toconvert text to numeric features For example the sentencesgiven in Table 5 can be converted into a numeric featurevector using one-hot-encoding as shown in Table 6

524 TF-IDF TF and TF-IDF are feature engineeringtechniques that transform the text into the numerical for-mat It is one of the most highly used feature vectors forcreating a method for text data )ree deep learning modelswere evaluated on our corpus )e sequential model withembedding layers outperformed other pretrained wordembedding models [44] reported in state of the art [48] )edetailed summary of the evaluation results of CNN RNNand DNN is discussed in Section 7

53 Deep Learning Models

531 Deep Neural Network Architecture Our DNN archi-tecture consists of three layers ie n-input layer 150 hidden(dense) layers and 12 output layers Feature vector is givenas input into a dense layer that is fully connected )eSoftMax activation function is used in the output layer toclassify sentences into multiple classes

532 Recurrence Neural Network )e recurrence neuralnetwork is evaluated using a long short-term memory(LSTM) classifier RNN consists of embedding dropoutLSTM and dense layers A dictionary of 30000 unique mostfrequent tokens is made )e sentences are standardized tothe same length by using a padding sequence)e dimensionof the feature vector is set as 250 RNN showed an overall81 accuracy that is the second highest in our work

533 Convolutional Neural Network (CNN) CNN is a classof deep neural networks that are highly recommended forimage processing [49] It consists of the input layer (em-bedding layer) multiple hidden layers and an output layer)ere are a series of convolutional layers that convolve witha multiplication )e embedded sequence layer and averagelayer (GloobalAveragePooling1D) are also part of the hiddenlayer )e common activation of CNN is RELU Layer )edetails of the hypermeters that are used in our problem totrain the CNN model are given in Table 7

534 Hyperparameters In this section all the hyper-parameters that are used in our experiments are given in thetabular format Only those hyperparameters are being dis-cussed here which have achieved the highest accuracy ofDNN RNN and CNN models )e hyperparameters ofDNN that are fine-tuned in our work are given in Table 8

)e RNN model showed the highest accuracy (803and 81) on two sets of hyperparameters that are given inTable 9 Similarly Table 7 provides the details of thehyperparameters of the convolutional neural network

6 Performance Measuring Parameters

)emost common performance measuring [41] parametersie precision recall and F1-measure are used to evaluatethe proposed framework )e selection of these parameterswas decided because of the multiclass classification andimbalance dataset

Precision TP

(TP + FP) (1)

Recall TP

(TP + FN) (2)

F1 2lowast PrecisionlowastRecall( 1113857

(Precision + Recall) (3)

Accuracy (TP + TN)

(TP + TN + FP + FN) (4)

where TP TN FP and FN represent total positive totalnegative false positive and false negative values respec-tively Precision is defined as the closeness of the mea-surements to each other and recall is the ratio of the totalamount of relevant (ie TP values) instances that wereactually retrieved during the experimental work It is

Scientific Programming 7

noteworthy that both precision and recall are the relativevalues of measure of relevance

7 Results

71 Deep Learning Classifiers )e feature vector can begenerated using different techniques )e details of feature

vector generating techniques were discussed in Section 5)e results of feature vector generating techniques that wereused in our work ie ldquomulticlass event classification for theUrdu language textrdquo are given in the proceedingsubsections

711 Pretrained Word Embedding Models )e convolu-tional neural network model is evaluated on the featuresvectors that were generated by all pretrained word em-bedding models )e summary of all results generated by

Table 4 Pretrained word embedding model and custom word embedding model

Sr no Unique words Dimension Window sizeExisting pretrained word embedding models1 [11] 64653 300 mdash2 [19] 102214 100 mdash3 53454 300 mdashCustom pretrained word embedding models1 57251 50 22 57251 100 23 57251 100 34 57251 350 1

Table 5 Event sentence

Urdu sentence English sentenceےہاتلیھکلابٹفیلع Ali plays football

یلےلناجیکںوگولںوھکالےنسرئاوانورک Corona virus killed millions of people

Table 6 Event sentence converted using one-hot-encoding

Sentence اتلیھک لاب ٹف یلع ناج ںوگول ںوھکال سرئاو انورک1 1 1 1 1 0 0 0 0 02 0 0 0 0 1 1 1 1 1

Table 7 CNNrsquos hyperparameters

CNN (7928)Parameter ValueMax_words 20000Batch size 128Embedding_dim 50Activation function SoftMaxDense_node 256Trainingtesting 70ndash30No of epochs 20Loss function Categorical cross-entropy

Table 8 DNNrsquos hyperparameters

Parameter ValueMax_words 5000Batch size 128Embedding_dim 512Activation function SoftMaxLayers 04Trainingtesting 70ndash30No of epochs 15Loss function Sparse categorical cross-entropy

Table 9 RNNrsquos hyperparameters

Parameter ValueRNN (LSTM) (803)Max_words 50000Batch size 64Embedding_dim 100Activation function SoftMaxRecurrent dropout 02Trainingtesting 90ndash10No of epochs 05Loss function Sparse categorical cross-entropyRNN (LSTM) (81)Max_words 30000Batch size 128Embedding_dim 100Activation function SoftMaxRecurrent dropout 02Trainingtesting 80ndash20No of epochs 05Loss function Sparse categorical cross-entropy

8 Scientific Programming

pretrained [44] and custom pretrained word embeddingmodels is given in Table 10 Our custom pretrained wordembedding model that contains 57251 unique tokens largerdimension size 350 and 1 as the size of a window showed3868 accuracy )e purpose of developing a differentcustom pretrained word embedding model was to develop adomain-specific model and achieve the highest accuracyHowever the results of both pre-existing pretrained wordembedding models and domain-specific custom word em-bedding models are very low )e detail summary of resultscan be seen in Table 10

712 TF-IDF Feature Vector DNN architecture consists ofan input layer a dense layer and a max pool layer)e denselayer is also called a fully connected layer comprised of 150nodes SoftMax activation function and sparse_categor-ical_cross-entropy are used to compile the model on thedataset

25991 instances are used to validate the accuracy of theDNN model )e DNN with connected layer architectureshowed 84 overall accuracy for all event classes)e details ofthe performance measuring parameters for each class of eventare given in Table 11 Law and order the 6th type of event inour dataset consists of 2000 instances that are used for vali-dation It showed 66 accuracy that is comparatively low to theaccuracy of other types of events It affected the overall per-formance of the DNN model )e main reason behind theseresults is that the sentence of law and order overlaps with thesentences of politics Generally sometimes humans hardlydistinguish between law and order and political statements

For example

ldquo ےطخوگتفگہنارادہمذریغیکریزویتموکحےہہرطخےیلےکنماےک rdquo

ldquo)e irresponsible talk of state minister is a threat topeace in the regionrdquo

)e performance of the DNN model is given in Table 11that showed 84 accuracy for multiple classes of events All theother performance measuring parameters ie precessionrecall and F1-score of each class of events are given in Table 11

)e accuracy of the DNN model can be viewed inFigure 5 where the y-axis represents the accuracy and the x-axis represents the number of epochs RNN achieved 84accuracy for multiclass event classification

)e expected solution to tackle the sentence over-lapping problem with multiple classes is to use a ldquopre-trained word embeddingrdquo model like W2Vec and GloveHowever unfortunately like the English language stillthere is no openclose domain pretrained word

embedding model that is developed by a large corpus ofthe Urdu language text

)e RNN sequential model architecture of deep learningis used in our experiments )e recurrent deep learningmodel architecture consists of a sequence of the followinglayers ie embedding layer having 100 dimensions Spa-tialDropout1D LSTM and dense layers Sparse_categor-ical_cross-entropy loss function has been used for thecompilation of the model Multiclass categorical classifica-tion is handled by a sparse categorical cross-entropy lossfunction instead of categorical cross-entropy A SoftMaxactivation function is used at a dense layer instead of thesigmoid function SoftMax can handle nonlinear classifi-cation ie multiple classes while sigmoid is limited to linearclassification and handles binary classification

A bag-of-words consisting of 30000 unique Urdu lan-guage words is used to generate a feature vector )emaximum length of the feature vector is 250 tokens

)e overall accuracy of the RNN model is presented inTable 12 that achieved 81 validation accuracy for our problemby using TF-IDF feature vectors Other performance evaluationparameters of each class are also given in Table 12

)e accuracy of the RNN model can be viewed inFigure 6 where the y-axis represents the accuracy and the x-axis represents the number of epochs RNN achieved 81accuracy for multiclass event classification

Although CNN is highly recommended for imageprocessing it showed considerable results for multiclassevent classification on textual data )e performance mea-suring parameters of the CNN classifier are given in Table 13

)e distributed accuracy of the CNN classifier for thetwelve classes can be viewed in Figure 7 )ere is more thanone peak (higher accuracies) in Figure 7 that showeddatasets are imbalanced

713 One-Hot-Encoding )e results of deep learning clas-sifiers are used in our researcher work and their performanceon one-hot-encoding features is presented in Figure 8)e one-hot-encoded feature vectors are given as input to CNN DNNand RNN deep learning classifiers RNN showed better ac-curacy as compared to CNN while the DNN outperformed

Table 10 Classification accuracy of the CNN model

Srno

Existing pretrained modelrsquosvalidation_accuracy

Custom pretrained modelrsquosvalidation_accuracy

1 5800 36852 6026 38043 5668 37384 mdash 3868

Table 11 Performance measuring parameters for the DNNmodel

Class Precision Recall F1-score Support1 096 095 096 46042 091 091 091 7763 075 075 075 16974 078 070 074 7705 081 085 083 84246 071 063 067 20007 100 100 100 8178 092 090 091 18399 070 070 071 252410 095 099 097 85611 095 099 097 74112 082 073 077 943Accuracy 084 25991Macro avg 084 084 085 25991Weighted avg 084 084 084 25991

Scientific Programming 9

among them RNN and DNN achieved 81 and 84 accuracyrespectively for multiclass event classification

72 Traditional Machine Learning Classifiers We also per-formed a multiclass event classifier by using traditionalmachine learning algorithms K-nearest neighbor (KNN)decision tree (DT) Naıve Bayes multinomial (NBM) ran-dom forest (RF) linear regression (LR) and support vectormachine (SVM) All these models are evaluated using TF-IDF and one-hot encoding features as feature vectors It wasobserved that the results produced using TF-IDF featureswere better than the results generated using one-hot-encoding features A detailed summary of the results of theabove-mentioned machine learning classifiers is given in thenext section

721 K-Nearest Neighbor (KNN) KNN performs theclassification of a new data point by measuring the similaritydistance between the nearest neighbors In our experimentswe set the value of k 5 that measures the similarity distanceamong five existing data points [50]

Table 12 Performance measuring parameters for the RNN model

Class Precision Recall F1-score Support1 095 095 095 46042 078 077 078 7763 070 072 071 16974 078 064 070 7705 078 084 081 84246 067 057 062 20007 100 100 100 8178 091 087 089 18399 070 063 066 252410 093 098 095 85611 086 094 090 74112 076 067 071 943Accuracy 081 25991Macro avg 082 080 081 25991Weighted avg 081 081 081 25991

Table 13 Performance measuring parameters for the CNN model

Class Precision Recall F1-score Support1 096 093 095 56612 081 065 072 9673 072 068 070 21154 078 054 064 8785 073 088 080 100306 064 051 057 22937 099 099 099 9708 091 086 088 22599 071 061 066 304410 093 094 093 103111 091 082 086 88912 077 063 07 1052Accuracy 080 31189Macro avg 082 075 078 31189Weighted avg 080 080 080 31189

0

065

070

075

080

085

090

2 4 6 8Epochs

Accuracy

Valid

ation_

accuracy

TrainTest

Figure 6 RNNrsquos accuracy

00

1000

2000

3000

4000

5000

6000

7000

8000

20 40 60 80 100

Figure 7 CNNrsquos accuracy distribution

CNN

79

80

808181

82

83

8484

RNN DNNDeep learning models

One-hot-encoding

78

85

Valid

atio

n_ac

cura

cy

Figure 8 CNN RNN andDNN accuracy using one-hot-encoding

10 Scientific Programming

Although the performance of traditional machinelearning classifiers is considerable it must be noted that it islower than deep learning classifiers )e main performancedegrading factor of the classifiers is the imbalanced numberof instances and sentences overlapping )e performance ofthe KNN machine learning model is given in Table 14 Itshowed 78 accuracy

722 Decision Tree (DT) Decision Tree (DT)Decision tree(DT) is a type of supervised machine learning algorithm [51]where the data input is split according to certain parameters)e overall accuracy achieved by DT is 73 while anotherperformance detail of classes andDTmodel is given in Table 15

723 Naive Bayes Multinominal (NBM) Naıve Bayesmultinominal is one of the computational [52] efficientclassifiers for text classification but it showed only 70accuracy that is very low as compared to KNN DT and RF)e performance details of all twelve types of classes aregiven in Table 16

724 Linear Regression (LR) Linear regression is highlyrecommended for the prediction of continuous output in-stead of categorical classification [53] Table 17 shows the

Table 17 Performance measuring parameters for the LR model

Class Precision Recall F1-score Support1 095 094 094 56612 083 064 072 9673 072 069 070 21154 077 055 064 8785 073 088 080 100306 064 053 058 22937 100 100 100 9708 091 084 088 22599 073 062 067 304410 094 092 093 103111 090 080 085 88912 077 066 071 1052Accuracy 080 31189Macro avg 082 076 079 31189Weighted avg 080 080 0 80 31189

Table 18 Performance measuring parameters for the RF model

Class Precision Recall F1-score Support1 094 093 094 56612 094 096 095 9673 072 063 067 21154 080 058 067 8785 071 090 079 100306 067 041 051 22937 100 100 100 9708 093 080 086 22599 075 058 065 304410 094 098 096 103111 096 098 097 88912 084 063 072 1052Accuracy 080 31189Macro avg 085 078 081 31189Weighted avg 081 080 0 80 31189

Table 14 Performance measuring parameters for the KNN model

Class Precision Recall F1-score Support1 091 093 092 56612 062 083 071 9673 067 071 069 21154 064 060 062 8785 078 082 080 100306 066 050 057 22937 093 100 096 9708 091 080 085 22599 071 062 066 304410 085 093 089 103111 072 085 078 88912 075 061 067 1052Accuracy 078 31189Macro avg 076 077 076 31189Weighted avg 078 078 0 78 31189

Table 15 Performance measuring parameters for the DT model

Class Precision Recall F1-score Support1 091 089 090 56612 083 097 089 9673 057 052 054 21154 058 054 056 8785 072 075 073 100306 044 041 042 22937 099 100 100 9708 079 077 078 22599 057 055 056 304410 098 098 093 103111 086 098 092 88912 061 056 058 1031Accuracy 073 31189Macro avg 073 074 074 31189Weighted avg 073 073 0 73 31189

Table 16 Performance measuring parameters for the NB Multi-nominal model

Class Precision Recall F1-score Support1 094 091 093 56832 082 034 048 9563 066 047 055 21214 091 020 032 9195 056 095 070 100136 070 022 034 23877 098 095 097 9598 094 075 083 21889 075 040 052 303110 096 078 086 99811 096 032 048 86312 084 025 039 1071Accuracy 070 31189Macro avg 084 054 061 31189Weighted avg 076 070 0 67 31189

Scientific Programming 11

performance of the LR model ie 84 overall accuracy formulticlass event classification

725 Random Forest (RF) It comprises many decision trees[54] Its results showed the highest accuracy among allevaluated machine learning classifiers A detailed summaryof the results is given in Table 18

726 Support Vector Machine (SVM) )e support vectormachine (SVM) is one of the highly recommended modelsfor binary classification It is based on statistical theory [55]Its performance details are given in Table 19

A comparative depiction of results obtained by thetraditional machine learning classifiers is given in Figure 9

8 Discussion and Conclusion

Lack of resources is a major hurdle in research for Urdulanguage texts We explored many feature vectors generatingtechniques Different classification algorithms of traditional

machine learning and deep learning approaches are evaluatedon these feature vectors )e purpose of performing manyexperiments on various feature vector generating techniqueswas to develop the most efficient and generic model of mul-ticlass event classification for Urdu language text

Word embedding feature generating technique is con-sidered an efficient and powerful technique for text analysisWord2Vector (W2Vec) feature vectors can be generated bypretrained word embedding models or using dynamic pa-rameters in embedding layers of deep neural networks Weperformed sentence classification using pretrained wordembedding models one-hot-encoding TF TF-IDF anddynamic embeddings )e results of the rest of the featurevector generating techniques are better than pretrained wordembedding models

Another argument in support of this conclusion is thatonly a few pretrained word embedding models exist forUrdu language texts )ese models are trained on consid-erable tokens and domain-specific Urdu text )ere is a needto develop generic word embedding models for the Urdulanguage on a large corpus CNN and RNN (LSTM) single-layer architecture and multilayer architecture do not affectthe performance of the proposed system

Experimental results are the vivid depiction that theone-hot-encoding method is better than the word em-bedding model and pretrained word embedding modelHowever among all mentioned (see Section 52) featuregenerating techniques TF-IDF outperformed It showedthe highest accuracy (84) by using DNN deep learningclassifier while event classification on an imbalancedataset of multiclass events for Urdu language usingtraditional machine learning classifiers showed consid-erable performance but lower than deep learning modelsDeep learning algorithms ie CNN DNN and RNN arepreferable over traditional machine learning algorithmsbecause there is no need for a domain expert to findrelevant features in deep learning like traditional machinelearning DNN and RNN outperformed among all otherclassifiers and showed overall 84 and 81 accuracyrespectively for the twelve classes of events Compara-tively the performance of CNN and RNN is better thanNaıve Bayes and SVM

Multiclass event classification at the sentence levelperformed on an imbalance dataset events that are having alow number of instances for a specific class affect the overallperformance of the classifiers We can improve the per-formance by balancing the instances of each class )efollowing can be concluded

(1) Pretrained word embedding models are suitable onlyfor sentence classification if pretrained models aredeveloped by an immense amount of textual data

(2) Existing word embedding models Word2Vec andGlove that were developed for the English languagetext are incompatible for Urdu language text

(3) In our case TF-IDF one-hot-encoding and dy-namic embedding layer are better feature generating

Table 19 Performance measuring parameters for the SVM model

Class Precision Recall F1-score Support1 084 094 089 56832 072 043 054 9563 072 049 058 21214 073 043 054 9195 064 090 075 100136 074 024 036 23877 090 099 094 9598 086 078 082 21889 065 047 057 303110 085 087 082 99811 081 062 070 86312 077 063 067 1071Accuracy 073 31189Macro avg 077 063 067 31189Weighted avg 077 073 0 71 31189

SVM

70

7573

70

80

73

80

7880

RF LRMachine learing classifiers

65

85

DT KNNNBM

Valid

atio

n_ac

cura

cy

Figure 9 Machine learning algorithmsrsquo accuracy using TF-IDF

12 Scientific Programming

techniques as compared to pre-existing Urdu lan-guage text word embedding models

(4) )e TF-IDF-based feature vectors showed thehighest results as compared to one-hot-encoding-and dynamic word embedding-based feature vectors

(5) Imbalance number of instances in the dataset af-fected the overall accuracy

9 Future Work

In a comprehensive review of Urdu literature we found onlya few numbers of referential works related to Urdu textprocessing )e main hurdle in Urdu exploration is theunavailability of the processing resources ie event datasetclose-domain part-of-speech tagger lexicons annotatorsand other supporting tools

)ere are a lot of tasks that can be accomplished forUrdu language text in the future Some of those are men-tioned as follows

(1) Generic word embedding models can be developedfor a large corpus of Urdu language text

(2) Different deep learning classifiers can be evaluatedie BERT and ANN

(3) Event classification can be performed at the doc-ument level

(4) A balance dataset can be used for better results(5) Multilabel event classification can be performed in

the future(6) Unstructured data of Urdu text can be classified

into different event classes(7) Classification of events for the Urdu language can

be further performed for other domains ofknowledge ie literacy ratio top trends famousfoods and a religious event like Eid

(8) Contextual information of sentence ie presen-tence and postsentence information certainly playsa vital role in enhancing the performance accuracyof the classification model

(9) Event classification can be performed on a balanceddataset

(10) Unstructured Urdu data can be used for eventclassification

(11) Classification can be performed at a document andphrase level

Data Availability

)e data used to support this study are available at httpsgithubcomunique-worldMulticlass-Event-Classification-Dataset

Conflicts of Interest

)e authors declare that there are no conflicts of interest

References

[1] A Lenhart R Ling S Campbell and K Purcell Teens andMobile Phones Text Messaging Explodes as Teens Embrace it asthe Centerpiece of eir Communication Strategies withFriends Pew Internet amp American Life Project WashingtonDC USA 2010

[2] M Motoyama B Meeder K Levchenko G M Voelker andS Savage ldquoMeasuring online service availability using twit-terrdquo WOSN vol 10 p 13 2010

[3] J Rogstadius M Vukovic C A Teixeira V KostakosE Karapanos and J A Laredo ldquoCrisisTracker crowdsourcedsocial media curation for disaster awarenessrdquo IBM Journal ofResearch and Development vol 57 no 5 pp 4ndash1 2013

[4] T Reuter and P Cimiano ldquoEvent-based classification of socialmedia streamsrdquo in Proceedings of the 2nd ACM InternationalConference on Multimedia Retrieval pp 1ndash8 Bielefeld Ger-many June 2012

[5] K Sailunaz and R Alhajj ldquoEmotion and sentiment analysisfrom Twitter textrdquo Journal of Computational Science vol 36Article ID 101003 2019

[6] P Capet T Delavallade T Nakamura A SandorC Tarsitano and S Voyatzi ldquoA risk assessment system withautomatic extraction of event typesrdquo in Proceedings of theInternational Conference on Intelligent Information Process-ing pp 220ndash229 Springer Beijing China October 2008

[7] F Hogenboom F Frasincar U Kaymak F De Jong andE Caron ldquoA survey of event extraction methods from text fordecision support systemsrdquo Decision Support Systems vol 85pp 12ndash22 2016

[8] S Jiang H Chen J F Nunamaker and D Zimbra ldquoAna-lyzing firm-specific social media and market a stakeholder-based event analysis frameworkrdquo Decision Support Systemsvol 67 pp 30ndash39 2014

[9] B Pang and L Lee ldquoOpinion mining and sentiment analysisrdquoFoundations and Trends in Information Retrieval vol 2 no 1-2 pp 1ndash135 2008

[10] S Deerwester S T Dumais G W Furnas T K Landauerand R Harshman ldquoIndexing by latent semantic analysisrdquoJournal of the American Society for Information Sciencevol 41 no 6 pp 391ndash407 1990

[11] T Mikolov Q V Le and I Sutskever ldquoExploiting similaritiesamong languages for machine translationrdquo 2013 httparxivorgabs13094168

[12] R Alghamdi and K Alfalqi ldquoA survey of topic modeling intext miningrdquo International Journal of Advanced ComputerScience and Applications (IJACSA)vol 6 no 1 2015

[13] D M Eberhard S F Gary and C D Fennig EthnologueLanguages of the World SIL International Dallas TX USA2019

[14] A Daud W Khan and D Che ldquoUrdu language processing asurveyrdquo Artificial Intelligence Review vol 47 no 3pp 279ndash311 2017

[15] M P Akhter Z Jiangbin I R Naqvi M AbdelmajeedA Mehmood and M T Sadiq ldquoDocument-level text clas-sification using single-layer multisize filters convolutionalneural networkrdquo IEEE Access vol 8 pp 42689ndash42707 2021

[16] U Pal and A Sarkar ldquoRecognition of printed Urdu scriptrdquo inProceedings of the 2003 Seventh International Conference onDocument Analysis and Recognition pp 1183ndash1187 IEEEEdinburgh Scotland August 2003

[17] Y Yang T Pierce and J Carbonell ldquoA study of retrospectiveand on-line event detectionrdquo in Proceedings of the 21st annualinternational ACM SIGIR conference on Research and

Scientific Programming 13

development in information retrieval pp 28ndash36 MelbourneAustralia August 1998

[18] T Kala ldquoEvent detection from text datardquo ComputationalIntelligence vol 31 pp 132ndash164 2015

[19] M Naughton N Stokes and J Carthy ldquoSentence-level eventclassification in unstructured textsrdquo Information Retrievalvol 13 no 2 pp 132ndash156 2010

[20] G Jacobs E Lefever and V Hoste ldquoEconomic event de-tection in company-specific news textrdquo in Proceedings of theFirst Workshop on Economics and Natural Language Pro-cessing pp 1ndash10 Melbourne Australia July 2018

[21] E DrsquoAndrea P Ducange A Bechini A Renda andF Marcelloni ldquoMonitoring the public opinion about thevaccination topic from tweets analysisrdquo Expert Systems withApplications vol 116 pp 209ndash226 2019

[22] M Sokolova and G Lapalme ldquoA systematic analysis ofperformance measures for classification tasksrdquo InformationProcessing amp Management vol 45 no 4 pp 427ndash437 2009

[23] T Mikolov I Sutskever K Chen G S Corrado and J DeanldquoDistributed representations of words and phrases and theircompositionalityrdquo in Proceedings Advances Neural Informa-tion Processing Systems vol 26 pp 3111ndash3119 Lake TahoeNV USA December 2013

[24] Y Bengio R Ducharme P Vincent and C Jauvin ldquoA neuralprobabilistic language modelrdquo Journal of Machine LearningResearch vol 3 pp 1137ndash1155 2003

[25] M P Akhter Z Jiangbin I R Naqvi M Abdelmajeed andM Fayyaz ldquoExploring deep learning approaches for Urdu textclassification in product manufacturingrdquo Enterprise Infor-mation Systems pp 1ndash26 2020

[26] G Liu and J Guo ldquoBidirectional LSTM with attentionmechanism and convolutional layer for text classificationrdquoNeurocomputing vol 337 pp 325ndash338 2019

[27] M Sharjeel R M A Nawab and P Rayson ldquoCOUNTERcorpus of Urdu news text reuserdquo Language Resources andEvaluation vol 51 no 3 pp 777ndash803 2017

[28] K Mehmood D Essam and K Shafi ldquoSentiment analysissystem for Roman Urdurdquo in Proceedings of the 2018 Scienceand Information Conference pp 29ndash42 Springer CasablancaMorocco July 2018

[29] K Ahmed M Ali S Khalid and M Kamran ldquoFramework forUrdu news headlines classificationrdquo Journal of Applied ComputerScience amp Mathematics vol 10 no 1 pp 17ndash21 2016

[30] Z Tehseen M P Akhter and Q Abbas ldquoComparative studyof feature selection approaches for Urdu text categorizationrdquoMalaysian Journal Computer Science vol 28 no 2 pp 93ndash109 2015

[31] W Yin and L Shen ldquoA short text classification approach withevent detection and conceptual informationrdquo in Proceedingsof the 2020 5th International Conference on Machine LearningTechnologies pp 129ndash135 Beijing China June 2020

[32] H Zhou M Huang T Zhang et al ldquoEmotional chattingmachine emotional conversation generation with internaland external memoryrdquo in Proceedings of the irty-SecondAAAI Conference on Artificial Intelligence New Orleans LAUSA February 2018

[33] A Hassan and A Mahmood ldquoConvolutional recurrent deeplearningmodel for sentence classificationrdquo IEEE Access vol 6pp 13949ndash13957 2018

[34] T Zia M P Akhter and Q Abbas ldquoComparative study offeature selection approaches for Urdu text categorizationrdquoMalaysian Journal of Computer Science vol 28 no 2pp 93ndash109 2015

[35] A R Ali andM Ijaz ldquoUrdu text classificationrdquo in Proceedingsof the 7th international conference on frontiers of informationtechnology pp 1ndash7 Abbottabad Pakistan December 2009

[36] M Usman Z Shafique S Ayub and K Malik ldquoUrdu textclassification using majority votingrdquo International Journal ofAdvanced Computer Science and Applications vol 7 no 8pp 265ndash273 2016

[37] N Kalchbrenner E Grefenstette and P Blunsom ldquoA con-volutional neural network for modelling sentencesrdquo 2014httparxivorgabs14042188

[38] D M Awais and DM Shoaib ldquoRole of discourse informationin Urdu sentiment classificationrdquo ACMTransactions on Asianand Low-Resource Language Information Processing vol 18no 4 pp 1ndash37 2019

[39] J P Singh Y K Dwivedi N P Rana A Kumar andK K Kapoor ldquoEvent classification and location predictionfrom tweets during disastersrdquo Annals of Operations Researchvol 283 no 1-2 pp 737ndash757 2019

[40] R C Paulo D Fillipe and M S C Sergo ldquoClassification ofevents on social mediardquo 2016

[41] Q A Al-Radaideh and M A Al-Abrat ldquoAn Arabic textcategorization approach using term weighting and multiplereductsrdquo Soft Computing vol 23 no 14 pp 5849ndash5863 2019

[42] J F Allen ldquoMaintaining knowledge about temporal inter-valsrdquo Communications of the ACM vol 26 no 11pp 832ndash843 1983

[43] T Joachims ldquoText categorization with support vector ma-chines learning with many relevant featuresrdquo in Proceedingsof the European conference on machine learning pp 137ndash142Springer Chemnitz Germany April 1998

[44] S Haider ldquoUrdu word embeddingsrdquo in Proceedings of theEleventh International Conference on Language Resources andEvaluation (LREC 2018) Miyazaki Japan May 2018

[45] B Jawaid A Kamran and O Bojar ldquoUrdu monolingual corpusrdquoLINDATCLARIN Digital Library at the Institute of Formal andApplied Linguistics Charles University Prague Czechia

[46] F Adeeba Q Akram H Khalid and S Hussain ldquoCLE Urdubooks n-gramsrdquo in Proceedings of the Conference on Languageand Technology CLT 14 Karachi Pakistan May 2014

[47] A Hassan and A Mahmood ldquoDeep learning for sentenceclassificationrdquo in Proceedings of the 2017 IEEE Long IslandSystems Applications and Technology Conference (LISAT)pp 1ndash5 IEEE New York NY USA May 2017

[48] D-X Zhou ldquoUniversality of deep convolutional neuralnetworksrdquo Applied and Computational Harmonic Analysisvol 48 no 2 pp 787ndash794 2020

[49] M V Valueva N N Nagornov P A Lyakhov G V Valuevand N I Chervyakov ldquoApplication of the residue numbersystem to reduce hardware costs of the convolutional neuralnetwork implementationrdquo Mathematics and Computers inSimulation vol 177 pp 232ndash243 2020

[50] G Guo H Wang D Bell Y Bi and K Greer ldquoKNN model-based approach in classificationrdquo in Proceedings of the OTMConfederated International Conferences ldquoOn the Move toMeaningful Internet Systemsrdquo Catania Italy November 2003

[51] Y Zhong ldquo)e analysis of cases based on decision treerdquo inProceedings of the 2016 7th IEEE international conference onsoftware engineering and service science (ICSESS) pp 142ndash147IEEE Beijing China August 2016

[52] S Xu ldquoBayesian Naıve Bayes classifiers to text classificationrdquoJournal of Information Science vol 44 no 1 pp 48ndash59 2018

[53] T Zhang and F Oles ldquoText categorization based on regu-larized linear classification methodsrdquo Information Retrievalvol 4 no 1 pp 5ndash31 2001

14 Scientific Programming

[54] J Ali R Khan N Ahmad and I Maqsood ldquoRandom forestsand decision treesrdquo International Journal of Computer ScienceIssues (IJCSI) vol 9 no 5 p 272 2012

[55] Y Zhang ldquoSupport vector machine classification algorithmand its applicationrdquo in Proceedings of the InternationalConference on Information Computing and Applicationspp 179ndash186 Springer Bhubaneswar India September 2012

Scientific Programming 15

Page 5: ResearchArticle MulticlassEventClassificationfromText2020/11/03  · script. Its grammatical structure is different from other languages. (1)Subject-object-verb(SOV)sentencestructure[14]

A collection of Urdu text documentsD d1 d2 dn1113864 1113865 is split into a set of sentencesS s1 s2 sn1113864 1113865 Our purpose is to classify the sentences toa predefined set of events E e1 e2 en1113864 1113865

Various feature generating methods are used to create afeature vector for deep learning and machine learning classi-fiers ie TF-IDF one-hot-encoding and word embeddingFeature vectors generated by all these techniques are fed up asinput into the embedding layer of neural networks )e outputgenerated by the embedding layers is fed up into the next fullyconnected layer (dense layer) of deep learning models ieRNN CNN and DNN A relevant class label out of twelvecategories is assigned to each sentence at the end of modelprocessing in the testingvalidation phase

Bag-of-words is a common method to represent text Itignores the sequence order and semantic of text [43] whilethe one-hot-coding method maintains the sequence of textWord embedding methods Word2Vec and Glove (httpsybbaigogitbooksio26pretrained-word-embeddingshtml)that are used to generate feature vectors for deep learningmodels are highly recommended for textual data Howeverin the case of Urdu text classification pre-existing wrod2vecand Glove are incompatible

)e framework of our designed system is represented inFigure 4 It shows the structure of our system from takinginput to producing output

5 Experimental Setup

We performed many experiments on our dataset by usingvarious traditional machine learning and deep learningclassifiers )e purpose of many experiments is to find themost efficient and accurate classification model for themulticlass event on an imbalance dataset for the Urdulanguage text A detailed comparison between traditionalclassifiers and deep neural classifiers is given in the nextsection

51 Feature Space Unigram and bigram tokens of the wholecorpus are used as features to create the feature space TF-IDF vectorization is used to create a dictionary-based modelIt consists of 656608 features )e training and testingdataset are converted to TF-IDF dictionary-based featurevectors A convolutional sequential model (see Figure 5)consists of three layers ie the input layer hidden layer andoutput layer which are used to evaluate our dataset Sim-ilarly word embedding and one-hot-encoding are also in-cluded in our feature space to enlarge the scope of ourresearch problem

52 Feature Vector Generating Techniques Feature vectorsare the numerical representation of text )ey are an actualform of input that can be processed by the machine learningclassifier )ere are several feature generating techniquesused for text processing We used the following featurevector generating techniques

521 Word Embedding A numerical representation of thetext is that each word is considered as a feature vector Itcreates a dense vector of real values that captures thecontextual semantical and syntactical meaning of the wordIt also ensures that similar words should have a relatedweighted value [29]

522 Pretrained Word Embedding Models Usage of apretrained word embedding model for the small amount ofdata is highly recommended by researchers in state of the artGlove and Word2Vec are famous word embedding modelsthat are developed by using a big amount of data Wordembedding models for text classification especially in theEnglish language showed promising results It has emergedas a powerful feature vector generating technique amongothers ie TF TF-IDF and one-hot encoding etc

In our research case sentence classification for differentevents in the Urdu language using the word embeddingtechnique is potentially preferable Unfortunately the Urdulanguage is lacking in processing resources We found onlythree word embedding models a word embedding model

Table 2 Sentence tokenization

Sentence Tokenized sentenceیلےلناجیکںوگولددعتمےنسرئاوانورک یل ےل ناج ںوگول ددعتم سرئاو انورک

یئگرگتھچیکںورھگیئکےسشرابینافوط ئگ رگ تھچ ںورھگ یئک شراب ینافوط

Table 3 Event label

Event LabelSports 1Inflation 2Murder and death 3Terrorist attack 4Politics 5Law and order 6Earthquake 7Showbiz 8Fraud and corruption 9Rainweather 10Sexual assault 11

Figure 2 )e few instances of the dataset

Scientific Programming 5

0

5000

10000

15000

20000

25000

30000

35000

40000To

tal n

umbe

r of i

nsta

nces

Imbalance dataset of the urdu label sentences for events

Type of events

18746

31966932

3034

33421

69603238

741710078

3406 2916 3617

Spor

ts

Infla

tion

Mur

der

Terr

orist

atta

ck

Polit

ics

Law

and

orde

r

Eart

hqua

ke

Show

biz

Frau

d an

dco

rrup

tion

Wea

ther

Sexu

al as

saul

t

Busin

ess

Figure 3 Imbalance instances of the dataset

Raw input Preprocessed input Word embedding layersDLM

RNN (LSTM-unidirection)

CNN

DNN

LabelsTF-IDF

Documents1

Documents2

Documents3

Documents

n

1

2

3

12

Figure 4 Methodology

00

050

055

060

065

070

075

080

085

05 10 15 20 25 30 35 40Epochs

Accuracy

Valid

ation_

accuracy

TrainTest

Figure 5 DDNrsquos accuracy

6 Scientific Programming

[44] that is developed by using three publicly available Urdudatasets Wikipediarsquos Urdu text another corpus having 90million tokens [45] and 35 million tokens [46] It has 102214unique tokens Each token comprises 300-dimensional realvalues Another model publicly available for research purposesconsists of 25925 unique words of Urdu language [47] Everyword has a 400-dimensional value A word embedding modelcomprises web-based text created to classify text It consists of64653 unique Urdu words and 300 dimensions for each word

)e journey of research is not over here to expand ourresearch scope and find the most efficient word embeddingmodel for sentence classification we decided to developcustom word embedding models We developed four wordembedding models that contain 57251 unique words

)e results of pretrained existing word embedding modelsare good at the initial level but very low ie 6026 accuracyWe explored the contents of these models which revealed thatmanywords are irrelevant and borrowed from other languagesie Arabic and Persian )e contents of Wikipedia are entirelydifferent than news websites that also affect the performance ofembedding models Another major factor ie low amount ofdata affected the feature vector generation quality Stop wordsin the pretrained word embedding model are not eliminatedand considered as a token while in our dataset all the stopwords are removed It also reduces the size of the vocabulary ofthe model while generating a feature vector )erefore wedecided to develop a custom word embedding model on ourpreprocessed dataset To postulate the enlargement of theresearch task three different word embedding models aredeveloped )e details of all used pretrained word embeddingmodels are given in Table 4

523 One-Hot-Encoding Text cannot be processed directlybymachine learning classifiers therefore we need to convertthe text into a real value We used one-hot-encoding toconvert text to numeric features For example the sentencesgiven in Table 5 can be converted into a numeric featurevector using one-hot-encoding as shown in Table 6

524 TF-IDF TF and TF-IDF are feature engineeringtechniques that transform the text into the numerical for-mat It is one of the most highly used feature vectors forcreating a method for text data )ree deep learning modelswere evaluated on our corpus )e sequential model withembedding layers outperformed other pretrained wordembedding models [44] reported in state of the art [48] )edetailed summary of the evaluation results of CNN RNNand DNN is discussed in Section 7

53 Deep Learning Models

531 Deep Neural Network Architecture Our DNN archi-tecture consists of three layers ie n-input layer 150 hidden(dense) layers and 12 output layers Feature vector is givenas input into a dense layer that is fully connected )eSoftMax activation function is used in the output layer toclassify sentences into multiple classes

532 Recurrence Neural Network )e recurrence neuralnetwork is evaluated using a long short-term memory(LSTM) classifier RNN consists of embedding dropoutLSTM and dense layers A dictionary of 30000 unique mostfrequent tokens is made )e sentences are standardized tothe same length by using a padding sequence)e dimensionof the feature vector is set as 250 RNN showed an overall81 accuracy that is the second highest in our work

533 Convolutional Neural Network (CNN) CNN is a classof deep neural networks that are highly recommended forimage processing [49] It consists of the input layer (em-bedding layer) multiple hidden layers and an output layer)ere are a series of convolutional layers that convolve witha multiplication )e embedded sequence layer and averagelayer (GloobalAveragePooling1D) are also part of the hiddenlayer )e common activation of CNN is RELU Layer )edetails of the hypermeters that are used in our problem totrain the CNN model are given in Table 7

534 Hyperparameters In this section all the hyper-parameters that are used in our experiments are given in thetabular format Only those hyperparameters are being dis-cussed here which have achieved the highest accuracy ofDNN RNN and CNN models )e hyperparameters ofDNN that are fine-tuned in our work are given in Table 8

)e RNN model showed the highest accuracy (803and 81) on two sets of hyperparameters that are given inTable 9 Similarly Table 7 provides the details of thehyperparameters of the convolutional neural network

6 Performance Measuring Parameters

)emost common performance measuring [41] parametersie precision recall and F1-measure are used to evaluatethe proposed framework )e selection of these parameterswas decided because of the multiclass classification andimbalance dataset

Precision TP

(TP + FP) (1)

Recall TP

(TP + FN) (2)

F1 2lowast PrecisionlowastRecall( 1113857

(Precision + Recall) (3)

Accuracy (TP + TN)

(TP + TN + FP + FN) (4)

where TP TN FP and FN represent total positive totalnegative false positive and false negative values respec-tively Precision is defined as the closeness of the mea-surements to each other and recall is the ratio of the totalamount of relevant (ie TP values) instances that wereactually retrieved during the experimental work It is

Scientific Programming 7

noteworthy that both precision and recall are the relativevalues of measure of relevance

7 Results

71 Deep Learning Classifiers )e feature vector can begenerated using different techniques )e details of feature

vector generating techniques were discussed in Section 5)e results of feature vector generating techniques that wereused in our work ie ldquomulticlass event classification for theUrdu language textrdquo are given in the proceedingsubsections

711 Pretrained Word Embedding Models )e convolu-tional neural network model is evaluated on the featuresvectors that were generated by all pretrained word em-bedding models )e summary of all results generated by

Table 4 Pretrained word embedding model and custom word embedding model

Sr no Unique words Dimension Window sizeExisting pretrained word embedding models1 [11] 64653 300 mdash2 [19] 102214 100 mdash3 53454 300 mdashCustom pretrained word embedding models1 57251 50 22 57251 100 23 57251 100 34 57251 350 1

Table 5 Event sentence

Urdu sentence English sentenceےہاتلیھکلابٹفیلع Ali plays football

یلےلناجیکںوگولںوھکالےنسرئاوانورک Corona virus killed millions of people

Table 6 Event sentence converted using one-hot-encoding

Sentence اتلیھک لاب ٹف یلع ناج ںوگول ںوھکال سرئاو انورک1 1 1 1 1 0 0 0 0 02 0 0 0 0 1 1 1 1 1

Table 7 CNNrsquos hyperparameters

CNN (7928)Parameter ValueMax_words 20000Batch size 128Embedding_dim 50Activation function SoftMaxDense_node 256Trainingtesting 70ndash30No of epochs 20Loss function Categorical cross-entropy

Table 8 DNNrsquos hyperparameters

Parameter ValueMax_words 5000Batch size 128Embedding_dim 512Activation function SoftMaxLayers 04Trainingtesting 70ndash30No of epochs 15Loss function Sparse categorical cross-entropy

Table 9 RNNrsquos hyperparameters

Parameter ValueRNN (LSTM) (803)Max_words 50000Batch size 64Embedding_dim 100Activation function SoftMaxRecurrent dropout 02Trainingtesting 90ndash10No of epochs 05Loss function Sparse categorical cross-entropyRNN (LSTM) (81)Max_words 30000Batch size 128Embedding_dim 100Activation function SoftMaxRecurrent dropout 02Trainingtesting 80ndash20No of epochs 05Loss function Sparse categorical cross-entropy

8 Scientific Programming

pretrained [44] and custom pretrained word embeddingmodels is given in Table 10 Our custom pretrained wordembedding model that contains 57251 unique tokens largerdimension size 350 and 1 as the size of a window showed3868 accuracy )e purpose of developing a differentcustom pretrained word embedding model was to develop adomain-specific model and achieve the highest accuracyHowever the results of both pre-existing pretrained wordembedding models and domain-specific custom word em-bedding models are very low )e detail summary of resultscan be seen in Table 10

712 TF-IDF Feature Vector DNN architecture consists ofan input layer a dense layer and a max pool layer)e denselayer is also called a fully connected layer comprised of 150nodes SoftMax activation function and sparse_categor-ical_cross-entropy are used to compile the model on thedataset

25991 instances are used to validate the accuracy of theDNN model )e DNN with connected layer architectureshowed 84 overall accuracy for all event classes)e details ofthe performance measuring parameters for each class of eventare given in Table 11 Law and order the 6th type of event inour dataset consists of 2000 instances that are used for vali-dation It showed 66 accuracy that is comparatively low to theaccuracy of other types of events It affected the overall per-formance of the DNN model )e main reason behind theseresults is that the sentence of law and order overlaps with thesentences of politics Generally sometimes humans hardlydistinguish between law and order and political statements

For example

ldquo ےطخوگتفگہنارادہمذریغیکریزویتموکحےہہرطخےیلےکنماےک rdquo

ldquo)e irresponsible talk of state minister is a threat topeace in the regionrdquo

)e performance of the DNN model is given in Table 11that showed 84 accuracy for multiple classes of events All theother performance measuring parameters ie precessionrecall and F1-score of each class of events are given in Table 11

)e accuracy of the DNN model can be viewed inFigure 5 where the y-axis represents the accuracy and the x-axis represents the number of epochs RNN achieved 84accuracy for multiclass event classification

)e expected solution to tackle the sentence over-lapping problem with multiple classes is to use a ldquopre-trained word embeddingrdquo model like W2Vec and GloveHowever unfortunately like the English language stillthere is no openclose domain pretrained word

embedding model that is developed by a large corpus ofthe Urdu language text

)e RNN sequential model architecture of deep learningis used in our experiments )e recurrent deep learningmodel architecture consists of a sequence of the followinglayers ie embedding layer having 100 dimensions Spa-tialDropout1D LSTM and dense layers Sparse_categor-ical_cross-entropy loss function has been used for thecompilation of the model Multiclass categorical classifica-tion is handled by a sparse categorical cross-entropy lossfunction instead of categorical cross-entropy A SoftMaxactivation function is used at a dense layer instead of thesigmoid function SoftMax can handle nonlinear classifi-cation ie multiple classes while sigmoid is limited to linearclassification and handles binary classification

A bag-of-words consisting of 30000 unique Urdu lan-guage words is used to generate a feature vector )emaximum length of the feature vector is 250 tokens

)e overall accuracy of the RNN model is presented inTable 12 that achieved 81 validation accuracy for our problemby using TF-IDF feature vectors Other performance evaluationparameters of each class are also given in Table 12

)e accuracy of the RNN model can be viewed inFigure 6 where the y-axis represents the accuracy and the x-axis represents the number of epochs RNN achieved 81accuracy for multiclass event classification

Although CNN is highly recommended for imageprocessing it showed considerable results for multiclassevent classification on textual data )e performance mea-suring parameters of the CNN classifier are given in Table 13

)e distributed accuracy of the CNN classifier for thetwelve classes can be viewed in Figure 7 )ere is more thanone peak (higher accuracies) in Figure 7 that showeddatasets are imbalanced

713 One-Hot-Encoding )e results of deep learning clas-sifiers are used in our researcher work and their performanceon one-hot-encoding features is presented in Figure 8)e one-hot-encoded feature vectors are given as input to CNN DNNand RNN deep learning classifiers RNN showed better ac-curacy as compared to CNN while the DNN outperformed

Table 10 Classification accuracy of the CNN model

Srno

Existing pretrained modelrsquosvalidation_accuracy

Custom pretrained modelrsquosvalidation_accuracy

1 5800 36852 6026 38043 5668 37384 mdash 3868

Table 11 Performance measuring parameters for the DNNmodel

Class Precision Recall F1-score Support1 096 095 096 46042 091 091 091 7763 075 075 075 16974 078 070 074 7705 081 085 083 84246 071 063 067 20007 100 100 100 8178 092 090 091 18399 070 070 071 252410 095 099 097 85611 095 099 097 74112 082 073 077 943Accuracy 084 25991Macro avg 084 084 085 25991Weighted avg 084 084 084 25991

Scientific Programming 9

among them RNN and DNN achieved 81 and 84 accuracyrespectively for multiclass event classification

72 Traditional Machine Learning Classifiers We also per-formed a multiclass event classifier by using traditionalmachine learning algorithms K-nearest neighbor (KNN)decision tree (DT) Naıve Bayes multinomial (NBM) ran-dom forest (RF) linear regression (LR) and support vectormachine (SVM) All these models are evaluated using TF-IDF and one-hot encoding features as feature vectors It wasobserved that the results produced using TF-IDF featureswere better than the results generated using one-hot-encoding features A detailed summary of the results of theabove-mentioned machine learning classifiers is given in thenext section

721 K-Nearest Neighbor (KNN) KNN performs theclassification of a new data point by measuring the similaritydistance between the nearest neighbors In our experimentswe set the value of k 5 that measures the similarity distanceamong five existing data points [50]

Table 12 Performance measuring parameters for the RNN model

Class Precision Recall F1-score Support1 095 095 095 46042 078 077 078 7763 070 072 071 16974 078 064 070 7705 078 084 081 84246 067 057 062 20007 100 100 100 8178 091 087 089 18399 070 063 066 252410 093 098 095 85611 086 094 090 74112 076 067 071 943Accuracy 081 25991Macro avg 082 080 081 25991Weighted avg 081 081 081 25991

Table 13 Performance measuring parameters for the CNN model

Class Precision Recall F1-score Support1 096 093 095 56612 081 065 072 9673 072 068 070 21154 078 054 064 8785 073 088 080 100306 064 051 057 22937 099 099 099 9708 091 086 088 22599 071 061 066 304410 093 094 093 103111 091 082 086 88912 077 063 07 1052Accuracy 080 31189Macro avg 082 075 078 31189Weighted avg 080 080 080 31189

0

065

070

075

080

085

090

2 4 6 8Epochs

Accuracy

Valid

ation_

accuracy

TrainTest

Figure 6 RNNrsquos accuracy

00

1000

2000

3000

4000

5000

6000

7000

8000

20 40 60 80 100

Figure 7 CNNrsquos accuracy distribution

CNN

79

80

808181

82

83

8484

RNN DNNDeep learning models

One-hot-encoding

78

85

Valid

atio

n_ac

cura

cy

Figure 8 CNN RNN andDNN accuracy using one-hot-encoding

10 Scientific Programming

Although the performance of traditional machinelearning classifiers is considerable it must be noted that it islower than deep learning classifiers )e main performancedegrading factor of the classifiers is the imbalanced numberof instances and sentences overlapping )e performance ofthe KNN machine learning model is given in Table 14 Itshowed 78 accuracy

722 Decision Tree (DT) Decision Tree (DT)Decision tree(DT) is a type of supervised machine learning algorithm [51]where the data input is split according to certain parameters)e overall accuracy achieved by DT is 73 while anotherperformance detail of classes andDTmodel is given in Table 15

723 Naive Bayes Multinominal (NBM) Naıve Bayesmultinominal is one of the computational [52] efficientclassifiers for text classification but it showed only 70accuracy that is very low as compared to KNN DT and RF)e performance details of all twelve types of classes aregiven in Table 16

724 Linear Regression (LR) Linear regression is highlyrecommended for the prediction of continuous output in-stead of categorical classification [53] Table 17 shows the

Table 17 Performance measuring parameters for the LR model

Class Precision Recall F1-score Support1 095 094 094 56612 083 064 072 9673 072 069 070 21154 077 055 064 8785 073 088 080 100306 064 053 058 22937 100 100 100 9708 091 084 088 22599 073 062 067 304410 094 092 093 103111 090 080 085 88912 077 066 071 1052Accuracy 080 31189Macro avg 082 076 079 31189Weighted avg 080 080 0 80 31189

Table 18 Performance measuring parameters for the RF model

Class Precision Recall F1-score Support1 094 093 094 56612 094 096 095 9673 072 063 067 21154 080 058 067 8785 071 090 079 100306 067 041 051 22937 100 100 100 9708 093 080 086 22599 075 058 065 304410 094 098 096 103111 096 098 097 88912 084 063 072 1052Accuracy 080 31189Macro avg 085 078 081 31189Weighted avg 081 080 0 80 31189

Table 14 Performance measuring parameters for the KNN model

Class Precision Recall F1-score Support1 091 093 092 56612 062 083 071 9673 067 071 069 21154 064 060 062 8785 078 082 080 100306 066 050 057 22937 093 100 096 9708 091 080 085 22599 071 062 066 304410 085 093 089 103111 072 085 078 88912 075 061 067 1052Accuracy 078 31189Macro avg 076 077 076 31189Weighted avg 078 078 0 78 31189

Table 15 Performance measuring parameters for the DT model

Class Precision Recall F1-score Support1 091 089 090 56612 083 097 089 9673 057 052 054 21154 058 054 056 8785 072 075 073 100306 044 041 042 22937 099 100 100 9708 079 077 078 22599 057 055 056 304410 098 098 093 103111 086 098 092 88912 061 056 058 1031Accuracy 073 31189Macro avg 073 074 074 31189Weighted avg 073 073 0 73 31189

Table 16 Performance measuring parameters for the NB Multi-nominal model

Class Precision Recall F1-score Support1 094 091 093 56832 082 034 048 9563 066 047 055 21214 091 020 032 9195 056 095 070 100136 070 022 034 23877 098 095 097 9598 094 075 083 21889 075 040 052 303110 096 078 086 99811 096 032 048 86312 084 025 039 1071Accuracy 070 31189Macro avg 084 054 061 31189Weighted avg 076 070 0 67 31189

Scientific Programming 11

performance of the LR model ie 84 overall accuracy formulticlass event classification

725 Random Forest (RF) It comprises many decision trees[54] Its results showed the highest accuracy among allevaluated machine learning classifiers A detailed summaryof the results is given in Table 18

726 Support Vector Machine (SVM) )e support vectormachine (SVM) is one of the highly recommended modelsfor binary classification It is based on statistical theory [55]Its performance details are given in Table 19

A comparative depiction of results obtained by thetraditional machine learning classifiers is given in Figure 9

8 Discussion and Conclusion

Lack of resources is a major hurdle in research for Urdulanguage texts We explored many feature vectors generatingtechniques Different classification algorithms of traditional

machine learning and deep learning approaches are evaluatedon these feature vectors )e purpose of performing manyexperiments on various feature vector generating techniqueswas to develop the most efficient and generic model of mul-ticlass event classification for Urdu language text

Word embedding feature generating technique is con-sidered an efficient and powerful technique for text analysisWord2Vector (W2Vec) feature vectors can be generated bypretrained word embedding models or using dynamic pa-rameters in embedding layers of deep neural networks Weperformed sentence classification using pretrained wordembedding models one-hot-encoding TF TF-IDF anddynamic embeddings )e results of the rest of the featurevector generating techniques are better than pretrained wordembedding models

Another argument in support of this conclusion is thatonly a few pretrained word embedding models exist forUrdu language texts )ese models are trained on consid-erable tokens and domain-specific Urdu text )ere is a needto develop generic word embedding models for the Urdulanguage on a large corpus CNN and RNN (LSTM) single-layer architecture and multilayer architecture do not affectthe performance of the proposed system

Experimental results are the vivid depiction that theone-hot-encoding method is better than the word em-bedding model and pretrained word embedding modelHowever among all mentioned (see Section 52) featuregenerating techniques TF-IDF outperformed It showedthe highest accuracy (84) by using DNN deep learningclassifier while event classification on an imbalancedataset of multiclass events for Urdu language usingtraditional machine learning classifiers showed consid-erable performance but lower than deep learning modelsDeep learning algorithms ie CNN DNN and RNN arepreferable over traditional machine learning algorithmsbecause there is no need for a domain expert to findrelevant features in deep learning like traditional machinelearning DNN and RNN outperformed among all otherclassifiers and showed overall 84 and 81 accuracyrespectively for the twelve classes of events Compara-tively the performance of CNN and RNN is better thanNaıve Bayes and SVM

Multiclass event classification at the sentence levelperformed on an imbalance dataset events that are having alow number of instances for a specific class affect the overallperformance of the classifiers We can improve the per-formance by balancing the instances of each class )efollowing can be concluded

(1) Pretrained word embedding models are suitable onlyfor sentence classification if pretrained models aredeveloped by an immense amount of textual data

(2) Existing word embedding models Word2Vec andGlove that were developed for the English languagetext are incompatible for Urdu language text

(3) In our case TF-IDF one-hot-encoding and dy-namic embedding layer are better feature generating

Table 19 Performance measuring parameters for the SVM model

Class Precision Recall F1-score Support1 084 094 089 56832 072 043 054 9563 072 049 058 21214 073 043 054 9195 064 090 075 100136 074 024 036 23877 090 099 094 9598 086 078 082 21889 065 047 057 303110 085 087 082 99811 081 062 070 86312 077 063 067 1071Accuracy 073 31189Macro avg 077 063 067 31189Weighted avg 077 073 0 71 31189

SVM

70

7573

70

80

73

80

7880

RF LRMachine learing classifiers

65

85

DT KNNNBM

Valid

atio

n_ac

cura

cy

Figure 9 Machine learning algorithmsrsquo accuracy using TF-IDF

12 Scientific Programming

techniques as compared to pre-existing Urdu lan-guage text word embedding models

(4) )e TF-IDF-based feature vectors showed thehighest results as compared to one-hot-encoding-and dynamic word embedding-based feature vectors

(5) Imbalance number of instances in the dataset af-fected the overall accuracy

9 Future Work

In a comprehensive review of Urdu literature we found onlya few numbers of referential works related to Urdu textprocessing )e main hurdle in Urdu exploration is theunavailability of the processing resources ie event datasetclose-domain part-of-speech tagger lexicons annotatorsand other supporting tools

)ere are a lot of tasks that can be accomplished forUrdu language text in the future Some of those are men-tioned as follows

(1) Generic word embedding models can be developedfor a large corpus of Urdu language text

(2) Different deep learning classifiers can be evaluatedie BERT and ANN

(3) Event classification can be performed at the doc-ument level

(4) A balance dataset can be used for better results(5) Multilabel event classification can be performed in

the future(6) Unstructured data of Urdu text can be classified

into different event classes(7) Classification of events for the Urdu language can

be further performed for other domains ofknowledge ie literacy ratio top trends famousfoods and a religious event like Eid

(8) Contextual information of sentence ie presen-tence and postsentence information certainly playsa vital role in enhancing the performance accuracyof the classification model

(9) Event classification can be performed on a balanceddataset

(10) Unstructured Urdu data can be used for eventclassification

(11) Classification can be performed at a document andphrase level

Data Availability

)e data used to support this study are available at httpsgithubcomunique-worldMulticlass-Event-Classification-Dataset

Conflicts of Interest

)e authors declare that there are no conflicts of interest

References

[1] A Lenhart R Ling S Campbell and K Purcell Teens andMobile Phones Text Messaging Explodes as Teens Embrace it asthe Centerpiece of eir Communication Strategies withFriends Pew Internet amp American Life Project WashingtonDC USA 2010

[2] M Motoyama B Meeder K Levchenko G M Voelker andS Savage ldquoMeasuring online service availability using twit-terrdquo WOSN vol 10 p 13 2010

[3] J Rogstadius M Vukovic C A Teixeira V KostakosE Karapanos and J A Laredo ldquoCrisisTracker crowdsourcedsocial media curation for disaster awarenessrdquo IBM Journal ofResearch and Development vol 57 no 5 pp 4ndash1 2013

[4] T Reuter and P Cimiano ldquoEvent-based classification of socialmedia streamsrdquo in Proceedings of the 2nd ACM InternationalConference on Multimedia Retrieval pp 1ndash8 Bielefeld Ger-many June 2012

[5] K Sailunaz and R Alhajj ldquoEmotion and sentiment analysisfrom Twitter textrdquo Journal of Computational Science vol 36Article ID 101003 2019

[6] P Capet T Delavallade T Nakamura A SandorC Tarsitano and S Voyatzi ldquoA risk assessment system withautomatic extraction of event typesrdquo in Proceedings of theInternational Conference on Intelligent Information Process-ing pp 220ndash229 Springer Beijing China October 2008

[7] F Hogenboom F Frasincar U Kaymak F De Jong andE Caron ldquoA survey of event extraction methods from text fordecision support systemsrdquo Decision Support Systems vol 85pp 12ndash22 2016

[8] S Jiang H Chen J F Nunamaker and D Zimbra ldquoAna-lyzing firm-specific social media and market a stakeholder-based event analysis frameworkrdquo Decision Support Systemsvol 67 pp 30ndash39 2014

[9] B Pang and L Lee ldquoOpinion mining and sentiment analysisrdquoFoundations and Trends in Information Retrieval vol 2 no 1-2 pp 1ndash135 2008

[10] S Deerwester S T Dumais G W Furnas T K Landauerand R Harshman ldquoIndexing by latent semantic analysisrdquoJournal of the American Society for Information Sciencevol 41 no 6 pp 391ndash407 1990

[11] T Mikolov Q V Le and I Sutskever ldquoExploiting similaritiesamong languages for machine translationrdquo 2013 httparxivorgabs13094168

[12] R Alghamdi and K Alfalqi ldquoA survey of topic modeling intext miningrdquo International Journal of Advanced ComputerScience and Applications (IJACSA)vol 6 no 1 2015

[13] D M Eberhard S F Gary and C D Fennig EthnologueLanguages of the World SIL International Dallas TX USA2019

[14] A Daud W Khan and D Che ldquoUrdu language processing asurveyrdquo Artificial Intelligence Review vol 47 no 3pp 279ndash311 2017

[15] M P Akhter Z Jiangbin I R Naqvi M AbdelmajeedA Mehmood and M T Sadiq ldquoDocument-level text clas-sification using single-layer multisize filters convolutionalneural networkrdquo IEEE Access vol 8 pp 42689ndash42707 2021

[16] U Pal and A Sarkar ldquoRecognition of printed Urdu scriptrdquo inProceedings of the 2003 Seventh International Conference onDocument Analysis and Recognition pp 1183ndash1187 IEEEEdinburgh Scotland August 2003

[17] Y Yang T Pierce and J Carbonell ldquoA study of retrospectiveand on-line event detectionrdquo in Proceedings of the 21st annualinternational ACM SIGIR conference on Research and

Scientific Programming 13

development in information retrieval pp 28ndash36 MelbourneAustralia August 1998

[18] T Kala ldquoEvent detection from text datardquo ComputationalIntelligence vol 31 pp 132ndash164 2015

[19] M Naughton N Stokes and J Carthy ldquoSentence-level eventclassification in unstructured textsrdquo Information Retrievalvol 13 no 2 pp 132ndash156 2010

[20] G Jacobs E Lefever and V Hoste ldquoEconomic event de-tection in company-specific news textrdquo in Proceedings of theFirst Workshop on Economics and Natural Language Pro-cessing pp 1ndash10 Melbourne Australia July 2018

[21] E DrsquoAndrea P Ducange A Bechini A Renda andF Marcelloni ldquoMonitoring the public opinion about thevaccination topic from tweets analysisrdquo Expert Systems withApplications vol 116 pp 209ndash226 2019

[22] M Sokolova and G Lapalme ldquoA systematic analysis ofperformance measures for classification tasksrdquo InformationProcessing amp Management vol 45 no 4 pp 427ndash437 2009

[23] T Mikolov I Sutskever K Chen G S Corrado and J DeanldquoDistributed representations of words and phrases and theircompositionalityrdquo in Proceedings Advances Neural Informa-tion Processing Systems vol 26 pp 3111ndash3119 Lake TahoeNV USA December 2013

[24] Y Bengio R Ducharme P Vincent and C Jauvin ldquoA neuralprobabilistic language modelrdquo Journal of Machine LearningResearch vol 3 pp 1137ndash1155 2003

[25] M P Akhter Z Jiangbin I R Naqvi M Abdelmajeed andM Fayyaz ldquoExploring deep learning approaches for Urdu textclassification in product manufacturingrdquo Enterprise Infor-mation Systems pp 1ndash26 2020

[26] G Liu and J Guo ldquoBidirectional LSTM with attentionmechanism and convolutional layer for text classificationrdquoNeurocomputing vol 337 pp 325ndash338 2019

[27] M Sharjeel R M A Nawab and P Rayson ldquoCOUNTERcorpus of Urdu news text reuserdquo Language Resources andEvaluation vol 51 no 3 pp 777ndash803 2017

[28] K Mehmood D Essam and K Shafi ldquoSentiment analysissystem for Roman Urdurdquo in Proceedings of the 2018 Scienceand Information Conference pp 29ndash42 Springer CasablancaMorocco July 2018

[29] K Ahmed M Ali S Khalid and M Kamran ldquoFramework forUrdu news headlines classificationrdquo Journal of Applied ComputerScience amp Mathematics vol 10 no 1 pp 17ndash21 2016

[30] Z Tehseen M P Akhter and Q Abbas ldquoComparative studyof feature selection approaches for Urdu text categorizationrdquoMalaysian Journal Computer Science vol 28 no 2 pp 93ndash109 2015

[31] W Yin and L Shen ldquoA short text classification approach withevent detection and conceptual informationrdquo in Proceedingsof the 2020 5th International Conference on Machine LearningTechnologies pp 129ndash135 Beijing China June 2020

[32] H Zhou M Huang T Zhang et al ldquoEmotional chattingmachine emotional conversation generation with internaland external memoryrdquo in Proceedings of the irty-SecondAAAI Conference on Artificial Intelligence New Orleans LAUSA February 2018

[33] A Hassan and A Mahmood ldquoConvolutional recurrent deeplearningmodel for sentence classificationrdquo IEEE Access vol 6pp 13949ndash13957 2018

[34] T Zia M P Akhter and Q Abbas ldquoComparative study offeature selection approaches for Urdu text categorizationrdquoMalaysian Journal of Computer Science vol 28 no 2pp 93ndash109 2015

[35] A R Ali andM Ijaz ldquoUrdu text classificationrdquo in Proceedingsof the 7th international conference on frontiers of informationtechnology pp 1ndash7 Abbottabad Pakistan December 2009

[36] M Usman Z Shafique S Ayub and K Malik ldquoUrdu textclassification using majority votingrdquo International Journal ofAdvanced Computer Science and Applications vol 7 no 8pp 265ndash273 2016

[37] N Kalchbrenner E Grefenstette and P Blunsom ldquoA con-volutional neural network for modelling sentencesrdquo 2014httparxivorgabs14042188

[38] D M Awais and DM Shoaib ldquoRole of discourse informationin Urdu sentiment classificationrdquo ACMTransactions on Asianand Low-Resource Language Information Processing vol 18no 4 pp 1ndash37 2019

[39] J P Singh Y K Dwivedi N P Rana A Kumar andK K Kapoor ldquoEvent classification and location predictionfrom tweets during disastersrdquo Annals of Operations Researchvol 283 no 1-2 pp 737ndash757 2019

[40] R C Paulo D Fillipe and M S C Sergo ldquoClassification ofevents on social mediardquo 2016

[41] Q A Al-Radaideh and M A Al-Abrat ldquoAn Arabic textcategorization approach using term weighting and multiplereductsrdquo Soft Computing vol 23 no 14 pp 5849ndash5863 2019

[42] J F Allen ldquoMaintaining knowledge about temporal inter-valsrdquo Communications of the ACM vol 26 no 11pp 832ndash843 1983

[43] T Joachims ldquoText categorization with support vector ma-chines learning with many relevant featuresrdquo in Proceedingsof the European conference on machine learning pp 137ndash142Springer Chemnitz Germany April 1998

[44] S Haider ldquoUrdu word embeddingsrdquo in Proceedings of theEleventh International Conference on Language Resources andEvaluation (LREC 2018) Miyazaki Japan May 2018

[45] B Jawaid A Kamran and O Bojar ldquoUrdu monolingual corpusrdquoLINDATCLARIN Digital Library at the Institute of Formal andApplied Linguistics Charles University Prague Czechia

[46] F Adeeba Q Akram H Khalid and S Hussain ldquoCLE Urdubooks n-gramsrdquo in Proceedings of the Conference on Languageand Technology CLT 14 Karachi Pakistan May 2014

[47] A Hassan and A Mahmood ldquoDeep learning for sentenceclassificationrdquo in Proceedings of the 2017 IEEE Long IslandSystems Applications and Technology Conference (LISAT)pp 1ndash5 IEEE New York NY USA May 2017

[48] D-X Zhou ldquoUniversality of deep convolutional neuralnetworksrdquo Applied and Computational Harmonic Analysisvol 48 no 2 pp 787ndash794 2020

[49] M V Valueva N N Nagornov P A Lyakhov G V Valuevand N I Chervyakov ldquoApplication of the residue numbersystem to reduce hardware costs of the convolutional neuralnetwork implementationrdquo Mathematics and Computers inSimulation vol 177 pp 232ndash243 2020

[50] G Guo H Wang D Bell Y Bi and K Greer ldquoKNN model-based approach in classificationrdquo in Proceedings of the OTMConfederated International Conferences ldquoOn the Move toMeaningful Internet Systemsrdquo Catania Italy November 2003

[51] Y Zhong ldquo)e analysis of cases based on decision treerdquo inProceedings of the 2016 7th IEEE international conference onsoftware engineering and service science (ICSESS) pp 142ndash147IEEE Beijing China August 2016

[52] S Xu ldquoBayesian Naıve Bayes classifiers to text classificationrdquoJournal of Information Science vol 44 no 1 pp 48ndash59 2018

[53] T Zhang and F Oles ldquoText categorization based on regu-larized linear classification methodsrdquo Information Retrievalvol 4 no 1 pp 5ndash31 2001

14 Scientific Programming

[54] J Ali R Khan N Ahmad and I Maqsood ldquoRandom forestsand decision treesrdquo International Journal of Computer ScienceIssues (IJCSI) vol 9 no 5 p 272 2012

[55] Y Zhang ldquoSupport vector machine classification algorithmand its applicationrdquo in Proceedings of the InternationalConference on Information Computing and Applicationspp 179ndash186 Springer Bhubaneswar India September 2012

Scientific Programming 15

Page 6: ResearchArticle MulticlassEventClassificationfromText2020/11/03  · script. Its grammatical structure is different from other languages. (1)Subject-object-verb(SOV)sentencestructure[14]

0

5000

10000

15000

20000

25000

30000

35000

40000To

tal n

umbe

r of i

nsta

nces

Imbalance dataset of the urdu label sentences for events

Type of events

18746

31966932

3034

33421

69603238

741710078

3406 2916 3617

Spor

ts

Infla

tion

Mur

der

Terr

orist

atta

ck

Polit

ics

Law

and

orde

r

Eart

hqua

ke

Show

biz

Frau

d an

dco

rrup

tion

Wea

ther

Sexu

al as

saul

t

Busin

ess

Figure 3 Imbalance instances of the dataset

Raw input Preprocessed input Word embedding layersDLM

RNN (LSTM-unidirection)

CNN

DNN

LabelsTF-IDF

Documents1

Documents2

Documents3

Documents

n

1

2

3

12

Figure 4 Methodology

00

050

055

060

065

070

075

080

085

05 10 15 20 25 30 35 40Epochs

Accuracy

Valid

ation_

accuracy

TrainTest

Figure 5 DDNrsquos accuracy

6 Scientific Programming

[44] that is developed by using three publicly available Urdudatasets Wikipediarsquos Urdu text another corpus having 90million tokens [45] and 35 million tokens [46] It has 102214unique tokens Each token comprises 300-dimensional realvalues Another model publicly available for research purposesconsists of 25925 unique words of Urdu language [47] Everyword has a 400-dimensional value A word embedding modelcomprises web-based text created to classify text It consists of64653 unique Urdu words and 300 dimensions for each word

)e journey of research is not over here to expand ourresearch scope and find the most efficient word embeddingmodel for sentence classification we decided to developcustom word embedding models We developed four wordembedding models that contain 57251 unique words

)e results of pretrained existing word embedding modelsare good at the initial level but very low ie 6026 accuracyWe explored the contents of these models which revealed thatmanywords are irrelevant and borrowed from other languagesie Arabic and Persian )e contents of Wikipedia are entirelydifferent than news websites that also affect the performance ofembedding models Another major factor ie low amount ofdata affected the feature vector generation quality Stop wordsin the pretrained word embedding model are not eliminatedand considered as a token while in our dataset all the stopwords are removed It also reduces the size of the vocabulary ofthe model while generating a feature vector )erefore wedecided to develop a custom word embedding model on ourpreprocessed dataset To postulate the enlargement of theresearch task three different word embedding models aredeveloped )e details of all used pretrained word embeddingmodels are given in Table 4

523 One-Hot-Encoding Text cannot be processed directlybymachine learning classifiers therefore we need to convertthe text into a real value We used one-hot-encoding toconvert text to numeric features For example the sentencesgiven in Table 5 can be converted into a numeric featurevector using one-hot-encoding as shown in Table 6

524 TF-IDF TF and TF-IDF are feature engineeringtechniques that transform the text into the numerical for-mat It is one of the most highly used feature vectors forcreating a method for text data )ree deep learning modelswere evaluated on our corpus )e sequential model withembedding layers outperformed other pretrained wordembedding models [44] reported in state of the art [48] )edetailed summary of the evaluation results of CNN RNNand DNN is discussed in Section 7

53 Deep Learning Models

531 Deep Neural Network Architecture Our DNN archi-tecture consists of three layers ie n-input layer 150 hidden(dense) layers and 12 output layers Feature vector is givenas input into a dense layer that is fully connected )eSoftMax activation function is used in the output layer toclassify sentences into multiple classes

532 Recurrence Neural Network )e recurrence neuralnetwork is evaluated using a long short-term memory(LSTM) classifier RNN consists of embedding dropoutLSTM and dense layers A dictionary of 30000 unique mostfrequent tokens is made )e sentences are standardized tothe same length by using a padding sequence)e dimensionof the feature vector is set as 250 RNN showed an overall81 accuracy that is the second highest in our work

533 Convolutional Neural Network (CNN) CNN is a classof deep neural networks that are highly recommended forimage processing [49] It consists of the input layer (em-bedding layer) multiple hidden layers and an output layer)ere are a series of convolutional layers that convolve witha multiplication )e embedded sequence layer and averagelayer (GloobalAveragePooling1D) are also part of the hiddenlayer )e common activation of CNN is RELU Layer )edetails of the hypermeters that are used in our problem totrain the CNN model are given in Table 7

534 Hyperparameters In this section all the hyper-parameters that are used in our experiments are given in thetabular format Only those hyperparameters are being dis-cussed here which have achieved the highest accuracy ofDNN RNN and CNN models )e hyperparameters ofDNN that are fine-tuned in our work are given in Table 8

)e RNN model showed the highest accuracy (803and 81) on two sets of hyperparameters that are given inTable 9 Similarly Table 7 provides the details of thehyperparameters of the convolutional neural network

6 Performance Measuring Parameters

)emost common performance measuring [41] parametersie precision recall and F1-measure are used to evaluatethe proposed framework )e selection of these parameterswas decided because of the multiclass classification andimbalance dataset

Precision TP

(TP + FP) (1)

Recall TP

(TP + FN) (2)

F1 2lowast PrecisionlowastRecall( 1113857

(Precision + Recall) (3)

Accuracy (TP + TN)

(TP + TN + FP + FN) (4)

where TP TN FP and FN represent total positive totalnegative false positive and false negative values respec-tively Precision is defined as the closeness of the mea-surements to each other and recall is the ratio of the totalamount of relevant (ie TP values) instances that wereactually retrieved during the experimental work It is

Scientific Programming 7

noteworthy that both precision and recall are the relativevalues of measure of relevance

7 Results

71 Deep Learning Classifiers )e feature vector can begenerated using different techniques )e details of feature

vector generating techniques were discussed in Section 5)e results of feature vector generating techniques that wereused in our work ie ldquomulticlass event classification for theUrdu language textrdquo are given in the proceedingsubsections

711 Pretrained Word Embedding Models )e convolu-tional neural network model is evaluated on the featuresvectors that were generated by all pretrained word em-bedding models )e summary of all results generated by

Table 4 Pretrained word embedding model and custom word embedding model

Sr no Unique words Dimension Window sizeExisting pretrained word embedding models1 [11] 64653 300 mdash2 [19] 102214 100 mdash3 53454 300 mdashCustom pretrained word embedding models1 57251 50 22 57251 100 23 57251 100 34 57251 350 1

Table 5 Event sentence

Urdu sentence English sentenceےہاتلیھکلابٹفیلع Ali plays football

یلےلناجیکںوگولںوھکالےنسرئاوانورک Corona virus killed millions of people

Table 6 Event sentence converted using one-hot-encoding

Sentence اتلیھک لاب ٹف یلع ناج ںوگول ںوھکال سرئاو انورک1 1 1 1 1 0 0 0 0 02 0 0 0 0 1 1 1 1 1

Table 7 CNNrsquos hyperparameters

CNN (7928)Parameter ValueMax_words 20000Batch size 128Embedding_dim 50Activation function SoftMaxDense_node 256Trainingtesting 70ndash30No of epochs 20Loss function Categorical cross-entropy

Table 8 DNNrsquos hyperparameters

Parameter ValueMax_words 5000Batch size 128Embedding_dim 512Activation function SoftMaxLayers 04Trainingtesting 70ndash30No of epochs 15Loss function Sparse categorical cross-entropy

Table 9 RNNrsquos hyperparameters

Parameter ValueRNN (LSTM) (803)Max_words 50000Batch size 64Embedding_dim 100Activation function SoftMaxRecurrent dropout 02Trainingtesting 90ndash10No of epochs 05Loss function Sparse categorical cross-entropyRNN (LSTM) (81)Max_words 30000Batch size 128Embedding_dim 100Activation function SoftMaxRecurrent dropout 02Trainingtesting 80ndash20No of epochs 05Loss function Sparse categorical cross-entropy

8 Scientific Programming

pretrained [44] and custom pretrained word embeddingmodels is given in Table 10 Our custom pretrained wordembedding model that contains 57251 unique tokens largerdimension size 350 and 1 as the size of a window showed3868 accuracy )e purpose of developing a differentcustom pretrained word embedding model was to develop adomain-specific model and achieve the highest accuracyHowever the results of both pre-existing pretrained wordembedding models and domain-specific custom word em-bedding models are very low )e detail summary of resultscan be seen in Table 10

712 TF-IDF Feature Vector DNN architecture consists ofan input layer a dense layer and a max pool layer)e denselayer is also called a fully connected layer comprised of 150nodes SoftMax activation function and sparse_categor-ical_cross-entropy are used to compile the model on thedataset

25991 instances are used to validate the accuracy of theDNN model )e DNN with connected layer architectureshowed 84 overall accuracy for all event classes)e details ofthe performance measuring parameters for each class of eventare given in Table 11 Law and order the 6th type of event inour dataset consists of 2000 instances that are used for vali-dation It showed 66 accuracy that is comparatively low to theaccuracy of other types of events It affected the overall per-formance of the DNN model )e main reason behind theseresults is that the sentence of law and order overlaps with thesentences of politics Generally sometimes humans hardlydistinguish between law and order and political statements

For example

ldquo ےطخوگتفگہنارادہمذریغیکریزویتموکحےہہرطخےیلےکنماےک rdquo

ldquo)e irresponsible talk of state minister is a threat topeace in the regionrdquo

)e performance of the DNN model is given in Table 11that showed 84 accuracy for multiple classes of events All theother performance measuring parameters ie precessionrecall and F1-score of each class of events are given in Table 11

)e accuracy of the DNN model can be viewed inFigure 5 where the y-axis represents the accuracy and the x-axis represents the number of epochs RNN achieved 84accuracy for multiclass event classification

)e expected solution to tackle the sentence over-lapping problem with multiple classes is to use a ldquopre-trained word embeddingrdquo model like W2Vec and GloveHowever unfortunately like the English language stillthere is no openclose domain pretrained word

embedding model that is developed by a large corpus ofthe Urdu language text

)e RNN sequential model architecture of deep learningis used in our experiments )e recurrent deep learningmodel architecture consists of a sequence of the followinglayers ie embedding layer having 100 dimensions Spa-tialDropout1D LSTM and dense layers Sparse_categor-ical_cross-entropy loss function has been used for thecompilation of the model Multiclass categorical classifica-tion is handled by a sparse categorical cross-entropy lossfunction instead of categorical cross-entropy A SoftMaxactivation function is used at a dense layer instead of thesigmoid function SoftMax can handle nonlinear classifi-cation ie multiple classes while sigmoid is limited to linearclassification and handles binary classification

A bag-of-words consisting of 30000 unique Urdu lan-guage words is used to generate a feature vector )emaximum length of the feature vector is 250 tokens

)e overall accuracy of the RNN model is presented inTable 12 that achieved 81 validation accuracy for our problemby using TF-IDF feature vectors Other performance evaluationparameters of each class are also given in Table 12

)e accuracy of the RNN model can be viewed inFigure 6 where the y-axis represents the accuracy and the x-axis represents the number of epochs RNN achieved 81accuracy for multiclass event classification

Although CNN is highly recommended for imageprocessing it showed considerable results for multiclassevent classification on textual data )e performance mea-suring parameters of the CNN classifier are given in Table 13

)e distributed accuracy of the CNN classifier for thetwelve classes can be viewed in Figure 7 )ere is more thanone peak (higher accuracies) in Figure 7 that showeddatasets are imbalanced

713 One-Hot-Encoding )e results of deep learning clas-sifiers are used in our researcher work and their performanceon one-hot-encoding features is presented in Figure 8)e one-hot-encoded feature vectors are given as input to CNN DNNand RNN deep learning classifiers RNN showed better ac-curacy as compared to CNN while the DNN outperformed

Table 10 Classification accuracy of the CNN model

Srno

Existing pretrained modelrsquosvalidation_accuracy

Custom pretrained modelrsquosvalidation_accuracy

1 5800 36852 6026 38043 5668 37384 mdash 3868

Table 11 Performance measuring parameters for the DNNmodel

Class Precision Recall F1-score Support1 096 095 096 46042 091 091 091 7763 075 075 075 16974 078 070 074 7705 081 085 083 84246 071 063 067 20007 100 100 100 8178 092 090 091 18399 070 070 071 252410 095 099 097 85611 095 099 097 74112 082 073 077 943Accuracy 084 25991Macro avg 084 084 085 25991Weighted avg 084 084 084 25991

Scientific Programming 9

among them RNN and DNN achieved 81 and 84 accuracyrespectively for multiclass event classification

72 Traditional Machine Learning Classifiers We also per-formed a multiclass event classifier by using traditionalmachine learning algorithms K-nearest neighbor (KNN)decision tree (DT) Naıve Bayes multinomial (NBM) ran-dom forest (RF) linear regression (LR) and support vectormachine (SVM) All these models are evaluated using TF-IDF and one-hot encoding features as feature vectors It wasobserved that the results produced using TF-IDF featureswere better than the results generated using one-hot-encoding features A detailed summary of the results of theabove-mentioned machine learning classifiers is given in thenext section

721 K-Nearest Neighbor (KNN) KNN performs theclassification of a new data point by measuring the similaritydistance between the nearest neighbors In our experimentswe set the value of k 5 that measures the similarity distanceamong five existing data points [50]

Table 12 Performance measuring parameters for the RNN model

Class Precision Recall F1-score Support1 095 095 095 46042 078 077 078 7763 070 072 071 16974 078 064 070 7705 078 084 081 84246 067 057 062 20007 100 100 100 8178 091 087 089 18399 070 063 066 252410 093 098 095 85611 086 094 090 74112 076 067 071 943Accuracy 081 25991Macro avg 082 080 081 25991Weighted avg 081 081 081 25991

Table 13 Performance measuring parameters for the CNN model

Class Precision Recall F1-score Support1 096 093 095 56612 081 065 072 9673 072 068 070 21154 078 054 064 8785 073 088 080 100306 064 051 057 22937 099 099 099 9708 091 086 088 22599 071 061 066 304410 093 094 093 103111 091 082 086 88912 077 063 07 1052Accuracy 080 31189Macro avg 082 075 078 31189Weighted avg 080 080 080 31189

0

065

070

075

080

085

090

2 4 6 8Epochs

Accuracy

Valid

ation_

accuracy

TrainTest

Figure 6 RNNrsquos accuracy

00

1000

2000

3000

4000

5000

6000

7000

8000

20 40 60 80 100

Figure 7 CNNrsquos accuracy distribution

CNN

79

80

808181

82

83

8484

RNN DNNDeep learning models

One-hot-encoding

78

85

Valid

atio

n_ac

cura

cy

Figure 8 CNN RNN andDNN accuracy using one-hot-encoding

10 Scientific Programming

Although the performance of traditional machinelearning classifiers is considerable it must be noted that it islower than deep learning classifiers )e main performancedegrading factor of the classifiers is the imbalanced numberof instances and sentences overlapping )e performance ofthe KNN machine learning model is given in Table 14 Itshowed 78 accuracy

722 Decision Tree (DT) Decision Tree (DT)Decision tree(DT) is a type of supervised machine learning algorithm [51]where the data input is split according to certain parameters)e overall accuracy achieved by DT is 73 while anotherperformance detail of classes andDTmodel is given in Table 15

723 Naive Bayes Multinominal (NBM) Naıve Bayesmultinominal is one of the computational [52] efficientclassifiers for text classification but it showed only 70accuracy that is very low as compared to KNN DT and RF)e performance details of all twelve types of classes aregiven in Table 16

724 Linear Regression (LR) Linear regression is highlyrecommended for the prediction of continuous output in-stead of categorical classification [53] Table 17 shows the

Table 17 Performance measuring parameters for the LR model

Class Precision Recall F1-score Support1 095 094 094 56612 083 064 072 9673 072 069 070 21154 077 055 064 8785 073 088 080 100306 064 053 058 22937 100 100 100 9708 091 084 088 22599 073 062 067 304410 094 092 093 103111 090 080 085 88912 077 066 071 1052Accuracy 080 31189Macro avg 082 076 079 31189Weighted avg 080 080 0 80 31189

Table 18 Performance measuring parameters for the RF model

Class Precision Recall F1-score Support1 094 093 094 56612 094 096 095 9673 072 063 067 21154 080 058 067 8785 071 090 079 100306 067 041 051 22937 100 100 100 9708 093 080 086 22599 075 058 065 304410 094 098 096 103111 096 098 097 88912 084 063 072 1052Accuracy 080 31189Macro avg 085 078 081 31189Weighted avg 081 080 0 80 31189

Table 14 Performance measuring parameters for the KNN model

Class Precision Recall F1-score Support1 091 093 092 56612 062 083 071 9673 067 071 069 21154 064 060 062 8785 078 082 080 100306 066 050 057 22937 093 100 096 9708 091 080 085 22599 071 062 066 304410 085 093 089 103111 072 085 078 88912 075 061 067 1052Accuracy 078 31189Macro avg 076 077 076 31189Weighted avg 078 078 0 78 31189

Table 15 Performance measuring parameters for the DT model

Class Precision Recall F1-score Support1 091 089 090 56612 083 097 089 9673 057 052 054 21154 058 054 056 8785 072 075 073 100306 044 041 042 22937 099 100 100 9708 079 077 078 22599 057 055 056 304410 098 098 093 103111 086 098 092 88912 061 056 058 1031Accuracy 073 31189Macro avg 073 074 074 31189Weighted avg 073 073 0 73 31189

Table 16 Performance measuring parameters for the NB Multi-nominal model

Class Precision Recall F1-score Support1 094 091 093 56832 082 034 048 9563 066 047 055 21214 091 020 032 9195 056 095 070 100136 070 022 034 23877 098 095 097 9598 094 075 083 21889 075 040 052 303110 096 078 086 99811 096 032 048 86312 084 025 039 1071Accuracy 070 31189Macro avg 084 054 061 31189Weighted avg 076 070 0 67 31189

Scientific Programming 11

performance of the LR model ie 84 overall accuracy formulticlass event classification

725 Random Forest (RF) It comprises many decision trees[54] Its results showed the highest accuracy among allevaluated machine learning classifiers A detailed summaryof the results is given in Table 18

726 Support Vector Machine (SVM) )e support vectormachine (SVM) is one of the highly recommended modelsfor binary classification It is based on statistical theory [55]Its performance details are given in Table 19

A comparative depiction of results obtained by thetraditional machine learning classifiers is given in Figure 9

8 Discussion and Conclusion

Lack of resources is a major hurdle in research for Urdulanguage texts We explored many feature vectors generatingtechniques Different classification algorithms of traditional

machine learning and deep learning approaches are evaluatedon these feature vectors )e purpose of performing manyexperiments on various feature vector generating techniqueswas to develop the most efficient and generic model of mul-ticlass event classification for Urdu language text

Word embedding feature generating technique is con-sidered an efficient and powerful technique for text analysisWord2Vector (W2Vec) feature vectors can be generated bypretrained word embedding models or using dynamic pa-rameters in embedding layers of deep neural networks Weperformed sentence classification using pretrained wordembedding models one-hot-encoding TF TF-IDF anddynamic embeddings )e results of the rest of the featurevector generating techniques are better than pretrained wordembedding models

Another argument in support of this conclusion is thatonly a few pretrained word embedding models exist forUrdu language texts )ese models are trained on consid-erable tokens and domain-specific Urdu text )ere is a needto develop generic word embedding models for the Urdulanguage on a large corpus CNN and RNN (LSTM) single-layer architecture and multilayer architecture do not affectthe performance of the proposed system

Experimental results are the vivid depiction that theone-hot-encoding method is better than the word em-bedding model and pretrained word embedding modelHowever among all mentioned (see Section 52) featuregenerating techniques TF-IDF outperformed It showedthe highest accuracy (84) by using DNN deep learningclassifier while event classification on an imbalancedataset of multiclass events for Urdu language usingtraditional machine learning classifiers showed consid-erable performance but lower than deep learning modelsDeep learning algorithms ie CNN DNN and RNN arepreferable over traditional machine learning algorithmsbecause there is no need for a domain expert to findrelevant features in deep learning like traditional machinelearning DNN and RNN outperformed among all otherclassifiers and showed overall 84 and 81 accuracyrespectively for the twelve classes of events Compara-tively the performance of CNN and RNN is better thanNaıve Bayes and SVM

Multiclass event classification at the sentence levelperformed on an imbalance dataset events that are having alow number of instances for a specific class affect the overallperformance of the classifiers We can improve the per-formance by balancing the instances of each class )efollowing can be concluded

(1) Pretrained word embedding models are suitable onlyfor sentence classification if pretrained models aredeveloped by an immense amount of textual data

(2) Existing word embedding models Word2Vec andGlove that were developed for the English languagetext are incompatible for Urdu language text

(3) In our case TF-IDF one-hot-encoding and dy-namic embedding layer are better feature generating

Table 19 Performance measuring parameters for the SVM model

Class Precision Recall F1-score Support1 084 094 089 56832 072 043 054 9563 072 049 058 21214 073 043 054 9195 064 090 075 100136 074 024 036 23877 090 099 094 9598 086 078 082 21889 065 047 057 303110 085 087 082 99811 081 062 070 86312 077 063 067 1071Accuracy 073 31189Macro avg 077 063 067 31189Weighted avg 077 073 0 71 31189

SVM

70

7573

70

80

73

80

7880

RF LRMachine learing classifiers

65

85

DT KNNNBM

Valid

atio

n_ac

cura

cy

Figure 9 Machine learning algorithmsrsquo accuracy using TF-IDF

12 Scientific Programming

techniques as compared to pre-existing Urdu lan-guage text word embedding models

(4) )e TF-IDF-based feature vectors showed thehighest results as compared to one-hot-encoding-and dynamic word embedding-based feature vectors

(5) Imbalance number of instances in the dataset af-fected the overall accuracy

9 Future Work

In a comprehensive review of Urdu literature we found onlya few numbers of referential works related to Urdu textprocessing )e main hurdle in Urdu exploration is theunavailability of the processing resources ie event datasetclose-domain part-of-speech tagger lexicons annotatorsand other supporting tools

)ere are a lot of tasks that can be accomplished forUrdu language text in the future Some of those are men-tioned as follows

(1) Generic word embedding models can be developedfor a large corpus of Urdu language text

(2) Different deep learning classifiers can be evaluatedie BERT and ANN

(3) Event classification can be performed at the doc-ument level

(4) A balance dataset can be used for better results(5) Multilabel event classification can be performed in

the future(6) Unstructured data of Urdu text can be classified

into different event classes(7) Classification of events for the Urdu language can

be further performed for other domains ofknowledge ie literacy ratio top trends famousfoods and a religious event like Eid

(8) Contextual information of sentence ie presen-tence and postsentence information certainly playsa vital role in enhancing the performance accuracyof the classification model

(9) Event classification can be performed on a balanceddataset

(10) Unstructured Urdu data can be used for eventclassification

(11) Classification can be performed at a document andphrase level

Data Availability

)e data used to support this study are available at httpsgithubcomunique-worldMulticlass-Event-Classification-Dataset

Conflicts of Interest

)e authors declare that there are no conflicts of interest

References

[1] A Lenhart R Ling S Campbell and K Purcell Teens andMobile Phones Text Messaging Explodes as Teens Embrace it asthe Centerpiece of eir Communication Strategies withFriends Pew Internet amp American Life Project WashingtonDC USA 2010

[2] M Motoyama B Meeder K Levchenko G M Voelker andS Savage ldquoMeasuring online service availability using twit-terrdquo WOSN vol 10 p 13 2010

[3] J Rogstadius M Vukovic C A Teixeira V KostakosE Karapanos and J A Laredo ldquoCrisisTracker crowdsourcedsocial media curation for disaster awarenessrdquo IBM Journal ofResearch and Development vol 57 no 5 pp 4ndash1 2013

[4] T Reuter and P Cimiano ldquoEvent-based classification of socialmedia streamsrdquo in Proceedings of the 2nd ACM InternationalConference on Multimedia Retrieval pp 1ndash8 Bielefeld Ger-many June 2012

[5] K Sailunaz and R Alhajj ldquoEmotion and sentiment analysisfrom Twitter textrdquo Journal of Computational Science vol 36Article ID 101003 2019

[6] P Capet T Delavallade T Nakamura A SandorC Tarsitano and S Voyatzi ldquoA risk assessment system withautomatic extraction of event typesrdquo in Proceedings of theInternational Conference on Intelligent Information Process-ing pp 220ndash229 Springer Beijing China October 2008

[7] F Hogenboom F Frasincar U Kaymak F De Jong andE Caron ldquoA survey of event extraction methods from text fordecision support systemsrdquo Decision Support Systems vol 85pp 12ndash22 2016

[8] S Jiang H Chen J F Nunamaker and D Zimbra ldquoAna-lyzing firm-specific social media and market a stakeholder-based event analysis frameworkrdquo Decision Support Systemsvol 67 pp 30ndash39 2014

[9] B Pang and L Lee ldquoOpinion mining and sentiment analysisrdquoFoundations and Trends in Information Retrieval vol 2 no 1-2 pp 1ndash135 2008

[10] S Deerwester S T Dumais G W Furnas T K Landauerand R Harshman ldquoIndexing by latent semantic analysisrdquoJournal of the American Society for Information Sciencevol 41 no 6 pp 391ndash407 1990

[11] T Mikolov Q V Le and I Sutskever ldquoExploiting similaritiesamong languages for machine translationrdquo 2013 httparxivorgabs13094168

[12] R Alghamdi and K Alfalqi ldquoA survey of topic modeling intext miningrdquo International Journal of Advanced ComputerScience and Applications (IJACSA)vol 6 no 1 2015

[13] D M Eberhard S F Gary and C D Fennig EthnologueLanguages of the World SIL International Dallas TX USA2019

[14] A Daud W Khan and D Che ldquoUrdu language processing asurveyrdquo Artificial Intelligence Review vol 47 no 3pp 279ndash311 2017

[15] M P Akhter Z Jiangbin I R Naqvi M AbdelmajeedA Mehmood and M T Sadiq ldquoDocument-level text clas-sification using single-layer multisize filters convolutionalneural networkrdquo IEEE Access vol 8 pp 42689ndash42707 2021

[16] U Pal and A Sarkar ldquoRecognition of printed Urdu scriptrdquo inProceedings of the 2003 Seventh International Conference onDocument Analysis and Recognition pp 1183ndash1187 IEEEEdinburgh Scotland August 2003

[17] Y Yang T Pierce and J Carbonell ldquoA study of retrospectiveand on-line event detectionrdquo in Proceedings of the 21st annualinternational ACM SIGIR conference on Research and

Scientific Programming 13

development in information retrieval pp 28ndash36 MelbourneAustralia August 1998

[18] T Kala ldquoEvent detection from text datardquo ComputationalIntelligence vol 31 pp 132ndash164 2015

[19] M Naughton N Stokes and J Carthy ldquoSentence-level eventclassification in unstructured textsrdquo Information Retrievalvol 13 no 2 pp 132ndash156 2010

[20] G Jacobs E Lefever and V Hoste ldquoEconomic event de-tection in company-specific news textrdquo in Proceedings of theFirst Workshop on Economics and Natural Language Pro-cessing pp 1ndash10 Melbourne Australia July 2018

[21] E DrsquoAndrea P Ducange A Bechini A Renda andF Marcelloni ldquoMonitoring the public opinion about thevaccination topic from tweets analysisrdquo Expert Systems withApplications vol 116 pp 209ndash226 2019

[22] M Sokolova and G Lapalme ldquoA systematic analysis ofperformance measures for classification tasksrdquo InformationProcessing amp Management vol 45 no 4 pp 427ndash437 2009

[23] T Mikolov I Sutskever K Chen G S Corrado and J DeanldquoDistributed representations of words and phrases and theircompositionalityrdquo in Proceedings Advances Neural Informa-tion Processing Systems vol 26 pp 3111ndash3119 Lake TahoeNV USA December 2013

[24] Y Bengio R Ducharme P Vincent and C Jauvin ldquoA neuralprobabilistic language modelrdquo Journal of Machine LearningResearch vol 3 pp 1137ndash1155 2003

[25] M P Akhter Z Jiangbin I R Naqvi M Abdelmajeed andM Fayyaz ldquoExploring deep learning approaches for Urdu textclassification in product manufacturingrdquo Enterprise Infor-mation Systems pp 1ndash26 2020

[26] G Liu and J Guo ldquoBidirectional LSTM with attentionmechanism and convolutional layer for text classificationrdquoNeurocomputing vol 337 pp 325ndash338 2019

[27] M Sharjeel R M A Nawab and P Rayson ldquoCOUNTERcorpus of Urdu news text reuserdquo Language Resources andEvaluation vol 51 no 3 pp 777ndash803 2017

[28] K Mehmood D Essam and K Shafi ldquoSentiment analysissystem for Roman Urdurdquo in Proceedings of the 2018 Scienceand Information Conference pp 29ndash42 Springer CasablancaMorocco July 2018

[29] K Ahmed M Ali S Khalid and M Kamran ldquoFramework forUrdu news headlines classificationrdquo Journal of Applied ComputerScience amp Mathematics vol 10 no 1 pp 17ndash21 2016

[30] Z Tehseen M P Akhter and Q Abbas ldquoComparative studyof feature selection approaches for Urdu text categorizationrdquoMalaysian Journal Computer Science vol 28 no 2 pp 93ndash109 2015

[31] W Yin and L Shen ldquoA short text classification approach withevent detection and conceptual informationrdquo in Proceedingsof the 2020 5th International Conference on Machine LearningTechnologies pp 129ndash135 Beijing China June 2020

[32] H Zhou M Huang T Zhang et al ldquoEmotional chattingmachine emotional conversation generation with internaland external memoryrdquo in Proceedings of the irty-SecondAAAI Conference on Artificial Intelligence New Orleans LAUSA February 2018

[33] A Hassan and A Mahmood ldquoConvolutional recurrent deeplearningmodel for sentence classificationrdquo IEEE Access vol 6pp 13949ndash13957 2018

[34] T Zia M P Akhter and Q Abbas ldquoComparative study offeature selection approaches for Urdu text categorizationrdquoMalaysian Journal of Computer Science vol 28 no 2pp 93ndash109 2015

[35] A R Ali andM Ijaz ldquoUrdu text classificationrdquo in Proceedingsof the 7th international conference on frontiers of informationtechnology pp 1ndash7 Abbottabad Pakistan December 2009

[36] M Usman Z Shafique S Ayub and K Malik ldquoUrdu textclassification using majority votingrdquo International Journal ofAdvanced Computer Science and Applications vol 7 no 8pp 265ndash273 2016

[37] N Kalchbrenner E Grefenstette and P Blunsom ldquoA con-volutional neural network for modelling sentencesrdquo 2014httparxivorgabs14042188

[38] D M Awais and DM Shoaib ldquoRole of discourse informationin Urdu sentiment classificationrdquo ACMTransactions on Asianand Low-Resource Language Information Processing vol 18no 4 pp 1ndash37 2019

[39] J P Singh Y K Dwivedi N P Rana A Kumar andK K Kapoor ldquoEvent classification and location predictionfrom tweets during disastersrdquo Annals of Operations Researchvol 283 no 1-2 pp 737ndash757 2019

[40] R C Paulo D Fillipe and M S C Sergo ldquoClassification ofevents on social mediardquo 2016

[41] Q A Al-Radaideh and M A Al-Abrat ldquoAn Arabic textcategorization approach using term weighting and multiplereductsrdquo Soft Computing vol 23 no 14 pp 5849ndash5863 2019

[42] J F Allen ldquoMaintaining knowledge about temporal inter-valsrdquo Communications of the ACM vol 26 no 11pp 832ndash843 1983

[43] T Joachims ldquoText categorization with support vector ma-chines learning with many relevant featuresrdquo in Proceedingsof the European conference on machine learning pp 137ndash142Springer Chemnitz Germany April 1998

[44] S Haider ldquoUrdu word embeddingsrdquo in Proceedings of theEleventh International Conference on Language Resources andEvaluation (LREC 2018) Miyazaki Japan May 2018

[45] B Jawaid A Kamran and O Bojar ldquoUrdu monolingual corpusrdquoLINDATCLARIN Digital Library at the Institute of Formal andApplied Linguistics Charles University Prague Czechia

[46] F Adeeba Q Akram H Khalid and S Hussain ldquoCLE Urdubooks n-gramsrdquo in Proceedings of the Conference on Languageand Technology CLT 14 Karachi Pakistan May 2014

[47] A Hassan and A Mahmood ldquoDeep learning for sentenceclassificationrdquo in Proceedings of the 2017 IEEE Long IslandSystems Applications and Technology Conference (LISAT)pp 1ndash5 IEEE New York NY USA May 2017

[48] D-X Zhou ldquoUniversality of deep convolutional neuralnetworksrdquo Applied and Computational Harmonic Analysisvol 48 no 2 pp 787ndash794 2020

[49] M V Valueva N N Nagornov P A Lyakhov G V Valuevand N I Chervyakov ldquoApplication of the residue numbersystem to reduce hardware costs of the convolutional neuralnetwork implementationrdquo Mathematics and Computers inSimulation vol 177 pp 232ndash243 2020

[50] G Guo H Wang D Bell Y Bi and K Greer ldquoKNN model-based approach in classificationrdquo in Proceedings of the OTMConfederated International Conferences ldquoOn the Move toMeaningful Internet Systemsrdquo Catania Italy November 2003

[51] Y Zhong ldquo)e analysis of cases based on decision treerdquo inProceedings of the 2016 7th IEEE international conference onsoftware engineering and service science (ICSESS) pp 142ndash147IEEE Beijing China August 2016

[52] S Xu ldquoBayesian Naıve Bayes classifiers to text classificationrdquoJournal of Information Science vol 44 no 1 pp 48ndash59 2018

[53] T Zhang and F Oles ldquoText categorization based on regu-larized linear classification methodsrdquo Information Retrievalvol 4 no 1 pp 5ndash31 2001

14 Scientific Programming

[54] J Ali R Khan N Ahmad and I Maqsood ldquoRandom forestsand decision treesrdquo International Journal of Computer ScienceIssues (IJCSI) vol 9 no 5 p 272 2012

[55] Y Zhang ldquoSupport vector machine classification algorithmand its applicationrdquo in Proceedings of the InternationalConference on Information Computing and Applicationspp 179ndash186 Springer Bhubaneswar India September 2012

Scientific Programming 15

Page 7: ResearchArticle MulticlassEventClassificationfromText2020/11/03  · script. Its grammatical structure is different from other languages. (1)Subject-object-verb(SOV)sentencestructure[14]

[44] that is developed by using three publicly available Urdudatasets Wikipediarsquos Urdu text another corpus having 90million tokens [45] and 35 million tokens [46] It has 102214unique tokens Each token comprises 300-dimensional realvalues Another model publicly available for research purposesconsists of 25925 unique words of Urdu language [47] Everyword has a 400-dimensional value A word embedding modelcomprises web-based text created to classify text It consists of64653 unique Urdu words and 300 dimensions for each word

)e journey of research is not over here to expand ourresearch scope and find the most efficient word embeddingmodel for sentence classification we decided to developcustom word embedding models We developed four wordembedding models that contain 57251 unique words

)e results of pretrained existing word embedding modelsare good at the initial level but very low ie 6026 accuracyWe explored the contents of these models which revealed thatmanywords are irrelevant and borrowed from other languagesie Arabic and Persian )e contents of Wikipedia are entirelydifferent than news websites that also affect the performance ofembedding models Another major factor ie low amount ofdata affected the feature vector generation quality Stop wordsin the pretrained word embedding model are not eliminatedand considered as a token while in our dataset all the stopwords are removed It also reduces the size of the vocabulary ofthe model while generating a feature vector )erefore wedecided to develop a custom word embedding model on ourpreprocessed dataset To postulate the enlargement of theresearch task three different word embedding models aredeveloped )e details of all used pretrained word embeddingmodels are given in Table 4

523 One-Hot-Encoding Text cannot be processed directlybymachine learning classifiers therefore we need to convertthe text into a real value We used one-hot-encoding toconvert text to numeric features For example the sentencesgiven in Table 5 can be converted into a numeric featurevector using one-hot-encoding as shown in Table 6

524 TF-IDF TF and TF-IDF are feature engineeringtechniques that transform the text into the numerical for-mat It is one of the most highly used feature vectors forcreating a method for text data )ree deep learning modelswere evaluated on our corpus )e sequential model withembedding layers outperformed other pretrained wordembedding models [44] reported in state of the art [48] )edetailed summary of the evaluation results of CNN RNNand DNN is discussed in Section 7

53 Deep Learning Models

531 Deep Neural Network Architecture Our DNN archi-tecture consists of three layers ie n-input layer 150 hidden(dense) layers and 12 output layers Feature vector is givenas input into a dense layer that is fully connected )eSoftMax activation function is used in the output layer toclassify sentences into multiple classes

532 Recurrence Neural Network )e recurrence neuralnetwork is evaluated using a long short-term memory(LSTM) classifier RNN consists of embedding dropoutLSTM and dense layers A dictionary of 30000 unique mostfrequent tokens is made )e sentences are standardized tothe same length by using a padding sequence)e dimensionof the feature vector is set as 250 RNN showed an overall81 accuracy that is the second highest in our work

533 Convolutional Neural Network (CNN) CNN is a classof deep neural networks that are highly recommended forimage processing [49] It consists of the input layer (em-bedding layer) multiple hidden layers and an output layer)ere are a series of convolutional layers that convolve witha multiplication )e embedded sequence layer and averagelayer (GloobalAveragePooling1D) are also part of the hiddenlayer )e common activation of CNN is RELU Layer )edetails of the hypermeters that are used in our problem totrain the CNN model are given in Table 7

534 Hyperparameters In this section all the hyper-parameters that are used in our experiments are given in thetabular format Only those hyperparameters are being dis-cussed here which have achieved the highest accuracy ofDNN RNN and CNN models )e hyperparameters ofDNN that are fine-tuned in our work are given in Table 8

)e RNN model showed the highest accuracy (803and 81) on two sets of hyperparameters that are given inTable 9 Similarly Table 7 provides the details of thehyperparameters of the convolutional neural network

6 Performance Measuring Parameters

)emost common performance measuring [41] parametersie precision recall and F1-measure are used to evaluatethe proposed framework )e selection of these parameterswas decided because of the multiclass classification andimbalance dataset

Precision TP

(TP + FP) (1)

Recall TP

(TP + FN) (2)

F1 2lowast PrecisionlowastRecall( 1113857

(Precision + Recall) (3)

Accuracy (TP + TN)

(TP + TN + FP + FN) (4)

where TP TN FP and FN represent total positive totalnegative false positive and false negative values respec-tively Precision is defined as the closeness of the mea-surements to each other and recall is the ratio of the totalamount of relevant (ie TP values) instances that wereactually retrieved during the experimental work It is

Scientific Programming 7

noteworthy that both precision and recall are the relativevalues of measure of relevance

7 Results

71 Deep Learning Classifiers )e feature vector can begenerated using different techniques )e details of feature

vector generating techniques were discussed in Section 5)e results of feature vector generating techniques that wereused in our work ie ldquomulticlass event classification for theUrdu language textrdquo are given in the proceedingsubsections

711 Pretrained Word Embedding Models )e convolu-tional neural network model is evaluated on the featuresvectors that were generated by all pretrained word em-bedding models )e summary of all results generated by

Table 4 Pretrained word embedding model and custom word embedding model

Sr no Unique words Dimension Window sizeExisting pretrained word embedding models1 [11] 64653 300 mdash2 [19] 102214 100 mdash3 53454 300 mdashCustom pretrained word embedding models1 57251 50 22 57251 100 23 57251 100 34 57251 350 1

Table 5 Event sentence

Urdu sentence English sentenceےہاتلیھکلابٹفیلع Ali plays football

یلےلناجیکںوگولںوھکالےنسرئاوانورک Corona virus killed millions of people

Table 6 Event sentence converted using one-hot-encoding

Sentence اتلیھک لاب ٹف یلع ناج ںوگول ںوھکال سرئاو انورک1 1 1 1 1 0 0 0 0 02 0 0 0 0 1 1 1 1 1

Table 7 CNNrsquos hyperparameters

CNN (7928)Parameter ValueMax_words 20000Batch size 128Embedding_dim 50Activation function SoftMaxDense_node 256Trainingtesting 70ndash30No of epochs 20Loss function Categorical cross-entropy

Table 8 DNNrsquos hyperparameters

Parameter ValueMax_words 5000Batch size 128Embedding_dim 512Activation function SoftMaxLayers 04Trainingtesting 70ndash30No of epochs 15Loss function Sparse categorical cross-entropy

Table 9 RNNrsquos hyperparameters

Parameter ValueRNN (LSTM) (803)Max_words 50000Batch size 64Embedding_dim 100Activation function SoftMaxRecurrent dropout 02Trainingtesting 90ndash10No of epochs 05Loss function Sparse categorical cross-entropyRNN (LSTM) (81)Max_words 30000Batch size 128Embedding_dim 100Activation function SoftMaxRecurrent dropout 02Trainingtesting 80ndash20No of epochs 05Loss function Sparse categorical cross-entropy

8 Scientific Programming

pretrained [44] and custom pretrained word embeddingmodels is given in Table 10 Our custom pretrained wordembedding model that contains 57251 unique tokens largerdimension size 350 and 1 as the size of a window showed3868 accuracy )e purpose of developing a differentcustom pretrained word embedding model was to develop adomain-specific model and achieve the highest accuracyHowever the results of both pre-existing pretrained wordembedding models and domain-specific custom word em-bedding models are very low )e detail summary of resultscan be seen in Table 10

712 TF-IDF Feature Vector DNN architecture consists ofan input layer a dense layer and a max pool layer)e denselayer is also called a fully connected layer comprised of 150nodes SoftMax activation function and sparse_categor-ical_cross-entropy are used to compile the model on thedataset

25991 instances are used to validate the accuracy of theDNN model )e DNN with connected layer architectureshowed 84 overall accuracy for all event classes)e details ofthe performance measuring parameters for each class of eventare given in Table 11 Law and order the 6th type of event inour dataset consists of 2000 instances that are used for vali-dation It showed 66 accuracy that is comparatively low to theaccuracy of other types of events It affected the overall per-formance of the DNN model )e main reason behind theseresults is that the sentence of law and order overlaps with thesentences of politics Generally sometimes humans hardlydistinguish between law and order and political statements

For example

ldquo ےطخوگتفگہنارادہمذریغیکریزویتموکحےہہرطخےیلےکنماےک rdquo

ldquo)e irresponsible talk of state minister is a threat topeace in the regionrdquo

)e performance of the DNN model is given in Table 11that showed 84 accuracy for multiple classes of events All theother performance measuring parameters ie precessionrecall and F1-score of each class of events are given in Table 11

)e accuracy of the DNN model can be viewed inFigure 5 where the y-axis represents the accuracy and the x-axis represents the number of epochs RNN achieved 84accuracy for multiclass event classification

)e expected solution to tackle the sentence over-lapping problem with multiple classes is to use a ldquopre-trained word embeddingrdquo model like W2Vec and GloveHowever unfortunately like the English language stillthere is no openclose domain pretrained word

embedding model that is developed by a large corpus ofthe Urdu language text

)e RNN sequential model architecture of deep learningis used in our experiments )e recurrent deep learningmodel architecture consists of a sequence of the followinglayers ie embedding layer having 100 dimensions Spa-tialDropout1D LSTM and dense layers Sparse_categor-ical_cross-entropy loss function has been used for thecompilation of the model Multiclass categorical classifica-tion is handled by a sparse categorical cross-entropy lossfunction instead of categorical cross-entropy A SoftMaxactivation function is used at a dense layer instead of thesigmoid function SoftMax can handle nonlinear classifi-cation ie multiple classes while sigmoid is limited to linearclassification and handles binary classification

A bag-of-words consisting of 30000 unique Urdu lan-guage words is used to generate a feature vector )emaximum length of the feature vector is 250 tokens

)e overall accuracy of the RNN model is presented inTable 12 that achieved 81 validation accuracy for our problemby using TF-IDF feature vectors Other performance evaluationparameters of each class are also given in Table 12

)e accuracy of the RNN model can be viewed inFigure 6 where the y-axis represents the accuracy and the x-axis represents the number of epochs RNN achieved 81accuracy for multiclass event classification

Although CNN is highly recommended for imageprocessing it showed considerable results for multiclassevent classification on textual data )e performance mea-suring parameters of the CNN classifier are given in Table 13

)e distributed accuracy of the CNN classifier for thetwelve classes can be viewed in Figure 7 )ere is more thanone peak (higher accuracies) in Figure 7 that showeddatasets are imbalanced

713 One-Hot-Encoding )e results of deep learning clas-sifiers are used in our researcher work and their performanceon one-hot-encoding features is presented in Figure 8)e one-hot-encoded feature vectors are given as input to CNN DNNand RNN deep learning classifiers RNN showed better ac-curacy as compared to CNN while the DNN outperformed

Table 10 Classification accuracy of the CNN model

Srno

Existing pretrained modelrsquosvalidation_accuracy

Custom pretrained modelrsquosvalidation_accuracy

1 5800 36852 6026 38043 5668 37384 mdash 3868

Table 11 Performance measuring parameters for the DNNmodel

Class Precision Recall F1-score Support1 096 095 096 46042 091 091 091 7763 075 075 075 16974 078 070 074 7705 081 085 083 84246 071 063 067 20007 100 100 100 8178 092 090 091 18399 070 070 071 252410 095 099 097 85611 095 099 097 74112 082 073 077 943Accuracy 084 25991Macro avg 084 084 085 25991Weighted avg 084 084 084 25991

Scientific Programming 9

among them RNN and DNN achieved 81 and 84 accuracyrespectively for multiclass event classification

72 Traditional Machine Learning Classifiers We also per-formed a multiclass event classifier by using traditionalmachine learning algorithms K-nearest neighbor (KNN)decision tree (DT) Naıve Bayes multinomial (NBM) ran-dom forest (RF) linear regression (LR) and support vectormachine (SVM) All these models are evaluated using TF-IDF and one-hot encoding features as feature vectors It wasobserved that the results produced using TF-IDF featureswere better than the results generated using one-hot-encoding features A detailed summary of the results of theabove-mentioned machine learning classifiers is given in thenext section

721 K-Nearest Neighbor (KNN) KNN performs theclassification of a new data point by measuring the similaritydistance between the nearest neighbors In our experimentswe set the value of k 5 that measures the similarity distanceamong five existing data points [50]

Table 12 Performance measuring parameters for the RNN model

Class Precision Recall F1-score Support1 095 095 095 46042 078 077 078 7763 070 072 071 16974 078 064 070 7705 078 084 081 84246 067 057 062 20007 100 100 100 8178 091 087 089 18399 070 063 066 252410 093 098 095 85611 086 094 090 74112 076 067 071 943Accuracy 081 25991Macro avg 082 080 081 25991Weighted avg 081 081 081 25991

Table 13 Performance measuring parameters for the CNN model

Class Precision Recall F1-score Support1 096 093 095 56612 081 065 072 9673 072 068 070 21154 078 054 064 8785 073 088 080 100306 064 051 057 22937 099 099 099 9708 091 086 088 22599 071 061 066 304410 093 094 093 103111 091 082 086 88912 077 063 07 1052Accuracy 080 31189Macro avg 082 075 078 31189Weighted avg 080 080 080 31189

0

065

070

075

080

085

090

2 4 6 8Epochs

Accuracy

Valid

ation_

accuracy

TrainTest

Figure 6 RNNrsquos accuracy

00

1000

2000

3000

4000

5000

6000

7000

8000

20 40 60 80 100

Figure 7 CNNrsquos accuracy distribution

CNN

79

80

808181

82

83

8484

RNN DNNDeep learning models

One-hot-encoding

78

85

Valid

atio

n_ac

cura

cy

Figure 8 CNN RNN andDNN accuracy using one-hot-encoding

10 Scientific Programming

Although the performance of traditional machinelearning classifiers is considerable it must be noted that it islower than deep learning classifiers )e main performancedegrading factor of the classifiers is the imbalanced numberof instances and sentences overlapping )e performance ofthe KNN machine learning model is given in Table 14 Itshowed 78 accuracy

722 Decision Tree (DT) Decision Tree (DT)Decision tree(DT) is a type of supervised machine learning algorithm [51]where the data input is split according to certain parameters)e overall accuracy achieved by DT is 73 while anotherperformance detail of classes andDTmodel is given in Table 15

723 Naive Bayes Multinominal (NBM) Naıve Bayesmultinominal is one of the computational [52] efficientclassifiers for text classification but it showed only 70accuracy that is very low as compared to KNN DT and RF)e performance details of all twelve types of classes aregiven in Table 16

724 Linear Regression (LR) Linear regression is highlyrecommended for the prediction of continuous output in-stead of categorical classification [53] Table 17 shows the

Table 17 Performance measuring parameters for the LR model

Class Precision Recall F1-score Support1 095 094 094 56612 083 064 072 9673 072 069 070 21154 077 055 064 8785 073 088 080 100306 064 053 058 22937 100 100 100 9708 091 084 088 22599 073 062 067 304410 094 092 093 103111 090 080 085 88912 077 066 071 1052Accuracy 080 31189Macro avg 082 076 079 31189Weighted avg 080 080 0 80 31189

Table 18 Performance measuring parameters for the RF model

Class Precision Recall F1-score Support1 094 093 094 56612 094 096 095 9673 072 063 067 21154 080 058 067 8785 071 090 079 100306 067 041 051 22937 100 100 100 9708 093 080 086 22599 075 058 065 304410 094 098 096 103111 096 098 097 88912 084 063 072 1052Accuracy 080 31189Macro avg 085 078 081 31189Weighted avg 081 080 0 80 31189

Table 14 Performance measuring parameters for the KNN model

Class Precision Recall F1-score Support1 091 093 092 56612 062 083 071 9673 067 071 069 21154 064 060 062 8785 078 082 080 100306 066 050 057 22937 093 100 096 9708 091 080 085 22599 071 062 066 304410 085 093 089 103111 072 085 078 88912 075 061 067 1052Accuracy 078 31189Macro avg 076 077 076 31189Weighted avg 078 078 0 78 31189

Table 15 Performance measuring parameters for the DT model

Class Precision Recall F1-score Support1 091 089 090 56612 083 097 089 9673 057 052 054 21154 058 054 056 8785 072 075 073 100306 044 041 042 22937 099 100 100 9708 079 077 078 22599 057 055 056 304410 098 098 093 103111 086 098 092 88912 061 056 058 1031Accuracy 073 31189Macro avg 073 074 074 31189Weighted avg 073 073 0 73 31189

Table 16 Performance measuring parameters for the NB Multi-nominal model

Class Precision Recall F1-score Support1 094 091 093 56832 082 034 048 9563 066 047 055 21214 091 020 032 9195 056 095 070 100136 070 022 034 23877 098 095 097 9598 094 075 083 21889 075 040 052 303110 096 078 086 99811 096 032 048 86312 084 025 039 1071Accuracy 070 31189Macro avg 084 054 061 31189Weighted avg 076 070 0 67 31189

Scientific Programming 11

performance of the LR model ie 84 overall accuracy formulticlass event classification

725 Random Forest (RF) It comprises many decision trees[54] Its results showed the highest accuracy among allevaluated machine learning classifiers A detailed summaryof the results is given in Table 18

726 Support Vector Machine (SVM) )e support vectormachine (SVM) is one of the highly recommended modelsfor binary classification It is based on statistical theory [55]Its performance details are given in Table 19

A comparative depiction of results obtained by thetraditional machine learning classifiers is given in Figure 9

8 Discussion and Conclusion

Lack of resources is a major hurdle in research for Urdulanguage texts We explored many feature vectors generatingtechniques Different classification algorithms of traditional

machine learning and deep learning approaches are evaluatedon these feature vectors )e purpose of performing manyexperiments on various feature vector generating techniqueswas to develop the most efficient and generic model of mul-ticlass event classification for Urdu language text

Word embedding feature generating technique is con-sidered an efficient and powerful technique for text analysisWord2Vector (W2Vec) feature vectors can be generated bypretrained word embedding models or using dynamic pa-rameters in embedding layers of deep neural networks Weperformed sentence classification using pretrained wordembedding models one-hot-encoding TF TF-IDF anddynamic embeddings )e results of the rest of the featurevector generating techniques are better than pretrained wordembedding models

Another argument in support of this conclusion is thatonly a few pretrained word embedding models exist forUrdu language texts )ese models are trained on consid-erable tokens and domain-specific Urdu text )ere is a needto develop generic word embedding models for the Urdulanguage on a large corpus CNN and RNN (LSTM) single-layer architecture and multilayer architecture do not affectthe performance of the proposed system

Experimental results are the vivid depiction that theone-hot-encoding method is better than the word em-bedding model and pretrained word embedding modelHowever among all mentioned (see Section 52) featuregenerating techniques TF-IDF outperformed It showedthe highest accuracy (84) by using DNN deep learningclassifier while event classification on an imbalancedataset of multiclass events for Urdu language usingtraditional machine learning classifiers showed consid-erable performance but lower than deep learning modelsDeep learning algorithms ie CNN DNN and RNN arepreferable over traditional machine learning algorithmsbecause there is no need for a domain expert to findrelevant features in deep learning like traditional machinelearning DNN and RNN outperformed among all otherclassifiers and showed overall 84 and 81 accuracyrespectively for the twelve classes of events Compara-tively the performance of CNN and RNN is better thanNaıve Bayes and SVM

Multiclass event classification at the sentence levelperformed on an imbalance dataset events that are having alow number of instances for a specific class affect the overallperformance of the classifiers We can improve the per-formance by balancing the instances of each class )efollowing can be concluded

(1) Pretrained word embedding models are suitable onlyfor sentence classification if pretrained models aredeveloped by an immense amount of textual data

(2) Existing word embedding models Word2Vec andGlove that were developed for the English languagetext are incompatible for Urdu language text

(3) In our case TF-IDF one-hot-encoding and dy-namic embedding layer are better feature generating

Table 19 Performance measuring parameters for the SVM model

Class Precision Recall F1-score Support1 084 094 089 56832 072 043 054 9563 072 049 058 21214 073 043 054 9195 064 090 075 100136 074 024 036 23877 090 099 094 9598 086 078 082 21889 065 047 057 303110 085 087 082 99811 081 062 070 86312 077 063 067 1071Accuracy 073 31189Macro avg 077 063 067 31189Weighted avg 077 073 0 71 31189

SVM

70

7573

70

80

73

80

7880

RF LRMachine learing classifiers

65

85

DT KNNNBM

Valid

atio

n_ac

cura

cy

Figure 9 Machine learning algorithmsrsquo accuracy using TF-IDF

12 Scientific Programming

techniques as compared to pre-existing Urdu lan-guage text word embedding models

(4) )e TF-IDF-based feature vectors showed thehighest results as compared to one-hot-encoding-and dynamic word embedding-based feature vectors

(5) Imbalance number of instances in the dataset af-fected the overall accuracy

9 Future Work

In a comprehensive review of Urdu literature we found onlya few numbers of referential works related to Urdu textprocessing )e main hurdle in Urdu exploration is theunavailability of the processing resources ie event datasetclose-domain part-of-speech tagger lexicons annotatorsand other supporting tools

)ere are a lot of tasks that can be accomplished forUrdu language text in the future Some of those are men-tioned as follows

(1) Generic word embedding models can be developedfor a large corpus of Urdu language text

(2) Different deep learning classifiers can be evaluatedie BERT and ANN

(3) Event classification can be performed at the doc-ument level

(4) A balance dataset can be used for better results(5) Multilabel event classification can be performed in

the future(6) Unstructured data of Urdu text can be classified

into different event classes(7) Classification of events for the Urdu language can

be further performed for other domains ofknowledge ie literacy ratio top trends famousfoods and a religious event like Eid

(8) Contextual information of sentence ie presen-tence and postsentence information certainly playsa vital role in enhancing the performance accuracyof the classification model

(9) Event classification can be performed on a balanceddataset

(10) Unstructured Urdu data can be used for eventclassification

(11) Classification can be performed at a document andphrase level

Data Availability

)e data used to support this study are available at httpsgithubcomunique-worldMulticlass-Event-Classification-Dataset

Conflicts of Interest

)e authors declare that there are no conflicts of interest

References

[1] A Lenhart R Ling S Campbell and K Purcell Teens andMobile Phones Text Messaging Explodes as Teens Embrace it asthe Centerpiece of eir Communication Strategies withFriends Pew Internet amp American Life Project WashingtonDC USA 2010

[2] M Motoyama B Meeder K Levchenko G M Voelker andS Savage ldquoMeasuring online service availability using twit-terrdquo WOSN vol 10 p 13 2010

[3] J Rogstadius M Vukovic C A Teixeira V KostakosE Karapanos and J A Laredo ldquoCrisisTracker crowdsourcedsocial media curation for disaster awarenessrdquo IBM Journal ofResearch and Development vol 57 no 5 pp 4ndash1 2013

[4] T Reuter and P Cimiano ldquoEvent-based classification of socialmedia streamsrdquo in Proceedings of the 2nd ACM InternationalConference on Multimedia Retrieval pp 1ndash8 Bielefeld Ger-many June 2012

[5] K Sailunaz and R Alhajj ldquoEmotion and sentiment analysisfrom Twitter textrdquo Journal of Computational Science vol 36Article ID 101003 2019

[6] P Capet T Delavallade T Nakamura A SandorC Tarsitano and S Voyatzi ldquoA risk assessment system withautomatic extraction of event typesrdquo in Proceedings of theInternational Conference on Intelligent Information Process-ing pp 220ndash229 Springer Beijing China October 2008

[7] F Hogenboom F Frasincar U Kaymak F De Jong andE Caron ldquoA survey of event extraction methods from text fordecision support systemsrdquo Decision Support Systems vol 85pp 12ndash22 2016

[8] S Jiang H Chen J F Nunamaker and D Zimbra ldquoAna-lyzing firm-specific social media and market a stakeholder-based event analysis frameworkrdquo Decision Support Systemsvol 67 pp 30ndash39 2014

[9] B Pang and L Lee ldquoOpinion mining and sentiment analysisrdquoFoundations and Trends in Information Retrieval vol 2 no 1-2 pp 1ndash135 2008

[10] S Deerwester S T Dumais G W Furnas T K Landauerand R Harshman ldquoIndexing by latent semantic analysisrdquoJournal of the American Society for Information Sciencevol 41 no 6 pp 391ndash407 1990

[11] T Mikolov Q V Le and I Sutskever ldquoExploiting similaritiesamong languages for machine translationrdquo 2013 httparxivorgabs13094168

[12] R Alghamdi and K Alfalqi ldquoA survey of topic modeling intext miningrdquo International Journal of Advanced ComputerScience and Applications (IJACSA)vol 6 no 1 2015

[13] D M Eberhard S F Gary and C D Fennig EthnologueLanguages of the World SIL International Dallas TX USA2019

[14] A Daud W Khan and D Che ldquoUrdu language processing asurveyrdquo Artificial Intelligence Review vol 47 no 3pp 279ndash311 2017

[15] M P Akhter Z Jiangbin I R Naqvi M AbdelmajeedA Mehmood and M T Sadiq ldquoDocument-level text clas-sification using single-layer multisize filters convolutionalneural networkrdquo IEEE Access vol 8 pp 42689ndash42707 2021

[16] U Pal and A Sarkar ldquoRecognition of printed Urdu scriptrdquo inProceedings of the 2003 Seventh International Conference onDocument Analysis and Recognition pp 1183ndash1187 IEEEEdinburgh Scotland August 2003

[17] Y Yang T Pierce and J Carbonell ldquoA study of retrospectiveand on-line event detectionrdquo in Proceedings of the 21st annualinternational ACM SIGIR conference on Research and

Scientific Programming 13

development in information retrieval pp 28ndash36 MelbourneAustralia August 1998

[18] T Kala ldquoEvent detection from text datardquo ComputationalIntelligence vol 31 pp 132ndash164 2015

[19] M Naughton N Stokes and J Carthy ldquoSentence-level eventclassification in unstructured textsrdquo Information Retrievalvol 13 no 2 pp 132ndash156 2010

[20] G Jacobs E Lefever and V Hoste ldquoEconomic event de-tection in company-specific news textrdquo in Proceedings of theFirst Workshop on Economics and Natural Language Pro-cessing pp 1ndash10 Melbourne Australia July 2018

[21] E DrsquoAndrea P Ducange A Bechini A Renda andF Marcelloni ldquoMonitoring the public opinion about thevaccination topic from tweets analysisrdquo Expert Systems withApplications vol 116 pp 209ndash226 2019

[22] M Sokolova and G Lapalme ldquoA systematic analysis ofperformance measures for classification tasksrdquo InformationProcessing amp Management vol 45 no 4 pp 427ndash437 2009

[23] T Mikolov I Sutskever K Chen G S Corrado and J DeanldquoDistributed representations of words and phrases and theircompositionalityrdquo in Proceedings Advances Neural Informa-tion Processing Systems vol 26 pp 3111ndash3119 Lake TahoeNV USA December 2013

[24] Y Bengio R Ducharme P Vincent and C Jauvin ldquoA neuralprobabilistic language modelrdquo Journal of Machine LearningResearch vol 3 pp 1137ndash1155 2003

[25] M P Akhter Z Jiangbin I R Naqvi M Abdelmajeed andM Fayyaz ldquoExploring deep learning approaches for Urdu textclassification in product manufacturingrdquo Enterprise Infor-mation Systems pp 1ndash26 2020

[26] G Liu and J Guo ldquoBidirectional LSTM with attentionmechanism and convolutional layer for text classificationrdquoNeurocomputing vol 337 pp 325ndash338 2019

[27] M Sharjeel R M A Nawab and P Rayson ldquoCOUNTERcorpus of Urdu news text reuserdquo Language Resources andEvaluation vol 51 no 3 pp 777ndash803 2017

[28] K Mehmood D Essam and K Shafi ldquoSentiment analysissystem for Roman Urdurdquo in Proceedings of the 2018 Scienceand Information Conference pp 29ndash42 Springer CasablancaMorocco July 2018

[29] K Ahmed M Ali S Khalid and M Kamran ldquoFramework forUrdu news headlines classificationrdquo Journal of Applied ComputerScience amp Mathematics vol 10 no 1 pp 17ndash21 2016

[30] Z Tehseen M P Akhter and Q Abbas ldquoComparative studyof feature selection approaches for Urdu text categorizationrdquoMalaysian Journal Computer Science vol 28 no 2 pp 93ndash109 2015

[31] W Yin and L Shen ldquoA short text classification approach withevent detection and conceptual informationrdquo in Proceedingsof the 2020 5th International Conference on Machine LearningTechnologies pp 129ndash135 Beijing China June 2020

[32] H Zhou M Huang T Zhang et al ldquoEmotional chattingmachine emotional conversation generation with internaland external memoryrdquo in Proceedings of the irty-SecondAAAI Conference on Artificial Intelligence New Orleans LAUSA February 2018

[33] A Hassan and A Mahmood ldquoConvolutional recurrent deeplearningmodel for sentence classificationrdquo IEEE Access vol 6pp 13949ndash13957 2018

[34] T Zia M P Akhter and Q Abbas ldquoComparative study offeature selection approaches for Urdu text categorizationrdquoMalaysian Journal of Computer Science vol 28 no 2pp 93ndash109 2015

[35] A R Ali andM Ijaz ldquoUrdu text classificationrdquo in Proceedingsof the 7th international conference on frontiers of informationtechnology pp 1ndash7 Abbottabad Pakistan December 2009

[36] M Usman Z Shafique S Ayub and K Malik ldquoUrdu textclassification using majority votingrdquo International Journal ofAdvanced Computer Science and Applications vol 7 no 8pp 265ndash273 2016

[37] N Kalchbrenner E Grefenstette and P Blunsom ldquoA con-volutional neural network for modelling sentencesrdquo 2014httparxivorgabs14042188

[38] D M Awais and DM Shoaib ldquoRole of discourse informationin Urdu sentiment classificationrdquo ACMTransactions on Asianand Low-Resource Language Information Processing vol 18no 4 pp 1ndash37 2019

[39] J P Singh Y K Dwivedi N P Rana A Kumar andK K Kapoor ldquoEvent classification and location predictionfrom tweets during disastersrdquo Annals of Operations Researchvol 283 no 1-2 pp 737ndash757 2019

[40] R C Paulo D Fillipe and M S C Sergo ldquoClassification ofevents on social mediardquo 2016

[41] Q A Al-Radaideh and M A Al-Abrat ldquoAn Arabic textcategorization approach using term weighting and multiplereductsrdquo Soft Computing vol 23 no 14 pp 5849ndash5863 2019

[42] J F Allen ldquoMaintaining knowledge about temporal inter-valsrdquo Communications of the ACM vol 26 no 11pp 832ndash843 1983

[43] T Joachims ldquoText categorization with support vector ma-chines learning with many relevant featuresrdquo in Proceedingsof the European conference on machine learning pp 137ndash142Springer Chemnitz Germany April 1998

[44] S Haider ldquoUrdu word embeddingsrdquo in Proceedings of theEleventh International Conference on Language Resources andEvaluation (LREC 2018) Miyazaki Japan May 2018

[45] B Jawaid A Kamran and O Bojar ldquoUrdu monolingual corpusrdquoLINDATCLARIN Digital Library at the Institute of Formal andApplied Linguistics Charles University Prague Czechia

[46] F Adeeba Q Akram H Khalid and S Hussain ldquoCLE Urdubooks n-gramsrdquo in Proceedings of the Conference on Languageand Technology CLT 14 Karachi Pakistan May 2014

[47] A Hassan and A Mahmood ldquoDeep learning for sentenceclassificationrdquo in Proceedings of the 2017 IEEE Long IslandSystems Applications and Technology Conference (LISAT)pp 1ndash5 IEEE New York NY USA May 2017

[48] D-X Zhou ldquoUniversality of deep convolutional neuralnetworksrdquo Applied and Computational Harmonic Analysisvol 48 no 2 pp 787ndash794 2020

[49] M V Valueva N N Nagornov P A Lyakhov G V Valuevand N I Chervyakov ldquoApplication of the residue numbersystem to reduce hardware costs of the convolutional neuralnetwork implementationrdquo Mathematics and Computers inSimulation vol 177 pp 232ndash243 2020

[50] G Guo H Wang D Bell Y Bi and K Greer ldquoKNN model-based approach in classificationrdquo in Proceedings of the OTMConfederated International Conferences ldquoOn the Move toMeaningful Internet Systemsrdquo Catania Italy November 2003

[51] Y Zhong ldquo)e analysis of cases based on decision treerdquo inProceedings of the 2016 7th IEEE international conference onsoftware engineering and service science (ICSESS) pp 142ndash147IEEE Beijing China August 2016

[52] S Xu ldquoBayesian Naıve Bayes classifiers to text classificationrdquoJournal of Information Science vol 44 no 1 pp 48ndash59 2018

[53] T Zhang and F Oles ldquoText categorization based on regu-larized linear classification methodsrdquo Information Retrievalvol 4 no 1 pp 5ndash31 2001

14 Scientific Programming

[54] J Ali R Khan N Ahmad and I Maqsood ldquoRandom forestsand decision treesrdquo International Journal of Computer ScienceIssues (IJCSI) vol 9 no 5 p 272 2012

[55] Y Zhang ldquoSupport vector machine classification algorithmand its applicationrdquo in Proceedings of the InternationalConference on Information Computing and Applicationspp 179ndash186 Springer Bhubaneswar India September 2012

Scientific Programming 15

Page 8: ResearchArticle MulticlassEventClassificationfromText2020/11/03  · script. Its grammatical structure is different from other languages. (1)Subject-object-verb(SOV)sentencestructure[14]

noteworthy that both precision and recall are the relativevalues of measure of relevance

7 Results

71 Deep Learning Classifiers )e feature vector can begenerated using different techniques )e details of feature

vector generating techniques were discussed in Section 5)e results of feature vector generating techniques that wereused in our work ie ldquomulticlass event classification for theUrdu language textrdquo are given in the proceedingsubsections

711 Pretrained Word Embedding Models )e convolu-tional neural network model is evaluated on the featuresvectors that were generated by all pretrained word em-bedding models )e summary of all results generated by

Table 4 Pretrained word embedding model and custom word embedding model

Sr no Unique words Dimension Window sizeExisting pretrained word embedding models1 [11] 64653 300 mdash2 [19] 102214 100 mdash3 53454 300 mdashCustom pretrained word embedding models1 57251 50 22 57251 100 23 57251 100 34 57251 350 1

Table 5 Event sentence

Urdu sentence English sentenceےہاتلیھکلابٹفیلع Ali plays football

یلےلناجیکںوگولںوھکالےنسرئاوانورک Corona virus killed millions of people

Table 6 Event sentence converted using one-hot-encoding

Sentence اتلیھک لاب ٹف یلع ناج ںوگول ںوھکال سرئاو انورک1 1 1 1 1 0 0 0 0 02 0 0 0 0 1 1 1 1 1

Table 7 CNNrsquos hyperparameters

CNN (7928)Parameter ValueMax_words 20000Batch size 128Embedding_dim 50Activation function SoftMaxDense_node 256Trainingtesting 70ndash30No of epochs 20Loss function Categorical cross-entropy

Table 8 DNNrsquos hyperparameters

Parameter ValueMax_words 5000Batch size 128Embedding_dim 512Activation function SoftMaxLayers 04Trainingtesting 70ndash30No of epochs 15Loss function Sparse categorical cross-entropy

Table 9 RNNrsquos hyperparameters

Parameter ValueRNN (LSTM) (803)Max_words 50000Batch size 64Embedding_dim 100Activation function SoftMaxRecurrent dropout 02Trainingtesting 90ndash10No of epochs 05Loss function Sparse categorical cross-entropyRNN (LSTM) (81)Max_words 30000Batch size 128Embedding_dim 100Activation function SoftMaxRecurrent dropout 02Trainingtesting 80ndash20No of epochs 05Loss function Sparse categorical cross-entropy

8 Scientific Programming

pretrained [44] and custom pretrained word embeddingmodels is given in Table 10 Our custom pretrained wordembedding model that contains 57251 unique tokens largerdimension size 350 and 1 as the size of a window showed3868 accuracy )e purpose of developing a differentcustom pretrained word embedding model was to develop adomain-specific model and achieve the highest accuracyHowever the results of both pre-existing pretrained wordembedding models and domain-specific custom word em-bedding models are very low )e detail summary of resultscan be seen in Table 10

712 TF-IDF Feature Vector DNN architecture consists ofan input layer a dense layer and a max pool layer)e denselayer is also called a fully connected layer comprised of 150nodes SoftMax activation function and sparse_categor-ical_cross-entropy are used to compile the model on thedataset

25991 instances are used to validate the accuracy of theDNN model )e DNN with connected layer architectureshowed 84 overall accuracy for all event classes)e details ofthe performance measuring parameters for each class of eventare given in Table 11 Law and order the 6th type of event inour dataset consists of 2000 instances that are used for vali-dation It showed 66 accuracy that is comparatively low to theaccuracy of other types of events It affected the overall per-formance of the DNN model )e main reason behind theseresults is that the sentence of law and order overlaps with thesentences of politics Generally sometimes humans hardlydistinguish between law and order and political statements

For example

ldquo ےطخوگتفگہنارادہمذریغیکریزویتموکحےہہرطخےیلےکنماےک rdquo

ldquo)e irresponsible talk of state minister is a threat topeace in the regionrdquo

)e performance of the DNN model is given in Table 11that showed 84 accuracy for multiple classes of events All theother performance measuring parameters ie precessionrecall and F1-score of each class of events are given in Table 11

)e accuracy of the DNN model can be viewed inFigure 5 where the y-axis represents the accuracy and the x-axis represents the number of epochs RNN achieved 84accuracy for multiclass event classification

)e expected solution to tackle the sentence over-lapping problem with multiple classes is to use a ldquopre-trained word embeddingrdquo model like W2Vec and GloveHowever unfortunately like the English language stillthere is no openclose domain pretrained word

embedding model that is developed by a large corpus ofthe Urdu language text

)e RNN sequential model architecture of deep learningis used in our experiments )e recurrent deep learningmodel architecture consists of a sequence of the followinglayers ie embedding layer having 100 dimensions Spa-tialDropout1D LSTM and dense layers Sparse_categor-ical_cross-entropy loss function has been used for thecompilation of the model Multiclass categorical classifica-tion is handled by a sparse categorical cross-entropy lossfunction instead of categorical cross-entropy A SoftMaxactivation function is used at a dense layer instead of thesigmoid function SoftMax can handle nonlinear classifi-cation ie multiple classes while sigmoid is limited to linearclassification and handles binary classification

A bag-of-words consisting of 30000 unique Urdu lan-guage words is used to generate a feature vector )emaximum length of the feature vector is 250 tokens

)e overall accuracy of the RNN model is presented inTable 12 that achieved 81 validation accuracy for our problemby using TF-IDF feature vectors Other performance evaluationparameters of each class are also given in Table 12

)e accuracy of the RNN model can be viewed inFigure 6 where the y-axis represents the accuracy and the x-axis represents the number of epochs RNN achieved 81accuracy for multiclass event classification

Although CNN is highly recommended for imageprocessing it showed considerable results for multiclassevent classification on textual data )e performance mea-suring parameters of the CNN classifier are given in Table 13

)e distributed accuracy of the CNN classifier for thetwelve classes can be viewed in Figure 7 )ere is more thanone peak (higher accuracies) in Figure 7 that showeddatasets are imbalanced

713 One-Hot-Encoding )e results of deep learning clas-sifiers are used in our researcher work and their performanceon one-hot-encoding features is presented in Figure 8)e one-hot-encoded feature vectors are given as input to CNN DNNand RNN deep learning classifiers RNN showed better ac-curacy as compared to CNN while the DNN outperformed

Table 10 Classification accuracy of the CNN model

Srno

Existing pretrained modelrsquosvalidation_accuracy

Custom pretrained modelrsquosvalidation_accuracy

1 5800 36852 6026 38043 5668 37384 mdash 3868

Table 11 Performance measuring parameters for the DNNmodel

Class Precision Recall F1-score Support1 096 095 096 46042 091 091 091 7763 075 075 075 16974 078 070 074 7705 081 085 083 84246 071 063 067 20007 100 100 100 8178 092 090 091 18399 070 070 071 252410 095 099 097 85611 095 099 097 74112 082 073 077 943Accuracy 084 25991Macro avg 084 084 085 25991Weighted avg 084 084 084 25991

Scientific Programming 9

among them RNN and DNN achieved 81 and 84 accuracyrespectively for multiclass event classification

72 Traditional Machine Learning Classifiers We also per-formed a multiclass event classifier by using traditionalmachine learning algorithms K-nearest neighbor (KNN)decision tree (DT) Naıve Bayes multinomial (NBM) ran-dom forest (RF) linear regression (LR) and support vectormachine (SVM) All these models are evaluated using TF-IDF and one-hot encoding features as feature vectors It wasobserved that the results produced using TF-IDF featureswere better than the results generated using one-hot-encoding features A detailed summary of the results of theabove-mentioned machine learning classifiers is given in thenext section

721 K-Nearest Neighbor (KNN) KNN performs theclassification of a new data point by measuring the similaritydistance between the nearest neighbors In our experimentswe set the value of k 5 that measures the similarity distanceamong five existing data points [50]

Table 12 Performance measuring parameters for the RNN model

Class Precision Recall F1-score Support1 095 095 095 46042 078 077 078 7763 070 072 071 16974 078 064 070 7705 078 084 081 84246 067 057 062 20007 100 100 100 8178 091 087 089 18399 070 063 066 252410 093 098 095 85611 086 094 090 74112 076 067 071 943Accuracy 081 25991Macro avg 082 080 081 25991Weighted avg 081 081 081 25991

Table 13 Performance measuring parameters for the CNN model

Class Precision Recall F1-score Support1 096 093 095 56612 081 065 072 9673 072 068 070 21154 078 054 064 8785 073 088 080 100306 064 051 057 22937 099 099 099 9708 091 086 088 22599 071 061 066 304410 093 094 093 103111 091 082 086 88912 077 063 07 1052Accuracy 080 31189Macro avg 082 075 078 31189Weighted avg 080 080 080 31189

0

065

070

075

080

085

090

2 4 6 8Epochs

Accuracy

Valid

ation_

accuracy

TrainTest

Figure 6 RNNrsquos accuracy

00

1000

2000

3000

4000

5000

6000

7000

8000

20 40 60 80 100

Figure 7 CNNrsquos accuracy distribution

CNN

79

80

808181

82

83

8484

RNN DNNDeep learning models

One-hot-encoding

78

85

Valid

atio

n_ac

cura

cy

Figure 8 CNN RNN andDNN accuracy using one-hot-encoding

10 Scientific Programming

Although the performance of traditional machinelearning classifiers is considerable it must be noted that it islower than deep learning classifiers )e main performancedegrading factor of the classifiers is the imbalanced numberof instances and sentences overlapping )e performance ofthe KNN machine learning model is given in Table 14 Itshowed 78 accuracy

722 Decision Tree (DT) Decision Tree (DT)Decision tree(DT) is a type of supervised machine learning algorithm [51]where the data input is split according to certain parameters)e overall accuracy achieved by DT is 73 while anotherperformance detail of classes andDTmodel is given in Table 15

723 Naive Bayes Multinominal (NBM) Naıve Bayesmultinominal is one of the computational [52] efficientclassifiers for text classification but it showed only 70accuracy that is very low as compared to KNN DT and RF)e performance details of all twelve types of classes aregiven in Table 16

724 Linear Regression (LR) Linear regression is highlyrecommended for the prediction of continuous output in-stead of categorical classification [53] Table 17 shows the

Table 17 Performance measuring parameters for the LR model

Class Precision Recall F1-score Support1 095 094 094 56612 083 064 072 9673 072 069 070 21154 077 055 064 8785 073 088 080 100306 064 053 058 22937 100 100 100 9708 091 084 088 22599 073 062 067 304410 094 092 093 103111 090 080 085 88912 077 066 071 1052Accuracy 080 31189Macro avg 082 076 079 31189Weighted avg 080 080 0 80 31189

Table 18 Performance measuring parameters for the RF model

Class Precision Recall F1-score Support1 094 093 094 56612 094 096 095 9673 072 063 067 21154 080 058 067 8785 071 090 079 100306 067 041 051 22937 100 100 100 9708 093 080 086 22599 075 058 065 304410 094 098 096 103111 096 098 097 88912 084 063 072 1052Accuracy 080 31189Macro avg 085 078 081 31189Weighted avg 081 080 0 80 31189

Table 14 Performance measuring parameters for the KNN model

Class Precision Recall F1-score Support1 091 093 092 56612 062 083 071 9673 067 071 069 21154 064 060 062 8785 078 082 080 100306 066 050 057 22937 093 100 096 9708 091 080 085 22599 071 062 066 304410 085 093 089 103111 072 085 078 88912 075 061 067 1052Accuracy 078 31189Macro avg 076 077 076 31189Weighted avg 078 078 0 78 31189

Table 15 Performance measuring parameters for the DT model

Class Precision Recall F1-score Support1 091 089 090 56612 083 097 089 9673 057 052 054 21154 058 054 056 8785 072 075 073 100306 044 041 042 22937 099 100 100 9708 079 077 078 22599 057 055 056 304410 098 098 093 103111 086 098 092 88912 061 056 058 1031Accuracy 073 31189Macro avg 073 074 074 31189Weighted avg 073 073 0 73 31189

Table 16 Performance measuring parameters for the NB Multi-nominal model

Class Precision Recall F1-score Support1 094 091 093 56832 082 034 048 9563 066 047 055 21214 091 020 032 9195 056 095 070 100136 070 022 034 23877 098 095 097 9598 094 075 083 21889 075 040 052 303110 096 078 086 99811 096 032 048 86312 084 025 039 1071Accuracy 070 31189Macro avg 084 054 061 31189Weighted avg 076 070 0 67 31189

Scientific Programming 11

performance of the LR model ie 84 overall accuracy formulticlass event classification

725 Random Forest (RF) It comprises many decision trees[54] Its results showed the highest accuracy among allevaluated machine learning classifiers A detailed summaryof the results is given in Table 18

726 Support Vector Machine (SVM) )e support vectormachine (SVM) is one of the highly recommended modelsfor binary classification It is based on statistical theory [55]Its performance details are given in Table 19

A comparative depiction of results obtained by thetraditional machine learning classifiers is given in Figure 9

8 Discussion and Conclusion

Lack of resources is a major hurdle in research for Urdulanguage texts We explored many feature vectors generatingtechniques Different classification algorithms of traditional

machine learning and deep learning approaches are evaluatedon these feature vectors )e purpose of performing manyexperiments on various feature vector generating techniqueswas to develop the most efficient and generic model of mul-ticlass event classification for Urdu language text

Word embedding feature generating technique is con-sidered an efficient and powerful technique for text analysisWord2Vector (W2Vec) feature vectors can be generated bypretrained word embedding models or using dynamic pa-rameters in embedding layers of deep neural networks Weperformed sentence classification using pretrained wordembedding models one-hot-encoding TF TF-IDF anddynamic embeddings )e results of the rest of the featurevector generating techniques are better than pretrained wordembedding models

Another argument in support of this conclusion is thatonly a few pretrained word embedding models exist forUrdu language texts )ese models are trained on consid-erable tokens and domain-specific Urdu text )ere is a needto develop generic word embedding models for the Urdulanguage on a large corpus CNN and RNN (LSTM) single-layer architecture and multilayer architecture do not affectthe performance of the proposed system

Experimental results are the vivid depiction that theone-hot-encoding method is better than the word em-bedding model and pretrained word embedding modelHowever among all mentioned (see Section 52) featuregenerating techniques TF-IDF outperformed It showedthe highest accuracy (84) by using DNN deep learningclassifier while event classification on an imbalancedataset of multiclass events for Urdu language usingtraditional machine learning classifiers showed consid-erable performance but lower than deep learning modelsDeep learning algorithms ie CNN DNN and RNN arepreferable over traditional machine learning algorithmsbecause there is no need for a domain expert to findrelevant features in deep learning like traditional machinelearning DNN and RNN outperformed among all otherclassifiers and showed overall 84 and 81 accuracyrespectively for the twelve classes of events Compara-tively the performance of CNN and RNN is better thanNaıve Bayes and SVM

Multiclass event classification at the sentence levelperformed on an imbalance dataset events that are having alow number of instances for a specific class affect the overallperformance of the classifiers We can improve the per-formance by balancing the instances of each class )efollowing can be concluded

(1) Pretrained word embedding models are suitable onlyfor sentence classification if pretrained models aredeveloped by an immense amount of textual data

(2) Existing word embedding models Word2Vec andGlove that were developed for the English languagetext are incompatible for Urdu language text

(3) In our case TF-IDF one-hot-encoding and dy-namic embedding layer are better feature generating

Table 19 Performance measuring parameters for the SVM model

Class Precision Recall F1-score Support1 084 094 089 56832 072 043 054 9563 072 049 058 21214 073 043 054 9195 064 090 075 100136 074 024 036 23877 090 099 094 9598 086 078 082 21889 065 047 057 303110 085 087 082 99811 081 062 070 86312 077 063 067 1071Accuracy 073 31189Macro avg 077 063 067 31189Weighted avg 077 073 0 71 31189

SVM

70

7573

70

80

73

80

7880

RF LRMachine learing classifiers

65

85

DT KNNNBM

Valid

atio

n_ac

cura

cy

Figure 9 Machine learning algorithmsrsquo accuracy using TF-IDF

12 Scientific Programming

techniques as compared to pre-existing Urdu lan-guage text word embedding models

(4) )e TF-IDF-based feature vectors showed thehighest results as compared to one-hot-encoding-and dynamic word embedding-based feature vectors

(5) Imbalance number of instances in the dataset af-fected the overall accuracy

9 Future Work

In a comprehensive review of Urdu literature we found onlya few numbers of referential works related to Urdu textprocessing )e main hurdle in Urdu exploration is theunavailability of the processing resources ie event datasetclose-domain part-of-speech tagger lexicons annotatorsand other supporting tools

)ere are a lot of tasks that can be accomplished forUrdu language text in the future Some of those are men-tioned as follows

(1) Generic word embedding models can be developedfor a large corpus of Urdu language text

(2) Different deep learning classifiers can be evaluatedie BERT and ANN

(3) Event classification can be performed at the doc-ument level

(4) A balance dataset can be used for better results(5) Multilabel event classification can be performed in

the future(6) Unstructured data of Urdu text can be classified

into different event classes(7) Classification of events for the Urdu language can

be further performed for other domains ofknowledge ie literacy ratio top trends famousfoods and a religious event like Eid

(8) Contextual information of sentence ie presen-tence and postsentence information certainly playsa vital role in enhancing the performance accuracyof the classification model

(9) Event classification can be performed on a balanceddataset

(10) Unstructured Urdu data can be used for eventclassification

(11) Classification can be performed at a document andphrase level

Data Availability

)e data used to support this study are available at httpsgithubcomunique-worldMulticlass-Event-Classification-Dataset

Conflicts of Interest

)e authors declare that there are no conflicts of interest

References

[1] A Lenhart R Ling S Campbell and K Purcell Teens andMobile Phones Text Messaging Explodes as Teens Embrace it asthe Centerpiece of eir Communication Strategies withFriends Pew Internet amp American Life Project WashingtonDC USA 2010

[2] M Motoyama B Meeder K Levchenko G M Voelker andS Savage ldquoMeasuring online service availability using twit-terrdquo WOSN vol 10 p 13 2010

[3] J Rogstadius M Vukovic C A Teixeira V KostakosE Karapanos and J A Laredo ldquoCrisisTracker crowdsourcedsocial media curation for disaster awarenessrdquo IBM Journal ofResearch and Development vol 57 no 5 pp 4ndash1 2013

[4] T Reuter and P Cimiano ldquoEvent-based classification of socialmedia streamsrdquo in Proceedings of the 2nd ACM InternationalConference on Multimedia Retrieval pp 1ndash8 Bielefeld Ger-many June 2012

[5] K Sailunaz and R Alhajj ldquoEmotion and sentiment analysisfrom Twitter textrdquo Journal of Computational Science vol 36Article ID 101003 2019

[6] P Capet T Delavallade T Nakamura A SandorC Tarsitano and S Voyatzi ldquoA risk assessment system withautomatic extraction of event typesrdquo in Proceedings of theInternational Conference on Intelligent Information Process-ing pp 220ndash229 Springer Beijing China October 2008

[7] F Hogenboom F Frasincar U Kaymak F De Jong andE Caron ldquoA survey of event extraction methods from text fordecision support systemsrdquo Decision Support Systems vol 85pp 12ndash22 2016

[8] S Jiang H Chen J F Nunamaker and D Zimbra ldquoAna-lyzing firm-specific social media and market a stakeholder-based event analysis frameworkrdquo Decision Support Systemsvol 67 pp 30ndash39 2014

[9] B Pang and L Lee ldquoOpinion mining and sentiment analysisrdquoFoundations and Trends in Information Retrieval vol 2 no 1-2 pp 1ndash135 2008

[10] S Deerwester S T Dumais G W Furnas T K Landauerand R Harshman ldquoIndexing by latent semantic analysisrdquoJournal of the American Society for Information Sciencevol 41 no 6 pp 391ndash407 1990

[11] T Mikolov Q V Le and I Sutskever ldquoExploiting similaritiesamong languages for machine translationrdquo 2013 httparxivorgabs13094168

[12] R Alghamdi and K Alfalqi ldquoA survey of topic modeling intext miningrdquo International Journal of Advanced ComputerScience and Applications (IJACSA)vol 6 no 1 2015

[13] D M Eberhard S F Gary and C D Fennig EthnologueLanguages of the World SIL International Dallas TX USA2019

[14] A Daud W Khan and D Che ldquoUrdu language processing asurveyrdquo Artificial Intelligence Review vol 47 no 3pp 279ndash311 2017

[15] M P Akhter Z Jiangbin I R Naqvi M AbdelmajeedA Mehmood and M T Sadiq ldquoDocument-level text clas-sification using single-layer multisize filters convolutionalneural networkrdquo IEEE Access vol 8 pp 42689ndash42707 2021

[16] U Pal and A Sarkar ldquoRecognition of printed Urdu scriptrdquo inProceedings of the 2003 Seventh International Conference onDocument Analysis and Recognition pp 1183ndash1187 IEEEEdinburgh Scotland August 2003

[17] Y Yang T Pierce and J Carbonell ldquoA study of retrospectiveand on-line event detectionrdquo in Proceedings of the 21st annualinternational ACM SIGIR conference on Research and

Scientific Programming 13

development in information retrieval pp 28ndash36 MelbourneAustralia August 1998

[18] T Kala ldquoEvent detection from text datardquo ComputationalIntelligence vol 31 pp 132ndash164 2015

[19] M Naughton N Stokes and J Carthy ldquoSentence-level eventclassification in unstructured textsrdquo Information Retrievalvol 13 no 2 pp 132ndash156 2010

[20] G Jacobs E Lefever and V Hoste ldquoEconomic event de-tection in company-specific news textrdquo in Proceedings of theFirst Workshop on Economics and Natural Language Pro-cessing pp 1ndash10 Melbourne Australia July 2018

[21] E DrsquoAndrea P Ducange A Bechini A Renda andF Marcelloni ldquoMonitoring the public opinion about thevaccination topic from tweets analysisrdquo Expert Systems withApplications vol 116 pp 209ndash226 2019

[22] M Sokolova and G Lapalme ldquoA systematic analysis ofperformance measures for classification tasksrdquo InformationProcessing amp Management vol 45 no 4 pp 427ndash437 2009

[23] T Mikolov I Sutskever K Chen G S Corrado and J DeanldquoDistributed representations of words and phrases and theircompositionalityrdquo in Proceedings Advances Neural Informa-tion Processing Systems vol 26 pp 3111ndash3119 Lake TahoeNV USA December 2013

[24] Y Bengio R Ducharme P Vincent and C Jauvin ldquoA neuralprobabilistic language modelrdquo Journal of Machine LearningResearch vol 3 pp 1137ndash1155 2003

[25] M P Akhter Z Jiangbin I R Naqvi M Abdelmajeed andM Fayyaz ldquoExploring deep learning approaches for Urdu textclassification in product manufacturingrdquo Enterprise Infor-mation Systems pp 1ndash26 2020

[26] G Liu and J Guo ldquoBidirectional LSTM with attentionmechanism and convolutional layer for text classificationrdquoNeurocomputing vol 337 pp 325ndash338 2019

[27] M Sharjeel R M A Nawab and P Rayson ldquoCOUNTERcorpus of Urdu news text reuserdquo Language Resources andEvaluation vol 51 no 3 pp 777ndash803 2017

[28] K Mehmood D Essam and K Shafi ldquoSentiment analysissystem for Roman Urdurdquo in Proceedings of the 2018 Scienceand Information Conference pp 29ndash42 Springer CasablancaMorocco July 2018

[29] K Ahmed M Ali S Khalid and M Kamran ldquoFramework forUrdu news headlines classificationrdquo Journal of Applied ComputerScience amp Mathematics vol 10 no 1 pp 17ndash21 2016

[30] Z Tehseen M P Akhter and Q Abbas ldquoComparative studyof feature selection approaches for Urdu text categorizationrdquoMalaysian Journal Computer Science vol 28 no 2 pp 93ndash109 2015

[31] W Yin and L Shen ldquoA short text classification approach withevent detection and conceptual informationrdquo in Proceedingsof the 2020 5th International Conference on Machine LearningTechnologies pp 129ndash135 Beijing China June 2020

[32] H Zhou M Huang T Zhang et al ldquoEmotional chattingmachine emotional conversation generation with internaland external memoryrdquo in Proceedings of the irty-SecondAAAI Conference on Artificial Intelligence New Orleans LAUSA February 2018

[33] A Hassan and A Mahmood ldquoConvolutional recurrent deeplearningmodel for sentence classificationrdquo IEEE Access vol 6pp 13949ndash13957 2018

[34] T Zia M P Akhter and Q Abbas ldquoComparative study offeature selection approaches for Urdu text categorizationrdquoMalaysian Journal of Computer Science vol 28 no 2pp 93ndash109 2015

[35] A R Ali andM Ijaz ldquoUrdu text classificationrdquo in Proceedingsof the 7th international conference on frontiers of informationtechnology pp 1ndash7 Abbottabad Pakistan December 2009

[36] M Usman Z Shafique S Ayub and K Malik ldquoUrdu textclassification using majority votingrdquo International Journal ofAdvanced Computer Science and Applications vol 7 no 8pp 265ndash273 2016

[37] N Kalchbrenner E Grefenstette and P Blunsom ldquoA con-volutional neural network for modelling sentencesrdquo 2014httparxivorgabs14042188

[38] D M Awais and DM Shoaib ldquoRole of discourse informationin Urdu sentiment classificationrdquo ACMTransactions on Asianand Low-Resource Language Information Processing vol 18no 4 pp 1ndash37 2019

[39] J P Singh Y K Dwivedi N P Rana A Kumar andK K Kapoor ldquoEvent classification and location predictionfrom tweets during disastersrdquo Annals of Operations Researchvol 283 no 1-2 pp 737ndash757 2019

[40] R C Paulo D Fillipe and M S C Sergo ldquoClassification ofevents on social mediardquo 2016

[41] Q A Al-Radaideh and M A Al-Abrat ldquoAn Arabic textcategorization approach using term weighting and multiplereductsrdquo Soft Computing vol 23 no 14 pp 5849ndash5863 2019

[42] J F Allen ldquoMaintaining knowledge about temporal inter-valsrdquo Communications of the ACM vol 26 no 11pp 832ndash843 1983

[43] T Joachims ldquoText categorization with support vector ma-chines learning with many relevant featuresrdquo in Proceedingsof the European conference on machine learning pp 137ndash142Springer Chemnitz Germany April 1998

[44] S Haider ldquoUrdu word embeddingsrdquo in Proceedings of theEleventh International Conference on Language Resources andEvaluation (LREC 2018) Miyazaki Japan May 2018

[45] B Jawaid A Kamran and O Bojar ldquoUrdu monolingual corpusrdquoLINDATCLARIN Digital Library at the Institute of Formal andApplied Linguistics Charles University Prague Czechia

[46] F Adeeba Q Akram H Khalid and S Hussain ldquoCLE Urdubooks n-gramsrdquo in Proceedings of the Conference on Languageand Technology CLT 14 Karachi Pakistan May 2014

[47] A Hassan and A Mahmood ldquoDeep learning for sentenceclassificationrdquo in Proceedings of the 2017 IEEE Long IslandSystems Applications and Technology Conference (LISAT)pp 1ndash5 IEEE New York NY USA May 2017

[48] D-X Zhou ldquoUniversality of deep convolutional neuralnetworksrdquo Applied and Computational Harmonic Analysisvol 48 no 2 pp 787ndash794 2020

[49] M V Valueva N N Nagornov P A Lyakhov G V Valuevand N I Chervyakov ldquoApplication of the residue numbersystem to reduce hardware costs of the convolutional neuralnetwork implementationrdquo Mathematics and Computers inSimulation vol 177 pp 232ndash243 2020

[50] G Guo H Wang D Bell Y Bi and K Greer ldquoKNN model-based approach in classificationrdquo in Proceedings of the OTMConfederated International Conferences ldquoOn the Move toMeaningful Internet Systemsrdquo Catania Italy November 2003

[51] Y Zhong ldquo)e analysis of cases based on decision treerdquo inProceedings of the 2016 7th IEEE international conference onsoftware engineering and service science (ICSESS) pp 142ndash147IEEE Beijing China August 2016

[52] S Xu ldquoBayesian Naıve Bayes classifiers to text classificationrdquoJournal of Information Science vol 44 no 1 pp 48ndash59 2018

[53] T Zhang and F Oles ldquoText categorization based on regu-larized linear classification methodsrdquo Information Retrievalvol 4 no 1 pp 5ndash31 2001

14 Scientific Programming

[54] J Ali R Khan N Ahmad and I Maqsood ldquoRandom forestsand decision treesrdquo International Journal of Computer ScienceIssues (IJCSI) vol 9 no 5 p 272 2012

[55] Y Zhang ldquoSupport vector machine classification algorithmand its applicationrdquo in Proceedings of the InternationalConference on Information Computing and Applicationspp 179ndash186 Springer Bhubaneswar India September 2012

Scientific Programming 15

Page 9: ResearchArticle MulticlassEventClassificationfromText2020/11/03  · script. Its grammatical structure is different from other languages. (1)Subject-object-verb(SOV)sentencestructure[14]

pretrained [44] and custom pretrained word embeddingmodels is given in Table 10 Our custom pretrained wordembedding model that contains 57251 unique tokens largerdimension size 350 and 1 as the size of a window showed3868 accuracy )e purpose of developing a differentcustom pretrained word embedding model was to develop adomain-specific model and achieve the highest accuracyHowever the results of both pre-existing pretrained wordembedding models and domain-specific custom word em-bedding models are very low )e detail summary of resultscan be seen in Table 10

712 TF-IDF Feature Vector DNN architecture consists ofan input layer a dense layer and a max pool layer)e denselayer is also called a fully connected layer comprised of 150nodes SoftMax activation function and sparse_categor-ical_cross-entropy are used to compile the model on thedataset

25991 instances are used to validate the accuracy of theDNN model )e DNN with connected layer architectureshowed 84 overall accuracy for all event classes)e details ofthe performance measuring parameters for each class of eventare given in Table 11 Law and order the 6th type of event inour dataset consists of 2000 instances that are used for vali-dation It showed 66 accuracy that is comparatively low to theaccuracy of other types of events It affected the overall per-formance of the DNN model )e main reason behind theseresults is that the sentence of law and order overlaps with thesentences of politics Generally sometimes humans hardlydistinguish between law and order and political statements

For example

ldquo ےطخوگتفگہنارادہمذریغیکریزویتموکحےہہرطخےیلےکنماےک rdquo

ldquo)e irresponsible talk of state minister is a threat topeace in the regionrdquo

)e performance of the DNN model is given in Table 11that showed 84 accuracy for multiple classes of events All theother performance measuring parameters ie precessionrecall and F1-score of each class of events are given in Table 11

)e accuracy of the DNN model can be viewed inFigure 5 where the y-axis represents the accuracy and the x-axis represents the number of epochs RNN achieved 84accuracy for multiclass event classification

)e expected solution to tackle the sentence over-lapping problem with multiple classes is to use a ldquopre-trained word embeddingrdquo model like W2Vec and GloveHowever unfortunately like the English language stillthere is no openclose domain pretrained word

embedding model that is developed by a large corpus ofthe Urdu language text

)e RNN sequential model architecture of deep learningis used in our experiments )e recurrent deep learningmodel architecture consists of a sequence of the followinglayers ie embedding layer having 100 dimensions Spa-tialDropout1D LSTM and dense layers Sparse_categor-ical_cross-entropy loss function has been used for thecompilation of the model Multiclass categorical classifica-tion is handled by a sparse categorical cross-entropy lossfunction instead of categorical cross-entropy A SoftMaxactivation function is used at a dense layer instead of thesigmoid function SoftMax can handle nonlinear classifi-cation ie multiple classes while sigmoid is limited to linearclassification and handles binary classification

A bag-of-words consisting of 30000 unique Urdu lan-guage words is used to generate a feature vector )emaximum length of the feature vector is 250 tokens

)e overall accuracy of the RNN model is presented inTable 12 that achieved 81 validation accuracy for our problemby using TF-IDF feature vectors Other performance evaluationparameters of each class are also given in Table 12

)e accuracy of the RNN model can be viewed inFigure 6 where the y-axis represents the accuracy and the x-axis represents the number of epochs RNN achieved 81accuracy for multiclass event classification

Although CNN is highly recommended for imageprocessing it showed considerable results for multiclassevent classification on textual data )e performance mea-suring parameters of the CNN classifier are given in Table 13

)e distributed accuracy of the CNN classifier for thetwelve classes can be viewed in Figure 7 )ere is more thanone peak (higher accuracies) in Figure 7 that showeddatasets are imbalanced

713 One-Hot-Encoding )e results of deep learning clas-sifiers are used in our researcher work and their performanceon one-hot-encoding features is presented in Figure 8)e one-hot-encoded feature vectors are given as input to CNN DNNand RNN deep learning classifiers RNN showed better ac-curacy as compared to CNN while the DNN outperformed

Table 10 Classification accuracy of the CNN model

Srno

Existing pretrained modelrsquosvalidation_accuracy

Custom pretrained modelrsquosvalidation_accuracy

1 5800 36852 6026 38043 5668 37384 mdash 3868

Table 11 Performance measuring parameters for the DNNmodel

Class Precision Recall F1-score Support1 096 095 096 46042 091 091 091 7763 075 075 075 16974 078 070 074 7705 081 085 083 84246 071 063 067 20007 100 100 100 8178 092 090 091 18399 070 070 071 252410 095 099 097 85611 095 099 097 74112 082 073 077 943Accuracy 084 25991Macro avg 084 084 085 25991Weighted avg 084 084 084 25991

Scientific Programming 9

among them RNN and DNN achieved 81 and 84 accuracyrespectively for multiclass event classification

72 Traditional Machine Learning Classifiers We also per-formed a multiclass event classifier by using traditionalmachine learning algorithms K-nearest neighbor (KNN)decision tree (DT) Naıve Bayes multinomial (NBM) ran-dom forest (RF) linear regression (LR) and support vectormachine (SVM) All these models are evaluated using TF-IDF and one-hot encoding features as feature vectors It wasobserved that the results produced using TF-IDF featureswere better than the results generated using one-hot-encoding features A detailed summary of the results of theabove-mentioned machine learning classifiers is given in thenext section

721 K-Nearest Neighbor (KNN) KNN performs theclassification of a new data point by measuring the similaritydistance between the nearest neighbors In our experimentswe set the value of k 5 that measures the similarity distanceamong five existing data points [50]

Table 12 Performance measuring parameters for the RNN model

Class Precision Recall F1-score Support1 095 095 095 46042 078 077 078 7763 070 072 071 16974 078 064 070 7705 078 084 081 84246 067 057 062 20007 100 100 100 8178 091 087 089 18399 070 063 066 252410 093 098 095 85611 086 094 090 74112 076 067 071 943Accuracy 081 25991Macro avg 082 080 081 25991Weighted avg 081 081 081 25991

Table 13 Performance measuring parameters for the CNN model

Class Precision Recall F1-score Support1 096 093 095 56612 081 065 072 9673 072 068 070 21154 078 054 064 8785 073 088 080 100306 064 051 057 22937 099 099 099 9708 091 086 088 22599 071 061 066 304410 093 094 093 103111 091 082 086 88912 077 063 07 1052Accuracy 080 31189Macro avg 082 075 078 31189Weighted avg 080 080 080 31189

0

065

070

075

080

085

090

2 4 6 8Epochs

Accuracy

Valid

ation_

accuracy

TrainTest

Figure 6 RNNrsquos accuracy

00

1000

2000

3000

4000

5000

6000

7000

8000

20 40 60 80 100

Figure 7 CNNrsquos accuracy distribution

CNN

79

80

808181

82

83

8484

RNN DNNDeep learning models

One-hot-encoding

78

85

Valid

atio

n_ac

cura

cy

Figure 8 CNN RNN andDNN accuracy using one-hot-encoding

10 Scientific Programming

Although the performance of traditional machinelearning classifiers is considerable it must be noted that it islower than deep learning classifiers )e main performancedegrading factor of the classifiers is the imbalanced numberof instances and sentences overlapping )e performance ofthe KNN machine learning model is given in Table 14 Itshowed 78 accuracy

722 Decision Tree (DT) Decision Tree (DT)Decision tree(DT) is a type of supervised machine learning algorithm [51]where the data input is split according to certain parameters)e overall accuracy achieved by DT is 73 while anotherperformance detail of classes andDTmodel is given in Table 15

723 Naive Bayes Multinominal (NBM) Naıve Bayesmultinominal is one of the computational [52] efficientclassifiers for text classification but it showed only 70accuracy that is very low as compared to KNN DT and RF)e performance details of all twelve types of classes aregiven in Table 16

724 Linear Regression (LR) Linear regression is highlyrecommended for the prediction of continuous output in-stead of categorical classification [53] Table 17 shows the

Table 17 Performance measuring parameters for the LR model

Class Precision Recall F1-score Support1 095 094 094 56612 083 064 072 9673 072 069 070 21154 077 055 064 8785 073 088 080 100306 064 053 058 22937 100 100 100 9708 091 084 088 22599 073 062 067 304410 094 092 093 103111 090 080 085 88912 077 066 071 1052Accuracy 080 31189Macro avg 082 076 079 31189Weighted avg 080 080 0 80 31189

Table 18 Performance measuring parameters for the RF model

Class Precision Recall F1-score Support1 094 093 094 56612 094 096 095 9673 072 063 067 21154 080 058 067 8785 071 090 079 100306 067 041 051 22937 100 100 100 9708 093 080 086 22599 075 058 065 304410 094 098 096 103111 096 098 097 88912 084 063 072 1052Accuracy 080 31189Macro avg 085 078 081 31189Weighted avg 081 080 0 80 31189

Table 14 Performance measuring parameters for the KNN model

Class Precision Recall F1-score Support1 091 093 092 56612 062 083 071 9673 067 071 069 21154 064 060 062 8785 078 082 080 100306 066 050 057 22937 093 100 096 9708 091 080 085 22599 071 062 066 304410 085 093 089 103111 072 085 078 88912 075 061 067 1052Accuracy 078 31189Macro avg 076 077 076 31189Weighted avg 078 078 0 78 31189

Table 15 Performance measuring parameters for the DT model

Class Precision Recall F1-score Support1 091 089 090 56612 083 097 089 9673 057 052 054 21154 058 054 056 8785 072 075 073 100306 044 041 042 22937 099 100 100 9708 079 077 078 22599 057 055 056 304410 098 098 093 103111 086 098 092 88912 061 056 058 1031Accuracy 073 31189Macro avg 073 074 074 31189Weighted avg 073 073 0 73 31189

Table 16 Performance measuring parameters for the NB Multi-nominal model

Class Precision Recall F1-score Support1 094 091 093 56832 082 034 048 9563 066 047 055 21214 091 020 032 9195 056 095 070 100136 070 022 034 23877 098 095 097 9598 094 075 083 21889 075 040 052 303110 096 078 086 99811 096 032 048 86312 084 025 039 1071Accuracy 070 31189Macro avg 084 054 061 31189Weighted avg 076 070 0 67 31189

Scientific Programming 11

performance of the LR model ie 84 overall accuracy formulticlass event classification

725 Random Forest (RF) It comprises many decision trees[54] Its results showed the highest accuracy among allevaluated machine learning classifiers A detailed summaryof the results is given in Table 18

726 Support Vector Machine (SVM) )e support vectormachine (SVM) is one of the highly recommended modelsfor binary classification It is based on statistical theory [55]Its performance details are given in Table 19

A comparative depiction of results obtained by thetraditional machine learning classifiers is given in Figure 9

8 Discussion and Conclusion

Lack of resources is a major hurdle in research for Urdulanguage texts We explored many feature vectors generatingtechniques Different classification algorithms of traditional

machine learning and deep learning approaches are evaluatedon these feature vectors )e purpose of performing manyexperiments on various feature vector generating techniqueswas to develop the most efficient and generic model of mul-ticlass event classification for Urdu language text

Word embedding feature generating technique is con-sidered an efficient and powerful technique for text analysisWord2Vector (W2Vec) feature vectors can be generated bypretrained word embedding models or using dynamic pa-rameters in embedding layers of deep neural networks Weperformed sentence classification using pretrained wordembedding models one-hot-encoding TF TF-IDF anddynamic embeddings )e results of the rest of the featurevector generating techniques are better than pretrained wordembedding models

Another argument in support of this conclusion is thatonly a few pretrained word embedding models exist forUrdu language texts )ese models are trained on consid-erable tokens and domain-specific Urdu text )ere is a needto develop generic word embedding models for the Urdulanguage on a large corpus CNN and RNN (LSTM) single-layer architecture and multilayer architecture do not affectthe performance of the proposed system

Experimental results are the vivid depiction that theone-hot-encoding method is better than the word em-bedding model and pretrained word embedding modelHowever among all mentioned (see Section 52) featuregenerating techniques TF-IDF outperformed It showedthe highest accuracy (84) by using DNN deep learningclassifier while event classification on an imbalancedataset of multiclass events for Urdu language usingtraditional machine learning classifiers showed consid-erable performance but lower than deep learning modelsDeep learning algorithms ie CNN DNN and RNN arepreferable over traditional machine learning algorithmsbecause there is no need for a domain expert to findrelevant features in deep learning like traditional machinelearning DNN and RNN outperformed among all otherclassifiers and showed overall 84 and 81 accuracyrespectively for the twelve classes of events Compara-tively the performance of CNN and RNN is better thanNaıve Bayes and SVM

Multiclass event classification at the sentence levelperformed on an imbalance dataset events that are having alow number of instances for a specific class affect the overallperformance of the classifiers We can improve the per-formance by balancing the instances of each class )efollowing can be concluded

(1) Pretrained word embedding models are suitable onlyfor sentence classification if pretrained models aredeveloped by an immense amount of textual data

(2) Existing word embedding models Word2Vec andGlove that were developed for the English languagetext are incompatible for Urdu language text

(3) In our case TF-IDF one-hot-encoding and dy-namic embedding layer are better feature generating

Table 19 Performance measuring parameters for the SVM model

Class Precision Recall F1-score Support1 084 094 089 56832 072 043 054 9563 072 049 058 21214 073 043 054 9195 064 090 075 100136 074 024 036 23877 090 099 094 9598 086 078 082 21889 065 047 057 303110 085 087 082 99811 081 062 070 86312 077 063 067 1071Accuracy 073 31189Macro avg 077 063 067 31189Weighted avg 077 073 0 71 31189

SVM

70

7573

70

80

73

80

7880

RF LRMachine learing classifiers

65

85

DT KNNNBM

Valid

atio

n_ac

cura

cy

Figure 9 Machine learning algorithmsrsquo accuracy using TF-IDF

12 Scientific Programming

techniques as compared to pre-existing Urdu lan-guage text word embedding models

(4) )e TF-IDF-based feature vectors showed thehighest results as compared to one-hot-encoding-and dynamic word embedding-based feature vectors

(5) Imbalance number of instances in the dataset af-fected the overall accuracy

9 Future Work

In a comprehensive review of Urdu literature we found onlya few numbers of referential works related to Urdu textprocessing )e main hurdle in Urdu exploration is theunavailability of the processing resources ie event datasetclose-domain part-of-speech tagger lexicons annotatorsand other supporting tools

)ere are a lot of tasks that can be accomplished forUrdu language text in the future Some of those are men-tioned as follows

(1) Generic word embedding models can be developedfor a large corpus of Urdu language text

(2) Different deep learning classifiers can be evaluatedie BERT and ANN

(3) Event classification can be performed at the doc-ument level

(4) A balance dataset can be used for better results(5) Multilabel event classification can be performed in

the future(6) Unstructured data of Urdu text can be classified

into different event classes(7) Classification of events for the Urdu language can

be further performed for other domains ofknowledge ie literacy ratio top trends famousfoods and a religious event like Eid

(8) Contextual information of sentence ie presen-tence and postsentence information certainly playsa vital role in enhancing the performance accuracyof the classification model

(9) Event classification can be performed on a balanceddataset

(10) Unstructured Urdu data can be used for eventclassification

(11) Classification can be performed at a document andphrase level

Data Availability

)e data used to support this study are available at httpsgithubcomunique-worldMulticlass-Event-Classification-Dataset

Conflicts of Interest

)e authors declare that there are no conflicts of interest

References

[1] A Lenhart R Ling S Campbell and K Purcell Teens andMobile Phones Text Messaging Explodes as Teens Embrace it asthe Centerpiece of eir Communication Strategies withFriends Pew Internet amp American Life Project WashingtonDC USA 2010

[2] M Motoyama B Meeder K Levchenko G M Voelker andS Savage ldquoMeasuring online service availability using twit-terrdquo WOSN vol 10 p 13 2010

[3] J Rogstadius M Vukovic C A Teixeira V KostakosE Karapanos and J A Laredo ldquoCrisisTracker crowdsourcedsocial media curation for disaster awarenessrdquo IBM Journal ofResearch and Development vol 57 no 5 pp 4ndash1 2013

[4] T Reuter and P Cimiano ldquoEvent-based classification of socialmedia streamsrdquo in Proceedings of the 2nd ACM InternationalConference on Multimedia Retrieval pp 1ndash8 Bielefeld Ger-many June 2012

[5] K Sailunaz and R Alhajj ldquoEmotion and sentiment analysisfrom Twitter textrdquo Journal of Computational Science vol 36Article ID 101003 2019

[6] P Capet T Delavallade T Nakamura A SandorC Tarsitano and S Voyatzi ldquoA risk assessment system withautomatic extraction of event typesrdquo in Proceedings of theInternational Conference on Intelligent Information Process-ing pp 220ndash229 Springer Beijing China October 2008

[7] F Hogenboom F Frasincar U Kaymak F De Jong andE Caron ldquoA survey of event extraction methods from text fordecision support systemsrdquo Decision Support Systems vol 85pp 12ndash22 2016

[8] S Jiang H Chen J F Nunamaker and D Zimbra ldquoAna-lyzing firm-specific social media and market a stakeholder-based event analysis frameworkrdquo Decision Support Systemsvol 67 pp 30ndash39 2014

[9] B Pang and L Lee ldquoOpinion mining and sentiment analysisrdquoFoundations and Trends in Information Retrieval vol 2 no 1-2 pp 1ndash135 2008

[10] S Deerwester S T Dumais G W Furnas T K Landauerand R Harshman ldquoIndexing by latent semantic analysisrdquoJournal of the American Society for Information Sciencevol 41 no 6 pp 391ndash407 1990

[11] T Mikolov Q V Le and I Sutskever ldquoExploiting similaritiesamong languages for machine translationrdquo 2013 httparxivorgabs13094168

[12] R Alghamdi and K Alfalqi ldquoA survey of topic modeling intext miningrdquo International Journal of Advanced ComputerScience and Applications (IJACSA)vol 6 no 1 2015

[13] D M Eberhard S F Gary and C D Fennig EthnologueLanguages of the World SIL International Dallas TX USA2019

[14] A Daud W Khan and D Che ldquoUrdu language processing asurveyrdquo Artificial Intelligence Review vol 47 no 3pp 279ndash311 2017

[15] M P Akhter Z Jiangbin I R Naqvi M AbdelmajeedA Mehmood and M T Sadiq ldquoDocument-level text clas-sification using single-layer multisize filters convolutionalneural networkrdquo IEEE Access vol 8 pp 42689ndash42707 2021

[16] U Pal and A Sarkar ldquoRecognition of printed Urdu scriptrdquo inProceedings of the 2003 Seventh International Conference onDocument Analysis and Recognition pp 1183ndash1187 IEEEEdinburgh Scotland August 2003

[17] Y Yang T Pierce and J Carbonell ldquoA study of retrospectiveand on-line event detectionrdquo in Proceedings of the 21st annualinternational ACM SIGIR conference on Research and

Scientific Programming 13

development in information retrieval pp 28ndash36 MelbourneAustralia August 1998

[18] T Kala ldquoEvent detection from text datardquo ComputationalIntelligence vol 31 pp 132ndash164 2015

[19] M Naughton N Stokes and J Carthy ldquoSentence-level eventclassification in unstructured textsrdquo Information Retrievalvol 13 no 2 pp 132ndash156 2010

[20] G Jacobs E Lefever and V Hoste ldquoEconomic event de-tection in company-specific news textrdquo in Proceedings of theFirst Workshop on Economics and Natural Language Pro-cessing pp 1ndash10 Melbourne Australia July 2018

[21] E DrsquoAndrea P Ducange A Bechini A Renda andF Marcelloni ldquoMonitoring the public opinion about thevaccination topic from tweets analysisrdquo Expert Systems withApplications vol 116 pp 209ndash226 2019

[22] M Sokolova and G Lapalme ldquoA systematic analysis ofperformance measures for classification tasksrdquo InformationProcessing amp Management vol 45 no 4 pp 427ndash437 2009

[23] T Mikolov I Sutskever K Chen G S Corrado and J DeanldquoDistributed representations of words and phrases and theircompositionalityrdquo in Proceedings Advances Neural Informa-tion Processing Systems vol 26 pp 3111ndash3119 Lake TahoeNV USA December 2013

[24] Y Bengio R Ducharme P Vincent and C Jauvin ldquoA neuralprobabilistic language modelrdquo Journal of Machine LearningResearch vol 3 pp 1137ndash1155 2003

[25] M P Akhter Z Jiangbin I R Naqvi M Abdelmajeed andM Fayyaz ldquoExploring deep learning approaches for Urdu textclassification in product manufacturingrdquo Enterprise Infor-mation Systems pp 1ndash26 2020

[26] G Liu and J Guo ldquoBidirectional LSTM with attentionmechanism and convolutional layer for text classificationrdquoNeurocomputing vol 337 pp 325ndash338 2019

[27] M Sharjeel R M A Nawab and P Rayson ldquoCOUNTERcorpus of Urdu news text reuserdquo Language Resources andEvaluation vol 51 no 3 pp 777ndash803 2017

[28] K Mehmood D Essam and K Shafi ldquoSentiment analysissystem for Roman Urdurdquo in Proceedings of the 2018 Scienceand Information Conference pp 29ndash42 Springer CasablancaMorocco July 2018

[29] K Ahmed M Ali S Khalid and M Kamran ldquoFramework forUrdu news headlines classificationrdquo Journal of Applied ComputerScience amp Mathematics vol 10 no 1 pp 17ndash21 2016

[30] Z Tehseen M P Akhter and Q Abbas ldquoComparative studyof feature selection approaches for Urdu text categorizationrdquoMalaysian Journal Computer Science vol 28 no 2 pp 93ndash109 2015

[31] W Yin and L Shen ldquoA short text classification approach withevent detection and conceptual informationrdquo in Proceedingsof the 2020 5th International Conference on Machine LearningTechnologies pp 129ndash135 Beijing China June 2020

[32] H Zhou M Huang T Zhang et al ldquoEmotional chattingmachine emotional conversation generation with internaland external memoryrdquo in Proceedings of the irty-SecondAAAI Conference on Artificial Intelligence New Orleans LAUSA February 2018

[33] A Hassan and A Mahmood ldquoConvolutional recurrent deeplearningmodel for sentence classificationrdquo IEEE Access vol 6pp 13949ndash13957 2018

[34] T Zia M P Akhter and Q Abbas ldquoComparative study offeature selection approaches for Urdu text categorizationrdquoMalaysian Journal of Computer Science vol 28 no 2pp 93ndash109 2015

[35] A R Ali andM Ijaz ldquoUrdu text classificationrdquo in Proceedingsof the 7th international conference on frontiers of informationtechnology pp 1ndash7 Abbottabad Pakistan December 2009

[36] M Usman Z Shafique S Ayub and K Malik ldquoUrdu textclassification using majority votingrdquo International Journal ofAdvanced Computer Science and Applications vol 7 no 8pp 265ndash273 2016

[37] N Kalchbrenner E Grefenstette and P Blunsom ldquoA con-volutional neural network for modelling sentencesrdquo 2014httparxivorgabs14042188

[38] D M Awais and DM Shoaib ldquoRole of discourse informationin Urdu sentiment classificationrdquo ACMTransactions on Asianand Low-Resource Language Information Processing vol 18no 4 pp 1ndash37 2019

[39] J P Singh Y K Dwivedi N P Rana A Kumar andK K Kapoor ldquoEvent classification and location predictionfrom tweets during disastersrdquo Annals of Operations Researchvol 283 no 1-2 pp 737ndash757 2019

[40] R C Paulo D Fillipe and M S C Sergo ldquoClassification ofevents on social mediardquo 2016

[41] Q A Al-Radaideh and M A Al-Abrat ldquoAn Arabic textcategorization approach using term weighting and multiplereductsrdquo Soft Computing vol 23 no 14 pp 5849ndash5863 2019

[42] J F Allen ldquoMaintaining knowledge about temporal inter-valsrdquo Communications of the ACM vol 26 no 11pp 832ndash843 1983

[43] T Joachims ldquoText categorization with support vector ma-chines learning with many relevant featuresrdquo in Proceedingsof the European conference on machine learning pp 137ndash142Springer Chemnitz Germany April 1998

[44] S Haider ldquoUrdu word embeddingsrdquo in Proceedings of theEleventh International Conference on Language Resources andEvaluation (LREC 2018) Miyazaki Japan May 2018

[45] B Jawaid A Kamran and O Bojar ldquoUrdu monolingual corpusrdquoLINDATCLARIN Digital Library at the Institute of Formal andApplied Linguistics Charles University Prague Czechia

[46] F Adeeba Q Akram H Khalid and S Hussain ldquoCLE Urdubooks n-gramsrdquo in Proceedings of the Conference on Languageand Technology CLT 14 Karachi Pakistan May 2014

[47] A Hassan and A Mahmood ldquoDeep learning for sentenceclassificationrdquo in Proceedings of the 2017 IEEE Long IslandSystems Applications and Technology Conference (LISAT)pp 1ndash5 IEEE New York NY USA May 2017

[48] D-X Zhou ldquoUniversality of deep convolutional neuralnetworksrdquo Applied and Computational Harmonic Analysisvol 48 no 2 pp 787ndash794 2020

[49] M V Valueva N N Nagornov P A Lyakhov G V Valuevand N I Chervyakov ldquoApplication of the residue numbersystem to reduce hardware costs of the convolutional neuralnetwork implementationrdquo Mathematics and Computers inSimulation vol 177 pp 232ndash243 2020

[50] G Guo H Wang D Bell Y Bi and K Greer ldquoKNN model-based approach in classificationrdquo in Proceedings of the OTMConfederated International Conferences ldquoOn the Move toMeaningful Internet Systemsrdquo Catania Italy November 2003

[51] Y Zhong ldquo)e analysis of cases based on decision treerdquo inProceedings of the 2016 7th IEEE international conference onsoftware engineering and service science (ICSESS) pp 142ndash147IEEE Beijing China August 2016

[52] S Xu ldquoBayesian Naıve Bayes classifiers to text classificationrdquoJournal of Information Science vol 44 no 1 pp 48ndash59 2018

[53] T Zhang and F Oles ldquoText categorization based on regu-larized linear classification methodsrdquo Information Retrievalvol 4 no 1 pp 5ndash31 2001

14 Scientific Programming

[54] J Ali R Khan N Ahmad and I Maqsood ldquoRandom forestsand decision treesrdquo International Journal of Computer ScienceIssues (IJCSI) vol 9 no 5 p 272 2012

[55] Y Zhang ldquoSupport vector machine classification algorithmand its applicationrdquo in Proceedings of the InternationalConference on Information Computing and Applicationspp 179ndash186 Springer Bhubaneswar India September 2012

Scientific Programming 15

Page 10: ResearchArticle MulticlassEventClassificationfromText2020/11/03  · script. Its grammatical structure is different from other languages. (1)Subject-object-verb(SOV)sentencestructure[14]

among them RNN and DNN achieved 81 and 84 accuracyrespectively for multiclass event classification

72 Traditional Machine Learning Classifiers We also per-formed a multiclass event classifier by using traditionalmachine learning algorithms K-nearest neighbor (KNN)decision tree (DT) Naıve Bayes multinomial (NBM) ran-dom forest (RF) linear regression (LR) and support vectormachine (SVM) All these models are evaluated using TF-IDF and one-hot encoding features as feature vectors It wasobserved that the results produced using TF-IDF featureswere better than the results generated using one-hot-encoding features A detailed summary of the results of theabove-mentioned machine learning classifiers is given in thenext section

721 K-Nearest Neighbor (KNN) KNN performs theclassification of a new data point by measuring the similaritydistance between the nearest neighbors In our experimentswe set the value of k 5 that measures the similarity distanceamong five existing data points [50]

Table 12 Performance measuring parameters for the RNN model

Class Precision Recall F1-score Support1 095 095 095 46042 078 077 078 7763 070 072 071 16974 078 064 070 7705 078 084 081 84246 067 057 062 20007 100 100 100 8178 091 087 089 18399 070 063 066 252410 093 098 095 85611 086 094 090 74112 076 067 071 943Accuracy 081 25991Macro avg 082 080 081 25991Weighted avg 081 081 081 25991

Table 13 Performance measuring parameters for the CNN model

Class Precision Recall F1-score Support1 096 093 095 56612 081 065 072 9673 072 068 070 21154 078 054 064 8785 073 088 080 100306 064 051 057 22937 099 099 099 9708 091 086 088 22599 071 061 066 304410 093 094 093 103111 091 082 086 88912 077 063 07 1052Accuracy 080 31189Macro avg 082 075 078 31189Weighted avg 080 080 080 31189

0

065

070

075

080

085

090

2 4 6 8Epochs

Accuracy

Valid

ation_

accuracy

TrainTest

Figure 6 RNNrsquos accuracy

00

1000

2000

3000

4000

5000

6000

7000

8000

20 40 60 80 100

Figure 7 CNNrsquos accuracy distribution

CNN

79

80

808181

82

83

8484

RNN DNNDeep learning models

One-hot-encoding

78

85

Valid

atio

n_ac

cura

cy

Figure 8 CNN RNN andDNN accuracy using one-hot-encoding

10 Scientific Programming

Although the performance of traditional machinelearning classifiers is considerable it must be noted that it islower than deep learning classifiers )e main performancedegrading factor of the classifiers is the imbalanced numberof instances and sentences overlapping )e performance ofthe KNN machine learning model is given in Table 14 Itshowed 78 accuracy

722 Decision Tree (DT) Decision Tree (DT)Decision tree(DT) is a type of supervised machine learning algorithm [51]where the data input is split according to certain parameters)e overall accuracy achieved by DT is 73 while anotherperformance detail of classes andDTmodel is given in Table 15

723 Naive Bayes Multinominal (NBM) Naıve Bayesmultinominal is one of the computational [52] efficientclassifiers for text classification but it showed only 70accuracy that is very low as compared to KNN DT and RF)e performance details of all twelve types of classes aregiven in Table 16

724 Linear Regression (LR) Linear regression is highlyrecommended for the prediction of continuous output in-stead of categorical classification [53] Table 17 shows the

Table 17 Performance measuring parameters for the LR model

Class Precision Recall F1-score Support1 095 094 094 56612 083 064 072 9673 072 069 070 21154 077 055 064 8785 073 088 080 100306 064 053 058 22937 100 100 100 9708 091 084 088 22599 073 062 067 304410 094 092 093 103111 090 080 085 88912 077 066 071 1052Accuracy 080 31189Macro avg 082 076 079 31189Weighted avg 080 080 0 80 31189

Table 18 Performance measuring parameters for the RF model

Class Precision Recall F1-score Support1 094 093 094 56612 094 096 095 9673 072 063 067 21154 080 058 067 8785 071 090 079 100306 067 041 051 22937 100 100 100 9708 093 080 086 22599 075 058 065 304410 094 098 096 103111 096 098 097 88912 084 063 072 1052Accuracy 080 31189Macro avg 085 078 081 31189Weighted avg 081 080 0 80 31189

Table 14 Performance measuring parameters for the KNN model

Class Precision Recall F1-score Support1 091 093 092 56612 062 083 071 9673 067 071 069 21154 064 060 062 8785 078 082 080 100306 066 050 057 22937 093 100 096 9708 091 080 085 22599 071 062 066 304410 085 093 089 103111 072 085 078 88912 075 061 067 1052Accuracy 078 31189Macro avg 076 077 076 31189Weighted avg 078 078 0 78 31189

Table 15 Performance measuring parameters for the DT model

Class Precision Recall F1-score Support1 091 089 090 56612 083 097 089 9673 057 052 054 21154 058 054 056 8785 072 075 073 100306 044 041 042 22937 099 100 100 9708 079 077 078 22599 057 055 056 304410 098 098 093 103111 086 098 092 88912 061 056 058 1031Accuracy 073 31189Macro avg 073 074 074 31189Weighted avg 073 073 0 73 31189

Table 16 Performance measuring parameters for the NB Multi-nominal model

Class Precision Recall F1-score Support1 094 091 093 56832 082 034 048 9563 066 047 055 21214 091 020 032 9195 056 095 070 100136 070 022 034 23877 098 095 097 9598 094 075 083 21889 075 040 052 303110 096 078 086 99811 096 032 048 86312 084 025 039 1071Accuracy 070 31189Macro avg 084 054 061 31189Weighted avg 076 070 0 67 31189

Scientific Programming 11

performance of the LR model ie 84 overall accuracy formulticlass event classification

725 Random Forest (RF) It comprises many decision trees[54] Its results showed the highest accuracy among allevaluated machine learning classifiers A detailed summaryof the results is given in Table 18

726 Support Vector Machine (SVM) )e support vectormachine (SVM) is one of the highly recommended modelsfor binary classification It is based on statistical theory [55]Its performance details are given in Table 19

A comparative depiction of results obtained by thetraditional machine learning classifiers is given in Figure 9

8 Discussion and Conclusion

Lack of resources is a major hurdle in research for Urdulanguage texts We explored many feature vectors generatingtechniques Different classification algorithms of traditional

machine learning and deep learning approaches are evaluatedon these feature vectors )e purpose of performing manyexperiments on various feature vector generating techniqueswas to develop the most efficient and generic model of mul-ticlass event classification for Urdu language text

Word embedding feature generating technique is con-sidered an efficient and powerful technique for text analysisWord2Vector (W2Vec) feature vectors can be generated bypretrained word embedding models or using dynamic pa-rameters in embedding layers of deep neural networks Weperformed sentence classification using pretrained wordembedding models one-hot-encoding TF TF-IDF anddynamic embeddings )e results of the rest of the featurevector generating techniques are better than pretrained wordembedding models

Another argument in support of this conclusion is thatonly a few pretrained word embedding models exist forUrdu language texts )ese models are trained on consid-erable tokens and domain-specific Urdu text )ere is a needto develop generic word embedding models for the Urdulanguage on a large corpus CNN and RNN (LSTM) single-layer architecture and multilayer architecture do not affectthe performance of the proposed system

Experimental results are the vivid depiction that theone-hot-encoding method is better than the word em-bedding model and pretrained word embedding modelHowever among all mentioned (see Section 52) featuregenerating techniques TF-IDF outperformed It showedthe highest accuracy (84) by using DNN deep learningclassifier while event classification on an imbalancedataset of multiclass events for Urdu language usingtraditional machine learning classifiers showed consid-erable performance but lower than deep learning modelsDeep learning algorithms ie CNN DNN and RNN arepreferable over traditional machine learning algorithmsbecause there is no need for a domain expert to findrelevant features in deep learning like traditional machinelearning DNN and RNN outperformed among all otherclassifiers and showed overall 84 and 81 accuracyrespectively for the twelve classes of events Compara-tively the performance of CNN and RNN is better thanNaıve Bayes and SVM

Multiclass event classification at the sentence levelperformed on an imbalance dataset events that are having alow number of instances for a specific class affect the overallperformance of the classifiers We can improve the per-formance by balancing the instances of each class )efollowing can be concluded

(1) Pretrained word embedding models are suitable onlyfor sentence classification if pretrained models aredeveloped by an immense amount of textual data

(2) Existing word embedding models Word2Vec andGlove that were developed for the English languagetext are incompatible for Urdu language text

(3) In our case TF-IDF one-hot-encoding and dy-namic embedding layer are better feature generating

Table 19 Performance measuring parameters for the SVM model

Class Precision Recall F1-score Support1 084 094 089 56832 072 043 054 9563 072 049 058 21214 073 043 054 9195 064 090 075 100136 074 024 036 23877 090 099 094 9598 086 078 082 21889 065 047 057 303110 085 087 082 99811 081 062 070 86312 077 063 067 1071Accuracy 073 31189Macro avg 077 063 067 31189Weighted avg 077 073 0 71 31189

SVM

70

7573

70

80

73

80

7880

RF LRMachine learing classifiers

65

85

DT KNNNBM

Valid

atio

n_ac

cura

cy

Figure 9 Machine learning algorithmsrsquo accuracy using TF-IDF

12 Scientific Programming

techniques as compared to pre-existing Urdu lan-guage text word embedding models

(4) )e TF-IDF-based feature vectors showed thehighest results as compared to one-hot-encoding-and dynamic word embedding-based feature vectors

(5) Imbalance number of instances in the dataset af-fected the overall accuracy

9 Future Work

In a comprehensive review of Urdu literature we found onlya few numbers of referential works related to Urdu textprocessing )e main hurdle in Urdu exploration is theunavailability of the processing resources ie event datasetclose-domain part-of-speech tagger lexicons annotatorsand other supporting tools

)ere are a lot of tasks that can be accomplished forUrdu language text in the future Some of those are men-tioned as follows

(1) Generic word embedding models can be developedfor a large corpus of Urdu language text

(2) Different deep learning classifiers can be evaluatedie BERT and ANN

(3) Event classification can be performed at the doc-ument level

(4) A balance dataset can be used for better results(5) Multilabel event classification can be performed in

the future(6) Unstructured data of Urdu text can be classified

into different event classes(7) Classification of events for the Urdu language can

be further performed for other domains ofknowledge ie literacy ratio top trends famousfoods and a religious event like Eid

(8) Contextual information of sentence ie presen-tence and postsentence information certainly playsa vital role in enhancing the performance accuracyof the classification model

(9) Event classification can be performed on a balanceddataset

(10) Unstructured Urdu data can be used for eventclassification

(11) Classification can be performed at a document andphrase level

Data Availability

)e data used to support this study are available at httpsgithubcomunique-worldMulticlass-Event-Classification-Dataset

Conflicts of Interest

)e authors declare that there are no conflicts of interest

References

[1] A Lenhart R Ling S Campbell and K Purcell Teens andMobile Phones Text Messaging Explodes as Teens Embrace it asthe Centerpiece of eir Communication Strategies withFriends Pew Internet amp American Life Project WashingtonDC USA 2010

[2] M Motoyama B Meeder K Levchenko G M Voelker andS Savage ldquoMeasuring online service availability using twit-terrdquo WOSN vol 10 p 13 2010

[3] J Rogstadius M Vukovic C A Teixeira V KostakosE Karapanos and J A Laredo ldquoCrisisTracker crowdsourcedsocial media curation for disaster awarenessrdquo IBM Journal ofResearch and Development vol 57 no 5 pp 4ndash1 2013

[4] T Reuter and P Cimiano ldquoEvent-based classification of socialmedia streamsrdquo in Proceedings of the 2nd ACM InternationalConference on Multimedia Retrieval pp 1ndash8 Bielefeld Ger-many June 2012

[5] K Sailunaz and R Alhajj ldquoEmotion and sentiment analysisfrom Twitter textrdquo Journal of Computational Science vol 36Article ID 101003 2019

[6] P Capet T Delavallade T Nakamura A SandorC Tarsitano and S Voyatzi ldquoA risk assessment system withautomatic extraction of event typesrdquo in Proceedings of theInternational Conference on Intelligent Information Process-ing pp 220ndash229 Springer Beijing China October 2008

[7] F Hogenboom F Frasincar U Kaymak F De Jong andE Caron ldquoA survey of event extraction methods from text fordecision support systemsrdquo Decision Support Systems vol 85pp 12ndash22 2016

[8] S Jiang H Chen J F Nunamaker and D Zimbra ldquoAna-lyzing firm-specific social media and market a stakeholder-based event analysis frameworkrdquo Decision Support Systemsvol 67 pp 30ndash39 2014

[9] B Pang and L Lee ldquoOpinion mining and sentiment analysisrdquoFoundations and Trends in Information Retrieval vol 2 no 1-2 pp 1ndash135 2008

[10] S Deerwester S T Dumais G W Furnas T K Landauerand R Harshman ldquoIndexing by latent semantic analysisrdquoJournal of the American Society for Information Sciencevol 41 no 6 pp 391ndash407 1990

[11] T Mikolov Q V Le and I Sutskever ldquoExploiting similaritiesamong languages for machine translationrdquo 2013 httparxivorgabs13094168

[12] R Alghamdi and K Alfalqi ldquoA survey of topic modeling intext miningrdquo International Journal of Advanced ComputerScience and Applications (IJACSA)vol 6 no 1 2015

[13] D M Eberhard S F Gary and C D Fennig EthnologueLanguages of the World SIL International Dallas TX USA2019

[14] A Daud W Khan and D Che ldquoUrdu language processing asurveyrdquo Artificial Intelligence Review vol 47 no 3pp 279ndash311 2017

[15] M P Akhter Z Jiangbin I R Naqvi M AbdelmajeedA Mehmood and M T Sadiq ldquoDocument-level text clas-sification using single-layer multisize filters convolutionalneural networkrdquo IEEE Access vol 8 pp 42689ndash42707 2021

[16] U Pal and A Sarkar ldquoRecognition of printed Urdu scriptrdquo inProceedings of the 2003 Seventh International Conference onDocument Analysis and Recognition pp 1183ndash1187 IEEEEdinburgh Scotland August 2003

[17] Y Yang T Pierce and J Carbonell ldquoA study of retrospectiveand on-line event detectionrdquo in Proceedings of the 21st annualinternational ACM SIGIR conference on Research and

Scientific Programming 13

development in information retrieval pp 28ndash36 MelbourneAustralia August 1998

[18] T Kala ldquoEvent detection from text datardquo ComputationalIntelligence vol 31 pp 132ndash164 2015

[19] M Naughton N Stokes and J Carthy ldquoSentence-level eventclassification in unstructured textsrdquo Information Retrievalvol 13 no 2 pp 132ndash156 2010

[20] G Jacobs E Lefever and V Hoste ldquoEconomic event de-tection in company-specific news textrdquo in Proceedings of theFirst Workshop on Economics and Natural Language Pro-cessing pp 1ndash10 Melbourne Australia July 2018

[21] E DrsquoAndrea P Ducange A Bechini A Renda andF Marcelloni ldquoMonitoring the public opinion about thevaccination topic from tweets analysisrdquo Expert Systems withApplications vol 116 pp 209ndash226 2019

[22] M Sokolova and G Lapalme ldquoA systematic analysis ofperformance measures for classification tasksrdquo InformationProcessing amp Management vol 45 no 4 pp 427ndash437 2009

[23] T Mikolov I Sutskever K Chen G S Corrado and J DeanldquoDistributed representations of words and phrases and theircompositionalityrdquo in Proceedings Advances Neural Informa-tion Processing Systems vol 26 pp 3111ndash3119 Lake TahoeNV USA December 2013

[24] Y Bengio R Ducharme P Vincent and C Jauvin ldquoA neuralprobabilistic language modelrdquo Journal of Machine LearningResearch vol 3 pp 1137ndash1155 2003

[25] M P Akhter Z Jiangbin I R Naqvi M Abdelmajeed andM Fayyaz ldquoExploring deep learning approaches for Urdu textclassification in product manufacturingrdquo Enterprise Infor-mation Systems pp 1ndash26 2020

[26] G Liu and J Guo ldquoBidirectional LSTM with attentionmechanism and convolutional layer for text classificationrdquoNeurocomputing vol 337 pp 325ndash338 2019

[27] M Sharjeel R M A Nawab and P Rayson ldquoCOUNTERcorpus of Urdu news text reuserdquo Language Resources andEvaluation vol 51 no 3 pp 777ndash803 2017

[28] K Mehmood D Essam and K Shafi ldquoSentiment analysissystem for Roman Urdurdquo in Proceedings of the 2018 Scienceand Information Conference pp 29ndash42 Springer CasablancaMorocco July 2018

[29] K Ahmed M Ali S Khalid and M Kamran ldquoFramework forUrdu news headlines classificationrdquo Journal of Applied ComputerScience amp Mathematics vol 10 no 1 pp 17ndash21 2016

[30] Z Tehseen M P Akhter and Q Abbas ldquoComparative studyof feature selection approaches for Urdu text categorizationrdquoMalaysian Journal Computer Science vol 28 no 2 pp 93ndash109 2015

[31] W Yin and L Shen ldquoA short text classification approach withevent detection and conceptual informationrdquo in Proceedingsof the 2020 5th International Conference on Machine LearningTechnologies pp 129ndash135 Beijing China June 2020

[32] H Zhou M Huang T Zhang et al ldquoEmotional chattingmachine emotional conversation generation with internaland external memoryrdquo in Proceedings of the irty-SecondAAAI Conference on Artificial Intelligence New Orleans LAUSA February 2018

[33] A Hassan and A Mahmood ldquoConvolutional recurrent deeplearningmodel for sentence classificationrdquo IEEE Access vol 6pp 13949ndash13957 2018

[34] T Zia M P Akhter and Q Abbas ldquoComparative study offeature selection approaches for Urdu text categorizationrdquoMalaysian Journal of Computer Science vol 28 no 2pp 93ndash109 2015

[35] A R Ali andM Ijaz ldquoUrdu text classificationrdquo in Proceedingsof the 7th international conference on frontiers of informationtechnology pp 1ndash7 Abbottabad Pakistan December 2009

[36] M Usman Z Shafique S Ayub and K Malik ldquoUrdu textclassification using majority votingrdquo International Journal ofAdvanced Computer Science and Applications vol 7 no 8pp 265ndash273 2016

[37] N Kalchbrenner E Grefenstette and P Blunsom ldquoA con-volutional neural network for modelling sentencesrdquo 2014httparxivorgabs14042188

[38] D M Awais and DM Shoaib ldquoRole of discourse informationin Urdu sentiment classificationrdquo ACMTransactions on Asianand Low-Resource Language Information Processing vol 18no 4 pp 1ndash37 2019

[39] J P Singh Y K Dwivedi N P Rana A Kumar andK K Kapoor ldquoEvent classification and location predictionfrom tweets during disastersrdquo Annals of Operations Researchvol 283 no 1-2 pp 737ndash757 2019

[40] R C Paulo D Fillipe and M S C Sergo ldquoClassification ofevents on social mediardquo 2016

[41] Q A Al-Radaideh and M A Al-Abrat ldquoAn Arabic textcategorization approach using term weighting and multiplereductsrdquo Soft Computing vol 23 no 14 pp 5849ndash5863 2019

[42] J F Allen ldquoMaintaining knowledge about temporal inter-valsrdquo Communications of the ACM vol 26 no 11pp 832ndash843 1983

[43] T Joachims ldquoText categorization with support vector ma-chines learning with many relevant featuresrdquo in Proceedingsof the European conference on machine learning pp 137ndash142Springer Chemnitz Germany April 1998

[44] S Haider ldquoUrdu word embeddingsrdquo in Proceedings of theEleventh International Conference on Language Resources andEvaluation (LREC 2018) Miyazaki Japan May 2018

[45] B Jawaid A Kamran and O Bojar ldquoUrdu monolingual corpusrdquoLINDATCLARIN Digital Library at the Institute of Formal andApplied Linguistics Charles University Prague Czechia

[46] F Adeeba Q Akram H Khalid and S Hussain ldquoCLE Urdubooks n-gramsrdquo in Proceedings of the Conference on Languageand Technology CLT 14 Karachi Pakistan May 2014

[47] A Hassan and A Mahmood ldquoDeep learning for sentenceclassificationrdquo in Proceedings of the 2017 IEEE Long IslandSystems Applications and Technology Conference (LISAT)pp 1ndash5 IEEE New York NY USA May 2017

[48] D-X Zhou ldquoUniversality of deep convolutional neuralnetworksrdquo Applied and Computational Harmonic Analysisvol 48 no 2 pp 787ndash794 2020

[49] M V Valueva N N Nagornov P A Lyakhov G V Valuevand N I Chervyakov ldquoApplication of the residue numbersystem to reduce hardware costs of the convolutional neuralnetwork implementationrdquo Mathematics and Computers inSimulation vol 177 pp 232ndash243 2020

[50] G Guo H Wang D Bell Y Bi and K Greer ldquoKNN model-based approach in classificationrdquo in Proceedings of the OTMConfederated International Conferences ldquoOn the Move toMeaningful Internet Systemsrdquo Catania Italy November 2003

[51] Y Zhong ldquo)e analysis of cases based on decision treerdquo inProceedings of the 2016 7th IEEE international conference onsoftware engineering and service science (ICSESS) pp 142ndash147IEEE Beijing China August 2016

[52] S Xu ldquoBayesian Naıve Bayes classifiers to text classificationrdquoJournal of Information Science vol 44 no 1 pp 48ndash59 2018

[53] T Zhang and F Oles ldquoText categorization based on regu-larized linear classification methodsrdquo Information Retrievalvol 4 no 1 pp 5ndash31 2001

14 Scientific Programming

[54] J Ali R Khan N Ahmad and I Maqsood ldquoRandom forestsand decision treesrdquo International Journal of Computer ScienceIssues (IJCSI) vol 9 no 5 p 272 2012

[55] Y Zhang ldquoSupport vector machine classification algorithmand its applicationrdquo in Proceedings of the InternationalConference on Information Computing and Applicationspp 179ndash186 Springer Bhubaneswar India September 2012

Scientific Programming 15

Page 11: ResearchArticle MulticlassEventClassificationfromText2020/11/03  · script. Its grammatical structure is different from other languages. (1)Subject-object-verb(SOV)sentencestructure[14]

Although the performance of traditional machinelearning classifiers is considerable it must be noted that it islower than deep learning classifiers )e main performancedegrading factor of the classifiers is the imbalanced numberof instances and sentences overlapping )e performance ofthe KNN machine learning model is given in Table 14 Itshowed 78 accuracy

722 Decision Tree (DT) Decision Tree (DT)Decision tree(DT) is a type of supervised machine learning algorithm [51]where the data input is split according to certain parameters)e overall accuracy achieved by DT is 73 while anotherperformance detail of classes andDTmodel is given in Table 15

723 Naive Bayes Multinominal (NBM) Naıve Bayesmultinominal is one of the computational [52] efficientclassifiers for text classification but it showed only 70accuracy that is very low as compared to KNN DT and RF)e performance details of all twelve types of classes aregiven in Table 16

724 Linear Regression (LR) Linear regression is highlyrecommended for the prediction of continuous output in-stead of categorical classification [53] Table 17 shows the

Table 17 Performance measuring parameters for the LR model

Class Precision Recall F1-score Support1 095 094 094 56612 083 064 072 9673 072 069 070 21154 077 055 064 8785 073 088 080 100306 064 053 058 22937 100 100 100 9708 091 084 088 22599 073 062 067 304410 094 092 093 103111 090 080 085 88912 077 066 071 1052Accuracy 080 31189Macro avg 082 076 079 31189Weighted avg 080 080 0 80 31189

Table 18 Performance measuring parameters for the RF model

Class Precision Recall F1-score Support1 094 093 094 56612 094 096 095 9673 072 063 067 21154 080 058 067 8785 071 090 079 100306 067 041 051 22937 100 100 100 9708 093 080 086 22599 075 058 065 304410 094 098 096 103111 096 098 097 88912 084 063 072 1052Accuracy 080 31189Macro avg 085 078 081 31189Weighted avg 081 080 0 80 31189

Table 14 Performance measuring parameters for the KNN model

Class Precision Recall F1-score Support1 091 093 092 56612 062 083 071 9673 067 071 069 21154 064 060 062 8785 078 082 080 100306 066 050 057 22937 093 100 096 9708 091 080 085 22599 071 062 066 304410 085 093 089 103111 072 085 078 88912 075 061 067 1052Accuracy 078 31189Macro avg 076 077 076 31189Weighted avg 078 078 0 78 31189

Table 15 Performance measuring parameters for the DT model

Class Precision Recall F1-score Support1 091 089 090 56612 083 097 089 9673 057 052 054 21154 058 054 056 8785 072 075 073 100306 044 041 042 22937 099 100 100 9708 079 077 078 22599 057 055 056 304410 098 098 093 103111 086 098 092 88912 061 056 058 1031Accuracy 073 31189Macro avg 073 074 074 31189Weighted avg 073 073 0 73 31189

Table 16 Performance measuring parameters for the NB Multi-nominal model

Class Precision Recall F1-score Support1 094 091 093 56832 082 034 048 9563 066 047 055 21214 091 020 032 9195 056 095 070 100136 070 022 034 23877 098 095 097 9598 094 075 083 21889 075 040 052 303110 096 078 086 99811 096 032 048 86312 084 025 039 1071Accuracy 070 31189Macro avg 084 054 061 31189Weighted avg 076 070 0 67 31189

Scientific Programming 11

performance of the LR model ie 84 overall accuracy formulticlass event classification

725 Random Forest (RF) It comprises many decision trees[54] Its results showed the highest accuracy among allevaluated machine learning classifiers A detailed summaryof the results is given in Table 18

726 Support Vector Machine (SVM) )e support vectormachine (SVM) is one of the highly recommended modelsfor binary classification It is based on statistical theory [55]Its performance details are given in Table 19

A comparative depiction of results obtained by thetraditional machine learning classifiers is given in Figure 9

8 Discussion and Conclusion

Lack of resources is a major hurdle in research for Urdulanguage texts We explored many feature vectors generatingtechniques Different classification algorithms of traditional

machine learning and deep learning approaches are evaluatedon these feature vectors )e purpose of performing manyexperiments on various feature vector generating techniqueswas to develop the most efficient and generic model of mul-ticlass event classification for Urdu language text

Word embedding feature generating technique is con-sidered an efficient and powerful technique for text analysisWord2Vector (W2Vec) feature vectors can be generated bypretrained word embedding models or using dynamic pa-rameters in embedding layers of deep neural networks Weperformed sentence classification using pretrained wordembedding models one-hot-encoding TF TF-IDF anddynamic embeddings )e results of the rest of the featurevector generating techniques are better than pretrained wordembedding models

Another argument in support of this conclusion is thatonly a few pretrained word embedding models exist forUrdu language texts )ese models are trained on consid-erable tokens and domain-specific Urdu text )ere is a needto develop generic word embedding models for the Urdulanguage on a large corpus CNN and RNN (LSTM) single-layer architecture and multilayer architecture do not affectthe performance of the proposed system

Experimental results are the vivid depiction that theone-hot-encoding method is better than the word em-bedding model and pretrained word embedding modelHowever among all mentioned (see Section 52) featuregenerating techniques TF-IDF outperformed It showedthe highest accuracy (84) by using DNN deep learningclassifier while event classification on an imbalancedataset of multiclass events for Urdu language usingtraditional machine learning classifiers showed consid-erable performance but lower than deep learning modelsDeep learning algorithms ie CNN DNN and RNN arepreferable over traditional machine learning algorithmsbecause there is no need for a domain expert to findrelevant features in deep learning like traditional machinelearning DNN and RNN outperformed among all otherclassifiers and showed overall 84 and 81 accuracyrespectively for the twelve classes of events Compara-tively the performance of CNN and RNN is better thanNaıve Bayes and SVM

Multiclass event classification at the sentence levelperformed on an imbalance dataset events that are having alow number of instances for a specific class affect the overallperformance of the classifiers We can improve the per-formance by balancing the instances of each class )efollowing can be concluded

(1) Pretrained word embedding models are suitable onlyfor sentence classification if pretrained models aredeveloped by an immense amount of textual data

(2) Existing word embedding models Word2Vec andGlove that were developed for the English languagetext are incompatible for Urdu language text

(3) In our case TF-IDF one-hot-encoding and dy-namic embedding layer are better feature generating

Table 19 Performance measuring parameters for the SVM model

Class Precision Recall F1-score Support1 084 094 089 56832 072 043 054 9563 072 049 058 21214 073 043 054 9195 064 090 075 100136 074 024 036 23877 090 099 094 9598 086 078 082 21889 065 047 057 303110 085 087 082 99811 081 062 070 86312 077 063 067 1071Accuracy 073 31189Macro avg 077 063 067 31189Weighted avg 077 073 0 71 31189

SVM

70

7573

70

80

73

80

7880

RF LRMachine learing classifiers

65

85

DT KNNNBM

Valid

atio

n_ac

cura

cy

Figure 9 Machine learning algorithmsrsquo accuracy using TF-IDF

12 Scientific Programming

techniques as compared to pre-existing Urdu lan-guage text word embedding models

(4) )e TF-IDF-based feature vectors showed thehighest results as compared to one-hot-encoding-and dynamic word embedding-based feature vectors

(5) Imbalance number of instances in the dataset af-fected the overall accuracy

9 Future Work

In a comprehensive review of Urdu literature we found onlya few numbers of referential works related to Urdu textprocessing )e main hurdle in Urdu exploration is theunavailability of the processing resources ie event datasetclose-domain part-of-speech tagger lexicons annotatorsand other supporting tools

)ere are a lot of tasks that can be accomplished forUrdu language text in the future Some of those are men-tioned as follows

(1) Generic word embedding models can be developedfor a large corpus of Urdu language text

(2) Different deep learning classifiers can be evaluatedie BERT and ANN

(3) Event classification can be performed at the doc-ument level

(4) A balance dataset can be used for better results(5) Multilabel event classification can be performed in

the future(6) Unstructured data of Urdu text can be classified

into different event classes(7) Classification of events for the Urdu language can

be further performed for other domains ofknowledge ie literacy ratio top trends famousfoods and a religious event like Eid

(8) Contextual information of sentence ie presen-tence and postsentence information certainly playsa vital role in enhancing the performance accuracyof the classification model

(9) Event classification can be performed on a balanceddataset

(10) Unstructured Urdu data can be used for eventclassification

(11) Classification can be performed at a document andphrase level

Data Availability

)e data used to support this study are available at httpsgithubcomunique-worldMulticlass-Event-Classification-Dataset

Conflicts of Interest

)e authors declare that there are no conflicts of interest

References

[1] A Lenhart R Ling S Campbell and K Purcell Teens andMobile Phones Text Messaging Explodes as Teens Embrace it asthe Centerpiece of eir Communication Strategies withFriends Pew Internet amp American Life Project WashingtonDC USA 2010

[2] M Motoyama B Meeder K Levchenko G M Voelker andS Savage ldquoMeasuring online service availability using twit-terrdquo WOSN vol 10 p 13 2010

[3] J Rogstadius M Vukovic C A Teixeira V KostakosE Karapanos and J A Laredo ldquoCrisisTracker crowdsourcedsocial media curation for disaster awarenessrdquo IBM Journal ofResearch and Development vol 57 no 5 pp 4ndash1 2013

[4] T Reuter and P Cimiano ldquoEvent-based classification of socialmedia streamsrdquo in Proceedings of the 2nd ACM InternationalConference on Multimedia Retrieval pp 1ndash8 Bielefeld Ger-many June 2012

[5] K Sailunaz and R Alhajj ldquoEmotion and sentiment analysisfrom Twitter textrdquo Journal of Computational Science vol 36Article ID 101003 2019

[6] P Capet T Delavallade T Nakamura A SandorC Tarsitano and S Voyatzi ldquoA risk assessment system withautomatic extraction of event typesrdquo in Proceedings of theInternational Conference on Intelligent Information Process-ing pp 220ndash229 Springer Beijing China October 2008

[7] F Hogenboom F Frasincar U Kaymak F De Jong andE Caron ldquoA survey of event extraction methods from text fordecision support systemsrdquo Decision Support Systems vol 85pp 12ndash22 2016

[8] S Jiang H Chen J F Nunamaker and D Zimbra ldquoAna-lyzing firm-specific social media and market a stakeholder-based event analysis frameworkrdquo Decision Support Systemsvol 67 pp 30ndash39 2014

[9] B Pang and L Lee ldquoOpinion mining and sentiment analysisrdquoFoundations and Trends in Information Retrieval vol 2 no 1-2 pp 1ndash135 2008

[10] S Deerwester S T Dumais G W Furnas T K Landauerand R Harshman ldquoIndexing by latent semantic analysisrdquoJournal of the American Society for Information Sciencevol 41 no 6 pp 391ndash407 1990

[11] T Mikolov Q V Le and I Sutskever ldquoExploiting similaritiesamong languages for machine translationrdquo 2013 httparxivorgabs13094168

[12] R Alghamdi and K Alfalqi ldquoA survey of topic modeling intext miningrdquo International Journal of Advanced ComputerScience and Applications (IJACSA)vol 6 no 1 2015

[13] D M Eberhard S F Gary and C D Fennig EthnologueLanguages of the World SIL International Dallas TX USA2019

[14] A Daud W Khan and D Che ldquoUrdu language processing asurveyrdquo Artificial Intelligence Review vol 47 no 3pp 279ndash311 2017

[15] M P Akhter Z Jiangbin I R Naqvi M AbdelmajeedA Mehmood and M T Sadiq ldquoDocument-level text clas-sification using single-layer multisize filters convolutionalneural networkrdquo IEEE Access vol 8 pp 42689ndash42707 2021

[16] U Pal and A Sarkar ldquoRecognition of printed Urdu scriptrdquo inProceedings of the 2003 Seventh International Conference onDocument Analysis and Recognition pp 1183ndash1187 IEEEEdinburgh Scotland August 2003

[17] Y Yang T Pierce and J Carbonell ldquoA study of retrospectiveand on-line event detectionrdquo in Proceedings of the 21st annualinternational ACM SIGIR conference on Research and

Scientific Programming 13

development in information retrieval pp 28ndash36 MelbourneAustralia August 1998

[18] T Kala ldquoEvent detection from text datardquo ComputationalIntelligence vol 31 pp 132ndash164 2015

[19] M Naughton N Stokes and J Carthy ldquoSentence-level eventclassification in unstructured textsrdquo Information Retrievalvol 13 no 2 pp 132ndash156 2010

[20] G Jacobs E Lefever and V Hoste ldquoEconomic event de-tection in company-specific news textrdquo in Proceedings of theFirst Workshop on Economics and Natural Language Pro-cessing pp 1ndash10 Melbourne Australia July 2018

[21] E DrsquoAndrea P Ducange A Bechini A Renda andF Marcelloni ldquoMonitoring the public opinion about thevaccination topic from tweets analysisrdquo Expert Systems withApplications vol 116 pp 209ndash226 2019

[22] M Sokolova and G Lapalme ldquoA systematic analysis ofperformance measures for classification tasksrdquo InformationProcessing amp Management vol 45 no 4 pp 427ndash437 2009

[23] T Mikolov I Sutskever K Chen G S Corrado and J DeanldquoDistributed representations of words and phrases and theircompositionalityrdquo in Proceedings Advances Neural Informa-tion Processing Systems vol 26 pp 3111ndash3119 Lake TahoeNV USA December 2013

[24] Y Bengio R Ducharme P Vincent and C Jauvin ldquoA neuralprobabilistic language modelrdquo Journal of Machine LearningResearch vol 3 pp 1137ndash1155 2003

[25] M P Akhter Z Jiangbin I R Naqvi M Abdelmajeed andM Fayyaz ldquoExploring deep learning approaches for Urdu textclassification in product manufacturingrdquo Enterprise Infor-mation Systems pp 1ndash26 2020

[26] G Liu and J Guo ldquoBidirectional LSTM with attentionmechanism and convolutional layer for text classificationrdquoNeurocomputing vol 337 pp 325ndash338 2019

[27] M Sharjeel R M A Nawab and P Rayson ldquoCOUNTERcorpus of Urdu news text reuserdquo Language Resources andEvaluation vol 51 no 3 pp 777ndash803 2017

[28] K Mehmood D Essam and K Shafi ldquoSentiment analysissystem for Roman Urdurdquo in Proceedings of the 2018 Scienceand Information Conference pp 29ndash42 Springer CasablancaMorocco July 2018

[29] K Ahmed M Ali S Khalid and M Kamran ldquoFramework forUrdu news headlines classificationrdquo Journal of Applied ComputerScience amp Mathematics vol 10 no 1 pp 17ndash21 2016

[30] Z Tehseen M P Akhter and Q Abbas ldquoComparative studyof feature selection approaches for Urdu text categorizationrdquoMalaysian Journal Computer Science vol 28 no 2 pp 93ndash109 2015

[31] W Yin and L Shen ldquoA short text classification approach withevent detection and conceptual informationrdquo in Proceedingsof the 2020 5th International Conference on Machine LearningTechnologies pp 129ndash135 Beijing China June 2020

[32] H Zhou M Huang T Zhang et al ldquoEmotional chattingmachine emotional conversation generation with internaland external memoryrdquo in Proceedings of the irty-SecondAAAI Conference on Artificial Intelligence New Orleans LAUSA February 2018

[33] A Hassan and A Mahmood ldquoConvolutional recurrent deeplearningmodel for sentence classificationrdquo IEEE Access vol 6pp 13949ndash13957 2018

[34] T Zia M P Akhter and Q Abbas ldquoComparative study offeature selection approaches for Urdu text categorizationrdquoMalaysian Journal of Computer Science vol 28 no 2pp 93ndash109 2015

[35] A R Ali andM Ijaz ldquoUrdu text classificationrdquo in Proceedingsof the 7th international conference on frontiers of informationtechnology pp 1ndash7 Abbottabad Pakistan December 2009

[36] M Usman Z Shafique S Ayub and K Malik ldquoUrdu textclassification using majority votingrdquo International Journal ofAdvanced Computer Science and Applications vol 7 no 8pp 265ndash273 2016

[37] N Kalchbrenner E Grefenstette and P Blunsom ldquoA con-volutional neural network for modelling sentencesrdquo 2014httparxivorgabs14042188

[38] D M Awais and DM Shoaib ldquoRole of discourse informationin Urdu sentiment classificationrdquo ACMTransactions on Asianand Low-Resource Language Information Processing vol 18no 4 pp 1ndash37 2019

[39] J P Singh Y K Dwivedi N P Rana A Kumar andK K Kapoor ldquoEvent classification and location predictionfrom tweets during disastersrdquo Annals of Operations Researchvol 283 no 1-2 pp 737ndash757 2019

[40] R C Paulo D Fillipe and M S C Sergo ldquoClassification ofevents on social mediardquo 2016

[41] Q A Al-Radaideh and M A Al-Abrat ldquoAn Arabic textcategorization approach using term weighting and multiplereductsrdquo Soft Computing vol 23 no 14 pp 5849ndash5863 2019

[42] J F Allen ldquoMaintaining knowledge about temporal inter-valsrdquo Communications of the ACM vol 26 no 11pp 832ndash843 1983

[43] T Joachims ldquoText categorization with support vector ma-chines learning with many relevant featuresrdquo in Proceedingsof the European conference on machine learning pp 137ndash142Springer Chemnitz Germany April 1998

[44] S Haider ldquoUrdu word embeddingsrdquo in Proceedings of theEleventh International Conference on Language Resources andEvaluation (LREC 2018) Miyazaki Japan May 2018

[45] B Jawaid A Kamran and O Bojar ldquoUrdu monolingual corpusrdquoLINDATCLARIN Digital Library at the Institute of Formal andApplied Linguistics Charles University Prague Czechia

[46] F Adeeba Q Akram H Khalid and S Hussain ldquoCLE Urdubooks n-gramsrdquo in Proceedings of the Conference on Languageand Technology CLT 14 Karachi Pakistan May 2014

[47] A Hassan and A Mahmood ldquoDeep learning for sentenceclassificationrdquo in Proceedings of the 2017 IEEE Long IslandSystems Applications and Technology Conference (LISAT)pp 1ndash5 IEEE New York NY USA May 2017

[48] D-X Zhou ldquoUniversality of deep convolutional neuralnetworksrdquo Applied and Computational Harmonic Analysisvol 48 no 2 pp 787ndash794 2020

[49] M V Valueva N N Nagornov P A Lyakhov G V Valuevand N I Chervyakov ldquoApplication of the residue numbersystem to reduce hardware costs of the convolutional neuralnetwork implementationrdquo Mathematics and Computers inSimulation vol 177 pp 232ndash243 2020

[50] G Guo H Wang D Bell Y Bi and K Greer ldquoKNN model-based approach in classificationrdquo in Proceedings of the OTMConfederated International Conferences ldquoOn the Move toMeaningful Internet Systemsrdquo Catania Italy November 2003

[51] Y Zhong ldquo)e analysis of cases based on decision treerdquo inProceedings of the 2016 7th IEEE international conference onsoftware engineering and service science (ICSESS) pp 142ndash147IEEE Beijing China August 2016

[52] S Xu ldquoBayesian Naıve Bayes classifiers to text classificationrdquoJournal of Information Science vol 44 no 1 pp 48ndash59 2018

[53] T Zhang and F Oles ldquoText categorization based on regu-larized linear classification methodsrdquo Information Retrievalvol 4 no 1 pp 5ndash31 2001

14 Scientific Programming

[54] J Ali R Khan N Ahmad and I Maqsood ldquoRandom forestsand decision treesrdquo International Journal of Computer ScienceIssues (IJCSI) vol 9 no 5 p 272 2012

[55] Y Zhang ldquoSupport vector machine classification algorithmand its applicationrdquo in Proceedings of the InternationalConference on Information Computing and Applicationspp 179ndash186 Springer Bhubaneswar India September 2012

Scientific Programming 15

Page 12: ResearchArticle MulticlassEventClassificationfromText2020/11/03  · script. Its grammatical structure is different from other languages. (1)Subject-object-verb(SOV)sentencestructure[14]

performance of the LR model ie 84 overall accuracy formulticlass event classification

725 Random Forest (RF) It comprises many decision trees[54] Its results showed the highest accuracy among allevaluated machine learning classifiers A detailed summaryof the results is given in Table 18

726 Support Vector Machine (SVM) )e support vectormachine (SVM) is one of the highly recommended modelsfor binary classification It is based on statistical theory [55]Its performance details are given in Table 19

A comparative depiction of results obtained by thetraditional machine learning classifiers is given in Figure 9

8 Discussion and Conclusion

Lack of resources is a major hurdle in research for Urdulanguage texts We explored many feature vectors generatingtechniques Different classification algorithms of traditional

machine learning and deep learning approaches are evaluatedon these feature vectors )e purpose of performing manyexperiments on various feature vector generating techniqueswas to develop the most efficient and generic model of mul-ticlass event classification for Urdu language text

Word embedding feature generating technique is con-sidered an efficient and powerful technique for text analysisWord2Vector (W2Vec) feature vectors can be generated bypretrained word embedding models or using dynamic pa-rameters in embedding layers of deep neural networks Weperformed sentence classification using pretrained wordembedding models one-hot-encoding TF TF-IDF anddynamic embeddings )e results of the rest of the featurevector generating techniques are better than pretrained wordembedding models

Another argument in support of this conclusion is thatonly a few pretrained word embedding models exist forUrdu language texts )ese models are trained on consid-erable tokens and domain-specific Urdu text )ere is a needto develop generic word embedding models for the Urdulanguage on a large corpus CNN and RNN (LSTM) single-layer architecture and multilayer architecture do not affectthe performance of the proposed system

Experimental results are the vivid depiction that theone-hot-encoding method is better than the word em-bedding model and pretrained word embedding modelHowever among all mentioned (see Section 52) featuregenerating techniques TF-IDF outperformed It showedthe highest accuracy (84) by using DNN deep learningclassifier while event classification on an imbalancedataset of multiclass events for Urdu language usingtraditional machine learning classifiers showed consid-erable performance but lower than deep learning modelsDeep learning algorithms ie CNN DNN and RNN arepreferable over traditional machine learning algorithmsbecause there is no need for a domain expert to findrelevant features in deep learning like traditional machinelearning DNN and RNN outperformed among all otherclassifiers and showed overall 84 and 81 accuracyrespectively for the twelve classes of events Compara-tively the performance of CNN and RNN is better thanNaıve Bayes and SVM

Multiclass event classification at the sentence levelperformed on an imbalance dataset events that are having alow number of instances for a specific class affect the overallperformance of the classifiers We can improve the per-formance by balancing the instances of each class )efollowing can be concluded

(1) Pretrained word embedding models are suitable onlyfor sentence classification if pretrained models aredeveloped by an immense amount of textual data

(2) Existing word embedding models Word2Vec andGlove that were developed for the English languagetext are incompatible for Urdu language text

(3) In our case TF-IDF one-hot-encoding and dy-namic embedding layer are better feature generating

Table 19 Performance measuring parameters for the SVM model

Class Precision Recall F1-score Support1 084 094 089 56832 072 043 054 9563 072 049 058 21214 073 043 054 9195 064 090 075 100136 074 024 036 23877 090 099 094 9598 086 078 082 21889 065 047 057 303110 085 087 082 99811 081 062 070 86312 077 063 067 1071Accuracy 073 31189Macro avg 077 063 067 31189Weighted avg 077 073 0 71 31189

SVM

70

7573

70

80

73

80

7880

RF LRMachine learing classifiers

65

85

DT KNNNBM

Valid

atio

n_ac

cura

cy

Figure 9 Machine learning algorithmsrsquo accuracy using TF-IDF

12 Scientific Programming

techniques as compared to pre-existing Urdu lan-guage text word embedding models

(4) )e TF-IDF-based feature vectors showed thehighest results as compared to one-hot-encoding-and dynamic word embedding-based feature vectors

(5) Imbalance number of instances in the dataset af-fected the overall accuracy

9 Future Work

In a comprehensive review of Urdu literature we found onlya few numbers of referential works related to Urdu textprocessing )e main hurdle in Urdu exploration is theunavailability of the processing resources ie event datasetclose-domain part-of-speech tagger lexicons annotatorsand other supporting tools

)ere are a lot of tasks that can be accomplished forUrdu language text in the future Some of those are men-tioned as follows

(1) Generic word embedding models can be developedfor a large corpus of Urdu language text

(2) Different deep learning classifiers can be evaluatedie BERT and ANN

(3) Event classification can be performed at the doc-ument level

(4) A balance dataset can be used for better results(5) Multilabel event classification can be performed in

the future(6) Unstructured data of Urdu text can be classified

into different event classes(7) Classification of events for the Urdu language can

be further performed for other domains ofknowledge ie literacy ratio top trends famousfoods and a religious event like Eid

(8) Contextual information of sentence ie presen-tence and postsentence information certainly playsa vital role in enhancing the performance accuracyof the classification model

(9) Event classification can be performed on a balanceddataset

(10) Unstructured Urdu data can be used for eventclassification

(11) Classification can be performed at a document andphrase level

Data Availability

)e data used to support this study are available at httpsgithubcomunique-worldMulticlass-Event-Classification-Dataset

Conflicts of Interest

)e authors declare that there are no conflicts of interest

References

[1] A Lenhart R Ling S Campbell and K Purcell Teens andMobile Phones Text Messaging Explodes as Teens Embrace it asthe Centerpiece of eir Communication Strategies withFriends Pew Internet amp American Life Project WashingtonDC USA 2010

[2] M Motoyama B Meeder K Levchenko G M Voelker andS Savage ldquoMeasuring online service availability using twit-terrdquo WOSN vol 10 p 13 2010

[3] J Rogstadius M Vukovic C A Teixeira V KostakosE Karapanos and J A Laredo ldquoCrisisTracker crowdsourcedsocial media curation for disaster awarenessrdquo IBM Journal ofResearch and Development vol 57 no 5 pp 4ndash1 2013

[4] T Reuter and P Cimiano ldquoEvent-based classification of socialmedia streamsrdquo in Proceedings of the 2nd ACM InternationalConference on Multimedia Retrieval pp 1ndash8 Bielefeld Ger-many June 2012

[5] K Sailunaz and R Alhajj ldquoEmotion and sentiment analysisfrom Twitter textrdquo Journal of Computational Science vol 36Article ID 101003 2019

[6] P Capet T Delavallade T Nakamura A SandorC Tarsitano and S Voyatzi ldquoA risk assessment system withautomatic extraction of event typesrdquo in Proceedings of theInternational Conference on Intelligent Information Process-ing pp 220ndash229 Springer Beijing China October 2008

[7] F Hogenboom F Frasincar U Kaymak F De Jong andE Caron ldquoA survey of event extraction methods from text fordecision support systemsrdquo Decision Support Systems vol 85pp 12ndash22 2016

[8] S Jiang H Chen J F Nunamaker and D Zimbra ldquoAna-lyzing firm-specific social media and market a stakeholder-based event analysis frameworkrdquo Decision Support Systemsvol 67 pp 30ndash39 2014

[9] B Pang and L Lee ldquoOpinion mining and sentiment analysisrdquoFoundations and Trends in Information Retrieval vol 2 no 1-2 pp 1ndash135 2008

[10] S Deerwester S T Dumais G W Furnas T K Landauerand R Harshman ldquoIndexing by latent semantic analysisrdquoJournal of the American Society for Information Sciencevol 41 no 6 pp 391ndash407 1990

[11] T Mikolov Q V Le and I Sutskever ldquoExploiting similaritiesamong languages for machine translationrdquo 2013 httparxivorgabs13094168

[12] R Alghamdi and K Alfalqi ldquoA survey of topic modeling intext miningrdquo International Journal of Advanced ComputerScience and Applications (IJACSA)vol 6 no 1 2015

[13] D M Eberhard S F Gary and C D Fennig EthnologueLanguages of the World SIL International Dallas TX USA2019

[14] A Daud W Khan and D Che ldquoUrdu language processing asurveyrdquo Artificial Intelligence Review vol 47 no 3pp 279ndash311 2017

[15] M P Akhter Z Jiangbin I R Naqvi M AbdelmajeedA Mehmood and M T Sadiq ldquoDocument-level text clas-sification using single-layer multisize filters convolutionalneural networkrdquo IEEE Access vol 8 pp 42689ndash42707 2021

[16] U Pal and A Sarkar ldquoRecognition of printed Urdu scriptrdquo inProceedings of the 2003 Seventh International Conference onDocument Analysis and Recognition pp 1183ndash1187 IEEEEdinburgh Scotland August 2003

[17] Y Yang T Pierce and J Carbonell ldquoA study of retrospectiveand on-line event detectionrdquo in Proceedings of the 21st annualinternational ACM SIGIR conference on Research and

Scientific Programming 13

development in information retrieval pp 28ndash36 MelbourneAustralia August 1998

[18] T Kala ldquoEvent detection from text datardquo ComputationalIntelligence vol 31 pp 132ndash164 2015

[19] M Naughton N Stokes and J Carthy ldquoSentence-level eventclassification in unstructured textsrdquo Information Retrievalvol 13 no 2 pp 132ndash156 2010

[20] G Jacobs E Lefever and V Hoste ldquoEconomic event de-tection in company-specific news textrdquo in Proceedings of theFirst Workshop on Economics and Natural Language Pro-cessing pp 1ndash10 Melbourne Australia July 2018

[21] E DrsquoAndrea P Ducange A Bechini A Renda andF Marcelloni ldquoMonitoring the public opinion about thevaccination topic from tweets analysisrdquo Expert Systems withApplications vol 116 pp 209ndash226 2019

[22] M Sokolova and G Lapalme ldquoA systematic analysis ofperformance measures for classification tasksrdquo InformationProcessing amp Management vol 45 no 4 pp 427ndash437 2009

[23] T Mikolov I Sutskever K Chen G S Corrado and J DeanldquoDistributed representations of words and phrases and theircompositionalityrdquo in Proceedings Advances Neural Informa-tion Processing Systems vol 26 pp 3111ndash3119 Lake TahoeNV USA December 2013

[24] Y Bengio R Ducharme P Vincent and C Jauvin ldquoA neuralprobabilistic language modelrdquo Journal of Machine LearningResearch vol 3 pp 1137ndash1155 2003

[25] M P Akhter Z Jiangbin I R Naqvi M Abdelmajeed andM Fayyaz ldquoExploring deep learning approaches for Urdu textclassification in product manufacturingrdquo Enterprise Infor-mation Systems pp 1ndash26 2020

[26] G Liu and J Guo ldquoBidirectional LSTM with attentionmechanism and convolutional layer for text classificationrdquoNeurocomputing vol 337 pp 325ndash338 2019

[27] M Sharjeel R M A Nawab and P Rayson ldquoCOUNTERcorpus of Urdu news text reuserdquo Language Resources andEvaluation vol 51 no 3 pp 777ndash803 2017

[28] K Mehmood D Essam and K Shafi ldquoSentiment analysissystem for Roman Urdurdquo in Proceedings of the 2018 Scienceand Information Conference pp 29ndash42 Springer CasablancaMorocco July 2018

[29] K Ahmed M Ali S Khalid and M Kamran ldquoFramework forUrdu news headlines classificationrdquo Journal of Applied ComputerScience amp Mathematics vol 10 no 1 pp 17ndash21 2016

[30] Z Tehseen M P Akhter and Q Abbas ldquoComparative studyof feature selection approaches for Urdu text categorizationrdquoMalaysian Journal Computer Science vol 28 no 2 pp 93ndash109 2015

[31] W Yin and L Shen ldquoA short text classification approach withevent detection and conceptual informationrdquo in Proceedingsof the 2020 5th International Conference on Machine LearningTechnologies pp 129ndash135 Beijing China June 2020

[32] H Zhou M Huang T Zhang et al ldquoEmotional chattingmachine emotional conversation generation with internaland external memoryrdquo in Proceedings of the irty-SecondAAAI Conference on Artificial Intelligence New Orleans LAUSA February 2018

[33] A Hassan and A Mahmood ldquoConvolutional recurrent deeplearningmodel for sentence classificationrdquo IEEE Access vol 6pp 13949ndash13957 2018

[34] T Zia M P Akhter and Q Abbas ldquoComparative study offeature selection approaches for Urdu text categorizationrdquoMalaysian Journal of Computer Science vol 28 no 2pp 93ndash109 2015

[35] A R Ali andM Ijaz ldquoUrdu text classificationrdquo in Proceedingsof the 7th international conference on frontiers of informationtechnology pp 1ndash7 Abbottabad Pakistan December 2009

[36] M Usman Z Shafique S Ayub and K Malik ldquoUrdu textclassification using majority votingrdquo International Journal ofAdvanced Computer Science and Applications vol 7 no 8pp 265ndash273 2016

[37] N Kalchbrenner E Grefenstette and P Blunsom ldquoA con-volutional neural network for modelling sentencesrdquo 2014httparxivorgabs14042188

[38] D M Awais and DM Shoaib ldquoRole of discourse informationin Urdu sentiment classificationrdquo ACMTransactions on Asianand Low-Resource Language Information Processing vol 18no 4 pp 1ndash37 2019

[39] J P Singh Y K Dwivedi N P Rana A Kumar andK K Kapoor ldquoEvent classification and location predictionfrom tweets during disastersrdquo Annals of Operations Researchvol 283 no 1-2 pp 737ndash757 2019

[40] R C Paulo D Fillipe and M S C Sergo ldquoClassification ofevents on social mediardquo 2016

[41] Q A Al-Radaideh and M A Al-Abrat ldquoAn Arabic textcategorization approach using term weighting and multiplereductsrdquo Soft Computing vol 23 no 14 pp 5849ndash5863 2019

[42] J F Allen ldquoMaintaining knowledge about temporal inter-valsrdquo Communications of the ACM vol 26 no 11pp 832ndash843 1983

[43] T Joachims ldquoText categorization with support vector ma-chines learning with many relevant featuresrdquo in Proceedingsof the European conference on machine learning pp 137ndash142Springer Chemnitz Germany April 1998

[44] S Haider ldquoUrdu word embeddingsrdquo in Proceedings of theEleventh International Conference on Language Resources andEvaluation (LREC 2018) Miyazaki Japan May 2018

[45] B Jawaid A Kamran and O Bojar ldquoUrdu monolingual corpusrdquoLINDATCLARIN Digital Library at the Institute of Formal andApplied Linguistics Charles University Prague Czechia

[46] F Adeeba Q Akram H Khalid and S Hussain ldquoCLE Urdubooks n-gramsrdquo in Proceedings of the Conference on Languageand Technology CLT 14 Karachi Pakistan May 2014

[47] A Hassan and A Mahmood ldquoDeep learning for sentenceclassificationrdquo in Proceedings of the 2017 IEEE Long IslandSystems Applications and Technology Conference (LISAT)pp 1ndash5 IEEE New York NY USA May 2017

[48] D-X Zhou ldquoUniversality of deep convolutional neuralnetworksrdquo Applied and Computational Harmonic Analysisvol 48 no 2 pp 787ndash794 2020

[49] M V Valueva N N Nagornov P A Lyakhov G V Valuevand N I Chervyakov ldquoApplication of the residue numbersystem to reduce hardware costs of the convolutional neuralnetwork implementationrdquo Mathematics and Computers inSimulation vol 177 pp 232ndash243 2020

[50] G Guo H Wang D Bell Y Bi and K Greer ldquoKNN model-based approach in classificationrdquo in Proceedings of the OTMConfederated International Conferences ldquoOn the Move toMeaningful Internet Systemsrdquo Catania Italy November 2003

[51] Y Zhong ldquo)e analysis of cases based on decision treerdquo inProceedings of the 2016 7th IEEE international conference onsoftware engineering and service science (ICSESS) pp 142ndash147IEEE Beijing China August 2016

[52] S Xu ldquoBayesian Naıve Bayes classifiers to text classificationrdquoJournal of Information Science vol 44 no 1 pp 48ndash59 2018

[53] T Zhang and F Oles ldquoText categorization based on regu-larized linear classification methodsrdquo Information Retrievalvol 4 no 1 pp 5ndash31 2001

14 Scientific Programming

[54] J Ali R Khan N Ahmad and I Maqsood ldquoRandom forestsand decision treesrdquo International Journal of Computer ScienceIssues (IJCSI) vol 9 no 5 p 272 2012

[55] Y Zhang ldquoSupport vector machine classification algorithmand its applicationrdquo in Proceedings of the InternationalConference on Information Computing and Applicationspp 179ndash186 Springer Bhubaneswar India September 2012

Scientific Programming 15

Page 13: ResearchArticle MulticlassEventClassificationfromText2020/11/03  · script. Its grammatical structure is different from other languages. (1)Subject-object-verb(SOV)sentencestructure[14]

techniques as compared to pre-existing Urdu lan-guage text word embedding models

(4) )e TF-IDF-based feature vectors showed thehighest results as compared to one-hot-encoding-and dynamic word embedding-based feature vectors

(5) Imbalance number of instances in the dataset af-fected the overall accuracy

9 Future Work

In a comprehensive review of Urdu literature we found onlya few numbers of referential works related to Urdu textprocessing )e main hurdle in Urdu exploration is theunavailability of the processing resources ie event datasetclose-domain part-of-speech tagger lexicons annotatorsand other supporting tools

)ere are a lot of tasks that can be accomplished forUrdu language text in the future Some of those are men-tioned as follows

(1) Generic word embedding models can be developedfor a large corpus of Urdu language text

(2) Different deep learning classifiers can be evaluatedie BERT and ANN

(3) Event classification can be performed at the doc-ument level

(4) A balance dataset can be used for better results(5) Multilabel event classification can be performed in

the future(6) Unstructured data of Urdu text can be classified

into different event classes(7) Classification of events for the Urdu language can

be further performed for other domains ofknowledge ie literacy ratio top trends famousfoods and a religious event like Eid

(8) Contextual information of sentence ie presen-tence and postsentence information certainly playsa vital role in enhancing the performance accuracyof the classification model

(9) Event classification can be performed on a balanceddataset

(10) Unstructured Urdu data can be used for eventclassification

(11) Classification can be performed at a document andphrase level

Data Availability

)e data used to support this study are available at httpsgithubcomunique-worldMulticlass-Event-Classification-Dataset

Conflicts of Interest

)e authors declare that there are no conflicts of interest

References

[1] A Lenhart R Ling S Campbell and K Purcell Teens andMobile Phones Text Messaging Explodes as Teens Embrace it asthe Centerpiece of eir Communication Strategies withFriends Pew Internet amp American Life Project WashingtonDC USA 2010

[2] M Motoyama B Meeder K Levchenko G M Voelker andS Savage ldquoMeasuring online service availability using twit-terrdquo WOSN vol 10 p 13 2010

[3] J Rogstadius M Vukovic C A Teixeira V KostakosE Karapanos and J A Laredo ldquoCrisisTracker crowdsourcedsocial media curation for disaster awarenessrdquo IBM Journal ofResearch and Development vol 57 no 5 pp 4ndash1 2013

[4] T Reuter and P Cimiano ldquoEvent-based classification of socialmedia streamsrdquo in Proceedings of the 2nd ACM InternationalConference on Multimedia Retrieval pp 1ndash8 Bielefeld Ger-many June 2012

[5] K Sailunaz and R Alhajj ldquoEmotion and sentiment analysisfrom Twitter textrdquo Journal of Computational Science vol 36Article ID 101003 2019

[6] P Capet T Delavallade T Nakamura A SandorC Tarsitano and S Voyatzi ldquoA risk assessment system withautomatic extraction of event typesrdquo in Proceedings of theInternational Conference on Intelligent Information Process-ing pp 220ndash229 Springer Beijing China October 2008

[7] F Hogenboom F Frasincar U Kaymak F De Jong andE Caron ldquoA survey of event extraction methods from text fordecision support systemsrdquo Decision Support Systems vol 85pp 12ndash22 2016

[8] S Jiang H Chen J F Nunamaker and D Zimbra ldquoAna-lyzing firm-specific social media and market a stakeholder-based event analysis frameworkrdquo Decision Support Systemsvol 67 pp 30ndash39 2014

[9] B Pang and L Lee ldquoOpinion mining and sentiment analysisrdquoFoundations and Trends in Information Retrieval vol 2 no 1-2 pp 1ndash135 2008

[10] S Deerwester S T Dumais G W Furnas T K Landauerand R Harshman ldquoIndexing by latent semantic analysisrdquoJournal of the American Society for Information Sciencevol 41 no 6 pp 391ndash407 1990

[11] T Mikolov Q V Le and I Sutskever ldquoExploiting similaritiesamong languages for machine translationrdquo 2013 httparxivorgabs13094168

[12] R Alghamdi and K Alfalqi ldquoA survey of topic modeling intext miningrdquo International Journal of Advanced ComputerScience and Applications (IJACSA)vol 6 no 1 2015

[13] D M Eberhard S F Gary and C D Fennig EthnologueLanguages of the World SIL International Dallas TX USA2019

[14] A Daud W Khan and D Che ldquoUrdu language processing asurveyrdquo Artificial Intelligence Review vol 47 no 3pp 279ndash311 2017

[15] M P Akhter Z Jiangbin I R Naqvi M AbdelmajeedA Mehmood and M T Sadiq ldquoDocument-level text clas-sification using single-layer multisize filters convolutionalneural networkrdquo IEEE Access vol 8 pp 42689ndash42707 2021

[16] U Pal and A Sarkar ldquoRecognition of printed Urdu scriptrdquo inProceedings of the 2003 Seventh International Conference onDocument Analysis and Recognition pp 1183ndash1187 IEEEEdinburgh Scotland August 2003

[17] Y Yang T Pierce and J Carbonell ldquoA study of retrospectiveand on-line event detectionrdquo in Proceedings of the 21st annualinternational ACM SIGIR conference on Research and

Scientific Programming 13

development in information retrieval pp 28ndash36 MelbourneAustralia August 1998

[18] T Kala ldquoEvent detection from text datardquo ComputationalIntelligence vol 31 pp 132ndash164 2015

[19] M Naughton N Stokes and J Carthy ldquoSentence-level eventclassification in unstructured textsrdquo Information Retrievalvol 13 no 2 pp 132ndash156 2010

[20] G Jacobs E Lefever and V Hoste ldquoEconomic event de-tection in company-specific news textrdquo in Proceedings of theFirst Workshop on Economics and Natural Language Pro-cessing pp 1ndash10 Melbourne Australia July 2018

[21] E DrsquoAndrea P Ducange A Bechini A Renda andF Marcelloni ldquoMonitoring the public opinion about thevaccination topic from tweets analysisrdquo Expert Systems withApplications vol 116 pp 209ndash226 2019

[22] M Sokolova and G Lapalme ldquoA systematic analysis ofperformance measures for classification tasksrdquo InformationProcessing amp Management vol 45 no 4 pp 427ndash437 2009

[23] T Mikolov I Sutskever K Chen G S Corrado and J DeanldquoDistributed representations of words and phrases and theircompositionalityrdquo in Proceedings Advances Neural Informa-tion Processing Systems vol 26 pp 3111ndash3119 Lake TahoeNV USA December 2013

[24] Y Bengio R Ducharme P Vincent and C Jauvin ldquoA neuralprobabilistic language modelrdquo Journal of Machine LearningResearch vol 3 pp 1137ndash1155 2003

[25] M P Akhter Z Jiangbin I R Naqvi M Abdelmajeed andM Fayyaz ldquoExploring deep learning approaches for Urdu textclassification in product manufacturingrdquo Enterprise Infor-mation Systems pp 1ndash26 2020

[26] G Liu and J Guo ldquoBidirectional LSTM with attentionmechanism and convolutional layer for text classificationrdquoNeurocomputing vol 337 pp 325ndash338 2019

[27] M Sharjeel R M A Nawab and P Rayson ldquoCOUNTERcorpus of Urdu news text reuserdquo Language Resources andEvaluation vol 51 no 3 pp 777ndash803 2017

[28] K Mehmood D Essam and K Shafi ldquoSentiment analysissystem for Roman Urdurdquo in Proceedings of the 2018 Scienceand Information Conference pp 29ndash42 Springer CasablancaMorocco July 2018

[29] K Ahmed M Ali S Khalid and M Kamran ldquoFramework forUrdu news headlines classificationrdquo Journal of Applied ComputerScience amp Mathematics vol 10 no 1 pp 17ndash21 2016

[30] Z Tehseen M P Akhter and Q Abbas ldquoComparative studyof feature selection approaches for Urdu text categorizationrdquoMalaysian Journal Computer Science vol 28 no 2 pp 93ndash109 2015

[31] W Yin and L Shen ldquoA short text classification approach withevent detection and conceptual informationrdquo in Proceedingsof the 2020 5th International Conference on Machine LearningTechnologies pp 129ndash135 Beijing China June 2020

[32] H Zhou M Huang T Zhang et al ldquoEmotional chattingmachine emotional conversation generation with internaland external memoryrdquo in Proceedings of the irty-SecondAAAI Conference on Artificial Intelligence New Orleans LAUSA February 2018

[33] A Hassan and A Mahmood ldquoConvolutional recurrent deeplearningmodel for sentence classificationrdquo IEEE Access vol 6pp 13949ndash13957 2018

[34] T Zia M P Akhter and Q Abbas ldquoComparative study offeature selection approaches for Urdu text categorizationrdquoMalaysian Journal of Computer Science vol 28 no 2pp 93ndash109 2015

[35] A R Ali andM Ijaz ldquoUrdu text classificationrdquo in Proceedingsof the 7th international conference on frontiers of informationtechnology pp 1ndash7 Abbottabad Pakistan December 2009

[36] M Usman Z Shafique S Ayub and K Malik ldquoUrdu textclassification using majority votingrdquo International Journal ofAdvanced Computer Science and Applications vol 7 no 8pp 265ndash273 2016

[37] N Kalchbrenner E Grefenstette and P Blunsom ldquoA con-volutional neural network for modelling sentencesrdquo 2014httparxivorgabs14042188

[38] D M Awais and DM Shoaib ldquoRole of discourse informationin Urdu sentiment classificationrdquo ACMTransactions on Asianand Low-Resource Language Information Processing vol 18no 4 pp 1ndash37 2019

[39] J P Singh Y K Dwivedi N P Rana A Kumar andK K Kapoor ldquoEvent classification and location predictionfrom tweets during disastersrdquo Annals of Operations Researchvol 283 no 1-2 pp 737ndash757 2019

[40] R C Paulo D Fillipe and M S C Sergo ldquoClassification ofevents on social mediardquo 2016

[41] Q A Al-Radaideh and M A Al-Abrat ldquoAn Arabic textcategorization approach using term weighting and multiplereductsrdquo Soft Computing vol 23 no 14 pp 5849ndash5863 2019

[42] J F Allen ldquoMaintaining knowledge about temporal inter-valsrdquo Communications of the ACM vol 26 no 11pp 832ndash843 1983

[43] T Joachims ldquoText categorization with support vector ma-chines learning with many relevant featuresrdquo in Proceedingsof the European conference on machine learning pp 137ndash142Springer Chemnitz Germany April 1998

[44] S Haider ldquoUrdu word embeddingsrdquo in Proceedings of theEleventh International Conference on Language Resources andEvaluation (LREC 2018) Miyazaki Japan May 2018

[45] B Jawaid A Kamran and O Bojar ldquoUrdu monolingual corpusrdquoLINDATCLARIN Digital Library at the Institute of Formal andApplied Linguistics Charles University Prague Czechia

[46] F Adeeba Q Akram H Khalid and S Hussain ldquoCLE Urdubooks n-gramsrdquo in Proceedings of the Conference on Languageand Technology CLT 14 Karachi Pakistan May 2014

[47] A Hassan and A Mahmood ldquoDeep learning for sentenceclassificationrdquo in Proceedings of the 2017 IEEE Long IslandSystems Applications and Technology Conference (LISAT)pp 1ndash5 IEEE New York NY USA May 2017

[48] D-X Zhou ldquoUniversality of deep convolutional neuralnetworksrdquo Applied and Computational Harmonic Analysisvol 48 no 2 pp 787ndash794 2020

[49] M V Valueva N N Nagornov P A Lyakhov G V Valuevand N I Chervyakov ldquoApplication of the residue numbersystem to reduce hardware costs of the convolutional neuralnetwork implementationrdquo Mathematics and Computers inSimulation vol 177 pp 232ndash243 2020

[50] G Guo H Wang D Bell Y Bi and K Greer ldquoKNN model-based approach in classificationrdquo in Proceedings of the OTMConfederated International Conferences ldquoOn the Move toMeaningful Internet Systemsrdquo Catania Italy November 2003

[51] Y Zhong ldquo)e analysis of cases based on decision treerdquo inProceedings of the 2016 7th IEEE international conference onsoftware engineering and service science (ICSESS) pp 142ndash147IEEE Beijing China August 2016

[52] S Xu ldquoBayesian Naıve Bayes classifiers to text classificationrdquoJournal of Information Science vol 44 no 1 pp 48ndash59 2018

[53] T Zhang and F Oles ldquoText categorization based on regu-larized linear classification methodsrdquo Information Retrievalvol 4 no 1 pp 5ndash31 2001

14 Scientific Programming

[54] J Ali R Khan N Ahmad and I Maqsood ldquoRandom forestsand decision treesrdquo International Journal of Computer ScienceIssues (IJCSI) vol 9 no 5 p 272 2012

[55] Y Zhang ldquoSupport vector machine classification algorithmand its applicationrdquo in Proceedings of the InternationalConference on Information Computing and Applicationspp 179ndash186 Springer Bhubaneswar India September 2012

Scientific Programming 15

Page 14: ResearchArticle MulticlassEventClassificationfromText2020/11/03  · script. Its grammatical structure is different from other languages. (1)Subject-object-verb(SOV)sentencestructure[14]

development in information retrieval pp 28ndash36 MelbourneAustralia August 1998

[18] T Kala ldquoEvent detection from text datardquo ComputationalIntelligence vol 31 pp 132ndash164 2015

[19] M Naughton N Stokes and J Carthy ldquoSentence-level eventclassification in unstructured textsrdquo Information Retrievalvol 13 no 2 pp 132ndash156 2010

[20] G Jacobs E Lefever and V Hoste ldquoEconomic event de-tection in company-specific news textrdquo in Proceedings of theFirst Workshop on Economics and Natural Language Pro-cessing pp 1ndash10 Melbourne Australia July 2018

[21] E DrsquoAndrea P Ducange A Bechini A Renda andF Marcelloni ldquoMonitoring the public opinion about thevaccination topic from tweets analysisrdquo Expert Systems withApplications vol 116 pp 209ndash226 2019

[22] M Sokolova and G Lapalme ldquoA systematic analysis ofperformance measures for classification tasksrdquo InformationProcessing amp Management vol 45 no 4 pp 427ndash437 2009

[23] T Mikolov I Sutskever K Chen G S Corrado and J DeanldquoDistributed representations of words and phrases and theircompositionalityrdquo in Proceedings Advances Neural Informa-tion Processing Systems vol 26 pp 3111ndash3119 Lake TahoeNV USA December 2013

[24] Y Bengio R Ducharme P Vincent and C Jauvin ldquoA neuralprobabilistic language modelrdquo Journal of Machine LearningResearch vol 3 pp 1137ndash1155 2003

[25] M P Akhter Z Jiangbin I R Naqvi M Abdelmajeed andM Fayyaz ldquoExploring deep learning approaches for Urdu textclassification in product manufacturingrdquo Enterprise Infor-mation Systems pp 1ndash26 2020

[26] G Liu and J Guo ldquoBidirectional LSTM with attentionmechanism and convolutional layer for text classificationrdquoNeurocomputing vol 337 pp 325ndash338 2019

[27] M Sharjeel R M A Nawab and P Rayson ldquoCOUNTERcorpus of Urdu news text reuserdquo Language Resources andEvaluation vol 51 no 3 pp 777ndash803 2017

[28] K Mehmood D Essam and K Shafi ldquoSentiment analysissystem for Roman Urdurdquo in Proceedings of the 2018 Scienceand Information Conference pp 29ndash42 Springer CasablancaMorocco July 2018

[29] K Ahmed M Ali S Khalid and M Kamran ldquoFramework forUrdu news headlines classificationrdquo Journal of Applied ComputerScience amp Mathematics vol 10 no 1 pp 17ndash21 2016

[30] Z Tehseen M P Akhter and Q Abbas ldquoComparative studyof feature selection approaches for Urdu text categorizationrdquoMalaysian Journal Computer Science vol 28 no 2 pp 93ndash109 2015

[31] W Yin and L Shen ldquoA short text classification approach withevent detection and conceptual informationrdquo in Proceedingsof the 2020 5th International Conference on Machine LearningTechnologies pp 129ndash135 Beijing China June 2020

[32] H Zhou M Huang T Zhang et al ldquoEmotional chattingmachine emotional conversation generation with internaland external memoryrdquo in Proceedings of the irty-SecondAAAI Conference on Artificial Intelligence New Orleans LAUSA February 2018

[33] A Hassan and A Mahmood ldquoConvolutional recurrent deeplearningmodel for sentence classificationrdquo IEEE Access vol 6pp 13949ndash13957 2018

[34] T Zia M P Akhter and Q Abbas ldquoComparative study offeature selection approaches for Urdu text categorizationrdquoMalaysian Journal of Computer Science vol 28 no 2pp 93ndash109 2015

[35] A R Ali andM Ijaz ldquoUrdu text classificationrdquo in Proceedingsof the 7th international conference on frontiers of informationtechnology pp 1ndash7 Abbottabad Pakistan December 2009

[36] M Usman Z Shafique S Ayub and K Malik ldquoUrdu textclassification using majority votingrdquo International Journal ofAdvanced Computer Science and Applications vol 7 no 8pp 265ndash273 2016

[37] N Kalchbrenner E Grefenstette and P Blunsom ldquoA con-volutional neural network for modelling sentencesrdquo 2014httparxivorgabs14042188

[38] D M Awais and DM Shoaib ldquoRole of discourse informationin Urdu sentiment classificationrdquo ACMTransactions on Asianand Low-Resource Language Information Processing vol 18no 4 pp 1ndash37 2019

[39] J P Singh Y K Dwivedi N P Rana A Kumar andK K Kapoor ldquoEvent classification and location predictionfrom tweets during disastersrdquo Annals of Operations Researchvol 283 no 1-2 pp 737ndash757 2019

[40] R C Paulo D Fillipe and M S C Sergo ldquoClassification ofevents on social mediardquo 2016

[41] Q A Al-Radaideh and M A Al-Abrat ldquoAn Arabic textcategorization approach using term weighting and multiplereductsrdquo Soft Computing vol 23 no 14 pp 5849ndash5863 2019

[42] J F Allen ldquoMaintaining knowledge about temporal inter-valsrdquo Communications of the ACM vol 26 no 11pp 832ndash843 1983

[43] T Joachims ldquoText categorization with support vector ma-chines learning with many relevant featuresrdquo in Proceedingsof the European conference on machine learning pp 137ndash142Springer Chemnitz Germany April 1998

[44] S Haider ldquoUrdu word embeddingsrdquo in Proceedings of theEleventh International Conference on Language Resources andEvaluation (LREC 2018) Miyazaki Japan May 2018

[45] B Jawaid A Kamran and O Bojar ldquoUrdu monolingual corpusrdquoLINDATCLARIN Digital Library at the Institute of Formal andApplied Linguistics Charles University Prague Czechia

[46] F Adeeba Q Akram H Khalid and S Hussain ldquoCLE Urdubooks n-gramsrdquo in Proceedings of the Conference on Languageand Technology CLT 14 Karachi Pakistan May 2014

[47] A Hassan and A Mahmood ldquoDeep learning for sentenceclassificationrdquo in Proceedings of the 2017 IEEE Long IslandSystems Applications and Technology Conference (LISAT)pp 1ndash5 IEEE New York NY USA May 2017

[48] D-X Zhou ldquoUniversality of deep convolutional neuralnetworksrdquo Applied and Computational Harmonic Analysisvol 48 no 2 pp 787ndash794 2020

[49] M V Valueva N N Nagornov P A Lyakhov G V Valuevand N I Chervyakov ldquoApplication of the residue numbersystem to reduce hardware costs of the convolutional neuralnetwork implementationrdquo Mathematics and Computers inSimulation vol 177 pp 232ndash243 2020

[50] G Guo H Wang D Bell Y Bi and K Greer ldquoKNN model-based approach in classificationrdquo in Proceedings of the OTMConfederated International Conferences ldquoOn the Move toMeaningful Internet Systemsrdquo Catania Italy November 2003

[51] Y Zhong ldquo)e analysis of cases based on decision treerdquo inProceedings of the 2016 7th IEEE international conference onsoftware engineering and service science (ICSESS) pp 142ndash147IEEE Beijing China August 2016

[52] S Xu ldquoBayesian Naıve Bayes classifiers to text classificationrdquoJournal of Information Science vol 44 no 1 pp 48ndash59 2018

[53] T Zhang and F Oles ldquoText categorization based on regu-larized linear classification methodsrdquo Information Retrievalvol 4 no 1 pp 5ndash31 2001

14 Scientific Programming

[54] J Ali R Khan N Ahmad and I Maqsood ldquoRandom forestsand decision treesrdquo International Journal of Computer ScienceIssues (IJCSI) vol 9 no 5 p 272 2012

[55] Y Zhang ldquoSupport vector machine classification algorithmand its applicationrdquo in Proceedings of the InternationalConference on Information Computing and Applicationspp 179ndash186 Springer Bhubaneswar India September 2012

Scientific Programming 15

Page 15: ResearchArticle MulticlassEventClassificationfromText2020/11/03  · script. Its grammatical structure is different from other languages. (1)Subject-object-verb(SOV)sentencestructure[14]

[54] J Ali R Khan N Ahmad and I Maqsood ldquoRandom forestsand decision treesrdquo International Journal of Computer ScienceIssues (IJCSI) vol 9 no 5 p 272 2012

[55] Y Zhang ldquoSupport vector machine classification algorithmand its applicationrdquo in Proceedings of the InternationalConference on Information Computing and Applicationspp 179ndash186 Springer Bhubaneswar India September 2012

Scientific Programming 15