24
Research Article A Supervised Approach to Predict the Hierarchical Structure of Conversation Threads for Comments A. Balali, H. Faili, and M. Asadpour School of ECE, College of Engineering, University of Tehran, P.O. Box 515-14395, Tehran, Iran Correspondence should be addressed to H. Faili; [email protected] Received 30 August 2013; Accepted 27 November 2013; Published 11 February 2014 Academic Editors: O. Greevy and J. Sarangapani Copyright © 2014 A. Balali et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. User-generated texts such as comments in social media are rich sources of information. In general, the reply structure of comments is not publicly accessible on the web. Websites present comments as a list in chronological order. is way, some information is lost. A solution for this problem is to reconstruct the thread structure (RTS) automatically. RTS predicts a semantic tree for the reply structure, useful for understanding users’ behaviours and facilitating follow of the actual conversation streams. is paper works on RTS task in blogs, online news agencies, and news websites. ese types of websites cover various types of articles reflecting the real-world events. People with different views participate in arguments by writing comments. Comments express opinions, sentiments, or ideas about articles. e reply structure of threads in these types of websites is basically different from threads in the forums, chats, and emails. To perform RTS, we define a set of textual and nontextual features. en, we use supervised learning to combine these features. e proposed method is evaluated on five different datasets. e accuracy of the proposed method is compared with baselines. e results reveal higher accuracy for our method in comparison with baselines in all datasets. 1. Introduction In recent years, interactive online websites such as weblogs, discussion boards, and news websites have grown in popu- larity, so these online threads have become valuable source of information. is information is obtained by interaction among users who create, share, and exchange information and ideas on various topics such as politics, economy, society, and environment. People tend to express their ideas and opinions in public and online [1]. Millions of web-users spend hours a day on these online threads in order to read news and articles, write their opinion and discuss with each other. Nowadays, most of the websites use content management systems that allow them to receive feedback from visitors and collect their comments. ese comments are publicly showed to the visitors, usually aſter confirmation by a moderator. ese comments are showed mainly in chronological order (sometimes in reverse order), that is, when a user posts a comment, it is appended to the end of the list. Although content management systems nowadays allow nested comments (in hierarchical order), due to space prob- lem showing the complete hierarchy might not be possible. at is why in many websites, either the nested hierarchy is not shown or the depth of the hierarchy is limited to something like 3 or 4 levels. is problem makes following the discussions very difficult and time consuming. Here, we try to automatically reconstruct the thread structure (RTS). We try to build a semantic tree where nodes of the tree (except the root node) are comments and edges that specify which comment is in reply to which comment. e root of the tree is the main article. Blogs and online news agencies are very important for the opinions they receive from visitors. Also, since they cover various events of our social life and many people with different views participate in the arguments, they are valuable source of information for researchers as well. We divide the discussion threads into interrogative and declarative threads. Discussion threads in which users write a question and the other users try to answer it like forums, chats, and emails [27] are interrogative threads. reads like blog comments and comments that discuss news articles do not fit to a question-answer format. We call them declarative. Valuable studies have been done on RTS in interrogative threads [27]. However, due to the difference between these Hindawi Publishing Corporation e Scientific World Journal Volume 2014, Article ID 479746, 23 pages http://dx.doi.org/10.1155/2014/479746

Research Article A Supervised Approach to Predict the …downloads.hindawi.com/journals/tswj/2014/479746.pdf · 2019-07-31 · Research Article A Supervised Approach to Predict the

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Research Article A Supervised Approach to Predict the …downloads.hindawi.com/journals/tswj/2014/479746.pdf · 2019-07-31 · Research Article A Supervised Approach to Predict the

Research ArticleA Supervised Approach to Predict the Hierarchical Structureof Conversation Threads for Comments

A Balali H Faili and M Asadpour

School of ECE College of Engineering University of Tehran PO Box 515-14395 Tehran Iran

Correspondence should be addressed to H Faili hfailiutacir

Received 30 August 2013 Accepted 27 November 2013 Published 11 February 2014

Academic Editors O Greevy and J Sarangapani

Copyright copy 2014 A Balali et alThis is an open access article distributed under the Creative Commons Attribution License whichpermits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

User-generated texts such as comments in social media are rich sources of information In general the reply structure of commentsis not publicly accessible on the webWebsites present comments as a list in chronological orderThis way some information is lostA solution for this problem is to reconstruct the thread structure (RTS) automatically RTS predicts a semantic tree for the replystructure useful for understanding usersrsquo behaviours and facilitating follow of the actual conversation streams This paper workson RTS task in blogs online news agencies and news websites These types of websites cover various types of articles reflectingthe real-world events People with different views participate in arguments by writing comments Comments express opinionssentiments or ideas about articles The reply structure of threads in these types of websites is basically different from threads inthe forums chats and emails To perform RTS we define a set of textual and nontextual featuresThen we use supervised learningto combine these features The proposed method is evaluated on five different datasets The accuracy of the proposed method iscompared with baselines The results reveal higher accuracy for our method in comparison with baselines in all datasets

1 Introduction

In recent years interactive online websites such as weblogsdiscussion boards and news websites have grown in popu-larity so these online threads have become valuable sourceof information This information is obtained by interactionamong users who create share and exchange informationand ideas on various topics such as politics economy societyand environment People tend to express their ideas andopinions in public and online [1]Millions of web-users spendhours a day on these online threads in order to read news andarticles write their opinion and discuss with each other

Nowadays most of the websites use content managementsystems that allow them to receive feedback from visitors andcollect their commentsThese comments are publicly showedto the visitors usually after confirmation by a moderatorThese comments are showed mainly in chronological order(sometimes in reverse order) that is when a user posts acomment it is appended to the end of the list

Although content management systems nowadays allownested comments (in hierarchical order) due to space prob-lem showing the complete hierarchy might not be possible

That is why in many websites either the nested hierarchyis not shown or the depth of the hierarchy is limited tosomething like 3 or 4 levels This problem makes followingthe discussions very difficult and time consuming

Here we try to automatically reconstruct the threadstructure (RTS) We try to build a semantic tree where nodesof the tree (except the root node) are comments and edgesthat specify which comment is in reply to which commentThe root of the tree is the main article

Blogs and online news agencies are very important forthe opinions they receive from visitors Also since theycover various events of our social life and many people withdifferent views participate in the arguments they are valuablesource of information for researchers as well

We divide the discussion threads into interrogative anddeclarative threads Discussion threads in which users writea question and the other users try to answer it like forumschats and emails [2ndash7] are interrogative threadsThreads likeblog comments and comments that discuss news articles donot fit to a question-answer format We call them declarative

Valuable studies have been done on RTS in interrogativethreads [2ndash7] However due to the difference between these

Hindawi Publishing Corporatione Scientific World JournalVolume 2014 Article ID 479746 23 pageshttpdxdoiorg1011552014479746

2 The Scientific World Journal

threads they are not effective enough on declarative threadsIn this paper we focus on declarative threads and try to devisean effective RTS method for them In summary the RTS taskthat we are doing here is suitable for the following websites

(i) websites that show comments in a list based struc-ture in chronological order This way of show-ing comments is prevalent [2ndash4] for exampleReuters (httpwwwreuterscom) and ABCNEWS(httpwwwabcnewscom) which are listed respec-tively as the 301th and 462th most popular websitesin global based on theAlexarsquos traffic rank (AlexamdashTheweb information company httpwwwalexacom)

(ii) blog service providers who do not support hierar-chical structure for their comments such as Blogfa(httpwwwblogfacom) which is listed at the 137thmost popular website globally and 3th popular web-site in Iran after Google and Yahoo

(iii) websites that cannot support a tree-like replystructure with more than some levels due to thelimitation of space on pages such as FacebookFin24 (httpwwwfin24com) and Skysport(httpwwwskysportscom)

(iv) websites that have been created from contentmanage-ment systems that do not support nested commentsfor example those who have designed the templateof their sites based on an old version of Wordpress(before 2009) and have not changed it

RTS can be beneficial in many applications To namea few it is useful for facilitating search and finding of theuserrsquos favorite content in a large volume of comments andfor improving retrieval accuracy [2 8 9] identifying userswho have the ability to answer the questions [10] isolatingdiscussions related to specific subtopics [11] understandingthe online userrsquos behavior [12] facilitating the follow of theactual conversation stream in threads [4] conversation sum-marization which include both the initiation and responseas a coherent unit [13] automatic question and answerdetection [13 14] and finding the reply relations amongcomments to be used in other tasks like topic detection

Another benefit for RTS is in topic detection on com-ments Usually comments are short in length and this makesthe topic detection task quite difficult Usually topic ofcomments is similar or related to topic of its parent and itschildren So knowing the hierarchy can help us to provideextra information in order to enhance topic detection Thiscan be considered as a sort of word expansion

In this work we propose a method to automaticallyreconstruct thread structure and organize comments into atree-like structure by considering information about authorscontent date and time of the post A set of relevant textualand nontextual features are defined Then a learning algo-rithm based on ranking SVMmodel is used to learn a propermodel that is exploited to identify the reply relation between aroot and a set of comments In other words a set of commentsis fed into the trained model to determine if there is any rela-tion between themor notTheproposedRTSmethod is calledSLARTS (a Supervised Learning Approach to Reconstruct

Thread Structure) We combine our knowledge and technicsfrom Information Retrieval Natural Language ProcessingMachine Learning and Social Network disciplines

The main contributions of this work are as follows

(i) The focus of this paper is RTS task on declarativethreads in blogs online news agencies and newswebsites on which few studies have been carried out

(ii) We describe and show the differences between thedeclarative and interrogative threads To better illus-trate the differences we used Apple discussion forumand compared it with declarative datasets

(iii) We propose a supervised approach to RTS tasknamely SLARTS based on ranking SVM modelusing novel textual and nontextual features which arerelated to declarative threads

(iv) The proposed method is tested on 5 datasets indifferent languages one in Persian one inRussian andthree in English In three of these websites the com-ments are confirmed by moderators Some of thesewebsites present comments as user posts All datasetsconsist of lots of reply structures among commentsThe reply structures of comments in ground truthcome from real structures that are created by usersand they are addressed by reply tags in our datasets

(v) In order to better evaluate our method some eval-uation metrics that are proposed in the previousworks have been modified and an evaluation metricis proposed to RTS task which is appropriate fordeclarative threads

(vi) The evaluation results reveal higher accuracy in com-parison with the baselines methods in all datasets

This paper is an extension of the work we published inour conference paper [15] We have defined new textual andnontextual features to improve the accuracy of the proposedmethod We also evaluate our method on more datasets withnew measures for accuracy

The rest of this paper is organized as follows In Section 2we describe related work It is followed by problem definitionin Section 3 We explain the proposed method and featuresused for RTS task in Section 4 In Section 5 the experimentsdatasets evaluation metrics and experimental results areshown and described Finally we discuss the results and con-clude with the valuable information which can be extractedby visualizing the reply relations structure

2 Related Work

In this section we cover previous works on RTS task and thenwe will investigate various issues around threads in onlineplatforms

21 Thread Detection Thread detection task which some-times is called topic detection should be accomplished aspreprocessing for the RTS task In this task all commentsare split into a number of threads After that RTS could

The Scientific World Journal 3

be exploited to discover the tree structure of the threads[5 10 16]

The thread detection task is not necessary for declarativethreads since usually all comments are related to the rootBased on this all comments can be considered in just onethread Consequently we focus only onRTS task in this study

22 RTS Task There are very few studies in the literature thatdirectly address the problem of RTS task for comments indeclarative threads In general RTS task can be done either insupervised or unsupervised manner In unsupervised meth-ods the relation between comments is weighted by usingtext similarity measures Then they are adjusted using othermetrics such as time distance and position in chronologicalorder Finally the relations whose weights are higher than apredefined threshold are selected as the parent-child relations[11 17] Lin et al [18] proposed a method which computesthe similarity between each post and previous posts to findcandidate parents Then the post with the highest score ischosen as the parent candidate If the parent candidate is notsimilar enough to the child comment the candidate parent isassumed to be a new discussion branch of the thread

In supervised methods the existence of relation betweentwo messages is determined using a supervised learningalgorithm [2 4 19] In these methods a set of features isdefined and weighted using a training set Then the trainedmodel is used to discover the comment relations in the testdata using extracted features

The methods proposed in [4 19] are supervised inwhich a set of simple features and a classifier are used Thefeatures are divided into two groups The first group is calledstructural or nontextual features such as time informationreply distance and authorrsquos name The next group of featuresis called semantic or textual features such as sentences typeand similarity among comments

Seo et al [2] proposed a learning technique that exploitsthe hierarchical structure of replies in forums or emailsTheyintroduced a structure discovery technique that uses a varietyof features to model the relations among posts of the samethread In fact their method is the most similar to oursHowever we focus on blogs and news agencies while theyhave worked on forums and emails The existence of quotedtexts in forums and emails makes RTS task easier than ourcase

Wang et al [3] proposed a probabilistic model as asupervised structure learning problem to predict the depen-dency among the posts within one thread on forums basedon the general conditional random fields Their method isbased on various kinds of features The features describedthe interactions in both posts and authors The weights forthe designed features are estimated in a supervised mannerSimilar to previous work Wang et al [20] proposed adiscriminative probabilistic model which can handle bothlocal features and knowledge propagation rules

The only existing work on RTS task in declarative threadshas been proposed by Schuth et al [19] This work focusedon RTS task in online news agencies threads They usedseveral features to detect authorsrsquo name in the commentsrsquo text

Then the features are combined by tree-learner algorithmand eventually a classifier detects relations among commentsand the root Since there are many comments that do notrefer to any authorrsquos name in our case this method does nothave a good accuracy

There are also some related works on other types of datasuch as email data [6 7 21] However some specific featuresexist in those environments that are not applicable here Forinstance some extra information about messagersquos recipientslike ldquoToCCrdquo tag in email data or any information about theaffiliation of messagersquos author like signature is exploited toimprove reconstructing conversation threads in email data

23 Thread Structure Analysis Some works have been doneon analysis of the social media emerging from the usercomment activity [22 23] The messages boards such asSlahsdot and Reddit publish frequently short news postsand allow their readers to comment on them The worksproposed in [24 25] focused on these threads They exploredthe structure and topical hierarchies of comment threadsto gain a deeper understanding of the usersrsquo behaviour thatallow these types of user-powered websites to better operateLaniado et al [26] analyzed the structural properties of thethreads on Wikipedia pages to extract and study differentkinds of interactions Wu et al [1] introduced a model toexplain the human view and reply behaviors in the forumwhich are helpful for discovering collective patterns of humanbehaviors They found that view and the reply behaviourshave a form of a power-law distribution [1 27]

24 Detection of Initiation-Response Pairs The works pro-posed in [13 14] focused on detection of initiation-responsepairs such as the question-answer and assessment-agreementrelationships Initiation-response pairs are pairs of utterancesthat the first pair part sets up an expectation for the secondpair part [13] Wang et al [14] introduced a list of thedialogue act labels for edges A dialogue act label such asanswer-answer answer-question is assigned to each relationbetween two messages Kim et al [28] proposed a dialogueact tag set and method for annotating post-to-post discoursestructure on forums They used three feature sets structuralfeatures post context features and semantic features andthey experimented with three discriminative learners SVMHMM andmaximum entropy Andreas et al [29] introduceda new corpus of sentence-level agreement and disagreementannotations Two sentences are in agreement if they show thesame fact or opinion

A list of dialog act labels and an approach for modelingdialogue acts have been proposed in conversational speech[30ndash32] Detection of dialogue act labels for each post issuitable for thread detection [5 16] and finding relevantanswers in forums [8]

25 Automatic Meta-Information Extraction from HTMLPages Information of threads such as authorsrsquo name andcontent is usually extracted by human This is time con-suming Some methods have been proposed to extract themain content and remove noisy information from a web page

4 The Scientific World Journal

automatically [33] Hu et al [34] proposed an algorithm toextract all meta-information of threads from different kindsof forums without human interactionThe algorithm consistsof two steps thread extraction and detailed informationextraction from thread pages

3 Problem Definition

In this section some concepts are defined about RTS taskA comment is an utterance written by a user comprisingone or several sentences A commenter is a person whowrites comments and replies to the root article or the othercomments In this paper the root is defined as the starterof a conversation which can be a news article or any othercontent which has been written by an author or a journalistFigure 1 shows a part of a real thread with 145 comments InFigure 1 the root post is shown with label 119877 and edges denotethe reply relations In a thread a reply comment (which isillustrated in Figure 1 by numbered labels) responses to theprevious comments or to the root item For example thenodes with label 1 and 32 are reply comments to the rootitem which is considered as their parent The sequences oflabels in Figure 1 are in chronological order A thread is asequence of comments that starts with the root item andcontains a series of reply comments which are usually relatedto the same topic as the root item Each comment has a singleparent and all comments are descended from the root postin a thread [24 29] In other words threads are consideredas a special case of human conversations [35] that consist ofa set of comments contributed by two or more participantsby the reply operation [36]The candidate set of 119894th commentis a set of comments that could be considered as the parentof 119894th comment and includes comments which appear beforethe 119894th comment in chronological orderThe starter discussioncomment is a comment repling to the root item and it has atleast one child like comments with labels 1 and 32 in Figure 1

Thread detection task means finding the cluster of com-ments that belong to the same topic in a given text streamwithout any previous knowledge about the number ofthreads Reconstructing thread structure (RTS) task meansreconstructing the reply structure on comments in a threadThis leads to construction of a tree-like reply structure [2ndash5] or directed acyclic graph (DAG) [11 17 19] Since thethreads extracted from websites have the tree-like replystructure they are usually modeled as a tree in most papersAlso the structure of threads can be modeled as a DAGThe information about DAG-like reply structure is mostlyprepared by manually annotating the data which is a timeconsuming and difficult task In addition since the manuallyannotated dataset contains a small number of threads [11 19]it cannot properly evaluate the RTS algorithms In this paperwe assume a tree-like reply structure

Declarative and interrogative threads are different inessence

(i) Users in declarative threads mostly express theiropinions or sentiments about the root post informallywhile in interrogative threads users mostly expresstheir questions and answers in a more formal way

(ii) In declarative threads the topic of the root is mostlikely a news article or a content reflecting the real-world events while in interrogative threads it is mostlikely a userrsquos question

(iii) There is meta-information such as quotation in inter-rogative threads that has a great improvement on theaccuracy of RTS task

(iv) Comments of declarative threads usually wait formoderation to be published and it usually takes sometime Moderators are not always online they log ina few times per day accept the sent comments andlog out Therefore multiple comments appear nearlyat the same time So some features that are based ontime distance [11] and position in chronological orderhave good performance in interrogative threads butthey are not good in declarative threads

Declarative and interrogative threads are technically dif-ferent as well

(i) Each comment can most likely be connected toits previous comment (1-Distance) in the interroga-tive threads [4] this simple heuristic leads to greatimprovement in the accuracy of RTS task [3 4] sincemany of posts are most likely written to answer thelast question However this heuristic cannot have agood performance in declarative threads Accordingto Figure 2(a) the users most likely reply to thefirst previous comment in Apple discussion forumAlthough Figure 2(b) shows the users most likelyreply to the root inThestandard online news agenciesThe 11th comment in Apple is the parent of 12th com-ment with probability 049 however in Thestandardthe probability is equal to 0137

(ii) Figure 3 includes three real threads in Thestandarddataset which includes 30 comments and each com-ment is numbered and sorted in chronological orderSince users usually express their opinions or sen-timents and reply to each comment regardless ofsubmission time and position of comments so thestructure of the replies is not predictable

(iii) The length of the roots text in declarative threads isusually larger than comments thus the length of theroot or comments should be normalized in similaritymeasurements

There are websites which are known as message boardssuch as Digg Reddit and Slashdot The message boards arevaluable threads to analyze usersrsquo behavior [24 25] Howeverthey are not suitable to be used as training data since theyhave different designs for showing the comments Thesedifferences change the behavior of userrsquos replies for exampleSlashdot (httpslashdotorg) shows comments based ontheir scores This causes the comments which have higherscores to get more replies So we cannot create a generalmodel appropriate for all message boards

The Scientific World Journal 5

Article-news title the president loses 30000 jobs in a single year

The president reacts to criticism of his appalling jobs recordwhich has 300000 Kiwis jobless and 30000 fewer people inwork in the last year middot middot middot Herersquos criticism of things that yoursquorenot doing on the jobs front the president tax pollution soyou can reduce tax on companies and income renew andenhance the home insulation scheme which has created2000 jobs and is about to run out of money create anational investment middot middot middot

User A26 march 2013 at 802 am

User D26 march 2013

at 1110 am

321

middot middot middot

All Shearer has to do is1 Print Money or borrow a shit load moremoney2 Whack on a pile of extra taxes3 Get the government to run everything

Can you get some of those 30000 toknock on mydoor Irsquom offering fulltime employment no experiencenecessary and well above minimumwage middot middot middot but the response has beenpoor and of those who have appliedmost have a terrible attitude towardswork I donrsquot understanditmiddot middot middot

User B26 march 2013

at 806 am

2

Irsquove been reading a lot of storiessimilar to yours and have seen howhard it is for employers to get goodreliable staffThe mind set of theyounger people these days seems tobe very poor middot middot middot

Irsquove beenreading a lot ofstories middot middot middot

User A26 march 2013 at 1124 am

User E26 march 2013 at1247 pm

124

36

Of course they have to borrowmoneyChristchurch roads WFF interestfree loans we donrsquot make enoughto cover those expenses middot middot middot

User A26 march 2013

at 816 am

6

But national are-borrowing a shit load more moneyevery week-whacking on extra taxes (GSTcigarettesalcohol petrol middot middot middot )

-trying to run Auckland city fromWellingtonmiddot middot middot

you forgot 1000000000in lost revenue becauseof poorly structured taxcuts that did not stimulatethe economy

remember only theMexicans pay less taxthan us middot middot middot

User C26 march 2013

at 819 am

8

R

$

Figure 1 A part of a real thread fromThestandard (httpwwwthestandardorgnz) online news website

4 Method

In this section we describe our supervised approach to RTStask First we define some textual and nontextual featuresand learn a proper model to combine the features using aranking SVM (Support Vector Machines) Then the model isemployed in reconstruction of the reply structure for threadsin the test data

41 Ranking SVM SVM is a supervised learning methodthat is used for classification regression and ranking

The implementation we use is the ranking SVM classifi-er (httpwwwcscornelledupeopletjsvm lightsvm rankhtml) [2 37] The ranking SVM classifier learns to assigna weight to the pairs of comments The whole procedurefor choosing the parent of the 119894th comment in a thread isdescribed in Figure 4

Since the purpose of RTS task is to predict a tree-likereply structure in a thread RTS algorithm needs to find 119873reply relations among comments and the root where119873 is thenumber of comments in a threadThus RTS algorithm needs119874(1198732) pairwise comparisons

6 The Scientific World Journal

2468101214

15

1015

0

03

02

01

04

05

06

07

08

09

1

X 7Y 12Z 001661

Child

-par

ent r

elat

ion

prob

abili

ty

RootY child

X parent

X 11Y 12

Z 049

X rootY12Z 0211

(a)

2468101214

1

5

10

15

0010203040506070809

1

X 7Y 12Z 006037

X 11Y 12Z 01371Ch

ild-p

aren

t rel

atio

n pr

obab

ility

X parent

Y child

X rootY 12Z 03652

Root

(b)

Figure 2 The probability of child-parent reply relations (a) Apple discussion forum and (b) Thestandard online news agencies

R

10 14

298

124

7

16 45

3

30

28

25

2726

20

13

19

23

15

12

11

19

13

22

18

21

26

17

221

2947

112

9

26325

23 27

30

5

12

6

810

14

21 15 20

16

181719

24

28

1

310

209

4

5

1622

6

2

15

R

826

30

19

2421

2923

277

17

18

11

13

14

12

28

25

9

11

R

Figure 3 Three real threads which the nodesrsquo label are in chronology order

42 Features The ranking SVM needs some features to bedefined In this section we introduce eleven textual andnontextual features Their definition is one of the maincontributions of our work It should be mentioned that about20 percent of replies in ground truth donot have any common

word with their parent The textual features therefore are notenough and we need to define some nontextual features

421 Similarity Measurement In order to measure the sim-ilarity of sentences we utilize a vector space model to

The Scientific World Journal 7

Learning

Model

i

R

1

i minus 1Get

(candidatesof irsquothcomment)

Ranking

Training set

Test set

SVM ranklearn

(A learning modellist of weights for

each feature)

i

R i minus 1

j

For each candidateForj = 0 to i minus 1 doFor each featurefor k = 0 to Features doLinear kernel functionA[j]+ = modelweight[k]lowastfeature [j][k]Return parentrsquos lablereturn argmaxj A

Figure 4 RTS algorithm

represent the text body of comments A commentrsquos text canbe considered as a vector of terms weighed by TF-IDF TF(Term Frequency) is the number of times a word appearsin a comment and IDF (Inverse Document Frequency) iscomputed according to the following formula

IDF (119908) = Log( 119873119899119908

) (1)

where119873 is the total number of comments in a news item and119899119908is the number of comments that contain the word 119908 IDF

is usually extracted from the whole training data but we havelimited it to the set of comments in a thread We believe thismakes it more accurate for example when a word has a lowfrequency in the whole training data but has a high frequencyin a thread

In order to measure similarity between two commentsthe stop-words are deleted first and then words of thecomments are stemmed by the Porter algorithm (for Englishdatasets) This step is not performed for Russian and Persiandatasets The common words of the two comments areextracted and their weights are calculated based on TF-IDFThen the final score is obtained by aggregating the weights ofcommonwords according to (2) Since the root text is usuallylonger than the comments log of the product of their lengthsis used in denominator

Score (1198621 1198622)

= sum

119908isin119862-words(1 + log (TF (119908 1198621) times TF (119908 1198622)))

times IDF (119908) times (log (|1198621| times |1198622|))minus1

(2)

where119862-words are the number of the words held in commonbetween comments 1198621 and 1198622 and |119862| is the length of thecommentrsquos text

Comments are usually informal and have typo errorsSome words in two comments might be the same but dueto spelling errors it is difficult to find this out In order tosolve this issue we use the minimum edit distance (MED)algorithm The minimum edit distance of two words is theminimum number of edit operations (insertion deletionsubstitution and transposition) needed to transform oneword into another [14] The costs of insertion deletionsubstitution and transposition are 1 1 2 and 1 respectively

Two words in different comments are considered ascommon words if either they are exactly the same or theyseem to be the same but they contain some typo errors Inthe latter case if the length of the words is bigger than fiveand their first two letters are the same and their edit distanceis lower than 4 the two words are considered as commonword For example two words ldquoBeautifulrdquo and ldquoBeuatifulrdquo areconsidered as common words

422 Authorsrsquo Language Model The idea is that the com-menters who talk to each other are more likely to use similarwords in their comments In order to take advantage ofthis feature all comments that are related to a commenterare appended This makes a collection of words for eachcommenter If the collections of words of two commentersare very similar they can be related to each other InFigure 5 commenter ldquo1198601rdquo wrote three comments Thesethree comments are appended and make a collection ofwords for commenter ldquo1198601rdquo and then like the first featurethe similarity is calculated between collection of words ofcommenters ldquo1198601rdquo and ldquo1198602rdquo The similarity scores obtained

8 The Scientific World Journal

between two commenters ldquo1198601rdquo and ldquo1198602rdquo are considered asthis featurersquos score for relations between their comments

423 Prior Location Position of comments can reveal someinformation about their hierarchy for example first com-ments usually have more children than the others or com-ments which are located just before the 119894th comment aremorelikely to be its parent In general we would like to estimate119901(119894 | 119895) that is knowing that a comment is in position 119895whichis the likelihood that the comment in position 119894 is its parent[2] So we calculate prior probabilities for being the parent ofdifferent positions

To calculate prior probability for 119895 we count the numberof times a comment in position 119894 is the parent of 119895 inthe training set Figure 6 shows the prior probability forcomments in positions 1 to 100 in Thestandard dataset Thehighest prior probability belongs to the root and then tothe comments which are located just before the commentThe sample points in Figure 6 show five commentrsquos positionssuch as the roots 10 30 57 and 59 and how it is probablefor 60th comment to be a child of them that the root hasthe most prior probability which is equal to 01852 and thenthe 59th comment which has the probability of 01236 AlsoFigure 7 shows the prior probability of child-parent relationfrom comment 40 to 60

424 Reference to Authorsrsquo Name If the name of an authorappears in a comment then all hisher comments are con-sidered as a potential candidate for being parent of thatcomment [19] Sometimes a part of the complete name isreferenced Sometimes the authorrsquos name is made up of twoparts and both of these parts could be used by other authorsfor reference We also consider these types of references Wehold each part of the authorrsquos name and then parts which arestop-words are removed

425 Global Similarity This feature is based on ranking thesimilarity of commentsrsquo text and the rootrsquos text which has aglobal view on the text similarity According to this feature ifcomment ldquo119860rdquo has the most similarity with comment ldquo119861rdquo andinversely comment ldquo119861rdquo has themost similarity with commentldquo119860rdquo among other candidates it is more likely that there is arelation between comment ldquo119860rdquo and comment ldquo119861rdquo To relaxthis feature the similarity measurement is calculated for eachcomment corresponding with all comments then the com-ments are sorted based on their similarity score For examplein Figure 8 we are looking for parent of the fifth commentIn the first step according to Formula (2) comments aresorted based on score of text similarity measurement per thefifth comment Comments that do not belong to candidateof the fifth comment are removed The removed commentshave been shown with black color in Figure 8 In the secondstep the same as the first step comments are sorted per eachcandidate of fifth comment and also comments which donot belong to candidate of the fifth comment are removedexcept the fifth comment itself Finally formula (3) is used tocalculate Ranking-distance score Two comments which arethe most similar to each other have more Ranking-distance

score In Figure 8 the most score belong to relation with thefourth commentThis feature is symmetric and the similarityamong comments is only calculated one time In other wordsthis feature needs 119874(1198732) time complexity for text similaritypairwise comparisons

Ranking-distance (1198621 1198622) = (|1198621amp1198622| + |1198622amp1198621|)minus1(3)

where 1198621 and 1198622 are two comments and |1198621amp1198622| is the textsimilarity distance between 1198621 and 1198622

426 Frequent Patterns of Authors Appearance The idea isthat the comments of two commenters who talk to eachother usually appear closely in chronological order So iftheir comments appear many times just after each other thisfeature gives a high score to their comments In order toimplement this feature we use the following formula

FPScore (119894 119895) = 119891 (119860119894 119860119895) + 119891 (119860

119895 119860119894) (4)

where 119860119894is the author of comment 119894 119891(119886 119887) is the number

of times of comments of author 119886 that appear just beforecomments of author 119887 (see Pseudocode 1)

Figure 9 shows a time line which includes 7 commentswhich have been written by 4 commenters The feature scoreis calculated for the relation among comments The score ofrelation between comments 119860 and 119861 is 3 which is more than119860 and 119862

427 Length Ratio The length of the parent text is usuallylonger than its children The length ratio score is calculatedaccording to

length ratio score (119862119862 119862119875) =

10038161003816100381610038161198621198751003816100381610038161003816

10038161003816100381610038161198621198621003816100381610038161003816

(5)

where 119862119862is a comment looking for a parent and 119862

119875is a

candidate parent

428 FrequentWords in Parent-Child Comments Sometimesa pair of words appears to be frequently one word in theparent and the next word in its children For example in ENE-News dataset the most frequent pairs are (believe people)(accident fukushima) (new discovered) (idea report) and(people public)We use pointwisemutual information (PMI)[38] to find the frequent pattern

Score (11988211198822) =

Count (11988211198822)

Count (1198821) lowast Count (119882

2) (6)

where 1198821is a word in a comment whose parent we are

looking for1198822is a word in its candidate parent and Count

(11988211198822) is the number of time119882

1has appeared in the parent

and1198822has appeared in child The numerator computes how

often two words appear together and denominator computeshow often one of the words appears Finally according toPseudocode 2 the score of relation between two comments iscalculated

The Scientific World Journal 9

+

C

C

C

A1

+

C

C

C

A2Similarity

C

Figure 5 The authorsrsquo language features based on author ldquo1198601rdquo and author ldquo1198602rdquo

102030405060708090100

1 20 4060 80

100

0

01

02

03

04

05

06

07

08

09

1

Prio

r pro

babi

lity

X RootY 60Z01862

X 10Y 60Z00011

X 30Y 60Z00011

X 57Y 60Z 006

Root

Y child X parent

X 59Y 60Z01236

Figure 6 The prior probability of child-parent relation from comments 1 to 100 inThestandard

102030405060

0

005

01

015

02

025

X rootY 60Z01862

X 59Y 60Z01236

Root

4045

5055

60

Y child

X parent

Prio

r pro

babi

lity

Figure 7 The prior probability of child-parent relation from comments 40 to 60

Comment other comments are sorted based on the text similarity measurement score

5 15 2 4 R 3 21 1

2 233 4 5 R13 1

4 235 1 R 313 2

5

2 4

1 2 3 4 5

1 2 3 4 5

1 2 3 4 5

1(1 + 3) 1(2 + 1)

Figure 8 An example for global similarity features

10 The Scientific World Journal

Step 1For 119894 = 1 to comments-1

if Author(119894) = Author(119894 + 1)119891[Author(119894 + 1) Author(119894)]++

End ifEnd forStep 2For 119894 = 2 to Comments For each comment which looking for its parent

For 119895 = 0 to 119894 minus 1 For each candidateFPScore(119894119895)= 119891[Author(119894) Author(119895)] + 119891[Author(119895) Author(119894)]

End forEnd for

Pseudocode 1 Pseudocode to calculate frequent patterns of authors appearance

Time line

Relation CountFrequentpattern

Sum

Aminus gt B 2 A B 3

Bminus gt C 1 B C 1

Cminus gt A 1 C A 1

Bminus gt A 1 AD 1Aminus gt D 1

B

AFrequent (A B) = 3 Frequent (A C) = 1

C

Step 1 Step 2

A B C A B A D

Figure 9 An example for authors appearance sequence

429 Candidate Filtering In addition to the features we usesome heuristics to filter some inappropriate candidates

(1) Usually a commenter does not reply to the root inhisher second comment So if a commenter haswritten more than one comment the root is removedfrom the parent candidates of the second and nextcomment This was shown to be a useful featurebecause the root is an important candidate so if wecould remove it correctly the results are improvedsignificantly

(2) A commenter does not reply to himherself So wesimply remove all comments of a commenter fromhisher commentrsquos candidates [2]

(3) Commenters who post only one comment on a threadare more likely to reply to the root post So othercandidates can be removed except the root

5 Experiments

In this section we provide details about the datasets used forevaluation describe the evaluation metrics and then presentthe results of the SLARTS method and compare them withthe results of the baseline approaches

51 Dataset Unfortunately there is no standard datasetfor RTS task in declarative threads while severaldatasets are available for interrogative threads [3 4 14]Therefore Thestandard (httpthestandardorgnz) Alef(httpwwwalefir) ENENews (httpenenewscom) Rus-sianblog (wwwkaverydreamwidthorg) and Courant-blogs(wwwcourantblogscom) websites were crawled to provideevaluation data Thestandard in New Zealand and Alef inIran are two online news agencies established in August2007 and 2006 respectively They publish daily news andarticles by journalists in different categories such as economyenvironment politics and society Thestandard is in Englishand Alef is in PersianThestandard has provided the tree-likereply structure of comments since 2010 ENENews is a newswebsite covering the latest energy-related developmentsIt was established shortly after the Fukushima Daiichidisaster in March 2011 It has grown rapidly to serveapproximately 2000000 page views per month Russianblogand Courantblogs are two blogs that are in Russian andEnglish respectively Russsianblog and Courantblogs areestablished since 2003 and 2012 respectively

The reason for selecting the mentioned websites are (1)They support multilevel reply (2) their users are active andarticles usually get many comments (3) author and time ofcomments are available (4) they cover different contexts andlanguages (news agencies cultural sites and blogs in EnglishRussian and Persian)

The Scientific World Journal 11

procedure frequent-words (comment candidate comment)Sum = 0For each (Word1 in comment)

For each (Word2 in candidate-comment)Sum += Score [Word1 Word2]

End ifEnd forreturn Sum

Pseudocode 2 A pseudocode for calculating score of frequent words between two comments

Table 1 The datasets specific properties

Datasets Language Confirmation bymoderator

Usersregister

Thestandard English times

Alef Persian times

ENENews English times

Russianblog Russian times

Courantblogs English times

For each website we crawled the webpages that werepublished until the end of 2012We then parsed the pages andextracted the reply structure and used it for ground truthWehave removed the threadswith less than 4 comments becausethese kinds of threads do not give us much information andusually their comments reply to the root Table 1 summarizesthe information about the prepared datasetsThe datasets areavailable at httpeceutacirnlpresourceshtml

In ENENews and Russianblog users have to register inorder to leave a comment However in other datasets userscan leave comments in a thread with different names ordifferent users can use the same names

Table 2 reports some statistics on the crawled websitesThe length of commentsrsquo text in Russianblog is shorter thanthe other datasets which causes text similarity measures toperform poorly on it In ENENews the root usually includesa tweet that is why the length of the rootrsquos text is shorterthan the other datasets All comments have authorrsquos nameexcept for some comments in Alef Therefore the numbersof commenters for Alef was calculated from the commentsthat have authorrsquos name The average number of commentsper article in Thestandard and ENENews are about 50 and39 respectively which are larger than the other datasets

In order to gain some insights into the data we show somecharts extracted from the datasets

Figure 10 shows the distribution of the number of com-ments in the articles It is seen thatmost threads have between5 and 15 comments in Russianblog and Alef However inThestandard length of threads is longer than the otherdatasets andmost threads have between 12 and 24 comments

Publication rate of articles is shown in Figure 11 Thepublication rate follows a bell-shape and articles are publishedbetween 7 am and 19 pm and the highest rate belongs to the 4-hour period between 9 am and 13 pm Since there is only one

author who publishes the articles in Russianblog its chart hasless variation and the root is usually published between 10 amand 18 pm

Figure 12 shows the publication rate of comments afterthe root It is seen that all datasets except Russianblog havesimilar behavior about one hour after the article is publishedit has received most of the comments After that the rate ofcomment reception decreases Russianblog shows differentbehavior It seems that a part of its visitors reply to its articlesthe next morning 16ndash21 hours after the publication

Figure 13 shows the time difference between publicationtime of comments and their replies It is seen that the maxi-mum difference is usually less than one hour ENENews andRussianblog do not moderate comments and Thestandardhas very active moderators who immediately check andaccept comments However in Courantblogs and Alef wheremoderators are not always online the time difference isbetween one and two hours

Figure 14 shows how depth of comments increases whentime passes after publication of the main article Deepercomments show longer and more serious conversations Asshown in the figure comments usually reply to the root inearly hours After a few hours conversations are continuedaround the comments which causes the depth of thread toincrease Visitors of Alef and Courantblogs talk more aboutthe root article However Thestandard and ENENews havelonger and deeper discussions

Figure 15 shows length of comments in words Most ofcomments include 10ndash20 words Except for comments ofRussianblog the other datasets are similar This tells thatthe similarity measure is not enough and we have to usenontextual features as well Russianblog has comments thatare shorter than the other datasets This dataset is a personalblog and users usually write friendly comments

Figure 16 shows depth of comments Depth is directlyrelated to the conversations It is seen that comments areusually found in the first depth (below the root) RussianblogENENews and Thestandard have more comments in higherlevels meaning conversations are longer

52 Evaluation Metrics To evaluate the experiment resultswe use several metrics For edge prediction we use precisionrecall and 119865-score measures [4 13 17 18] Output of RTS isa tree This causes precision recall and 119865-score values to beequal [2] since FP (False Positive) and FN (False Negative)

12 The Scientific World Journal

Table 2 Datasetsrsquo statistics split up per website

DatasetsAvg length of

commentsrsquo text (inwords)

Avg length of therootrsquos text (in words)

Number ofcomments

Avg comments pernews

Number ofnews

Number ofcommenter

Thestandard 7298 35768 149058 4981 2992 3431Alef 7466 76655 23210 2796 830 4387ENENews 6819 16264 16661 3892 428 831Courantblogs 7924 39817 4952 1919 258 1716Russianblog 1856 23716 30877 1618 1908 1258

0

20

40

60

80

100

120

4 8 16 32 64 128 256

Num

ber o

f new

s

Number of comments

AlefEnenewsRussianblog

CourantblogsThestandard

Figure 10 Distribution of the number of comments in news article

0

005

01

015

02

025

0 2 4 6 8 10 12 14 16 18 20 22

Nor

mal

ized

new

s cou

nt

Hour

AlefEnenewsRussianblog

CourantblogsThestandard

Figure 11 Publication rate of articles per hour

0002004006008

01012014016

0 2 4 6 8 10 12 14 16 18 20 22 24

Nor

mal

ized

com

men

ts co

unt

Elapsed time after published the root (hour)

AlefEnenewsCourantblogs

RussianblogThestandard

Figure 12 Average number of published comments per hour after the root

The Scientific World Journal 13

0005

01015

02025

03035

04045

05

0 1 2 3 4 5 6 7 8 9 10 11 12

Nor

mal

ized

com

men

ts co

unt

Time difference

AlefEnenewsRusssiablog

CourantblogsThestandard

Figure 13 Time difference between a node and corresponding parent per hour

0

05

1

15

2

25

3

35

4

0 2 4 6 8 10 12 14 16 18 20 22 24

Aver

age c

omm

ents

dept

h

Elapsed time after publishing the root (hour)

AlefEnenewsRusssianblog

CourantblogsThestandard

Figure 14 Depth of comments averaged on the elapsed time from publication of root

in precision and recall are always equal Instead we use thefollowing edge accuracy measure

Accuracy (Acc)edge

=

1003816100381610038161003816reply relations cap detected relations1003816100381610038161003816119873

(7)

where the tree is comprised of119873 comments and one rootThe secondmetric is path accuracy which was introduced

byWang et al [3]Thismetric has a global view and considerspaths from nodes to the root

Accuracy (Acc)path =sum|119862|

119894=1

1003816100381610038161003816path119866 (119894) = path119875(119894)1003816100381610038161003816

|119862| (8)

where path119866(119894) is the ground-truth structure for 119894th comment

and path119875(119894) is the predicted structure for it |119862| is the number

of comments in a thread If any irrelevant comment appearsin the path this metric considers it to be completely wrongSo it is very strict To relax this Wang et al introduced

a metric that computes the overlap between the paths inground truth and the predicted path

Precision (119875)119877-path =

sum|119862|

119894=1(1003816100381610038161003816path119866 (119894) cap path

119875(119894)1003816100381610038161003816 1003816100381610038161003816path119875 (119894)

1003816100381610038161003816)

|119862|

Recall (119877)119877-path =

sum|119862|

119894=1(1003816100381610038161003816path119866 (119894) cap path

119875(119894)1003816100381610038161003816 1003816100381610038161003816path119866 (119894)

1003816100381610038161003816)

|119862|

(9)

where |path119875(119894)| is the number of comments in the prediction

path of 119894th comment and |path119866(119894)| is the number of com-

ments in the ground-truth path of 119894th comment 1198651-score isthe harmonic mean of precision and recall

1198651-score = 2 times Precision times recallPrecision + recall

(10)

The above mentioned metrics are appropriate in inter-rogative threads As mentioned before the root of declar-ative threads is news articles or main contents which aredifferent from the root of interrogative threads This causesthe structure of threads and reply relations to be differentThere are two types of reply relations in declarative threads

14 The Scientific World Journal

0

001

002

003

004

005

006

1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91Nor

mal

ized

com

men

ts co

unt

Length of comments (words)

AlefEnenewsRussianblog

CourantblogsThestandard

Figure 15 Histogram of length of comments in words

0

01

02

03

04

05

06

07

08

1 2 3 4 5 6 7 8 9Nor

mal

ized

com

men

ts co

unt

Depth

AlefEnenewsRusssainblog

CourantblogsThestandard

Figure 16 Histogram of the depth of comments

(1) comment-root that is the relation between commentsand the root article and (2) comment-comment that is therelation between two comments one parent and one childThe comment-root relations show different views around themain article (root) The comment-comment relations showconversation among visitors which is a valuable source ofinformation In Figure 17 there are three comment-root andthree comment-comment relations When a user reads theroot and comments heshe can write hisher opinion orsentiment about the root by replying to root or participatingin a discussion by replying to usersrsquo comments

We believe that the evaluation metrics mentioned beforeare not enough to cover both types of reply relations dueto differences between interrogative and declarative threadsAn appropriate metric should be able to detect both typesof relations So we have to modify the evaluation metrics ordefine new metrics

We propose an evaluation metric Accuracy (Acc)CTDwhere CTD stands for Comments Type Detection It isdefined as the proportion of correctly detected commenttypes The comment-root type includes comments whichinitiate a new view about the root (NP) and the comment-

comment type includes comments which are part of adiscussion (PD)

Accuracy (Acc)CTD =TP + TN

TP + TN + FP + FN (11)

where TP is the number of correctly detected NP commentsTN is the number of correctly detected PD comments FP isthe number of incorrectly detected NP comments and FN isthe number of incorrectly detected PD comments

To evaluate accuracy of comment-comment relations thestandard precision recall and 119865-score measures are used

Precision (119875)edge =TP

TP + FP

Recall (119877)edge =TP

TP + FN

1198651-score (119865)edge = 2 timesPrecision times recallPrecision + recall

(12)

where TP is the number of correctly detected comment-comment relations FP is the number of incorrectly detectedcomment-comment relations and FN is the number of

The Scientific World Journal 15

C

C

Comment-root relation

Comment-comment relation

C

C C C Starter discussion commentStarter discussion comment

R

Figure 17 An illustration of a declarative thread

comment-comment relations in the ground truth that werenot predicted at all

The path accuracy metric mentioned earlier is a mod-ified version of 119875path and 119877path which is appropriate todeclarative platform This metric consider discussion pathsfrom each PD comment to the starter discussion commentbut not to root

Precision (119875)path

=

sum|119862|119901

119894=1

1003816100381610038161003816path119866 (119894) = path119875(119894)1003816100381610038161003816

|119862|119901

ow

1 if (|119862|119901 = 0)

Recall (119877)path

=

sum|119862|119877

119894=1

1003816100381610038161003816path119866 (119894) = path119875(119894)1003816100381610038161003816

|119862|119877

ow

1 if (|119862|119877= 0)

(13)

where |119862|119877is the number of PD comments in the ground-

truth thread and |119862|119901is the number of PD comments in the

predicted thread Path119866(119894) is the discussion path from 119894th

node to the discussion starter comment in the ground truthPath119875(119894) is discussion path from 119894th node to the discussion

starter comment in the predicted threadAlso the relaxed Precision (119875)

119877-path and Recall (119877)119877-path

are modified to be suitable for declarative platform

Precision (119875)119877-path

=

sum|119862|119901

119894=1(1003816100381610038161003816path119866 (119894) cap path

119875(119894)1003816100381610038161003816 1003816100381610038161003816path119875 (119894)

1003816100381610038161003816)

|119862|119901

ow

1 if (|119862|119901= 0)

Recall (119877)119877-path

=

sum|119862|119877

119894=1(10038161003816100381610038161003816path119866 (119894)

cap path119875(119894)

1003816100381610038161003816100381610038161003816100381610038161003816path119866(119894)

10038161003816100381610038161003816)

|119862|119877

ow

1 if (|119862|119877= 0)

(14)

Figure 18 shows two threads of which one is the groundtruth and the other one is the predicted thread In order tobetter understand the evaluation metrics we calculate themfor this example

In Table 3 the predicted thread in Figure 18 has beenevaluated by metrics from interrogative threads The resultsshow high values Table 4 shows the results of evaluationby metrics from declarative threads The results show thatdeclarative metrics have more appropriate results

There are two reasons which lead declarative metricsto better evaluate the predicted structure (1) The root indeclarative threads has many children So if a methodconnects all comments to the root interrogative metricsshow good results (2) Interrogative metrics cannot properlyevaluate comment-comment relations in declarative threadsHowever the declarative metrics can evaluate both types ofrelations

53 Experimental Results and Analysis We use several base-lines to compare the effectiveness of SLARTS The first base-line is the performance when we simply link all comments totheir previous comment (we name it Last baseline) [3 4]Thismethod leads to good results for RTS task in interrogativethreads The second baseline is to link all comments to theroot (we name it First baseline)

The only work that has been done on RTS has focused onthe authorrsquos name in online news agencies [19] Schuth et alrsquosmethod used several features to find the commenterrsquos name inonline news articles This method achieves a good precisionbut it has a low recall because many comments do not referto any authorrsquos name So we selected Seo et alrsquos method [2]which is a strong method in interrogative platforms

Seo et alrsquos method has focused on forums and emails andused the quoted text as one of theirmain featuresThis featureis not available in our datasets Therefore we use all theirproposed features except the quoted text

We have applied 5fold cross-validation to minimize biaswith 95 as confidence interval Table 5 shows the results ofour experiments based on interrogative evaluation metricsAccording to Accedge SLARTS reveals higher accuracy exceptin Alef in which lots of replies are connected to the root

According to 119875119877-path and 119877119877-path First and Last baselines

have the best performance respectively First connects allcomments to the root this way irrelevant comments do notappear in the path from the comment to the root and also theroot appears in all paths Last connects all comments to theirprevious comments This causes all comments to appear inthe path

16 The Scientific World Journal

Table3Ex

ampleo

fcalculatio

nof

interrogativem

etric

sfor

thee

xamples

hownin

Figure

18

Evaluatio

nmetric

Calculation

Accuracy

edge

Replyrelatio

ns=1rarr

1198772rarr

1198777rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

14rarr

35rarr

26rarr

58rarr

5

Detectedrelatio

ns==1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

14rarr

15rarr

26rarr

57rarr

28rarr

7

8 11asymp073

Acc p

ath

|(1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

1rarr

1198775rarr

2rarr

1198776rarr

5rarr

2rarr

119877)|

Num

berof

comments

=8 11asymp073

119875119877-path

(1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

1rarr

1198775rarr

2rarr

1198776rarr

5rarr

2rarr

119877)+(4rarr

1rarr

1198777rarr

2rarr

1198778rarr

7rarr

2rarr

119877)

Num

berof

comments

=(1+1+1+1+1+1+1+1)+(1+12+23)

11

=1017

11

asymp092

119877119877-path

(1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

1rarr

1198775rarr

2rarr

1198776rarr

5rarr

2rarr

119877)+(4rarr

3rarr

1rarr

1198777rarr

1198778rarr

5rarr

2rarr

119877)

Num

berof

comments

=(1+1+1+1+1+1+1+1)+(23+1+23)

11

=1034

11

asymp094

1198651-score119877-path

2lowast092lowast094

092+094

asymp093

The Scientific World Journal 17

R

9 10 11

3 5

8

21 7

4 6

(a)

R

9 10 11

3 7

8

21

4 5

6

(b)

Figure 18 An example of two threads (a) ground-truth thread and (b) the prediction thread

Table 4 Example of calculation of declarative metrics for the example shown in Figure 18

Evaluation metric Calculation

AccCTD

TP = 1 2 9 10 11TN = 3 4 5 6 8

FN = 7

FP =

5 + 5

5 + 5 + 1 + 0asymp 091

119875edge

119877edge

1198651edge

TP = 3 rarr 1 5 rarr 2 6 rarr 5 119875edge =3

3 + 3= 05

FN = 4 rarr 3 8 rarr 5 119877edge =3

3 + 2= 06

FP = 4 rarr 1 7 rarr 2 8 rarr 7 1198651 minus scoreedge asymp 055

119875path

119877path

1198651path

119875path =| 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 |

| 3 rarr 1 4 rarr 1 5 rarr 2 6 rarr 5 rarr 2 7 rarr 2 8 rarr 7 rarr 2 |=3

6= 05

119877path =| 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 |

| 3 rarr 1 4 rarr 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 8 rarr 5 rarr 2 |=3

5= 06

1198651-score119877-path asymp 055

119875119877-path

119877119877-path

1198651119877-path

119875119877-path =

3 rarr 1 4 rarr 1 5 rarr 2 6 rarr 5 rarr 2 7 rarr 2 8 rarr 7 rarr 2

| 3 4 5 6 7 8 |=1 + 1 + 1 + 1 + 0 + 12

6=45

6= 075

119877119877-path =

3 rarr 1 4 rarr 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 8 rarr 5 rarr 2

| 3 4 5 6 8 |=1 + 12 + 1 + 1 + 12

5=4

5= 08

1198651-score119877-path =

2 lowast 075 lowast 08

075 + 08asymp 077

According to Accpath First has better performance inThestandard ENENews and Alef because more commentsare linked directly to the root Usually SLARTS and Seo etalrsquos methods cannot predict the full path inThestandard andENENews because according to Figure 15 paths are verylong and complex in these datasets

As wementioned earlier interrogative evaluationmetricsare not appropriate for declarative threads because based onthese metrics First shows high performance although thisbaseline does not detect any comment-comment relations

Table 6 shows the results when we use declarative evaluationmetrics proposed in Section 52

According to Table 6 for 119865edge SLARTS performs betterthan Seo et alrsquos method However for119875edge Seo et alrsquos methodperforms better but its 119877edge is lower than SLARTS in all da-tasets

It is important to say that when the difference between119875edge and 119877edge is high and 119875edge is greater than 119877edge themethod most likely connects comments to the root and itdoes not appropriately detect comment-comment relations

18 The Scientific World Journal

J(1)M(1)

V(1)

H(1)

A(1)

R(1)

G(1)

N(1)D(1)

E(1)

C(1)

P(2)

U(1)

S(1)

T(2)

T(1)

M(2)

L(1)

Q(1)

C(2)

K(1)

F(1)L(2)

C(3)

P(1)

A(2)

B(1)O(1)

I(1)

H(2)

R

(a)

J(1)

M(1)

R

A(1)

N(4)

D(1)

G(1)

I(1)

B(1)H(1)

O(1)

C(1)

N(2)

G(2)O(2)

P(1)

E(1)

D(2)

F(1)

E(2)F(2)

D(3)

L(1) F(3) N(1)

D(4)

F(4)K(1)D(5)

D(6)

N(3)

(b)

1)

1)

1)

1)

1)

1)

1)

1)1)

1)

1)

1)

R

F(2)

K(3)

R(1)

K(2)

M(2)A(3)

Q(1)

N(2)

K(4)A(2)

J(2)

1)

H(2)

H(3)

I(2)

I(1)

P(1)

O(1)

(c)

Figure 19 Three real threads in which nodes are labeled according to commenterrsquos name and number of its comments

(like First baseline) On the other hand when 119877edge isgreater than 119875edge the method does not appropriately detectcomment-root relations (like Last baseline)

So Seo et alrsquos method most likely connects comments tothe root and this is not appropriately detecting comment-comment relations On the other hands 119875edge and 119877edge ofSLARTS are close

Several features in SLARTS such as Global similarityfrequent words Frequent patterns and authorsrsquo languagefocus on detection of comment-comment relations Thismakes the results of SLARTS better than Seo et alrsquos methodin declarative evaluation metrics

We already saw that the average length of threads inThestandard and ENENews is longer than the other datasets(Table 2) and their paths are much longer and more complexthan the other datasets (Figure 16) According to 119865edge theaccuracy of Thestandard and ENENews is less than otherdatasets Note that 119865edge is a strict metric in declarativethreads

SLARTS has better 119865edge than Seo et alrsquos method in alldatasetsThemaximumdifference occurs inThestandard andRussianblog In these datasets many of the defined featureshave goodperformance (Importance of features in ENENewsand Russianblog datasets will be shown in Table 7 and we willexplain more in the next section)

The Scientific World Journal 19

Table 5 Experimental results with interrogative evaluation metrics

Datasets Methods Accedge Accpath 119875119877-path 119877

119877-path 1198651119877-path

Thestandard

SLARTS 04669plusmn0009

03059plusmn0013

06927plusmn0018

07450plusmn0010

07179plusmn0013

Seo et alrsquos [2] 03941plusmn0005

03413plusmn0006

08303plusmn0009

06563plusmn0005

07331plusmn0003

First 03630plusmn0010

03630plusmn0010

10000plusmn0000

05846plusmn0009

07379plusmn0007

Last 01824plusmn0006

00495plusmn0004

02214plusmn0008

10000plusmn0000

03625plusmn0010

ENENews

SLARTS 04837plusmn0018

03695plusmn0027

07625plusmn0026

07770plusmn0013

07697plusmn0017

Seo et alrsquos [2] 04661plusmn0017

04134plusmn0021

08720plusmn0011

07086plusmn0013

07818plusmn0011

First 04524plusmn0018

04524plusmn0018

10000plusmn0000

06634plusmn0016

07976plusmn0011

Last 02090plusmn0016

00596plusmn0003

02319plusmn0011

10000plusmn0000

03764plusmn0015

Courantblogs

SLARTS 06843plusmn0028

06453plusmn0033

09127plusmn0013

08619plusmn0014

08865plusmn0013

Seo et alrsquos [2] 06700plusmn0040

06628plusmn0043

09667plusmn0014

08250plusmn0018

08902plusmn0015

First 06490plusmn0036

06490plusmn0036

10000plusmn0000

08001plusmn0019

08889plusmn0012

Last 02037plusmn0017

01082plusmn0008

03183plusmn0011

10000plusmn0000

04829plusmn0012

Russianblog

SLARTS 07463plusmn0017

06904plusmn0017

08741plusmn0008

08787plusmn0006

08764plusmn0007

Seo et alrsquos [2] 05446plusmn0010

05174plusmn0011

09293plusmn0003

07821plusmn0004

08494plusmn0003

First 04968plusmn0006

04968plusmn0006

10000plusmn0000

07134plusmn0004

08327plusmn0003

Last 03632plusmn0018

01549plusmn0012

03695plusmn0015

10000plusmn0000

05395plusmn0016

Alef

SLARTS 07753plusmn0017

07641plusmn0018

09579plusmn0006

09033plusmn0008

09298plusmn0007

Seo et alrsquos [2] 07802plusmn0019

07800plusmn0019

09950plusmn0003

08828plusmn0010

09355plusmn0006

First 07853plusmn0018

07853plusmn0018

10000plusmn0000

08826plusmn0010

09376plusmn0006

Last 00993plusmn0005

00822plusmn0005

02526plusmn0012

10000plusmn0000

04032plusmn0015

The bold style shows the best result in each metric

As shown in Table 6 Russianblog has better resultsthan the other datasets in all metrics The main reason isthat its comments are not confirmed by a moderator Thiscauses Accedge of Last baseline in Russianblog to be equalto 03632 that is more than other datasets (Table 5) Accedgeof Last baseline has inverse relationship with complexity ofreplies Also we showed in Figure 13 that the Russianbloghas the lowest time difference between a comment and itsparent When time difference between a child and its parentdecreases detection of reply relationwould be easier In otherwords as the parent appears to be closer to its child somefeatures such as frequent pattern and location Prior that are

based on position of comments in chronological order workbetter

The difference between 119865119877-path and 119865path is about 20

in Thestandard and ENENews where threads are larger andpaths have more complex structures

The minimum difference between the results of SLARTSand Seo et alrsquos methods appears in Alef datasets In Alef manyrelations are comment root and many comments do not haveauthorrsquos name which make the features perform poorly

Since SLARTS method has features which speciallyfocused on detecting comment-root relations (eg by addingcandidate filtering) AccCTD of SLARTS is better than Seo

20 The Scientific World Journal

et al method in all datasets The best result is 091 forRussianblog and the worst result is 069 for ENENewsThe root of ENENews dataset is usually a tweet Accordingto Table 2 it makes the average length of the rootrsquos textto be shorter than the other datasets and this makes thetextual features perform poorly on detecting comment-rootrelations

Confidence intervals of Alef andCourantblog datasets arehigher than the other datasets becausemany of their relationsare comment root (Figure 16) This makes ranking SVM tobe bias towards connecting comments to the root especiallywhen a thread includes very few comment-comment rela-tions

We compare 119875-value to specify the significance of differ-ences between SLARTS and Seo et alrsquosmethods on declarativemetrics Since ranking SVM ranks candidates based on theirscore and selects the first candidate from the list only thefirst candidate is important So p1 is computed The resultsindicate that all improvements are statistically significant (119875-value lt 0005) in all datasets

54 Evaluation of the Features In this section we evaluatethe role of the introduced features We use backward featureselection That means to measure the importance of a featurewe use all features except that feature repeat the experimentsin its absence and compare the results to the case whereall features are present The difference between the values ofmetrics in presence of a feature and its absence is reported inTable 7

It is seen that some features improve precision of themetrics for example location prior and candidate filteringrule 3 where they most likely tend to detect comment-rootrelations Some features improve recall of the metrics such asauthorsrsquo language global similarity candidate filtering rule1 and frequent patterns These features most likely tendto detect comment-comment relations Some features affectboth precision and recall for example authorsrsquo name

As stated earlier Russianblog has different behaviourin comparison with other datasets (Figures 12 13 and 15)Also ENENews has larger threads and more complex paths(Figure 16 and Table 2) So we use these datasets to evaluatefeatures in depth

Similarity feature improves recall of evaluation metricsSince Russian language is known as a morphologically richlanguage and the length of commentsrsquo text is very short(about average 18 words according to Table 2) in comparisonwith other datasets improvement of textual features is lowTo increase the performance of textual feature in Russianwe need a stemmer a Wordnet and a tokenizer Similarityfeature has rather a good impact on AccCTD

Authorsrsquo language feature improves recall of evaluationmetrics According to 119877edge the improvements of this featureare 003 and 001 in ENENews and Russianblog respectively

Global similarity feature improves both precision andrecall in Russianblog and recall in ENENews

The frequent words feature has a small improve-ment in Russianblog like similarity feature This feature

improves recall of the evaluation metrics in ENEnews about001

Length ratio feature improves precision in both datasetsHowever since longer comments in ENENews have manychildren this feature is more prominent there

Authorsrsquo name feature is useful in all evaluation metricsThe value of this feature in ENENews is more than Russian-blog because authorsrsquo name in Russianblog is different fromother datasets it is an email and no one would refer to it as aname

The frequent patterns feature focuses on detectingcomment-comment relations This feature improves recall ofevaluation metrics in both datasets

The location prior feature improves precision in bothdatasets This feature has a good improvement on AccCTDAccording to 119875edge the best improvement is 016091 forRussianblog since comments are not moderated

The candidate filtering rule 1 improves recall of evaluationmetrics in both datasets This feature removes the rootcandidate accurately This feature has a good improvementon AccCTD The maximum improvement for 119877edge is 034 forRussianblog

The candidate filtering rule 2 has small improvements onprecision and recall of evaluation metrics in both datasetsMaximum improvement is gained with Russianblog

Finally The candidate filtering rule 3 improves precisionof metrics in both datasets This feature removes all can-didates except the root Thus detection of comment-rootrelations is improved Also this feature improves AccCTD

6 Information Extraction byVisualizing Threads Structure

In this section we discuss the information which can beextracted from hierarchical structure of threads Figure 19 isthe same as Figure 3 except that the commenters and theirnumber of posted comments are shown as the nodesrsquo labelfor example ldquo119863(3)rdquo is the third comment sent by user ldquo119863rdquoVisualization of the structure reveals valuable information asfollowing

(i) Is the root article controversial The threads ldquo119861rdquo andldquo119862rdquo in Figure 19 aremore controversial than ldquo119860rdquo Visi-tors have expressed different opinions and sentimentsabout the root in ldquo119861rdquo and ldquo119862rdquo leading to formationof longer conversations The thread ldquo119860rdquo shows acommon thread which has short conversations aboutthe root The height and width of trees can help as ameasure to recognize whether a root is controversialor not

(ii) Which comments are starters [36] Starter commentshave important role in the conversations becauseusers read them and then read other comments tobetter understand the conversations

(iii) Which comments have participated in this conversa-tion Users can follow the conversations that seeminteresting to them without reading any unrelatedcomment

The Scientific World Journal 21

Table 6 Experimental results with declarative evaluation metrics

Datasets Method 119875edge 119877edge 119865edge 119875path 119877path 119865path 119875119877-path 119877

119877-path 119865119877-path AccCTD

ThestandardSLARTS 03782

plusmn000804155plusmn0011

03960plusmn0006

01565plusmn0007

01713plusmn0008

01636plusmn0007

03628plusmn0012

04166plusmn0016

03876plusmn0005

07525plusmn0005

Seo et alrsquos[2]

03014plusmn0031

01414plusmn0015

01921plusmn0016

01544plusmn0025

00576plusmn0011

00835plusmn0013

03013plusmn0038

01319plusmn0018

01831plusmn0021

05831plusmn0007

Difference 00778 02741 02039 00021 01137 00801 00615 02847 02045 01694

ENENewsSLARTS 03797

plusmn003803891plusmn0019

03839plusmn0024

01873plusmn0040

01855plusmn0026

01857plusmn0030

03768plusmn0048

03768plusmn0017

03759plusmn0025

06872plusmn0014

Seo et alrsquos[2]

04060plusmn0030

01713plusmn0016

02407plusmn0018

02378plusmn0034

00837plusmn0016

01235plusmn0021

03947plusmn0035

01523plusmn0015

02196plusmn0019

05954plusmn0009

Difference minus00263 02178 01432 minus00505 01018 00622 minus00179 02245 01563 00918

CourantblogsSLARTS 05236

plusmn003003963plusmn0073

04495plusmn0052

04220plusmn0057

03110plusmn0058

03566plusmn0049

05325plusmn0049

03853plusmn0075

04452plusmn0057

07625plusmn0033

Seo et alrsquos[2]

07320plusmn0090

02049plusmn0062

03189plusmn0082

06815plusmn0105

01896plusmn0070

02951plusmn0093

07309plusmn0094

02010plusmn0065

03140plusmn0086

07001plusmn0031

Difference minus02084 01914 01306 minus02595 01214 00615 minus01984 01843 01312 00624

RussianblogSLARTS 05940

plusmn002805674plusmn0022

05803plusmn0023

04907plusmn0031

04664plusmn0020

04782plusmn0024

05878plusmn0028

05823plusmn0021

05849plusmn0020

09116plusmn0011

Seo et alrsquos[2]

04705plusmn0020

03030plusmn0015

03685plusmn0016

03823plusmn0023

02534plusmn0013

03046plusmn0015

04682plusmn0020

02822plusmn0014

03521plusmn0015

05862plusmn0013

Difference 01235 02644 02118 01084 0213 01736 01196 03001 02328 03254

AlefSLARTS 06189

plusmn002804131plusmn0040

04952plusmn0037

05639plusmn0027

03819plusmn0037

04551plusmn0034

06277plusmn0027

04069plusmn0043

04934plusmn0039

08044plusmn0013

Seo et alrsquos[2]

08827plusmn0052

02637plusmn0051

04045plusmn0060

08797plusmn0053

02631plusmn0051

04034plusmn0060

08840plusmn0052

02635plusmn0051

04043plusmn0060

07846plusmn0018

Difference minus02638 01494 00907 minus03158 01188 00517 minus02563 01434 00891 00198The bold style shows the best result in each metric

(iv) Who plays an important role in a discussion Someusers playmore important roles in conversations thanother users For example users ldquo119863(1)rdquo in threadldquoBrdquo and ldquo119860(1)rdquo in thread ldquoCrdquo have made a longconversation These users are known as a hub or astarter in a thread and their degree of importance canbe compared according to their indegree [39] Manyanalyses can be performed on this structure which isknown as popularity features [12]

(v) How many conversations are about the root Thereare thirteen conversations which are replyingdirectly to the root in thread ldquoCrdquo for exampleF(2)H(2)H(3) I(2)O(1)orF(2)H(2) l(1) showtwo conversations in thread ldquo119862rdquo

7 Conclusion and Future Work

In this paper we proposed SLARTS a method based onranking SVMwhich predicts the tree-like structure of declar-ative threads for example blogs and online news agenciesWe emphasized on the differences between declarative andinterrogative threads and showed that many of the previouslyproposed features perform poorly on declarative threadsbecause of this Instead we defined a set of novel textual andnontextual features and used a ranking SVM algorithm tocombine these features

We detect two types of reply relations in declarativethreads comment-root relation and comment-commentrelation An appropriate method should be able to detectboth types of reply relations and an appropriatemetric shouldconsider both types of reply relations So in order to havefair judge on the quality of the predicted structures wemodified the evaluation metrics accordingly We also defineda novel metric that measures the accuracy of comment typedetection

The results of our experiments showed that accord-ing to declarative evaluation metrics our method showshigher accuracy in comparison with the baselines on fivedatasets Also we showed that all improvements in detectingcomment-comment relations are statistically significant in alldatasets

We believe that the defined features are not limited todeclarative threads Some features such as authorrsquos languageand frequent patterns extract relations among users can beuseful in interrogative threads

For future we would like to use the proposed methodon interrogative platforms such as forums Also we wouldlike to analyze the tree-like reply structure deeply and webelieve it can provide valuable information for example tohelp in understanding the role of users in discussions or tofind important and controversial comments among a largevolume of comments

22 The Scientific World Journal

Table 7 The difference between the evaluation metrics in presence and absence of features

Feature Dataset Evaluation metric119875edge 119877edge 119875Path 119877Path 119875

119877-path 119877119877-path AccCTD

Similarity ENENews minus000233 007415 minus000538 003289 minus001095 008996 004684Russianblog 000874 000012 000963 000347 000695 minus001079 000355

Authorsrsquo language ENENews minus001354 003246 minus001785 000693 minus002757 004052 000458Russianblog minus000229 001052 minus000179 000750 minus000590 000807 minus000182

Global similarity ENENews minus000402 005604 minus001338 001581 minus001129 007086 002742Russianblog 001090 001443 000582 000897 000368 minus000317 minus000087

Frequent words ENENews 000252 000931 000170 000424 minus000132 000973 000292Russianblog 000008 000023 000016 000029 000034 000047 minus000011

Length ratio ENENews 004206 minus002984 004450 001700 006421 minus005336 minus000850Russianblog 000651 minus000355 000920 000080 000784 minus000592 000235

Authorsrsquo name ENENews 003753 006392 001920 003285 002341 005998 001788Russianblog 000000 000000 000000 000000 000000 000000 000000

Frequent pattern ENENews minus002028 002930 minus002357 000300 minus003793 003410 000526Russianblog 000458 006100 000830 005206 minus000545 006150 000404

Location prior ENENews 008778 minus003690 008556 003280 011792 minus007852 002925Russianblog 016091 009671 017766 013515 016699 008793 002414

Candidate filtering rule 1 ENENews minus005431 006138 minus004102 001296 minus006458 007545 006949Russianblog minus012506 034370 minus013916 027445 minus012683 037003 032546

Candidate filtering rule 2 ENENews 000014 minus000004 000048 000041 minus000038 minus000035 minus000009Russianblog 002575 002475 002462 002341 001746 000802 000000

Candidate filtering rule 3 ENENews 000357 minus000121 000452 000254 000551 minus000161 000161Russianblog 008521 minus001978 009636 001702 009380 minus003825 007371

Since our goal is to provide a language-independentmodel to reconstruct the thread structure we did not usetext processing tools such as Wordnet and Named EntityRecognition (as they are not available in the same quality forall languages) We would like to focus on English and usethese tools to find out whether they can improve the accuracyof RTS method or not

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] Y Wu Q Ye L Li and J Xiao ldquoPower-law properties ofhuman view and reply behavior in online societyrdquoMathematicalProblems in Engineering vol 2012 Article ID 969087 7 pages2012

[2] J Seo W B Croft and D A Smith ldquoOnline community searchusing conversational structuresrdquo Information Retrieval vol 14no 6 pp 547ndash571 2011

[3] HWang CWang C X Zhai and J Han ldquoLearning online dis-cussion structures by conditional random fieldsrdquo in Proceedingsof the 34th International ACM SIGIR Conference on Researchand Development in Information Retrieval (SIGIR rsquo11) pp 435ndash444 July 2011

[4] E Aumayr J Chan and C Hayes ldquoReconstruction of threadedconversations in online discussion forumsrdquo inProceedings of the

5th International AAAI Conference onWeblogs and SocialMedia(ICWSM rsquo11) pp 26ndash33 2011

[5] D Shen Q Yang J-T Sun and Z Chen ldquoThread detactionin dynamic text message streamsrdquo in Proceedings of the 29thAnnual International ACM SIGIR Conference on Research andDevelopment in Information Retrieval pp 35ndash42 Seattle WashUSA August 2006

[6] M Dehghani M Asadpour and A Shakery ldquoAn evolutionary-based method for reconstructing conversation threads in emailcorporardquo in Proceedings of the IEEEACM International Con-ference on Advances in Social Networks Analysis and Mining(ASONAM rsquo12) 2012

[7] Y Jen-Yuan ldquoEmail thread reassembly using similarity match-ingrdquo in Proceedings of the 3rd Conference on Email and Anti-Spam (CEAS rsquo06) Muntain View Calif USA 2006

[8] S Gottipati D Lo and J Jiang ldquoFinding relevant answersin software forumsrdquo in Proceedings of the 26th IEEEACMInternational Conference on Automated Software Engineering(ASE rsquo11) pp 323ndash332 November 2011

[9] H Duan and C Zhai ldquoExploiting thread structures to improvesmoothing of language models for forum post retrievalrdquo inAdvances in Information Retrieval P Clough C Foley CGurrin et al Eds vol 6611 pp 350ndash361 Springer BerlinGermany 2011

[10] T Georgiou M Karvounis and Y Ioannidis ldquoExtractingtopics of debate between users on web discussion boardsrdquo inProceedings of the 1st International Conference forUndergraduateand Postgraduate Students in Research Poster Competition 2010

[11] Y C Wang M Joshi W Cohen and C P Rose ldquoRecoveringimplicit thread structure in newsgroup style conversationsrdquo

The Scientific World Journal 23

in Proceedings of the International Conference on Weblogs andSocial Media Seattle Wash USA 2008

[12] J Chan C Hayes and E M Daly ldquoDecomposing discussionforums and boards using user rolesrdquo in Proceedings of the 4thInternational AAAI Conference on Weblogs and Social Media2010

[13] Y-C Wang and C P Rose ldquoMaking conversational structureexplicit identification of initiation-response pairs within onlinediscussionsrdquo in Proceedings of the Human Language Technolo-gies Conference ofthe North American Chapter of the Associationfor Computational Linguistics pp 673ndash676 June 2010

[14] LWangM Lui S N Kim J Nivre and T Baldwin ldquoPredictingthread discourse structure over technical web forumsrdquo inProceedings of the Conference on Empirical Methods in NaturalLanguage Processing (EMNLP rsquo11) pp 13ndash25 Edinburgh UKJuly 2011

[15] A Balali H FailiMAsadpour andMDehghani ldquoa supervisedapproach for reconstructing thread structure in commentson blogs and online news agenciesrdquo in Proceedings of the14th International Conference on Intelligent Text Processing andComputational Linguistics Samos Greece 2013

[16] P Adams and C Martel ldquoConversational thread extraction andtopic detection in text-based chatrdquo in Semantic Computing pp87ndash113 John Wiley amp Sons 2010

[17] P H Adams and C H Martell ldquoTopic detection and extractionin chatrdquo in Proceedings of the 2nd Annual IEEE InternationalConference on Semantic Computing (ICSC rsquo08) pp 581ndash588August 2008

[18] C Lin J M Yang R Cai X J Wang W Wang and L ZhangldquoModeling semantics and structure of discussion threadsrdquo inProceedings of the 18th International Conference on World WideWeb pp 1103ndash1104 2009

[19] A Schuth MMarx andM d Rijke ldquoExtracting the discussionstructure in comments on news-articlesrdquo in Proceedings of the9th ACM International Workshop onWeb Information and DataManagement (WIDM rsquo07) Lisboa Portugal November 2007

[20] C Wang J Han Q Li X Li W P Lin and H Ji ldquoLearninghierarchical relationships among partially ordered objects withheterogeneous attributes and linksrdquo in Proceedings of the 12thSIAM International Conference on Data Mining pp 516ndash527Anaheim Calif USA 2012

[21] M Dehghani A Shakery M Asadpour and A KoushkestanildquoA learning approach for email conversation thread reconstruc-tionrdquo Journal of Information Science vol 39 no 6 pp 846ndash8632013

[22] A Balali A Rajabi S Ghassemi M Asadpour and H FailildquoContent diffusion prediction in social networksrdquo in Pro-ceedings of the 5th Conference on Information and KnowledgeTechnology (IKT rsquo13) pp 467ndash471 2013

[23] G Mishne and N Glance ldquoLeave a reply an analysis of weblogcommentsrdquo in Proceedings of the 3rd Annual Workshop onthe Weblogging ecosystem Aggregation Analysis and DynamicsEdinburgh UK 2006

[24] V Gomez A Kaltenbrunner and V Lopez ldquoStatistical analysisof the social network and discussion threads in Slashdotrdquo inProceedings of the 17th International Conference on World WideWeb (WWW rsquo08) pp 645ndash654 April 2008

[25] T Weninger X A Zhu and J Han ldquoAn exploration ofdiscussion threads in social news sites a case study of the redditcommunityrdquo 2013

[26] D Laniado R Tasso Y Volkovich and A KaltenbrunnerldquoWhen the Wikipedians talk network and tree structure of

Wikipedia discussion pagesrdquo in Proceedings of the 5th Interna-tional AAAI Conference on Weblogs and Social Media (ICWSMrsquo11) 2011

[27] A Grabowski ldquoHuman behavior in online social systemsrdquoTheEuropean Physical Journal B vol 69 no 4 pp 605ndash611 2009

[28] S N Kim L Wang and T Baldwin ldquoTagging and linkingweb forum postsrdquo in Proceedings of the 40th Conference onComputational Natural Language Learning Uppsala Sweden2010

[29] J Andreas S Rosenthal and K McKeown ldquoAnnotating agree-ment and disagreement in threaded discussionrdquo in Proceedingsof the 8th International Conference on Language Resources andComputation (LREC rsquo12) pp 818ndash822 Istanbul Turkey May2012

[30] A Stolcke K Ries N Coccaro et al ldquoDialogue actmodeling forautomatic tagging and recognition of conversational speechrdquoComputational Linguistics vol 26 no 3 pp 339ndash373 2000

[31] E N Forsyth Improving Automated Lexical and DiscourseAnalysis of Online Chat Dialog Naval Postgraduate SchoolMonterey Calif USA 2007

[32] D Feng E Shaw J Kim and E Hovy ldquoAn intelligentdiscussion-bot for answering student queries in threaded dis-cussionsrdquo in Proceedings of the 11th international Conference onIntelligent User Interfaces pp 171ndash177 February 2006

[33] F Hu T Ruan Z Shao and J Ding ldquoAutomatic web informa-tion extraction based on rulesrdquo in Proceedings of the Web Infor-mation System Engineering (WISE rsquo11) pp 265ndash272 Springer2011

[34] F Hu T Ruan and Z Shao ldquoComplete-thread extraction fromweb forumsrdquo inWebTechnologies andApplications pp 727ndash734Springer 2012

[35] D Feng J Kim E Shaw and E Hovy ldquoTowards modelingthreaded discussions using induced ontology knowledgerdquo inProceedings of the National Conference on Artificial Intelligencepp 1289ndash1294 July 2006

[36] G D Venolia and C Neustaedter ldquoUnderstanding sequenceand reply relationships within email conversations a mixed-model visualizationrdquo in Proceedings of the SIGCHI Conferenceon Human Factors in Computing Systems pp 361ndash368 April2003

[37] T Joachims ldquoTraining linear SVMs in linear timerdquo in Pro-ceedings of the 12th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining pp 217ndash226 Philadel-phia Pa USA August 2006

[38] RM Fano ldquoTransmission of information a statistical theory ofcommunicationsrdquo American Journal of Physics vol 29 pp 793ndash794 1961

[39] T Opsahl F Agneessens and J Skvoretz ldquoNode centrality inweighted networks generalizing degree and shortest pathsrdquoSocial Networks vol 32 no 3 pp 245ndash251 2010

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 2: Research Article A Supervised Approach to Predict the …downloads.hindawi.com/journals/tswj/2014/479746.pdf · 2019-07-31 · Research Article A Supervised Approach to Predict the

2 The Scientific World Journal

threads they are not effective enough on declarative threadsIn this paper we focus on declarative threads and try to devisean effective RTS method for them In summary the RTS taskthat we are doing here is suitable for the following websites

(i) websites that show comments in a list based struc-ture in chronological order This way of show-ing comments is prevalent [2ndash4] for exampleReuters (httpwwwreuterscom) and ABCNEWS(httpwwwabcnewscom) which are listed respec-tively as the 301th and 462th most popular websitesin global based on theAlexarsquos traffic rank (AlexamdashTheweb information company httpwwwalexacom)

(ii) blog service providers who do not support hierar-chical structure for their comments such as Blogfa(httpwwwblogfacom) which is listed at the 137thmost popular website globally and 3th popular web-site in Iran after Google and Yahoo

(iii) websites that cannot support a tree-like replystructure with more than some levels due to thelimitation of space on pages such as FacebookFin24 (httpwwwfin24com) and Skysport(httpwwwskysportscom)

(iv) websites that have been created from contentmanage-ment systems that do not support nested commentsfor example those who have designed the templateof their sites based on an old version of Wordpress(before 2009) and have not changed it

RTS can be beneficial in many applications To namea few it is useful for facilitating search and finding of theuserrsquos favorite content in a large volume of comments andfor improving retrieval accuracy [2 8 9] identifying userswho have the ability to answer the questions [10] isolatingdiscussions related to specific subtopics [11] understandingthe online userrsquos behavior [12] facilitating the follow of theactual conversation stream in threads [4] conversation sum-marization which include both the initiation and responseas a coherent unit [13] automatic question and answerdetection [13 14] and finding the reply relations amongcomments to be used in other tasks like topic detection

Another benefit for RTS is in topic detection on com-ments Usually comments are short in length and this makesthe topic detection task quite difficult Usually topic ofcomments is similar or related to topic of its parent and itschildren So knowing the hierarchy can help us to provideextra information in order to enhance topic detection Thiscan be considered as a sort of word expansion

In this work we propose a method to automaticallyreconstruct thread structure and organize comments into atree-like structure by considering information about authorscontent date and time of the post A set of relevant textualand nontextual features are defined Then a learning algo-rithm based on ranking SVMmodel is used to learn a propermodel that is exploited to identify the reply relation between aroot and a set of comments In other words a set of commentsis fed into the trained model to determine if there is any rela-tion between themor notTheproposedRTSmethod is calledSLARTS (a Supervised Learning Approach to Reconstruct

Thread Structure) We combine our knowledge and technicsfrom Information Retrieval Natural Language ProcessingMachine Learning and Social Network disciplines

The main contributions of this work are as follows

(i) The focus of this paper is RTS task on declarativethreads in blogs online news agencies and newswebsites on which few studies have been carried out

(ii) We describe and show the differences between thedeclarative and interrogative threads To better illus-trate the differences we used Apple discussion forumand compared it with declarative datasets

(iii) We propose a supervised approach to RTS tasknamely SLARTS based on ranking SVM modelusing novel textual and nontextual features which arerelated to declarative threads

(iv) The proposed method is tested on 5 datasets indifferent languages one in Persian one inRussian andthree in English In three of these websites the com-ments are confirmed by moderators Some of thesewebsites present comments as user posts All datasetsconsist of lots of reply structures among commentsThe reply structures of comments in ground truthcome from real structures that are created by usersand they are addressed by reply tags in our datasets

(v) In order to better evaluate our method some eval-uation metrics that are proposed in the previousworks have been modified and an evaluation metricis proposed to RTS task which is appropriate fordeclarative threads

(vi) The evaluation results reveal higher accuracy in com-parison with the baselines methods in all datasets

This paper is an extension of the work we published inour conference paper [15] We have defined new textual andnontextual features to improve the accuracy of the proposedmethod We also evaluate our method on more datasets withnew measures for accuracy

The rest of this paper is organized as follows In Section 2we describe related work It is followed by problem definitionin Section 3 We explain the proposed method and featuresused for RTS task in Section 4 In Section 5 the experimentsdatasets evaluation metrics and experimental results areshown and described Finally we discuss the results and con-clude with the valuable information which can be extractedby visualizing the reply relations structure

2 Related Work

In this section we cover previous works on RTS task and thenwe will investigate various issues around threads in onlineplatforms

21 Thread Detection Thread detection task which some-times is called topic detection should be accomplished aspreprocessing for the RTS task In this task all commentsare split into a number of threads After that RTS could

The Scientific World Journal 3

be exploited to discover the tree structure of the threads[5 10 16]

The thread detection task is not necessary for declarativethreads since usually all comments are related to the rootBased on this all comments can be considered in just onethread Consequently we focus only onRTS task in this study

22 RTS Task There are very few studies in the literature thatdirectly address the problem of RTS task for comments indeclarative threads In general RTS task can be done either insupervised or unsupervised manner In unsupervised meth-ods the relation between comments is weighted by usingtext similarity measures Then they are adjusted using othermetrics such as time distance and position in chronologicalorder Finally the relations whose weights are higher than apredefined threshold are selected as the parent-child relations[11 17] Lin et al [18] proposed a method which computesthe similarity between each post and previous posts to findcandidate parents Then the post with the highest score ischosen as the parent candidate If the parent candidate is notsimilar enough to the child comment the candidate parent isassumed to be a new discussion branch of the thread

In supervised methods the existence of relation betweentwo messages is determined using a supervised learningalgorithm [2 4 19] In these methods a set of features isdefined and weighted using a training set Then the trainedmodel is used to discover the comment relations in the testdata using extracted features

The methods proposed in [4 19] are supervised inwhich a set of simple features and a classifier are used Thefeatures are divided into two groups The first group is calledstructural or nontextual features such as time informationreply distance and authorrsquos name The next group of featuresis called semantic or textual features such as sentences typeand similarity among comments

Seo et al [2] proposed a learning technique that exploitsthe hierarchical structure of replies in forums or emailsTheyintroduced a structure discovery technique that uses a varietyof features to model the relations among posts of the samethread In fact their method is the most similar to oursHowever we focus on blogs and news agencies while theyhave worked on forums and emails The existence of quotedtexts in forums and emails makes RTS task easier than ourcase

Wang et al [3] proposed a probabilistic model as asupervised structure learning problem to predict the depen-dency among the posts within one thread on forums basedon the general conditional random fields Their method isbased on various kinds of features The features describedthe interactions in both posts and authors The weights forthe designed features are estimated in a supervised mannerSimilar to previous work Wang et al [20] proposed adiscriminative probabilistic model which can handle bothlocal features and knowledge propagation rules

The only existing work on RTS task in declarative threadshas been proposed by Schuth et al [19] This work focusedon RTS task in online news agencies threads They usedseveral features to detect authorsrsquo name in the commentsrsquo text

Then the features are combined by tree-learner algorithmand eventually a classifier detects relations among commentsand the root Since there are many comments that do notrefer to any authorrsquos name in our case this method does nothave a good accuracy

There are also some related works on other types of datasuch as email data [6 7 21] However some specific featuresexist in those environments that are not applicable here Forinstance some extra information about messagersquos recipientslike ldquoToCCrdquo tag in email data or any information about theaffiliation of messagersquos author like signature is exploited toimprove reconstructing conversation threads in email data

23 Thread Structure Analysis Some works have been doneon analysis of the social media emerging from the usercomment activity [22 23] The messages boards such asSlahsdot and Reddit publish frequently short news postsand allow their readers to comment on them The worksproposed in [24 25] focused on these threads They exploredthe structure and topical hierarchies of comment threadsto gain a deeper understanding of the usersrsquo behaviour thatallow these types of user-powered websites to better operateLaniado et al [26] analyzed the structural properties of thethreads on Wikipedia pages to extract and study differentkinds of interactions Wu et al [1] introduced a model toexplain the human view and reply behaviors in the forumwhich are helpful for discovering collective patterns of humanbehaviors They found that view and the reply behaviourshave a form of a power-law distribution [1 27]

24 Detection of Initiation-Response Pairs The works pro-posed in [13 14] focused on detection of initiation-responsepairs such as the question-answer and assessment-agreementrelationships Initiation-response pairs are pairs of utterancesthat the first pair part sets up an expectation for the secondpair part [13] Wang et al [14] introduced a list of thedialogue act labels for edges A dialogue act label such asanswer-answer answer-question is assigned to each relationbetween two messages Kim et al [28] proposed a dialogueact tag set and method for annotating post-to-post discoursestructure on forums They used three feature sets structuralfeatures post context features and semantic features andthey experimented with three discriminative learners SVMHMM andmaximum entropy Andreas et al [29] introduceda new corpus of sentence-level agreement and disagreementannotations Two sentences are in agreement if they show thesame fact or opinion

A list of dialog act labels and an approach for modelingdialogue acts have been proposed in conversational speech[30ndash32] Detection of dialogue act labels for each post issuitable for thread detection [5 16] and finding relevantanswers in forums [8]

25 Automatic Meta-Information Extraction from HTMLPages Information of threads such as authorsrsquo name andcontent is usually extracted by human This is time con-suming Some methods have been proposed to extract themain content and remove noisy information from a web page

4 The Scientific World Journal

automatically [33] Hu et al [34] proposed an algorithm toextract all meta-information of threads from different kindsof forums without human interactionThe algorithm consistsof two steps thread extraction and detailed informationextraction from thread pages

3 Problem Definition

In this section some concepts are defined about RTS taskA comment is an utterance written by a user comprisingone or several sentences A commenter is a person whowrites comments and replies to the root article or the othercomments In this paper the root is defined as the starterof a conversation which can be a news article or any othercontent which has been written by an author or a journalistFigure 1 shows a part of a real thread with 145 comments InFigure 1 the root post is shown with label 119877 and edges denotethe reply relations In a thread a reply comment (which isillustrated in Figure 1 by numbered labels) responses to theprevious comments or to the root item For example thenodes with label 1 and 32 are reply comments to the rootitem which is considered as their parent The sequences oflabels in Figure 1 are in chronological order A thread is asequence of comments that starts with the root item andcontains a series of reply comments which are usually relatedto the same topic as the root item Each comment has a singleparent and all comments are descended from the root postin a thread [24 29] In other words threads are consideredas a special case of human conversations [35] that consist ofa set of comments contributed by two or more participantsby the reply operation [36]The candidate set of 119894th commentis a set of comments that could be considered as the parentof 119894th comment and includes comments which appear beforethe 119894th comment in chronological orderThe starter discussioncomment is a comment repling to the root item and it has atleast one child like comments with labels 1 and 32 in Figure 1

Thread detection task means finding the cluster of com-ments that belong to the same topic in a given text streamwithout any previous knowledge about the number ofthreads Reconstructing thread structure (RTS) task meansreconstructing the reply structure on comments in a threadThis leads to construction of a tree-like reply structure [2ndash5] or directed acyclic graph (DAG) [11 17 19] Since thethreads extracted from websites have the tree-like replystructure they are usually modeled as a tree in most papersAlso the structure of threads can be modeled as a DAGThe information about DAG-like reply structure is mostlyprepared by manually annotating the data which is a timeconsuming and difficult task In addition since the manuallyannotated dataset contains a small number of threads [11 19]it cannot properly evaluate the RTS algorithms In this paperwe assume a tree-like reply structure

Declarative and interrogative threads are different inessence

(i) Users in declarative threads mostly express theiropinions or sentiments about the root post informallywhile in interrogative threads users mostly expresstheir questions and answers in a more formal way

(ii) In declarative threads the topic of the root is mostlikely a news article or a content reflecting the real-world events while in interrogative threads it is mostlikely a userrsquos question

(iii) There is meta-information such as quotation in inter-rogative threads that has a great improvement on theaccuracy of RTS task

(iv) Comments of declarative threads usually wait formoderation to be published and it usually takes sometime Moderators are not always online they log ina few times per day accept the sent comments andlog out Therefore multiple comments appear nearlyat the same time So some features that are based ontime distance [11] and position in chronological orderhave good performance in interrogative threads butthey are not good in declarative threads

Declarative and interrogative threads are technically dif-ferent as well

(i) Each comment can most likely be connected toits previous comment (1-Distance) in the interroga-tive threads [4] this simple heuristic leads to greatimprovement in the accuracy of RTS task [3 4] sincemany of posts are most likely written to answer thelast question However this heuristic cannot have agood performance in declarative threads Accordingto Figure 2(a) the users most likely reply to thefirst previous comment in Apple discussion forumAlthough Figure 2(b) shows the users most likelyreply to the root inThestandard online news agenciesThe 11th comment in Apple is the parent of 12th com-ment with probability 049 however in Thestandardthe probability is equal to 0137

(ii) Figure 3 includes three real threads in Thestandarddataset which includes 30 comments and each com-ment is numbered and sorted in chronological orderSince users usually express their opinions or sen-timents and reply to each comment regardless ofsubmission time and position of comments so thestructure of the replies is not predictable

(iii) The length of the roots text in declarative threads isusually larger than comments thus the length of theroot or comments should be normalized in similaritymeasurements

There are websites which are known as message boardssuch as Digg Reddit and Slashdot The message boards arevaluable threads to analyze usersrsquo behavior [24 25] Howeverthey are not suitable to be used as training data since theyhave different designs for showing the comments Thesedifferences change the behavior of userrsquos replies for exampleSlashdot (httpslashdotorg) shows comments based ontheir scores This causes the comments which have higherscores to get more replies So we cannot create a generalmodel appropriate for all message boards

The Scientific World Journal 5

Article-news title the president loses 30000 jobs in a single year

The president reacts to criticism of his appalling jobs recordwhich has 300000 Kiwis jobless and 30000 fewer people inwork in the last year middot middot middot Herersquos criticism of things that yoursquorenot doing on the jobs front the president tax pollution soyou can reduce tax on companies and income renew andenhance the home insulation scheme which has created2000 jobs and is about to run out of money create anational investment middot middot middot

User A26 march 2013 at 802 am

User D26 march 2013

at 1110 am

321

middot middot middot

All Shearer has to do is1 Print Money or borrow a shit load moremoney2 Whack on a pile of extra taxes3 Get the government to run everything

Can you get some of those 30000 toknock on mydoor Irsquom offering fulltime employment no experiencenecessary and well above minimumwage middot middot middot but the response has beenpoor and of those who have appliedmost have a terrible attitude towardswork I donrsquot understanditmiddot middot middot

User B26 march 2013

at 806 am

2

Irsquove been reading a lot of storiessimilar to yours and have seen howhard it is for employers to get goodreliable staffThe mind set of theyounger people these days seems tobe very poor middot middot middot

Irsquove beenreading a lot ofstories middot middot middot

User A26 march 2013 at 1124 am

User E26 march 2013 at1247 pm

124

36

Of course they have to borrowmoneyChristchurch roads WFF interestfree loans we donrsquot make enoughto cover those expenses middot middot middot

User A26 march 2013

at 816 am

6

But national are-borrowing a shit load more moneyevery week-whacking on extra taxes (GSTcigarettesalcohol petrol middot middot middot )

-trying to run Auckland city fromWellingtonmiddot middot middot

you forgot 1000000000in lost revenue becauseof poorly structured taxcuts that did not stimulatethe economy

remember only theMexicans pay less taxthan us middot middot middot

User C26 march 2013

at 819 am

8

R

$

Figure 1 A part of a real thread fromThestandard (httpwwwthestandardorgnz) online news website

4 Method

In this section we describe our supervised approach to RTStask First we define some textual and nontextual featuresand learn a proper model to combine the features using aranking SVM (Support Vector Machines) Then the model isemployed in reconstruction of the reply structure for threadsin the test data

41 Ranking SVM SVM is a supervised learning methodthat is used for classification regression and ranking

The implementation we use is the ranking SVM classifi-er (httpwwwcscornelledupeopletjsvm lightsvm rankhtml) [2 37] The ranking SVM classifier learns to assigna weight to the pairs of comments The whole procedurefor choosing the parent of the 119894th comment in a thread isdescribed in Figure 4

Since the purpose of RTS task is to predict a tree-likereply structure in a thread RTS algorithm needs to find 119873reply relations among comments and the root where119873 is thenumber of comments in a threadThus RTS algorithm needs119874(1198732) pairwise comparisons

6 The Scientific World Journal

2468101214

15

1015

0

03

02

01

04

05

06

07

08

09

1

X 7Y 12Z 001661

Child

-par

ent r

elat

ion

prob

abili

ty

RootY child

X parent

X 11Y 12

Z 049

X rootY12Z 0211

(a)

2468101214

1

5

10

15

0010203040506070809

1

X 7Y 12Z 006037

X 11Y 12Z 01371Ch

ild-p

aren

t rel

atio

n pr

obab

ility

X parent

Y child

X rootY 12Z 03652

Root

(b)

Figure 2 The probability of child-parent reply relations (a) Apple discussion forum and (b) Thestandard online news agencies

R

10 14

298

124

7

16 45

3

30

28

25

2726

20

13

19

23

15

12

11

19

13

22

18

21

26

17

221

2947

112

9

26325

23 27

30

5

12

6

810

14

21 15 20

16

181719

24

28

1

310

209

4

5

1622

6

2

15

R

826

30

19

2421

2923

277

17

18

11

13

14

12

28

25

9

11

R

Figure 3 Three real threads which the nodesrsquo label are in chronology order

42 Features The ranking SVM needs some features to bedefined In this section we introduce eleven textual andnontextual features Their definition is one of the maincontributions of our work It should be mentioned that about20 percent of replies in ground truth donot have any common

word with their parent The textual features therefore are notenough and we need to define some nontextual features

421 Similarity Measurement In order to measure the sim-ilarity of sentences we utilize a vector space model to

The Scientific World Journal 7

Learning

Model

i

R

1

i minus 1Get

(candidatesof irsquothcomment)

Ranking

Training set

Test set

SVM ranklearn

(A learning modellist of weights for

each feature)

i

R i minus 1

j

For each candidateForj = 0 to i minus 1 doFor each featurefor k = 0 to Features doLinear kernel functionA[j]+ = modelweight[k]lowastfeature [j][k]Return parentrsquos lablereturn argmaxj A

Figure 4 RTS algorithm

represent the text body of comments A commentrsquos text canbe considered as a vector of terms weighed by TF-IDF TF(Term Frequency) is the number of times a word appearsin a comment and IDF (Inverse Document Frequency) iscomputed according to the following formula

IDF (119908) = Log( 119873119899119908

) (1)

where119873 is the total number of comments in a news item and119899119908is the number of comments that contain the word 119908 IDF

is usually extracted from the whole training data but we havelimited it to the set of comments in a thread We believe thismakes it more accurate for example when a word has a lowfrequency in the whole training data but has a high frequencyin a thread

In order to measure similarity between two commentsthe stop-words are deleted first and then words of thecomments are stemmed by the Porter algorithm (for Englishdatasets) This step is not performed for Russian and Persiandatasets The common words of the two comments areextracted and their weights are calculated based on TF-IDFThen the final score is obtained by aggregating the weights ofcommonwords according to (2) Since the root text is usuallylonger than the comments log of the product of their lengthsis used in denominator

Score (1198621 1198622)

= sum

119908isin119862-words(1 + log (TF (119908 1198621) times TF (119908 1198622)))

times IDF (119908) times (log (|1198621| times |1198622|))minus1

(2)

where119862-words are the number of the words held in commonbetween comments 1198621 and 1198622 and |119862| is the length of thecommentrsquos text

Comments are usually informal and have typo errorsSome words in two comments might be the same but dueto spelling errors it is difficult to find this out In order tosolve this issue we use the minimum edit distance (MED)algorithm The minimum edit distance of two words is theminimum number of edit operations (insertion deletionsubstitution and transposition) needed to transform oneword into another [14] The costs of insertion deletionsubstitution and transposition are 1 1 2 and 1 respectively

Two words in different comments are considered ascommon words if either they are exactly the same or theyseem to be the same but they contain some typo errors Inthe latter case if the length of the words is bigger than fiveand their first two letters are the same and their edit distanceis lower than 4 the two words are considered as commonword For example two words ldquoBeautifulrdquo and ldquoBeuatifulrdquo areconsidered as common words

422 Authorsrsquo Language Model The idea is that the com-menters who talk to each other are more likely to use similarwords in their comments In order to take advantage ofthis feature all comments that are related to a commenterare appended This makes a collection of words for eachcommenter If the collections of words of two commentersare very similar they can be related to each other InFigure 5 commenter ldquo1198601rdquo wrote three comments Thesethree comments are appended and make a collection ofwords for commenter ldquo1198601rdquo and then like the first featurethe similarity is calculated between collection of words ofcommenters ldquo1198601rdquo and ldquo1198602rdquo The similarity scores obtained

8 The Scientific World Journal

between two commenters ldquo1198601rdquo and ldquo1198602rdquo are considered asthis featurersquos score for relations between their comments

423 Prior Location Position of comments can reveal someinformation about their hierarchy for example first com-ments usually have more children than the others or com-ments which are located just before the 119894th comment aremorelikely to be its parent In general we would like to estimate119901(119894 | 119895) that is knowing that a comment is in position 119895whichis the likelihood that the comment in position 119894 is its parent[2] So we calculate prior probabilities for being the parent ofdifferent positions

To calculate prior probability for 119895 we count the numberof times a comment in position 119894 is the parent of 119895 inthe training set Figure 6 shows the prior probability forcomments in positions 1 to 100 in Thestandard dataset Thehighest prior probability belongs to the root and then tothe comments which are located just before the commentThe sample points in Figure 6 show five commentrsquos positionssuch as the roots 10 30 57 and 59 and how it is probablefor 60th comment to be a child of them that the root hasthe most prior probability which is equal to 01852 and thenthe 59th comment which has the probability of 01236 AlsoFigure 7 shows the prior probability of child-parent relationfrom comment 40 to 60

424 Reference to Authorsrsquo Name If the name of an authorappears in a comment then all hisher comments are con-sidered as a potential candidate for being parent of thatcomment [19] Sometimes a part of the complete name isreferenced Sometimes the authorrsquos name is made up of twoparts and both of these parts could be used by other authorsfor reference We also consider these types of references Wehold each part of the authorrsquos name and then parts which arestop-words are removed

425 Global Similarity This feature is based on ranking thesimilarity of commentsrsquo text and the rootrsquos text which has aglobal view on the text similarity According to this feature ifcomment ldquo119860rdquo has the most similarity with comment ldquo119861rdquo andinversely comment ldquo119861rdquo has themost similarity with commentldquo119860rdquo among other candidates it is more likely that there is arelation between comment ldquo119860rdquo and comment ldquo119861rdquo To relaxthis feature the similarity measurement is calculated for eachcomment corresponding with all comments then the com-ments are sorted based on their similarity score For examplein Figure 8 we are looking for parent of the fifth commentIn the first step according to Formula (2) comments aresorted based on score of text similarity measurement per thefifth comment Comments that do not belong to candidateof the fifth comment are removed The removed commentshave been shown with black color in Figure 8 In the secondstep the same as the first step comments are sorted per eachcandidate of fifth comment and also comments which donot belong to candidate of the fifth comment are removedexcept the fifth comment itself Finally formula (3) is used tocalculate Ranking-distance score Two comments which arethe most similar to each other have more Ranking-distance

score In Figure 8 the most score belong to relation with thefourth commentThis feature is symmetric and the similarityamong comments is only calculated one time In other wordsthis feature needs 119874(1198732) time complexity for text similaritypairwise comparisons

Ranking-distance (1198621 1198622) = (|1198621amp1198622| + |1198622amp1198621|)minus1(3)

where 1198621 and 1198622 are two comments and |1198621amp1198622| is the textsimilarity distance between 1198621 and 1198622

426 Frequent Patterns of Authors Appearance The idea isthat the comments of two commenters who talk to eachother usually appear closely in chronological order So iftheir comments appear many times just after each other thisfeature gives a high score to their comments In order toimplement this feature we use the following formula

FPScore (119894 119895) = 119891 (119860119894 119860119895) + 119891 (119860

119895 119860119894) (4)

where 119860119894is the author of comment 119894 119891(119886 119887) is the number

of times of comments of author 119886 that appear just beforecomments of author 119887 (see Pseudocode 1)

Figure 9 shows a time line which includes 7 commentswhich have been written by 4 commenters The feature scoreis calculated for the relation among comments The score ofrelation between comments 119860 and 119861 is 3 which is more than119860 and 119862

427 Length Ratio The length of the parent text is usuallylonger than its children The length ratio score is calculatedaccording to

length ratio score (119862119862 119862119875) =

10038161003816100381610038161198621198751003816100381610038161003816

10038161003816100381610038161198621198621003816100381610038161003816

(5)

where 119862119862is a comment looking for a parent and 119862

119875is a

candidate parent

428 FrequentWords in Parent-Child Comments Sometimesa pair of words appears to be frequently one word in theparent and the next word in its children For example in ENE-News dataset the most frequent pairs are (believe people)(accident fukushima) (new discovered) (idea report) and(people public)We use pointwisemutual information (PMI)[38] to find the frequent pattern

Score (11988211198822) =

Count (11988211198822)

Count (1198821) lowast Count (119882

2) (6)

where 1198821is a word in a comment whose parent we are

looking for1198822is a word in its candidate parent and Count

(11988211198822) is the number of time119882

1has appeared in the parent

and1198822has appeared in child The numerator computes how

often two words appear together and denominator computeshow often one of the words appears Finally according toPseudocode 2 the score of relation between two comments iscalculated

The Scientific World Journal 9

+

C

C

C

A1

+

C

C

C

A2Similarity

C

Figure 5 The authorsrsquo language features based on author ldquo1198601rdquo and author ldquo1198602rdquo

102030405060708090100

1 20 4060 80

100

0

01

02

03

04

05

06

07

08

09

1

Prio

r pro

babi

lity

X RootY 60Z01862

X 10Y 60Z00011

X 30Y 60Z00011

X 57Y 60Z 006

Root

Y child X parent

X 59Y 60Z01236

Figure 6 The prior probability of child-parent relation from comments 1 to 100 inThestandard

102030405060

0

005

01

015

02

025

X rootY 60Z01862

X 59Y 60Z01236

Root

4045

5055

60

Y child

X parent

Prio

r pro

babi

lity

Figure 7 The prior probability of child-parent relation from comments 40 to 60

Comment other comments are sorted based on the text similarity measurement score

5 15 2 4 R 3 21 1

2 233 4 5 R13 1

4 235 1 R 313 2

5

2 4

1 2 3 4 5

1 2 3 4 5

1 2 3 4 5

1(1 + 3) 1(2 + 1)

Figure 8 An example for global similarity features

10 The Scientific World Journal

Step 1For 119894 = 1 to comments-1

if Author(119894) = Author(119894 + 1)119891[Author(119894 + 1) Author(119894)]++

End ifEnd forStep 2For 119894 = 2 to Comments For each comment which looking for its parent

For 119895 = 0 to 119894 minus 1 For each candidateFPScore(119894119895)= 119891[Author(119894) Author(119895)] + 119891[Author(119895) Author(119894)]

End forEnd for

Pseudocode 1 Pseudocode to calculate frequent patterns of authors appearance

Time line

Relation CountFrequentpattern

Sum

Aminus gt B 2 A B 3

Bminus gt C 1 B C 1

Cminus gt A 1 C A 1

Bminus gt A 1 AD 1Aminus gt D 1

B

AFrequent (A B) = 3 Frequent (A C) = 1

C

Step 1 Step 2

A B C A B A D

Figure 9 An example for authors appearance sequence

429 Candidate Filtering In addition to the features we usesome heuristics to filter some inappropriate candidates

(1) Usually a commenter does not reply to the root inhisher second comment So if a commenter haswritten more than one comment the root is removedfrom the parent candidates of the second and nextcomment This was shown to be a useful featurebecause the root is an important candidate so if wecould remove it correctly the results are improvedsignificantly

(2) A commenter does not reply to himherself So wesimply remove all comments of a commenter fromhisher commentrsquos candidates [2]

(3) Commenters who post only one comment on a threadare more likely to reply to the root post So othercandidates can be removed except the root

5 Experiments

In this section we provide details about the datasets used forevaluation describe the evaluation metrics and then presentthe results of the SLARTS method and compare them withthe results of the baseline approaches

51 Dataset Unfortunately there is no standard datasetfor RTS task in declarative threads while severaldatasets are available for interrogative threads [3 4 14]Therefore Thestandard (httpthestandardorgnz) Alef(httpwwwalefir) ENENews (httpenenewscom) Rus-sianblog (wwwkaverydreamwidthorg) and Courant-blogs(wwwcourantblogscom) websites were crawled to provideevaluation data Thestandard in New Zealand and Alef inIran are two online news agencies established in August2007 and 2006 respectively They publish daily news andarticles by journalists in different categories such as economyenvironment politics and society Thestandard is in Englishand Alef is in PersianThestandard has provided the tree-likereply structure of comments since 2010 ENENews is a newswebsite covering the latest energy-related developmentsIt was established shortly after the Fukushima Daiichidisaster in March 2011 It has grown rapidly to serveapproximately 2000000 page views per month Russianblogand Courantblogs are two blogs that are in Russian andEnglish respectively Russsianblog and Courantblogs areestablished since 2003 and 2012 respectively

The reason for selecting the mentioned websites are (1)They support multilevel reply (2) their users are active andarticles usually get many comments (3) author and time ofcomments are available (4) they cover different contexts andlanguages (news agencies cultural sites and blogs in EnglishRussian and Persian)

The Scientific World Journal 11

procedure frequent-words (comment candidate comment)Sum = 0For each (Word1 in comment)

For each (Word2 in candidate-comment)Sum += Score [Word1 Word2]

End ifEnd forreturn Sum

Pseudocode 2 A pseudocode for calculating score of frequent words between two comments

Table 1 The datasets specific properties

Datasets Language Confirmation bymoderator

Usersregister

Thestandard English times

Alef Persian times

ENENews English times

Russianblog Russian times

Courantblogs English times

For each website we crawled the webpages that werepublished until the end of 2012We then parsed the pages andextracted the reply structure and used it for ground truthWehave removed the threadswith less than 4 comments becausethese kinds of threads do not give us much information andusually their comments reply to the root Table 1 summarizesthe information about the prepared datasetsThe datasets areavailable at httpeceutacirnlpresourceshtml

In ENENews and Russianblog users have to register inorder to leave a comment However in other datasets userscan leave comments in a thread with different names ordifferent users can use the same names

Table 2 reports some statistics on the crawled websitesThe length of commentsrsquo text in Russianblog is shorter thanthe other datasets which causes text similarity measures toperform poorly on it In ENENews the root usually includesa tweet that is why the length of the rootrsquos text is shorterthan the other datasets All comments have authorrsquos nameexcept for some comments in Alef Therefore the numbersof commenters for Alef was calculated from the commentsthat have authorrsquos name The average number of commentsper article in Thestandard and ENENews are about 50 and39 respectively which are larger than the other datasets

In order to gain some insights into the data we show somecharts extracted from the datasets

Figure 10 shows the distribution of the number of com-ments in the articles It is seen thatmost threads have between5 and 15 comments in Russianblog and Alef However inThestandard length of threads is longer than the otherdatasets andmost threads have between 12 and 24 comments

Publication rate of articles is shown in Figure 11 Thepublication rate follows a bell-shape and articles are publishedbetween 7 am and 19 pm and the highest rate belongs to the 4-hour period between 9 am and 13 pm Since there is only one

author who publishes the articles in Russianblog its chart hasless variation and the root is usually published between 10 amand 18 pm

Figure 12 shows the publication rate of comments afterthe root It is seen that all datasets except Russianblog havesimilar behavior about one hour after the article is publishedit has received most of the comments After that the rate ofcomment reception decreases Russianblog shows differentbehavior It seems that a part of its visitors reply to its articlesthe next morning 16ndash21 hours after the publication

Figure 13 shows the time difference between publicationtime of comments and their replies It is seen that the maxi-mum difference is usually less than one hour ENENews andRussianblog do not moderate comments and Thestandardhas very active moderators who immediately check andaccept comments However in Courantblogs and Alef wheremoderators are not always online the time difference isbetween one and two hours

Figure 14 shows how depth of comments increases whentime passes after publication of the main article Deepercomments show longer and more serious conversations Asshown in the figure comments usually reply to the root inearly hours After a few hours conversations are continuedaround the comments which causes the depth of thread toincrease Visitors of Alef and Courantblogs talk more aboutthe root article However Thestandard and ENENews havelonger and deeper discussions

Figure 15 shows length of comments in words Most ofcomments include 10ndash20 words Except for comments ofRussianblog the other datasets are similar This tells thatthe similarity measure is not enough and we have to usenontextual features as well Russianblog has comments thatare shorter than the other datasets This dataset is a personalblog and users usually write friendly comments

Figure 16 shows depth of comments Depth is directlyrelated to the conversations It is seen that comments areusually found in the first depth (below the root) RussianblogENENews and Thestandard have more comments in higherlevels meaning conversations are longer

52 Evaluation Metrics To evaluate the experiment resultswe use several metrics For edge prediction we use precisionrecall and 119865-score measures [4 13 17 18] Output of RTS isa tree This causes precision recall and 119865-score values to beequal [2] since FP (False Positive) and FN (False Negative)

12 The Scientific World Journal

Table 2 Datasetsrsquo statistics split up per website

DatasetsAvg length of

commentsrsquo text (inwords)

Avg length of therootrsquos text (in words)

Number ofcomments

Avg comments pernews

Number ofnews

Number ofcommenter

Thestandard 7298 35768 149058 4981 2992 3431Alef 7466 76655 23210 2796 830 4387ENENews 6819 16264 16661 3892 428 831Courantblogs 7924 39817 4952 1919 258 1716Russianblog 1856 23716 30877 1618 1908 1258

0

20

40

60

80

100

120

4 8 16 32 64 128 256

Num

ber o

f new

s

Number of comments

AlefEnenewsRussianblog

CourantblogsThestandard

Figure 10 Distribution of the number of comments in news article

0

005

01

015

02

025

0 2 4 6 8 10 12 14 16 18 20 22

Nor

mal

ized

new

s cou

nt

Hour

AlefEnenewsRussianblog

CourantblogsThestandard

Figure 11 Publication rate of articles per hour

0002004006008

01012014016

0 2 4 6 8 10 12 14 16 18 20 22 24

Nor

mal

ized

com

men

ts co

unt

Elapsed time after published the root (hour)

AlefEnenewsCourantblogs

RussianblogThestandard

Figure 12 Average number of published comments per hour after the root

The Scientific World Journal 13

0005

01015

02025

03035

04045

05

0 1 2 3 4 5 6 7 8 9 10 11 12

Nor

mal

ized

com

men

ts co

unt

Time difference

AlefEnenewsRusssiablog

CourantblogsThestandard

Figure 13 Time difference between a node and corresponding parent per hour

0

05

1

15

2

25

3

35

4

0 2 4 6 8 10 12 14 16 18 20 22 24

Aver

age c

omm

ents

dept

h

Elapsed time after publishing the root (hour)

AlefEnenewsRusssianblog

CourantblogsThestandard

Figure 14 Depth of comments averaged on the elapsed time from publication of root

in precision and recall are always equal Instead we use thefollowing edge accuracy measure

Accuracy (Acc)edge

=

1003816100381610038161003816reply relations cap detected relations1003816100381610038161003816119873

(7)

where the tree is comprised of119873 comments and one rootThe secondmetric is path accuracy which was introduced

byWang et al [3]Thismetric has a global view and considerspaths from nodes to the root

Accuracy (Acc)path =sum|119862|

119894=1

1003816100381610038161003816path119866 (119894) = path119875(119894)1003816100381610038161003816

|119862| (8)

where path119866(119894) is the ground-truth structure for 119894th comment

and path119875(119894) is the predicted structure for it |119862| is the number

of comments in a thread If any irrelevant comment appearsin the path this metric considers it to be completely wrongSo it is very strict To relax this Wang et al introduced

a metric that computes the overlap between the paths inground truth and the predicted path

Precision (119875)119877-path =

sum|119862|

119894=1(1003816100381610038161003816path119866 (119894) cap path

119875(119894)1003816100381610038161003816 1003816100381610038161003816path119875 (119894)

1003816100381610038161003816)

|119862|

Recall (119877)119877-path =

sum|119862|

119894=1(1003816100381610038161003816path119866 (119894) cap path

119875(119894)1003816100381610038161003816 1003816100381610038161003816path119866 (119894)

1003816100381610038161003816)

|119862|

(9)

where |path119875(119894)| is the number of comments in the prediction

path of 119894th comment and |path119866(119894)| is the number of com-

ments in the ground-truth path of 119894th comment 1198651-score isthe harmonic mean of precision and recall

1198651-score = 2 times Precision times recallPrecision + recall

(10)

The above mentioned metrics are appropriate in inter-rogative threads As mentioned before the root of declar-ative threads is news articles or main contents which aredifferent from the root of interrogative threads This causesthe structure of threads and reply relations to be differentThere are two types of reply relations in declarative threads

14 The Scientific World Journal

0

001

002

003

004

005

006

1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91Nor

mal

ized

com

men

ts co

unt

Length of comments (words)

AlefEnenewsRussianblog

CourantblogsThestandard

Figure 15 Histogram of length of comments in words

0

01

02

03

04

05

06

07

08

1 2 3 4 5 6 7 8 9Nor

mal

ized

com

men

ts co

unt

Depth

AlefEnenewsRusssainblog

CourantblogsThestandard

Figure 16 Histogram of the depth of comments

(1) comment-root that is the relation between commentsand the root article and (2) comment-comment that is therelation between two comments one parent and one childThe comment-root relations show different views around themain article (root) The comment-comment relations showconversation among visitors which is a valuable source ofinformation In Figure 17 there are three comment-root andthree comment-comment relations When a user reads theroot and comments heshe can write hisher opinion orsentiment about the root by replying to root or participatingin a discussion by replying to usersrsquo comments

We believe that the evaluation metrics mentioned beforeare not enough to cover both types of reply relations dueto differences between interrogative and declarative threadsAn appropriate metric should be able to detect both typesof relations So we have to modify the evaluation metrics ordefine new metrics

We propose an evaluation metric Accuracy (Acc)CTDwhere CTD stands for Comments Type Detection It isdefined as the proportion of correctly detected commenttypes The comment-root type includes comments whichinitiate a new view about the root (NP) and the comment-

comment type includes comments which are part of adiscussion (PD)

Accuracy (Acc)CTD =TP + TN

TP + TN + FP + FN (11)

where TP is the number of correctly detected NP commentsTN is the number of correctly detected PD comments FP isthe number of incorrectly detected NP comments and FN isthe number of incorrectly detected PD comments

To evaluate accuracy of comment-comment relations thestandard precision recall and 119865-score measures are used

Precision (119875)edge =TP

TP + FP

Recall (119877)edge =TP

TP + FN

1198651-score (119865)edge = 2 timesPrecision times recallPrecision + recall

(12)

where TP is the number of correctly detected comment-comment relations FP is the number of incorrectly detectedcomment-comment relations and FN is the number of

The Scientific World Journal 15

C

C

Comment-root relation

Comment-comment relation

C

C C C Starter discussion commentStarter discussion comment

R

Figure 17 An illustration of a declarative thread

comment-comment relations in the ground truth that werenot predicted at all

The path accuracy metric mentioned earlier is a mod-ified version of 119875path and 119877path which is appropriate todeclarative platform This metric consider discussion pathsfrom each PD comment to the starter discussion commentbut not to root

Precision (119875)path

=

sum|119862|119901

119894=1

1003816100381610038161003816path119866 (119894) = path119875(119894)1003816100381610038161003816

|119862|119901

ow

1 if (|119862|119901 = 0)

Recall (119877)path

=

sum|119862|119877

119894=1

1003816100381610038161003816path119866 (119894) = path119875(119894)1003816100381610038161003816

|119862|119877

ow

1 if (|119862|119877= 0)

(13)

where |119862|119877is the number of PD comments in the ground-

truth thread and |119862|119901is the number of PD comments in the

predicted thread Path119866(119894) is the discussion path from 119894th

node to the discussion starter comment in the ground truthPath119875(119894) is discussion path from 119894th node to the discussion

starter comment in the predicted threadAlso the relaxed Precision (119875)

119877-path and Recall (119877)119877-path

are modified to be suitable for declarative platform

Precision (119875)119877-path

=

sum|119862|119901

119894=1(1003816100381610038161003816path119866 (119894) cap path

119875(119894)1003816100381610038161003816 1003816100381610038161003816path119875 (119894)

1003816100381610038161003816)

|119862|119901

ow

1 if (|119862|119901= 0)

Recall (119877)119877-path

=

sum|119862|119877

119894=1(10038161003816100381610038161003816path119866 (119894)

cap path119875(119894)

1003816100381610038161003816100381610038161003816100381610038161003816path119866(119894)

10038161003816100381610038161003816)

|119862|119877

ow

1 if (|119862|119877= 0)

(14)

Figure 18 shows two threads of which one is the groundtruth and the other one is the predicted thread In order tobetter understand the evaluation metrics we calculate themfor this example

In Table 3 the predicted thread in Figure 18 has beenevaluated by metrics from interrogative threads The resultsshow high values Table 4 shows the results of evaluationby metrics from declarative threads The results show thatdeclarative metrics have more appropriate results

There are two reasons which lead declarative metricsto better evaluate the predicted structure (1) The root indeclarative threads has many children So if a methodconnects all comments to the root interrogative metricsshow good results (2) Interrogative metrics cannot properlyevaluate comment-comment relations in declarative threadsHowever the declarative metrics can evaluate both types ofrelations

53 Experimental Results and Analysis We use several base-lines to compare the effectiveness of SLARTS The first base-line is the performance when we simply link all comments totheir previous comment (we name it Last baseline) [3 4]Thismethod leads to good results for RTS task in interrogativethreads The second baseline is to link all comments to theroot (we name it First baseline)

The only work that has been done on RTS has focused onthe authorrsquos name in online news agencies [19] Schuth et alrsquosmethod used several features to find the commenterrsquos name inonline news articles This method achieves a good precisionbut it has a low recall because many comments do not referto any authorrsquos name So we selected Seo et alrsquos method [2]which is a strong method in interrogative platforms

Seo et alrsquos method has focused on forums and emails andused the quoted text as one of theirmain featuresThis featureis not available in our datasets Therefore we use all theirproposed features except the quoted text

We have applied 5fold cross-validation to minimize biaswith 95 as confidence interval Table 5 shows the results ofour experiments based on interrogative evaluation metricsAccording to Accedge SLARTS reveals higher accuracy exceptin Alef in which lots of replies are connected to the root

According to 119875119877-path and 119877119877-path First and Last baselines

have the best performance respectively First connects allcomments to the root this way irrelevant comments do notappear in the path from the comment to the root and also theroot appears in all paths Last connects all comments to theirprevious comments This causes all comments to appear inthe path

16 The Scientific World Journal

Table3Ex

ampleo

fcalculatio

nof

interrogativem

etric

sfor

thee

xamples

hownin

Figure

18

Evaluatio

nmetric

Calculation

Accuracy

edge

Replyrelatio

ns=1rarr

1198772rarr

1198777rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

14rarr

35rarr

26rarr

58rarr

5

Detectedrelatio

ns==1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

14rarr

15rarr

26rarr

57rarr

28rarr

7

8 11asymp073

Acc p

ath

|(1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

1rarr

1198775rarr

2rarr

1198776rarr

5rarr

2rarr

119877)|

Num

berof

comments

=8 11asymp073

119875119877-path

(1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

1rarr

1198775rarr

2rarr

1198776rarr

5rarr

2rarr

119877)+(4rarr

1rarr

1198777rarr

2rarr

1198778rarr

7rarr

2rarr

119877)

Num

berof

comments

=(1+1+1+1+1+1+1+1)+(1+12+23)

11

=1017

11

asymp092

119877119877-path

(1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

1rarr

1198775rarr

2rarr

1198776rarr

5rarr

2rarr

119877)+(4rarr

3rarr

1rarr

1198777rarr

1198778rarr

5rarr

2rarr

119877)

Num

berof

comments

=(1+1+1+1+1+1+1+1)+(23+1+23)

11

=1034

11

asymp094

1198651-score119877-path

2lowast092lowast094

092+094

asymp093

The Scientific World Journal 17

R

9 10 11

3 5

8

21 7

4 6

(a)

R

9 10 11

3 7

8

21

4 5

6

(b)

Figure 18 An example of two threads (a) ground-truth thread and (b) the prediction thread

Table 4 Example of calculation of declarative metrics for the example shown in Figure 18

Evaluation metric Calculation

AccCTD

TP = 1 2 9 10 11TN = 3 4 5 6 8

FN = 7

FP =

5 + 5

5 + 5 + 1 + 0asymp 091

119875edge

119877edge

1198651edge

TP = 3 rarr 1 5 rarr 2 6 rarr 5 119875edge =3

3 + 3= 05

FN = 4 rarr 3 8 rarr 5 119877edge =3

3 + 2= 06

FP = 4 rarr 1 7 rarr 2 8 rarr 7 1198651 minus scoreedge asymp 055

119875path

119877path

1198651path

119875path =| 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 |

| 3 rarr 1 4 rarr 1 5 rarr 2 6 rarr 5 rarr 2 7 rarr 2 8 rarr 7 rarr 2 |=3

6= 05

119877path =| 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 |

| 3 rarr 1 4 rarr 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 8 rarr 5 rarr 2 |=3

5= 06

1198651-score119877-path asymp 055

119875119877-path

119877119877-path

1198651119877-path

119875119877-path =

3 rarr 1 4 rarr 1 5 rarr 2 6 rarr 5 rarr 2 7 rarr 2 8 rarr 7 rarr 2

| 3 4 5 6 7 8 |=1 + 1 + 1 + 1 + 0 + 12

6=45

6= 075

119877119877-path =

3 rarr 1 4 rarr 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 8 rarr 5 rarr 2

| 3 4 5 6 8 |=1 + 12 + 1 + 1 + 12

5=4

5= 08

1198651-score119877-path =

2 lowast 075 lowast 08

075 + 08asymp 077

According to Accpath First has better performance inThestandard ENENews and Alef because more commentsare linked directly to the root Usually SLARTS and Seo etalrsquos methods cannot predict the full path inThestandard andENENews because according to Figure 15 paths are verylong and complex in these datasets

As wementioned earlier interrogative evaluationmetricsare not appropriate for declarative threads because based onthese metrics First shows high performance although thisbaseline does not detect any comment-comment relations

Table 6 shows the results when we use declarative evaluationmetrics proposed in Section 52

According to Table 6 for 119865edge SLARTS performs betterthan Seo et alrsquos method However for119875edge Seo et alrsquos methodperforms better but its 119877edge is lower than SLARTS in all da-tasets

It is important to say that when the difference between119875edge and 119877edge is high and 119875edge is greater than 119877edge themethod most likely connects comments to the root and itdoes not appropriately detect comment-comment relations

18 The Scientific World Journal

J(1)M(1)

V(1)

H(1)

A(1)

R(1)

G(1)

N(1)D(1)

E(1)

C(1)

P(2)

U(1)

S(1)

T(2)

T(1)

M(2)

L(1)

Q(1)

C(2)

K(1)

F(1)L(2)

C(3)

P(1)

A(2)

B(1)O(1)

I(1)

H(2)

R

(a)

J(1)

M(1)

R

A(1)

N(4)

D(1)

G(1)

I(1)

B(1)H(1)

O(1)

C(1)

N(2)

G(2)O(2)

P(1)

E(1)

D(2)

F(1)

E(2)F(2)

D(3)

L(1) F(3) N(1)

D(4)

F(4)K(1)D(5)

D(6)

N(3)

(b)

1)

1)

1)

1)

1)

1)

1)

1)1)

1)

1)

1)

R

F(2)

K(3)

R(1)

K(2)

M(2)A(3)

Q(1)

N(2)

K(4)A(2)

J(2)

1)

H(2)

H(3)

I(2)

I(1)

P(1)

O(1)

(c)

Figure 19 Three real threads in which nodes are labeled according to commenterrsquos name and number of its comments

(like First baseline) On the other hand when 119877edge isgreater than 119875edge the method does not appropriately detectcomment-root relations (like Last baseline)

So Seo et alrsquos method most likely connects comments tothe root and this is not appropriately detecting comment-comment relations On the other hands 119875edge and 119877edge ofSLARTS are close

Several features in SLARTS such as Global similarityfrequent words Frequent patterns and authorsrsquo languagefocus on detection of comment-comment relations Thismakes the results of SLARTS better than Seo et alrsquos methodin declarative evaluation metrics

We already saw that the average length of threads inThestandard and ENENews is longer than the other datasets(Table 2) and their paths are much longer and more complexthan the other datasets (Figure 16) According to 119865edge theaccuracy of Thestandard and ENENews is less than otherdatasets Note that 119865edge is a strict metric in declarativethreads

SLARTS has better 119865edge than Seo et alrsquos method in alldatasetsThemaximumdifference occurs inThestandard andRussianblog In these datasets many of the defined featureshave goodperformance (Importance of features in ENENewsand Russianblog datasets will be shown in Table 7 and we willexplain more in the next section)

The Scientific World Journal 19

Table 5 Experimental results with interrogative evaluation metrics

Datasets Methods Accedge Accpath 119875119877-path 119877

119877-path 1198651119877-path

Thestandard

SLARTS 04669plusmn0009

03059plusmn0013

06927plusmn0018

07450plusmn0010

07179plusmn0013

Seo et alrsquos [2] 03941plusmn0005

03413plusmn0006

08303plusmn0009

06563plusmn0005

07331plusmn0003

First 03630plusmn0010

03630plusmn0010

10000plusmn0000

05846plusmn0009

07379plusmn0007

Last 01824plusmn0006

00495plusmn0004

02214plusmn0008

10000plusmn0000

03625plusmn0010

ENENews

SLARTS 04837plusmn0018

03695plusmn0027

07625plusmn0026

07770plusmn0013

07697plusmn0017

Seo et alrsquos [2] 04661plusmn0017

04134plusmn0021

08720plusmn0011

07086plusmn0013

07818plusmn0011

First 04524plusmn0018

04524plusmn0018

10000plusmn0000

06634plusmn0016

07976plusmn0011

Last 02090plusmn0016

00596plusmn0003

02319plusmn0011

10000plusmn0000

03764plusmn0015

Courantblogs

SLARTS 06843plusmn0028

06453plusmn0033

09127plusmn0013

08619plusmn0014

08865plusmn0013

Seo et alrsquos [2] 06700plusmn0040

06628plusmn0043

09667plusmn0014

08250plusmn0018

08902plusmn0015

First 06490plusmn0036

06490plusmn0036

10000plusmn0000

08001plusmn0019

08889plusmn0012

Last 02037plusmn0017

01082plusmn0008

03183plusmn0011

10000plusmn0000

04829plusmn0012

Russianblog

SLARTS 07463plusmn0017

06904plusmn0017

08741plusmn0008

08787plusmn0006

08764plusmn0007

Seo et alrsquos [2] 05446plusmn0010

05174plusmn0011

09293plusmn0003

07821plusmn0004

08494plusmn0003

First 04968plusmn0006

04968plusmn0006

10000plusmn0000

07134plusmn0004

08327plusmn0003

Last 03632plusmn0018

01549plusmn0012

03695plusmn0015

10000plusmn0000

05395plusmn0016

Alef

SLARTS 07753plusmn0017

07641plusmn0018

09579plusmn0006

09033plusmn0008

09298plusmn0007

Seo et alrsquos [2] 07802plusmn0019

07800plusmn0019

09950plusmn0003

08828plusmn0010

09355plusmn0006

First 07853plusmn0018

07853plusmn0018

10000plusmn0000

08826plusmn0010

09376plusmn0006

Last 00993plusmn0005

00822plusmn0005

02526plusmn0012

10000plusmn0000

04032plusmn0015

The bold style shows the best result in each metric

As shown in Table 6 Russianblog has better resultsthan the other datasets in all metrics The main reason isthat its comments are not confirmed by a moderator Thiscauses Accedge of Last baseline in Russianblog to be equalto 03632 that is more than other datasets (Table 5) Accedgeof Last baseline has inverse relationship with complexity ofreplies Also we showed in Figure 13 that the Russianbloghas the lowest time difference between a comment and itsparent When time difference between a child and its parentdecreases detection of reply relationwould be easier In otherwords as the parent appears to be closer to its child somefeatures such as frequent pattern and location Prior that are

based on position of comments in chronological order workbetter

The difference between 119865119877-path and 119865path is about 20

in Thestandard and ENENews where threads are larger andpaths have more complex structures

The minimum difference between the results of SLARTSand Seo et alrsquos methods appears in Alef datasets In Alef manyrelations are comment root and many comments do not haveauthorrsquos name which make the features perform poorly

Since SLARTS method has features which speciallyfocused on detecting comment-root relations (eg by addingcandidate filtering) AccCTD of SLARTS is better than Seo

20 The Scientific World Journal

et al method in all datasets The best result is 091 forRussianblog and the worst result is 069 for ENENewsThe root of ENENews dataset is usually a tweet Accordingto Table 2 it makes the average length of the rootrsquos textto be shorter than the other datasets and this makes thetextual features perform poorly on detecting comment-rootrelations

Confidence intervals of Alef andCourantblog datasets arehigher than the other datasets becausemany of their relationsare comment root (Figure 16) This makes ranking SVM tobe bias towards connecting comments to the root especiallywhen a thread includes very few comment-comment rela-tions

We compare 119875-value to specify the significance of differ-ences between SLARTS and Seo et alrsquosmethods on declarativemetrics Since ranking SVM ranks candidates based on theirscore and selects the first candidate from the list only thefirst candidate is important So p1 is computed The resultsindicate that all improvements are statistically significant (119875-value lt 0005) in all datasets

54 Evaluation of the Features In this section we evaluatethe role of the introduced features We use backward featureselection That means to measure the importance of a featurewe use all features except that feature repeat the experimentsin its absence and compare the results to the case whereall features are present The difference between the values ofmetrics in presence of a feature and its absence is reported inTable 7

It is seen that some features improve precision of themetrics for example location prior and candidate filteringrule 3 where they most likely tend to detect comment-rootrelations Some features improve recall of the metrics such asauthorsrsquo language global similarity candidate filtering rule1 and frequent patterns These features most likely tendto detect comment-comment relations Some features affectboth precision and recall for example authorsrsquo name

As stated earlier Russianblog has different behaviourin comparison with other datasets (Figures 12 13 and 15)Also ENENews has larger threads and more complex paths(Figure 16 and Table 2) So we use these datasets to evaluatefeatures in depth

Similarity feature improves recall of evaluation metricsSince Russian language is known as a morphologically richlanguage and the length of commentsrsquo text is very short(about average 18 words according to Table 2) in comparisonwith other datasets improvement of textual features is lowTo increase the performance of textual feature in Russianwe need a stemmer a Wordnet and a tokenizer Similarityfeature has rather a good impact on AccCTD

Authorsrsquo language feature improves recall of evaluationmetrics According to 119877edge the improvements of this featureare 003 and 001 in ENENews and Russianblog respectively

Global similarity feature improves both precision andrecall in Russianblog and recall in ENENews

The frequent words feature has a small improve-ment in Russianblog like similarity feature This feature

improves recall of the evaluation metrics in ENEnews about001

Length ratio feature improves precision in both datasetsHowever since longer comments in ENENews have manychildren this feature is more prominent there

Authorsrsquo name feature is useful in all evaluation metricsThe value of this feature in ENENews is more than Russian-blog because authorsrsquo name in Russianblog is different fromother datasets it is an email and no one would refer to it as aname

The frequent patterns feature focuses on detectingcomment-comment relations This feature improves recall ofevaluation metrics in both datasets

The location prior feature improves precision in bothdatasets This feature has a good improvement on AccCTDAccording to 119875edge the best improvement is 016091 forRussianblog since comments are not moderated

The candidate filtering rule 1 improves recall of evaluationmetrics in both datasets This feature removes the rootcandidate accurately This feature has a good improvementon AccCTD The maximum improvement for 119877edge is 034 forRussianblog

The candidate filtering rule 2 has small improvements onprecision and recall of evaluation metrics in both datasetsMaximum improvement is gained with Russianblog

Finally The candidate filtering rule 3 improves precisionof metrics in both datasets This feature removes all can-didates except the root Thus detection of comment-rootrelations is improved Also this feature improves AccCTD

6 Information Extraction byVisualizing Threads Structure

In this section we discuss the information which can beextracted from hierarchical structure of threads Figure 19 isthe same as Figure 3 except that the commenters and theirnumber of posted comments are shown as the nodesrsquo labelfor example ldquo119863(3)rdquo is the third comment sent by user ldquo119863rdquoVisualization of the structure reveals valuable information asfollowing

(i) Is the root article controversial The threads ldquo119861rdquo andldquo119862rdquo in Figure 19 aremore controversial than ldquo119860rdquo Visi-tors have expressed different opinions and sentimentsabout the root in ldquo119861rdquo and ldquo119862rdquo leading to formationof longer conversations The thread ldquo119860rdquo shows acommon thread which has short conversations aboutthe root The height and width of trees can help as ameasure to recognize whether a root is controversialor not

(ii) Which comments are starters [36] Starter commentshave important role in the conversations becauseusers read them and then read other comments tobetter understand the conversations

(iii) Which comments have participated in this conversa-tion Users can follow the conversations that seeminteresting to them without reading any unrelatedcomment

The Scientific World Journal 21

Table 6 Experimental results with declarative evaluation metrics

Datasets Method 119875edge 119877edge 119865edge 119875path 119877path 119865path 119875119877-path 119877

119877-path 119865119877-path AccCTD

ThestandardSLARTS 03782

plusmn000804155plusmn0011

03960plusmn0006

01565plusmn0007

01713plusmn0008

01636plusmn0007

03628plusmn0012

04166plusmn0016

03876plusmn0005

07525plusmn0005

Seo et alrsquos[2]

03014plusmn0031

01414plusmn0015

01921plusmn0016

01544plusmn0025

00576plusmn0011

00835plusmn0013

03013plusmn0038

01319plusmn0018

01831plusmn0021

05831plusmn0007

Difference 00778 02741 02039 00021 01137 00801 00615 02847 02045 01694

ENENewsSLARTS 03797

plusmn003803891plusmn0019

03839plusmn0024

01873plusmn0040

01855plusmn0026

01857plusmn0030

03768plusmn0048

03768plusmn0017

03759plusmn0025

06872plusmn0014

Seo et alrsquos[2]

04060plusmn0030

01713plusmn0016

02407plusmn0018

02378plusmn0034

00837plusmn0016

01235plusmn0021

03947plusmn0035

01523plusmn0015

02196plusmn0019

05954plusmn0009

Difference minus00263 02178 01432 minus00505 01018 00622 minus00179 02245 01563 00918

CourantblogsSLARTS 05236

plusmn003003963plusmn0073

04495plusmn0052

04220plusmn0057

03110plusmn0058

03566plusmn0049

05325plusmn0049

03853plusmn0075

04452plusmn0057

07625plusmn0033

Seo et alrsquos[2]

07320plusmn0090

02049plusmn0062

03189plusmn0082

06815plusmn0105

01896plusmn0070

02951plusmn0093

07309plusmn0094

02010plusmn0065

03140plusmn0086

07001plusmn0031

Difference minus02084 01914 01306 minus02595 01214 00615 minus01984 01843 01312 00624

RussianblogSLARTS 05940

plusmn002805674plusmn0022

05803plusmn0023

04907plusmn0031

04664plusmn0020

04782plusmn0024

05878plusmn0028

05823plusmn0021

05849plusmn0020

09116plusmn0011

Seo et alrsquos[2]

04705plusmn0020

03030plusmn0015

03685plusmn0016

03823plusmn0023

02534plusmn0013

03046plusmn0015

04682plusmn0020

02822plusmn0014

03521plusmn0015

05862plusmn0013

Difference 01235 02644 02118 01084 0213 01736 01196 03001 02328 03254

AlefSLARTS 06189

plusmn002804131plusmn0040

04952plusmn0037

05639plusmn0027

03819plusmn0037

04551plusmn0034

06277plusmn0027

04069plusmn0043

04934plusmn0039

08044plusmn0013

Seo et alrsquos[2]

08827plusmn0052

02637plusmn0051

04045plusmn0060

08797plusmn0053

02631plusmn0051

04034plusmn0060

08840plusmn0052

02635plusmn0051

04043plusmn0060

07846plusmn0018

Difference minus02638 01494 00907 minus03158 01188 00517 minus02563 01434 00891 00198The bold style shows the best result in each metric

(iv) Who plays an important role in a discussion Someusers playmore important roles in conversations thanother users For example users ldquo119863(1)rdquo in threadldquoBrdquo and ldquo119860(1)rdquo in thread ldquoCrdquo have made a longconversation These users are known as a hub or astarter in a thread and their degree of importance canbe compared according to their indegree [39] Manyanalyses can be performed on this structure which isknown as popularity features [12]

(v) How many conversations are about the root Thereare thirteen conversations which are replyingdirectly to the root in thread ldquoCrdquo for exampleF(2)H(2)H(3) I(2)O(1)orF(2)H(2) l(1) showtwo conversations in thread ldquo119862rdquo

7 Conclusion and Future Work

In this paper we proposed SLARTS a method based onranking SVMwhich predicts the tree-like structure of declar-ative threads for example blogs and online news agenciesWe emphasized on the differences between declarative andinterrogative threads and showed that many of the previouslyproposed features perform poorly on declarative threadsbecause of this Instead we defined a set of novel textual andnontextual features and used a ranking SVM algorithm tocombine these features

We detect two types of reply relations in declarativethreads comment-root relation and comment-commentrelation An appropriate method should be able to detectboth types of reply relations and an appropriatemetric shouldconsider both types of reply relations So in order to havefair judge on the quality of the predicted structures wemodified the evaluation metrics accordingly We also defineda novel metric that measures the accuracy of comment typedetection

The results of our experiments showed that accord-ing to declarative evaluation metrics our method showshigher accuracy in comparison with the baselines on fivedatasets Also we showed that all improvements in detectingcomment-comment relations are statistically significant in alldatasets

We believe that the defined features are not limited todeclarative threads Some features such as authorrsquos languageand frequent patterns extract relations among users can beuseful in interrogative threads

For future we would like to use the proposed methodon interrogative platforms such as forums Also we wouldlike to analyze the tree-like reply structure deeply and webelieve it can provide valuable information for example tohelp in understanding the role of users in discussions or tofind important and controversial comments among a largevolume of comments

22 The Scientific World Journal

Table 7 The difference between the evaluation metrics in presence and absence of features

Feature Dataset Evaluation metric119875edge 119877edge 119875Path 119877Path 119875

119877-path 119877119877-path AccCTD

Similarity ENENews minus000233 007415 minus000538 003289 minus001095 008996 004684Russianblog 000874 000012 000963 000347 000695 minus001079 000355

Authorsrsquo language ENENews minus001354 003246 minus001785 000693 minus002757 004052 000458Russianblog minus000229 001052 minus000179 000750 minus000590 000807 minus000182

Global similarity ENENews minus000402 005604 minus001338 001581 minus001129 007086 002742Russianblog 001090 001443 000582 000897 000368 minus000317 minus000087

Frequent words ENENews 000252 000931 000170 000424 minus000132 000973 000292Russianblog 000008 000023 000016 000029 000034 000047 minus000011

Length ratio ENENews 004206 minus002984 004450 001700 006421 minus005336 minus000850Russianblog 000651 minus000355 000920 000080 000784 minus000592 000235

Authorsrsquo name ENENews 003753 006392 001920 003285 002341 005998 001788Russianblog 000000 000000 000000 000000 000000 000000 000000

Frequent pattern ENENews minus002028 002930 minus002357 000300 minus003793 003410 000526Russianblog 000458 006100 000830 005206 minus000545 006150 000404

Location prior ENENews 008778 minus003690 008556 003280 011792 minus007852 002925Russianblog 016091 009671 017766 013515 016699 008793 002414

Candidate filtering rule 1 ENENews minus005431 006138 minus004102 001296 minus006458 007545 006949Russianblog minus012506 034370 minus013916 027445 minus012683 037003 032546

Candidate filtering rule 2 ENENews 000014 minus000004 000048 000041 minus000038 minus000035 minus000009Russianblog 002575 002475 002462 002341 001746 000802 000000

Candidate filtering rule 3 ENENews 000357 minus000121 000452 000254 000551 minus000161 000161Russianblog 008521 minus001978 009636 001702 009380 minus003825 007371

Since our goal is to provide a language-independentmodel to reconstruct the thread structure we did not usetext processing tools such as Wordnet and Named EntityRecognition (as they are not available in the same quality forall languages) We would like to focus on English and usethese tools to find out whether they can improve the accuracyof RTS method or not

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] Y Wu Q Ye L Li and J Xiao ldquoPower-law properties ofhuman view and reply behavior in online societyrdquoMathematicalProblems in Engineering vol 2012 Article ID 969087 7 pages2012

[2] J Seo W B Croft and D A Smith ldquoOnline community searchusing conversational structuresrdquo Information Retrieval vol 14no 6 pp 547ndash571 2011

[3] HWang CWang C X Zhai and J Han ldquoLearning online dis-cussion structures by conditional random fieldsrdquo in Proceedingsof the 34th International ACM SIGIR Conference on Researchand Development in Information Retrieval (SIGIR rsquo11) pp 435ndash444 July 2011

[4] E Aumayr J Chan and C Hayes ldquoReconstruction of threadedconversations in online discussion forumsrdquo inProceedings of the

5th International AAAI Conference onWeblogs and SocialMedia(ICWSM rsquo11) pp 26ndash33 2011

[5] D Shen Q Yang J-T Sun and Z Chen ldquoThread detactionin dynamic text message streamsrdquo in Proceedings of the 29thAnnual International ACM SIGIR Conference on Research andDevelopment in Information Retrieval pp 35ndash42 Seattle WashUSA August 2006

[6] M Dehghani M Asadpour and A Shakery ldquoAn evolutionary-based method for reconstructing conversation threads in emailcorporardquo in Proceedings of the IEEEACM International Con-ference on Advances in Social Networks Analysis and Mining(ASONAM rsquo12) 2012

[7] Y Jen-Yuan ldquoEmail thread reassembly using similarity match-ingrdquo in Proceedings of the 3rd Conference on Email and Anti-Spam (CEAS rsquo06) Muntain View Calif USA 2006

[8] S Gottipati D Lo and J Jiang ldquoFinding relevant answersin software forumsrdquo in Proceedings of the 26th IEEEACMInternational Conference on Automated Software Engineering(ASE rsquo11) pp 323ndash332 November 2011

[9] H Duan and C Zhai ldquoExploiting thread structures to improvesmoothing of language models for forum post retrievalrdquo inAdvances in Information Retrieval P Clough C Foley CGurrin et al Eds vol 6611 pp 350ndash361 Springer BerlinGermany 2011

[10] T Georgiou M Karvounis and Y Ioannidis ldquoExtractingtopics of debate between users on web discussion boardsrdquo inProceedings of the 1st International Conference forUndergraduateand Postgraduate Students in Research Poster Competition 2010

[11] Y C Wang M Joshi W Cohen and C P Rose ldquoRecoveringimplicit thread structure in newsgroup style conversationsrdquo

The Scientific World Journal 23

in Proceedings of the International Conference on Weblogs andSocial Media Seattle Wash USA 2008

[12] J Chan C Hayes and E M Daly ldquoDecomposing discussionforums and boards using user rolesrdquo in Proceedings of the 4thInternational AAAI Conference on Weblogs and Social Media2010

[13] Y-C Wang and C P Rose ldquoMaking conversational structureexplicit identification of initiation-response pairs within onlinediscussionsrdquo in Proceedings of the Human Language Technolo-gies Conference ofthe North American Chapter of the Associationfor Computational Linguistics pp 673ndash676 June 2010

[14] LWangM Lui S N Kim J Nivre and T Baldwin ldquoPredictingthread discourse structure over technical web forumsrdquo inProceedings of the Conference on Empirical Methods in NaturalLanguage Processing (EMNLP rsquo11) pp 13ndash25 Edinburgh UKJuly 2011

[15] A Balali H FailiMAsadpour andMDehghani ldquoa supervisedapproach for reconstructing thread structure in commentson blogs and online news agenciesrdquo in Proceedings of the14th International Conference on Intelligent Text Processing andComputational Linguistics Samos Greece 2013

[16] P Adams and C Martel ldquoConversational thread extraction andtopic detection in text-based chatrdquo in Semantic Computing pp87ndash113 John Wiley amp Sons 2010

[17] P H Adams and C H Martell ldquoTopic detection and extractionin chatrdquo in Proceedings of the 2nd Annual IEEE InternationalConference on Semantic Computing (ICSC rsquo08) pp 581ndash588August 2008

[18] C Lin J M Yang R Cai X J Wang W Wang and L ZhangldquoModeling semantics and structure of discussion threadsrdquo inProceedings of the 18th International Conference on World WideWeb pp 1103ndash1104 2009

[19] A Schuth MMarx andM d Rijke ldquoExtracting the discussionstructure in comments on news-articlesrdquo in Proceedings of the9th ACM International Workshop onWeb Information and DataManagement (WIDM rsquo07) Lisboa Portugal November 2007

[20] C Wang J Han Q Li X Li W P Lin and H Ji ldquoLearninghierarchical relationships among partially ordered objects withheterogeneous attributes and linksrdquo in Proceedings of the 12thSIAM International Conference on Data Mining pp 516ndash527Anaheim Calif USA 2012

[21] M Dehghani A Shakery M Asadpour and A KoushkestanildquoA learning approach for email conversation thread reconstruc-tionrdquo Journal of Information Science vol 39 no 6 pp 846ndash8632013

[22] A Balali A Rajabi S Ghassemi M Asadpour and H FailildquoContent diffusion prediction in social networksrdquo in Pro-ceedings of the 5th Conference on Information and KnowledgeTechnology (IKT rsquo13) pp 467ndash471 2013

[23] G Mishne and N Glance ldquoLeave a reply an analysis of weblogcommentsrdquo in Proceedings of the 3rd Annual Workshop onthe Weblogging ecosystem Aggregation Analysis and DynamicsEdinburgh UK 2006

[24] V Gomez A Kaltenbrunner and V Lopez ldquoStatistical analysisof the social network and discussion threads in Slashdotrdquo inProceedings of the 17th International Conference on World WideWeb (WWW rsquo08) pp 645ndash654 April 2008

[25] T Weninger X A Zhu and J Han ldquoAn exploration ofdiscussion threads in social news sites a case study of the redditcommunityrdquo 2013

[26] D Laniado R Tasso Y Volkovich and A KaltenbrunnerldquoWhen the Wikipedians talk network and tree structure of

Wikipedia discussion pagesrdquo in Proceedings of the 5th Interna-tional AAAI Conference on Weblogs and Social Media (ICWSMrsquo11) 2011

[27] A Grabowski ldquoHuman behavior in online social systemsrdquoTheEuropean Physical Journal B vol 69 no 4 pp 605ndash611 2009

[28] S N Kim L Wang and T Baldwin ldquoTagging and linkingweb forum postsrdquo in Proceedings of the 40th Conference onComputational Natural Language Learning Uppsala Sweden2010

[29] J Andreas S Rosenthal and K McKeown ldquoAnnotating agree-ment and disagreement in threaded discussionrdquo in Proceedingsof the 8th International Conference on Language Resources andComputation (LREC rsquo12) pp 818ndash822 Istanbul Turkey May2012

[30] A Stolcke K Ries N Coccaro et al ldquoDialogue actmodeling forautomatic tagging and recognition of conversational speechrdquoComputational Linguistics vol 26 no 3 pp 339ndash373 2000

[31] E N Forsyth Improving Automated Lexical and DiscourseAnalysis of Online Chat Dialog Naval Postgraduate SchoolMonterey Calif USA 2007

[32] D Feng E Shaw J Kim and E Hovy ldquoAn intelligentdiscussion-bot for answering student queries in threaded dis-cussionsrdquo in Proceedings of the 11th international Conference onIntelligent User Interfaces pp 171ndash177 February 2006

[33] F Hu T Ruan Z Shao and J Ding ldquoAutomatic web informa-tion extraction based on rulesrdquo in Proceedings of the Web Infor-mation System Engineering (WISE rsquo11) pp 265ndash272 Springer2011

[34] F Hu T Ruan and Z Shao ldquoComplete-thread extraction fromweb forumsrdquo inWebTechnologies andApplications pp 727ndash734Springer 2012

[35] D Feng J Kim E Shaw and E Hovy ldquoTowards modelingthreaded discussions using induced ontology knowledgerdquo inProceedings of the National Conference on Artificial Intelligencepp 1289ndash1294 July 2006

[36] G D Venolia and C Neustaedter ldquoUnderstanding sequenceand reply relationships within email conversations a mixed-model visualizationrdquo in Proceedings of the SIGCHI Conferenceon Human Factors in Computing Systems pp 361ndash368 April2003

[37] T Joachims ldquoTraining linear SVMs in linear timerdquo in Pro-ceedings of the 12th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining pp 217ndash226 Philadel-phia Pa USA August 2006

[38] RM Fano ldquoTransmission of information a statistical theory ofcommunicationsrdquo American Journal of Physics vol 29 pp 793ndash794 1961

[39] T Opsahl F Agneessens and J Skvoretz ldquoNode centrality inweighted networks generalizing degree and shortest pathsrdquoSocial Networks vol 32 no 3 pp 245ndash251 2010

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 3: Research Article A Supervised Approach to Predict the …downloads.hindawi.com/journals/tswj/2014/479746.pdf · 2019-07-31 · Research Article A Supervised Approach to Predict the

The Scientific World Journal 3

be exploited to discover the tree structure of the threads[5 10 16]

The thread detection task is not necessary for declarativethreads since usually all comments are related to the rootBased on this all comments can be considered in just onethread Consequently we focus only onRTS task in this study

22 RTS Task There are very few studies in the literature thatdirectly address the problem of RTS task for comments indeclarative threads In general RTS task can be done either insupervised or unsupervised manner In unsupervised meth-ods the relation between comments is weighted by usingtext similarity measures Then they are adjusted using othermetrics such as time distance and position in chronologicalorder Finally the relations whose weights are higher than apredefined threshold are selected as the parent-child relations[11 17] Lin et al [18] proposed a method which computesthe similarity between each post and previous posts to findcandidate parents Then the post with the highest score ischosen as the parent candidate If the parent candidate is notsimilar enough to the child comment the candidate parent isassumed to be a new discussion branch of the thread

In supervised methods the existence of relation betweentwo messages is determined using a supervised learningalgorithm [2 4 19] In these methods a set of features isdefined and weighted using a training set Then the trainedmodel is used to discover the comment relations in the testdata using extracted features

The methods proposed in [4 19] are supervised inwhich a set of simple features and a classifier are used Thefeatures are divided into two groups The first group is calledstructural or nontextual features such as time informationreply distance and authorrsquos name The next group of featuresis called semantic or textual features such as sentences typeand similarity among comments

Seo et al [2] proposed a learning technique that exploitsthe hierarchical structure of replies in forums or emailsTheyintroduced a structure discovery technique that uses a varietyof features to model the relations among posts of the samethread In fact their method is the most similar to oursHowever we focus on blogs and news agencies while theyhave worked on forums and emails The existence of quotedtexts in forums and emails makes RTS task easier than ourcase

Wang et al [3] proposed a probabilistic model as asupervised structure learning problem to predict the depen-dency among the posts within one thread on forums basedon the general conditional random fields Their method isbased on various kinds of features The features describedthe interactions in both posts and authors The weights forthe designed features are estimated in a supervised mannerSimilar to previous work Wang et al [20] proposed adiscriminative probabilistic model which can handle bothlocal features and knowledge propagation rules

The only existing work on RTS task in declarative threadshas been proposed by Schuth et al [19] This work focusedon RTS task in online news agencies threads They usedseveral features to detect authorsrsquo name in the commentsrsquo text

Then the features are combined by tree-learner algorithmand eventually a classifier detects relations among commentsand the root Since there are many comments that do notrefer to any authorrsquos name in our case this method does nothave a good accuracy

There are also some related works on other types of datasuch as email data [6 7 21] However some specific featuresexist in those environments that are not applicable here Forinstance some extra information about messagersquos recipientslike ldquoToCCrdquo tag in email data or any information about theaffiliation of messagersquos author like signature is exploited toimprove reconstructing conversation threads in email data

23 Thread Structure Analysis Some works have been doneon analysis of the social media emerging from the usercomment activity [22 23] The messages boards such asSlahsdot and Reddit publish frequently short news postsand allow their readers to comment on them The worksproposed in [24 25] focused on these threads They exploredthe structure and topical hierarchies of comment threadsto gain a deeper understanding of the usersrsquo behaviour thatallow these types of user-powered websites to better operateLaniado et al [26] analyzed the structural properties of thethreads on Wikipedia pages to extract and study differentkinds of interactions Wu et al [1] introduced a model toexplain the human view and reply behaviors in the forumwhich are helpful for discovering collective patterns of humanbehaviors They found that view and the reply behaviourshave a form of a power-law distribution [1 27]

24 Detection of Initiation-Response Pairs The works pro-posed in [13 14] focused on detection of initiation-responsepairs such as the question-answer and assessment-agreementrelationships Initiation-response pairs are pairs of utterancesthat the first pair part sets up an expectation for the secondpair part [13] Wang et al [14] introduced a list of thedialogue act labels for edges A dialogue act label such asanswer-answer answer-question is assigned to each relationbetween two messages Kim et al [28] proposed a dialogueact tag set and method for annotating post-to-post discoursestructure on forums They used three feature sets structuralfeatures post context features and semantic features andthey experimented with three discriminative learners SVMHMM andmaximum entropy Andreas et al [29] introduceda new corpus of sentence-level agreement and disagreementannotations Two sentences are in agreement if they show thesame fact or opinion

A list of dialog act labels and an approach for modelingdialogue acts have been proposed in conversational speech[30ndash32] Detection of dialogue act labels for each post issuitable for thread detection [5 16] and finding relevantanswers in forums [8]

25 Automatic Meta-Information Extraction from HTMLPages Information of threads such as authorsrsquo name andcontent is usually extracted by human This is time con-suming Some methods have been proposed to extract themain content and remove noisy information from a web page

4 The Scientific World Journal

automatically [33] Hu et al [34] proposed an algorithm toextract all meta-information of threads from different kindsof forums without human interactionThe algorithm consistsof two steps thread extraction and detailed informationextraction from thread pages

3 Problem Definition

In this section some concepts are defined about RTS taskA comment is an utterance written by a user comprisingone or several sentences A commenter is a person whowrites comments and replies to the root article or the othercomments In this paper the root is defined as the starterof a conversation which can be a news article or any othercontent which has been written by an author or a journalistFigure 1 shows a part of a real thread with 145 comments InFigure 1 the root post is shown with label 119877 and edges denotethe reply relations In a thread a reply comment (which isillustrated in Figure 1 by numbered labels) responses to theprevious comments or to the root item For example thenodes with label 1 and 32 are reply comments to the rootitem which is considered as their parent The sequences oflabels in Figure 1 are in chronological order A thread is asequence of comments that starts with the root item andcontains a series of reply comments which are usually relatedto the same topic as the root item Each comment has a singleparent and all comments are descended from the root postin a thread [24 29] In other words threads are consideredas a special case of human conversations [35] that consist ofa set of comments contributed by two or more participantsby the reply operation [36]The candidate set of 119894th commentis a set of comments that could be considered as the parentof 119894th comment and includes comments which appear beforethe 119894th comment in chronological orderThe starter discussioncomment is a comment repling to the root item and it has atleast one child like comments with labels 1 and 32 in Figure 1

Thread detection task means finding the cluster of com-ments that belong to the same topic in a given text streamwithout any previous knowledge about the number ofthreads Reconstructing thread structure (RTS) task meansreconstructing the reply structure on comments in a threadThis leads to construction of a tree-like reply structure [2ndash5] or directed acyclic graph (DAG) [11 17 19] Since thethreads extracted from websites have the tree-like replystructure they are usually modeled as a tree in most papersAlso the structure of threads can be modeled as a DAGThe information about DAG-like reply structure is mostlyprepared by manually annotating the data which is a timeconsuming and difficult task In addition since the manuallyannotated dataset contains a small number of threads [11 19]it cannot properly evaluate the RTS algorithms In this paperwe assume a tree-like reply structure

Declarative and interrogative threads are different inessence

(i) Users in declarative threads mostly express theiropinions or sentiments about the root post informallywhile in interrogative threads users mostly expresstheir questions and answers in a more formal way

(ii) In declarative threads the topic of the root is mostlikely a news article or a content reflecting the real-world events while in interrogative threads it is mostlikely a userrsquos question

(iii) There is meta-information such as quotation in inter-rogative threads that has a great improvement on theaccuracy of RTS task

(iv) Comments of declarative threads usually wait formoderation to be published and it usually takes sometime Moderators are not always online they log ina few times per day accept the sent comments andlog out Therefore multiple comments appear nearlyat the same time So some features that are based ontime distance [11] and position in chronological orderhave good performance in interrogative threads butthey are not good in declarative threads

Declarative and interrogative threads are technically dif-ferent as well

(i) Each comment can most likely be connected toits previous comment (1-Distance) in the interroga-tive threads [4] this simple heuristic leads to greatimprovement in the accuracy of RTS task [3 4] sincemany of posts are most likely written to answer thelast question However this heuristic cannot have agood performance in declarative threads Accordingto Figure 2(a) the users most likely reply to thefirst previous comment in Apple discussion forumAlthough Figure 2(b) shows the users most likelyreply to the root inThestandard online news agenciesThe 11th comment in Apple is the parent of 12th com-ment with probability 049 however in Thestandardthe probability is equal to 0137

(ii) Figure 3 includes three real threads in Thestandarddataset which includes 30 comments and each com-ment is numbered and sorted in chronological orderSince users usually express their opinions or sen-timents and reply to each comment regardless ofsubmission time and position of comments so thestructure of the replies is not predictable

(iii) The length of the roots text in declarative threads isusually larger than comments thus the length of theroot or comments should be normalized in similaritymeasurements

There are websites which are known as message boardssuch as Digg Reddit and Slashdot The message boards arevaluable threads to analyze usersrsquo behavior [24 25] Howeverthey are not suitable to be used as training data since theyhave different designs for showing the comments Thesedifferences change the behavior of userrsquos replies for exampleSlashdot (httpslashdotorg) shows comments based ontheir scores This causes the comments which have higherscores to get more replies So we cannot create a generalmodel appropriate for all message boards

The Scientific World Journal 5

Article-news title the president loses 30000 jobs in a single year

The president reacts to criticism of his appalling jobs recordwhich has 300000 Kiwis jobless and 30000 fewer people inwork in the last year middot middot middot Herersquos criticism of things that yoursquorenot doing on the jobs front the president tax pollution soyou can reduce tax on companies and income renew andenhance the home insulation scheme which has created2000 jobs and is about to run out of money create anational investment middot middot middot

User A26 march 2013 at 802 am

User D26 march 2013

at 1110 am

321

middot middot middot

All Shearer has to do is1 Print Money or borrow a shit load moremoney2 Whack on a pile of extra taxes3 Get the government to run everything

Can you get some of those 30000 toknock on mydoor Irsquom offering fulltime employment no experiencenecessary and well above minimumwage middot middot middot but the response has beenpoor and of those who have appliedmost have a terrible attitude towardswork I donrsquot understanditmiddot middot middot

User B26 march 2013

at 806 am

2

Irsquove been reading a lot of storiessimilar to yours and have seen howhard it is for employers to get goodreliable staffThe mind set of theyounger people these days seems tobe very poor middot middot middot

Irsquove beenreading a lot ofstories middot middot middot

User A26 march 2013 at 1124 am

User E26 march 2013 at1247 pm

124

36

Of course they have to borrowmoneyChristchurch roads WFF interestfree loans we donrsquot make enoughto cover those expenses middot middot middot

User A26 march 2013

at 816 am

6

But national are-borrowing a shit load more moneyevery week-whacking on extra taxes (GSTcigarettesalcohol petrol middot middot middot )

-trying to run Auckland city fromWellingtonmiddot middot middot

you forgot 1000000000in lost revenue becauseof poorly structured taxcuts that did not stimulatethe economy

remember only theMexicans pay less taxthan us middot middot middot

User C26 march 2013

at 819 am

8

R

$

Figure 1 A part of a real thread fromThestandard (httpwwwthestandardorgnz) online news website

4 Method

In this section we describe our supervised approach to RTStask First we define some textual and nontextual featuresand learn a proper model to combine the features using aranking SVM (Support Vector Machines) Then the model isemployed in reconstruction of the reply structure for threadsin the test data

41 Ranking SVM SVM is a supervised learning methodthat is used for classification regression and ranking

The implementation we use is the ranking SVM classifi-er (httpwwwcscornelledupeopletjsvm lightsvm rankhtml) [2 37] The ranking SVM classifier learns to assigna weight to the pairs of comments The whole procedurefor choosing the parent of the 119894th comment in a thread isdescribed in Figure 4

Since the purpose of RTS task is to predict a tree-likereply structure in a thread RTS algorithm needs to find 119873reply relations among comments and the root where119873 is thenumber of comments in a threadThus RTS algorithm needs119874(1198732) pairwise comparisons

6 The Scientific World Journal

2468101214

15

1015

0

03

02

01

04

05

06

07

08

09

1

X 7Y 12Z 001661

Child

-par

ent r

elat

ion

prob

abili

ty

RootY child

X parent

X 11Y 12

Z 049

X rootY12Z 0211

(a)

2468101214

1

5

10

15

0010203040506070809

1

X 7Y 12Z 006037

X 11Y 12Z 01371Ch

ild-p

aren

t rel

atio

n pr

obab

ility

X parent

Y child

X rootY 12Z 03652

Root

(b)

Figure 2 The probability of child-parent reply relations (a) Apple discussion forum and (b) Thestandard online news agencies

R

10 14

298

124

7

16 45

3

30

28

25

2726

20

13

19

23

15

12

11

19

13

22

18

21

26

17

221

2947

112

9

26325

23 27

30

5

12

6

810

14

21 15 20

16

181719

24

28

1

310

209

4

5

1622

6

2

15

R

826

30

19

2421

2923

277

17

18

11

13

14

12

28

25

9

11

R

Figure 3 Three real threads which the nodesrsquo label are in chronology order

42 Features The ranking SVM needs some features to bedefined In this section we introduce eleven textual andnontextual features Their definition is one of the maincontributions of our work It should be mentioned that about20 percent of replies in ground truth donot have any common

word with their parent The textual features therefore are notenough and we need to define some nontextual features

421 Similarity Measurement In order to measure the sim-ilarity of sentences we utilize a vector space model to

The Scientific World Journal 7

Learning

Model

i

R

1

i minus 1Get

(candidatesof irsquothcomment)

Ranking

Training set

Test set

SVM ranklearn

(A learning modellist of weights for

each feature)

i

R i minus 1

j

For each candidateForj = 0 to i minus 1 doFor each featurefor k = 0 to Features doLinear kernel functionA[j]+ = modelweight[k]lowastfeature [j][k]Return parentrsquos lablereturn argmaxj A

Figure 4 RTS algorithm

represent the text body of comments A commentrsquos text canbe considered as a vector of terms weighed by TF-IDF TF(Term Frequency) is the number of times a word appearsin a comment and IDF (Inverse Document Frequency) iscomputed according to the following formula

IDF (119908) = Log( 119873119899119908

) (1)

where119873 is the total number of comments in a news item and119899119908is the number of comments that contain the word 119908 IDF

is usually extracted from the whole training data but we havelimited it to the set of comments in a thread We believe thismakes it more accurate for example when a word has a lowfrequency in the whole training data but has a high frequencyin a thread

In order to measure similarity between two commentsthe stop-words are deleted first and then words of thecomments are stemmed by the Porter algorithm (for Englishdatasets) This step is not performed for Russian and Persiandatasets The common words of the two comments areextracted and their weights are calculated based on TF-IDFThen the final score is obtained by aggregating the weights ofcommonwords according to (2) Since the root text is usuallylonger than the comments log of the product of their lengthsis used in denominator

Score (1198621 1198622)

= sum

119908isin119862-words(1 + log (TF (119908 1198621) times TF (119908 1198622)))

times IDF (119908) times (log (|1198621| times |1198622|))minus1

(2)

where119862-words are the number of the words held in commonbetween comments 1198621 and 1198622 and |119862| is the length of thecommentrsquos text

Comments are usually informal and have typo errorsSome words in two comments might be the same but dueto spelling errors it is difficult to find this out In order tosolve this issue we use the minimum edit distance (MED)algorithm The minimum edit distance of two words is theminimum number of edit operations (insertion deletionsubstitution and transposition) needed to transform oneword into another [14] The costs of insertion deletionsubstitution and transposition are 1 1 2 and 1 respectively

Two words in different comments are considered ascommon words if either they are exactly the same or theyseem to be the same but they contain some typo errors Inthe latter case if the length of the words is bigger than fiveand their first two letters are the same and their edit distanceis lower than 4 the two words are considered as commonword For example two words ldquoBeautifulrdquo and ldquoBeuatifulrdquo areconsidered as common words

422 Authorsrsquo Language Model The idea is that the com-menters who talk to each other are more likely to use similarwords in their comments In order to take advantage ofthis feature all comments that are related to a commenterare appended This makes a collection of words for eachcommenter If the collections of words of two commentersare very similar they can be related to each other InFigure 5 commenter ldquo1198601rdquo wrote three comments Thesethree comments are appended and make a collection ofwords for commenter ldquo1198601rdquo and then like the first featurethe similarity is calculated between collection of words ofcommenters ldquo1198601rdquo and ldquo1198602rdquo The similarity scores obtained

8 The Scientific World Journal

between two commenters ldquo1198601rdquo and ldquo1198602rdquo are considered asthis featurersquos score for relations between their comments

423 Prior Location Position of comments can reveal someinformation about their hierarchy for example first com-ments usually have more children than the others or com-ments which are located just before the 119894th comment aremorelikely to be its parent In general we would like to estimate119901(119894 | 119895) that is knowing that a comment is in position 119895whichis the likelihood that the comment in position 119894 is its parent[2] So we calculate prior probabilities for being the parent ofdifferent positions

To calculate prior probability for 119895 we count the numberof times a comment in position 119894 is the parent of 119895 inthe training set Figure 6 shows the prior probability forcomments in positions 1 to 100 in Thestandard dataset Thehighest prior probability belongs to the root and then tothe comments which are located just before the commentThe sample points in Figure 6 show five commentrsquos positionssuch as the roots 10 30 57 and 59 and how it is probablefor 60th comment to be a child of them that the root hasthe most prior probability which is equal to 01852 and thenthe 59th comment which has the probability of 01236 AlsoFigure 7 shows the prior probability of child-parent relationfrom comment 40 to 60

424 Reference to Authorsrsquo Name If the name of an authorappears in a comment then all hisher comments are con-sidered as a potential candidate for being parent of thatcomment [19] Sometimes a part of the complete name isreferenced Sometimes the authorrsquos name is made up of twoparts and both of these parts could be used by other authorsfor reference We also consider these types of references Wehold each part of the authorrsquos name and then parts which arestop-words are removed

425 Global Similarity This feature is based on ranking thesimilarity of commentsrsquo text and the rootrsquos text which has aglobal view on the text similarity According to this feature ifcomment ldquo119860rdquo has the most similarity with comment ldquo119861rdquo andinversely comment ldquo119861rdquo has themost similarity with commentldquo119860rdquo among other candidates it is more likely that there is arelation between comment ldquo119860rdquo and comment ldquo119861rdquo To relaxthis feature the similarity measurement is calculated for eachcomment corresponding with all comments then the com-ments are sorted based on their similarity score For examplein Figure 8 we are looking for parent of the fifth commentIn the first step according to Formula (2) comments aresorted based on score of text similarity measurement per thefifth comment Comments that do not belong to candidateof the fifth comment are removed The removed commentshave been shown with black color in Figure 8 In the secondstep the same as the first step comments are sorted per eachcandidate of fifth comment and also comments which donot belong to candidate of the fifth comment are removedexcept the fifth comment itself Finally formula (3) is used tocalculate Ranking-distance score Two comments which arethe most similar to each other have more Ranking-distance

score In Figure 8 the most score belong to relation with thefourth commentThis feature is symmetric and the similarityamong comments is only calculated one time In other wordsthis feature needs 119874(1198732) time complexity for text similaritypairwise comparisons

Ranking-distance (1198621 1198622) = (|1198621amp1198622| + |1198622amp1198621|)minus1(3)

where 1198621 and 1198622 are two comments and |1198621amp1198622| is the textsimilarity distance between 1198621 and 1198622

426 Frequent Patterns of Authors Appearance The idea isthat the comments of two commenters who talk to eachother usually appear closely in chronological order So iftheir comments appear many times just after each other thisfeature gives a high score to their comments In order toimplement this feature we use the following formula

FPScore (119894 119895) = 119891 (119860119894 119860119895) + 119891 (119860

119895 119860119894) (4)

where 119860119894is the author of comment 119894 119891(119886 119887) is the number

of times of comments of author 119886 that appear just beforecomments of author 119887 (see Pseudocode 1)

Figure 9 shows a time line which includes 7 commentswhich have been written by 4 commenters The feature scoreis calculated for the relation among comments The score ofrelation between comments 119860 and 119861 is 3 which is more than119860 and 119862

427 Length Ratio The length of the parent text is usuallylonger than its children The length ratio score is calculatedaccording to

length ratio score (119862119862 119862119875) =

10038161003816100381610038161198621198751003816100381610038161003816

10038161003816100381610038161198621198621003816100381610038161003816

(5)

where 119862119862is a comment looking for a parent and 119862

119875is a

candidate parent

428 FrequentWords in Parent-Child Comments Sometimesa pair of words appears to be frequently one word in theparent and the next word in its children For example in ENE-News dataset the most frequent pairs are (believe people)(accident fukushima) (new discovered) (idea report) and(people public)We use pointwisemutual information (PMI)[38] to find the frequent pattern

Score (11988211198822) =

Count (11988211198822)

Count (1198821) lowast Count (119882

2) (6)

where 1198821is a word in a comment whose parent we are

looking for1198822is a word in its candidate parent and Count

(11988211198822) is the number of time119882

1has appeared in the parent

and1198822has appeared in child The numerator computes how

often two words appear together and denominator computeshow often one of the words appears Finally according toPseudocode 2 the score of relation between two comments iscalculated

The Scientific World Journal 9

+

C

C

C

A1

+

C

C

C

A2Similarity

C

Figure 5 The authorsrsquo language features based on author ldquo1198601rdquo and author ldquo1198602rdquo

102030405060708090100

1 20 4060 80

100

0

01

02

03

04

05

06

07

08

09

1

Prio

r pro

babi

lity

X RootY 60Z01862

X 10Y 60Z00011

X 30Y 60Z00011

X 57Y 60Z 006

Root

Y child X parent

X 59Y 60Z01236

Figure 6 The prior probability of child-parent relation from comments 1 to 100 inThestandard

102030405060

0

005

01

015

02

025

X rootY 60Z01862

X 59Y 60Z01236

Root

4045

5055

60

Y child

X parent

Prio

r pro

babi

lity

Figure 7 The prior probability of child-parent relation from comments 40 to 60

Comment other comments are sorted based on the text similarity measurement score

5 15 2 4 R 3 21 1

2 233 4 5 R13 1

4 235 1 R 313 2

5

2 4

1 2 3 4 5

1 2 3 4 5

1 2 3 4 5

1(1 + 3) 1(2 + 1)

Figure 8 An example for global similarity features

10 The Scientific World Journal

Step 1For 119894 = 1 to comments-1

if Author(119894) = Author(119894 + 1)119891[Author(119894 + 1) Author(119894)]++

End ifEnd forStep 2For 119894 = 2 to Comments For each comment which looking for its parent

For 119895 = 0 to 119894 minus 1 For each candidateFPScore(119894119895)= 119891[Author(119894) Author(119895)] + 119891[Author(119895) Author(119894)]

End forEnd for

Pseudocode 1 Pseudocode to calculate frequent patterns of authors appearance

Time line

Relation CountFrequentpattern

Sum

Aminus gt B 2 A B 3

Bminus gt C 1 B C 1

Cminus gt A 1 C A 1

Bminus gt A 1 AD 1Aminus gt D 1

B

AFrequent (A B) = 3 Frequent (A C) = 1

C

Step 1 Step 2

A B C A B A D

Figure 9 An example for authors appearance sequence

429 Candidate Filtering In addition to the features we usesome heuristics to filter some inappropriate candidates

(1) Usually a commenter does not reply to the root inhisher second comment So if a commenter haswritten more than one comment the root is removedfrom the parent candidates of the second and nextcomment This was shown to be a useful featurebecause the root is an important candidate so if wecould remove it correctly the results are improvedsignificantly

(2) A commenter does not reply to himherself So wesimply remove all comments of a commenter fromhisher commentrsquos candidates [2]

(3) Commenters who post only one comment on a threadare more likely to reply to the root post So othercandidates can be removed except the root

5 Experiments

In this section we provide details about the datasets used forevaluation describe the evaluation metrics and then presentthe results of the SLARTS method and compare them withthe results of the baseline approaches

51 Dataset Unfortunately there is no standard datasetfor RTS task in declarative threads while severaldatasets are available for interrogative threads [3 4 14]Therefore Thestandard (httpthestandardorgnz) Alef(httpwwwalefir) ENENews (httpenenewscom) Rus-sianblog (wwwkaverydreamwidthorg) and Courant-blogs(wwwcourantblogscom) websites were crawled to provideevaluation data Thestandard in New Zealand and Alef inIran are two online news agencies established in August2007 and 2006 respectively They publish daily news andarticles by journalists in different categories such as economyenvironment politics and society Thestandard is in Englishand Alef is in PersianThestandard has provided the tree-likereply structure of comments since 2010 ENENews is a newswebsite covering the latest energy-related developmentsIt was established shortly after the Fukushima Daiichidisaster in March 2011 It has grown rapidly to serveapproximately 2000000 page views per month Russianblogand Courantblogs are two blogs that are in Russian andEnglish respectively Russsianblog and Courantblogs areestablished since 2003 and 2012 respectively

The reason for selecting the mentioned websites are (1)They support multilevel reply (2) their users are active andarticles usually get many comments (3) author and time ofcomments are available (4) they cover different contexts andlanguages (news agencies cultural sites and blogs in EnglishRussian and Persian)

The Scientific World Journal 11

procedure frequent-words (comment candidate comment)Sum = 0For each (Word1 in comment)

For each (Word2 in candidate-comment)Sum += Score [Word1 Word2]

End ifEnd forreturn Sum

Pseudocode 2 A pseudocode for calculating score of frequent words between two comments

Table 1 The datasets specific properties

Datasets Language Confirmation bymoderator

Usersregister

Thestandard English times

Alef Persian times

ENENews English times

Russianblog Russian times

Courantblogs English times

For each website we crawled the webpages that werepublished until the end of 2012We then parsed the pages andextracted the reply structure and used it for ground truthWehave removed the threadswith less than 4 comments becausethese kinds of threads do not give us much information andusually their comments reply to the root Table 1 summarizesthe information about the prepared datasetsThe datasets areavailable at httpeceutacirnlpresourceshtml

In ENENews and Russianblog users have to register inorder to leave a comment However in other datasets userscan leave comments in a thread with different names ordifferent users can use the same names

Table 2 reports some statistics on the crawled websitesThe length of commentsrsquo text in Russianblog is shorter thanthe other datasets which causes text similarity measures toperform poorly on it In ENENews the root usually includesa tweet that is why the length of the rootrsquos text is shorterthan the other datasets All comments have authorrsquos nameexcept for some comments in Alef Therefore the numbersof commenters for Alef was calculated from the commentsthat have authorrsquos name The average number of commentsper article in Thestandard and ENENews are about 50 and39 respectively which are larger than the other datasets

In order to gain some insights into the data we show somecharts extracted from the datasets

Figure 10 shows the distribution of the number of com-ments in the articles It is seen thatmost threads have between5 and 15 comments in Russianblog and Alef However inThestandard length of threads is longer than the otherdatasets andmost threads have between 12 and 24 comments

Publication rate of articles is shown in Figure 11 Thepublication rate follows a bell-shape and articles are publishedbetween 7 am and 19 pm and the highest rate belongs to the 4-hour period between 9 am and 13 pm Since there is only one

author who publishes the articles in Russianblog its chart hasless variation and the root is usually published between 10 amand 18 pm

Figure 12 shows the publication rate of comments afterthe root It is seen that all datasets except Russianblog havesimilar behavior about one hour after the article is publishedit has received most of the comments After that the rate ofcomment reception decreases Russianblog shows differentbehavior It seems that a part of its visitors reply to its articlesthe next morning 16ndash21 hours after the publication

Figure 13 shows the time difference between publicationtime of comments and their replies It is seen that the maxi-mum difference is usually less than one hour ENENews andRussianblog do not moderate comments and Thestandardhas very active moderators who immediately check andaccept comments However in Courantblogs and Alef wheremoderators are not always online the time difference isbetween one and two hours

Figure 14 shows how depth of comments increases whentime passes after publication of the main article Deepercomments show longer and more serious conversations Asshown in the figure comments usually reply to the root inearly hours After a few hours conversations are continuedaround the comments which causes the depth of thread toincrease Visitors of Alef and Courantblogs talk more aboutthe root article However Thestandard and ENENews havelonger and deeper discussions

Figure 15 shows length of comments in words Most ofcomments include 10ndash20 words Except for comments ofRussianblog the other datasets are similar This tells thatthe similarity measure is not enough and we have to usenontextual features as well Russianblog has comments thatare shorter than the other datasets This dataset is a personalblog and users usually write friendly comments

Figure 16 shows depth of comments Depth is directlyrelated to the conversations It is seen that comments areusually found in the first depth (below the root) RussianblogENENews and Thestandard have more comments in higherlevels meaning conversations are longer

52 Evaluation Metrics To evaluate the experiment resultswe use several metrics For edge prediction we use precisionrecall and 119865-score measures [4 13 17 18] Output of RTS isa tree This causes precision recall and 119865-score values to beequal [2] since FP (False Positive) and FN (False Negative)

12 The Scientific World Journal

Table 2 Datasetsrsquo statistics split up per website

DatasetsAvg length of

commentsrsquo text (inwords)

Avg length of therootrsquos text (in words)

Number ofcomments

Avg comments pernews

Number ofnews

Number ofcommenter

Thestandard 7298 35768 149058 4981 2992 3431Alef 7466 76655 23210 2796 830 4387ENENews 6819 16264 16661 3892 428 831Courantblogs 7924 39817 4952 1919 258 1716Russianblog 1856 23716 30877 1618 1908 1258

0

20

40

60

80

100

120

4 8 16 32 64 128 256

Num

ber o

f new

s

Number of comments

AlefEnenewsRussianblog

CourantblogsThestandard

Figure 10 Distribution of the number of comments in news article

0

005

01

015

02

025

0 2 4 6 8 10 12 14 16 18 20 22

Nor

mal

ized

new

s cou

nt

Hour

AlefEnenewsRussianblog

CourantblogsThestandard

Figure 11 Publication rate of articles per hour

0002004006008

01012014016

0 2 4 6 8 10 12 14 16 18 20 22 24

Nor

mal

ized

com

men

ts co

unt

Elapsed time after published the root (hour)

AlefEnenewsCourantblogs

RussianblogThestandard

Figure 12 Average number of published comments per hour after the root

The Scientific World Journal 13

0005

01015

02025

03035

04045

05

0 1 2 3 4 5 6 7 8 9 10 11 12

Nor

mal

ized

com

men

ts co

unt

Time difference

AlefEnenewsRusssiablog

CourantblogsThestandard

Figure 13 Time difference between a node and corresponding parent per hour

0

05

1

15

2

25

3

35

4

0 2 4 6 8 10 12 14 16 18 20 22 24

Aver

age c

omm

ents

dept

h

Elapsed time after publishing the root (hour)

AlefEnenewsRusssianblog

CourantblogsThestandard

Figure 14 Depth of comments averaged on the elapsed time from publication of root

in precision and recall are always equal Instead we use thefollowing edge accuracy measure

Accuracy (Acc)edge

=

1003816100381610038161003816reply relations cap detected relations1003816100381610038161003816119873

(7)

where the tree is comprised of119873 comments and one rootThe secondmetric is path accuracy which was introduced

byWang et al [3]Thismetric has a global view and considerspaths from nodes to the root

Accuracy (Acc)path =sum|119862|

119894=1

1003816100381610038161003816path119866 (119894) = path119875(119894)1003816100381610038161003816

|119862| (8)

where path119866(119894) is the ground-truth structure for 119894th comment

and path119875(119894) is the predicted structure for it |119862| is the number

of comments in a thread If any irrelevant comment appearsin the path this metric considers it to be completely wrongSo it is very strict To relax this Wang et al introduced

a metric that computes the overlap between the paths inground truth and the predicted path

Precision (119875)119877-path =

sum|119862|

119894=1(1003816100381610038161003816path119866 (119894) cap path

119875(119894)1003816100381610038161003816 1003816100381610038161003816path119875 (119894)

1003816100381610038161003816)

|119862|

Recall (119877)119877-path =

sum|119862|

119894=1(1003816100381610038161003816path119866 (119894) cap path

119875(119894)1003816100381610038161003816 1003816100381610038161003816path119866 (119894)

1003816100381610038161003816)

|119862|

(9)

where |path119875(119894)| is the number of comments in the prediction

path of 119894th comment and |path119866(119894)| is the number of com-

ments in the ground-truth path of 119894th comment 1198651-score isthe harmonic mean of precision and recall

1198651-score = 2 times Precision times recallPrecision + recall

(10)

The above mentioned metrics are appropriate in inter-rogative threads As mentioned before the root of declar-ative threads is news articles or main contents which aredifferent from the root of interrogative threads This causesthe structure of threads and reply relations to be differentThere are two types of reply relations in declarative threads

14 The Scientific World Journal

0

001

002

003

004

005

006

1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91Nor

mal

ized

com

men

ts co

unt

Length of comments (words)

AlefEnenewsRussianblog

CourantblogsThestandard

Figure 15 Histogram of length of comments in words

0

01

02

03

04

05

06

07

08

1 2 3 4 5 6 7 8 9Nor

mal

ized

com

men

ts co

unt

Depth

AlefEnenewsRusssainblog

CourantblogsThestandard

Figure 16 Histogram of the depth of comments

(1) comment-root that is the relation between commentsand the root article and (2) comment-comment that is therelation between two comments one parent and one childThe comment-root relations show different views around themain article (root) The comment-comment relations showconversation among visitors which is a valuable source ofinformation In Figure 17 there are three comment-root andthree comment-comment relations When a user reads theroot and comments heshe can write hisher opinion orsentiment about the root by replying to root or participatingin a discussion by replying to usersrsquo comments

We believe that the evaluation metrics mentioned beforeare not enough to cover both types of reply relations dueto differences between interrogative and declarative threadsAn appropriate metric should be able to detect both typesof relations So we have to modify the evaluation metrics ordefine new metrics

We propose an evaluation metric Accuracy (Acc)CTDwhere CTD stands for Comments Type Detection It isdefined as the proportion of correctly detected commenttypes The comment-root type includes comments whichinitiate a new view about the root (NP) and the comment-

comment type includes comments which are part of adiscussion (PD)

Accuracy (Acc)CTD =TP + TN

TP + TN + FP + FN (11)

where TP is the number of correctly detected NP commentsTN is the number of correctly detected PD comments FP isthe number of incorrectly detected NP comments and FN isthe number of incorrectly detected PD comments

To evaluate accuracy of comment-comment relations thestandard precision recall and 119865-score measures are used

Precision (119875)edge =TP

TP + FP

Recall (119877)edge =TP

TP + FN

1198651-score (119865)edge = 2 timesPrecision times recallPrecision + recall

(12)

where TP is the number of correctly detected comment-comment relations FP is the number of incorrectly detectedcomment-comment relations and FN is the number of

The Scientific World Journal 15

C

C

Comment-root relation

Comment-comment relation

C

C C C Starter discussion commentStarter discussion comment

R

Figure 17 An illustration of a declarative thread

comment-comment relations in the ground truth that werenot predicted at all

The path accuracy metric mentioned earlier is a mod-ified version of 119875path and 119877path which is appropriate todeclarative platform This metric consider discussion pathsfrom each PD comment to the starter discussion commentbut not to root

Precision (119875)path

=

sum|119862|119901

119894=1

1003816100381610038161003816path119866 (119894) = path119875(119894)1003816100381610038161003816

|119862|119901

ow

1 if (|119862|119901 = 0)

Recall (119877)path

=

sum|119862|119877

119894=1

1003816100381610038161003816path119866 (119894) = path119875(119894)1003816100381610038161003816

|119862|119877

ow

1 if (|119862|119877= 0)

(13)

where |119862|119877is the number of PD comments in the ground-

truth thread and |119862|119901is the number of PD comments in the

predicted thread Path119866(119894) is the discussion path from 119894th

node to the discussion starter comment in the ground truthPath119875(119894) is discussion path from 119894th node to the discussion

starter comment in the predicted threadAlso the relaxed Precision (119875)

119877-path and Recall (119877)119877-path

are modified to be suitable for declarative platform

Precision (119875)119877-path

=

sum|119862|119901

119894=1(1003816100381610038161003816path119866 (119894) cap path

119875(119894)1003816100381610038161003816 1003816100381610038161003816path119875 (119894)

1003816100381610038161003816)

|119862|119901

ow

1 if (|119862|119901= 0)

Recall (119877)119877-path

=

sum|119862|119877

119894=1(10038161003816100381610038161003816path119866 (119894)

cap path119875(119894)

1003816100381610038161003816100381610038161003816100381610038161003816path119866(119894)

10038161003816100381610038161003816)

|119862|119877

ow

1 if (|119862|119877= 0)

(14)

Figure 18 shows two threads of which one is the groundtruth and the other one is the predicted thread In order tobetter understand the evaluation metrics we calculate themfor this example

In Table 3 the predicted thread in Figure 18 has beenevaluated by metrics from interrogative threads The resultsshow high values Table 4 shows the results of evaluationby metrics from declarative threads The results show thatdeclarative metrics have more appropriate results

There are two reasons which lead declarative metricsto better evaluate the predicted structure (1) The root indeclarative threads has many children So if a methodconnects all comments to the root interrogative metricsshow good results (2) Interrogative metrics cannot properlyevaluate comment-comment relations in declarative threadsHowever the declarative metrics can evaluate both types ofrelations

53 Experimental Results and Analysis We use several base-lines to compare the effectiveness of SLARTS The first base-line is the performance when we simply link all comments totheir previous comment (we name it Last baseline) [3 4]Thismethod leads to good results for RTS task in interrogativethreads The second baseline is to link all comments to theroot (we name it First baseline)

The only work that has been done on RTS has focused onthe authorrsquos name in online news agencies [19] Schuth et alrsquosmethod used several features to find the commenterrsquos name inonline news articles This method achieves a good precisionbut it has a low recall because many comments do not referto any authorrsquos name So we selected Seo et alrsquos method [2]which is a strong method in interrogative platforms

Seo et alrsquos method has focused on forums and emails andused the quoted text as one of theirmain featuresThis featureis not available in our datasets Therefore we use all theirproposed features except the quoted text

We have applied 5fold cross-validation to minimize biaswith 95 as confidence interval Table 5 shows the results ofour experiments based on interrogative evaluation metricsAccording to Accedge SLARTS reveals higher accuracy exceptin Alef in which lots of replies are connected to the root

According to 119875119877-path and 119877119877-path First and Last baselines

have the best performance respectively First connects allcomments to the root this way irrelevant comments do notappear in the path from the comment to the root and also theroot appears in all paths Last connects all comments to theirprevious comments This causes all comments to appear inthe path

16 The Scientific World Journal

Table3Ex

ampleo

fcalculatio

nof

interrogativem

etric

sfor

thee

xamples

hownin

Figure

18

Evaluatio

nmetric

Calculation

Accuracy

edge

Replyrelatio

ns=1rarr

1198772rarr

1198777rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

14rarr

35rarr

26rarr

58rarr

5

Detectedrelatio

ns==1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

14rarr

15rarr

26rarr

57rarr

28rarr

7

8 11asymp073

Acc p

ath

|(1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

1rarr

1198775rarr

2rarr

1198776rarr

5rarr

2rarr

119877)|

Num

berof

comments

=8 11asymp073

119875119877-path

(1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

1rarr

1198775rarr

2rarr

1198776rarr

5rarr

2rarr

119877)+(4rarr

1rarr

1198777rarr

2rarr

1198778rarr

7rarr

2rarr

119877)

Num

berof

comments

=(1+1+1+1+1+1+1+1)+(1+12+23)

11

=1017

11

asymp092

119877119877-path

(1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

1rarr

1198775rarr

2rarr

1198776rarr

5rarr

2rarr

119877)+(4rarr

3rarr

1rarr

1198777rarr

1198778rarr

5rarr

2rarr

119877)

Num

berof

comments

=(1+1+1+1+1+1+1+1)+(23+1+23)

11

=1034

11

asymp094

1198651-score119877-path

2lowast092lowast094

092+094

asymp093

The Scientific World Journal 17

R

9 10 11

3 5

8

21 7

4 6

(a)

R

9 10 11

3 7

8

21

4 5

6

(b)

Figure 18 An example of two threads (a) ground-truth thread and (b) the prediction thread

Table 4 Example of calculation of declarative metrics for the example shown in Figure 18

Evaluation metric Calculation

AccCTD

TP = 1 2 9 10 11TN = 3 4 5 6 8

FN = 7

FP =

5 + 5

5 + 5 + 1 + 0asymp 091

119875edge

119877edge

1198651edge

TP = 3 rarr 1 5 rarr 2 6 rarr 5 119875edge =3

3 + 3= 05

FN = 4 rarr 3 8 rarr 5 119877edge =3

3 + 2= 06

FP = 4 rarr 1 7 rarr 2 8 rarr 7 1198651 minus scoreedge asymp 055

119875path

119877path

1198651path

119875path =| 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 |

| 3 rarr 1 4 rarr 1 5 rarr 2 6 rarr 5 rarr 2 7 rarr 2 8 rarr 7 rarr 2 |=3

6= 05

119877path =| 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 |

| 3 rarr 1 4 rarr 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 8 rarr 5 rarr 2 |=3

5= 06

1198651-score119877-path asymp 055

119875119877-path

119877119877-path

1198651119877-path

119875119877-path =

3 rarr 1 4 rarr 1 5 rarr 2 6 rarr 5 rarr 2 7 rarr 2 8 rarr 7 rarr 2

| 3 4 5 6 7 8 |=1 + 1 + 1 + 1 + 0 + 12

6=45

6= 075

119877119877-path =

3 rarr 1 4 rarr 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 8 rarr 5 rarr 2

| 3 4 5 6 8 |=1 + 12 + 1 + 1 + 12

5=4

5= 08

1198651-score119877-path =

2 lowast 075 lowast 08

075 + 08asymp 077

According to Accpath First has better performance inThestandard ENENews and Alef because more commentsare linked directly to the root Usually SLARTS and Seo etalrsquos methods cannot predict the full path inThestandard andENENews because according to Figure 15 paths are verylong and complex in these datasets

As wementioned earlier interrogative evaluationmetricsare not appropriate for declarative threads because based onthese metrics First shows high performance although thisbaseline does not detect any comment-comment relations

Table 6 shows the results when we use declarative evaluationmetrics proposed in Section 52

According to Table 6 for 119865edge SLARTS performs betterthan Seo et alrsquos method However for119875edge Seo et alrsquos methodperforms better but its 119877edge is lower than SLARTS in all da-tasets

It is important to say that when the difference between119875edge and 119877edge is high and 119875edge is greater than 119877edge themethod most likely connects comments to the root and itdoes not appropriately detect comment-comment relations

18 The Scientific World Journal

J(1)M(1)

V(1)

H(1)

A(1)

R(1)

G(1)

N(1)D(1)

E(1)

C(1)

P(2)

U(1)

S(1)

T(2)

T(1)

M(2)

L(1)

Q(1)

C(2)

K(1)

F(1)L(2)

C(3)

P(1)

A(2)

B(1)O(1)

I(1)

H(2)

R

(a)

J(1)

M(1)

R

A(1)

N(4)

D(1)

G(1)

I(1)

B(1)H(1)

O(1)

C(1)

N(2)

G(2)O(2)

P(1)

E(1)

D(2)

F(1)

E(2)F(2)

D(3)

L(1) F(3) N(1)

D(4)

F(4)K(1)D(5)

D(6)

N(3)

(b)

1)

1)

1)

1)

1)

1)

1)

1)1)

1)

1)

1)

R

F(2)

K(3)

R(1)

K(2)

M(2)A(3)

Q(1)

N(2)

K(4)A(2)

J(2)

1)

H(2)

H(3)

I(2)

I(1)

P(1)

O(1)

(c)

Figure 19 Three real threads in which nodes are labeled according to commenterrsquos name and number of its comments

(like First baseline) On the other hand when 119877edge isgreater than 119875edge the method does not appropriately detectcomment-root relations (like Last baseline)

So Seo et alrsquos method most likely connects comments tothe root and this is not appropriately detecting comment-comment relations On the other hands 119875edge and 119877edge ofSLARTS are close

Several features in SLARTS such as Global similarityfrequent words Frequent patterns and authorsrsquo languagefocus on detection of comment-comment relations Thismakes the results of SLARTS better than Seo et alrsquos methodin declarative evaluation metrics

We already saw that the average length of threads inThestandard and ENENews is longer than the other datasets(Table 2) and their paths are much longer and more complexthan the other datasets (Figure 16) According to 119865edge theaccuracy of Thestandard and ENENews is less than otherdatasets Note that 119865edge is a strict metric in declarativethreads

SLARTS has better 119865edge than Seo et alrsquos method in alldatasetsThemaximumdifference occurs inThestandard andRussianblog In these datasets many of the defined featureshave goodperformance (Importance of features in ENENewsand Russianblog datasets will be shown in Table 7 and we willexplain more in the next section)

The Scientific World Journal 19

Table 5 Experimental results with interrogative evaluation metrics

Datasets Methods Accedge Accpath 119875119877-path 119877

119877-path 1198651119877-path

Thestandard

SLARTS 04669plusmn0009

03059plusmn0013

06927plusmn0018

07450plusmn0010

07179plusmn0013

Seo et alrsquos [2] 03941plusmn0005

03413plusmn0006

08303plusmn0009

06563plusmn0005

07331plusmn0003

First 03630plusmn0010

03630plusmn0010

10000plusmn0000

05846plusmn0009

07379plusmn0007

Last 01824plusmn0006

00495plusmn0004

02214plusmn0008

10000plusmn0000

03625plusmn0010

ENENews

SLARTS 04837plusmn0018

03695plusmn0027

07625plusmn0026

07770plusmn0013

07697plusmn0017

Seo et alrsquos [2] 04661plusmn0017

04134plusmn0021

08720plusmn0011

07086plusmn0013

07818plusmn0011

First 04524plusmn0018

04524plusmn0018

10000plusmn0000

06634plusmn0016

07976plusmn0011

Last 02090plusmn0016

00596plusmn0003

02319plusmn0011

10000plusmn0000

03764plusmn0015

Courantblogs

SLARTS 06843plusmn0028

06453plusmn0033

09127plusmn0013

08619plusmn0014

08865plusmn0013

Seo et alrsquos [2] 06700plusmn0040

06628plusmn0043

09667plusmn0014

08250plusmn0018

08902plusmn0015

First 06490plusmn0036

06490plusmn0036

10000plusmn0000

08001plusmn0019

08889plusmn0012

Last 02037plusmn0017

01082plusmn0008

03183plusmn0011

10000plusmn0000

04829plusmn0012

Russianblog

SLARTS 07463plusmn0017

06904plusmn0017

08741plusmn0008

08787plusmn0006

08764plusmn0007

Seo et alrsquos [2] 05446plusmn0010

05174plusmn0011

09293plusmn0003

07821plusmn0004

08494plusmn0003

First 04968plusmn0006

04968plusmn0006

10000plusmn0000

07134plusmn0004

08327plusmn0003

Last 03632plusmn0018

01549plusmn0012

03695plusmn0015

10000plusmn0000

05395plusmn0016

Alef

SLARTS 07753plusmn0017

07641plusmn0018

09579plusmn0006

09033plusmn0008

09298plusmn0007

Seo et alrsquos [2] 07802plusmn0019

07800plusmn0019

09950plusmn0003

08828plusmn0010

09355plusmn0006

First 07853plusmn0018

07853plusmn0018

10000plusmn0000

08826plusmn0010

09376plusmn0006

Last 00993plusmn0005

00822plusmn0005

02526plusmn0012

10000plusmn0000

04032plusmn0015

The bold style shows the best result in each metric

As shown in Table 6 Russianblog has better resultsthan the other datasets in all metrics The main reason isthat its comments are not confirmed by a moderator Thiscauses Accedge of Last baseline in Russianblog to be equalto 03632 that is more than other datasets (Table 5) Accedgeof Last baseline has inverse relationship with complexity ofreplies Also we showed in Figure 13 that the Russianbloghas the lowest time difference between a comment and itsparent When time difference between a child and its parentdecreases detection of reply relationwould be easier In otherwords as the parent appears to be closer to its child somefeatures such as frequent pattern and location Prior that are

based on position of comments in chronological order workbetter

The difference between 119865119877-path and 119865path is about 20

in Thestandard and ENENews where threads are larger andpaths have more complex structures

The minimum difference between the results of SLARTSand Seo et alrsquos methods appears in Alef datasets In Alef manyrelations are comment root and many comments do not haveauthorrsquos name which make the features perform poorly

Since SLARTS method has features which speciallyfocused on detecting comment-root relations (eg by addingcandidate filtering) AccCTD of SLARTS is better than Seo

20 The Scientific World Journal

et al method in all datasets The best result is 091 forRussianblog and the worst result is 069 for ENENewsThe root of ENENews dataset is usually a tweet Accordingto Table 2 it makes the average length of the rootrsquos textto be shorter than the other datasets and this makes thetextual features perform poorly on detecting comment-rootrelations

Confidence intervals of Alef andCourantblog datasets arehigher than the other datasets becausemany of their relationsare comment root (Figure 16) This makes ranking SVM tobe bias towards connecting comments to the root especiallywhen a thread includes very few comment-comment rela-tions

We compare 119875-value to specify the significance of differ-ences between SLARTS and Seo et alrsquosmethods on declarativemetrics Since ranking SVM ranks candidates based on theirscore and selects the first candidate from the list only thefirst candidate is important So p1 is computed The resultsindicate that all improvements are statistically significant (119875-value lt 0005) in all datasets

54 Evaluation of the Features In this section we evaluatethe role of the introduced features We use backward featureselection That means to measure the importance of a featurewe use all features except that feature repeat the experimentsin its absence and compare the results to the case whereall features are present The difference between the values ofmetrics in presence of a feature and its absence is reported inTable 7

It is seen that some features improve precision of themetrics for example location prior and candidate filteringrule 3 where they most likely tend to detect comment-rootrelations Some features improve recall of the metrics such asauthorsrsquo language global similarity candidate filtering rule1 and frequent patterns These features most likely tendto detect comment-comment relations Some features affectboth precision and recall for example authorsrsquo name

As stated earlier Russianblog has different behaviourin comparison with other datasets (Figures 12 13 and 15)Also ENENews has larger threads and more complex paths(Figure 16 and Table 2) So we use these datasets to evaluatefeatures in depth

Similarity feature improves recall of evaluation metricsSince Russian language is known as a morphologically richlanguage and the length of commentsrsquo text is very short(about average 18 words according to Table 2) in comparisonwith other datasets improvement of textual features is lowTo increase the performance of textual feature in Russianwe need a stemmer a Wordnet and a tokenizer Similarityfeature has rather a good impact on AccCTD

Authorsrsquo language feature improves recall of evaluationmetrics According to 119877edge the improvements of this featureare 003 and 001 in ENENews and Russianblog respectively

Global similarity feature improves both precision andrecall in Russianblog and recall in ENENews

The frequent words feature has a small improve-ment in Russianblog like similarity feature This feature

improves recall of the evaluation metrics in ENEnews about001

Length ratio feature improves precision in both datasetsHowever since longer comments in ENENews have manychildren this feature is more prominent there

Authorsrsquo name feature is useful in all evaluation metricsThe value of this feature in ENENews is more than Russian-blog because authorsrsquo name in Russianblog is different fromother datasets it is an email and no one would refer to it as aname

The frequent patterns feature focuses on detectingcomment-comment relations This feature improves recall ofevaluation metrics in both datasets

The location prior feature improves precision in bothdatasets This feature has a good improvement on AccCTDAccording to 119875edge the best improvement is 016091 forRussianblog since comments are not moderated

The candidate filtering rule 1 improves recall of evaluationmetrics in both datasets This feature removes the rootcandidate accurately This feature has a good improvementon AccCTD The maximum improvement for 119877edge is 034 forRussianblog

The candidate filtering rule 2 has small improvements onprecision and recall of evaluation metrics in both datasetsMaximum improvement is gained with Russianblog

Finally The candidate filtering rule 3 improves precisionof metrics in both datasets This feature removes all can-didates except the root Thus detection of comment-rootrelations is improved Also this feature improves AccCTD

6 Information Extraction byVisualizing Threads Structure

In this section we discuss the information which can beextracted from hierarchical structure of threads Figure 19 isthe same as Figure 3 except that the commenters and theirnumber of posted comments are shown as the nodesrsquo labelfor example ldquo119863(3)rdquo is the third comment sent by user ldquo119863rdquoVisualization of the structure reveals valuable information asfollowing

(i) Is the root article controversial The threads ldquo119861rdquo andldquo119862rdquo in Figure 19 aremore controversial than ldquo119860rdquo Visi-tors have expressed different opinions and sentimentsabout the root in ldquo119861rdquo and ldquo119862rdquo leading to formationof longer conversations The thread ldquo119860rdquo shows acommon thread which has short conversations aboutthe root The height and width of trees can help as ameasure to recognize whether a root is controversialor not

(ii) Which comments are starters [36] Starter commentshave important role in the conversations becauseusers read them and then read other comments tobetter understand the conversations

(iii) Which comments have participated in this conversa-tion Users can follow the conversations that seeminteresting to them without reading any unrelatedcomment

The Scientific World Journal 21

Table 6 Experimental results with declarative evaluation metrics

Datasets Method 119875edge 119877edge 119865edge 119875path 119877path 119865path 119875119877-path 119877

119877-path 119865119877-path AccCTD

ThestandardSLARTS 03782

plusmn000804155plusmn0011

03960plusmn0006

01565plusmn0007

01713plusmn0008

01636plusmn0007

03628plusmn0012

04166plusmn0016

03876plusmn0005

07525plusmn0005

Seo et alrsquos[2]

03014plusmn0031

01414plusmn0015

01921plusmn0016

01544plusmn0025

00576plusmn0011

00835plusmn0013

03013plusmn0038

01319plusmn0018

01831plusmn0021

05831plusmn0007

Difference 00778 02741 02039 00021 01137 00801 00615 02847 02045 01694

ENENewsSLARTS 03797

plusmn003803891plusmn0019

03839plusmn0024

01873plusmn0040

01855plusmn0026

01857plusmn0030

03768plusmn0048

03768plusmn0017

03759plusmn0025

06872plusmn0014

Seo et alrsquos[2]

04060plusmn0030

01713plusmn0016

02407plusmn0018

02378plusmn0034

00837plusmn0016

01235plusmn0021

03947plusmn0035

01523plusmn0015

02196plusmn0019

05954plusmn0009

Difference minus00263 02178 01432 minus00505 01018 00622 minus00179 02245 01563 00918

CourantblogsSLARTS 05236

plusmn003003963plusmn0073

04495plusmn0052

04220plusmn0057

03110plusmn0058

03566plusmn0049

05325plusmn0049

03853plusmn0075

04452plusmn0057

07625plusmn0033

Seo et alrsquos[2]

07320plusmn0090

02049plusmn0062

03189plusmn0082

06815plusmn0105

01896plusmn0070

02951plusmn0093

07309plusmn0094

02010plusmn0065

03140plusmn0086

07001plusmn0031

Difference minus02084 01914 01306 minus02595 01214 00615 minus01984 01843 01312 00624

RussianblogSLARTS 05940

plusmn002805674plusmn0022

05803plusmn0023

04907plusmn0031

04664plusmn0020

04782plusmn0024

05878plusmn0028

05823plusmn0021

05849plusmn0020

09116plusmn0011

Seo et alrsquos[2]

04705plusmn0020

03030plusmn0015

03685plusmn0016

03823plusmn0023

02534plusmn0013

03046plusmn0015

04682plusmn0020

02822plusmn0014

03521plusmn0015

05862plusmn0013

Difference 01235 02644 02118 01084 0213 01736 01196 03001 02328 03254

AlefSLARTS 06189

plusmn002804131plusmn0040

04952plusmn0037

05639plusmn0027

03819plusmn0037

04551plusmn0034

06277plusmn0027

04069plusmn0043

04934plusmn0039

08044plusmn0013

Seo et alrsquos[2]

08827plusmn0052

02637plusmn0051

04045plusmn0060

08797plusmn0053

02631plusmn0051

04034plusmn0060

08840plusmn0052

02635plusmn0051

04043plusmn0060

07846plusmn0018

Difference minus02638 01494 00907 minus03158 01188 00517 minus02563 01434 00891 00198The bold style shows the best result in each metric

(iv) Who plays an important role in a discussion Someusers playmore important roles in conversations thanother users For example users ldquo119863(1)rdquo in threadldquoBrdquo and ldquo119860(1)rdquo in thread ldquoCrdquo have made a longconversation These users are known as a hub or astarter in a thread and their degree of importance canbe compared according to their indegree [39] Manyanalyses can be performed on this structure which isknown as popularity features [12]

(v) How many conversations are about the root Thereare thirteen conversations which are replyingdirectly to the root in thread ldquoCrdquo for exampleF(2)H(2)H(3) I(2)O(1)orF(2)H(2) l(1) showtwo conversations in thread ldquo119862rdquo

7 Conclusion and Future Work

In this paper we proposed SLARTS a method based onranking SVMwhich predicts the tree-like structure of declar-ative threads for example blogs and online news agenciesWe emphasized on the differences between declarative andinterrogative threads and showed that many of the previouslyproposed features perform poorly on declarative threadsbecause of this Instead we defined a set of novel textual andnontextual features and used a ranking SVM algorithm tocombine these features

We detect two types of reply relations in declarativethreads comment-root relation and comment-commentrelation An appropriate method should be able to detectboth types of reply relations and an appropriatemetric shouldconsider both types of reply relations So in order to havefair judge on the quality of the predicted structures wemodified the evaluation metrics accordingly We also defineda novel metric that measures the accuracy of comment typedetection

The results of our experiments showed that accord-ing to declarative evaluation metrics our method showshigher accuracy in comparison with the baselines on fivedatasets Also we showed that all improvements in detectingcomment-comment relations are statistically significant in alldatasets

We believe that the defined features are not limited todeclarative threads Some features such as authorrsquos languageand frequent patterns extract relations among users can beuseful in interrogative threads

For future we would like to use the proposed methodon interrogative platforms such as forums Also we wouldlike to analyze the tree-like reply structure deeply and webelieve it can provide valuable information for example tohelp in understanding the role of users in discussions or tofind important and controversial comments among a largevolume of comments

22 The Scientific World Journal

Table 7 The difference between the evaluation metrics in presence and absence of features

Feature Dataset Evaluation metric119875edge 119877edge 119875Path 119877Path 119875

119877-path 119877119877-path AccCTD

Similarity ENENews minus000233 007415 minus000538 003289 minus001095 008996 004684Russianblog 000874 000012 000963 000347 000695 minus001079 000355

Authorsrsquo language ENENews minus001354 003246 minus001785 000693 minus002757 004052 000458Russianblog minus000229 001052 minus000179 000750 minus000590 000807 minus000182

Global similarity ENENews minus000402 005604 minus001338 001581 minus001129 007086 002742Russianblog 001090 001443 000582 000897 000368 minus000317 minus000087

Frequent words ENENews 000252 000931 000170 000424 minus000132 000973 000292Russianblog 000008 000023 000016 000029 000034 000047 minus000011

Length ratio ENENews 004206 minus002984 004450 001700 006421 minus005336 minus000850Russianblog 000651 minus000355 000920 000080 000784 minus000592 000235

Authorsrsquo name ENENews 003753 006392 001920 003285 002341 005998 001788Russianblog 000000 000000 000000 000000 000000 000000 000000

Frequent pattern ENENews minus002028 002930 minus002357 000300 minus003793 003410 000526Russianblog 000458 006100 000830 005206 minus000545 006150 000404

Location prior ENENews 008778 minus003690 008556 003280 011792 minus007852 002925Russianblog 016091 009671 017766 013515 016699 008793 002414

Candidate filtering rule 1 ENENews minus005431 006138 minus004102 001296 minus006458 007545 006949Russianblog minus012506 034370 minus013916 027445 minus012683 037003 032546

Candidate filtering rule 2 ENENews 000014 minus000004 000048 000041 minus000038 minus000035 minus000009Russianblog 002575 002475 002462 002341 001746 000802 000000

Candidate filtering rule 3 ENENews 000357 minus000121 000452 000254 000551 minus000161 000161Russianblog 008521 minus001978 009636 001702 009380 minus003825 007371

Since our goal is to provide a language-independentmodel to reconstruct the thread structure we did not usetext processing tools such as Wordnet and Named EntityRecognition (as they are not available in the same quality forall languages) We would like to focus on English and usethese tools to find out whether they can improve the accuracyof RTS method or not

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] Y Wu Q Ye L Li and J Xiao ldquoPower-law properties ofhuman view and reply behavior in online societyrdquoMathematicalProblems in Engineering vol 2012 Article ID 969087 7 pages2012

[2] J Seo W B Croft and D A Smith ldquoOnline community searchusing conversational structuresrdquo Information Retrieval vol 14no 6 pp 547ndash571 2011

[3] HWang CWang C X Zhai and J Han ldquoLearning online dis-cussion structures by conditional random fieldsrdquo in Proceedingsof the 34th International ACM SIGIR Conference on Researchand Development in Information Retrieval (SIGIR rsquo11) pp 435ndash444 July 2011

[4] E Aumayr J Chan and C Hayes ldquoReconstruction of threadedconversations in online discussion forumsrdquo inProceedings of the

5th International AAAI Conference onWeblogs and SocialMedia(ICWSM rsquo11) pp 26ndash33 2011

[5] D Shen Q Yang J-T Sun and Z Chen ldquoThread detactionin dynamic text message streamsrdquo in Proceedings of the 29thAnnual International ACM SIGIR Conference on Research andDevelopment in Information Retrieval pp 35ndash42 Seattle WashUSA August 2006

[6] M Dehghani M Asadpour and A Shakery ldquoAn evolutionary-based method for reconstructing conversation threads in emailcorporardquo in Proceedings of the IEEEACM International Con-ference on Advances in Social Networks Analysis and Mining(ASONAM rsquo12) 2012

[7] Y Jen-Yuan ldquoEmail thread reassembly using similarity match-ingrdquo in Proceedings of the 3rd Conference on Email and Anti-Spam (CEAS rsquo06) Muntain View Calif USA 2006

[8] S Gottipati D Lo and J Jiang ldquoFinding relevant answersin software forumsrdquo in Proceedings of the 26th IEEEACMInternational Conference on Automated Software Engineering(ASE rsquo11) pp 323ndash332 November 2011

[9] H Duan and C Zhai ldquoExploiting thread structures to improvesmoothing of language models for forum post retrievalrdquo inAdvances in Information Retrieval P Clough C Foley CGurrin et al Eds vol 6611 pp 350ndash361 Springer BerlinGermany 2011

[10] T Georgiou M Karvounis and Y Ioannidis ldquoExtractingtopics of debate between users on web discussion boardsrdquo inProceedings of the 1st International Conference forUndergraduateand Postgraduate Students in Research Poster Competition 2010

[11] Y C Wang M Joshi W Cohen and C P Rose ldquoRecoveringimplicit thread structure in newsgroup style conversationsrdquo

The Scientific World Journal 23

in Proceedings of the International Conference on Weblogs andSocial Media Seattle Wash USA 2008

[12] J Chan C Hayes and E M Daly ldquoDecomposing discussionforums and boards using user rolesrdquo in Proceedings of the 4thInternational AAAI Conference on Weblogs and Social Media2010

[13] Y-C Wang and C P Rose ldquoMaking conversational structureexplicit identification of initiation-response pairs within onlinediscussionsrdquo in Proceedings of the Human Language Technolo-gies Conference ofthe North American Chapter of the Associationfor Computational Linguistics pp 673ndash676 June 2010

[14] LWangM Lui S N Kim J Nivre and T Baldwin ldquoPredictingthread discourse structure over technical web forumsrdquo inProceedings of the Conference on Empirical Methods in NaturalLanguage Processing (EMNLP rsquo11) pp 13ndash25 Edinburgh UKJuly 2011

[15] A Balali H FailiMAsadpour andMDehghani ldquoa supervisedapproach for reconstructing thread structure in commentson blogs and online news agenciesrdquo in Proceedings of the14th International Conference on Intelligent Text Processing andComputational Linguistics Samos Greece 2013

[16] P Adams and C Martel ldquoConversational thread extraction andtopic detection in text-based chatrdquo in Semantic Computing pp87ndash113 John Wiley amp Sons 2010

[17] P H Adams and C H Martell ldquoTopic detection and extractionin chatrdquo in Proceedings of the 2nd Annual IEEE InternationalConference on Semantic Computing (ICSC rsquo08) pp 581ndash588August 2008

[18] C Lin J M Yang R Cai X J Wang W Wang and L ZhangldquoModeling semantics and structure of discussion threadsrdquo inProceedings of the 18th International Conference on World WideWeb pp 1103ndash1104 2009

[19] A Schuth MMarx andM d Rijke ldquoExtracting the discussionstructure in comments on news-articlesrdquo in Proceedings of the9th ACM International Workshop onWeb Information and DataManagement (WIDM rsquo07) Lisboa Portugal November 2007

[20] C Wang J Han Q Li X Li W P Lin and H Ji ldquoLearninghierarchical relationships among partially ordered objects withheterogeneous attributes and linksrdquo in Proceedings of the 12thSIAM International Conference on Data Mining pp 516ndash527Anaheim Calif USA 2012

[21] M Dehghani A Shakery M Asadpour and A KoushkestanildquoA learning approach for email conversation thread reconstruc-tionrdquo Journal of Information Science vol 39 no 6 pp 846ndash8632013

[22] A Balali A Rajabi S Ghassemi M Asadpour and H FailildquoContent diffusion prediction in social networksrdquo in Pro-ceedings of the 5th Conference on Information and KnowledgeTechnology (IKT rsquo13) pp 467ndash471 2013

[23] G Mishne and N Glance ldquoLeave a reply an analysis of weblogcommentsrdquo in Proceedings of the 3rd Annual Workshop onthe Weblogging ecosystem Aggregation Analysis and DynamicsEdinburgh UK 2006

[24] V Gomez A Kaltenbrunner and V Lopez ldquoStatistical analysisof the social network and discussion threads in Slashdotrdquo inProceedings of the 17th International Conference on World WideWeb (WWW rsquo08) pp 645ndash654 April 2008

[25] T Weninger X A Zhu and J Han ldquoAn exploration ofdiscussion threads in social news sites a case study of the redditcommunityrdquo 2013

[26] D Laniado R Tasso Y Volkovich and A KaltenbrunnerldquoWhen the Wikipedians talk network and tree structure of

Wikipedia discussion pagesrdquo in Proceedings of the 5th Interna-tional AAAI Conference on Weblogs and Social Media (ICWSMrsquo11) 2011

[27] A Grabowski ldquoHuman behavior in online social systemsrdquoTheEuropean Physical Journal B vol 69 no 4 pp 605ndash611 2009

[28] S N Kim L Wang and T Baldwin ldquoTagging and linkingweb forum postsrdquo in Proceedings of the 40th Conference onComputational Natural Language Learning Uppsala Sweden2010

[29] J Andreas S Rosenthal and K McKeown ldquoAnnotating agree-ment and disagreement in threaded discussionrdquo in Proceedingsof the 8th International Conference on Language Resources andComputation (LREC rsquo12) pp 818ndash822 Istanbul Turkey May2012

[30] A Stolcke K Ries N Coccaro et al ldquoDialogue actmodeling forautomatic tagging and recognition of conversational speechrdquoComputational Linguistics vol 26 no 3 pp 339ndash373 2000

[31] E N Forsyth Improving Automated Lexical and DiscourseAnalysis of Online Chat Dialog Naval Postgraduate SchoolMonterey Calif USA 2007

[32] D Feng E Shaw J Kim and E Hovy ldquoAn intelligentdiscussion-bot for answering student queries in threaded dis-cussionsrdquo in Proceedings of the 11th international Conference onIntelligent User Interfaces pp 171ndash177 February 2006

[33] F Hu T Ruan Z Shao and J Ding ldquoAutomatic web informa-tion extraction based on rulesrdquo in Proceedings of the Web Infor-mation System Engineering (WISE rsquo11) pp 265ndash272 Springer2011

[34] F Hu T Ruan and Z Shao ldquoComplete-thread extraction fromweb forumsrdquo inWebTechnologies andApplications pp 727ndash734Springer 2012

[35] D Feng J Kim E Shaw and E Hovy ldquoTowards modelingthreaded discussions using induced ontology knowledgerdquo inProceedings of the National Conference on Artificial Intelligencepp 1289ndash1294 July 2006

[36] G D Venolia and C Neustaedter ldquoUnderstanding sequenceand reply relationships within email conversations a mixed-model visualizationrdquo in Proceedings of the SIGCHI Conferenceon Human Factors in Computing Systems pp 361ndash368 April2003

[37] T Joachims ldquoTraining linear SVMs in linear timerdquo in Pro-ceedings of the 12th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining pp 217ndash226 Philadel-phia Pa USA August 2006

[38] RM Fano ldquoTransmission of information a statistical theory ofcommunicationsrdquo American Journal of Physics vol 29 pp 793ndash794 1961

[39] T Opsahl F Agneessens and J Skvoretz ldquoNode centrality inweighted networks generalizing degree and shortest pathsrdquoSocial Networks vol 32 no 3 pp 245ndash251 2010

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 4: Research Article A Supervised Approach to Predict the …downloads.hindawi.com/journals/tswj/2014/479746.pdf · 2019-07-31 · Research Article A Supervised Approach to Predict the

4 The Scientific World Journal

automatically [33] Hu et al [34] proposed an algorithm toextract all meta-information of threads from different kindsof forums without human interactionThe algorithm consistsof two steps thread extraction and detailed informationextraction from thread pages

3 Problem Definition

In this section some concepts are defined about RTS taskA comment is an utterance written by a user comprisingone or several sentences A commenter is a person whowrites comments and replies to the root article or the othercomments In this paper the root is defined as the starterof a conversation which can be a news article or any othercontent which has been written by an author or a journalistFigure 1 shows a part of a real thread with 145 comments InFigure 1 the root post is shown with label 119877 and edges denotethe reply relations In a thread a reply comment (which isillustrated in Figure 1 by numbered labels) responses to theprevious comments or to the root item For example thenodes with label 1 and 32 are reply comments to the rootitem which is considered as their parent The sequences oflabels in Figure 1 are in chronological order A thread is asequence of comments that starts with the root item andcontains a series of reply comments which are usually relatedto the same topic as the root item Each comment has a singleparent and all comments are descended from the root postin a thread [24 29] In other words threads are consideredas a special case of human conversations [35] that consist ofa set of comments contributed by two or more participantsby the reply operation [36]The candidate set of 119894th commentis a set of comments that could be considered as the parentof 119894th comment and includes comments which appear beforethe 119894th comment in chronological orderThe starter discussioncomment is a comment repling to the root item and it has atleast one child like comments with labels 1 and 32 in Figure 1

Thread detection task means finding the cluster of com-ments that belong to the same topic in a given text streamwithout any previous knowledge about the number ofthreads Reconstructing thread structure (RTS) task meansreconstructing the reply structure on comments in a threadThis leads to construction of a tree-like reply structure [2ndash5] or directed acyclic graph (DAG) [11 17 19] Since thethreads extracted from websites have the tree-like replystructure they are usually modeled as a tree in most papersAlso the structure of threads can be modeled as a DAGThe information about DAG-like reply structure is mostlyprepared by manually annotating the data which is a timeconsuming and difficult task In addition since the manuallyannotated dataset contains a small number of threads [11 19]it cannot properly evaluate the RTS algorithms In this paperwe assume a tree-like reply structure

Declarative and interrogative threads are different inessence

(i) Users in declarative threads mostly express theiropinions or sentiments about the root post informallywhile in interrogative threads users mostly expresstheir questions and answers in a more formal way

(ii) In declarative threads the topic of the root is mostlikely a news article or a content reflecting the real-world events while in interrogative threads it is mostlikely a userrsquos question

(iii) There is meta-information such as quotation in inter-rogative threads that has a great improvement on theaccuracy of RTS task

(iv) Comments of declarative threads usually wait formoderation to be published and it usually takes sometime Moderators are not always online they log ina few times per day accept the sent comments andlog out Therefore multiple comments appear nearlyat the same time So some features that are based ontime distance [11] and position in chronological orderhave good performance in interrogative threads butthey are not good in declarative threads

Declarative and interrogative threads are technically dif-ferent as well

(i) Each comment can most likely be connected toits previous comment (1-Distance) in the interroga-tive threads [4] this simple heuristic leads to greatimprovement in the accuracy of RTS task [3 4] sincemany of posts are most likely written to answer thelast question However this heuristic cannot have agood performance in declarative threads Accordingto Figure 2(a) the users most likely reply to thefirst previous comment in Apple discussion forumAlthough Figure 2(b) shows the users most likelyreply to the root inThestandard online news agenciesThe 11th comment in Apple is the parent of 12th com-ment with probability 049 however in Thestandardthe probability is equal to 0137

(ii) Figure 3 includes three real threads in Thestandarddataset which includes 30 comments and each com-ment is numbered and sorted in chronological orderSince users usually express their opinions or sen-timents and reply to each comment regardless ofsubmission time and position of comments so thestructure of the replies is not predictable

(iii) The length of the roots text in declarative threads isusually larger than comments thus the length of theroot or comments should be normalized in similaritymeasurements

There are websites which are known as message boardssuch as Digg Reddit and Slashdot The message boards arevaluable threads to analyze usersrsquo behavior [24 25] Howeverthey are not suitable to be used as training data since theyhave different designs for showing the comments Thesedifferences change the behavior of userrsquos replies for exampleSlashdot (httpslashdotorg) shows comments based ontheir scores This causes the comments which have higherscores to get more replies So we cannot create a generalmodel appropriate for all message boards

The Scientific World Journal 5

Article-news title the president loses 30000 jobs in a single year

The president reacts to criticism of his appalling jobs recordwhich has 300000 Kiwis jobless and 30000 fewer people inwork in the last year middot middot middot Herersquos criticism of things that yoursquorenot doing on the jobs front the president tax pollution soyou can reduce tax on companies and income renew andenhance the home insulation scheme which has created2000 jobs and is about to run out of money create anational investment middot middot middot

User A26 march 2013 at 802 am

User D26 march 2013

at 1110 am

321

middot middot middot

All Shearer has to do is1 Print Money or borrow a shit load moremoney2 Whack on a pile of extra taxes3 Get the government to run everything

Can you get some of those 30000 toknock on mydoor Irsquom offering fulltime employment no experiencenecessary and well above minimumwage middot middot middot but the response has beenpoor and of those who have appliedmost have a terrible attitude towardswork I donrsquot understanditmiddot middot middot

User B26 march 2013

at 806 am

2

Irsquove been reading a lot of storiessimilar to yours and have seen howhard it is for employers to get goodreliable staffThe mind set of theyounger people these days seems tobe very poor middot middot middot

Irsquove beenreading a lot ofstories middot middot middot

User A26 march 2013 at 1124 am

User E26 march 2013 at1247 pm

124

36

Of course they have to borrowmoneyChristchurch roads WFF interestfree loans we donrsquot make enoughto cover those expenses middot middot middot

User A26 march 2013

at 816 am

6

But national are-borrowing a shit load more moneyevery week-whacking on extra taxes (GSTcigarettesalcohol petrol middot middot middot )

-trying to run Auckland city fromWellingtonmiddot middot middot

you forgot 1000000000in lost revenue becauseof poorly structured taxcuts that did not stimulatethe economy

remember only theMexicans pay less taxthan us middot middot middot

User C26 march 2013

at 819 am

8

R

$

Figure 1 A part of a real thread fromThestandard (httpwwwthestandardorgnz) online news website

4 Method

In this section we describe our supervised approach to RTStask First we define some textual and nontextual featuresand learn a proper model to combine the features using aranking SVM (Support Vector Machines) Then the model isemployed in reconstruction of the reply structure for threadsin the test data

41 Ranking SVM SVM is a supervised learning methodthat is used for classification regression and ranking

The implementation we use is the ranking SVM classifi-er (httpwwwcscornelledupeopletjsvm lightsvm rankhtml) [2 37] The ranking SVM classifier learns to assigna weight to the pairs of comments The whole procedurefor choosing the parent of the 119894th comment in a thread isdescribed in Figure 4

Since the purpose of RTS task is to predict a tree-likereply structure in a thread RTS algorithm needs to find 119873reply relations among comments and the root where119873 is thenumber of comments in a threadThus RTS algorithm needs119874(1198732) pairwise comparisons

6 The Scientific World Journal

2468101214

15

1015

0

03

02

01

04

05

06

07

08

09

1

X 7Y 12Z 001661

Child

-par

ent r

elat

ion

prob

abili

ty

RootY child

X parent

X 11Y 12

Z 049

X rootY12Z 0211

(a)

2468101214

1

5

10

15

0010203040506070809

1

X 7Y 12Z 006037

X 11Y 12Z 01371Ch

ild-p

aren

t rel

atio

n pr

obab

ility

X parent

Y child

X rootY 12Z 03652

Root

(b)

Figure 2 The probability of child-parent reply relations (a) Apple discussion forum and (b) Thestandard online news agencies

R

10 14

298

124

7

16 45

3

30

28

25

2726

20

13

19

23

15

12

11

19

13

22

18

21

26

17

221

2947

112

9

26325

23 27

30

5

12

6

810

14

21 15 20

16

181719

24

28

1

310

209

4

5

1622

6

2

15

R

826

30

19

2421

2923

277

17

18

11

13

14

12

28

25

9

11

R

Figure 3 Three real threads which the nodesrsquo label are in chronology order

42 Features The ranking SVM needs some features to bedefined In this section we introduce eleven textual andnontextual features Their definition is one of the maincontributions of our work It should be mentioned that about20 percent of replies in ground truth donot have any common

word with their parent The textual features therefore are notenough and we need to define some nontextual features

421 Similarity Measurement In order to measure the sim-ilarity of sentences we utilize a vector space model to

The Scientific World Journal 7

Learning

Model

i

R

1

i minus 1Get

(candidatesof irsquothcomment)

Ranking

Training set

Test set

SVM ranklearn

(A learning modellist of weights for

each feature)

i

R i minus 1

j

For each candidateForj = 0 to i minus 1 doFor each featurefor k = 0 to Features doLinear kernel functionA[j]+ = modelweight[k]lowastfeature [j][k]Return parentrsquos lablereturn argmaxj A

Figure 4 RTS algorithm

represent the text body of comments A commentrsquos text canbe considered as a vector of terms weighed by TF-IDF TF(Term Frequency) is the number of times a word appearsin a comment and IDF (Inverse Document Frequency) iscomputed according to the following formula

IDF (119908) = Log( 119873119899119908

) (1)

where119873 is the total number of comments in a news item and119899119908is the number of comments that contain the word 119908 IDF

is usually extracted from the whole training data but we havelimited it to the set of comments in a thread We believe thismakes it more accurate for example when a word has a lowfrequency in the whole training data but has a high frequencyin a thread

In order to measure similarity between two commentsthe stop-words are deleted first and then words of thecomments are stemmed by the Porter algorithm (for Englishdatasets) This step is not performed for Russian and Persiandatasets The common words of the two comments areextracted and their weights are calculated based on TF-IDFThen the final score is obtained by aggregating the weights ofcommonwords according to (2) Since the root text is usuallylonger than the comments log of the product of their lengthsis used in denominator

Score (1198621 1198622)

= sum

119908isin119862-words(1 + log (TF (119908 1198621) times TF (119908 1198622)))

times IDF (119908) times (log (|1198621| times |1198622|))minus1

(2)

where119862-words are the number of the words held in commonbetween comments 1198621 and 1198622 and |119862| is the length of thecommentrsquos text

Comments are usually informal and have typo errorsSome words in two comments might be the same but dueto spelling errors it is difficult to find this out In order tosolve this issue we use the minimum edit distance (MED)algorithm The minimum edit distance of two words is theminimum number of edit operations (insertion deletionsubstitution and transposition) needed to transform oneword into another [14] The costs of insertion deletionsubstitution and transposition are 1 1 2 and 1 respectively

Two words in different comments are considered ascommon words if either they are exactly the same or theyseem to be the same but they contain some typo errors Inthe latter case if the length of the words is bigger than fiveand their first two letters are the same and their edit distanceis lower than 4 the two words are considered as commonword For example two words ldquoBeautifulrdquo and ldquoBeuatifulrdquo areconsidered as common words

422 Authorsrsquo Language Model The idea is that the com-menters who talk to each other are more likely to use similarwords in their comments In order to take advantage ofthis feature all comments that are related to a commenterare appended This makes a collection of words for eachcommenter If the collections of words of two commentersare very similar they can be related to each other InFigure 5 commenter ldquo1198601rdquo wrote three comments Thesethree comments are appended and make a collection ofwords for commenter ldquo1198601rdquo and then like the first featurethe similarity is calculated between collection of words ofcommenters ldquo1198601rdquo and ldquo1198602rdquo The similarity scores obtained

8 The Scientific World Journal

between two commenters ldquo1198601rdquo and ldquo1198602rdquo are considered asthis featurersquos score for relations between their comments

423 Prior Location Position of comments can reveal someinformation about their hierarchy for example first com-ments usually have more children than the others or com-ments which are located just before the 119894th comment aremorelikely to be its parent In general we would like to estimate119901(119894 | 119895) that is knowing that a comment is in position 119895whichis the likelihood that the comment in position 119894 is its parent[2] So we calculate prior probabilities for being the parent ofdifferent positions

To calculate prior probability for 119895 we count the numberof times a comment in position 119894 is the parent of 119895 inthe training set Figure 6 shows the prior probability forcomments in positions 1 to 100 in Thestandard dataset Thehighest prior probability belongs to the root and then tothe comments which are located just before the commentThe sample points in Figure 6 show five commentrsquos positionssuch as the roots 10 30 57 and 59 and how it is probablefor 60th comment to be a child of them that the root hasthe most prior probability which is equal to 01852 and thenthe 59th comment which has the probability of 01236 AlsoFigure 7 shows the prior probability of child-parent relationfrom comment 40 to 60

424 Reference to Authorsrsquo Name If the name of an authorappears in a comment then all hisher comments are con-sidered as a potential candidate for being parent of thatcomment [19] Sometimes a part of the complete name isreferenced Sometimes the authorrsquos name is made up of twoparts and both of these parts could be used by other authorsfor reference We also consider these types of references Wehold each part of the authorrsquos name and then parts which arestop-words are removed

425 Global Similarity This feature is based on ranking thesimilarity of commentsrsquo text and the rootrsquos text which has aglobal view on the text similarity According to this feature ifcomment ldquo119860rdquo has the most similarity with comment ldquo119861rdquo andinversely comment ldquo119861rdquo has themost similarity with commentldquo119860rdquo among other candidates it is more likely that there is arelation between comment ldquo119860rdquo and comment ldquo119861rdquo To relaxthis feature the similarity measurement is calculated for eachcomment corresponding with all comments then the com-ments are sorted based on their similarity score For examplein Figure 8 we are looking for parent of the fifth commentIn the first step according to Formula (2) comments aresorted based on score of text similarity measurement per thefifth comment Comments that do not belong to candidateof the fifth comment are removed The removed commentshave been shown with black color in Figure 8 In the secondstep the same as the first step comments are sorted per eachcandidate of fifth comment and also comments which donot belong to candidate of the fifth comment are removedexcept the fifth comment itself Finally formula (3) is used tocalculate Ranking-distance score Two comments which arethe most similar to each other have more Ranking-distance

score In Figure 8 the most score belong to relation with thefourth commentThis feature is symmetric and the similarityamong comments is only calculated one time In other wordsthis feature needs 119874(1198732) time complexity for text similaritypairwise comparisons

Ranking-distance (1198621 1198622) = (|1198621amp1198622| + |1198622amp1198621|)minus1(3)

where 1198621 and 1198622 are two comments and |1198621amp1198622| is the textsimilarity distance between 1198621 and 1198622

426 Frequent Patterns of Authors Appearance The idea isthat the comments of two commenters who talk to eachother usually appear closely in chronological order So iftheir comments appear many times just after each other thisfeature gives a high score to their comments In order toimplement this feature we use the following formula

FPScore (119894 119895) = 119891 (119860119894 119860119895) + 119891 (119860

119895 119860119894) (4)

where 119860119894is the author of comment 119894 119891(119886 119887) is the number

of times of comments of author 119886 that appear just beforecomments of author 119887 (see Pseudocode 1)

Figure 9 shows a time line which includes 7 commentswhich have been written by 4 commenters The feature scoreis calculated for the relation among comments The score ofrelation between comments 119860 and 119861 is 3 which is more than119860 and 119862

427 Length Ratio The length of the parent text is usuallylonger than its children The length ratio score is calculatedaccording to

length ratio score (119862119862 119862119875) =

10038161003816100381610038161198621198751003816100381610038161003816

10038161003816100381610038161198621198621003816100381610038161003816

(5)

where 119862119862is a comment looking for a parent and 119862

119875is a

candidate parent

428 FrequentWords in Parent-Child Comments Sometimesa pair of words appears to be frequently one word in theparent and the next word in its children For example in ENE-News dataset the most frequent pairs are (believe people)(accident fukushima) (new discovered) (idea report) and(people public)We use pointwisemutual information (PMI)[38] to find the frequent pattern

Score (11988211198822) =

Count (11988211198822)

Count (1198821) lowast Count (119882

2) (6)

where 1198821is a word in a comment whose parent we are

looking for1198822is a word in its candidate parent and Count

(11988211198822) is the number of time119882

1has appeared in the parent

and1198822has appeared in child The numerator computes how

often two words appear together and denominator computeshow often one of the words appears Finally according toPseudocode 2 the score of relation between two comments iscalculated

The Scientific World Journal 9

+

C

C

C

A1

+

C

C

C

A2Similarity

C

Figure 5 The authorsrsquo language features based on author ldquo1198601rdquo and author ldquo1198602rdquo

102030405060708090100

1 20 4060 80

100

0

01

02

03

04

05

06

07

08

09

1

Prio

r pro

babi

lity

X RootY 60Z01862

X 10Y 60Z00011

X 30Y 60Z00011

X 57Y 60Z 006

Root

Y child X parent

X 59Y 60Z01236

Figure 6 The prior probability of child-parent relation from comments 1 to 100 inThestandard

102030405060

0

005

01

015

02

025

X rootY 60Z01862

X 59Y 60Z01236

Root

4045

5055

60

Y child

X parent

Prio

r pro

babi

lity

Figure 7 The prior probability of child-parent relation from comments 40 to 60

Comment other comments are sorted based on the text similarity measurement score

5 15 2 4 R 3 21 1

2 233 4 5 R13 1

4 235 1 R 313 2

5

2 4

1 2 3 4 5

1 2 3 4 5

1 2 3 4 5

1(1 + 3) 1(2 + 1)

Figure 8 An example for global similarity features

10 The Scientific World Journal

Step 1For 119894 = 1 to comments-1

if Author(119894) = Author(119894 + 1)119891[Author(119894 + 1) Author(119894)]++

End ifEnd forStep 2For 119894 = 2 to Comments For each comment which looking for its parent

For 119895 = 0 to 119894 minus 1 For each candidateFPScore(119894119895)= 119891[Author(119894) Author(119895)] + 119891[Author(119895) Author(119894)]

End forEnd for

Pseudocode 1 Pseudocode to calculate frequent patterns of authors appearance

Time line

Relation CountFrequentpattern

Sum

Aminus gt B 2 A B 3

Bminus gt C 1 B C 1

Cminus gt A 1 C A 1

Bminus gt A 1 AD 1Aminus gt D 1

B

AFrequent (A B) = 3 Frequent (A C) = 1

C

Step 1 Step 2

A B C A B A D

Figure 9 An example for authors appearance sequence

429 Candidate Filtering In addition to the features we usesome heuristics to filter some inappropriate candidates

(1) Usually a commenter does not reply to the root inhisher second comment So if a commenter haswritten more than one comment the root is removedfrom the parent candidates of the second and nextcomment This was shown to be a useful featurebecause the root is an important candidate so if wecould remove it correctly the results are improvedsignificantly

(2) A commenter does not reply to himherself So wesimply remove all comments of a commenter fromhisher commentrsquos candidates [2]

(3) Commenters who post only one comment on a threadare more likely to reply to the root post So othercandidates can be removed except the root

5 Experiments

In this section we provide details about the datasets used forevaluation describe the evaluation metrics and then presentthe results of the SLARTS method and compare them withthe results of the baseline approaches

51 Dataset Unfortunately there is no standard datasetfor RTS task in declarative threads while severaldatasets are available for interrogative threads [3 4 14]Therefore Thestandard (httpthestandardorgnz) Alef(httpwwwalefir) ENENews (httpenenewscom) Rus-sianblog (wwwkaverydreamwidthorg) and Courant-blogs(wwwcourantblogscom) websites were crawled to provideevaluation data Thestandard in New Zealand and Alef inIran are two online news agencies established in August2007 and 2006 respectively They publish daily news andarticles by journalists in different categories such as economyenvironment politics and society Thestandard is in Englishand Alef is in PersianThestandard has provided the tree-likereply structure of comments since 2010 ENENews is a newswebsite covering the latest energy-related developmentsIt was established shortly after the Fukushima Daiichidisaster in March 2011 It has grown rapidly to serveapproximately 2000000 page views per month Russianblogand Courantblogs are two blogs that are in Russian andEnglish respectively Russsianblog and Courantblogs areestablished since 2003 and 2012 respectively

The reason for selecting the mentioned websites are (1)They support multilevel reply (2) their users are active andarticles usually get many comments (3) author and time ofcomments are available (4) they cover different contexts andlanguages (news agencies cultural sites and blogs in EnglishRussian and Persian)

The Scientific World Journal 11

procedure frequent-words (comment candidate comment)Sum = 0For each (Word1 in comment)

For each (Word2 in candidate-comment)Sum += Score [Word1 Word2]

End ifEnd forreturn Sum

Pseudocode 2 A pseudocode for calculating score of frequent words between two comments

Table 1 The datasets specific properties

Datasets Language Confirmation bymoderator

Usersregister

Thestandard English times

Alef Persian times

ENENews English times

Russianblog Russian times

Courantblogs English times

For each website we crawled the webpages that werepublished until the end of 2012We then parsed the pages andextracted the reply structure and used it for ground truthWehave removed the threadswith less than 4 comments becausethese kinds of threads do not give us much information andusually their comments reply to the root Table 1 summarizesthe information about the prepared datasetsThe datasets areavailable at httpeceutacirnlpresourceshtml

In ENENews and Russianblog users have to register inorder to leave a comment However in other datasets userscan leave comments in a thread with different names ordifferent users can use the same names

Table 2 reports some statistics on the crawled websitesThe length of commentsrsquo text in Russianblog is shorter thanthe other datasets which causes text similarity measures toperform poorly on it In ENENews the root usually includesa tweet that is why the length of the rootrsquos text is shorterthan the other datasets All comments have authorrsquos nameexcept for some comments in Alef Therefore the numbersof commenters for Alef was calculated from the commentsthat have authorrsquos name The average number of commentsper article in Thestandard and ENENews are about 50 and39 respectively which are larger than the other datasets

In order to gain some insights into the data we show somecharts extracted from the datasets

Figure 10 shows the distribution of the number of com-ments in the articles It is seen thatmost threads have between5 and 15 comments in Russianblog and Alef However inThestandard length of threads is longer than the otherdatasets andmost threads have between 12 and 24 comments

Publication rate of articles is shown in Figure 11 Thepublication rate follows a bell-shape and articles are publishedbetween 7 am and 19 pm and the highest rate belongs to the 4-hour period between 9 am and 13 pm Since there is only one

author who publishes the articles in Russianblog its chart hasless variation and the root is usually published between 10 amand 18 pm

Figure 12 shows the publication rate of comments afterthe root It is seen that all datasets except Russianblog havesimilar behavior about one hour after the article is publishedit has received most of the comments After that the rate ofcomment reception decreases Russianblog shows differentbehavior It seems that a part of its visitors reply to its articlesthe next morning 16ndash21 hours after the publication

Figure 13 shows the time difference between publicationtime of comments and their replies It is seen that the maxi-mum difference is usually less than one hour ENENews andRussianblog do not moderate comments and Thestandardhas very active moderators who immediately check andaccept comments However in Courantblogs and Alef wheremoderators are not always online the time difference isbetween one and two hours

Figure 14 shows how depth of comments increases whentime passes after publication of the main article Deepercomments show longer and more serious conversations Asshown in the figure comments usually reply to the root inearly hours After a few hours conversations are continuedaround the comments which causes the depth of thread toincrease Visitors of Alef and Courantblogs talk more aboutthe root article However Thestandard and ENENews havelonger and deeper discussions

Figure 15 shows length of comments in words Most ofcomments include 10ndash20 words Except for comments ofRussianblog the other datasets are similar This tells thatthe similarity measure is not enough and we have to usenontextual features as well Russianblog has comments thatare shorter than the other datasets This dataset is a personalblog and users usually write friendly comments

Figure 16 shows depth of comments Depth is directlyrelated to the conversations It is seen that comments areusually found in the first depth (below the root) RussianblogENENews and Thestandard have more comments in higherlevels meaning conversations are longer

52 Evaluation Metrics To evaluate the experiment resultswe use several metrics For edge prediction we use precisionrecall and 119865-score measures [4 13 17 18] Output of RTS isa tree This causes precision recall and 119865-score values to beequal [2] since FP (False Positive) and FN (False Negative)

12 The Scientific World Journal

Table 2 Datasetsrsquo statistics split up per website

DatasetsAvg length of

commentsrsquo text (inwords)

Avg length of therootrsquos text (in words)

Number ofcomments

Avg comments pernews

Number ofnews

Number ofcommenter

Thestandard 7298 35768 149058 4981 2992 3431Alef 7466 76655 23210 2796 830 4387ENENews 6819 16264 16661 3892 428 831Courantblogs 7924 39817 4952 1919 258 1716Russianblog 1856 23716 30877 1618 1908 1258

0

20

40

60

80

100

120

4 8 16 32 64 128 256

Num

ber o

f new

s

Number of comments

AlefEnenewsRussianblog

CourantblogsThestandard

Figure 10 Distribution of the number of comments in news article

0

005

01

015

02

025

0 2 4 6 8 10 12 14 16 18 20 22

Nor

mal

ized

new

s cou

nt

Hour

AlefEnenewsRussianblog

CourantblogsThestandard

Figure 11 Publication rate of articles per hour

0002004006008

01012014016

0 2 4 6 8 10 12 14 16 18 20 22 24

Nor

mal

ized

com

men

ts co

unt

Elapsed time after published the root (hour)

AlefEnenewsCourantblogs

RussianblogThestandard

Figure 12 Average number of published comments per hour after the root

The Scientific World Journal 13

0005

01015

02025

03035

04045

05

0 1 2 3 4 5 6 7 8 9 10 11 12

Nor

mal

ized

com

men

ts co

unt

Time difference

AlefEnenewsRusssiablog

CourantblogsThestandard

Figure 13 Time difference between a node and corresponding parent per hour

0

05

1

15

2

25

3

35

4

0 2 4 6 8 10 12 14 16 18 20 22 24

Aver

age c

omm

ents

dept

h

Elapsed time after publishing the root (hour)

AlefEnenewsRusssianblog

CourantblogsThestandard

Figure 14 Depth of comments averaged on the elapsed time from publication of root

in precision and recall are always equal Instead we use thefollowing edge accuracy measure

Accuracy (Acc)edge

=

1003816100381610038161003816reply relations cap detected relations1003816100381610038161003816119873

(7)

where the tree is comprised of119873 comments and one rootThe secondmetric is path accuracy which was introduced

byWang et al [3]Thismetric has a global view and considerspaths from nodes to the root

Accuracy (Acc)path =sum|119862|

119894=1

1003816100381610038161003816path119866 (119894) = path119875(119894)1003816100381610038161003816

|119862| (8)

where path119866(119894) is the ground-truth structure for 119894th comment

and path119875(119894) is the predicted structure for it |119862| is the number

of comments in a thread If any irrelevant comment appearsin the path this metric considers it to be completely wrongSo it is very strict To relax this Wang et al introduced

a metric that computes the overlap between the paths inground truth and the predicted path

Precision (119875)119877-path =

sum|119862|

119894=1(1003816100381610038161003816path119866 (119894) cap path

119875(119894)1003816100381610038161003816 1003816100381610038161003816path119875 (119894)

1003816100381610038161003816)

|119862|

Recall (119877)119877-path =

sum|119862|

119894=1(1003816100381610038161003816path119866 (119894) cap path

119875(119894)1003816100381610038161003816 1003816100381610038161003816path119866 (119894)

1003816100381610038161003816)

|119862|

(9)

where |path119875(119894)| is the number of comments in the prediction

path of 119894th comment and |path119866(119894)| is the number of com-

ments in the ground-truth path of 119894th comment 1198651-score isthe harmonic mean of precision and recall

1198651-score = 2 times Precision times recallPrecision + recall

(10)

The above mentioned metrics are appropriate in inter-rogative threads As mentioned before the root of declar-ative threads is news articles or main contents which aredifferent from the root of interrogative threads This causesthe structure of threads and reply relations to be differentThere are two types of reply relations in declarative threads

14 The Scientific World Journal

0

001

002

003

004

005

006

1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91Nor

mal

ized

com

men

ts co

unt

Length of comments (words)

AlefEnenewsRussianblog

CourantblogsThestandard

Figure 15 Histogram of length of comments in words

0

01

02

03

04

05

06

07

08

1 2 3 4 5 6 7 8 9Nor

mal

ized

com

men

ts co

unt

Depth

AlefEnenewsRusssainblog

CourantblogsThestandard

Figure 16 Histogram of the depth of comments

(1) comment-root that is the relation between commentsand the root article and (2) comment-comment that is therelation between two comments one parent and one childThe comment-root relations show different views around themain article (root) The comment-comment relations showconversation among visitors which is a valuable source ofinformation In Figure 17 there are three comment-root andthree comment-comment relations When a user reads theroot and comments heshe can write hisher opinion orsentiment about the root by replying to root or participatingin a discussion by replying to usersrsquo comments

We believe that the evaluation metrics mentioned beforeare not enough to cover both types of reply relations dueto differences between interrogative and declarative threadsAn appropriate metric should be able to detect both typesof relations So we have to modify the evaluation metrics ordefine new metrics

We propose an evaluation metric Accuracy (Acc)CTDwhere CTD stands for Comments Type Detection It isdefined as the proportion of correctly detected commenttypes The comment-root type includes comments whichinitiate a new view about the root (NP) and the comment-

comment type includes comments which are part of adiscussion (PD)

Accuracy (Acc)CTD =TP + TN

TP + TN + FP + FN (11)

where TP is the number of correctly detected NP commentsTN is the number of correctly detected PD comments FP isthe number of incorrectly detected NP comments and FN isthe number of incorrectly detected PD comments

To evaluate accuracy of comment-comment relations thestandard precision recall and 119865-score measures are used

Precision (119875)edge =TP

TP + FP

Recall (119877)edge =TP

TP + FN

1198651-score (119865)edge = 2 timesPrecision times recallPrecision + recall

(12)

where TP is the number of correctly detected comment-comment relations FP is the number of incorrectly detectedcomment-comment relations and FN is the number of

The Scientific World Journal 15

C

C

Comment-root relation

Comment-comment relation

C

C C C Starter discussion commentStarter discussion comment

R

Figure 17 An illustration of a declarative thread

comment-comment relations in the ground truth that werenot predicted at all

The path accuracy metric mentioned earlier is a mod-ified version of 119875path and 119877path which is appropriate todeclarative platform This metric consider discussion pathsfrom each PD comment to the starter discussion commentbut not to root

Precision (119875)path

=

sum|119862|119901

119894=1

1003816100381610038161003816path119866 (119894) = path119875(119894)1003816100381610038161003816

|119862|119901

ow

1 if (|119862|119901 = 0)

Recall (119877)path

=

sum|119862|119877

119894=1

1003816100381610038161003816path119866 (119894) = path119875(119894)1003816100381610038161003816

|119862|119877

ow

1 if (|119862|119877= 0)

(13)

where |119862|119877is the number of PD comments in the ground-

truth thread and |119862|119901is the number of PD comments in the

predicted thread Path119866(119894) is the discussion path from 119894th

node to the discussion starter comment in the ground truthPath119875(119894) is discussion path from 119894th node to the discussion

starter comment in the predicted threadAlso the relaxed Precision (119875)

119877-path and Recall (119877)119877-path

are modified to be suitable for declarative platform

Precision (119875)119877-path

=

sum|119862|119901

119894=1(1003816100381610038161003816path119866 (119894) cap path

119875(119894)1003816100381610038161003816 1003816100381610038161003816path119875 (119894)

1003816100381610038161003816)

|119862|119901

ow

1 if (|119862|119901= 0)

Recall (119877)119877-path

=

sum|119862|119877

119894=1(10038161003816100381610038161003816path119866 (119894)

cap path119875(119894)

1003816100381610038161003816100381610038161003816100381610038161003816path119866(119894)

10038161003816100381610038161003816)

|119862|119877

ow

1 if (|119862|119877= 0)

(14)

Figure 18 shows two threads of which one is the groundtruth and the other one is the predicted thread In order tobetter understand the evaluation metrics we calculate themfor this example

In Table 3 the predicted thread in Figure 18 has beenevaluated by metrics from interrogative threads The resultsshow high values Table 4 shows the results of evaluationby metrics from declarative threads The results show thatdeclarative metrics have more appropriate results

There are two reasons which lead declarative metricsto better evaluate the predicted structure (1) The root indeclarative threads has many children So if a methodconnects all comments to the root interrogative metricsshow good results (2) Interrogative metrics cannot properlyevaluate comment-comment relations in declarative threadsHowever the declarative metrics can evaluate both types ofrelations

53 Experimental Results and Analysis We use several base-lines to compare the effectiveness of SLARTS The first base-line is the performance when we simply link all comments totheir previous comment (we name it Last baseline) [3 4]Thismethod leads to good results for RTS task in interrogativethreads The second baseline is to link all comments to theroot (we name it First baseline)

The only work that has been done on RTS has focused onthe authorrsquos name in online news agencies [19] Schuth et alrsquosmethod used several features to find the commenterrsquos name inonline news articles This method achieves a good precisionbut it has a low recall because many comments do not referto any authorrsquos name So we selected Seo et alrsquos method [2]which is a strong method in interrogative platforms

Seo et alrsquos method has focused on forums and emails andused the quoted text as one of theirmain featuresThis featureis not available in our datasets Therefore we use all theirproposed features except the quoted text

We have applied 5fold cross-validation to minimize biaswith 95 as confidence interval Table 5 shows the results ofour experiments based on interrogative evaluation metricsAccording to Accedge SLARTS reveals higher accuracy exceptin Alef in which lots of replies are connected to the root

According to 119875119877-path and 119877119877-path First and Last baselines

have the best performance respectively First connects allcomments to the root this way irrelevant comments do notappear in the path from the comment to the root and also theroot appears in all paths Last connects all comments to theirprevious comments This causes all comments to appear inthe path

16 The Scientific World Journal

Table3Ex

ampleo

fcalculatio

nof

interrogativem

etric

sfor

thee

xamples

hownin

Figure

18

Evaluatio

nmetric

Calculation

Accuracy

edge

Replyrelatio

ns=1rarr

1198772rarr

1198777rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

14rarr

35rarr

26rarr

58rarr

5

Detectedrelatio

ns==1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

14rarr

15rarr

26rarr

57rarr

28rarr

7

8 11asymp073

Acc p

ath

|(1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

1rarr

1198775rarr

2rarr

1198776rarr

5rarr

2rarr

119877)|

Num

berof

comments

=8 11asymp073

119875119877-path

(1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

1rarr

1198775rarr

2rarr

1198776rarr

5rarr

2rarr

119877)+(4rarr

1rarr

1198777rarr

2rarr

1198778rarr

7rarr

2rarr

119877)

Num

berof

comments

=(1+1+1+1+1+1+1+1)+(1+12+23)

11

=1017

11

asymp092

119877119877-path

(1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

1rarr

1198775rarr

2rarr

1198776rarr

5rarr

2rarr

119877)+(4rarr

3rarr

1rarr

1198777rarr

1198778rarr

5rarr

2rarr

119877)

Num

berof

comments

=(1+1+1+1+1+1+1+1)+(23+1+23)

11

=1034

11

asymp094

1198651-score119877-path

2lowast092lowast094

092+094

asymp093

The Scientific World Journal 17

R

9 10 11

3 5

8

21 7

4 6

(a)

R

9 10 11

3 7

8

21

4 5

6

(b)

Figure 18 An example of two threads (a) ground-truth thread and (b) the prediction thread

Table 4 Example of calculation of declarative metrics for the example shown in Figure 18

Evaluation metric Calculation

AccCTD

TP = 1 2 9 10 11TN = 3 4 5 6 8

FN = 7

FP =

5 + 5

5 + 5 + 1 + 0asymp 091

119875edge

119877edge

1198651edge

TP = 3 rarr 1 5 rarr 2 6 rarr 5 119875edge =3

3 + 3= 05

FN = 4 rarr 3 8 rarr 5 119877edge =3

3 + 2= 06

FP = 4 rarr 1 7 rarr 2 8 rarr 7 1198651 minus scoreedge asymp 055

119875path

119877path

1198651path

119875path =| 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 |

| 3 rarr 1 4 rarr 1 5 rarr 2 6 rarr 5 rarr 2 7 rarr 2 8 rarr 7 rarr 2 |=3

6= 05

119877path =| 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 |

| 3 rarr 1 4 rarr 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 8 rarr 5 rarr 2 |=3

5= 06

1198651-score119877-path asymp 055

119875119877-path

119877119877-path

1198651119877-path

119875119877-path =

3 rarr 1 4 rarr 1 5 rarr 2 6 rarr 5 rarr 2 7 rarr 2 8 rarr 7 rarr 2

| 3 4 5 6 7 8 |=1 + 1 + 1 + 1 + 0 + 12

6=45

6= 075

119877119877-path =

3 rarr 1 4 rarr 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 8 rarr 5 rarr 2

| 3 4 5 6 8 |=1 + 12 + 1 + 1 + 12

5=4

5= 08

1198651-score119877-path =

2 lowast 075 lowast 08

075 + 08asymp 077

According to Accpath First has better performance inThestandard ENENews and Alef because more commentsare linked directly to the root Usually SLARTS and Seo etalrsquos methods cannot predict the full path inThestandard andENENews because according to Figure 15 paths are verylong and complex in these datasets

As wementioned earlier interrogative evaluationmetricsare not appropriate for declarative threads because based onthese metrics First shows high performance although thisbaseline does not detect any comment-comment relations

Table 6 shows the results when we use declarative evaluationmetrics proposed in Section 52

According to Table 6 for 119865edge SLARTS performs betterthan Seo et alrsquos method However for119875edge Seo et alrsquos methodperforms better but its 119877edge is lower than SLARTS in all da-tasets

It is important to say that when the difference between119875edge and 119877edge is high and 119875edge is greater than 119877edge themethod most likely connects comments to the root and itdoes not appropriately detect comment-comment relations

18 The Scientific World Journal

J(1)M(1)

V(1)

H(1)

A(1)

R(1)

G(1)

N(1)D(1)

E(1)

C(1)

P(2)

U(1)

S(1)

T(2)

T(1)

M(2)

L(1)

Q(1)

C(2)

K(1)

F(1)L(2)

C(3)

P(1)

A(2)

B(1)O(1)

I(1)

H(2)

R

(a)

J(1)

M(1)

R

A(1)

N(4)

D(1)

G(1)

I(1)

B(1)H(1)

O(1)

C(1)

N(2)

G(2)O(2)

P(1)

E(1)

D(2)

F(1)

E(2)F(2)

D(3)

L(1) F(3) N(1)

D(4)

F(4)K(1)D(5)

D(6)

N(3)

(b)

1)

1)

1)

1)

1)

1)

1)

1)1)

1)

1)

1)

R

F(2)

K(3)

R(1)

K(2)

M(2)A(3)

Q(1)

N(2)

K(4)A(2)

J(2)

1)

H(2)

H(3)

I(2)

I(1)

P(1)

O(1)

(c)

Figure 19 Three real threads in which nodes are labeled according to commenterrsquos name and number of its comments

(like First baseline) On the other hand when 119877edge isgreater than 119875edge the method does not appropriately detectcomment-root relations (like Last baseline)

So Seo et alrsquos method most likely connects comments tothe root and this is not appropriately detecting comment-comment relations On the other hands 119875edge and 119877edge ofSLARTS are close

Several features in SLARTS such as Global similarityfrequent words Frequent patterns and authorsrsquo languagefocus on detection of comment-comment relations Thismakes the results of SLARTS better than Seo et alrsquos methodin declarative evaluation metrics

We already saw that the average length of threads inThestandard and ENENews is longer than the other datasets(Table 2) and their paths are much longer and more complexthan the other datasets (Figure 16) According to 119865edge theaccuracy of Thestandard and ENENews is less than otherdatasets Note that 119865edge is a strict metric in declarativethreads

SLARTS has better 119865edge than Seo et alrsquos method in alldatasetsThemaximumdifference occurs inThestandard andRussianblog In these datasets many of the defined featureshave goodperformance (Importance of features in ENENewsand Russianblog datasets will be shown in Table 7 and we willexplain more in the next section)

The Scientific World Journal 19

Table 5 Experimental results with interrogative evaluation metrics

Datasets Methods Accedge Accpath 119875119877-path 119877

119877-path 1198651119877-path

Thestandard

SLARTS 04669plusmn0009

03059plusmn0013

06927plusmn0018

07450plusmn0010

07179plusmn0013

Seo et alrsquos [2] 03941plusmn0005

03413plusmn0006

08303plusmn0009

06563plusmn0005

07331plusmn0003

First 03630plusmn0010

03630plusmn0010

10000plusmn0000

05846plusmn0009

07379plusmn0007

Last 01824plusmn0006

00495plusmn0004

02214plusmn0008

10000plusmn0000

03625plusmn0010

ENENews

SLARTS 04837plusmn0018

03695plusmn0027

07625plusmn0026

07770plusmn0013

07697plusmn0017

Seo et alrsquos [2] 04661plusmn0017

04134plusmn0021

08720plusmn0011

07086plusmn0013

07818plusmn0011

First 04524plusmn0018

04524plusmn0018

10000plusmn0000

06634plusmn0016

07976plusmn0011

Last 02090plusmn0016

00596plusmn0003

02319plusmn0011

10000plusmn0000

03764plusmn0015

Courantblogs

SLARTS 06843plusmn0028

06453plusmn0033

09127plusmn0013

08619plusmn0014

08865plusmn0013

Seo et alrsquos [2] 06700plusmn0040

06628plusmn0043

09667plusmn0014

08250plusmn0018

08902plusmn0015

First 06490plusmn0036

06490plusmn0036

10000plusmn0000

08001plusmn0019

08889plusmn0012

Last 02037plusmn0017

01082plusmn0008

03183plusmn0011

10000plusmn0000

04829plusmn0012

Russianblog

SLARTS 07463plusmn0017

06904plusmn0017

08741plusmn0008

08787plusmn0006

08764plusmn0007

Seo et alrsquos [2] 05446plusmn0010

05174plusmn0011

09293plusmn0003

07821plusmn0004

08494plusmn0003

First 04968plusmn0006

04968plusmn0006

10000plusmn0000

07134plusmn0004

08327plusmn0003

Last 03632plusmn0018

01549plusmn0012

03695plusmn0015

10000plusmn0000

05395plusmn0016

Alef

SLARTS 07753plusmn0017

07641plusmn0018

09579plusmn0006

09033plusmn0008

09298plusmn0007

Seo et alrsquos [2] 07802plusmn0019

07800plusmn0019

09950plusmn0003

08828plusmn0010

09355plusmn0006

First 07853plusmn0018

07853plusmn0018

10000plusmn0000

08826plusmn0010

09376plusmn0006

Last 00993plusmn0005

00822plusmn0005

02526plusmn0012

10000plusmn0000

04032plusmn0015

The bold style shows the best result in each metric

As shown in Table 6 Russianblog has better resultsthan the other datasets in all metrics The main reason isthat its comments are not confirmed by a moderator Thiscauses Accedge of Last baseline in Russianblog to be equalto 03632 that is more than other datasets (Table 5) Accedgeof Last baseline has inverse relationship with complexity ofreplies Also we showed in Figure 13 that the Russianbloghas the lowest time difference between a comment and itsparent When time difference between a child and its parentdecreases detection of reply relationwould be easier In otherwords as the parent appears to be closer to its child somefeatures such as frequent pattern and location Prior that are

based on position of comments in chronological order workbetter

The difference between 119865119877-path and 119865path is about 20

in Thestandard and ENENews where threads are larger andpaths have more complex structures

The minimum difference between the results of SLARTSand Seo et alrsquos methods appears in Alef datasets In Alef manyrelations are comment root and many comments do not haveauthorrsquos name which make the features perform poorly

Since SLARTS method has features which speciallyfocused on detecting comment-root relations (eg by addingcandidate filtering) AccCTD of SLARTS is better than Seo

20 The Scientific World Journal

et al method in all datasets The best result is 091 forRussianblog and the worst result is 069 for ENENewsThe root of ENENews dataset is usually a tweet Accordingto Table 2 it makes the average length of the rootrsquos textto be shorter than the other datasets and this makes thetextual features perform poorly on detecting comment-rootrelations

Confidence intervals of Alef andCourantblog datasets arehigher than the other datasets becausemany of their relationsare comment root (Figure 16) This makes ranking SVM tobe bias towards connecting comments to the root especiallywhen a thread includes very few comment-comment rela-tions

We compare 119875-value to specify the significance of differ-ences between SLARTS and Seo et alrsquosmethods on declarativemetrics Since ranking SVM ranks candidates based on theirscore and selects the first candidate from the list only thefirst candidate is important So p1 is computed The resultsindicate that all improvements are statistically significant (119875-value lt 0005) in all datasets

54 Evaluation of the Features In this section we evaluatethe role of the introduced features We use backward featureselection That means to measure the importance of a featurewe use all features except that feature repeat the experimentsin its absence and compare the results to the case whereall features are present The difference between the values ofmetrics in presence of a feature and its absence is reported inTable 7

It is seen that some features improve precision of themetrics for example location prior and candidate filteringrule 3 where they most likely tend to detect comment-rootrelations Some features improve recall of the metrics such asauthorsrsquo language global similarity candidate filtering rule1 and frequent patterns These features most likely tendto detect comment-comment relations Some features affectboth precision and recall for example authorsrsquo name

As stated earlier Russianblog has different behaviourin comparison with other datasets (Figures 12 13 and 15)Also ENENews has larger threads and more complex paths(Figure 16 and Table 2) So we use these datasets to evaluatefeatures in depth

Similarity feature improves recall of evaluation metricsSince Russian language is known as a morphologically richlanguage and the length of commentsrsquo text is very short(about average 18 words according to Table 2) in comparisonwith other datasets improvement of textual features is lowTo increase the performance of textual feature in Russianwe need a stemmer a Wordnet and a tokenizer Similarityfeature has rather a good impact on AccCTD

Authorsrsquo language feature improves recall of evaluationmetrics According to 119877edge the improvements of this featureare 003 and 001 in ENENews and Russianblog respectively

Global similarity feature improves both precision andrecall in Russianblog and recall in ENENews

The frequent words feature has a small improve-ment in Russianblog like similarity feature This feature

improves recall of the evaluation metrics in ENEnews about001

Length ratio feature improves precision in both datasetsHowever since longer comments in ENENews have manychildren this feature is more prominent there

Authorsrsquo name feature is useful in all evaluation metricsThe value of this feature in ENENews is more than Russian-blog because authorsrsquo name in Russianblog is different fromother datasets it is an email and no one would refer to it as aname

The frequent patterns feature focuses on detectingcomment-comment relations This feature improves recall ofevaluation metrics in both datasets

The location prior feature improves precision in bothdatasets This feature has a good improvement on AccCTDAccording to 119875edge the best improvement is 016091 forRussianblog since comments are not moderated

The candidate filtering rule 1 improves recall of evaluationmetrics in both datasets This feature removes the rootcandidate accurately This feature has a good improvementon AccCTD The maximum improvement for 119877edge is 034 forRussianblog

The candidate filtering rule 2 has small improvements onprecision and recall of evaluation metrics in both datasetsMaximum improvement is gained with Russianblog

Finally The candidate filtering rule 3 improves precisionof metrics in both datasets This feature removes all can-didates except the root Thus detection of comment-rootrelations is improved Also this feature improves AccCTD

6 Information Extraction byVisualizing Threads Structure

In this section we discuss the information which can beextracted from hierarchical structure of threads Figure 19 isthe same as Figure 3 except that the commenters and theirnumber of posted comments are shown as the nodesrsquo labelfor example ldquo119863(3)rdquo is the third comment sent by user ldquo119863rdquoVisualization of the structure reveals valuable information asfollowing

(i) Is the root article controversial The threads ldquo119861rdquo andldquo119862rdquo in Figure 19 aremore controversial than ldquo119860rdquo Visi-tors have expressed different opinions and sentimentsabout the root in ldquo119861rdquo and ldquo119862rdquo leading to formationof longer conversations The thread ldquo119860rdquo shows acommon thread which has short conversations aboutthe root The height and width of trees can help as ameasure to recognize whether a root is controversialor not

(ii) Which comments are starters [36] Starter commentshave important role in the conversations becauseusers read them and then read other comments tobetter understand the conversations

(iii) Which comments have participated in this conversa-tion Users can follow the conversations that seeminteresting to them without reading any unrelatedcomment

The Scientific World Journal 21

Table 6 Experimental results with declarative evaluation metrics

Datasets Method 119875edge 119877edge 119865edge 119875path 119877path 119865path 119875119877-path 119877

119877-path 119865119877-path AccCTD

ThestandardSLARTS 03782

plusmn000804155plusmn0011

03960plusmn0006

01565plusmn0007

01713plusmn0008

01636plusmn0007

03628plusmn0012

04166plusmn0016

03876plusmn0005

07525plusmn0005

Seo et alrsquos[2]

03014plusmn0031

01414plusmn0015

01921plusmn0016

01544plusmn0025

00576plusmn0011

00835plusmn0013

03013plusmn0038

01319plusmn0018

01831plusmn0021

05831plusmn0007

Difference 00778 02741 02039 00021 01137 00801 00615 02847 02045 01694

ENENewsSLARTS 03797

plusmn003803891plusmn0019

03839plusmn0024

01873plusmn0040

01855plusmn0026

01857plusmn0030

03768plusmn0048

03768plusmn0017

03759plusmn0025

06872plusmn0014

Seo et alrsquos[2]

04060plusmn0030

01713plusmn0016

02407plusmn0018

02378plusmn0034

00837plusmn0016

01235plusmn0021

03947plusmn0035

01523plusmn0015

02196plusmn0019

05954plusmn0009

Difference minus00263 02178 01432 minus00505 01018 00622 minus00179 02245 01563 00918

CourantblogsSLARTS 05236

plusmn003003963plusmn0073

04495plusmn0052

04220plusmn0057

03110plusmn0058

03566plusmn0049

05325plusmn0049

03853plusmn0075

04452plusmn0057

07625plusmn0033

Seo et alrsquos[2]

07320plusmn0090

02049plusmn0062

03189plusmn0082

06815plusmn0105

01896plusmn0070

02951plusmn0093

07309plusmn0094

02010plusmn0065

03140plusmn0086

07001plusmn0031

Difference minus02084 01914 01306 minus02595 01214 00615 minus01984 01843 01312 00624

RussianblogSLARTS 05940

plusmn002805674plusmn0022

05803plusmn0023

04907plusmn0031

04664plusmn0020

04782plusmn0024

05878plusmn0028

05823plusmn0021

05849plusmn0020

09116plusmn0011

Seo et alrsquos[2]

04705plusmn0020

03030plusmn0015

03685plusmn0016

03823plusmn0023

02534plusmn0013

03046plusmn0015

04682plusmn0020

02822plusmn0014

03521plusmn0015

05862plusmn0013

Difference 01235 02644 02118 01084 0213 01736 01196 03001 02328 03254

AlefSLARTS 06189

plusmn002804131plusmn0040

04952plusmn0037

05639plusmn0027

03819plusmn0037

04551plusmn0034

06277plusmn0027

04069plusmn0043

04934plusmn0039

08044plusmn0013

Seo et alrsquos[2]

08827plusmn0052

02637plusmn0051

04045plusmn0060

08797plusmn0053

02631plusmn0051

04034plusmn0060

08840plusmn0052

02635plusmn0051

04043plusmn0060

07846plusmn0018

Difference minus02638 01494 00907 minus03158 01188 00517 minus02563 01434 00891 00198The bold style shows the best result in each metric

(iv) Who plays an important role in a discussion Someusers playmore important roles in conversations thanother users For example users ldquo119863(1)rdquo in threadldquoBrdquo and ldquo119860(1)rdquo in thread ldquoCrdquo have made a longconversation These users are known as a hub or astarter in a thread and their degree of importance canbe compared according to their indegree [39] Manyanalyses can be performed on this structure which isknown as popularity features [12]

(v) How many conversations are about the root Thereare thirteen conversations which are replyingdirectly to the root in thread ldquoCrdquo for exampleF(2)H(2)H(3) I(2)O(1)orF(2)H(2) l(1) showtwo conversations in thread ldquo119862rdquo

7 Conclusion and Future Work

In this paper we proposed SLARTS a method based onranking SVMwhich predicts the tree-like structure of declar-ative threads for example blogs and online news agenciesWe emphasized on the differences between declarative andinterrogative threads and showed that many of the previouslyproposed features perform poorly on declarative threadsbecause of this Instead we defined a set of novel textual andnontextual features and used a ranking SVM algorithm tocombine these features

We detect two types of reply relations in declarativethreads comment-root relation and comment-commentrelation An appropriate method should be able to detectboth types of reply relations and an appropriatemetric shouldconsider both types of reply relations So in order to havefair judge on the quality of the predicted structures wemodified the evaluation metrics accordingly We also defineda novel metric that measures the accuracy of comment typedetection

The results of our experiments showed that accord-ing to declarative evaluation metrics our method showshigher accuracy in comparison with the baselines on fivedatasets Also we showed that all improvements in detectingcomment-comment relations are statistically significant in alldatasets

We believe that the defined features are not limited todeclarative threads Some features such as authorrsquos languageand frequent patterns extract relations among users can beuseful in interrogative threads

For future we would like to use the proposed methodon interrogative platforms such as forums Also we wouldlike to analyze the tree-like reply structure deeply and webelieve it can provide valuable information for example tohelp in understanding the role of users in discussions or tofind important and controversial comments among a largevolume of comments

22 The Scientific World Journal

Table 7 The difference between the evaluation metrics in presence and absence of features

Feature Dataset Evaluation metric119875edge 119877edge 119875Path 119877Path 119875

119877-path 119877119877-path AccCTD

Similarity ENENews minus000233 007415 minus000538 003289 minus001095 008996 004684Russianblog 000874 000012 000963 000347 000695 minus001079 000355

Authorsrsquo language ENENews minus001354 003246 minus001785 000693 minus002757 004052 000458Russianblog minus000229 001052 minus000179 000750 minus000590 000807 minus000182

Global similarity ENENews minus000402 005604 minus001338 001581 minus001129 007086 002742Russianblog 001090 001443 000582 000897 000368 minus000317 minus000087

Frequent words ENENews 000252 000931 000170 000424 minus000132 000973 000292Russianblog 000008 000023 000016 000029 000034 000047 minus000011

Length ratio ENENews 004206 minus002984 004450 001700 006421 minus005336 minus000850Russianblog 000651 minus000355 000920 000080 000784 minus000592 000235

Authorsrsquo name ENENews 003753 006392 001920 003285 002341 005998 001788Russianblog 000000 000000 000000 000000 000000 000000 000000

Frequent pattern ENENews minus002028 002930 minus002357 000300 minus003793 003410 000526Russianblog 000458 006100 000830 005206 minus000545 006150 000404

Location prior ENENews 008778 minus003690 008556 003280 011792 minus007852 002925Russianblog 016091 009671 017766 013515 016699 008793 002414

Candidate filtering rule 1 ENENews minus005431 006138 minus004102 001296 minus006458 007545 006949Russianblog minus012506 034370 minus013916 027445 minus012683 037003 032546

Candidate filtering rule 2 ENENews 000014 minus000004 000048 000041 minus000038 minus000035 minus000009Russianblog 002575 002475 002462 002341 001746 000802 000000

Candidate filtering rule 3 ENENews 000357 minus000121 000452 000254 000551 minus000161 000161Russianblog 008521 minus001978 009636 001702 009380 minus003825 007371

Since our goal is to provide a language-independentmodel to reconstruct the thread structure we did not usetext processing tools such as Wordnet and Named EntityRecognition (as they are not available in the same quality forall languages) We would like to focus on English and usethese tools to find out whether they can improve the accuracyof RTS method or not

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] Y Wu Q Ye L Li and J Xiao ldquoPower-law properties ofhuman view and reply behavior in online societyrdquoMathematicalProblems in Engineering vol 2012 Article ID 969087 7 pages2012

[2] J Seo W B Croft and D A Smith ldquoOnline community searchusing conversational structuresrdquo Information Retrieval vol 14no 6 pp 547ndash571 2011

[3] HWang CWang C X Zhai and J Han ldquoLearning online dis-cussion structures by conditional random fieldsrdquo in Proceedingsof the 34th International ACM SIGIR Conference on Researchand Development in Information Retrieval (SIGIR rsquo11) pp 435ndash444 July 2011

[4] E Aumayr J Chan and C Hayes ldquoReconstruction of threadedconversations in online discussion forumsrdquo inProceedings of the

5th International AAAI Conference onWeblogs and SocialMedia(ICWSM rsquo11) pp 26ndash33 2011

[5] D Shen Q Yang J-T Sun and Z Chen ldquoThread detactionin dynamic text message streamsrdquo in Proceedings of the 29thAnnual International ACM SIGIR Conference on Research andDevelopment in Information Retrieval pp 35ndash42 Seattle WashUSA August 2006

[6] M Dehghani M Asadpour and A Shakery ldquoAn evolutionary-based method for reconstructing conversation threads in emailcorporardquo in Proceedings of the IEEEACM International Con-ference on Advances in Social Networks Analysis and Mining(ASONAM rsquo12) 2012

[7] Y Jen-Yuan ldquoEmail thread reassembly using similarity match-ingrdquo in Proceedings of the 3rd Conference on Email and Anti-Spam (CEAS rsquo06) Muntain View Calif USA 2006

[8] S Gottipati D Lo and J Jiang ldquoFinding relevant answersin software forumsrdquo in Proceedings of the 26th IEEEACMInternational Conference on Automated Software Engineering(ASE rsquo11) pp 323ndash332 November 2011

[9] H Duan and C Zhai ldquoExploiting thread structures to improvesmoothing of language models for forum post retrievalrdquo inAdvances in Information Retrieval P Clough C Foley CGurrin et al Eds vol 6611 pp 350ndash361 Springer BerlinGermany 2011

[10] T Georgiou M Karvounis and Y Ioannidis ldquoExtractingtopics of debate between users on web discussion boardsrdquo inProceedings of the 1st International Conference forUndergraduateand Postgraduate Students in Research Poster Competition 2010

[11] Y C Wang M Joshi W Cohen and C P Rose ldquoRecoveringimplicit thread structure in newsgroup style conversationsrdquo

The Scientific World Journal 23

in Proceedings of the International Conference on Weblogs andSocial Media Seattle Wash USA 2008

[12] J Chan C Hayes and E M Daly ldquoDecomposing discussionforums and boards using user rolesrdquo in Proceedings of the 4thInternational AAAI Conference on Weblogs and Social Media2010

[13] Y-C Wang and C P Rose ldquoMaking conversational structureexplicit identification of initiation-response pairs within onlinediscussionsrdquo in Proceedings of the Human Language Technolo-gies Conference ofthe North American Chapter of the Associationfor Computational Linguistics pp 673ndash676 June 2010

[14] LWangM Lui S N Kim J Nivre and T Baldwin ldquoPredictingthread discourse structure over technical web forumsrdquo inProceedings of the Conference on Empirical Methods in NaturalLanguage Processing (EMNLP rsquo11) pp 13ndash25 Edinburgh UKJuly 2011

[15] A Balali H FailiMAsadpour andMDehghani ldquoa supervisedapproach for reconstructing thread structure in commentson blogs and online news agenciesrdquo in Proceedings of the14th International Conference on Intelligent Text Processing andComputational Linguistics Samos Greece 2013

[16] P Adams and C Martel ldquoConversational thread extraction andtopic detection in text-based chatrdquo in Semantic Computing pp87ndash113 John Wiley amp Sons 2010

[17] P H Adams and C H Martell ldquoTopic detection and extractionin chatrdquo in Proceedings of the 2nd Annual IEEE InternationalConference on Semantic Computing (ICSC rsquo08) pp 581ndash588August 2008

[18] C Lin J M Yang R Cai X J Wang W Wang and L ZhangldquoModeling semantics and structure of discussion threadsrdquo inProceedings of the 18th International Conference on World WideWeb pp 1103ndash1104 2009

[19] A Schuth MMarx andM d Rijke ldquoExtracting the discussionstructure in comments on news-articlesrdquo in Proceedings of the9th ACM International Workshop onWeb Information and DataManagement (WIDM rsquo07) Lisboa Portugal November 2007

[20] C Wang J Han Q Li X Li W P Lin and H Ji ldquoLearninghierarchical relationships among partially ordered objects withheterogeneous attributes and linksrdquo in Proceedings of the 12thSIAM International Conference on Data Mining pp 516ndash527Anaheim Calif USA 2012

[21] M Dehghani A Shakery M Asadpour and A KoushkestanildquoA learning approach for email conversation thread reconstruc-tionrdquo Journal of Information Science vol 39 no 6 pp 846ndash8632013

[22] A Balali A Rajabi S Ghassemi M Asadpour and H FailildquoContent diffusion prediction in social networksrdquo in Pro-ceedings of the 5th Conference on Information and KnowledgeTechnology (IKT rsquo13) pp 467ndash471 2013

[23] G Mishne and N Glance ldquoLeave a reply an analysis of weblogcommentsrdquo in Proceedings of the 3rd Annual Workshop onthe Weblogging ecosystem Aggregation Analysis and DynamicsEdinburgh UK 2006

[24] V Gomez A Kaltenbrunner and V Lopez ldquoStatistical analysisof the social network and discussion threads in Slashdotrdquo inProceedings of the 17th International Conference on World WideWeb (WWW rsquo08) pp 645ndash654 April 2008

[25] T Weninger X A Zhu and J Han ldquoAn exploration ofdiscussion threads in social news sites a case study of the redditcommunityrdquo 2013

[26] D Laniado R Tasso Y Volkovich and A KaltenbrunnerldquoWhen the Wikipedians talk network and tree structure of

Wikipedia discussion pagesrdquo in Proceedings of the 5th Interna-tional AAAI Conference on Weblogs and Social Media (ICWSMrsquo11) 2011

[27] A Grabowski ldquoHuman behavior in online social systemsrdquoTheEuropean Physical Journal B vol 69 no 4 pp 605ndash611 2009

[28] S N Kim L Wang and T Baldwin ldquoTagging and linkingweb forum postsrdquo in Proceedings of the 40th Conference onComputational Natural Language Learning Uppsala Sweden2010

[29] J Andreas S Rosenthal and K McKeown ldquoAnnotating agree-ment and disagreement in threaded discussionrdquo in Proceedingsof the 8th International Conference on Language Resources andComputation (LREC rsquo12) pp 818ndash822 Istanbul Turkey May2012

[30] A Stolcke K Ries N Coccaro et al ldquoDialogue actmodeling forautomatic tagging and recognition of conversational speechrdquoComputational Linguistics vol 26 no 3 pp 339ndash373 2000

[31] E N Forsyth Improving Automated Lexical and DiscourseAnalysis of Online Chat Dialog Naval Postgraduate SchoolMonterey Calif USA 2007

[32] D Feng E Shaw J Kim and E Hovy ldquoAn intelligentdiscussion-bot for answering student queries in threaded dis-cussionsrdquo in Proceedings of the 11th international Conference onIntelligent User Interfaces pp 171ndash177 February 2006

[33] F Hu T Ruan Z Shao and J Ding ldquoAutomatic web informa-tion extraction based on rulesrdquo in Proceedings of the Web Infor-mation System Engineering (WISE rsquo11) pp 265ndash272 Springer2011

[34] F Hu T Ruan and Z Shao ldquoComplete-thread extraction fromweb forumsrdquo inWebTechnologies andApplications pp 727ndash734Springer 2012

[35] D Feng J Kim E Shaw and E Hovy ldquoTowards modelingthreaded discussions using induced ontology knowledgerdquo inProceedings of the National Conference on Artificial Intelligencepp 1289ndash1294 July 2006

[36] G D Venolia and C Neustaedter ldquoUnderstanding sequenceand reply relationships within email conversations a mixed-model visualizationrdquo in Proceedings of the SIGCHI Conferenceon Human Factors in Computing Systems pp 361ndash368 April2003

[37] T Joachims ldquoTraining linear SVMs in linear timerdquo in Pro-ceedings of the 12th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining pp 217ndash226 Philadel-phia Pa USA August 2006

[38] RM Fano ldquoTransmission of information a statistical theory ofcommunicationsrdquo American Journal of Physics vol 29 pp 793ndash794 1961

[39] T Opsahl F Agneessens and J Skvoretz ldquoNode centrality inweighted networks generalizing degree and shortest pathsrdquoSocial Networks vol 32 no 3 pp 245ndash251 2010

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 5: Research Article A Supervised Approach to Predict the …downloads.hindawi.com/journals/tswj/2014/479746.pdf · 2019-07-31 · Research Article A Supervised Approach to Predict the

The Scientific World Journal 5

Article-news title the president loses 30000 jobs in a single year

The president reacts to criticism of his appalling jobs recordwhich has 300000 Kiwis jobless and 30000 fewer people inwork in the last year middot middot middot Herersquos criticism of things that yoursquorenot doing on the jobs front the president tax pollution soyou can reduce tax on companies and income renew andenhance the home insulation scheme which has created2000 jobs and is about to run out of money create anational investment middot middot middot

User A26 march 2013 at 802 am

User D26 march 2013

at 1110 am

321

middot middot middot

All Shearer has to do is1 Print Money or borrow a shit load moremoney2 Whack on a pile of extra taxes3 Get the government to run everything

Can you get some of those 30000 toknock on mydoor Irsquom offering fulltime employment no experiencenecessary and well above minimumwage middot middot middot but the response has beenpoor and of those who have appliedmost have a terrible attitude towardswork I donrsquot understanditmiddot middot middot

User B26 march 2013

at 806 am

2

Irsquove been reading a lot of storiessimilar to yours and have seen howhard it is for employers to get goodreliable staffThe mind set of theyounger people these days seems tobe very poor middot middot middot

Irsquove beenreading a lot ofstories middot middot middot

User A26 march 2013 at 1124 am

User E26 march 2013 at1247 pm

124

36

Of course they have to borrowmoneyChristchurch roads WFF interestfree loans we donrsquot make enoughto cover those expenses middot middot middot

User A26 march 2013

at 816 am

6

But national are-borrowing a shit load more moneyevery week-whacking on extra taxes (GSTcigarettesalcohol petrol middot middot middot )

-trying to run Auckland city fromWellingtonmiddot middot middot

you forgot 1000000000in lost revenue becauseof poorly structured taxcuts that did not stimulatethe economy

remember only theMexicans pay less taxthan us middot middot middot

User C26 march 2013

at 819 am

8

R

$

Figure 1 A part of a real thread fromThestandard (httpwwwthestandardorgnz) online news website

4 Method

In this section we describe our supervised approach to RTStask First we define some textual and nontextual featuresand learn a proper model to combine the features using aranking SVM (Support Vector Machines) Then the model isemployed in reconstruction of the reply structure for threadsin the test data

41 Ranking SVM SVM is a supervised learning methodthat is used for classification regression and ranking

The implementation we use is the ranking SVM classifi-er (httpwwwcscornelledupeopletjsvm lightsvm rankhtml) [2 37] The ranking SVM classifier learns to assigna weight to the pairs of comments The whole procedurefor choosing the parent of the 119894th comment in a thread isdescribed in Figure 4

Since the purpose of RTS task is to predict a tree-likereply structure in a thread RTS algorithm needs to find 119873reply relations among comments and the root where119873 is thenumber of comments in a threadThus RTS algorithm needs119874(1198732) pairwise comparisons

6 The Scientific World Journal

2468101214

15

1015

0

03

02

01

04

05

06

07

08

09

1

X 7Y 12Z 001661

Child

-par

ent r

elat

ion

prob

abili

ty

RootY child

X parent

X 11Y 12

Z 049

X rootY12Z 0211

(a)

2468101214

1

5

10

15

0010203040506070809

1

X 7Y 12Z 006037

X 11Y 12Z 01371Ch

ild-p

aren

t rel

atio

n pr

obab

ility

X parent

Y child

X rootY 12Z 03652

Root

(b)

Figure 2 The probability of child-parent reply relations (a) Apple discussion forum and (b) Thestandard online news agencies

R

10 14

298

124

7

16 45

3

30

28

25

2726

20

13

19

23

15

12

11

19

13

22

18

21

26

17

221

2947

112

9

26325

23 27

30

5

12

6

810

14

21 15 20

16

181719

24

28

1

310

209

4

5

1622

6

2

15

R

826

30

19

2421

2923

277

17

18

11

13

14

12

28

25

9

11

R

Figure 3 Three real threads which the nodesrsquo label are in chronology order

42 Features The ranking SVM needs some features to bedefined In this section we introduce eleven textual andnontextual features Their definition is one of the maincontributions of our work It should be mentioned that about20 percent of replies in ground truth donot have any common

word with their parent The textual features therefore are notenough and we need to define some nontextual features

421 Similarity Measurement In order to measure the sim-ilarity of sentences we utilize a vector space model to

The Scientific World Journal 7

Learning

Model

i

R

1

i minus 1Get

(candidatesof irsquothcomment)

Ranking

Training set

Test set

SVM ranklearn

(A learning modellist of weights for

each feature)

i

R i minus 1

j

For each candidateForj = 0 to i minus 1 doFor each featurefor k = 0 to Features doLinear kernel functionA[j]+ = modelweight[k]lowastfeature [j][k]Return parentrsquos lablereturn argmaxj A

Figure 4 RTS algorithm

represent the text body of comments A commentrsquos text canbe considered as a vector of terms weighed by TF-IDF TF(Term Frequency) is the number of times a word appearsin a comment and IDF (Inverse Document Frequency) iscomputed according to the following formula

IDF (119908) = Log( 119873119899119908

) (1)

where119873 is the total number of comments in a news item and119899119908is the number of comments that contain the word 119908 IDF

is usually extracted from the whole training data but we havelimited it to the set of comments in a thread We believe thismakes it more accurate for example when a word has a lowfrequency in the whole training data but has a high frequencyin a thread

In order to measure similarity between two commentsthe stop-words are deleted first and then words of thecomments are stemmed by the Porter algorithm (for Englishdatasets) This step is not performed for Russian and Persiandatasets The common words of the two comments areextracted and their weights are calculated based on TF-IDFThen the final score is obtained by aggregating the weights ofcommonwords according to (2) Since the root text is usuallylonger than the comments log of the product of their lengthsis used in denominator

Score (1198621 1198622)

= sum

119908isin119862-words(1 + log (TF (119908 1198621) times TF (119908 1198622)))

times IDF (119908) times (log (|1198621| times |1198622|))minus1

(2)

where119862-words are the number of the words held in commonbetween comments 1198621 and 1198622 and |119862| is the length of thecommentrsquos text

Comments are usually informal and have typo errorsSome words in two comments might be the same but dueto spelling errors it is difficult to find this out In order tosolve this issue we use the minimum edit distance (MED)algorithm The minimum edit distance of two words is theminimum number of edit operations (insertion deletionsubstitution and transposition) needed to transform oneword into another [14] The costs of insertion deletionsubstitution and transposition are 1 1 2 and 1 respectively

Two words in different comments are considered ascommon words if either they are exactly the same or theyseem to be the same but they contain some typo errors Inthe latter case if the length of the words is bigger than fiveand their first two letters are the same and their edit distanceis lower than 4 the two words are considered as commonword For example two words ldquoBeautifulrdquo and ldquoBeuatifulrdquo areconsidered as common words

422 Authorsrsquo Language Model The idea is that the com-menters who talk to each other are more likely to use similarwords in their comments In order to take advantage ofthis feature all comments that are related to a commenterare appended This makes a collection of words for eachcommenter If the collections of words of two commentersare very similar they can be related to each other InFigure 5 commenter ldquo1198601rdquo wrote three comments Thesethree comments are appended and make a collection ofwords for commenter ldquo1198601rdquo and then like the first featurethe similarity is calculated between collection of words ofcommenters ldquo1198601rdquo and ldquo1198602rdquo The similarity scores obtained

8 The Scientific World Journal

between two commenters ldquo1198601rdquo and ldquo1198602rdquo are considered asthis featurersquos score for relations between their comments

423 Prior Location Position of comments can reveal someinformation about their hierarchy for example first com-ments usually have more children than the others or com-ments which are located just before the 119894th comment aremorelikely to be its parent In general we would like to estimate119901(119894 | 119895) that is knowing that a comment is in position 119895whichis the likelihood that the comment in position 119894 is its parent[2] So we calculate prior probabilities for being the parent ofdifferent positions

To calculate prior probability for 119895 we count the numberof times a comment in position 119894 is the parent of 119895 inthe training set Figure 6 shows the prior probability forcomments in positions 1 to 100 in Thestandard dataset Thehighest prior probability belongs to the root and then tothe comments which are located just before the commentThe sample points in Figure 6 show five commentrsquos positionssuch as the roots 10 30 57 and 59 and how it is probablefor 60th comment to be a child of them that the root hasthe most prior probability which is equal to 01852 and thenthe 59th comment which has the probability of 01236 AlsoFigure 7 shows the prior probability of child-parent relationfrom comment 40 to 60

424 Reference to Authorsrsquo Name If the name of an authorappears in a comment then all hisher comments are con-sidered as a potential candidate for being parent of thatcomment [19] Sometimes a part of the complete name isreferenced Sometimes the authorrsquos name is made up of twoparts and both of these parts could be used by other authorsfor reference We also consider these types of references Wehold each part of the authorrsquos name and then parts which arestop-words are removed

425 Global Similarity This feature is based on ranking thesimilarity of commentsrsquo text and the rootrsquos text which has aglobal view on the text similarity According to this feature ifcomment ldquo119860rdquo has the most similarity with comment ldquo119861rdquo andinversely comment ldquo119861rdquo has themost similarity with commentldquo119860rdquo among other candidates it is more likely that there is arelation between comment ldquo119860rdquo and comment ldquo119861rdquo To relaxthis feature the similarity measurement is calculated for eachcomment corresponding with all comments then the com-ments are sorted based on their similarity score For examplein Figure 8 we are looking for parent of the fifth commentIn the first step according to Formula (2) comments aresorted based on score of text similarity measurement per thefifth comment Comments that do not belong to candidateof the fifth comment are removed The removed commentshave been shown with black color in Figure 8 In the secondstep the same as the first step comments are sorted per eachcandidate of fifth comment and also comments which donot belong to candidate of the fifth comment are removedexcept the fifth comment itself Finally formula (3) is used tocalculate Ranking-distance score Two comments which arethe most similar to each other have more Ranking-distance

score In Figure 8 the most score belong to relation with thefourth commentThis feature is symmetric and the similarityamong comments is only calculated one time In other wordsthis feature needs 119874(1198732) time complexity for text similaritypairwise comparisons

Ranking-distance (1198621 1198622) = (|1198621amp1198622| + |1198622amp1198621|)minus1(3)

where 1198621 and 1198622 are two comments and |1198621amp1198622| is the textsimilarity distance between 1198621 and 1198622

426 Frequent Patterns of Authors Appearance The idea isthat the comments of two commenters who talk to eachother usually appear closely in chronological order So iftheir comments appear many times just after each other thisfeature gives a high score to their comments In order toimplement this feature we use the following formula

FPScore (119894 119895) = 119891 (119860119894 119860119895) + 119891 (119860

119895 119860119894) (4)

where 119860119894is the author of comment 119894 119891(119886 119887) is the number

of times of comments of author 119886 that appear just beforecomments of author 119887 (see Pseudocode 1)

Figure 9 shows a time line which includes 7 commentswhich have been written by 4 commenters The feature scoreis calculated for the relation among comments The score ofrelation between comments 119860 and 119861 is 3 which is more than119860 and 119862

427 Length Ratio The length of the parent text is usuallylonger than its children The length ratio score is calculatedaccording to

length ratio score (119862119862 119862119875) =

10038161003816100381610038161198621198751003816100381610038161003816

10038161003816100381610038161198621198621003816100381610038161003816

(5)

where 119862119862is a comment looking for a parent and 119862

119875is a

candidate parent

428 FrequentWords in Parent-Child Comments Sometimesa pair of words appears to be frequently one word in theparent and the next word in its children For example in ENE-News dataset the most frequent pairs are (believe people)(accident fukushima) (new discovered) (idea report) and(people public)We use pointwisemutual information (PMI)[38] to find the frequent pattern

Score (11988211198822) =

Count (11988211198822)

Count (1198821) lowast Count (119882

2) (6)

where 1198821is a word in a comment whose parent we are

looking for1198822is a word in its candidate parent and Count

(11988211198822) is the number of time119882

1has appeared in the parent

and1198822has appeared in child The numerator computes how

often two words appear together and denominator computeshow often one of the words appears Finally according toPseudocode 2 the score of relation between two comments iscalculated

The Scientific World Journal 9

+

C

C

C

A1

+

C

C

C

A2Similarity

C

Figure 5 The authorsrsquo language features based on author ldquo1198601rdquo and author ldquo1198602rdquo

102030405060708090100

1 20 4060 80

100

0

01

02

03

04

05

06

07

08

09

1

Prio

r pro

babi

lity

X RootY 60Z01862

X 10Y 60Z00011

X 30Y 60Z00011

X 57Y 60Z 006

Root

Y child X parent

X 59Y 60Z01236

Figure 6 The prior probability of child-parent relation from comments 1 to 100 inThestandard

102030405060

0

005

01

015

02

025

X rootY 60Z01862

X 59Y 60Z01236

Root

4045

5055

60

Y child

X parent

Prio

r pro

babi

lity

Figure 7 The prior probability of child-parent relation from comments 40 to 60

Comment other comments are sorted based on the text similarity measurement score

5 15 2 4 R 3 21 1

2 233 4 5 R13 1

4 235 1 R 313 2

5

2 4

1 2 3 4 5

1 2 3 4 5

1 2 3 4 5

1(1 + 3) 1(2 + 1)

Figure 8 An example for global similarity features

10 The Scientific World Journal

Step 1For 119894 = 1 to comments-1

if Author(119894) = Author(119894 + 1)119891[Author(119894 + 1) Author(119894)]++

End ifEnd forStep 2For 119894 = 2 to Comments For each comment which looking for its parent

For 119895 = 0 to 119894 minus 1 For each candidateFPScore(119894119895)= 119891[Author(119894) Author(119895)] + 119891[Author(119895) Author(119894)]

End forEnd for

Pseudocode 1 Pseudocode to calculate frequent patterns of authors appearance

Time line

Relation CountFrequentpattern

Sum

Aminus gt B 2 A B 3

Bminus gt C 1 B C 1

Cminus gt A 1 C A 1

Bminus gt A 1 AD 1Aminus gt D 1

B

AFrequent (A B) = 3 Frequent (A C) = 1

C

Step 1 Step 2

A B C A B A D

Figure 9 An example for authors appearance sequence

429 Candidate Filtering In addition to the features we usesome heuristics to filter some inappropriate candidates

(1) Usually a commenter does not reply to the root inhisher second comment So if a commenter haswritten more than one comment the root is removedfrom the parent candidates of the second and nextcomment This was shown to be a useful featurebecause the root is an important candidate so if wecould remove it correctly the results are improvedsignificantly

(2) A commenter does not reply to himherself So wesimply remove all comments of a commenter fromhisher commentrsquos candidates [2]

(3) Commenters who post only one comment on a threadare more likely to reply to the root post So othercandidates can be removed except the root

5 Experiments

In this section we provide details about the datasets used forevaluation describe the evaluation metrics and then presentthe results of the SLARTS method and compare them withthe results of the baseline approaches

51 Dataset Unfortunately there is no standard datasetfor RTS task in declarative threads while severaldatasets are available for interrogative threads [3 4 14]Therefore Thestandard (httpthestandardorgnz) Alef(httpwwwalefir) ENENews (httpenenewscom) Rus-sianblog (wwwkaverydreamwidthorg) and Courant-blogs(wwwcourantblogscom) websites were crawled to provideevaluation data Thestandard in New Zealand and Alef inIran are two online news agencies established in August2007 and 2006 respectively They publish daily news andarticles by journalists in different categories such as economyenvironment politics and society Thestandard is in Englishand Alef is in PersianThestandard has provided the tree-likereply structure of comments since 2010 ENENews is a newswebsite covering the latest energy-related developmentsIt was established shortly after the Fukushima Daiichidisaster in March 2011 It has grown rapidly to serveapproximately 2000000 page views per month Russianblogand Courantblogs are two blogs that are in Russian andEnglish respectively Russsianblog and Courantblogs areestablished since 2003 and 2012 respectively

The reason for selecting the mentioned websites are (1)They support multilevel reply (2) their users are active andarticles usually get many comments (3) author and time ofcomments are available (4) they cover different contexts andlanguages (news agencies cultural sites and blogs in EnglishRussian and Persian)

The Scientific World Journal 11

procedure frequent-words (comment candidate comment)Sum = 0For each (Word1 in comment)

For each (Word2 in candidate-comment)Sum += Score [Word1 Word2]

End ifEnd forreturn Sum

Pseudocode 2 A pseudocode for calculating score of frequent words between two comments

Table 1 The datasets specific properties

Datasets Language Confirmation bymoderator

Usersregister

Thestandard English times

Alef Persian times

ENENews English times

Russianblog Russian times

Courantblogs English times

For each website we crawled the webpages that werepublished until the end of 2012We then parsed the pages andextracted the reply structure and used it for ground truthWehave removed the threadswith less than 4 comments becausethese kinds of threads do not give us much information andusually their comments reply to the root Table 1 summarizesthe information about the prepared datasetsThe datasets areavailable at httpeceutacirnlpresourceshtml

In ENENews and Russianblog users have to register inorder to leave a comment However in other datasets userscan leave comments in a thread with different names ordifferent users can use the same names

Table 2 reports some statistics on the crawled websitesThe length of commentsrsquo text in Russianblog is shorter thanthe other datasets which causes text similarity measures toperform poorly on it In ENENews the root usually includesa tweet that is why the length of the rootrsquos text is shorterthan the other datasets All comments have authorrsquos nameexcept for some comments in Alef Therefore the numbersof commenters for Alef was calculated from the commentsthat have authorrsquos name The average number of commentsper article in Thestandard and ENENews are about 50 and39 respectively which are larger than the other datasets

In order to gain some insights into the data we show somecharts extracted from the datasets

Figure 10 shows the distribution of the number of com-ments in the articles It is seen thatmost threads have between5 and 15 comments in Russianblog and Alef However inThestandard length of threads is longer than the otherdatasets andmost threads have between 12 and 24 comments

Publication rate of articles is shown in Figure 11 Thepublication rate follows a bell-shape and articles are publishedbetween 7 am and 19 pm and the highest rate belongs to the 4-hour period between 9 am and 13 pm Since there is only one

author who publishes the articles in Russianblog its chart hasless variation and the root is usually published between 10 amand 18 pm

Figure 12 shows the publication rate of comments afterthe root It is seen that all datasets except Russianblog havesimilar behavior about one hour after the article is publishedit has received most of the comments After that the rate ofcomment reception decreases Russianblog shows differentbehavior It seems that a part of its visitors reply to its articlesthe next morning 16ndash21 hours after the publication

Figure 13 shows the time difference between publicationtime of comments and their replies It is seen that the maxi-mum difference is usually less than one hour ENENews andRussianblog do not moderate comments and Thestandardhas very active moderators who immediately check andaccept comments However in Courantblogs and Alef wheremoderators are not always online the time difference isbetween one and two hours

Figure 14 shows how depth of comments increases whentime passes after publication of the main article Deepercomments show longer and more serious conversations Asshown in the figure comments usually reply to the root inearly hours After a few hours conversations are continuedaround the comments which causes the depth of thread toincrease Visitors of Alef and Courantblogs talk more aboutthe root article However Thestandard and ENENews havelonger and deeper discussions

Figure 15 shows length of comments in words Most ofcomments include 10ndash20 words Except for comments ofRussianblog the other datasets are similar This tells thatthe similarity measure is not enough and we have to usenontextual features as well Russianblog has comments thatare shorter than the other datasets This dataset is a personalblog and users usually write friendly comments

Figure 16 shows depth of comments Depth is directlyrelated to the conversations It is seen that comments areusually found in the first depth (below the root) RussianblogENENews and Thestandard have more comments in higherlevels meaning conversations are longer

52 Evaluation Metrics To evaluate the experiment resultswe use several metrics For edge prediction we use precisionrecall and 119865-score measures [4 13 17 18] Output of RTS isa tree This causes precision recall and 119865-score values to beequal [2] since FP (False Positive) and FN (False Negative)

12 The Scientific World Journal

Table 2 Datasetsrsquo statistics split up per website

DatasetsAvg length of

commentsrsquo text (inwords)

Avg length of therootrsquos text (in words)

Number ofcomments

Avg comments pernews

Number ofnews

Number ofcommenter

Thestandard 7298 35768 149058 4981 2992 3431Alef 7466 76655 23210 2796 830 4387ENENews 6819 16264 16661 3892 428 831Courantblogs 7924 39817 4952 1919 258 1716Russianblog 1856 23716 30877 1618 1908 1258

0

20

40

60

80

100

120

4 8 16 32 64 128 256

Num

ber o

f new

s

Number of comments

AlefEnenewsRussianblog

CourantblogsThestandard

Figure 10 Distribution of the number of comments in news article

0

005

01

015

02

025

0 2 4 6 8 10 12 14 16 18 20 22

Nor

mal

ized

new

s cou

nt

Hour

AlefEnenewsRussianblog

CourantblogsThestandard

Figure 11 Publication rate of articles per hour

0002004006008

01012014016

0 2 4 6 8 10 12 14 16 18 20 22 24

Nor

mal

ized

com

men

ts co

unt

Elapsed time after published the root (hour)

AlefEnenewsCourantblogs

RussianblogThestandard

Figure 12 Average number of published comments per hour after the root

The Scientific World Journal 13

0005

01015

02025

03035

04045

05

0 1 2 3 4 5 6 7 8 9 10 11 12

Nor

mal

ized

com

men

ts co

unt

Time difference

AlefEnenewsRusssiablog

CourantblogsThestandard

Figure 13 Time difference between a node and corresponding parent per hour

0

05

1

15

2

25

3

35

4

0 2 4 6 8 10 12 14 16 18 20 22 24

Aver

age c

omm

ents

dept

h

Elapsed time after publishing the root (hour)

AlefEnenewsRusssianblog

CourantblogsThestandard

Figure 14 Depth of comments averaged on the elapsed time from publication of root

in precision and recall are always equal Instead we use thefollowing edge accuracy measure

Accuracy (Acc)edge

=

1003816100381610038161003816reply relations cap detected relations1003816100381610038161003816119873

(7)

where the tree is comprised of119873 comments and one rootThe secondmetric is path accuracy which was introduced

byWang et al [3]Thismetric has a global view and considerspaths from nodes to the root

Accuracy (Acc)path =sum|119862|

119894=1

1003816100381610038161003816path119866 (119894) = path119875(119894)1003816100381610038161003816

|119862| (8)

where path119866(119894) is the ground-truth structure for 119894th comment

and path119875(119894) is the predicted structure for it |119862| is the number

of comments in a thread If any irrelevant comment appearsin the path this metric considers it to be completely wrongSo it is very strict To relax this Wang et al introduced

a metric that computes the overlap between the paths inground truth and the predicted path

Precision (119875)119877-path =

sum|119862|

119894=1(1003816100381610038161003816path119866 (119894) cap path

119875(119894)1003816100381610038161003816 1003816100381610038161003816path119875 (119894)

1003816100381610038161003816)

|119862|

Recall (119877)119877-path =

sum|119862|

119894=1(1003816100381610038161003816path119866 (119894) cap path

119875(119894)1003816100381610038161003816 1003816100381610038161003816path119866 (119894)

1003816100381610038161003816)

|119862|

(9)

where |path119875(119894)| is the number of comments in the prediction

path of 119894th comment and |path119866(119894)| is the number of com-

ments in the ground-truth path of 119894th comment 1198651-score isthe harmonic mean of precision and recall

1198651-score = 2 times Precision times recallPrecision + recall

(10)

The above mentioned metrics are appropriate in inter-rogative threads As mentioned before the root of declar-ative threads is news articles or main contents which aredifferent from the root of interrogative threads This causesthe structure of threads and reply relations to be differentThere are two types of reply relations in declarative threads

14 The Scientific World Journal

0

001

002

003

004

005

006

1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91Nor

mal

ized

com

men

ts co

unt

Length of comments (words)

AlefEnenewsRussianblog

CourantblogsThestandard

Figure 15 Histogram of length of comments in words

0

01

02

03

04

05

06

07

08

1 2 3 4 5 6 7 8 9Nor

mal

ized

com

men

ts co

unt

Depth

AlefEnenewsRusssainblog

CourantblogsThestandard

Figure 16 Histogram of the depth of comments

(1) comment-root that is the relation between commentsand the root article and (2) comment-comment that is therelation between two comments one parent and one childThe comment-root relations show different views around themain article (root) The comment-comment relations showconversation among visitors which is a valuable source ofinformation In Figure 17 there are three comment-root andthree comment-comment relations When a user reads theroot and comments heshe can write hisher opinion orsentiment about the root by replying to root or participatingin a discussion by replying to usersrsquo comments

We believe that the evaluation metrics mentioned beforeare not enough to cover both types of reply relations dueto differences between interrogative and declarative threadsAn appropriate metric should be able to detect both typesof relations So we have to modify the evaluation metrics ordefine new metrics

We propose an evaluation metric Accuracy (Acc)CTDwhere CTD stands for Comments Type Detection It isdefined as the proportion of correctly detected commenttypes The comment-root type includes comments whichinitiate a new view about the root (NP) and the comment-

comment type includes comments which are part of adiscussion (PD)

Accuracy (Acc)CTD =TP + TN

TP + TN + FP + FN (11)

where TP is the number of correctly detected NP commentsTN is the number of correctly detected PD comments FP isthe number of incorrectly detected NP comments and FN isthe number of incorrectly detected PD comments

To evaluate accuracy of comment-comment relations thestandard precision recall and 119865-score measures are used

Precision (119875)edge =TP

TP + FP

Recall (119877)edge =TP

TP + FN

1198651-score (119865)edge = 2 timesPrecision times recallPrecision + recall

(12)

where TP is the number of correctly detected comment-comment relations FP is the number of incorrectly detectedcomment-comment relations and FN is the number of

The Scientific World Journal 15

C

C

Comment-root relation

Comment-comment relation

C

C C C Starter discussion commentStarter discussion comment

R

Figure 17 An illustration of a declarative thread

comment-comment relations in the ground truth that werenot predicted at all

The path accuracy metric mentioned earlier is a mod-ified version of 119875path and 119877path which is appropriate todeclarative platform This metric consider discussion pathsfrom each PD comment to the starter discussion commentbut not to root

Precision (119875)path

=

sum|119862|119901

119894=1

1003816100381610038161003816path119866 (119894) = path119875(119894)1003816100381610038161003816

|119862|119901

ow

1 if (|119862|119901 = 0)

Recall (119877)path

=

sum|119862|119877

119894=1

1003816100381610038161003816path119866 (119894) = path119875(119894)1003816100381610038161003816

|119862|119877

ow

1 if (|119862|119877= 0)

(13)

where |119862|119877is the number of PD comments in the ground-

truth thread and |119862|119901is the number of PD comments in the

predicted thread Path119866(119894) is the discussion path from 119894th

node to the discussion starter comment in the ground truthPath119875(119894) is discussion path from 119894th node to the discussion

starter comment in the predicted threadAlso the relaxed Precision (119875)

119877-path and Recall (119877)119877-path

are modified to be suitable for declarative platform

Precision (119875)119877-path

=

sum|119862|119901

119894=1(1003816100381610038161003816path119866 (119894) cap path

119875(119894)1003816100381610038161003816 1003816100381610038161003816path119875 (119894)

1003816100381610038161003816)

|119862|119901

ow

1 if (|119862|119901= 0)

Recall (119877)119877-path

=

sum|119862|119877

119894=1(10038161003816100381610038161003816path119866 (119894)

cap path119875(119894)

1003816100381610038161003816100381610038161003816100381610038161003816path119866(119894)

10038161003816100381610038161003816)

|119862|119877

ow

1 if (|119862|119877= 0)

(14)

Figure 18 shows two threads of which one is the groundtruth and the other one is the predicted thread In order tobetter understand the evaluation metrics we calculate themfor this example

In Table 3 the predicted thread in Figure 18 has beenevaluated by metrics from interrogative threads The resultsshow high values Table 4 shows the results of evaluationby metrics from declarative threads The results show thatdeclarative metrics have more appropriate results

There are two reasons which lead declarative metricsto better evaluate the predicted structure (1) The root indeclarative threads has many children So if a methodconnects all comments to the root interrogative metricsshow good results (2) Interrogative metrics cannot properlyevaluate comment-comment relations in declarative threadsHowever the declarative metrics can evaluate both types ofrelations

53 Experimental Results and Analysis We use several base-lines to compare the effectiveness of SLARTS The first base-line is the performance when we simply link all comments totheir previous comment (we name it Last baseline) [3 4]Thismethod leads to good results for RTS task in interrogativethreads The second baseline is to link all comments to theroot (we name it First baseline)

The only work that has been done on RTS has focused onthe authorrsquos name in online news agencies [19] Schuth et alrsquosmethod used several features to find the commenterrsquos name inonline news articles This method achieves a good precisionbut it has a low recall because many comments do not referto any authorrsquos name So we selected Seo et alrsquos method [2]which is a strong method in interrogative platforms

Seo et alrsquos method has focused on forums and emails andused the quoted text as one of theirmain featuresThis featureis not available in our datasets Therefore we use all theirproposed features except the quoted text

We have applied 5fold cross-validation to minimize biaswith 95 as confidence interval Table 5 shows the results ofour experiments based on interrogative evaluation metricsAccording to Accedge SLARTS reveals higher accuracy exceptin Alef in which lots of replies are connected to the root

According to 119875119877-path and 119877119877-path First and Last baselines

have the best performance respectively First connects allcomments to the root this way irrelevant comments do notappear in the path from the comment to the root and also theroot appears in all paths Last connects all comments to theirprevious comments This causes all comments to appear inthe path

16 The Scientific World Journal

Table3Ex

ampleo

fcalculatio

nof

interrogativem

etric

sfor

thee

xamples

hownin

Figure

18

Evaluatio

nmetric

Calculation

Accuracy

edge

Replyrelatio

ns=1rarr

1198772rarr

1198777rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

14rarr

35rarr

26rarr

58rarr

5

Detectedrelatio

ns==1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

14rarr

15rarr

26rarr

57rarr

28rarr

7

8 11asymp073

Acc p

ath

|(1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

1rarr

1198775rarr

2rarr

1198776rarr

5rarr

2rarr

119877)|

Num

berof

comments

=8 11asymp073

119875119877-path

(1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

1rarr

1198775rarr

2rarr

1198776rarr

5rarr

2rarr

119877)+(4rarr

1rarr

1198777rarr

2rarr

1198778rarr

7rarr

2rarr

119877)

Num

berof

comments

=(1+1+1+1+1+1+1+1)+(1+12+23)

11

=1017

11

asymp092

119877119877-path

(1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

1rarr

1198775rarr

2rarr

1198776rarr

5rarr

2rarr

119877)+(4rarr

3rarr

1rarr

1198777rarr

1198778rarr

5rarr

2rarr

119877)

Num

berof

comments

=(1+1+1+1+1+1+1+1)+(23+1+23)

11

=1034

11

asymp094

1198651-score119877-path

2lowast092lowast094

092+094

asymp093

The Scientific World Journal 17

R

9 10 11

3 5

8

21 7

4 6

(a)

R

9 10 11

3 7

8

21

4 5

6

(b)

Figure 18 An example of two threads (a) ground-truth thread and (b) the prediction thread

Table 4 Example of calculation of declarative metrics for the example shown in Figure 18

Evaluation metric Calculation

AccCTD

TP = 1 2 9 10 11TN = 3 4 5 6 8

FN = 7

FP =

5 + 5

5 + 5 + 1 + 0asymp 091

119875edge

119877edge

1198651edge

TP = 3 rarr 1 5 rarr 2 6 rarr 5 119875edge =3

3 + 3= 05

FN = 4 rarr 3 8 rarr 5 119877edge =3

3 + 2= 06

FP = 4 rarr 1 7 rarr 2 8 rarr 7 1198651 minus scoreedge asymp 055

119875path

119877path

1198651path

119875path =| 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 |

| 3 rarr 1 4 rarr 1 5 rarr 2 6 rarr 5 rarr 2 7 rarr 2 8 rarr 7 rarr 2 |=3

6= 05

119877path =| 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 |

| 3 rarr 1 4 rarr 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 8 rarr 5 rarr 2 |=3

5= 06

1198651-score119877-path asymp 055

119875119877-path

119877119877-path

1198651119877-path

119875119877-path =

3 rarr 1 4 rarr 1 5 rarr 2 6 rarr 5 rarr 2 7 rarr 2 8 rarr 7 rarr 2

| 3 4 5 6 7 8 |=1 + 1 + 1 + 1 + 0 + 12

6=45

6= 075

119877119877-path =

3 rarr 1 4 rarr 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 8 rarr 5 rarr 2

| 3 4 5 6 8 |=1 + 12 + 1 + 1 + 12

5=4

5= 08

1198651-score119877-path =

2 lowast 075 lowast 08

075 + 08asymp 077

According to Accpath First has better performance inThestandard ENENews and Alef because more commentsare linked directly to the root Usually SLARTS and Seo etalrsquos methods cannot predict the full path inThestandard andENENews because according to Figure 15 paths are verylong and complex in these datasets

As wementioned earlier interrogative evaluationmetricsare not appropriate for declarative threads because based onthese metrics First shows high performance although thisbaseline does not detect any comment-comment relations

Table 6 shows the results when we use declarative evaluationmetrics proposed in Section 52

According to Table 6 for 119865edge SLARTS performs betterthan Seo et alrsquos method However for119875edge Seo et alrsquos methodperforms better but its 119877edge is lower than SLARTS in all da-tasets

It is important to say that when the difference between119875edge and 119877edge is high and 119875edge is greater than 119877edge themethod most likely connects comments to the root and itdoes not appropriately detect comment-comment relations

18 The Scientific World Journal

J(1)M(1)

V(1)

H(1)

A(1)

R(1)

G(1)

N(1)D(1)

E(1)

C(1)

P(2)

U(1)

S(1)

T(2)

T(1)

M(2)

L(1)

Q(1)

C(2)

K(1)

F(1)L(2)

C(3)

P(1)

A(2)

B(1)O(1)

I(1)

H(2)

R

(a)

J(1)

M(1)

R

A(1)

N(4)

D(1)

G(1)

I(1)

B(1)H(1)

O(1)

C(1)

N(2)

G(2)O(2)

P(1)

E(1)

D(2)

F(1)

E(2)F(2)

D(3)

L(1) F(3) N(1)

D(4)

F(4)K(1)D(5)

D(6)

N(3)

(b)

1)

1)

1)

1)

1)

1)

1)

1)1)

1)

1)

1)

R

F(2)

K(3)

R(1)

K(2)

M(2)A(3)

Q(1)

N(2)

K(4)A(2)

J(2)

1)

H(2)

H(3)

I(2)

I(1)

P(1)

O(1)

(c)

Figure 19 Three real threads in which nodes are labeled according to commenterrsquos name and number of its comments

(like First baseline) On the other hand when 119877edge isgreater than 119875edge the method does not appropriately detectcomment-root relations (like Last baseline)

So Seo et alrsquos method most likely connects comments tothe root and this is not appropriately detecting comment-comment relations On the other hands 119875edge and 119877edge ofSLARTS are close

Several features in SLARTS such as Global similarityfrequent words Frequent patterns and authorsrsquo languagefocus on detection of comment-comment relations Thismakes the results of SLARTS better than Seo et alrsquos methodin declarative evaluation metrics

We already saw that the average length of threads inThestandard and ENENews is longer than the other datasets(Table 2) and their paths are much longer and more complexthan the other datasets (Figure 16) According to 119865edge theaccuracy of Thestandard and ENENews is less than otherdatasets Note that 119865edge is a strict metric in declarativethreads

SLARTS has better 119865edge than Seo et alrsquos method in alldatasetsThemaximumdifference occurs inThestandard andRussianblog In these datasets many of the defined featureshave goodperformance (Importance of features in ENENewsand Russianblog datasets will be shown in Table 7 and we willexplain more in the next section)

The Scientific World Journal 19

Table 5 Experimental results with interrogative evaluation metrics

Datasets Methods Accedge Accpath 119875119877-path 119877

119877-path 1198651119877-path

Thestandard

SLARTS 04669plusmn0009

03059plusmn0013

06927plusmn0018

07450plusmn0010

07179plusmn0013

Seo et alrsquos [2] 03941plusmn0005

03413plusmn0006

08303plusmn0009

06563plusmn0005

07331plusmn0003

First 03630plusmn0010

03630plusmn0010

10000plusmn0000

05846plusmn0009

07379plusmn0007

Last 01824plusmn0006

00495plusmn0004

02214plusmn0008

10000plusmn0000

03625plusmn0010

ENENews

SLARTS 04837plusmn0018

03695plusmn0027

07625plusmn0026

07770plusmn0013

07697plusmn0017

Seo et alrsquos [2] 04661plusmn0017

04134plusmn0021

08720plusmn0011

07086plusmn0013

07818plusmn0011

First 04524plusmn0018

04524plusmn0018

10000plusmn0000

06634plusmn0016

07976plusmn0011

Last 02090plusmn0016

00596plusmn0003

02319plusmn0011

10000plusmn0000

03764plusmn0015

Courantblogs

SLARTS 06843plusmn0028

06453plusmn0033

09127plusmn0013

08619plusmn0014

08865plusmn0013

Seo et alrsquos [2] 06700plusmn0040

06628plusmn0043

09667plusmn0014

08250plusmn0018

08902plusmn0015

First 06490plusmn0036

06490plusmn0036

10000plusmn0000

08001plusmn0019

08889plusmn0012

Last 02037plusmn0017

01082plusmn0008

03183plusmn0011

10000plusmn0000

04829plusmn0012

Russianblog

SLARTS 07463plusmn0017

06904plusmn0017

08741plusmn0008

08787plusmn0006

08764plusmn0007

Seo et alrsquos [2] 05446plusmn0010

05174plusmn0011

09293plusmn0003

07821plusmn0004

08494plusmn0003

First 04968plusmn0006

04968plusmn0006

10000plusmn0000

07134plusmn0004

08327plusmn0003

Last 03632plusmn0018

01549plusmn0012

03695plusmn0015

10000plusmn0000

05395plusmn0016

Alef

SLARTS 07753plusmn0017

07641plusmn0018

09579plusmn0006

09033plusmn0008

09298plusmn0007

Seo et alrsquos [2] 07802plusmn0019

07800plusmn0019

09950plusmn0003

08828plusmn0010

09355plusmn0006

First 07853plusmn0018

07853plusmn0018

10000plusmn0000

08826plusmn0010

09376plusmn0006

Last 00993plusmn0005

00822plusmn0005

02526plusmn0012

10000plusmn0000

04032plusmn0015

The bold style shows the best result in each metric

As shown in Table 6 Russianblog has better resultsthan the other datasets in all metrics The main reason isthat its comments are not confirmed by a moderator Thiscauses Accedge of Last baseline in Russianblog to be equalto 03632 that is more than other datasets (Table 5) Accedgeof Last baseline has inverse relationship with complexity ofreplies Also we showed in Figure 13 that the Russianbloghas the lowest time difference between a comment and itsparent When time difference between a child and its parentdecreases detection of reply relationwould be easier In otherwords as the parent appears to be closer to its child somefeatures such as frequent pattern and location Prior that are

based on position of comments in chronological order workbetter

The difference between 119865119877-path and 119865path is about 20

in Thestandard and ENENews where threads are larger andpaths have more complex structures

The minimum difference between the results of SLARTSand Seo et alrsquos methods appears in Alef datasets In Alef manyrelations are comment root and many comments do not haveauthorrsquos name which make the features perform poorly

Since SLARTS method has features which speciallyfocused on detecting comment-root relations (eg by addingcandidate filtering) AccCTD of SLARTS is better than Seo

20 The Scientific World Journal

et al method in all datasets The best result is 091 forRussianblog and the worst result is 069 for ENENewsThe root of ENENews dataset is usually a tweet Accordingto Table 2 it makes the average length of the rootrsquos textto be shorter than the other datasets and this makes thetextual features perform poorly on detecting comment-rootrelations

Confidence intervals of Alef andCourantblog datasets arehigher than the other datasets becausemany of their relationsare comment root (Figure 16) This makes ranking SVM tobe bias towards connecting comments to the root especiallywhen a thread includes very few comment-comment rela-tions

We compare 119875-value to specify the significance of differ-ences between SLARTS and Seo et alrsquosmethods on declarativemetrics Since ranking SVM ranks candidates based on theirscore and selects the first candidate from the list only thefirst candidate is important So p1 is computed The resultsindicate that all improvements are statistically significant (119875-value lt 0005) in all datasets

54 Evaluation of the Features In this section we evaluatethe role of the introduced features We use backward featureselection That means to measure the importance of a featurewe use all features except that feature repeat the experimentsin its absence and compare the results to the case whereall features are present The difference between the values ofmetrics in presence of a feature and its absence is reported inTable 7

It is seen that some features improve precision of themetrics for example location prior and candidate filteringrule 3 where they most likely tend to detect comment-rootrelations Some features improve recall of the metrics such asauthorsrsquo language global similarity candidate filtering rule1 and frequent patterns These features most likely tendto detect comment-comment relations Some features affectboth precision and recall for example authorsrsquo name

As stated earlier Russianblog has different behaviourin comparison with other datasets (Figures 12 13 and 15)Also ENENews has larger threads and more complex paths(Figure 16 and Table 2) So we use these datasets to evaluatefeatures in depth

Similarity feature improves recall of evaluation metricsSince Russian language is known as a morphologically richlanguage and the length of commentsrsquo text is very short(about average 18 words according to Table 2) in comparisonwith other datasets improvement of textual features is lowTo increase the performance of textual feature in Russianwe need a stemmer a Wordnet and a tokenizer Similarityfeature has rather a good impact on AccCTD

Authorsrsquo language feature improves recall of evaluationmetrics According to 119877edge the improvements of this featureare 003 and 001 in ENENews and Russianblog respectively

Global similarity feature improves both precision andrecall in Russianblog and recall in ENENews

The frequent words feature has a small improve-ment in Russianblog like similarity feature This feature

improves recall of the evaluation metrics in ENEnews about001

Length ratio feature improves precision in both datasetsHowever since longer comments in ENENews have manychildren this feature is more prominent there

Authorsrsquo name feature is useful in all evaluation metricsThe value of this feature in ENENews is more than Russian-blog because authorsrsquo name in Russianblog is different fromother datasets it is an email and no one would refer to it as aname

The frequent patterns feature focuses on detectingcomment-comment relations This feature improves recall ofevaluation metrics in both datasets

The location prior feature improves precision in bothdatasets This feature has a good improvement on AccCTDAccording to 119875edge the best improvement is 016091 forRussianblog since comments are not moderated

The candidate filtering rule 1 improves recall of evaluationmetrics in both datasets This feature removes the rootcandidate accurately This feature has a good improvementon AccCTD The maximum improvement for 119877edge is 034 forRussianblog

The candidate filtering rule 2 has small improvements onprecision and recall of evaluation metrics in both datasetsMaximum improvement is gained with Russianblog

Finally The candidate filtering rule 3 improves precisionof metrics in both datasets This feature removes all can-didates except the root Thus detection of comment-rootrelations is improved Also this feature improves AccCTD

6 Information Extraction byVisualizing Threads Structure

In this section we discuss the information which can beextracted from hierarchical structure of threads Figure 19 isthe same as Figure 3 except that the commenters and theirnumber of posted comments are shown as the nodesrsquo labelfor example ldquo119863(3)rdquo is the third comment sent by user ldquo119863rdquoVisualization of the structure reveals valuable information asfollowing

(i) Is the root article controversial The threads ldquo119861rdquo andldquo119862rdquo in Figure 19 aremore controversial than ldquo119860rdquo Visi-tors have expressed different opinions and sentimentsabout the root in ldquo119861rdquo and ldquo119862rdquo leading to formationof longer conversations The thread ldquo119860rdquo shows acommon thread which has short conversations aboutthe root The height and width of trees can help as ameasure to recognize whether a root is controversialor not

(ii) Which comments are starters [36] Starter commentshave important role in the conversations becauseusers read them and then read other comments tobetter understand the conversations

(iii) Which comments have participated in this conversa-tion Users can follow the conversations that seeminteresting to them without reading any unrelatedcomment

The Scientific World Journal 21

Table 6 Experimental results with declarative evaluation metrics

Datasets Method 119875edge 119877edge 119865edge 119875path 119877path 119865path 119875119877-path 119877

119877-path 119865119877-path AccCTD

ThestandardSLARTS 03782

plusmn000804155plusmn0011

03960plusmn0006

01565plusmn0007

01713plusmn0008

01636plusmn0007

03628plusmn0012

04166plusmn0016

03876plusmn0005

07525plusmn0005

Seo et alrsquos[2]

03014plusmn0031

01414plusmn0015

01921plusmn0016

01544plusmn0025

00576plusmn0011

00835plusmn0013

03013plusmn0038

01319plusmn0018

01831plusmn0021

05831plusmn0007

Difference 00778 02741 02039 00021 01137 00801 00615 02847 02045 01694

ENENewsSLARTS 03797

plusmn003803891plusmn0019

03839plusmn0024

01873plusmn0040

01855plusmn0026

01857plusmn0030

03768plusmn0048

03768plusmn0017

03759plusmn0025

06872plusmn0014

Seo et alrsquos[2]

04060plusmn0030

01713plusmn0016

02407plusmn0018

02378plusmn0034

00837plusmn0016

01235plusmn0021

03947plusmn0035

01523plusmn0015

02196plusmn0019

05954plusmn0009

Difference minus00263 02178 01432 minus00505 01018 00622 minus00179 02245 01563 00918

CourantblogsSLARTS 05236

plusmn003003963plusmn0073

04495plusmn0052

04220plusmn0057

03110plusmn0058

03566plusmn0049

05325plusmn0049

03853plusmn0075

04452plusmn0057

07625plusmn0033

Seo et alrsquos[2]

07320plusmn0090

02049plusmn0062

03189plusmn0082

06815plusmn0105

01896plusmn0070

02951plusmn0093

07309plusmn0094

02010plusmn0065

03140plusmn0086

07001plusmn0031

Difference minus02084 01914 01306 minus02595 01214 00615 minus01984 01843 01312 00624

RussianblogSLARTS 05940

plusmn002805674plusmn0022

05803plusmn0023

04907plusmn0031

04664plusmn0020

04782plusmn0024

05878plusmn0028

05823plusmn0021

05849plusmn0020

09116plusmn0011

Seo et alrsquos[2]

04705plusmn0020

03030plusmn0015

03685plusmn0016

03823plusmn0023

02534plusmn0013

03046plusmn0015

04682plusmn0020

02822plusmn0014

03521plusmn0015

05862plusmn0013

Difference 01235 02644 02118 01084 0213 01736 01196 03001 02328 03254

AlefSLARTS 06189

plusmn002804131plusmn0040

04952plusmn0037

05639plusmn0027

03819plusmn0037

04551plusmn0034

06277plusmn0027

04069plusmn0043

04934plusmn0039

08044plusmn0013

Seo et alrsquos[2]

08827plusmn0052

02637plusmn0051

04045plusmn0060

08797plusmn0053

02631plusmn0051

04034plusmn0060

08840plusmn0052

02635plusmn0051

04043plusmn0060

07846plusmn0018

Difference minus02638 01494 00907 minus03158 01188 00517 minus02563 01434 00891 00198The bold style shows the best result in each metric

(iv) Who plays an important role in a discussion Someusers playmore important roles in conversations thanother users For example users ldquo119863(1)rdquo in threadldquoBrdquo and ldquo119860(1)rdquo in thread ldquoCrdquo have made a longconversation These users are known as a hub or astarter in a thread and their degree of importance canbe compared according to their indegree [39] Manyanalyses can be performed on this structure which isknown as popularity features [12]

(v) How many conversations are about the root Thereare thirteen conversations which are replyingdirectly to the root in thread ldquoCrdquo for exampleF(2)H(2)H(3) I(2)O(1)orF(2)H(2) l(1) showtwo conversations in thread ldquo119862rdquo

7 Conclusion and Future Work

In this paper we proposed SLARTS a method based onranking SVMwhich predicts the tree-like structure of declar-ative threads for example blogs and online news agenciesWe emphasized on the differences between declarative andinterrogative threads and showed that many of the previouslyproposed features perform poorly on declarative threadsbecause of this Instead we defined a set of novel textual andnontextual features and used a ranking SVM algorithm tocombine these features

We detect two types of reply relations in declarativethreads comment-root relation and comment-commentrelation An appropriate method should be able to detectboth types of reply relations and an appropriatemetric shouldconsider both types of reply relations So in order to havefair judge on the quality of the predicted structures wemodified the evaluation metrics accordingly We also defineda novel metric that measures the accuracy of comment typedetection

The results of our experiments showed that accord-ing to declarative evaluation metrics our method showshigher accuracy in comparison with the baselines on fivedatasets Also we showed that all improvements in detectingcomment-comment relations are statistically significant in alldatasets

We believe that the defined features are not limited todeclarative threads Some features such as authorrsquos languageand frequent patterns extract relations among users can beuseful in interrogative threads

For future we would like to use the proposed methodon interrogative platforms such as forums Also we wouldlike to analyze the tree-like reply structure deeply and webelieve it can provide valuable information for example tohelp in understanding the role of users in discussions or tofind important and controversial comments among a largevolume of comments

22 The Scientific World Journal

Table 7 The difference between the evaluation metrics in presence and absence of features

Feature Dataset Evaluation metric119875edge 119877edge 119875Path 119877Path 119875

119877-path 119877119877-path AccCTD

Similarity ENENews minus000233 007415 minus000538 003289 minus001095 008996 004684Russianblog 000874 000012 000963 000347 000695 minus001079 000355

Authorsrsquo language ENENews minus001354 003246 minus001785 000693 minus002757 004052 000458Russianblog minus000229 001052 minus000179 000750 minus000590 000807 minus000182

Global similarity ENENews minus000402 005604 minus001338 001581 minus001129 007086 002742Russianblog 001090 001443 000582 000897 000368 minus000317 minus000087

Frequent words ENENews 000252 000931 000170 000424 minus000132 000973 000292Russianblog 000008 000023 000016 000029 000034 000047 minus000011

Length ratio ENENews 004206 minus002984 004450 001700 006421 minus005336 minus000850Russianblog 000651 minus000355 000920 000080 000784 minus000592 000235

Authorsrsquo name ENENews 003753 006392 001920 003285 002341 005998 001788Russianblog 000000 000000 000000 000000 000000 000000 000000

Frequent pattern ENENews minus002028 002930 minus002357 000300 minus003793 003410 000526Russianblog 000458 006100 000830 005206 minus000545 006150 000404

Location prior ENENews 008778 minus003690 008556 003280 011792 minus007852 002925Russianblog 016091 009671 017766 013515 016699 008793 002414

Candidate filtering rule 1 ENENews minus005431 006138 minus004102 001296 minus006458 007545 006949Russianblog minus012506 034370 minus013916 027445 minus012683 037003 032546

Candidate filtering rule 2 ENENews 000014 minus000004 000048 000041 minus000038 minus000035 minus000009Russianblog 002575 002475 002462 002341 001746 000802 000000

Candidate filtering rule 3 ENENews 000357 minus000121 000452 000254 000551 minus000161 000161Russianblog 008521 minus001978 009636 001702 009380 minus003825 007371

Since our goal is to provide a language-independentmodel to reconstruct the thread structure we did not usetext processing tools such as Wordnet and Named EntityRecognition (as they are not available in the same quality forall languages) We would like to focus on English and usethese tools to find out whether they can improve the accuracyof RTS method or not

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] Y Wu Q Ye L Li and J Xiao ldquoPower-law properties ofhuman view and reply behavior in online societyrdquoMathematicalProblems in Engineering vol 2012 Article ID 969087 7 pages2012

[2] J Seo W B Croft and D A Smith ldquoOnline community searchusing conversational structuresrdquo Information Retrieval vol 14no 6 pp 547ndash571 2011

[3] HWang CWang C X Zhai and J Han ldquoLearning online dis-cussion structures by conditional random fieldsrdquo in Proceedingsof the 34th International ACM SIGIR Conference on Researchand Development in Information Retrieval (SIGIR rsquo11) pp 435ndash444 July 2011

[4] E Aumayr J Chan and C Hayes ldquoReconstruction of threadedconversations in online discussion forumsrdquo inProceedings of the

5th International AAAI Conference onWeblogs and SocialMedia(ICWSM rsquo11) pp 26ndash33 2011

[5] D Shen Q Yang J-T Sun and Z Chen ldquoThread detactionin dynamic text message streamsrdquo in Proceedings of the 29thAnnual International ACM SIGIR Conference on Research andDevelopment in Information Retrieval pp 35ndash42 Seattle WashUSA August 2006

[6] M Dehghani M Asadpour and A Shakery ldquoAn evolutionary-based method for reconstructing conversation threads in emailcorporardquo in Proceedings of the IEEEACM International Con-ference on Advances in Social Networks Analysis and Mining(ASONAM rsquo12) 2012

[7] Y Jen-Yuan ldquoEmail thread reassembly using similarity match-ingrdquo in Proceedings of the 3rd Conference on Email and Anti-Spam (CEAS rsquo06) Muntain View Calif USA 2006

[8] S Gottipati D Lo and J Jiang ldquoFinding relevant answersin software forumsrdquo in Proceedings of the 26th IEEEACMInternational Conference on Automated Software Engineering(ASE rsquo11) pp 323ndash332 November 2011

[9] H Duan and C Zhai ldquoExploiting thread structures to improvesmoothing of language models for forum post retrievalrdquo inAdvances in Information Retrieval P Clough C Foley CGurrin et al Eds vol 6611 pp 350ndash361 Springer BerlinGermany 2011

[10] T Georgiou M Karvounis and Y Ioannidis ldquoExtractingtopics of debate between users on web discussion boardsrdquo inProceedings of the 1st International Conference forUndergraduateand Postgraduate Students in Research Poster Competition 2010

[11] Y C Wang M Joshi W Cohen and C P Rose ldquoRecoveringimplicit thread structure in newsgroup style conversationsrdquo

The Scientific World Journal 23

in Proceedings of the International Conference on Weblogs andSocial Media Seattle Wash USA 2008

[12] J Chan C Hayes and E M Daly ldquoDecomposing discussionforums and boards using user rolesrdquo in Proceedings of the 4thInternational AAAI Conference on Weblogs and Social Media2010

[13] Y-C Wang and C P Rose ldquoMaking conversational structureexplicit identification of initiation-response pairs within onlinediscussionsrdquo in Proceedings of the Human Language Technolo-gies Conference ofthe North American Chapter of the Associationfor Computational Linguistics pp 673ndash676 June 2010

[14] LWangM Lui S N Kim J Nivre and T Baldwin ldquoPredictingthread discourse structure over technical web forumsrdquo inProceedings of the Conference on Empirical Methods in NaturalLanguage Processing (EMNLP rsquo11) pp 13ndash25 Edinburgh UKJuly 2011

[15] A Balali H FailiMAsadpour andMDehghani ldquoa supervisedapproach for reconstructing thread structure in commentson blogs and online news agenciesrdquo in Proceedings of the14th International Conference on Intelligent Text Processing andComputational Linguistics Samos Greece 2013

[16] P Adams and C Martel ldquoConversational thread extraction andtopic detection in text-based chatrdquo in Semantic Computing pp87ndash113 John Wiley amp Sons 2010

[17] P H Adams and C H Martell ldquoTopic detection and extractionin chatrdquo in Proceedings of the 2nd Annual IEEE InternationalConference on Semantic Computing (ICSC rsquo08) pp 581ndash588August 2008

[18] C Lin J M Yang R Cai X J Wang W Wang and L ZhangldquoModeling semantics and structure of discussion threadsrdquo inProceedings of the 18th International Conference on World WideWeb pp 1103ndash1104 2009

[19] A Schuth MMarx andM d Rijke ldquoExtracting the discussionstructure in comments on news-articlesrdquo in Proceedings of the9th ACM International Workshop onWeb Information and DataManagement (WIDM rsquo07) Lisboa Portugal November 2007

[20] C Wang J Han Q Li X Li W P Lin and H Ji ldquoLearninghierarchical relationships among partially ordered objects withheterogeneous attributes and linksrdquo in Proceedings of the 12thSIAM International Conference on Data Mining pp 516ndash527Anaheim Calif USA 2012

[21] M Dehghani A Shakery M Asadpour and A KoushkestanildquoA learning approach for email conversation thread reconstruc-tionrdquo Journal of Information Science vol 39 no 6 pp 846ndash8632013

[22] A Balali A Rajabi S Ghassemi M Asadpour and H FailildquoContent diffusion prediction in social networksrdquo in Pro-ceedings of the 5th Conference on Information and KnowledgeTechnology (IKT rsquo13) pp 467ndash471 2013

[23] G Mishne and N Glance ldquoLeave a reply an analysis of weblogcommentsrdquo in Proceedings of the 3rd Annual Workshop onthe Weblogging ecosystem Aggregation Analysis and DynamicsEdinburgh UK 2006

[24] V Gomez A Kaltenbrunner and V Lopez ldquoStatistical analysisof the social network and discussion threads in Slashdotrdquo inProceedings of the 17th International Conference on World WideWeb (WWW rsquo08) pp 645ndash654 April 2008

[25] T Weninger X A Zhu and J Han ldquoAn exploration ofdiscussion threads in social news sites a case study of the redditcommunityrdquo 2013

[26] D Laniado R Tasso Y Volkovich and A KaltenbrunnerldquoWhen the Wikipedians talk network and tree structure of

Wikipedia discussion pagesrdquo in Proceedings of the 5th Interna-tional AAAI Conference on Weblogs and Social Media (ICWSMrsquo11) 2011

[27] A Grabowski ldquoHuman behavior in online social systemsrdquoTheEuropean Physical Journal B vol 69 no 4 pp 605ndash611 2009

[28] S N Kim L Wang and T Baldwin ldquoTagging and linkingweb forum postsrdquo in Proceedings of the 40th Conference onComputational Natural Language Learning Uppsala Sweden2010

[29] J Andreas S Rosenthal and K McKeown ldquoAnnotating agree-ment and disagreement in threaded discussionrdquo in Proceedingsof the 8th International Conference on Language Resources andComputation (LREC rsquo12) pp 818ndash822 Istanbul Turkey May2012

[30] A Stolcke K Ries N Coccaro et al ldquoDialogue actmodeling forautomatic tagging and recognition of conversational speechrdquoComputational Linguistics vol 26 no 3 pp 339ndash373 2000

[31] E N Forsyth Improving Automated Lexical and DiscourseAnalysis of Online Chat Dialog Naval Postgraduate SchoolMonterey Calif USA 2007

[32] D Feng E Shaw J Kim and E Hovy ldquoAn intelligentdiscussion-bot for answering student queries in threaded dis-cussionsrdquo in Proceedings of the 11th international Conference onIntelligent User Interfaces pp 171ndash177 February 2006

[33] F Hu T Ruan Z Shao and J Ding ldquoAutomatic web informa-tion extraction based on rulesrdquo in Proceedings of the Web Infor-mation System Engineering (WISE rsquo11) pp 265ndash272 Springer2011

[34] F Hu T Ruan and Z Shao ldquoComplete-thread extraction fromweb forumsrdquo inWebTechnologies andApplications pp 727ndash734Springer 2012

[35] D Feng J Kim E Shaw and E Hovy ldquoTowards modelingthreaded discussions using induced ontology knowledgerdquo inProceedings of the National Conference on Artificial Intelligencepp 1289ndash1294 July 2006

[36] G D Venolia and C Neustaedter ldquoUnderstanding sequenceand reply relationships within email conversations a mixed-model visualizationrdquo in Proceedings of the SIGCHI Conferenceon Human Factors in Computing Systems pp 361ndash368 April2003

[37] T Joachims ldquoTraining linear SVMs in linear timerdquo in Pro-ceedings of the 12th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining pp 217ndash226 Philadel-phia Pa USA August 2006

[38] RM Fano ldquoTransmission of information a statistical theory ofcommunicationsrdquo American Journal of Physics vol 29 pp 793ndash794 1961

[39] T Opsahl F Agneessens and J Skvoretz ldquoNode centrality inweighted networks generalizing degree and shortest pathsrdquoSocial Networks vol 32 no 3 pp 245ndash251 2010

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 6: Research Article A Supervised Approach to Predict the …downloads.hindawi.com/journals/tswj/2014/479746.pdf · 2019-07-31 · Research Article A Supervised Approach to Predict the

6 The Scientific World Journal

2468101214

15

1015

0

03

02

01

04

05

06

07

08

09

1

X 7Y 12Z 001661

Child

-par

ent r

elat

ion

prob

abili

ty

RootY child

X parent

X 11Y 12

Z 049

X rootY12Z 0211

(a)

2468101214

1

5

10

15

0010203040506070809

1

X 7Y 12Z 006037

X 11Y 12Z 01371Ch

ild-p

aren

t rel

atio

n pr

obab

ility

X parent

Y child

X rootY 12Z 03652

Root

(b)

Figure 2 The probability of child-parent reply relations (a) Apple discussion forum and (b) Thestandard online news agencies

R

10 14

298

124

7

16 45

3

30

28

25

2726

20

13

19

23

15

12

11

19

13

22

18

21

26

17

221

2947

112

9

26325

23 27

30

5

12

6

810

14

21 15 20

16

181719

24

28

1

310

209

4

5

1622

6

2

15

R

826

30

19

2421

2923

277

17

18

11

13

14

12

28

25

9

11

R

Figure 3 Three real threads which the nodesrsquo label are in chronology order

42 Features The ranking SVM needs some features to bedefined In this section we introduce eleven textual andnontextual features Their definition is one of the maincontributions of our work It should be mentioned that about20 percent of replies in ground truth donot have any common

word with their parent The textual features therefore are notenough and we need to define some nontextual features

421 Similarity Measurement In order to measure the sim-ilarity of sentences we utilize a vector space model to

The Scientific World Journal 7

Learning

Model

i

R

1

i minus 1Get

(candidatesof irsquothcomment)

Ranking

Training set

Test set

SVM ranklearn

(A learning modellist of weights for

each feature)

i

R i minus 1

j

For each candidateForj = 0 to i minus 1 doFor each featurefor k = 0 to Features doLinear kernel functionA[j]+ = modelweight[k]lowastfeature [j][k]Return parentrsquos lablereturn argmaxj A

Figure 4 RTS algorithm

represent the text body of comments A commentrsquos text canbe considered as a vector of terms weighed by TF-IDF TF(Term Frequency) is the number of times a word appearsin a comment and IDF (Inverse Document Frequency) iscomputed according to the following formula

IDF (119908) = Log( 119873119899119908

) (1)

where119873 is the total number of comments in a news item and119899119908is the number of comments that contain the word 119908 IDF

is usually extracted from the whole training data but we havelimited it to the set of comments in a thread We believe thismakes it more accurate for example when a word has a lowfrequency in the whole training data but has a high frequencyin a thread

In order to measure similarity between two commentsthe stop-words are deleted first and then words of thecomments are stemmed by the Porter algorithm (for Englishdatasets) This step is not performed for Russian and Persiandatasets The common words of the two comments areextracted and their weights are calculated based on TF-IDFThen the final score is obtained by aggregating the weights ofcommonwords according to (2) Since the root text is usuallylonger than the comments log of the product of their lengthsis used in denominator

Score (1198621 1198622)

= sum

119908isin119862-words(1 + log (TF (119908 1198621) times TF (119908 1198622)))

times IDF (119908) times (log (|1198621| times |1198622|))minus1

(2)

where119862-words are the number of the words held in commonbetween comments 1198621 and 1198622 and |119862| is the length of thecommentrsquos text

Comments are usually informal and have typo errorsSome words in two comments might be the same but dueto spelling errors it is difficult to find this out In order tosolve this issue we use the minimum edit distance (MED)algorithm The minimum edit distance of two words is theminimum number of edit operations (insertion deletionsubstitution and transposition) needed to transform oneword into another [14] The costs of insertion deletionsubstitution and transposition are 1 1 2 and 1 respectively

Two words in different comments are considered ascommon words if either they are exactly the same or theyseem to be the same but they contain some typo errors Inthe latter case if the length of the words is bigger than fiveand their first two letters are the same and their edit distanceis lower than 4 the two words are considered as commonword For example two words ldquoBeautifulrdquo and ldquoBeuatifulrdquo areconsidered as common words

422 Authorsrsquo Language Model The idea is that the com-menters who talk to each other are more likely to use similarwords in their comments In order to take advantage ofthis feature all comments that are related to a commenterare appended This makes a collection of words for eachcommenter If the collections of words of two commentersare very similar they can be related to each other InFigure 5 commenter ldquo1198601rdquo wrote three comments Thesethree comments are appended and make a collection ofwords for commenter ldquo1198601rdquo and then like the first featurethe similarity is calculated between collection of words ofcommenters ldquo1198601rdquo and ldquo1198602rdquo The similarity scores obtained

8 The Scientific World Journal

between two commenters ldquo1198601rdquo and ldquo1198602rdquo are considered asthis featurersquos score for relations between their comments

423 Prior Location Position of comments can reveal someinformation about their hierarchy for example first com-ments usually have more children than the others or com-ments which are located just before the 119894th comment aremorelikely to be its parent In general we would like to estimate119901(119894 | 119895) that is knowing that a comment is in position 119895whichis the likelihood that the comment in position 119894 is its parent[2] So we calculate prior probabilities for being the parent ofdifferent positions

To calculate prior probability for 119895 we count the numberof times a comment in position 119894 is the parent of 119895 inthe training set Figure 6 shows the prior probability forcomments in positions 1 to 100 in Thestandard dataset Thehighest prior probability belongs to the root and then tothe comments which are located just before the commentThe sample points in Figure 6 show five commentrsquos positionssuch as the roots 10 30 57 and 59 and how it is probablefor 60th comment to be a child of them that the root hasthe most prior probability which is equal to 01852 and thenthe 59th comment which has the probability of 01236 AlsoFigure 7 shows the prior probability of child-parent relationfrom comment 40 to 60

424 Reference to Authorsrsquo Name If the name of an authorappears in a comment then all hisher comments are con-sidered as a potential candidate for being parent of thatcomment [19] Sometimes a part of the complete name isreferenced Sometimes the authorrsquos name is made up of twoparts and both of these parts could be used by other authorsfor reference We also consider these types of references Wehold each part of the authorrsquos name and then parts which arestop-words are removed

425 Global Similarity This feature is based on ranking thesimilarity of commentsrsquo text and the rootrsquos text which has aglobal view on the text similarity According to this feature ifcomment ldquo119860rdquo has the most similarity with comment ldquo119861rdquo andinversely comment ldquo119861rdquo has themost similarity with commentldquo119860rdquo among other candidates it is more likely that there is arelation between comment ldquo119860rdquo and comment ldquo119861rdquo To relaxthis feature the similarity measurement is calculated for eachcomment corresponding with all comments then the com-ments are sorted based on their similarity score For examplein Figure 8 we are looking for parent of the fifth commentIn the first step according to Formula (2) comments aresorted based on score of text similarity measurement per thefifth comment Comments that do not belong to candidateof the fifth comment are removed The removed commentshave been shown with black color in Figure 8 In the secondstep the same as the first step comments are sorted per eachcandidate of fifth comment and also comments which donot belong to candidate of the fifth comment are removedexcept the fifth comment itself Finally formula (3) is used tocalculate Ranking-distance score Two comments which arethe most similar to each other have more Ranking-distance

score In Figure 8 the most score belong to relation with thefourth commentThis feature is symmetric and the similarityamong comments is only calculated one time In other wordsthis feature needs 119874(1198732) time complexity for text similaritypairwise comparisons

Ranking-distance (1198621 1198622) = (|1198621amp1198622| + |1198622amp1198621|)minus1(3)

where 1198621 and 1198622 are two comments and |1198621amp1198622| is the textsimilarity distance between 1198621 and 1198622

426 Frequent Patterns of Authors Appearance The idea isthat the comments of two commenters who talk to eachother usually appear closely in chronological order So iftheir comments appear many times just after each other thisfeature gives a high score to their comments In order toimplement this feature we use the following formula

FPScore (119894 119895) = 119891 (119860119894 119860119895) + 119891 (119860

119895 119860119894) (4)

where 119860119894is the author of comment 119894 119891(119886 119887) is the number

of times of comments of author 119886 that appear just beforecomments of author 119887 (see Pseudocode 1)

Figure 9 shows a time line which includes 7 commentswhich have been written by 4 commenters The feature scoreis calculated for the relation among comments The score ofrelation between comments 119860 and 119861 is 3 which is more than119860 and 119862

427 Length Ratio The length of the parent text is usuallylonger than its children The length ratio score is calculatedaccording to

length ratio score (119862119862 119862119875) =

10038161003816100381610038161198621198751003816100381610038161003816

10038161003816100381610038161198621198621003816100381610038161003816

(5)

where 119862119862is a comment looking for a parent and 119862

119875is a

candidate parent

428 FrequentWords in Parent-Child Comments Sometimesa pair of words appears to be frequently one word in theparent and the next word in its children For example in ENE-News dataset the most frequent pairs are (believe people)(accident fukushima) (new discovered) (idea report) and(people public)We use pointwisemutual information (PMI)[38] to find the frequent pattern

Score (11988211198822) =

Count (11988211198822)

Count (1198821) lowast Count (119882

2) (6)

where 1198821is a word in a comment whose parent we are

looking for1198822is a word in its candidate parent and Count

(11988211198822) is the number of time119882

1has appeared in the parent

and1198822has appeared in child The numerator computes how

often two words appear together and denominator computeshow often one of the words appears Finally according toPseudocode 2 the score of relation between two comments iscalculated

The Scientific World Journal 9

+

C

C

C

A1

+

C

C

C

A2Similarity

C

Figure 5 The authorsrsquo language features based on author ldquo1198601rdquo and author ldquo1198602rdquo

102030405060708090100

1 20 4060 80

100

0

01

02

03

04

05

06

07

08

09

1

Prio

r pro

babi

lity

X RootY 60Z01862

X 10Y 60Z00011

X 30Y 60Z00011

X 57Y 60Z 006

Root

Y child X parent

X 59Y 60Z01236

Figure 6 The prior probability of child-parent relation from comments 1 to 100 inThestandard

102030405060

0

005

01

015

02

025

X rootY 60Z01862

X 59Y 60Z01236

Root

4045

5055

60

Y child

X parent

Prio

r pro

babi

lity

Figure 7 The prior probability of child-parent relation from comments 40 to 60

Comment other comments are sorted based on the text similarity measurement score

5 15 2 4 R 3 21 1

2 233 4 5 R13 1

4 235 1 R 313 2

5

2 4

1 2 3 4 5

1 2 3 4 5

1 2 3 4 5

1(1 + 3) 1(2 + 1)

Figure 8 An example for global similarity features

10 The Scientific World Journal

Step 1For 119894 = 1 to comments-1

if Author(119894) = Author(119894 + 1)119891[Author(119894 + 1) Author(119894)]++

End ifEnd forStep 2For 119894 = 2 to Comments For each comment which looking for its parent

For 119895 = 0 to 119894 minus 1 For each candidateFPScore(119894119895)= 119891[Author(119894) Author(119895)] + 119891[Author(119895) Author(119894)]

End forEnd for

Pseudocode 1 Pseudocode to calculate frequent patterns of authors appearance

Time line

Relation CountFrequentpattern

Sum

Aminus gt B 2 A B 3

Bminus gt C 1 B C 1

Cminus gt A 1 C A 1

Bminus gt A 1 AD 1Aminus gt D 1

B

AFrequent (A B) = 3 Frequent (A C) = 1

C

Step 1 Step 2

A B C A B A D

Figure 9 An example for authors appearance sequence

429 Candidate Filtering In addition to the features we usesome heuristics to filter some inappropriate candidates

(1) Usually a commenter does not reply to the root inhisher second comment So if a commenter haswritten more than one comment the root is removedfrom the parent candidates of the second and nextcomment This was shown to be a useful featurebecause the root is an important candidate so if wecould remove it correctly the results are improvedsignificantly

(2) A commenter does not reply to himherself So wesimply remove all comments of a commenter fromhisher commentrsquos candidates [2]

(3) Commenters who post only one comment on a threadare more likely to reply to the root post So othercandidates can be removed except the root

5 Experiments

In this section we provide details about the datasets used forevaluation describe the evaluation metrics and then presentthe results of the SLARTS method and compare them withthe results of the baseline approaches

51 Dataset Unfortunately there is no standard datasetfor RTS task in declarative threads while severaldatasets are available for interrogative threads [3 4 14]Therefore Thestandard (httpthestandardorgnz) Alef(httpwwwalefir) ENENews (httpenenewscom) Rus-sianblog (wwwkaverydreamwidthorg) and Courant-blogs(wwwcourantblogscom) websites were crawled to provideevaluation data Thestandard in New Zealand and Alef inIran are two online news agencies established in August2007 and 2006 respectively They publish daily news andarticles by journalists in different categories such as economyenvironment politics and society Thestandard is in Englishand Alef is in PersianThestandard has provided the tree-likereply structure of comments since 2010 ENENews is a newswebsite covering the latest energy-related developmentsIt was established shortly after the Fukushima Daiichidisaster in March 2011 It has grown rapidly to serveapproximately 2000000 page views per month Russianblogand Courantblogs are two blogs that are in Russian andEnglish respectively Russsianblog and Courantblogs areestablished since 2003 and 2012 respectively

The reason for selecting the mentioned websites are (1)They support multilevel reply (2) their users are active andarticles usually get many comments (3) author and time ofcomments are available (4) they cover different contexts andlanguages (news agencies cultural sites and blogs in EnglishRussian and Persian)

The Scientific World Journal 11

procedure frequent-words (comment candidate comment)Sum = 0For each (Word1 in comment)

For each (Word2 in candidate-comment)Sum += Score [Word1 Word2]

End ifEnd forreturn Sum

Pseudocode 2 A pseudocode for calculating score of frequent words between two comments

Table 1 The datasets specific properties

Datasets Language Confirmation bymoderator

Usersregister

Thestandard English times

Alef Persian times

ENENews English times

Russianblog Russian times

Courantblogs English times

For each website we crawled the webpages that werepublished until the end of 2012We then parsed the pages andextracted the reply structure and used it for ground truthWehave removed the threadswith less than 4 comments becausethese kinds of threads do not give us much information andusually their comments reply to the root Table 1 summarizesthe information about the prepared datasetsThe datasets areavailable at httpeceutacirnlpresourceshtml

In ENENews and Russianblog users have to register inorder to leave a comment However in other datasets userscan leave comments in a thread with different names ordifferent users can use the same names

Table 2 reports some statistics on the crawled websitesThe length of commentsrsquo text in Russianblog is shorter thanthe other datasets which causes text similarity measures toperform poorly on it In ENENews the root usually includesa tweet that is why the length of the rootrsquos text is shorterthan the other datasets All comments have authorrsquos nameexcept for some comments in Alef Therefore the numbersof commenters for Alef was calculated from the commentsthat have authorrsquos name The average number of commentsper article in Thestandard and ENENews are about 50 and39 respectively which are larger than the other datasets

In order to gain some insights into the data we show somecharts extracted from the datasets

Figure 10 shows the distribution of the number of com-ments in the articles It is seen thatmost threads have between5 and 15 comments in Russianblog and Alef However inThestandard length of threads is longer than the otherdatasets andmost threads have between 12 and 24 comments

Publication rate of articles is shown in Figure 11 Thepublication rate follows a bell-shape and articles are publishedbetween 7 am and 19 pm and the highest rate belongs to the 4-hour period between 9 am and 13 pm Since there is only one

author who publishes the articles in Russianblog its chart hasless variation and the root is usually published between 10 amand 18 pm

Figure 12 shows the publication rate of comments afterthe root It is seen that all datasets except Russianblog havesimilar behavior about one hour after the article is publishedit has received most of the comments After that the rate ofcomment reception decreases Russianblog shows differentbehavior It seems that a part of its visitors reply to its articlesthe next morning 16ndash21 hours after the publication

Figure 13 shows the time difference between publicationtime of comments and their replies It is seen that the maxi-mum difference is usually less than one hour ENENews andRussianblog do not moderate comments and Thestandardhas very active moderators who immediately check andaccept comments However in Courantblogs and Alef wheremoderators are not always online the time difference isbetween one and two hours

Figure 14 shows how depth of comments increases whentime passes after publication of the main article Deepercomments show longer and more serious conversations Asshown in the figure comments usually reply to the root inearly hours After a few hours conversations are continuedaround the comments which causes the depth of thread toincrease Visitors of Alef and Courantblogs talk more aboutthe root article However Thestandard and ENENews havelonger and deeper discussions

Figure 15 shows length of comments in words Most ofcomments include 10ndash20 words Except for comments ofRussianblog the other datasets are similar This tells thatthe similarity measure is not enough and we have to usenontextual features as well Russianblog has comments thatare shorter than the other datasets This dataset is a personalblog and users usually write friendly comments

Figure 16 shows depth of comments Depth is directlyrelated to the conversations It is seen that comments areusually found in the first depth (below the root) RussianblogENENews and Thestandard have more comments in higherlevels meaning conversations are longer

52 Evaluation Metrics To evaluate the experiment resultswe use several metrics For edge prediction we use precisionrecall and 119865-score measures [4 13 17 18] Output of RTS isa tree This causes precision recall and 119865-score values to beequal [2] since FP (False Positive) and FN (False Negative)

12 The Scientific World Journal

Table 2 Datasetsrsquo statistics split up per website

DatasetsAvg length of

commentsrsquo text (inwords)

Avg length of therootrsquos text (in words)

Number ofcomments

Avg comments pernews

Number ofnews

Number ofcommenter

Thestandard 7298 35768 149058 4981 2992 3431Alef 7466 76655 23210 2796 830 4387ENENews 6819 16264 16661 3892 428 831Courantblogs 7924 39817 4952 1919 258 1716Russianblog 1856 23716 30877 1618 1908 1258

0

20

40

60

80

100

120

4 8 16 32 64 128 256

Num

ber o

f new

s

Number of comments

AlefEnenewsRussianblog

CourantblogsThestandard

Figure 10 Distribution of the number of comments in news article

0

005

01

015

02

025

0 2 4 6 8 10 12 14 16 18 20 22

Nor

mal

ized

new

s cou

nt

Hour

AlefEnenewsRussianblog

CourantblogsThestandard

Figure 11 Publication rate of articles per hour

0002004006008

01012014016

0 2 4 6 8 10 12 14 16 18 20 22 24

Nor

mal

ized

com

men

ts co

unt

Elapsed time after published the root (hour)

AlefEnenewsCourantblogs

RussianblogThestandard

Figure 12 Average number of published comments per hour after the root

The Scientific World Journal 13

0005

01015

02025

03035

04045

05

0 1 2 3 4 5 6 7 8 9 10 11 12

Nor

mal

ized

com

men

ts co

unt

Time difference

AlefEnenewsRusssiablog

CourantblogsThestandard

Figure 13 Time difference between a node and corresponding parent per hour

0

05

1

15

2

25

3

35

4

0 2 4 6 8 10 12 14 16 18 20 22 24

Aver

age c

omm

ents

dept

h

Elapsed time after publishing the root (hour)

AlefEnenewsRusssianblog

CourantblogsThestandard

Figure 14 Depth of comments averaged on the elapsed time from publication of root

in precision and recall are always equal Instead we use thefollowing edge accuracy measure

Accuracy (Acc)edge

=

1003816100381610038161003816reply relations cap detected relations1003816100381610038161003816119873

(7)

where the tree is comprised of119873 comments and one rootThe secondmetric is path accuracy which was introduced

byWang et al [3]Thismetric has a global view and considerspaths from nodes to the root

Accuracy (Acc)path =sum|119862|

119894=1

1003816100381610038161003816path119866 (119894) = path119875(119894)1003816100381610038161003816

|119862| (8)

where path119866(119894) is the ground-truth structure for 119894th comment

and path119875(119894) is the predicted structure for it |119862| is the number

of comments in a thread If any irrelevant comment appearsin the path this metric considers it to be completely wrongSo it is very strict To relax this Wang et al introduced

a metric that computes the overlap between the paths inground truth and the predicted path

Precision (119875)119877-path =

sum|119862|

119894=1(1003816100381610038161003816path119866 (119894) cap path

119875(119894)1003816100381610038161003816 1003816100381610038161003816path119875 (119894)

1003816100381610038161003816)

|119862|

Recall (119877)119877-path =

sum|119862|

119894=1(1003816100381610038161003816path119866 (119894) cap path

119875(119894)1003816100381610038161003816 1003816100381610038161003816path119866 (119894)

1003816100381610038161003816)

|119862|

(9)

where |path119875(119894)| is the number of comments in the prediction

path of 119894th comment and |path119866(119894)| is the number of com-

ments in the ground-truth path of 119894th comment 1198651-score isthe harmonic mean of precision and recall

1198651-score = 2 times Precision times recallPrecision + recall

(10)

The above mentioned metrics are appropriate in inter-rogative threads As mentioned before the root of declar-ative threads is news articles or main contents which aredifferent from the root of interrogative threads This causesthe structure of threads and reply relations to be differentThere are two types of reply relations in declarative threads

14 The Scientific World Journal

0

001

002

003

004

005

006

1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91Nor

mal

ized

com

men

ts co

unt

Length of comments (words)

AlefEnenewsRussianblog

CourantblogsThestandard

Figure 15 Histogram of length of comments in words

0

01

02

03

04

05

06

07

08

1 2 3 4 5 6 7 8 9Nor

mal

ized

com

men

ts co

unt

Depth

AlefEnenewsRusssainblog

CourantblogsThestandard

Figure 16 Histogram of the depth of comments

(1) comment-root that is the relation between commentsand the root article and (2) comment-comment that is therelation between two comments one parent and one childThe comment-root relations show different views around themain article (root) The comment-comment relations showconversation among visitors which is a valuable source ofinformation In Figure 17 there are three comment-root andthree comment-comment relations When a user reads theroot and comments heshe can write hisher opinion orsentiment about the root by replying to root or participatingin a discussion by replying to usersrsquo comments

We believe that the evaluation metrics mentioned beforeare not enough to cover both types of reply relations dueto differences between interrogative and declarative threadsAn appropriate metric should be able to detect both typesof relations So we have to modify the evaluation metrics ordefine new metrics

We propose an evaluation metric Accuracy (Acc)CTDwhere CTD stands for Comments Type Detection It isdefined as the proportion of correctly detected commenttypes The comment-root type includes comments whichinitiate a new view about the root (NP) and the comment-

comment type includes comments which are part of adiscussion (PD)

Accuracy (Acc)CTD =TP + TN

TP + TN + FP + FN (11)

where TP is the number of correctly detected NP commentsTN is the number of correctly detected PD comments FP isthe number of incorrectly detected NP comments and FN isthe number of incorrectly detected PD comments

To evaluate accuracy of comment-comment relations thestandard precision recall and 119865-score measures are used

Precision (119875)edge =TP

TP + FP

Recall (119877)edge =TP

TP + FN

1198651-score (119865)edge = 2 timesPrecision times recallPrecision + recall

(12)

where TP is the number of correctly detected comment-comment relations FP is the number of incorrectly detectedcomment-comment relations and FN is the number of

The Scientific World Journal 15

C

C

Comment-root relation

Comment-comment relation

C

C C C Starter discussion commentStarter discussion comment

R

Figure 17 An illustration of a declarative thread

comment-comment relations in the ground truth that werenot predicted at all

The path accuracy metric mentioned earlier is a mod-ified version of 119875path and 119877path which is appropriate todeclarative platform This metric consider discussion pathsfrom each PD comment to the starter discussion commentbut not to root

Precision (119875)path

=

sum|119862|119901

119894=1

1003816100381610038161003816path119866 (119894) = path119875(119894)1003816100381610038161003816

|119862|119901

ow

1 if (|119862|119901 = 0)

Recall (119877)path

=

sum|119862|119877

119894=1

1003816100381610038161003816path119866 (119894) = path119875(119894)1003816100381610038161003816

|119862|119877

ow

1 if (|119862|119877= 0)

(13)

where |119862|119877is the number of PD comments in the ground-

truth thread and |119862|119901is the number of PD comments in the

predicted thread Path119866(119894) is the discussion path from 119894th

node to the discussion starter comment in the ground truthPath119875(119894) is discussion path from 119894th node to the discussion

starter comment in the predicted threadAlso the relaxed Precision (119875)

119877-path and Recall (119877)119877-path

are modified to be suitable for declarative platform

Precision (119875)119877-path

=

sum|119862|119901

119894=1(1003816100381610038161003816path119866 (119894) cap path

119875(119894)1003816100381610038161003816 1003816100381610038161003816path119875 (119894)

1003816100381610038161003816)

|119862|119901

ow

1 if (|119862|119901= 0)

Recall (119877)119877-path

=

sum|119862|119877

119894=1(10038161003816100381610038161003816path119866 (119894)

cap path119875(119894)

1003816100381610038161003816100381610038161003816100381610038161003816path119866(119894)

10038161003816100381610038161003816)

|119862|119877

ow

1 if (|119862|119877= 0)

(14)

Figure 18 shows two threads of which one is the groundtruth and the other one is the predicted thread In order tobetter understand the evaluation metrics we calculate themfor this example

In Table 3 the predicted thread in Figure 18 has beenevaluated by metrics from interrogative threads The resultsshow high values Table 4 shows the results of evaluationby metrics from declarative threads The results show thatdeclarative metrics have more appropriate results

There are two reasons which lead declarative metricsto better evaluate the predicted structure (1) The root indeclarative threads has many children So if a methodconnects all comments to the root interrogative metricsshow good results (2) Interrogative metrics cannot properlyevaluate comment-comment relations in declarative threadsHowever the declarative metrics can evaluate both types ofrelations

53 Experimental Results and Analysis We use several base-lines to compare the effectiveness of SLARTS The first base-line is the performance when we simply link all comments totheir previous comment (we name it Last baseline) [3 4]Thismethod leads to good results for RTS task in interrogativethreads The second baseline is to link all comments to theroot (we name it First baseline)

The only work that has been done on RTS has focused onthe authorrsquos name in online news agencies [19] Schuth et alrsquosmethod used several features to find the commenterrsquos name inonline news articles This method achieves a good precisionbut it has a low recall because many comments do not referto any authorrsquos name So we selected Seo et alrsquos method [2]which is a strong method in interrogative platforms

Seo et alrsquos method has focused on forums and emails andused the quoted text as one of theirmain featuresThis featureis not available in our datasets Therefore we use all theirproposed features except the quoted text

We have applied 5fold cross-validation to minimize biaswith 95 as confidence interval Table 5 shows the results ofour experiments based on interrogative evaluation metricsAccording to Accedge SLARTS reveals higher accuracy exceptin Alef in which lots of replies are connected to the root

According to 119875119877-path and 119877119877-path First and Last baselines

have the best performance respectively First connects allcomments to the root this way irrelevant comments do notappear in the path from the comment to the root and also theroot appears in all paths Last connects all comments to theirprevious comments This causes all comments to appear inthe path

16 The Scientific World Journal

Table3Ex

ampleo

fcalculatio

nof

interrogativem

etric

sfor

thee

xamples

hownin

Figure

18

Evaluatio

nmetric

Calculation

Accuracy

edge

Replyrelatio

ns=1rarr

1198772rarr

1198777rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

14rarr

35rarr

26rarr

58rarr

5

Detectedrelatio

ns==1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

14rarr

15rarr

26rarr

57rarr

28rarr

7

8 11asymp073

Acc p

ath

|(1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

1rarr

1198775rarr

2rarr

1198776rarr

5rarr

2rarr

119877)|

Num

berof

comments

=8 11asymp073

119875119877-path

(1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

1rarr

1198775rarr

2rarr

1198776rarr

5rarr

2rarr

119877)+(4rarr

1rarr

1198777rarr

2rarr

1198778rarr

7rarr

2rarr

119877)

Num

berof

comments

=(1+1+1+1+1+1+1+1)+(1+12+23)

11

=1017

11

asymp092

119877119877-path

(1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

1rarr

1198775rarr

2rarr

1198776rarr

5rarr

2rarr

119877)+(4rarr

3rarr

1rarr

1198777rarr

1198778rarr

5rarr

2rarr

119877)

Num

berof

comments

=(1+1+1+1+1+1+1+1)+(23+1+23)

11

=1034

11

asymp094

1198651-score119877-path

2lowast092lowast094

092+094

asymp093

The Scientific World Journal 17

R

9 10 11

3 5

8

21 7

4 6

(a)

R

9 10 11

3 7

8

21

4 5

6

(b)

Figure 18 An example of two threads (a) ground-truth thread and (b) the prediction thread

Table 4 Example of calculation of declarative metrics for the example shown in Figure 18

Evaluation metric Calculation

AccCTD

TP = 1 2 9 10 11TN = 3 4 5 6 8

FN = 7

FP =

5 + 5

5 + 5 + 1 + 0asymp 091

119875edge

119877edge

1198651edge

TP = 3 rarr 1 5 rarr 2 6 rarr 5 119875edge =3

3 + 3= 05

FN = 4 rarr 3 8 rarr 5 119877edge =3

3 + 2= 06

FP = 4 rarr 1 7 rarr 2 8 rarr 7 1198651 minus scoreedge asymp 055

119875path

119877path

1198651path

119875path =| 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 |

| 3 rarr 1 4 rarr 1 5 rarr 2 6 rarr 5 rarr 2 7 rarr 2 8 rarr 7 rarr 2 |=3

6= 05

119877path =| 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 |

| 3 rarr 1 4 rarr 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 8 rarr 5 rarr 2 |=3

5= 06

1198651-score119877-path asymp 055

119875119877-path

119877119877-path

1198651119877-path

119875119877-path =

3 rarr 1 4 rarr 1 5 rarr 2 6 rarr 5 rarr 2 7 rarr 2 8 rarr 7 rarr 2

| 3 4 5 6 7 8 |=1 + 1 + 1 + 1 + 0 + 12

6=45

6= 075

119877119877-path =

3 rarr 1 4 rarr 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 8 rarr 5 rarr 2

| 3 4 5 6 8 |=1 + 12 + 1 + 1 + 12

5=4

5= 08

1198651-score119877-path =

2 lowast 075 lowast 08

075 + 08asymp 077

According to Accpath First has better performance inThestandard ENENews and Alef because more commentsare linked directly to the root Usually SLARTS and Seo etalrsquos methods cannot predict the full path inThestandard andENENews because according to Figure 15 paths are verylong and complex in these datasets

As wementioned earlier interrogative evaluationmetricsare not appropriate for declarative threads because based onthese metrics First shows high performance although thisbaseline does not detect any comment-comment relations

Table 6 shows the results when we use declarative evaluationmetrics proposed in Section 52

According to Table 6 for 119865edge SLARTS performs betterthan Seo et alrsquos method However for119875edge Seo et alrsquos methodperforms better but its 119877edge is lower than SLARTS in all da-tasets

It is important to say that when the difference between119875edge and 119877edge is high and 119875edge is greater than 119877edge themethod most likely connects comments to the root and itdoes not appropriately detect comment-comment relations

18 The Scientific World Journal

J(1)M(1)

V(1)

H(1)

A(1)

R(1)

G(1)

N(1)D(1)

E(1)

C(1)

P(2)

U(1)

S(1)

T(2)

T(1)

M(2)

L(1)

Q(1)

C(2)

K(1)

F(1)L(2)

C(3)

P(1)

A(2)

B(1)O(1)

I(1)

H(2)

R

(a)

J(1)

M(1)

R

A(1)

N(4)

D(1)

G(1)

I(1)

B(1)H(1)

O(1)

C(1)

N(2)

G(2)O(2)

P(1)

E(1)

D(2)

F(1)

E(2)F(2)

D(3)

L(1) F(3) N(1)

D(4)

F(4)K(1)D(5)

D(6)

N(3)

(b)

1)

1)

1)

1)

1)

1)

1)

1)1)

1)

1)

1)

R

F(2)

K(3)

R(1)

K(2)

M(2)A(3)

Q(1)

N(2)

K(4)A(2)

J(2)

1)

H(2)

H(3)

I(2)

I(1)

P(1)

O(1)

(c)

Figure 19 Three real threads in which nodes are labeled according to commenterrsquos name and number of its comments

(like First baseline) On the other hand when 119877edge isgreater than 119875edge the method does not appropriately detectcomment-root relations (like Last baseline)

So Seo et alrsquos method most likely connects comments tothe root and this is not appropriately detecting comment-comment relations On the other hands 119875edge and 119877edge ofSLARTS are close

Several features in SLARTS such as Global similarityfrequent words Frequent patterns and authorsrsquo languagefocus on detection of comment-comment relations Thismakes the results of SLARTS better than Seo et alrsquos methodin declarative evaluation metrics

We already saw that the average length of threads inThestandard and ENENews is longer than the other datasets(Table 2) and their paths are much longer and more complexthan the other datasets (Figure 16) According to 119865edge theaccuracy of Thestandard and ENENews is less than otherdatasets Note that 119865edge is a strict metric in declarativethreads

SLARTS has better 119865edge than Seo et alrsquos method in alldatasetsThemaximumdifference occurs inThestandard andRussianblog In these datasets many of the defined featureshave goodperformance (Importance of features in ENENewsand Russianblog datasets will be shown in Table 7 and we willexplain more in the next section)

The Scientific World Journal 19

Table 5 Experimental results with interrogative evaluation metrics

Datasets Methods Accedge Accpath 119875119877-path 119877

119877-path 1198651119877-path

Thestandard

SLARTS 04669plusmn0009

03059plusmn0013

06927plusmn0018

07450plusmn0010

07179plusmn0013

Seo et alrsquos [2] 03941plusmn0005

03413plusmn0006

08303plusmn0009

06563plusmn0005

07331plusmn0003

First 03630plusmn0010

03630plusmn0010

10000plusmn0000

05846plusmn0009

07379plusmn0007

Last 01824plusmn0006

00495plusmn0004

02214plusmn0008

10000plusmn0000

03625plusmn0010

ENENews

SLARTS 04837plusmn0018

03695plusmn0027

07625plusmn0026

07770plusmn0013

07697plusmn0017

Seo et alrsquos [2] 04661plusmn0017

04134plusmn0021

08720plusmn0011

07086plusmn0013

07818plusmn0011

First 04524plusmn0018

04524plusmn0018

10000plusmn0000

06634plusmn0016

07976plusmn0011

Last 02090plusmn0016

00596plusmn0003

02319plusmn0011

10000plusmn0000

03764plusmn0015

Courantblogs

SLARTS 06843plusmn0028

06453plusmn0033

09127plusmn0013

08619plusmn0014

08865plusmn0013

Seo et alrsquos [2] 06700plusmn0040

06628plusmn0043

09667plusmn0014

08250plusmn0018

08902plusmn0015

First 06490plusmn0036

06490plusmn0036

10000plusmn0000

08001plusmn0019

08889plusmn0012

Last 02037plusmn0017

01082plusmn0008

03183plusmn0011

10000plusmn0000

04829plusmn0012

Russianblog

SLARTS 07463plusmn0017

06904plusmn0017

08741plusmn0008

08787plusmn0006

08764plusmn0007

Seo et alrsquos [2] 05446plusmn0010

05174plusmn0011

09293plusmn0003

07821plusmn0004

08494plusmn0003

First 04968plusmn0006

04968plusmn0006

10000plusmn0000

07134plusmn0004

08327plusmn0003

Last 03632plusmn0018

01549plusmn0012

03695plusmn0015

10000plusmn0000

05395plusmn0016

Alef

SLARTS 07753plusmn0017

07641plusmn0018

09579plusmn0006

09033plusmn0008

09298plusmn0007

Seo et alrsquos [2] 07802plusmn0019

07800plusmn0019

09950plusmn0003

08828plusmn0010

09355plusmn0006

First 07853plusmn0018

07853plusmn0018

10000plusmn0000

08826plusmn0010

09376plusmn0006

Last 00993plusmn0005

00822plusmn0005

02526plusmn0012

10000plusmn0000

04032plusmn0015

The bold style shows the best result in each metric

As shown in Table 6 Russianblog has better resultsthan the other datasets in all metrics The main reason isthat its comments are not confirmed by a moderator Thiscauses Accedge of Last baseline in Russianblog to be equalto 03632 that is more than other datasets (Table 5) Accedgeof Last baseline has inverse relationship with complexity ofreplies Also we showed in Figure 13 that the Russianbloghas the lowest time difference between a comment and itsparent When time difference between a child and its parentdecreases detection of reply relationwould be easier In otherwords as the parent appears to be closer to its child somefeatures such as frequent pattern and location Prior that are

based on position of comments in chronological order workbetter

The difference between 119865119877-path and 119865path is about 20

in Thestandard and ENENews where threads are larger andpaths have more complex structures

The minimum difference between the results of SLARTSand Seo et alrsquos methods appears in Alef datasets In Alef manyrelations are comment root and many comments do not haveauthorrsquos name which make the features perform poorly

Since SLARTS method has features which speciallyfocused on detecting comment-root relations (eg by addingcandidate filtering) AccCTD of SLARTS is better than Seo

20 The Scientific World Journal

et al method in all datasets The best result is 091 forRussianblog and the worst result is 069 for ENENewsThe root of ENENews dataset is usually a tweet Accordingto Table 2 it makes the average length of the rootrsquos textto be shorter than the other datasets and this makes thetextual features perform poorly on detecting comment-rootrelations

Confidence intervals of Alef andCourantblog datasets arehigher than the other datasets becausemany of their relationsare comment root (Figure 16) This makes ranking SVM tobe bias towards connecting comments to the root especiallywhen a thread includes very few comment-comment rela-tions

We compare 119875-value to specify the significance of differ-ences between SLARTS and Seo et alrsquosmethods on declarativemetrics Since ranking SVM ranks candidates based on theirscore and selects the first candidate from the list only thefirst candidate is important So p1 is computed The resultsindicate that all improvements are statistically significant (119875-value lt 0005) in all datasets

54 Evaluation of the Features In this section we evaluatethe role of the introduced features We use backward featureselection That means to measure the importance of a featurewe use all features except that feature repeat the experimentsin its absence and compare the results to the case whereall features are present The difference between the values ofmetrics in presence of a feature and its absence is reported inTable 7

It is seen that some features improve precision of themetrics for example location prior and candidate filteringrule 3 where they most likely tend to detect comment-rootrelations Some features improve recall of the metrics such asauthorsrsquo language global similarity candidate filtering rule1 and frequent patterns These features most likely tendto detect comment-comment relations Some features affectboth precision and recall for example authorsrsquo name

As stated earlier Russianblog has different behaviourin comparison with other datasets (Figures 12 13 and 15)Also ENENews has larger threads and more complex paths(Figure 16 and Table 2) So we use these datasets to evaluatefeatures in depth

Similarity feature improves recall of evaluation metricsSince Russian language is known as a morphologically richlanguage and the length of commentsrsquo text is very short(about average 18 words according to Table 2) in comparisonwith other datasets improvement of textual features is lowTo increase the performance of textual feature in Russianwe need a stemmer a Wordnet and a tokenizer Similarityfeature has rather a good impact on AccCTD

Authorsrsquo language feature improves recall of evaluationmetrics According to 119877edge the improvements of this featureare 003 and 001 in ENENews and Russianblog respectively

Global similarity feature improves both precision andrecall in Russianblog and recall in ENENews

The frequent words feature has a small improve-ment in Russianblog like similarity feature This feature

improves recall of the evaluation metrics in ENEnews about001

Length ratio feature improves precision in both datasetsHowever since longer comments in ENENews have manychildren this feature is more prominent there

Authorsrsquo name feature is useful in all evaluation metricsThe value of this feature in ENENews is more than Russian-blog because authorsrsquo name in Russianblog is different fromother datasets it is an email and no one would refer to it as aname

The frequent patterns feature focuses on detectingcomment-comment relations This feature improves recall ofevaluation metrics in both datasets

The location prior feature improves precision in bothdatasets This feature has a good improvement on AccCTDAccording to 119875edge the best improvement is 016091 forRussianblog since comments are not moderated

The candidate filtering rule 1 improves recall of evaluationmetrics in both datasets This feature removes the rootcandidate accurately This feature has a good improvementon AccCTD The maximum improvement for 119877edge is 034 forRussianblog

The candidate filtering rule 2 has small improvements onprecision and recall of evaluation metrics in both datasetsMaximum improvement is gained with Russianblog

Finally The candidate filtering rule 3 improves precisionof metrics in both datasets This feature removes all can-didates except the root Thus detection of comment-rootrelations is improved Also this feature improves AccCTD

6 Information Extraction byVisualizing Threads Structure

In this section we discuss the information which can beextracted from hierarchical structure of threads Figure 19 isthe same as Figure 3 except that the commenters and theirnumber of posted comments are shown as the nodesrsquo labelfor example ldquo119863(3)rdquo is the third comment sent by user ldquo119863rdquoVisualization of the structure reveals valuable information asfollowing

(i) Is the root article controversial The threads ldquo119861rdquo andldquo119862rdquo in Figure 19 aremore controversial than ldquo119860rdquo Visi-tors have expressed different opinions and sentimentsabout the root in ldquo119861rdquo and ldquo119862rdquo leading to formationof longer conversations The thread ldquo119860rdquo shows acommon thread which has short conversations aboutthe root The height and width of trees can help as ameasure to recognize whether a root is controversialor not

(ii) Which comments are starters [36] Starter commentshave important role in the conversations becauseusers read them and then read other comments tobetter understand the conversations

(iii) Which comments have participated in this conversa-tion Users can follow the conversations that seeminteresting to them without reading any unrelatedcomment

The Scientific World Journal 21

Table 6 Experimental results with declarative evaluation metrics

Datasets Method 119875edge 119877edge 119865edge 119875path 119877path 119865path 119875119877-path 119877

119877-path 119865119877-path AccCTD

ThestandardSLARTS 03782

plusmn000804155plusmn0011

03960plusmn0006

01565plusmn0007

01713plusmn0008

01636plusmn0007

03628plusmn0012

04166plusmn0016

03876plusmn0005

07525plusmn0005

Seo et alrsquos[2]

03014plusmn0031

01414plusmn0015

01921plusmn0016

01544plusmn0025

00576plusmn0011

00835plusmn0013

03013plusmn0038

01319plusmn0018

01831plusmn0021

05831plusmn0007

Difference 00778 02741 02039 00021 01137 00801 00615 02847 02045 01694

ENENewsSLARTS 03797

plusmn003803891plusmn0019

03839plusmn0024

01873plusmn0040

01855plusmn0026

01857plusmn0030

03768plusmn0048

03768plusmn0017

03759plusmn0025

06872plusmn0014

Seo et alrsquos[2]

04060plusmn0030

01713plusmn0016

02407plusmn0018

02378plusmn0034

00837plusmn0016

01235plusmn0021

03947plusmn0035

01523plusmn0015

02196plusmn0019

05954plusmn0009

Difference minus00263 02178 01432 minus00505 01018 00622 minus00179 02245 01563 00918

CourantblogsSLARTS 05236

plusmn003003963plusmn0073

04495plusmn0052

04220plusmn0057

03110plusmn0058

03566plusmn0049

05325plusmn0049

03853plusmn0075

04452plusmn0057

07625plusmn0033

Seo et alrsquos[2]

07320plusmn0090

02049plusmn0062

03189plusmn0082

06815plusmn0105

01896plusmn0070

02951plusmn0093

07309plusmn0094

02010plusmn0065

03140plusmn0086

07001plusmn0031

Difference minus02084 01914 01306 minus02595 01214 00615 minus01984 01843 01312 00624

RussianblogSLARTS 05940

plusmn002805674plusmn0022

05803plusmn0023

04907plusmn0031

04664plusmn0020

04782plusmn0024

05878plusmn0028

05823plusmn0021

05849plusmn0020

09116plusmn0011

Seo et alrsquos[2]

04705plusmn0020

03030plusmn0015

03685plusmn0016

03823plusmn0023

02534plusmn0013

03046plusmn0015

04682plusmn0020

02822plusmn0014

03521plusmn0015

05862plusmn0013

Difference 01235 02644 02118 01084 0213 01736 01196 03001 02328 03254

AlefSLARTS 06189

plusmn002804131plusmn0040

04952plusmn0037

05639plusmn0027

03819plusmn0037

04551plusmn0034

06277plusmn0027

04069plusmn0043

04934plusmn0039

08044plusmn0013

Seo et alrsquos[2]

08827plusmn0052

02637plusmn0051

04045plusmn0060

08797plusmn0053

02631plusmn0051

04034plusmn0060

08840plusmn0052

02635plusmn0051

04043plusmn0060

07846plusmn0018

Difference minus02638 01494 00907 minus03158 01188 00517 minus02563 01434 00891 00198The bold style shows the best result in each metric

(iv) Who plays an important role in a discussion Someusers playmore important roles in conversations thanother users For example users ldquo119863(1)rdquo in threadldquoBrdquo and ldquo119860(1)rdquo in thread ldquoCrdquo have made a longconversation These users are known as a hub or astarter in a thread and their degree of importance canbe compared according to their indegree [39] Manyanalyses can be performed on this structure which isknown as popularity features [12]

(v) How many conversations are about the root Thereare thirteen conversations which are replyingdirectly to the root in thread ldquoCrdquo for exampleF(2)H(2)H(3) I(2)O(1)orF(2)H(2) l(1) showtwo conversations in thread ldquo119862rdquo

7 Conclusion and Future Work

In this paper we proposed SLARTS a method based onranking SVMwhich predicts the tree-like structure of declar-ative threads for example blogs and online news agenciesWe emphasized on the differences between declarative andinterrogative threads and showed that many of the previouslyproposed features perform poorly on declarative threadsbecause of this Instead we defined a set of novel textual andnontextual features and used a ranking SVM algorithm tocombine these features

We detect two types of reply relations in declarativethreads comment-root relation and comment-commentrelation An appropriate method should be able to detectboth types of reply relations and an appropriatemetric shouldconsider both types of reply relations So in order to havefair judge on the quality of the predicted structures wemodified the evaluation metrics accordingly We also defineda novel metric that measures the accuracy of comment typedetection

The results of our experiments showed that accord-ing to declarative evaluation metrics our method showshigher accuracy in comparison with the baselines on fivedatasets Also we showed that all improvements in detectingcomment-comment relations are statistically significant in alldatasets

We believe that the defined features are not limited todeclarative threads Some features such as authorrsquos languageand frequent patterns extract relations among users can beuseful in interrogative threads

For future we would like to use the proposed methodon interrogative platforms such as forums Also we wouldlike to analyze the tree-like reply structure deeply and webelieve it can provide valuable information for example tohelp in understanding the role of users in discussions or tofind important and controversial comments among a largevolume of comments

22 The Scientific World Journal

Table 7 The difference between the evaluation metrics in presence and absence of features

Feature Dataset Evaluation metric119875edge 119877edge 119875Path 119877Path 119875

119877-path 119877119877-path AccCTD

Similarity ENENews minus000233 007415 minus000538 003289 minus001095 008996 004684Russianblog 000874 000012 000963 000347 000695 minus001079 000355

Authorsrsquo language ENENews minus001354 003246 minus001785 000693 minus002757 004052 000458Russianblog minus000229 001052 minus000179 000750 minus000590 000807 minus000182

Global similarity ENENews minus000402 005604 minus001338 001581 minus001129 007086 002742Russianblog 001090 001443 000582 000897 000368 minus000317 minus000087

Frequent words ENENews 000252 000931 000170 000424 minus000132 000973 000292Russianblog 000008 000023 000016 000029 000034 000047 minus000011

Length ratio ENENews 004206 minus002984 004450 001700 006421 minus005336 minus000850Russianblog 000651 minus000355 000920 000080 000784 minus000592 000235

Authorsrsquo name ENENews 003753 006392 001920 003285 002341 005998 001788Russianblog 000000 000000 000000 000000 000000 000000 000000

Frequent pattern ENENews minus002028 002930 minus002357 000300 minus003793 003410 000526Russianblog 000458 006100 000830 005206 minus000545 006150 000404

Location prior ENENews 008778 minus003690 008556 003280 011792 minus007852 002925Russianblog 016091 009671 017766 013515 016699 008793 002414

Candidate filtering rule 1 ENENews minus005431 006138 minus004102 001296 minus006458 007545 006949Russianblog minus012506 034370 minus013916 027445 minus012683 037003 032546

Candidate filtering rule 2 ENENews 000014 minus000004 000048 000041 minus000038 minus000035 minus000009Russianblog 002575 002475 002462 002341 001746 000802 000000

Candidate filtering rule 3 ENENews 000357 minus000121 000452 000254 000551 minus000161 000161Russianblog 008521 minus001978 009636 001702 009380 minus003825 007371

Since our goal is to provide a language-independentmodel to reconstruct the thread structure we did not usetext processing tools such as Wordnet and Named EntityRecognition (as they are not available in the same quality forall languages) We would like to focus on English and usethese tools to find out whether they can improve the accuracyof RTS method or not

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] Y Wu Q Ye L Li and J Xiao ldquoPower-law properties ofhuman view and reply behavior in online societyrdquoMathematicalProblems in Engineering vol 2012 Article ID 969087 7 pages2012

[2] J Seo W B Croft and D A Smith ldquoOnline community searchusing conversational structuresrdquo Information Retrieval vol 14no 6 pp 547ndash571 2011

[3] HWang CWang C X Zhai and J Han ldquoLearning online dis-cussion structures by conditional random fieldsrdquo in Proceedingsof the 34th International ACM SIGIR Conference on Researchand Development in Information Retrieval (SIGIR rsquo11) pp 435ndash444 July 2011

[4] E Aumayr J Chan and C Hayes ldquoReconstruction of threadedconversations in online discussion forumsrdquo inProceedings of the

5th International AAAI Conference onWeblogs and SocialMedia(ICWSM rsquo11) pp 26ndash33 2011

[5] D Shen Q Yang J-T Sun and Z Chen ldquoThread detactionin dynamic text message streamsrdquo in Proceedings of the 29thAnnual International ACM SIGIR Conference on Research andDevelopment in Information Retrieval pp 35ndash42 Seattle WashUSA August 2006

[6] M Dehghani M Asadpour and A Shakery ldquoAn evolutionary-based method for reconstructing conversation threads in emailcorporardquo in Proceedings of the IEEEACM International Con-ference on Advances in Social Networks Analysis and Mining(ASONAM rsquo12) 2012

[7] Y Jen-Yuan ldquoEmail thread reassembly using similarity match-ingrdquo in Proceedings of the 3rd Conference on Email and Anti-Spam (CEAS rsquo06) Muntain View Calif USA 2006

[8] S Gottipati D Lo and J Jiang ldquoFinding relevant answersin software forumsrdquo in Proceedings of the 26th IEEEACMInternational Conference on Automated Software Engineering(ASE rsquo11) pp 323ndash332 November 2011

[9] H Duan and C Zhai ldquoExploiting thread structures to improvesmoothing of language models for forum post retrievalrdquo inAdvances in Information Retrieval P Clough C Foley CGurrin et al Eds vol 6611 pp 350ndash361 Springer BerlinGermany 2011

[10] T Georgiou M Karvounis and Y Ioannidis ldquoExtractingtopics of debate between users on web discussion boardsrdquo inProceedings of the 1st International Conference forUndergraduateand Postgraduate Students in Research Poster Competition 2010

[11] Y C Wang M Joshi W Cohen and C P Rose ldquoRecoveringimplicit thread structure in newsgroup style conversationsrdquo

The Scientific World Journal 23

in Proceedings of the International Conference on Weblogs andSocial Media Seattle Wash USA 2008

[12] J Chan C Hayes and E M Daly ldquoDecomposing discussionforums and boards using user rolesrdquo in Proceedings of the 4thInternational AAAI Conference on Weblogs and Social Media2010

[13] Y-C Wang and C P Rose ldquoMaking conversational structureexplicit identification of initiation-response pairs within onlinediscussionsrdquo in Proceedings of the Human Language Technolo-gies Conference ofthe North American Chapter of the Associationfor Computational Linguistics pp 673ndash676 June 2010

[14] LWangM Lui S N Kim J Nivre and T Baldwin ldquoPredictingthread discourse structure over technical web forumsrdquo inProceedings of the Conference on Empirical Methods in NaturalLanguage Processing (EMNLP rsquo11) pp 13ndash25 Edinburgh UKJuly 2011

[15] A Balali H FailiMAsadpour andMDehghani ldquoa supervisedapproach for reconstructing thread structure in commentson blogs and online news agenciesrdquo in Proceedings of the14th International Conference on Intelligent Text Processing andComputational Linguistics Samos Greece 2013

[16] P Adams and C Martel ldquoConversational thread extraction andtopic detection in text-based chatrdquo in Semantic Computing pp87ndash113 John Wiley amp Sons 2010

[17] P H Adams and C H Martell ldquoTopic detection and extractionin chatrdquo in Proceedings of the 2nd Annual IEEE InternationalConference on Semantic Computing (ICSC rsquo08) pp 581ndash588August 2008

[18] C Lin J M Yang R Cai X J Wang W Wang and L ZhangldquoModeling semantics and structure of discussion threadsrdquo inProceedings of the 18th International Conference on World WideWeb pp 1103ndash1104 2009

[19] A Schuth MMarx andM d Rijke ldquoExtracting the discussionstructure in comments on news-articlesrdquo in Proceedings of the9th ACM International Workshop onWeb Information and DataManagement (WIDM rsquo07) Lisboa Portugal November 2007

[20] C Wang J Han Q Li X Li W P Lin and H Ji ldquoLearninghierarchical relationships among partially ordered objects withheterogeneous attributes and linksrdquo in Proceedings of the 12thSIAM International Conference on Data Mining pp 516ndash527Anaheim Calif USA 2012

[21] M Dehghani A Shakery M Asadpour and A KoushkestanildquoA learning approach for email conversation thread reconstruc-tionrdquo Journal of Information Science vol 39 no 6 pp 846ndash8632013

[22] A Balali A Rajabi S Ghassemi M Asadpour and H FailildquoContent diffusion prediction in social networksrdquo in Pro-ceedings of the 5th Conference on Information and KnowledgeTechnology (IKT rsquo13) pp 467ndash471 2013

[23] G Mishne and N Glance ldquoLeave a reply an analysis of weblogcommentsrdquo in Proceedings of the 3rd Annual Workshop onthe Weblogging ecosystem Aggregation Analysis and DynamicsEdinburgh UK 2006

[24] V Gomez A Kaltenbrunner and V Lopez ldquoStatistical analysisof the social network and discussion threads in Slashdotrdquo inProceedings of the 17th International Conference on World WideWeb (WWW rsquo08) pp 645ndash654 April 2008

[25] T Weninger X A Zhu and J Han ldquoAn exploration ofdiscussion threads in social news sites a case study of the redditcommunityrdquo 2013

[26] D Laniado R Tasso Y Volkovich and A KaltenbrunnerldquoWhen the Wikipedians talk network and tree structure of

Wikipedia discussion pagesrdquo in Proceedings of the 5th Interna-tional AAAI Conference on Weblogs and Social Media (ICWSMrsquo11) 2011

[27] A Grabowski ldquoHuman behavior in online social systemsrdquoTheEuropean Physical Journal B vol 69 no 4 pp 605ndash611 2009

[28] S N Kim L Wang and T Baldwin ldquoTagging and linkingweb forum postsrdquo in Proceedings of the 40th Conference onComputational Natural Language Learning Uppsala Sweden2010

[29] J Andreas S Rosenthal and K McKeown ldquoAnnotating agree-ment and disagreement in threaded discussionrdquo in Proceedingsof the 8th International Conference on Language Resources andComputation (LREC rsquo12) pp 818ndash822 Istanbul Turkey May2012

[30] A Stolcke K Ries N Coccaro et al ldquoDialogue actmodeling forautomatic tagging and recognition of conversational speechrdquoComputational Linguistics vol 26 no 3 pp 339ndash373 2000

[31] E N Forsyth Improving Automated Lexical and DiscourseAnalysis of Online Chat Dialog Naval Postgraduate SchoolMonterey Calif USA 2007

[32] D Feng E Shaw J Kim and E Hovy ldquoAn intelligentdiscussion-bot for answering student queries in threaded dis-cussionsrdquo in Proceedings of the 11th international Conference onIntelligent User Interfaces pp 171ndash177 February 2006

[33] F Hu T Ruan Z Shao and J Ding ldquoAutomatic web informa-tion extraction based on rulesrdquo in Proceedings of the Web Infor-mation System Engineering (WISE rsquo11) pp 265ndash272 Springer2011

[34] F Hu T Ruan and Z Shao ldquoComplete-thread extraction fromweb forumsrdquo inWebTechnologies andApplications pp 727ndash734Springer 2012

[35] D Feng J Kim E Shaw and E Hovy ldquoTowards modelingthreaded discussions using induced ontology knowledgerdquo inProceedings of the National Conference on Artificial Intelligencepp 1289ndash1294 July 2006

[36] G D Venolia and C Neustaedter ldquoUnderstanding sequenceand reply relationships within email conversations a mixed-model visualizationrdquo in Proceedings of the SIGCHI Conferenceon Human Factors in Computing Systems pp 361ndash368 April2003

[37] T Joachims ldquoTraining linear SVMs in linear timerdquo in Pro-ceedings of the 12th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining pp 217ndash226 Philadel-phia Pa USA August 2006

[38] RM Fano ldquoTransmission of information a statistical theory ofcommunicationsrdquo American Journal of Physics vol 29 pp 793ndash794 1961

[39] T Opsahl F Agneessens and J Skvoretz ldquoNode centrality inweighted networks generalizing degree and shortest pathsrdquoSocial Networks vol 32 no 3 pp 245ndash251 2010

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 7: Research Article A Supervised Approach to Predict the …downloads.hindawi.com/journals/tswj/2014/479746.pdf · 2019-07-31 · Research Article A Supervised Approach to Predict the

The Scientific World Journal 7

Learning

Model

i

R

1

i minus 1Get

(candidatesof irsquothcomment)

Ranking

Training set

Test set

SVM ranklearn

(A learning modellist of weights for

each feature)

i

R i minus 1

j

For each candidateForj = 0 to i minus 1 doFor each featurefor k = 0 to Features doLinear kernel functionA[j]+ = modelweight[k]lowastfeature [j][k]Return parentrsquos lablereturn argmaxj A

Figure 4 RTS algorithm

represent the text body of comments A commentrsquos text canbe considered as a vector of terms weighed by TF-IDF TF(Term Frequency) is the number of times a word appearsin a comment and IDF (Inverse Document Frequency) iscomputed according to the following formula

IDF (119908) = Log( 119873119899119908

) (1)

where119873 is the total number of comments in a news item and119899119908is the number of comments that contain the word 119908 IDF

is usually extracted from the whole training data but we havelimited it to the set of comments in a thread We believe thismakes it more accurate for example when a word has a lowfrequency in the whole training data but has a high frequencyin a thread

In order to measure similarity between two commentsthe stop-words are deleted first and then words of thecomments are stemmed by the Porter algorithm (for Englishdatasets) This step is not performed for Russian and Persiandatasets The common words of the two comments areextracted and their weights are calculated based on TF-IDFThen the final score is obtained by aggregating the weights ofcommonwords according to (2) Since the root text is usuallylonger than the comments log of the product of their lengthsis used in denominator

Score (1198621 1198622)

= sum

119908isin119862-words(1 + log (TF (119908 1198621) times TF (119908 1198622)))

times IDF (119908) times (log (|1198621| times |1198622|))minus1

(2)

where119862-words are the number of the words held in commonbetween comments 1198621 and 1198622 and |119862| is the length of thecommentrsquos text

Comments are usually informal and have typo errorsSome words in two comments might be the same but dueto spelling errors it is difficult to find this out In order tosolve this issue we use the minimum edit distance (MED)algorithm The minimum edit distance of two words is theminimum number of edit operations (insertion deletionsubstitution and transposition) needed to transform oneword into another [14] The costs of insertion deletionsubstitution and transposition are 1 1 2 and 1 respectively

Two words in different comments are considered ascommon words if either they are exactly the same or theyseem to be the same but they contain some typo errors Inthe latter case if the length of the words is bigger than fiveand their first two letters are the same and their edit distanceis lower than 4 the two words are considered as commonword For example two words ldquoBeautifulrdquo and ldquoBeuatifulrdquo areconsidered as common words

422 Authorsrsquo Language Model The idea is that the com-menters who talk to each other are more likely to use similarwords in their comments In order to take advantage ofthis feature all comments that are related to a commenterare appended This makes a collection of words for eachcommenter If the collections of words of two commentersare very similar they can be related to each other InFigure 5 commenter ldquo1198601rdquo wrote three comments Thesethree comments are appended and make a collection ofwords for commenter ldquo1198601rdquo and then like the first featurethe similarity is calculated between collection of words ofcommenters ldquo1198601rdquo and ldquo1198602rdquo The similarity scores obtained

8 The Scientific World Journal

between two commenters ldquo1198601rdquo and ldquo1198602rdquo are considered asthis featurersquos score for relations between their comments

423 Prior Location Position of comments can reveal someinformation about their hierarchy for example first com-ments usually have more children than the others or com-ments which are located just before the 119894th comment aremorelikely to be its parent In general we would like to estimate119901(119894 | 119895) that is knowing that a comment is in position 119895whichis the likelihood that the comment in position 119894 is its parent[2] So we calculate prior probabilities for being the parent ofdifferent positions

To calculate prior probability for 119895 we count the numberof times a comment in position 119894 is the parent of 119895 inthe training set Figure 6 shows the prior probability forcomments in positions 1 to 100 in Thestandard dataset Thehighest prior probability belongs to the root and then tothe comments which are located just before the commentThe sample points in Figure 6 show five commentrsquos positionssuch as the roots 10 30 57 and 59 and how it is probablefor 60th comment to be a child of them that the root hasthe most prior probability which is equal to 01852 and thenthe 59th comment which has the probability of 01236 AlsoFigure 7 shows the prior probability of child-parent relationfrom comment 40 to 60

424 Reference to Authorsrsquo Name If the name of an authorappears in a comment then all hisher comments are con-sidered as a potential candidate for being parent of thatcomment [19] Sometimes a part of the complete name isreferenced Sometimes the authorrsquos name is made up of twoparts and both of these parts could be used by other authorsfor reference We also consider these types of references Wehold each part of the authorrsquos name and then parts which arestop-words are removed

425 Global Similarity This feature is based on ranking thesimilarity of commentsrsquo text and the rootrsquos text which has aglobal view on the text similarity According to this feature ifcomment ldquo119860rdquo has the most similarity with comment ldquo119861rdquo andinversely comment ldquo119861rdquo has themost similarity with commentldquo119860rdquo among other candidates it is more likely that there is arelation between comment ldquo119860rdquo and comment ldquo119861rdquo To relaxthis feature the similarity measurement is calculated for eachcomment corresponding with all comments then the com-ments are sorted based on their similarity score For examplein Figure 8 we are looking for parent of the fifth commentIn the first step according to Formula (2) comments aresorted based on score of text similarity measurement per thefifth comment Comments that do not belong to candidateof the fifth comment are removed The removed commentshave been shown with black color in Figure 8 In the secondstep the same as the first step comments are sorted per eachcandidate of fifth comment and also comments which donot belong to candidate of the fifth comment are removedexcept the fifth comment itself Finally formula (3) is used tocalculate Ranking-distance score Two comments which arethe most similar to each other have more Ranking-distance

score In Figure 8 the most score belong to relation with thefourth commentThis feature is symmetric and the similarityamong comments is only calculated one time In other wordsthis feature needs 119874(1198732) time complexity for text similaritypairwise comparisons

Ranking-distance (1198621 1198622) = (|1198621amp1198622| + |1198622amp1198621|)minus1(3)

where 1198621 and 1198622 are two comments and |1198621amp1198622| is the textsimilarity distance between 1198621 and 1198622

426 Frequent Patterns of Authors Appearance The idea isthat the comments of two commenters who talk to eachother usually appear closely in chronological order So iftheir comments appear many times just after each other thisfeature gives a high score to their comments In order toimplement this feature we use the following formula

FPScore (119894 119895) = 119891 (119860119894 119860119895) + 119891 (119860

119895 119860119894) (4)

where 119860119894is the author of comment 119894 119891(119886 119887) is the number

of times of comments of author 119886 that appear just beforecomments of author 119887 (see Pseudocode 1)

Figure 9 shows a time line which includes 7 commentswhich have been written by 4 commenters The feature scoreis calculated for the relation among comments The score ofrelation between comments 119860 and 119861 is 3 which is more than119860 and 119862

427 Length Ratio The length of the parent text is usuallylonger than its children The length ratio score is calculatedaccording to

length ratio score (119862119862 119862119875) =

10038161003816100381610038161198621198751003816100381610038161003816

10038161003816100381610038161198621198621003816100381610038161003816

(5)

where 119862119862is a comment looking for a parent and 119862

119875is a

candidate parent

428 FrequentWords in Parent-Child Comments Sometimesa pair of words appears to be frequently one word in theparent and the next word in its children For example in ENE-News dataset the most frequent pairs are (believe people)(accident fukushima) (new discovered) (idea report) and(people public)We use pointwisemutual information (PMI)[38] to find the frequent pattern

Score (11988211198822) =

Count (11988211198822)

Count (1198821) lowast Count (119882

2) (6)

where 1198821is a word in a comment whose parent we are

looking for1198822is a word in its candidate parent and Count

(11988211198822) is the number of time119882

1has appeared in the parent

and1198822has appeared in child The numerator computes how

often two words appear together and denominator computeshow often one of the words appears Finally according toPseudocode 2 the score of relation between two comments iscalculated

The Scientific World Journal 9

+

C

C

C

A1

+

C

C

C

A2Similarity

C

Figure 5 The authorsrsquo language features based on author ldquo1198601rdquo and author ldquo1198602rdquo

102030405060708090100

1 20 4060 80

100

0

01

02

03

04

05

06

07

08

09

1

Prio

r pro

babi

lity

X RootY 60Z01862

X 10Y 60Z00011

X 30Y 60Z00011

X 57Y 60Z 006

Root

Y child X parent

X 59Y 60Z01236

Figure 6 The prior probability of child-parent relation from comments 1 to 100 inThestandard

102030405060

0

005

01

015

02

025

X rootY 60Z01862

X 59Y 60Z01236

Root

4045

5055

60

Y child

X parent

Prio

r pro

babi

lity

Figure 7 The prior probability of child-parent relation from comments 40 to 60

Comment other comments are sorted based on the text similarity measurement score

5 15 2 4 R 3 21 1

2 233 4 5 R13 1

4 235 1 R 313 2

5

2 4

1 2 3 4 5

1 2 3 4 5

1 2 3 4 5

1(1 + 3) 1(2 + 1)

Figure 8 An example for global similarity features

10 The Scientific World Journal

Step 1For 119894 = 1 to comments-1

if Author(119894) = Author(119894 + 1)119891[Author(119894 + 1) Author(119894)]++

End ifEnd forStep 2For 119894 = 2 to Comments For each comment which looking for its parent

For 119895 = 0 to 119894 minus 1 For each candidateFPScore(119894119895)= 119891[Author(119894) Author(119895)] + 119891[Author(119895) Author(119894)]

End forEnd for

Pseudocode 1 Pseudocode to calculate frequent patterns of authors appearance

Time line

Relation CountFrequentpattern

Sum

Aminus gt B 2 A B 3

Bminus gt C 1 B C 1

Cminus gt A 1 C A 1

Bminus gt A 1 AD 1Aminus gt D 1

B

AFrequent (A B) = 3 Frequent (A C) = 1

C

Step 1 Step 2

A B C A B A D

Figure 9 An example for authors appearance sequence

429 Candidate Filtering In addition to the features we usesome heuristics to filter some inappropriate candidates

(1) Usually a commenter does not reply to the root inhisher second comment So if a commenter haswritten more than one comment the root is removedfrom the parent candidates of the second and nextcomment This was shown to be a useful featurebecause the root is an important candidate so if wecould remove it correctly the results are improvedsignificantly

(2) A commenter does not reply to himherself So wesimply remove all comments of a commenter fromhisher commentrsquos candidates [2]

(3) Commenters who post only one comment on a threadare more likely to reply to the root post So othercandidates can be removed except the root

5 Experiments

In this section we provide details about the datasets used forevaluation describe the evaluation metrics and then presentthe results of the SLARTS method and compare them withthe results of the baseline approaches

51 Dataset Unfortunately there is no standard datasetfor RTS task in declarative threads while severaldatasets are available for interrogative threads [3 4 14]Therefore Thestandard (httpthestandardorgnz) Alef(httpwwwalefir) ENENews (httpenenewscom) Rus-sianblog (wwwkaverydreamwidthorg) and Courant-blogs(wwwcourantblogscom) websites were crawled to provideevaluation data Thestandard in New Zealand and Alef inIran are two online news agencies established in August2007 and 2006 respectively They publish daily news andarticles by journalists in different categories such as economyenvironment politics and society Thestandard is in Englishand Alef is in PersianThestandard has provided the tree-likereply structure of comments since 2010 ENENews is a newswebsite covering the latest energy-related developmentsIt was established shortly after the Fukushima Daiichidisaster in March 2011 It has grown rapidly to serveapproximately 2000000 page views per month Russianblogand Courantblogs are two blogs that are in Russian andEnglish respectively Russsianblog and Courantblogs areestablished since 2003 and 2012 respectively

The reason for selecting the mentioned websites are (1)They support multilevel reply (2) their users are active andarticles usually get many comments (3) author and time ofcomments are available (4) they cover different contexts andlanguages (news agencies cultural sites and blogs in EnglishRussian and Persian)

The Scientific World Journal 11

procedure frequent-words (comment candidate comment)Sum = 0For each (Word1 in comment)

For each (Word2 in candidate-comment)Sum += Score [Word1 Word2]

End ifEnd forreturn Sum

Pseudocode 2 A pseudocode for calculating score of frequent words between two comments

Table 1 The datasets specific properties

Datasets Language Confirmation bymoderator

Usersregister

Thestandard English times

Alef Persian times

ENENews English times

Russianblog Russian times

Courantblogs English times

For each website we crawled the webpages that werepublished until the end of 2012We then parsed the pages andextracted the reply structure and used it for ground truthWehave removed the threadswith less than 4 comments becausethese kinds of threads do not give us much information andusually their comments reply to the root Table 1 summarizesthe information about the prepared datasetsThe datasets areavailable at httpeceutacirnlpresourceshtml

In ENENews and Russianblog users have to register inorder to leave a comment However in other datasets userscan leave comments in a thread with different names ordifferent users can use the same names

Table 2 reports some statistics on the crawled websitesThe length of commentsrsquo text in Russianblog is shorter thanthe other datasets which causes text similarity measures toperform poorly on it In ENENews the root usually includesa tweet that is why the length of the rootrsquos text is shorterthan the other datasets All comments have authorrsquos nameexcept for some comments in Alef Therefore the numbersof commenters for Alef was calculated from the commentsthat have authorrsquos name The average number of commentsper article in Thestandard and ENENews are about 50 and39 respectively which are larger than the other datasets

In order to gain some insights into the data we show somecharts extracted from the datasets

Figure 10 shows the distribution of the number of com-ments in the articles It is seen thatmost threads have between5 and 15 comments in Russianblog and Alef However inThestandard length of threads is longer than the otherdatasets andmost threads have between 12 and 24 comments

Publication rate of articles is shown in Figure 11 Thepublication rate follows a bell-shape and articles are publishedbetween 7 am and 19 pm and the highest rate belongs to the 4-hour period between 9 am and 13 pm Since there is only one

author who publishes the articles in Russianblog its chart hasless variation and the root is usually published between 10 amand 18 pm

Figure 12 shows the publication rate of comments afterthe root It is seen that all datasets except Russianblog havesimilar behavior about one hour after the article is publishedit has received most of the comments After that the rate ofcomment reception decreases Russianblog shows differentbehavior It seems that a part of its visitors reply to its articlesthe next morning 16ndash21 hours after the publication

Figure 13 shows the time difference between publicationtime of comments and their replies It is seen that the maxi-mum difference is usually less than one hour ENENews andRussianblog do not moderate comments and Thestandardhas very active moderators who immediately check andaccept comments However in Courantblogs and Alef wheremoderators are not always online the time difference isbetween one and two hours

Figure 14 shows how depth of comments increases whentime passes after publication of the main article Deepercomments show longer and more serious conversations Asshown in the figure comments usually reply to the root inearly hours After a few hours conversations are continuedaround the comments which causes the depth of thread toincrease Visitors of Alef and Courantblogs talk more aboutthe root article However Thestandard and ENENews havelonger and deeper discussions

Figure 15 shows length of comments in words Most ofcomments include 10ndash20 words Except for comments ofRussianblog the other datasets are similar This tells thatthe similarity measure is not enough and we have to usenontextual features as well Russianblog has comments thatare shorter than the other datasets This dataset is a personalblog and users usually write friendly comments

Figure 16 shows depth of comments Depth is directlyrelated to the conversations It is seen that comments areusually found in the first depth (below the root) RussianblogENENews and Thestandard have more comments in higherlevels meaning conversations are longer

52 Evaluation Metrics To evaluate the experiment resultswe use several metrics For edge prediction we use precisionrecall and 119865-score measures [4 13 17 18] Output of RTS isa tree This causes precision recall and 119865-score values to beequal [2] since FP (False Positive) and FN (False Negative)

12 The Scientific World Journal

Table 2 Datasetsrsquo statistics split up per website

DatasetsAvg length of

commentsrsquo text (inwords)

Avg length of therootrsquos text (in words)

Number ofcomments

Avg comments pernews

Number ofnews

Number ofcommenter

Thestandard 7298 35768 149058 4981 2992 3431Alef 7466 76655 23210 2796 830 4387ENENews 6819 16264 16661 3892 428 831Courantblogs 7924 39817 4952 1919 258 1716Russianblog 1856 23716 30877 1618 1908 1258

0

20

40

60

80

100

120

4 8 16 32 64 128 256

Num

ber o

f new

s

Number of comments

AlefEnenewsRussianblog

CourantblogsThestandard

Figure 10 Distribution of the number of comments in news article

0

005

01

015

02

025

0 2 4 6 8 10 12 14 16 18 20 22

Nor

mal

ized

new

s cou

nt

Hour

AlefEnenewsRussianblog

CourantblogsThestandard

Figure 11 Publication rate of articles per hour

0002004006008

01012014016

0 2 4 6 8 10 12 14 16 18 20 22 24

Nor

mal

ized

com

men

ts co

unt

Elapsed time after published the root (hour)

AlefEnenewsCourantblogs

RussianblogThestandard

Figure 12 Average number of published comments per hour after the root

The Scientific World Journal 13

0005

01015

02025

03035

04045

05

0 1 2 3 4 5 6 7 8 9 10 11 12

Nor

mal

ized

com

men

ts co

unt

Time difference

AlefEnenewsRusssiablog

CourantblogsThestandard

Figure 13 Time difference between a node and corresponding parent per hour

0

05

1

15

2

25

3

35

4

0 2 4 6 8 10 12 14 16 18 20 22 24

Aver

age c

omm

ents

dept

h

Elapsed time after publishing the root (hour)

AlefEnenewsRusssianblog

CourantblogsThestandard

Figure 14 Depth of comments averaged on the elapsed time from publication of root

in precision and recall are always equal Instead we use thefollowing edge accuracy measure

Accuracy (Acc)edge

=

1003816100381610038161003816reply relations cap detected relations1003816100381610038161003816119873

(7)

where the tree is comprised of119873 comments and one rootThe secondmetric is path accuracy which was introduced

byWang et al [3]Thismetric has a global view and considerspaths from nodes to the root

Accuracy (Acc)path =sum|119862|

119894=1

1003816100381610038161003816path119866 (119894) = path119875(119894)1003816100381610038161003816

|119862| (8)

where path119866(119894) is the ground-truth structure for 119894th comment

and path119875(119894) is the predicted structure for it |119862| is the number

of comments in a thread If any irrelevant comment appearsin the path this metric considers it to be completely wrongSo it is very strict To relax this Wang et al introduced

a metric that computes the overlap between the paths inground truth and the predicted path

Precision (119875)119877-path =

sum|119862|

119894=1(1003816100381610038161003816path119866 (119894) cap path

119875(119894)1003816100381610038161003816 1003816100381610038161003816path119875 (119894)

1003816100381610038161003816)

|119862|

Recall (119877)119877-path =

sum|119862|

119894=1(1003816100381610038161003816path119866 (119894) cap path

119875(119894)1003816100381610038161003816 1003816100381610038161003816path119866 (119894)

1003816100381610038161003816)

|119862|

(9)

where |path119875(119894)| is the number of comments in the prediction

path of 119894th comment and |path119866(119894)| is the number of com-

ments in the ground-truth path of 119894th comment 1198651-score isthe harmonic mean of precision and recall

1198651-score = 2 times Precision times recallPrecision + recall

(10)

The above mentioned metrics are appropriate in inter-rogative threads As mentioned before the root of declar-ative threads is news articles or main contents which aredifferent from the root of interrogative threads This causesthe structure of threads and reply relations to be differentThere are two types of reply relations in declarative threads

14 The Scientific World Journal

0

001

002

003

004

005

006

1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91Nor

mal

ized

com

men

ts co

unt

Length of comments (words)

AlefEnenewsRussianblog

CourantblogsThestandard

Figure 15 Histogram of length of comments in words

0

01

02

03

04

05

06

07

08

1 2 3 4 5 6 7 8 9Nor

mal

ized

com

men

ts co

unt

Depth

AlefEnenewsRusssainblog

CourantblogsThestandard

Figure 16 Histogram of the depth of comments

(1) comment-root that is the relation between commentsand the root article and (2) comment-comment that is therelation between two comments one parent and one childThe comment-root relations show different views around themain article (root) The comment-comment relations showconversation among visitors which is a valuable source ofinformation In Figure 17 there are three comment-root andthree comment-comment relations When a user reads theroot and comments heshe can write hisher opinion orsentiment about the root by replying to root or participatingin a discussion by replying to usersrsquo comments

We believe that the evaluation metrics mentioned beforeare not enough to cover both types of reply relations dueto differences between interrogative and declarative threadsAn appropriate metric should be able to detect both typesof relations So we have to modify the evaluation metrics ordefine new metrics

We propose an evaluation metric Accuracy (Acc)CTDwhere CTD stands for Comments Type Detection It isdefined as the proportion of correctly detected commenttypes The comment-root type includes comments whichinitiate a new view about the root (NP) and the comment-

comment type includes comments which are part of adiscussion (PD)

Accuracy (Acc)CTD =TP + TN

TP + TN + FP + FN (11)

where TP is the number of correctly detected NP commentsTN is the number of correctly detected PD comments FP isthe number of incorrectly detected NP comments and FN isthe number of incorrectly detected PD comments

To evaluate accuracy of comment-comment relations thestandard precision recall and 119865-score measures are used

Precision (119875)edge =TP

TP + FP

Recall (119877)edge =TP

TP + FN

1198651-score (119865)edge = 2 timesPrecision times recallPrecision + recall

(12)

where TP is the number of correctly detected comment-comment relations FP is the number of incorrectly detectedcomment-comment relations and FN is the number of

The Scientific World Journal 15

C

C

Comment-root relation

Comment-comment relation

C

C C C Starter discussion commentStarter discussion comment

R

Figure 17 An illustration of a declarative thread

comment-comment relations in the ground truth that werenot predicted at all

The path accuracy metric mentioned earlier is a mod-ified version of 119875path and 119877path which is appropriate todeclarative platform This metric consider discussion pathsfrom each PD comment to the starter discussion commentbut not to root

Precision (119875)path

=

sum|119862|119901

119894=1

1003816100381610038161003816path119866 (119894) = path119875(119894)1003816100381610038161003816

|119862|119901

ow

1 if (|119862|119901 = 0)

Recall (119877)path

=

sum|119862|119877

119894=1

1003816100381610038161003816path119866 (119894) = path119875(119894)1003816100381610038161003816

|119862|119877

ow

1 if (|119862|119877= 0)

(13)

where |119862|119877is the number of PD comments in the ground-

truth thread and |119862|119901is the number of PD comments in the

predicted thread Path119866(119894) is the discussion path from 119894th

node to the discussion starter comment in the ground truthPath119875(119894) is discussion path from 119894th node to the discussion

starter comment in the predicted threadAlso the relaxed Precision (119875)

119877-path and Recall (119877)119877-path

are modified to be suitable for declarative platform

Precision (119875)119877-path

=

sum|119862|119901

119894=1(1003816100381610038161003816path119866 (119894) cap path

119875(119894)1003816100381610038161003816 1003816100381610038161003816path119875 (119894)

1003816100381610038161003816)

|119862|119901

ow

1 if (|119862|119901= 0)

Recall (119877)119877-path

=

sum|119862|119877

119894=1(10038161003816100381610038161003816path119866 (119894)

cap path119875(119894)

1003816100381610038161003816100381610038161003816100381610038161003816path119866(119894)

10038161003816100381610038161003816)

|119862|119877

ow

1 if (|119862|119877= 0)

(14)

Figure 18 shows two threads of which one is the groundtruth and the other one is the predicted thread In order tobetter understand the evaluation metrics we calculate themfor this example

In Table 3 the predicted thread in Figure 18 has beenevaluated by metrics from interrogative threads The resultsshow high values Table 4 shows the results of evaluationby metrics from declarative threads The results show thatdeclarative metrics have more appropriate results

There are two reasons which lead declarative metricsto better evaluate the predicted structure (1) The root indeclarative threads has many children So if a methodconnects all comments to the root interrogative metricsshow good results (2) Interrogative metrics cannot properlyevaluate comment-comment relations in declarative threadsHowever the declarative metrics can evaluate both types ofrelations

53 Experimental Results and Analysis We use several base-lines to compare the effectiveness of SLARTS The first base-line is the performance when we simply link all comments totheir previous comment (we name it Last baseline) [3 4]Thismethod leads to good results for RTS task in interrogativethreads The second baseline is to link all comments to theroot (we name it First baseline)

The only work that has been done on RTS has focused onthe authorrsquos name in online news agencies [19] Schuth et alrsquosmethod used several features to find the commenterrsquos name inonline news articles This method achieves a good precisionbut it has a low recall because many comments do not referto any authorrsquos name So we selected Seo et alrsquos method [2]which is a strong method in interrogative platforms

Seo et alrsquos method has focused on forums and emails andused the quoted text as one of theirmain featuresThis featureis not available in our datasets Therefore we use all theirproposed features except the quoted text

We have applied 5fold cross-validation to minimize biaswith 95 as confidence interval Table 5 shows the results ofour experiments based on interrogative evaluation metricsAccording to Accedge SLARTS reveals higher accuracy exceptin Alef in which lots of replies are connected to the root

According to 119875119877-path and 119877119877-path First and Last baselines

have the best performance respectively First connects allcomments to the root this way irrelevant comments do notappear in the path from the comment to the root and also theroot appears in all paths Last connects all comments to theirprevious comments This causes all comments to appear inthe path

16 The Scientific World Journal

Table3Ex

ampleo

fcalculatio

nof

interrogativem

etric

sfor

thee

xamples

hownin

Figure

18

Evaluatio

nmetric

Calculation

Accuracy

edge

Replyrelatio

ns=1rarr

1198772rarr

1198777rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

14rarr

35rarr

26rarr

58rarr

5

Detectedrelatio

ns==1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

14rarr

15rarr

26rarr

57rarr

28rarr

7

8 11asymp073

Acc p

ath

|(1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

1rarr

1198775rarr

2rarr

1198776rarr

5rarr

2rarr

119877)|

Num

berof

comments

=8 11asymp073

119875119877-path

(1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

1rarr

1198775rarr

2rarr

1198776rarr

5rarr

2rarr

119877)+(4rarr

1rarr

1198777rarr

2rarr

1198778rarr

7rarr

2rarr

119877)

Num

berof

comments

=(1+1+1+1+1+1+1+1)+(1+12+23)

11

=1017

11

asymp092

119877119877-path

(1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

1rarr

1198775rarr

2rarr

1198776rarr

5rarr

2rarr

119877)+(4rarr

3rarr

1rarr

1198777rarr

1198778rarr

5rarr

2rarr

119877)

Num

berof

comments

=(1+1+1+1+1+1+1+1)+(23+1+23)

11

=1034

11

asymp094

1198651-score119877-path

2lowast092lowast094

092+094

asymp093

The Scientific World Journal 17

R

9 10 11

3 5

8

21 7

4 6

(a)

R

9 10 11

3 7

8

21

4 5

6

(b)

Figure 18 An example of two threads (a) ground-truth thread and (b) the prediction thread

Table 4 Example of calculation of declarative metrics for the example shown in Figure 18

Evaluation metric Calculation

AccCTD

TP = 1 2 9 10 11TN = 3 4 5 6 8

FN = 7

FP =

5 + 5

5 + 5 + 1 + 0asymp 091

119875edge

119877edge

1198651edge

TP = 3 rarr 1 5 rarr 2 6 rarr 5 119875edge =3

3 + 3= 05

FN = 4 rarr 3 8 rarr 5 119877edge =3

3 + 2= 06

FP = 4 rarr 1 7 rarr 2 8 rarr 7 1198651 minus scoreedge asymp 055

119875path

119877path

1198651path

119875path =| 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 |

| 3 rarr 1 4 rarr 1 5 rarr 2 6 rarr 5 rarr 2 7 rarr 2 8 rarr 7 rarr 2 |=3

6= 05

119877path =| 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 |

| 3 rarr 1 4 rarr 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 8 rarr 5 rarr 2 |=3

5= 06

1198651-score119877-path asymp 055

119875119877-path

119877119877-path

1198651119877-path

119875119877-path =

3 rarr 1 4 rarr 1 5 rarr 2 6 rarr 5 rarr 2 7 rarr 2 8 rarr 7 rarr 2

| 3 4 5 6 7 8 |=1 + 1 + 1 + 1 + 0 + 12

6=45

6= 075

119877119877-path =

3 rarr 1 4 rarr 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 8 rarr 5 rarr 2

| 3 4 5 6 8 |=1 + 12 + 1 + 1 + 12

5=4

5= 08

1198651-score119877-path =

2 lowast 075 lowast 08

075 + 08asymp 077

According to Accpath First has better performance inThestandard ENENews and Alef because more commentsare linked directly to the root Usually SLARTS and Seo etalrsquos methods cannot predict the full path inThestandard andENENews because according to Figure 15 paths are verylong and complex in these datasets

As wementioned earlier interrogative evaluationmetricsare not appropriate for declarative threads because based onthese metrics First shows high performance although thisbaseline does not detect any comment-comment relations

Table 6 shows the results when we use declarative evaluationmetrics proposed in Section 52

According to Table 6 for 119865edge SLARTS performs betterthan Seo et alrsquos method However for119875edge Seo et alrsquos methodperforms better but its 119877edge is lower than SLARTS in all da-tasets

It is important to say that when the difference between119875edge and 119877edge is high and 119875edge is greater than 119877edge themethod most likely connects comments to the root and itdoes not appropriately detect comment-comment relations

18 The Scientific World Journal

J(1)M(1)

V(1)

H(1)

A(1)

R(1)

G(1)

N(1)D(1)

E(1)

C(1)

P(2)

U(1)

S(1)

T(2)

T(1)

M(2)

L(1)

Q(1)

C(2)

K(1)

F(1)L(2)

C(3)

P(1)

A(2)

B(1)O(1)

I(1)

H(2)

R

(a)

J(1)

M(1)

R

A(1)

N(4)

D(1)

G(1)

I(1)

B(1)H(1)

O(1)

C(1)

N(2)

G(2)O(2)

P(1)

E(1)

D(2)

F(1)

E(2)F(2)

D(3)

L(1) F(3) N(1)

D(4)

F(4)K(1)D(5)

D(6)

N(3)

(b)

1)

1)

1)

1)

1)

1)

1)

1)1)

1)

1)

1)

R

F(2)

K(3)

R(1)

K(2)

M(2)A(3)

Q(1)

N(2)

K(4)A(2)

J(2)

1)

H(2)

H(3)

I(2)

I(1)

P(1)

O(1)

(c)

Figure 19 Three real threads in which nodes are labeled according to commenterrsquos name and number of its comments

(like First baseline) On the other hand when 119877edge isgreater than 119875edge the method does not appropriately detectcomment-root relations (like Last baseline)

So Seo et alrsquos method most likely connects comments tothe root and this is not appropriately detecting comment-comment relations On the other hands 119875edge and 119877edge ofSLARTS are close

Several features in SLARTS such as Global similarityfrequent words Frequent patterns and authorsrsquo languagefocus on detection of comment-comment relations Thismakes the results of SLARTS better than Seo et alrsquos methodin declarative evaluation metrics

We already saw that the average length of threads inThestandard and ENENews is longer than the other datasets(Table 2) and their paths are much longer and more complexthan the other datasets (Figure 16) According to 119865edge theaccuracy of Thestandard and ENENews is less than otherdatasets Note that 119865edge is a strict metric in declarativethreads

SLARTS has better 119865edge than Seo et alrsquos method in alldatasetsThemaximumdifference occurs inThestandard andRussianblog In these datasets many of the defined featureshave goodperformance (Importance of features in ENENewsand Russianblog datasets will be shown in Table 7 and we willexplain more in the next section)

The Scientific World Journal 19

Table 5 Experimental results with interrogative evaluation metrics

Datasets Methods Accedge Accpath 119875119877-path 119877

119877-path 1198651119877-path

Thestandard

SLARTS 04669plusmn0009

03059plusmn0013

06927plusmn0018

07450plusmn0010

07179plusmn0013

Seo et alrsquos [2] 03941plusmn0005

03413plusmn0006

08303plusmn0009

06563plusmn0005

07331plusmn0003

First 03630plusmn0010

03630plusmn0010

10000plusmn0000

05846plusmn0009

07379plusmn0007

Last 01824plusmn0006

00495plusmn0004

02214plusmn0008

10000plusmn0000

03625plusmn0010

ENENews

SLARTS 04837plusmn0018

03695plusmn0027

07625plusmn0026

07770plusmn0013

07697plusmn0017

Seo et alrsquos [2] 04661plusmn0017

04134plusmn0021

08720plusmn0011

07086plusmn0013

07818plusmn0011

First 04524plusmn0018

04524plusmn0018

10000plusmn0000

06634plusmn0016

07976plusmn0011

Last 02090plusmn0016

00596plusmn0003

02319plusmn0011

10000plusmn0000

03764plusmn0015

Courantblogs

SLARTS 06843plusmn0028

06453plusmn0033

09127plusmn0013

08619plusmn0014

08865plusmn0013

Seo et alrsquos [2] 06700plusmn0040

06628plusmn0043

09667plusmn0014

08250plusmn0018

08902plusmn0015

First 06490plusmn0036

06490plusmn0036

10000plusmn0000

08001plusmn0019

08889plusmn0012

Last 02037plusmn0017

01082plusmn0008

03183plusmn0011

10000plusmn0000

04829plusmn0012

Russianblog

SLARTS 07463plusmn0017

06904plusmn0017

08741plusmn0008

08787plusmn0006

08764plusmn0007

Seo et alrsquos [2] 05446plusmn0010

05174plusmn0011

09293plusmn0003

07821plusmn0004

08494plusmn0003

First 04968plusmn0006

04968plusmn0006

10000plusmn0000

07134plusmn0004

08327plusmn0003

Last 03632plusmn0018

01549plusmn0012

03695plusmn0015

10000plusmn0000

05395plusmn0016

Alef

SLARTS 07753plusmn0017

07641plusmn0018

09579plusmn0006

09033plusmn0008

09298plusmn0007

Seo et alrsquos [2] 07802plusmn0019

07800plusmn0019

09950plusmn0003

08828plusmn0010

09355plusmn0006

First 07853plusmn0018

07853plusmn0018

10000plusmn0000

08826plusmn0010

09376plusmn0006

Last 00993plusmn0005

00822plusmn0005

02526plusmn0012

10000plusmn0000

04032plusmn0015

The bold style shows the best result in each metric

As shown in Table 6 Russianblog has better resultsthan the other datasets in all metrics The main reason isthat its comments are not confirmed by a moderator Thiscauses Accedge of Last baseline in Russianblog to be equalto 03632 that is more than other datasets (Table 5) Accedgeof Last baseline has inverse relationship with complexity ofreplies Also we showed in Figure 13 that the Russianbloghas the lowest time difference between a comment and itsparent When time difference between a child and its parentdecreases detection of reply relationwould be easier In otherwords as the parent appears to be closer to its child somefeatures such as frequent pattern and location Prior that are

based on position of comments in chronological order workbetter

The difference between 119865119877-path and 119865path is about 20

in Thestandard and ENENews where threads are larger andpaths have more complex structures

The minimum difference between the results of SLARTSand Seo et alrsquos methods appears in Alef datasets In Alef manyrelations are comment root and many comments do not haveauthorrsquos name which make the features perform poorly

Since SLARTS method has features which speciallyfocused on detecting comment-root relations (eg by addingcandidate filtering) AccCTD of SLARTS is better than Seo

20 The Scientific World Journal

et al method in all datasets The best result is 091 forRussianblog and the worst result is 069 for ENENewsThe root of ENENews dataset is usually a tweet Accordingto Table 2 it makes the average length of the rootrsquos textto be shorter than the other datasets and this makes thetextual features perform poorly on detecting comment-rootrelations

Confidence intervals of Alef andCourantblog datasets arehigher than the other datasets becausemany of their relationsare comment root (Figure 16) This makes ranking SVM tobe bias towards connecting comments to the root especiallywhen a thread includes very few comment-comment rela-tions

We compare 119875-value to specify the significance of differ-ences between SLARTS and Seo et alrsquosmethods on declarativemetrics Since ranking SVM ranks candidates based on theirscore and selects the first candidate from the list only thefirst candidate is important So p1 is computed The resultsindicate that all improvements are statistically significant (119875-value lt 0005) in all datasets

54 Evaluation of the Features In this section we evaluatethe role of the introduced features We use backward featureselection That means to measure the importance of a featurewe use all features except that feature repeat the experimentsin its absence and compare the results to the case whereall features are present The difference between the values ofmetrics in presence of a feature and its absence is reported inTable 7

It is seen that some features improve precision of themetrics for example location prior and candidate filteringrule 3 where they most likely tend to detect comment-rootrelations Some features improve recall of the metrics such asauthorsrsquo language global similarity candidate filtering rule1 and frequent patterns These features most likely tendto detect comment-comment relations Some features affectboth precision and recall for example authorsrsquo name

As stated earlier Russianblog has different behaviourin comparison with other datasets (Figures 12 13 and 15)Also ENENews has larger threads and more complex paths(Figure 16 and Table 2) So we use these datasets to evaluatefeatures in depth

Similarity feature improves recall of evaluation metricsSince Russian language is known as a morphologically richlanguage and the length of commentsrsquo text is very short(about average 18 words according to Table 2) in comparisonwith other datasets improvement of textual features is lowTo increase the performance of textual feature in Russianwe need a stemmer a Wordnet and a tokenizer Similarityfeature has rather a good impact on AccCTD

Authorsrsquo language feature improves recall of evaluationmetrics According to 119877edge the improvements of this featureare 003 and 001 in ENENews and Russianblog respectively

Global similarity feature improves both precision andrecall in Russianblog and recall in ENENews

The frequent words feature has a small improve-ment in Russianblog like similarity feature This feature

improves recall of the evaluation metrics in ENEnews about001

Length ratio feature improves precision in both datasetsHowever since longer comments in ENENews have manychildren this feature is more prominent there

Authorsrsquo name feature is useful in all evaluation metricsThe value of this feature in ENENews is more than Russian-blog because authorsrsquo name in Russianblog is different fromother datasets it is an email and no one would refer to it as aname

The frequent patterns feature focuses on detectingcomment-comment relations This feature improves recall ofevaluation metrics in both datasets

The location prior feature improves precision in bothdatasets This feature has a good improvement on AccCTDAccording to 119875edge the best improvement is 016091 forRussianblog since comments are not moderated

The candidate filtering rule 1 improves recall of evaluationmetrics in both datasets This feature removes the rootcandidate accurately This feature has a good improvementon AccCTD The maximum improvement for 119877edge is 034 forRussianblog

The candidate filtering rule 2 has small improvements onprecision and recall of evaluation metrics in both datasetsMaximum improvement is gained with Russianblog

Finally The candidate filtering rule 3 improves precisionof metrics in both datasets This feature removes all can-didates except the root Thus detection of comment-rootrelations is improved Also this feature improves AccCTD

6 Information Extraction byVisualizing Threads Structure

In this section we discuss the information which can beextracted from hierarchical structure of threads Figure 19 isthe same as Figure 3 except that the commenters and theirnumber of posted comments are shown as the nodesrsquo labelfor example ldquo119863(3)rdquo is the third comment sent by user ldquo119863rdquoVisualization of the structure reveals valuable information asfollowing

(i) Is the root article controversial The threads ldquo119861rdquo andldquo119862rdquo in Figure 19 aremore controversial than ldquo119860rdquo Visi-tors have expressed different opinions and sentimentsabout the root in ldquo119861rdquo and ldquo119862rdquo leading to formationof longer conversations The thread ldquo119860rdquo shows acommon thread which has short conversations aboutthe root The height and width of trees can help as ameasure to recognize whether a root is controversialor not

(ii) Which comments are starters [36] Starter commentshave important role in the conversations becauseusers read them and then read other comments tobetter understand the conversations

(iii) Which comments have participated in this conversa-tion Users can follow the conversations that seeminteresting to them without reading any unrelatedcomment

The Scientific World Journal 21

Table 6 Experimental results with declarative evaluation metrics

Datasets Method 119875edge 119877edge 119865edge 119875path 119877path 119865path 119875119877-path 119877

119877-path 119865119877-path AccCTD

ThestandardSLARTS 03782

plusmn000804155plusmn0011

03960plusmn0006

01565plusmn0007

01713plusmn0008

01636plusmn0007

03628plusmn0012

04166plusmn0016

03876plusmn0005

07525plusmn0005

Seo et alrsquos[2]

03014plusmn0031

01414plusmn0015

01921plusmn0016

01544plusmn0025

00576plusmn0011

00835plusmn0013

03013plusmn0038

01319plusmn0018

01831plusmn0021

05831plusmn0007

Difference 00778 02741 02039 00021 01137 00801 00615 02847 02045 01694

ENENewsSLARTS 03797

plusmn003803891plusmn0019

03839plusmn0024

01873plusmn0040

01855plusmn0026

01857plusmn0030

03768plusmn0048

03768plusmn0017

03759plusmn0025

06872plusmn0014

Seo et alrsquos[2]

04060plusmn0030

01713plusmn0016

02407plusmn0018

02378plusmn0034

00837plusmn0016

01235plusmn0021

03947plusmn0035

01523plusmn0015

02196plusmn0019

05954plusmn0009

Difference minus00263 02178 01432 minus00505 01018 00622 minus00179 02245 01563 00918

CourantblogsSLARTS 05236

plusmn003003963plusmn0073

04495plusmn0052

04220plusmn0057

03110plusmn0058

03566plusmn0049

05325plusmn0049

03853plusmn0075

04452plusmn0057

07625plusmn0033

Seo et alrsquos[2]

07320plusmn0090

02049plusmn0062

03189plusmn0082

06815plusmn0105

01896plusmn0070

02951plusmn0093

07309plusmn0094

02010plusmn0065

03140plusmn0086

07001plusmn0031

Difference minus02084 01914 01306 minus02595 01214 00615 minus01984 01843 01312 00624

RussianblogSLARTS 05940

plusmn002805674plusmn0022

05803plusmn0023

04907plusmn0031

04664plusmn0020

04782plusmn0024

05878plusmn0028

05823plusmn0021

05849plusmn0020

09116plusmn0011

Seo et alrsquos[2]

04705plusmn0020

03030plusmn0015

03685plusmn0016

03823plusmn0023

02534plusmn0013

03046plusmn0015

04682plusmn0020

02822plusmn0014

03521plusmn0015

05862plusmn0013

Difference 01235 02644 02118 01084 0213 01736 01196 03001 02328 03254

AlefSLARTS 06189

plusmn002804131plusmn0040

04952plusmn0037

05639plusmn0027

03819plusmn0037

04551plusmn0034

06277plusmn0027

04069plusmn0043

04934plusmn0039

08044plusmn0013

Seo et alrsquos[2]

08827plusmn0052

02637plusmn0051

04045plusmn0060

08797plusmn0053

02631plusmn0051

04034plusmn0060

08840plusmn0052

02635plusmn0051

04043plusmn0060

07846plusmn0018

Difference minus02638 01494 00907 minus03158 01188 00517 minus02563 01434 00891 00198The bold style shows the best result in each metric

(iv) Who plays an important role in a discussion Someusers playmore important roles in conversations thanother users For example users ldquo119863(1)rdquo in threadldquoBrdquo and ldquo119860(1)rdquo in thread ldquoCrdquo have made a longconversation These users are known as a hub or astarter in a thread and their degree of importance canbe compared according to their indegree [39] Manyanalyses can be performed on this structure which isknown as popularity features [12]

(v) How many conversations are about the root Thereare thirteen conversations which are replyingdirectly to the root in thread ldquoCrdquo for exampleF(2)H(2)H(3) I(2)O(1)orF(2)H(2) l(1) showtwo conversations in thread ldquo119862rdquo

7 Conclusion and Future Work

In this paper we proposed SLARTS a method based onranking SVMwhich predicts the tree-like structure of declar-ative threads for example blogs and online news agenciesWe emphasized on the differences between declarative andinterrogative threads and showed that many of the previouslyproposed features perform poorly on declarative threadsbecause of this Instead we defined a set of novel textual andnontextual features and used a ranking SVM algorithm tocombine these features

We detect two types of reply relations in declarativethreads comment-root relation and comment-commentrelation An appropriate method should be able to detectboth types of reply relations and an appropriatemetric shouldconsider both types of reply relations So in order to havefair judge on the quality of the predicted structures wemodified the evaluation metrics accordingly We also defineda novel metric that measures the accuracy of comment typedetection

The results of our experiments showed that accord-ing to declarative evaluation metrics our method showshigher accuracy in comparison with the baselines on fivedatasets Also we showed that all improvements in detectingcomment-comment relations are statistically significant in alldatasets

We believe that the defined features are not limited todeclarative threads Some features such as authorrsquos languageand frequent patterns extract relations among users can beuseful in interrogative threads

For future we would like to use the proposed methodon interrogative platforms such as forums Also we wouldlike to analyze the tree-like reply structure deeply and webelieve it can provide valuable information for example tohelp in understanding the role of users in discussions or tofind important and controversial comments among a largevolume of comments

22 The Scientific World Journal

Table 7 The difference between the evaluation metrics in presence and absence of features

Feature Dataset Evaluation metric119875edge 119877edge 119875Path 119877Path 119875

119877-path 119877119877-path AccCTD

Similarity ENENews minus000233 007415 minus000538 003289 minus001095 008996 004684Russianblog 000874 000012 000963 000347 000695 minus001079 000355

Authorsrsquo language ENENews minus001354 003246 minus001785 000693 minus002757 004052 000458Russianblog minus000229 001052 minus000179 000750 minus000590 000807 minus000182

Global similarity ENENews minus000402 005604 minus001338 001581 minus001129 007086 002742Russianblog 001090 001443 000582 000897 000368 minus000317 minus000087

Frequent words ENENews 000252 000931 000170 000424 minus000132 000973 000292Russianblog 000008 000023 000016 000029 000034 000047 minus000011

Length ratio ENENews 004206 minus002984 004450 001700 006421 minus005336 minus000850Russianblog 000651 minus000355 000920 000080 000784 minus000592 000235

Authorsrsquo name ENENews 003753 006392 001920 003285 002341 005998 001788Russianblog 000000 000000 000000 000000 000000 000000 000000

Frequent pattern ENENews minus002028 002930 minus002357 000300 minus003793 003410 000526Russianblog 000458 006100 000830 005206 minus000545 006150 000404

Location prior ENENews 008778 minus003690 008556 003280 011792 minus007852 002925Russianblog 016091 009671 017766 013515 016699 008793 002414

Candidate filtering rule 1 ENENews minus005431 006138 minus004102 001296 minus006458 007545 006949Russianblog minus012506 034370 minus013916 027445 minus012683 037003 032546

Candidate filtering rule 2 ENENews 000014 minus000004 000048 000041 minus000038 minus000035 minus000009Russianblog 002575 002475 002462 002341 001746 000802 000000

Candidate filtering rule 3 ENENews 000357 minus000121 000452 000254 000551 minus000161 000161Russianblog 008521 minus001978 009636 001702 009380 minus003825 007371

Since our goal is to provide a language-independentmodel to reconstruct the thread structure we did not usetext processing tools such as Wordnet and Named EntityRecognition (as they are not available in the same quality forall languages) We would like to focus on English and usethese tools to find out whether they can improve the accuracyof RTS method or not

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] Y Wu Q Ye L Li and J Xiao ldquoPower-law properties ofhuman view and reply behavior in online societyrdquoMathematicalProblems in Engineering vol 2012 Article ID 969087 7 pages2012

[2] J Seo W B Croft and D A Smith ldquoOnline community searchusing conversational structuresrdquo Information Retrieval vol 14no 6 pp 547ndash571 2011

[3] HWang CWang C X Zhai and J Han ldquoLearning online dis-cussion structures by conditional random fieldsrdquo in Proceedingsof the 34th International ACM SIGIR Conference on Researchand Development in Information Retrieval (SIGIR rsquo11) pp 435ndash444 July 2011

[4] E Aumayr J Chan and C Hayes ldquoReconstruction of threadedconversations in online discussion forumsrdquo inProceedings of the

5th International AAAI Conference onWeblogs and SocialMedia(ICWSM rsquo11) pp 26ndash33 2011

[5] D Shen Q Yang J-T Sun and Z Chen ldquoThread detactionin dynamic text message streamsrdquo in Proceedings of the 29thAnnual International ACM SIGIR Conference on Research andDevelopment in Information Retrieval pp 35ndash42 Seattle WashUSA August 2006

[6] M Dehghani M Asadpour and A Shakery ldquoAn evolutionary-based method for reconstructing conversation threads in emailcorporardquo in Proceedings of the IEEEACM International Con-ference on Advances in Social Networks Analysis and Mining(ASONAM rsquo12) 2012

[7] Y Jen-Yuan ldquoEmail thread reassembly using similarity match-ingrdquo in Proceedings of the 3rd Conference on Email and Anti-Spam (CEAS rsquo06) Muntain View Calif USA 2006

[8] S Gottipati D Lo and J Jiang ldquoFinding relevant answersin software forumsrdquo in Proceedings of the 26th IEEEACMInternational Conference on Automated Software Engineering(ASE rsquo11) pp 323ndash332 November 2011

[9] H Duan and C Zhai ldquoExploiting thread structures to improvesmoothing of language models for forum post retrievalrdquo inAdvances in Information Retrieval P Clough C Foley CGurrin et al Eds vol 6611 pp 350ndash361 Springer BerlinGermany 2011

[10] T Georgiou M Karvounis and Y Ioannidis ldquoExtractingtopics of debate between users on web discussion boardsrdquo inProceedings of the 1st International Conference forUndergraduateand Postgraduate Students in Research Poster Competition 2010

[11] Y C Wang M Joshi W Cohen and C P Rose ldquoRecoveringimplicit thread structure in newsgroup style conversationsrdquo

The Scientific World Journal 23

in Proceedings of the International Conference on Weblogs andSocial Media Seattle Wash USA 2008

[12] J Chan C Hayes and E M Daly ldquoDecomposing discussionforums and boards using user rolesrdquo in Proceedings of the 4thInternational AAAI Conference on Weblogs and Social Media2010

[13] Y-C Wang and C P Rose ldquoMaking conversational structureexplicit identification of initiation-response pairs within onlinediscussionsrdquo in Proceedings of the Human Language Technolo-gies Conference ofthe North American Chapter of the Associationfor Computational Linguistics pp 673ndash676 June 2010

[14] LWangM Lui S N Kim J Nivre and T Baldwin ldquoPredictingthread discourse structure over technical web forumsrdquo inProceedings of the Conference on Empirical Methods in NaturalLanguage Processing (EMNLP rsquo11) pp 13ndash25 Edinburgh UKJuly 2011

[15] A Balali H FailiMAsadpour andMDehghani ldquoa supervisedapproach for reconstructing thread structure in commentson blogs and online news agenciesrdquo in Proceedings of the14th International Conference on Intelligent Text Processing andComputational Linguistics Samos Greece 2013

[16] P Adams and C Martel ldquoConversational thread extraction andtopic detection in text-based chatrdquo in Semantic Computing pp87ndash113 John Wiley amp Sons 2010

[17] P H Adams and C H Martell ldquoTopic detection and extractionin chatrdquo in Proceedings of the 2nd Annual IEEE InternationalConference on Semantic Computing (ICSC rsquo08) pp 581ndash588August 2008

[18] C Lin J M Yang R Cai X J Wang W Wang and L ZhangldquoModeling semantics and structure of discussion threadsrdquo inProceedings of the 18th International Conference on World WideWeb pp 1103ndash1104 2009

[19] A Schuth MMarx andM d Rijke ldquoExtracting the discussionstructure in comments on news-articlesrdquo in Proceedings of the9th ACM International Workshop onWeb Information and DataManagement (WIDM rsquo07) Lisboa Portugal November 2007

[20] C Wang J Han Q Li X Li W P Lin and H Ji ldquoLearninghierarchical relationships among partially ordered objects withheterogeneous attributes and linksrdquo in Proceedings of the 12thSIAM International Conference on Data Mining pp 516ndash527Anaheim Calif USA 2012

[21] M Dehghani A Shakery M Asadpour and A KoushkestanildquoA learning approach for email conversation thread reconstruc-tionrdquo Journal of Information Science vol 39 no 6 pp 846ndash8632013

[22] A Balali A Rajabi S Ghassemi M Asadpour and H FailildquoContent diffusion prediction in social networksrdquo in Pro-ceedings of the 5th Conference on Information and KnowledgeTechnology (IKT rsquo13) pp 467ndash471 2013

[23] G Mishne and N Glance ldquoLeave a reply an analysis of weblogcommentsrdquo in Proceedings of the 3rd Annual Workshop onthe Weblogging ecosystem Aggregation Analysis and DynamicsEdinburgh UK 2006

[24] V Gomez A Kaltenbrunner and V Lopez ldquoStatistical analysisof the social network and discussion threads in Slashdotrdquo inProceedings of the 17th International Conference on World WideWeb (WWW rsquo08) pp 645ndash654 April 2008

[25] T Weninger X A Zhu and J Han ldquoAn exploration ofdiscussion threads in social news sites a case study of the redditcommunityrdquo 2013

[26] D Laniado R Tasso Y Volkovich and A KaltenbrunnerldquoWhen the Wikipedians talk network and tree structure of

Wikipedia discussion pagesrdquo in Proceedings of the 5th Interna-tional AAAI Conference on Weblogs and Social Media (ICWSMrsquo11) 2011

[27] A Grabowski ldquoHuman behavior in online social systemsrdquoTheEuropean Physical Journal B vol 69 no 4 pp 605ndash611 2009

[28] S N Kim L Wang and T Baldwin ldquoTagging and linkingweb forum postsrdquo in Proceedings of the 40th Conference onComputational Natural Language Learning Uppsala Sweden2010

[29] J Andreas S Rosenthal and K McKeown ldquoAnnotating agree-ment and disagreement in threaded discussionrdquo in Proceedingsof the 8th International Conference on Language Resources andComputation (LREC rsquo12) pp 818ndash822 Istanbul Turkey May2012

[30] A Stolcke K Ries N Coccaro et al ldquoDialogue actmodeling forautomatic tagging and recognition of conversational speechrdquoComputational Linguistics vol 26 no 3 pp 339ndash373 2000

[31] E N Forsyth Improving Automated Lexical and DiscourseAnalysis of Online Chat Dialog Naval Postgraduate SchoolMonterey Calif USA 2007

[32] D Feng E Shaw J Kim and E Hovy ldquoAn intelligentdiscussion-bot for answering student queries in threaded dis-cussionsrdquo in Proceedings of the 11th international Conference onIntelligent User Interfaces pp 171ndash177 February 2006

[33] F Hu T Ruan Z Shao and J Ding ldquoAutomatic web informa-tion extraction based on rulesrdquo in Proceedings of the Web Infor-mation System Engineering (WISE rsquo11) pp 265ndash272 Springer2011

[34] F Hu T Ruan and Z Shao ldquoComplete-thread extraction fromweb forumsrdquo inWebTechnologies andApplications pp 727ndash734Springer 2012

[35] D Feng J Kim E Shaw and E Hovy ldquoTowards modelingthreaded discussions using induced ontology knowledgerdquo inProceedings of the National Conference on Artificial Intelligencepp 1289ndash1294 July 2006

[36] G D Venolia and C Neustaedter ldquoUnderstanding sequenceand reply relationships within email conversations a mixed-model visualizationrdquo in Proceedings of the SIGCHI Conferenceon Human Factors in Computing Systems pp 361ndash368 April2003

[37] T Joachims ldquoTraining linear SVMs in linear timerdquo in Pro-ceedings of the 12th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining pp 217ndash226 Philadel-phia Pa USA August 2006

[38] RM Fano ldquoTransmission of information a statistical theory ofcommunicationsrdquo American Journal of Physics vol 29 pp 793ndash794 1961

[39] T Opsahl F Agneessens and J Skvoretz ldquoNode centrality inweighted networks generalizing degree and shortest pathsrdquoSocial Networks vol 32 no 3 pp 245ndash251 2010

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 8: Research Article A Supervised Approach to Predict the …downloads.hindawi.com/journals/tswj/2014/479746.pdf · 2019-07-31 · Research Article A Supervised Approach to Predict the

8 The Scientific World Journal

between two commenters ldquo1198601rdquo and ldquo1198602rdquo are considered asthis featurersquos score for relations between their comments

423 Prior Location Position of comments can reveal someinformation about their hierarchy for example first com-ments usually have more children than the others or com-ments which are located just before the 119894th comment aremorelikely to be its parent In general we would like to estimate119901(119894 | 119895) that is knowing that a comment is in position 119895whichis the likelihood that the comment in position 119894 is its parent[2] So we calculate prior probabilities for being the parent ofdifferent positions

To calculate prior probability for 119895 we count the numberof times a comment in position 119894 is the parent of 119895 inthe training set Figure 6 shows the prior probability forcomments in positions 1 to 100 in Thestandard dataset Thehighest prior probability belongs to the root and then tothe comments which are located just before the commentThe sample points in Figure 6 show five commentrsquos positionssuch as the roots 10 30 57 and 59 and how it is probablefor 60th comment to be a child of them that the root hasthe most prior probability which is equal to 01852 and thenthe 59th comment which has the probability of 01236 AlsoFigure 7 shows the prior probability of child-parent relationfrom comment 40 to 60

424 Reference to Authorsrsquo Name If the name of an authorappears in a comment then all hisher comments are con-sidered as a potential candidate for being parent of thatcomment [19] Sometimes a part of the complete name isreferenced Sometimes the authorrsquos name is made up of twoparts and both of these parts could be used by other authorsfor reference We also consider these types of references Wehold each part of the authorrsquos name and then parts which arestop-words are removed

425 Global Similarity This feature is based on ranking thesimilarity of commentsrsquo text and the rootrsquos text which has aglobal view on the text similarity According to this feature ifcomment ldquo119860rdquo has the most similarity with comment ldquo119861rdquo andinversely comment ldquo119861rdquo has themost similarity with commentldquo119860rdquo among other candidates it is more likely that there is arelation between comment ldquo119860rdquo and comment ldquo119861rdquo To relaxthis feature the similarity measurement is calculated for eachcomment corresponding with all comments then the com-ments are sorted based on their similarity score For examplein Figure 8 we are looking for parent of the fifth commentIn the first step according to Formula (2) comments aresorted based on score of text similarity measurement per thefifth comment Comments that do not belong to candidateof the fifth comment are removed The removed commentshave been shown with black color in Figure 8 In the secondstep the same as the first step comments are sorted per eachcandidate of fifth comment and also comments which donot belong to candidate of the fifth comment are removedexcept the fifth comment itself Finally formula (3) is used tocalculate Ranking-distance score Two comments which arethe most similar to each other have more Ranking-distance

score In Figure 8 the most score belong to relation with thefourth commentThis feature is symmetric and the similarityamong comments is only calculated one time In other wordsthis feature needs 119874(1198732) time complexity for text similaritypairwise comparisons

Ranking-distance (1198621 1198622) = (|1198621amp1198622| + |1198622amp1198621|)minus1(3)

where 1198621 and 1198622 are two comments and |1198621amp1198622| is the textsimilarity distance between 1198621 and 1198622

426 Frequent Patterns of Authors Appearance The idea isthat the comments of two commenters who talk to eachother usually appear closely in chronological order So iftheir comments appear many times just after each other thisfeature gives a high score to their comments In order toimplement this feature we use the following formula

FPScore (119894 119895) = 119891 (119860119894 119860119895) + 119891 (119860

119895 119860119894) (4)

where 119860119894is the author of comment 119894 119891(119886 119887) is the number

of times of comments of author 119886 that appear just beforecomments of author 119887 (see Pseudocode 1)

Figure 9 shows a time line which includes 7 commentswhich have been written by 4 commenters The feature scoreis calculated for the relation among comments The score ofrelation between comments 119860 and 119861 is 3 which is more than119860 and 119862

427 Length Ratio The length of the parent text is usuallylonger than its children The length ratio score is calculatedaccording to

length ratio score (119862119862 119862119875) =

10038161003816100381610038161198621198751003816100381610038161003816

10038161003816100381610038161198621198621003816100381610038161003816

(5)

where 119862119862is a comment looking for a parent and 119862

119875is a

candidate parent

428 FrequentWords in Parent-Child Comments Sometimesa pair of words appears to be frequently one word in theparent and the next word in its children For example in ENE-News dataset the most frequent pairs are (believe people)(accident fukushima) (new discovered) (idea report) and(people public)We use pointwisemutual information (PMI)[38] to find the frequent pattern

Score (11988211198822) =

Count (11988211198822)

Count (1198821) lowast Count (119882

2) (6)

where 1198821is a word in a comment whose parent we are

looking for1198822is a word in its candidate parent and Count

(11988211198822) is the number of time119882

1has appeared in the parent

and1198822has appeared in child The numerator computes how

often two words appear together and denominator computeshow often one of the words appears Finally according toPseudocode 2 the score of relation between two comments iscalculated

The Scientific World Journal 9

+

C

C

C

A1

+

C

C

C

A2Similarity

C

Figure 5 The authorsrsquo language features based on author ldquo1198601rdquo and author ldquo1198602rdquo

102030405060708090100

1 20 4060 80

100

0

01

02

03

04

05

06

07

08

09

1

Prio

r pro

babi

lity

X RootY 60Z01862

X 10Y 60Z00011

X 30Y 60Z00011

X 57Y 60Z 006

Root

Y child X parent

X 59Y 60Z01236

Figure 6 The prior probability of child-parent relation from comments 1 to 100 inThestandard

102030405060

0

005

01

015

02

025

X rootY 60Z01862

X 59Y 60Z01236

Root

4045

5055

60

Y child

X parent

Prio

r pro

babi

lity

Figure 7 The prior probability of child-parent relation from comments 40 to 60

Comment other comments are sorted based on the text similarity measurement score

5 15 2 4 R 3 21 1

2 233 4 5 R13 1

4 235 1 R 313 2

5

2 4

1 2 3 4 5

1 2 3 4 5

1 2 3 4 5

1(1 + 3) 1(2 + 1)

Figure 8 An example for global similarity features

10 The Scientific World Journal

Step 1For 119894 = 1 to comments-1

if Author(119894) = Author(119894 + 1)119891[Author(119894 + 1) Author(119894)]++

End ifEnd forStep 2For 119894 = 2 to Comments For each comment which looking for its parent

For 119895 = 0 to 119894 minus 1 For each candidateFPScore(119894119895)= 119891[Author(119894) Author(119895)] + 119891[Author(119895) Author(119894)]

End forEnd for

Pseudocode 1 Pseudocode to calculate frequent patterns of authors appearance

Time line

Relation CountFrequentpattern

Sum

Aminus gt B 2 A B 3

Bminus gt C 1 B C 1

Cminus gt A 1 C A 1

Bminus gt A 1 AD 1Aminus gt D 1

B

AFrequent (A B) = 3 Frequent (A C) = 1

C

Step 1 Step 2

A B C A B A D

Figure 9 An example for authors appearance sequence

429 Candidate Filtering In addition to the features we usesome heuristics to filter some inappropriate candidates

(1) Usually a commenter does not reply to the root inhisher second comment So if a commenter haswritten more than one comment the root is removedfrom the parent candidates of the second and nextcomment This was shown to be a useful featurebecause the root is an important candidate so if wecould remove it correctly the results are improvedsignificantly

(2) A commenter does not reply to himherself So wesimply remove all comments of a commenter fromhisher commentrsquos candidates [2]

(3) Commenters who post only one comment on a threadare more likely to reply to the root post So othercandidates can be removed except the root

5 Experiments

In this section we provide details about the datasets used forevaluation describe the evaluation metrics and then presentthe results of the SLARTS method and compare them withthe results of the baseline approaches

51 Dataset Unfortunately there is no standard datasetfor RTS task in declarative threads while severaldatasets are available for interrogative threads [3 4 14]Therefore Thestandard (httpthestandardorgnz) Alef(httpwwwalefir) ENENews (httpenenewscom) Rus-sianblog (wwwkaverydreamwidthorg) and Courant-blogs(wwwcourantblogscom) websites were crawled to provideevaluation data Thestandard in New Zealand and Alef inIran are two online news agencies established in August2007 and 2006 respectively They publish daily news andarticles by journalists in different categories such as economyenvironment politics and society Thestandard is in Englishand Alef is in PersianThestandard has provided the tree-likereply structure of comments since 2010 ENENews is a newswebsite covering the latest energy-related developmentsIt was established shortly after the Fukushima Daiichidisaster in March 2011 It has grown rapidly to serveapproximately 2000000 page views per month Russianblogand Courantblogs are two blogs that are in Russian andEnglish respectively Russsianblog and Courantblogs areestablished since 2003 and 2012 respectively

The reason for selecting the mentioned websites are (1)They support multilevel reply (2) their users are active andarticles usually get many comments (3) author and time ofcomments are available (4) they cover different contexts andlanguages (news agencies cultural sites and blogs in EnglishRussian and Persian)

The Scientific World Journal 11

procedure frequent-words (comment candidate comment)Sum = 0For each (Word1 in comment)

For each (Word2 in candidate-comment)Sum += Score [Word1 Word2]

End ifEnd forreturn Sum

Pseudocode 2 A pseudocode for calculating score of frequent words between two comments

Table 1 The datasets specific properties

Datasets Language Confirmation bymoderator

Usersregister

Thestandard English times

Alef Persian times

ENENews English times

Russianblog Russian times

Courantblogs English times

For each website we crawled the webpages that werepublished until the end of 2012We then parsed the pages andextracted the reply structure and used it for ground truthWehave removed the threadswith less than 4 comments becausethese kinds of threads do not give us much information andusually their comments reply to the root Table 1 summarizesthe information about the prepared datasetsThe datasets areavailable at httpeceutacirnlpresourceshtml

In ENENews and Russianblog users have to register inorder to leave a comment However in other datasets userscan leave comments in a thread with different names ordifferent users can use the same names

Table 2 reports some statistics on the crawled websitesThe length of commentsrsquo text in Russianblog is shorter thanthe other datasets which causes text similarity measures toperform poorly on it In ENENews the root usually includesa tweet that is why the length of the rootrsquos text is shorterthan the other datasets All comments have authorrsquos nameexcept for some comments in Alef Therefore the numbersof commenters for Alef was calculated from the commentsthat have authorrsquos name The average number of commentsper article in Thestandard and ENENews are about 50 and39 respectively which are larger than the other datasets

In order to gain some insights into the data we show somecharts extracted from the datasets

Figure 10 shows the distribution of the number of com-ments in the articles It is seen thatmost threads have between5 and 15 comments in Russianblog and Alef However inThestandard length of threads is longer than the otherdatasets andmost threads have between 12 and 24 comments

Publication rate of articles is shown in Figure 11 Thepublication rate follows a bell-shape and articles are publishedbetween 7 am and 19 pm and the highest rate belongs to the 4-hour period between 9 am and 13 pm Since there is only one

author who publishes the articles in Russianblog its chart hasless variation and the root is usually published between 10 amand 18 pm

Figure 12 shows the publication rate of comments afterthe root It is seen that all datasets except Russianblog havesimilar behavior about one hour after the article is publishedit has received most of the comments After that the rate ofcomment reception decreases Russianblog shows differentbehavior It seems that a part of its visitors reply to its articlesthe next morning 16ndash21 hours after the publication

Figure 13 shows the time difference between publicationtime of comments and their replies It is seen that the maxi-mum difference is usually less than one hour ENENews andRussianblog do not moderate comments and Thestandardhas very active moderators who immediately check andaccept comments However in Courantblogs and Alef wheremoderators are not always online the time difference isbetween one and two hours

Figure 14 shows how depth of comments increases whentime passes after publication of the main article Deepercomments show longer and more serious conversations Asshown in the figure comments usually reply to the root inearly hours After a few hours conversations are continuedaround the comments which causes the depth of thread toincrease Visitors of Alef and Courantblogs talk more aboutthe root article However Thestandard and ENENews havelonger and deeper discussions

Figure 15 shows length of comments in words Most ofcomments include 10ndash20 words Except for comments ofRussianblog the other datasets are similar This tells thatthe similarity measure is not enough and we have to usenontextual features as well Russianblog has comments thatare shorter than the other datasets This dataset is a personalblog and users usually write friendly comments

Figure 16 shows depth of comments Depth is directlyrelated to the conversations It is seen that comments areusually found in the first depth (below the root) RussianblogENENews and Thestandard have more comments in higherlevels meaning conversations are longer

52 Evaluation Metrics To evaluate the experiment resultswe use several metrics For edge prediction we use precisionrecall and 119865-score measures [4 13 17 18] Output of RTS isa tree This causes precision recall and 119865-score values to beequal [2] since FP (False Positive) and FN (False Negative)

12 The Scientific World Journal

Table 2 Datasetsrsquo statistics split up per website

DatasetsAvg length of

commentsrsquo text (inwords)

Avg length of therootrsquos text (in words)

Number ofcomments

Avg comments pernews

Number ofnews

Number ofcommenter

Thestandard 7298 35768 149058 4981 2992 3431Alef 7466 76655 23210 2796 830 4387ENENews 6819 16264 16661 3892 428 831Courantblogs 7924 39817 4952 1919 258 1716Russianblog 1856 23716 30877 1618 1908 1258

0

20

40

60

80

100

120

4 8 16 32 64 128 256

Num

ber o

f new

s

Number of comments

AlefEnenewsRussianblog

CourantblogsThestandard

Figure 10 Distribution of the number of comments in news article

0

005

01

015

02

025

0 2 4 6 8 10 12 14 16 18 20 22

Nor

mal

ized

new

s cou

nt

Hour

AlefEnenewsRussianblog

CourantblogsThestandard

Figure 11 Publication rate of articles per hour

0002004006008

01012014016

0 2 4 6 8 10 12 14 16 18 20 22 24

Nor

mal

ized

com

men

ts co

unt

Elapsed time after published the root (hour)

AlefEnenewsCourantblogs

RussianblogThestandard

Figure 12 Average number of published comments per hour after the root

The Scientific World Journal 13

0005

01015

02025

03035

04045

05

0 1 2 3 4 5 6 7 8 9 10 11 12

Nor

mal

ized

com

men

ts co

unt

Time difference

AlefEnenewsRusssiablog

CourantblogsThestandard

Figure 13 Time difference between a node and corresponding parent per hour

0

05

1

15

2

25

3

35

4

0 2 4 6 8 10 12 14 16 18 20 22 24

Aver

age c

omm

ents

dept

h

Elapsed time after publishing the root (hour)

AlefEnenewsRusssianblog

CourantblogsThestandard

Figure 14 Depth of comments averaged on the elapsed time from publication of root

in precision and recall are always equal Instead we use thefollowing edge accuracy measure

Accuracy (Acc)edge

=

1003816100381610038161003816reply relations cap detected relations1003816100381610038161003816119873

(7)

where the tree is comprised of119873 comments and one rootThe secondmetric is path accuracy which was introduced

byWang et al [3]Thismetric has a global view and considerspaths from nodes to the root

Accuracy (Acc)path =sum|119862|

119894=1

1003816100381610038161003816path119866 (119894) = path119875(119894)1003816100381610038161003816

|119862| (8)

where path119866(119894) is the ground-truth structure for 119894th comment

and path119875(119894) is the predicted structure for it |119862| is the number

of comments in a thread If any irrelevant comment appearsin the path this metric considers it to be completely wrongSo it is very strict To relax this Wang et al introduced

a metric that computes the overlap between the paths inground truth and the predicted path

Precision (119875)119877-path =

sum|119862|

119894=1(1003816100381610038161003816path119866 (119894) cap path

119875(119894)1003816100381610038161003816 1003816100381610038161003816path119875 (119894)

1003816100381610038161003816)

|119862|

Recall (119877)119877-path =

sum|119862|

119894=1(1003816100381610038161003816path119866 (119894) cap path

119875(119894)1003816100381610038161003816 1003816100381610038161003816path119866 (119894)

1003816100381610038161003816)

|119862|

(9)

where |path119875(119894)| is the number of comments in the prediction

path of 119894th comment and |path119866(119894)| is the number of com-

ments in the ground-truth path of 119894th comment 1198651-score isthe harmonic mean of precision and recall

1198651-score = 2 times Precision times recallPrecision + recall

(10)

The above mentioned metrics are appropriate in inter-rogative threads As mentioned before the root of declar-ative threads is news articles or main contents which aredifferent from the root of interrogative threads This causesthe structure of threads and reply relations to be differentThere are two types of reply relations in declarative threads

14 The Scientific World Journal

0

001

002

003

004

005

006

1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91Nor

mal

ized

com

men

ts co

unt

Length of comments (words)

AlefEnenewsRussianblog

CourantblogsThestandard

Figure 15 Histogram of length of comments in words

0

01

02

03

04

05

06

07

08

1 2 3 4 5 6 7 8 9Nor

mal

ized

com

men

ts co

unt

Depth

AlefEnenewsRusssainblog

CourantblogsThestandard

Figure 16 Histogram of the depth of comments

(1) comment-root that is the relation between commentsand the root article and (2) comment-comment that is therelation between two comments one parent and one childThe comment-root relations show different views around themain article (root) The comment-comment relations showconversation among visitors which is a valuable source ofinformation In Figure 17 there are three comment-root andthree comment-comment relations When a user reads theroot and comments heshe can write hisher opinion orsentiment about the root by replying to root or participatingin a discussion by replying to usersrsquo comments

We believe that the evaluation metrics mentioned beforeare not enough to cover both types of reply relations dueto differences between interrogative and declarative threadsAn appropriate metric should be able to detect both typesof relations So we have to modify the evaluation metrics ordefine new metrics

We propose an evaluation metric Accuracy (Acc)CTDwhere CTD stands for Comments Type Detection It isdefined as the proportion of correctly detected commenttypes The comment-root type includes comments whichinitiate a new view about the root (NP) and the comment-

comment type includes comments which are part of adiscussion (PD)

Accuracy (Acc)CTD =TP + TN

TP + TN + FP + FN (11)

where TP is the number of correctly detected NP commentsTN is the number of correctly detected PD comments FP isthe number of incorrectly detected NP comments and FN isthe number of incorrectly detected PD comments

To evaluate accuracy of comment-comment relations thestandard precision recall and 119865-score measures are used

Precision (119875)edge =TP

TP + FP

Recall (119877)edge =TP

TP + FN

1198651-score (119865)edge = 2 timesPrecision times recallPrecision + recall

(12)

where TP is the number of correctly detected comment-comment relations FP is the number of incorrectly detectedcomment-comment relations and FN is the number of

The Scientific World Journal 15

C

C

Comment-root relation

Comment-comment relation

C

C C C Starter discussion commentStarter discussion comment

R

Figure 17 An illustration of a declarative thread

comment-comment relations in the ground truth that werenot predicted at all

The path accuracy metric mentioned earlier is a mod-ified version of 119875path and 119877path which is appropriate todeclarative platform This metric consider discussion pathsfrom each PD comment to the starter discussion commentbut not to root

Precision (119875)path

=

sum|119862|119901

119894=1

1003816100381610038161003816path119866 (119894) = path119875(119894)1003816100381610038161003816

|119862|119901

ow

1 if (|119862|119901 = 0)

Recall (119877)path

=

sum|119862|119877

119894=1

1003816100381610038161003816path119866 (119894) = path119875(119894)1003816100381610038161003816

|119862|119877

ow

1 if (|119862|119877= 0)

(13)

where |119862|119877is the number of PD comments in the ground-

truth thread and |119862|119901is the number of PD comments in the

predicted thread Path119866(119894) is the discussion path from 119894th

node to the discussion starter comment in the ground truthPath119875(119894) is discussion path from 119894th node to the discussion

starter comment in the predicted threadAlso the relaxed Precision (119875)

119877-path and Recall (119877)119877-path

are modified to be suitable for declarative platform

Precision (119875)119877-path

=

sum|119862|119901

119894=1(1003816100381610038161003816path119866 (119894) cap path

119875(119894)1003816100381610038161003816 1003816100381610038161003816path119875 (119894)

1003816100381610038161003816)

|119862|119901

ow

1 if (|119862|119901= 0)

Recall (119877)119877-path

=

sum|119862|119877

119894=1(10038161003816100381610038161003816path119866 (119894)

cap path119875(119894)

1003816100381610038161003816100381610038161003816100381610038161003816path119866(119894)

10038161003816100381610038161003816)

|119862|119877

ow

1 if (|119862|119877= 0)

(14)

Figure 18 shows two threads of which one is the groundtruth and the other one is the predicted thread In order tobetter understand the evaluation metrics we calculate themfor this example

In Table 3 the predicted thread in Figure 18 has beenevaluated by metrics from interrogative threads The resultsshow high values Table 4 shows the results of evaluationby metrics from declarative threads The results show thatdeclarative metrics have more appropriate results

There are two reasons which lead declarative metricsto better evaluate the predicted structure (1) The root indeclarative threads has many children So if a methodconnects all comments to the root interrogative metricsshow good results (2) Interrogative metrics cannot properlyevaluate comment-comment relations in declarative threadsHowever the declarative metrics can evaluate both types ofrelations

53 Experimental Results and Analysis We use several base-lines to compare the effectiveness of SLARTS The first base-line is the performance when we simply link all comments totheir previous comment (we name it Last baseline) [3 4]Thismethod leads to good results for RTS task in interrogativethreads The second baseline is to link all comments to theroot (we name it First baseline)

The only work that has been done on RTS has focused onthe authorrsquos name in online news agencies [19] Schuth et alrsquosmethod used several features to find the commenterrsquos name inonline news articles This method achieves a good precisionbut it has a low recall because many comments do not referto any authorrsquos name So we selected Seo et alrsquos method [2]which is a strong method in interrogative platforms

Seo et alrsquos method has focused on forums and emails andused the quoted text as one of theirmain featuresThis featureis not available in our datasets Therefore we use all theirproposed features except the quoted text

We have applied 5fold cross-validation to minimize biaswith 95 as confidence interval Table 5 shows the results ofour experiments based on interrogative evaluation metricsAccording to Accedge SLARTS reveals higher accuracy exceptin Alef in which lots of replies are connected to the root

According to 119875119877-path and 119877119877-path First and Last baselines

have the best performance respectively First connects allcomments to the root this way irrelevant comments do notappear in the path from the comment to the root and also theroot appears in all paths Last connects all comments to theirprevious comments This causes all comments to appear inthe path

16 The Scientific World Journal

Table3Ex

ampleo

fcalculatio

nof

interrogativem

etric

sfor

thee

xamples

hownin

Figure

18

Evaluatio

nmetric

Calculation

Accuracy

edge

Replyrelatio

ns=1rarr

1198772rarr

1198777rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

14rarr

35rarr

26rarr

58rarr

5

Detectedrelatio

ns==1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

14rarr

15rarr

26rarr

57rarr

28rarr

7

8 11asymp073

Acc p

ath

|(1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

1rarr

1198775rarr

2rarr

1198776rarr

5rarr

2rarr

119877)|

Num

berof

comments

=8 11asymp073

119875119877-path

(1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

1rarr

1198775rarr

2rarr

1198776rarr

5rarr

2rarr

119877)+(4rarr

1rarr

1198777rarr

2rarr

1198778rarr

7rarr

2rarr

119877)

Num

berof

comments

=(1+1+1+1+1+1+1+1)+(1+12+23)

11

=1017

11

asymp092

119877119877-path

(1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

1rarr

1198775rarr

2rarr

1198776rarr

5rarr

2rarr

119877)+(4rarr

3rarr

1rarr

1198777rarr

1198778rarr

5rarr

2rarr

119877)

Num

berof

comments

=(1+1+1+1+1+1+1+1)+(23+1+23)

11

=1034

11

asymp094

1198651-score119877-path

2lowast092lowast094

092+094

asymp093

The Scientific World Journal 17

R

9 10 11

3 5

8

21 7

4 6

(a)

R

9 10 11

3 7

8

21

4 5

6

(b)

Figure 18 An example of two threads (a) ground-truth thread and (b) the prediction thread

Table 4 Example of calculation of declarative metrics for the example shown in Figure 18

Evaluation metric Calculation

AccCTD

TP = 1 2 9 10 11TN = 3 4 5 6 8

FN = 7

FP =

5 + 5

5 + 5 + 1 + 0asymp 091

119875edge

119877edge

1198651edge

TP = 3 rarr 1 5 rarr 2 6 rarr 5 119875edge =3

3 + 3= 05

FN = 4 rarr 3 8 rarr 5 119877edge =3

3 + 2= 06

FP = 4 rarr 1 7 rarr 2 8 rarr 7 1198651 minus scoreedge asymp 055

119875path

119877path

1198651path

119875path =| 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 |

| 3 rarr 1 4 rarr 1 5 rarr 2 6 rarr 5 rarr 2 7 rarr 2 8 rarr 7 rarr 2 |=3

6= 05

119877path =| 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 |

| 3 rarr 1 4 rarr 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 8 rarr 5 rarr 2 |=3

5= 06

1198651-score119877-path asymp 055

119875119877-path

119877119877-path

1198651119877-path

119875119877-path =

3 rarr 1 4 rarr 1 5 rarr 2 6 rarr 5 rarr 2 7 rarr 2 8 rarr 7 rarr 2

| 3 4 5 6 7 8 |=1 + 1 + 1 + 1 + 0 + 12

6=45

6= 075

119877119877-path =

3 rarr 1 4 rarr 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 8 rarr 5 rarr 2

| 3 4 5 6 8 |=1 + 12 + 1 + 1 + 12

5=4

5= 08

1198651-score119877-path =

2 lowast 075 lowast 08

075 + 08asymp 077

According to Accpath First has better performance inThestandard ENENews and Alef because more commentsare linked directly to the root Usually SLARTS and Seo etalrsquos methods cannot predict the full path inThestandard andENENews because according to Figure 15 paths are verylong and complex in these datasets

As wementioned earlier interrogative evaluationmetricsare not appropriate for declarative threads because based onthese metrics First shows high performance although thisbaseline does not detect any comment-comment relations

Table 6 shows the results when we use declarative evaluationmetrics proposed in Section 52

According to Table 6 for 119865edge SLARTS performs betterthan Seo et alrsquos method However for119875edge Seo et alrsquos methodperforms better but its 119877edge is lower than SLARTS in all da-tasets

It is important to say that when the difference between119875edge and 119877edge is high and 119875edge is greater than 119877edge themethod most likely connects comments to the root and itdoes not appropriately detect comment-comment relations

18 The Scientific World Journal

J(1)M(1)

V(1)

H(1)

A(1)

R(1)

G(1)

N(1)D(1)

E(1)

C(1)

P(2)

U(1)

S(1)

T(2)

T(1)

M(2)

L(1)

Q(1)

C(2)

K(1)

F(1)L(2)

C(3)

P(1)

A(2)

B(1)O(1)

I(1)

H(2)

R

(a)

J(1)

M(1)

R

A(1)

N(4)

D(1)

G(1)

I(1)

B(1)H(1)

O(1)

C(1)

N(2)

G(2)O(2)

P(1)

E(1)

D(2)

F(1)

E(2)F(2)

D(3)

L(1) F(3) N(1)

D(4)

F(4)K(1)D(5)

D(6)

N(3)

(b)

1)

1)

1)

1)

1)

1)

1)

1)1)

1)

1)

1)

R

F(2)

K(3)

R(1)

K(2)

M(2)A(3)

Q(1)

N(2)

K(4)A(2)

J(2)

1)

H(2)

H(3)

I(2)

I(1)

P(1)

O(1)

(c)

Figure 19 Three real threads in which nodes are labeled according to commenterrsquos name and number of its comments

(like First baseline) On the other hand when 119877edge isgreater than 119875edge the method does not appropriately detectcomment-root relations (like Last baseline)

So Seo et alrsquos method most likely connects comments tothe root and this is not appropriately detecting comment-comment relations On the other hands 119875edge and 119877edge ofSLARTS are close

Several features in SLARTS such as Global similarityfrequent words Frequent patterns and authorsrsquo languagefocus on detection of comment-comment relations Thismakes the results of SLARTS better than Seo et alrsquos methodin declarative evaluation metrics

We already saw that the average length of threads inThestandard and ENENews is longer than the other datasets(Table 2) and their paths are much longer and more complexthan the other datasets (Figure 16) According to 119865edge theaccuracy of Thestandard and ENENews is less than otherdatasets Note that 119865edge is a strict metric in declarativethreads

SLARTS has better 119865edge than Seo et alrsquos method in alldatasetsThemaximumdifference occurs inThestandard andRussianblog In these datasets many of the defined featureshave goodperformance (Importance of features in ENENewsand Russianblog datasets will be shown in Table 7 and we willexplain more in the next section)

The Scientific World Journal 19

Table 5 Experimental results with interrogative evaluation metrics

Datasets Methods Accedge Accpath 119875119877-path 119877

119877-path 1198651119877-path

Thestandard

SLARTS 04669plusmn0009

03059plusmn0013

06927plusmn0018

07450plusmn0010

07179plusmn0013

Seo et alrsquos [2] 03941plusmn0005

03413plusmn0006

08303plusmn0009

06563plusmn0005

07331plusmn0003

First 03630plusmn0010

03630plusmn0010

10000plusmn0000

05846plusmn0009

07379plusmn0007

Last 01824plusmn0006

00495plusmn0004

02214plusmn0008

10000plusmn0000

03625plusmn0010

ENENews

SLARTS 04837plusmn0018

03695plusmn0027

07625plusmn0026

07770plusmn0013

07697plusmn0017

Seo et alrsquos [2] 04661plusmn0017

04134plusmn0021

08720plusmn0011

07086plusmn0013

07818plusmn0011

First 04524plusmn0018

04524plusmn0018

10000plusmn0000

06634plusmn0016

07976plusmn0011

Last 02090plusmn0016

00596plusmn0003

02319plusmn0011

10000plusmn0000

03764plusmn0015

Courantblogs

SLARTS 06843plusmn0028

06453plusmn0033

09127plusmn0013

08619plusmn0014

08865plusmn0013

Seo et alrsquos [2] 06700plusmn0040

06628plusmn0043

09667plusmn0014

08250plusmn0018

08902plusmn0015

First 06490plusmn0036

06490plusmn0036

10000plusmn0000

08001plusmn0019

08889plusmn0012

Last 02037plusmn0017

01082plusmn0008

03183plusmn0011

10000plusmn0000

04829plusmn0012

Russianblog

SLARTS 07463plusmn0017

06904plusmn0017

08741plusmn0008

08787plusmn0006

08764plusmn0007

Seo et alrsquos [2] 05446plusmn0010

05174plusmn0011

09293plusmn0003

07821plusmn0004

08494plusmn0003

First 04968plusmn0006

04968plusmn0006

10000plusmn0000

07134plusmn0004

08327plusmn0003

Last 03632plusmn0018

01549plusmn0012

03695plusmn0015

10000plusmn0000

05395plusmn0016

Alef

SLARTS 07753plusmn0017

07641plusmn0018

09579plusmn0006

09033plusmn0008

09298plusmn0007

Seo et alrsquos [2] 07802plusmn0019

07800plusmn0019

09950plusmn0003

08828plusmn0010

09355plusmn0006

First 07853plusmn0018

07853plusmn0018

10000plusmn0000

08826plusmn0010

09376plusmn0006

Last 00993plusmn0005

00822plusmn0005

02526plusmn0012

10000plusmn0000

04032plusmn0015

The bold style shows the best result in each metric

As shown in Table 6 Russianblog has better resultsthan the other datasets in all metrics The main reason isthat its comments are not confirmed by a moderator Thiscauses Accedge of Last baseline in Russianblog to be equalto 03632 that is more than other datasets (Table 5) Accedgeof Last baseline has inverse relationship with complexity ofreplies Also we showed in Figure 13 that the Russianbloghas the lowest time difference between a comment and itsparent When time difference between a child and its parentdecreases detection of reply relationwould be easier In otherwords as the parent appears to be closer to its child somefeatures such as frequent pattern and location Prior that are

based on position of comments in chronological order workbetter

The difference between 119865119877-path and 119865path is about 20

in Thestandard and ENENews where threads are larger andpaths have more complex structures

The minimum difference between the results of SLARTSand Seo et alrsquos methods appears in Alef datasets In Alef manyrelations are comment root and many comments do not haveauthorrsquos name which make the features perform poorly

Since SLARTS method has features which speciallyfocused on detecting comment-root relations (eg by addingcandidate filtering) AccCTD of SLARTS is better than Seo

20 The Scientific World Journal

et al method in all datasets The best result is 091 forRussianblog and the worst result is 069 for ENENewsThe root of ENENews dataset is usually a tweet Accordingto Table 2 it makes the average length of the rootrsquos textto be shorter than the other datasets and this makes thetextual features perform poorly on detecting comment-rootrelations

Confidence intervals of Alef andCourantblog datasets arehigher than the other datasets becausemany of their relationsare comment root (Figure 16) This makes ranking SVM tobe bias towards connecting comments to the root especiallywhen a thread includes very few comment-comment rela-tions

We compare 119875-value to specify the significance of differ-ences between SLARTS and Seo et alrsquosmethods on declarativemetrics Since ranking SVM ranks candidates based on theirscore and selects the first candidate from the list only thefirst candidate is important So p1 is computed The resultsindicate that all improvements are statistically significant (119875-value lt 0005) in all datasets

54 Evaluation of the Features In this section we evaluatethe role of the introduced features We use backward featureselection That means to measure the importance of a featurewe use all features except that feature repeat the experimentsin its absence and compare the results to the case whereall features are present The difference between the values ofmetrics in presence of a feature and its absence is reported inTable 7

It is seen that some features improve precision of themetrics for example location prior and candidate filteringrule 3 where they most likely tend to detect comment-rootrelations Some features improve recall of the metrics such asauthorsrsquo language global similarity candidate filtering rule1 and frequent patterns These features most likely tendto detect comment-comment relations Some features affectboth precision and recall for example authorsrsquo name

As stated earlier Russianblog has different behaviourin comparison with other datasets (Figures 12 13 and 15)Also ENENews has larger threads and more complex paths(Figure 16 and Table 2) So we use these datasets to evaluatefeatures in depth

Similarity feature improves recall of evaluation metricsSince Russian language is known as a morphologically richlanguage and the length of commentsrsquo text is very short(about average 18 words according to Table 2) in comparisonwith other datasets improvement of textual features is lowTo increase the performance of textual feature in Russianwe need a stemmer a Wordnet and a tokenizer Similarityfeature has rather a good impact on AccCTD

Authorsrsquo language feature improves recall of evaluationmetrics According to 119877edge the improvements of this featureare 003 and 001 in ENENews and Russianblog respectively

Global similarity feature improves both precision andrecall in Russianblog and recall in ENENews

The frequent words feature has a small improve-ment in Russianblog like similarity feature This feature

improves recall of the evaluation metrics in ENEnews about001

Length ratio feature improves precision in both datasetsHowever since longer comments in ENENews have manychildren this feature is more prominent there

Authorsrsquo name feature is useful in all evaluation metricsThe value of this feature in ENENews is more than Russian-blog because authorsrsquo name in Russianblog is different fromother datasets it is an email and no one would refer to it as aname

The frequent patterns feature focuses on detectingcomment-comment relations This feature improves recall ofevaluation metrics in both datasets

The location prior feature improves precision in bothdatasets This feature has a good improvement on AccCTDAccording to 119875edge the best improvement is 016091 forRussianblog since comments are not moderated

The candidate filtering rule 1 improves recall of evaluationmetrics in both datasets This feature removes the rootcandidate accurately This feature has a good improvementon AccCTD The maximum improvement for 119877edge is 034 forRussianblog

The candidate filtering rule 2 has small improvements onprecision and recall of evaluation metrics in both datasetsMaximum improvement is gained with Russianblog

Finally The candidate filtering rule 3 improves precisionof metrics in both datasets This feature removes all can-didates except the root Thus detection of comment-rootrelations is improved Also this feature improves AccCTD

6 Information Extraction byVisualizing Threads Structure

In this section we discuss the information which can beextracted from hierarchical structure of threads Figure 19 isthe same as Figure 3 except that the commenters and theirnumber of posted comments are shown as the nodesrsquo labelfor example ldquo119863(3)rdquo is the third comment sent by user ldquo119863rdquoVisualization of the structure reveals valuable information asfollowing

(i) Is the root article controversial The threads ldquo119861rdquo andldquo119862rdquo in Figure 19 aremore controversial than ldquo119860rdquo Visi-tors have expressed different opinions and sentimentsabout the root in ldquo119861rdquo and ldquo119862rdquo leading to formationof longer conversations The thread ldquo119860rdquo shows acommon thread which has short conversations aboutthe root The height and width of trees can help as ameasure to recognize whether a root is controversialor not

(ii) Which comments are starters [36] Starter commentshave important role in the conversations becauseusers read them and then read other comments tobetter understand the conversations

(iii) Which comments have participated in this conversa-tion Users can follow the conversations that seeminteresting to them without reading any unrelatedcomment

The Scientific World Journal 21

Table 6 Experimental results with declarative evaluation metrics

Datasets Method 119875edge 119877edge 119865edge 119875path 119877path 119865path 119875119877-path 119877

119877-path 119865119877-path AccCTD

ThestandardSLARTS 03782

plusmn000804155plusmn0011

03960plusmn0006

01565plusmn0007

01713plusmn0008

01636plusmn0007

03628plusmn0012

04166plusmn0016

03876plusmn0005

07525plusmn0005

Seo et alrsquos[2]

03014plusmn0031

01414plusmn0015

01921plusmn0016

01544plusmn0025

00576plusmn0011

00835plusmn0013

03013plusmn0038

01319plusmn0018

01831plusmn0021

05831plusmn0007

Difference 00778 02741 02039 00021 01137 00801 00615 02847 02045 01694

ENENewsSLARTS 03797

plusmn003803891plusmn0019

03839plusmn0024

01873plusmn0040

01855plusmn0026

01857plusmn0030

03768plusmn0048

03768plusmn0017

03759plusmn0025

06872plusmn0014

Seo et alrsquos[2]

04060plusmn0030

01713plusmn0016

02407plusmn0018

02378plusmn0034

00837plusmn0016

01235plusmn0021

03947plusmn0035

01523plusmn0015

02196plusmn0019

05954plusmn0009

Difference minus00263 02178 01432 minus00505 01018 00622 minus00179 02245 01563 00918

CourantblogsSLARTS 05236

plusmn003003963plusmn0073

04495plusmn0052

04220plusmn0057

03110plusmn0058

03566plusmn0049

05325plusmn0049

03853plusmn0075

04452plusmn0057

07625plusmn0033

Seo et alrsquos[2]

07320plusmn0090

02049plusmn0062

03189plusmn0082

06815plusmn0105

01896plusmn0070

02951plusmn0093

07309plusmn0094

02010plusmn0065

03140plusmn0086

07001plusmn0031

Difference minus02084 01914 01306 minus02595 01214 00615 minus01984 01843 01312 00624

RussianblogSLARTS 05940

plusmn002805674plusmn0022

05803plusmn0023

04907plusmn0031

04664plusmn0020

04782plusmn0024

05878plusmn0028

05823plusmn0021

05849plusmn0020

09116plusmn0011

Seo et alrsquos[2]

04705plusmn0020

03030plusmn0015

03685plusmn0016

03823plusmn0023

02534plusmn0013

03046plusmn0015

04682plusmn0020

02822plusmn0014

03521plusmn0015

05862plusmn0013

Difference 01235 02644 02118 01084 0213 01736 01196 03001 02328 03254

AlefSLARTS 06189

plusmn002804131plusmn0040

04952plusmn0037

05639plusmn0027

03819plusmn0037

04551plusmn0034

06277plusmn0027

04069plusmn0043

04934plusmn0039

08044plusmn0013

Seo et alrsquos[2]

08827plusmn0052

02637plusmn0051

04045plusmn0060

08797plusmn0053

02631plusmn0051

04034plusmn0060

08840plusmn0052

02635plusmn0051

04043plusmn0060

07846plusmn0018

Difference minus02638 01494 00907 minus03158 01188 00517 minus02563 01434 00891 00198The bold style shows the best result in each metric

(iv) Who plays an important role in a discussion Someusers playmore important roles in conversations thanother users For example users ldquo119863(1)rdquo in threadldquoBrdquo and ldquo119860(1)rdquo in thread ldquoCrdquo have made a longconversation These users are known as a hub or astarter in a thread and their degree of importance canbe compared according to their indegree [39] Manyanalyses can be performed on this structure which isknown as popularity features [12]

(v) How many conversations are about the root Thereare thirteen conversations which are replyingdirectly to the root in thread ldquoCrdquo for exampleF(2)H(2)H(3) I(2)O(1)orF(2)H(2) l(1) showtwo conversations in thread ldquo119862rdquo

7 Conclusion and Future Work

In this paper we proposed SLARTS a method based onranking SVMwhich predicts the tree-like structure of declar-ative threads for example blogs and online news agenciesWe emphasized on the differences between declarative andinterrogative threads and showed that many of the previouslyproposed features perform poorly on declarative threadsbecause of this Instead we defined a set of novel textual andnontextual features and used a ranking SVM algorithm tocombine these features

We detect two types of reply relations in declarativethreads comment-root relation and comment-commentrelation An appropriate method should be able to detectboth types of reply relations and an appropriatemetric shouldconsider both types of reply relations So in order to havefair judge on the quality of the predicted structures wemodified the evaluation metrics accordingly We also defineda novel metric that measures the accuracy of comment typedetection

The results of our experiments showed that accord-ing to declarative evaluation metrics our method showshigher accuracy in comparison with the baselines on fivedatasets Also we showed that all improvements in detectingcomment-comment relations are statistically significant in alldatasets

We believe that the defined features are not limited todeclarative threads Some features such as authorrsquos languageand frequent patterns extract relations among users can beuseful in interrogative threads

For future we would like to use the proposed methodon interrogative platforms such as forums Also we wouldlike to analyze the tree-like reply structure deeply and webelieve it can provide valuable information for example tohelp in understanding the role of users in discussions or tofind important and controversial comments among a largevolume of comments

22 The Scientific World Journal

Table 7 The difference between the evaluation metrics in presence and absence of features

Feature Dataset Evaluation metric119875edge 119877edge 119875Path 119877Path 119875

119877-path 119877119877-path AccCTD

Similarity ENENews minus000233 007415 minus000538 003289 minus001095 008996 004684Russianblog 000874 000012 000963 000347 000695 minus001079 000355

Authorsrsquo language ENENews minus001354 003246 minus001785 000693 minus002757 004052 000458Russianblog minus000229 001052 minus000179 000750 minus000590 000807 minus000182

Global similarity ENENews minus000402 005604 minus001338 001581 minus001129 007086 002742Russianblog 001090 001443 000582 000897 000368 minus000317 minus000087

Frequent words ENENews 000252 000931 000170 000424 minus000132 000973 000292Russianblog 000008 000023 000016 000029 000034 000047 minus000011

Length ratio ENENews 004206 minus002984 004450 001700 006421 minus005336 minus000850Russianblog 000651 minus000355 000920 000080 000784 minus000592 000235

Authorsrsquo name ENENews 003753 006392 001920 003285 002341 005998 001788Russianblog 000000 000000 000000 000000 000000 000000 000000

Frequent pattern ENENews minus002028 002930 minus002357 000300 minus003793 003410 000526Russianblog 000458 006100 000830 005206 minus000545 006150 000404

Location prior ENENews 008778 minus003690 008556 003280 011792 minus007852 002925Russianblog 016091 009671 017766 013515 016699 008793 002414

Candidate filtering rule 1 ENENews minus005431 006138 minus004102 001296 minus006458 007545 006949Russianblog minus012506 034370 minus013916 027445 minus012683 037003 032546

Candidate filtering rule 2 ENENews 000014 minus000004 000048 000041 minus000038 minus000035 minus000009Russianblog 002575 002475 002462 002341 001746 000802 000000

Candidate filtering rule 3 ENENews 000357 minus000121 000452 000254 000551 minus000161 000161Russianblog 008521 minus001978 009636 001702 009380 minus003825 007371

Since our goal is to provide a language-independentmodel to reconstruct the thread structure we did not usetext processing tools such as Wordnet and Named EntityRecognition (as they are not available in the same quality forall languages) We would like to focus on English and usethese tools to find out whether they can improve the accuracyof RTS method or not

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] Y Wu Q Ye L Li and J Xiao ldquoPower-law properties ofhuman view and reply behavior in online societyrdquoMathematicalProblems in Engineering vol 2012 Article ID 969087 7 pages2012

[2] J Seo W B Croft and D A Smith ldquoOnline community searchusing conversational structuresrdquo Information Retrieval vol 14no 6 pp 547ndash571 2011

[3] HWang CWang C X Zhai and J Han ldquoLearning online dis-cussion structures by conditional random fieldsrdquo in Proceedingsof the 34th International ACM SIGIR Conference on Researchand Development in Information Retrieval (SIGIR rsquo11) pp 435ndash444 July 2011

[4] E Aumayr J Chan and C Hayes ldquoReconstruction of threadedconversations in online discussion forumsrdquo inProceedings of the

5th International AAAI Conference onWeblogs and SocialMedia(ICWSM rsquo11) pp 26ndash33 2011

[5] D Shen Q Yang J-T Sun and Z Chen ldquoThread detactionin dynamic text message streamsrdquo in Proceedings of the 29thAnnual International ACM SIGIR Conference on Research andDevelopment in Information Retrieval pp 35ndash42 Seattle WashUSA August 2006

[6] M Dehghani M Asadpour and A Shakery ldquoAn evolutionary-based method for reconstructing conversation threads in emailcorporardquo in Proceedings of the IEEEACM International Con-ference on Advances in Social Networks Analysis and Mining(ASONAM rsquo12) 2012

[7] Y Jen-Yuan ldquoEmail thread reassembly using similarity match-ingrdquo in Proceedings of the 3rd Conference on Email and Anti-Spam (CEAS rsquo06) Muntain View Calif USA 2006

[8] S Gottipati D Lo and J Jiang ldquoFinding relevant answersin software forumsrdquo in Proceedings of the 26th IEEEACMInternational Conference on Automated Software Engineering(ASE rsquo11) pp 323ndash332 November 2011

[9] H Duan and C Zhai ldquoExploiting thread structures to improvesmoothing of language models for forum post retrievalrdquo inAdvances in Information Retrieval P Clough C Foley CGurrin et al Eds vol 6611 pp 350ndash361 Springer BerlinGermany 2011

[10] T Georgiou M Karvounis and Y Ioannidis ldquoExtractingtopics of debate between users on web discussion boardsrdquo inProceedings of the 1st International Conference forUndergraduateand Postgraduate Students in Research Poster Competition 2010

[11] Y C Wang M Joshi W Cohen and C P Rose ldquoRecoveringimplicit thread structure in newsgroup style conversationsrdquo

The Scientific World Journal 23

in Proceedings of the International Conference on Weblogs andSocial Media Seattle Wash USA 2008

[12] J Chan C Hayes and E M Daly ldquoDecomposing discussionforums and boards using user rolesrdquo in Proceedings of the 4thInternational AAAI Conference on Weblogs and Social Media2010

[13] Y-C Wang and C P Rose ldquoMaking conversational structureexplicit identification of initiation-response pairs within onlinediscussionsrdquo in Proceedings of the Human Language Technolo-gies Conference ofthe North American Chapter of the Associationfor Computational Linguistics pp 673ndash676 June 2010

[14] LWangM Lui S N Kim J Nivre and T Baldwin ldquoPredictingthread discourse structure over technical web forumsrdquo inProceedings of the Conference on Empirical Methods in NaturalLanguage Processing (EMNLP rsquo11) pp 13ndash25 Edinburgh UKJuly 2011

[15] A Balali H FailiMAsadpour andMDehghani ldquoa supervisedapproach for reconstructing thread structure in commentson blogs and online news agenciesrdquo in Proceedings of the14th International Conference on Intelligent Text Processing andComputational Linguistics Samos Greece 2013

[16] P Adams and C Martel ldquoConversational thread extraction andtopic detection in text-based chatrdquo in Semantic Computing pp87ndash113 John Wiley amp Sons 2010

[17] P H Adams and C H Martell ldquoTopic detection and extractionin chatrdquo in Proceedings of the 2nd Annual IEEE InternationalConference on Semantic Computing (ICSC rsquo08) pp 581ndash588August 2008

[18] C Lin J M Yang R Cai X J Wang W Wang and L ZhangldquoModeling semantics and structure of discussion threadsrdquo inProceedings of the 18th International Conference on World WideWeb pp 1103ndash1104 2009

[19] A Schuth MMarx andM d Rijke ldquoExtracting the discussionstructure in comments on news-articlesrdquo in Proceedings of the9th ACM International Workshop onWeb Information and DataManagement (WIDM rsquo07) Lisboa Portugal November 2007

[20] C Wang J Han Q Li X Li W P Lin and H Ji ldquoLearninghierarchical relationships among partially ordered objects withheterogeneous attributes and linksrdquo in Proceedings of the 12thSIAM International Conference on Data Mining pp 516ndash527Anaheim Calif USA 2012

[21] M Dehghani A Shakery M Asadpour and A KoushkestanildquoA learning approach for email conversation thread reconstruc-tionrdquo Journal of Information Science vol 39 no 6 pp 846ndash8632013

[22] A Balali A Rajabi S Ghassemi M Asadpour and H FailildquoContent diffusion prediction in social networksrdquo in Pro-ceedings of the 5th Conference on Information and KnowledgeTechnology (IKT rsquo13) pp 467ndash471 2013

[23] G Mishne and N Glance ldquoLeave a reply an analysis of weblogcommentsrdquo in Proceedings of the 3rd Annual Workshop onthe Weblogging ecosystem Aggregation Analysis and DynamicsEdinburgh UK 2006

[24] V Gomez A Kaltenbrunner and V Lopez ldquoStatistical analysisof the social network and discussion threads in Slashdotrdquo inProceedings of the 17th International Conference on World WideWeb (WWW rsquo08) pp 645ndash654 April 2008

[25] T Weninger X A Zhu and J Han ldquoAn exploration ofdiscussion threads in social news sites a case study of the redditcommunityrdquo 2013

[26] D Laniado R Tasso Y Volkovich and A KaltenbrunnerldquoWhen the Wikipedians talk network and tree structure of

Wikipedia discussion pagesrdquo in Proceedings of the 5th Interna-tional AAAI Conference on Weblogs and Social Media (ICWSMrsquo11) 2011

[27] A Grabowski ldquoHuman behavior in online social systemsrdquoTheEuropean Physical Journal B vol 69 no 4 pp 605ndash611 2009

[28] S N Kim L Wang and T Baldwin ldquoTagging and linkingweb forum postsrdquo in Proceedings of the 40th Conference onComputational Natural Language Learning Uppsala Sweden2010

[29] J Andreas S Rosenthal and K McKeown ldquoAnnotating agree-ment and disagreement in threaded discussionrdquo in Proceedingsof the 8th International Conference on Language Resources andComputation (LREC rsquo12) pp 818ndash822 Istanbul Turkey May2012

[30] A Stolcke K Ries N Coccaro et al ldquoDialogue actmodeling forautomatic tagging and recognition of conversational speechrdquoComputational Linguistics vol 26 no 3 pp 339ndash373 2000

[31] E N Forsyth Improving Automated Lexical and DiscourseAnalysis of Online Chat Dialog Naval Postgraduate SchoolMonterey Calif USA 2007

[32] D Feng E Shaw J Kim and E Hovy ldquoAn intelligentdiscussion-bot for answering student queries in threaded dis-cussionsrdquo in Proceedings of the 11th international Conference onIntelligent User Interfaces pp 171ndash177 February 2006

[33] F Hu T Ruan Z Shao and J Ding ldquoAutomatic web informa-tion extraction based on rulesrdquo in Proceedings of the Web Infor-mation System Engineering (WISE rsquo11) pp 265ndash272 Springer2011

[34] F Hu T Ruan and Z Shao ldquoComplete-thread extraction fromweb forumsrdquo inWebTechnologies andApplications pp 727ndash734Springer 2012

[35] D Feng J Kim E Shaw and E Hovy ldquoTowards modelingthreaded discussions using induced ontology knowledgerdquo inProceedings of the National Conference on Artificial Intelligencepp 1289ndash1294 July 2006

[36] G D Venolia and C Neustaedter ldquoUnderstanding sequenceand reply relationships within email conversations a mixed-model visualizationrdquo in Proceedings of the SIGCHI Conferenceon Human Factors in Computing Systems pp 361ndash368 April2003

[37] T Joachims ldquoTraining linear SVMs in linear timerdquo in Pro-ceedings of the 12th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining pp 217ndash226 Philadel-phia Pa USA August 2006

[38] RM Fano ldquoTransmission of information a statistical theory ofcommunicationsrdquo American Journal of Physics vol 29 pp 793ndash794 1961

[39] T Opsahl F Agneessens and J Skvoretz ldquoNode centrality inweighted networks generalizing degree and shortest pathsrdquoSocial Networks vol 32 no 3 pp 245ndash251 2010

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 9: Research Article A Supervised Approach to Predict the …downloads.hindawi.com/journals/tswj/2014/479746.pdf · 2019-07-31 · Research Article A Supervised Approach to Predict the

The Scientific World Journal 9

+

C

C

C

A1

+

C

C

C

A2Similarity

C

Figure 5 The authorsrsquo language features based on author ldquo1198601rdquo and author ldquo1198602rdquo

102030405060708090100

1 20 4060 80

100

0

01

02

03

04

05

06

07

08

09

1

Prio

r pro

babi

lity

X RootY 60Z01862

X 10Y 60Z00011

X 30Y 60Z00011

X 57Y 60Z 006

Root

Y child X parent

X 59Y 60Z01236

Figure 6 The prior probability of child-parent relation from comments 1 to 100 inThestandard

102030405060

0

005

01

015

02

025

X rootY 60Z01862

X 59Y 60Z01236

Root

4045

5055

60

Y child

X parent

Prio

r pro

babi

lity

Figure 7 The prior probability of child-parent relation from comments 40 to 60

Comment other comments are sorted based on the text similarity measurement score

5 15 2 4 R 3 21 1

2 233 4 5 R13 1

4 235 1 R 313 2

5

2 4

1 2 3 4 5

1 2 3 4 5

1 2 3 4 5

1(1 + 3) 1(2 + 1)

Figure 8 An example for global similarity features

10 The Scientific World Journal

Step 1For 119894 = 1 to comments-1

if Author(119894) = Author(119894 + 1)119891[Author(119894 + 1) Author(119894)]++

End ifEnd forStep 2For 119894 = 2 to Comments For each comment which looking for its parent

For 119895 = 0 to 119894 minus 1 For each candidateFPScore(119894119895)= 119891[Author(119894) Author(119895)] + 119891[Author(119895) Author(119894)]

End forEnd for

Pseudocode 1 Pseudocode to calculate frequent patterns of authors appearance

Time line

Relation CountFrequentpattern

Sum

Aminus gt B 2 A B 3

Bminus gt C 1 B C 1

Cminus gt A 1 C A 1

Bminus gt A 1 AD 1Aminus gt D 1

B

AFrequent (A B) = 3 Frequent (A C) = 1

C

Step 1 Step 2

A B C A B A D

Figure 9 An example for authors appearance sequence

429 Candidate Filtering In addition to the features we usesome heuristics to filter some inappropriate candidates

(1) Usually a commenter does not reply to the root inhisher second comment So if a commenter haswritten more than one comment the root is removedfrom the parent candidates of the second and nextcomment This was shown to be a useful featurebecause the root is an important candidate so if wecould remove it correctly the results are improvedsignificantly

(2) A commenter does not reply to himherself So wesimply remove all comments of a commenter fromhisher commentrsquos candidates [2]

(3) Commenters who post only one comment on a threadare more likely to reply to the root post So othercandidates can be removed except the root

5 Experiments

In this section we provide details about the datasets used forevaluation describe the evaluation metrics and then presentthe results of the SLARTS method and compare them withthe results of the baseline approaches

51 Dataset Unfortunately there is no standard datasetfor RTS task in declarative threads while severaldatasets are available for interrogative threads [3 4 14]Therefore Thestandard (httpthestandardorgnz) Alef(httpwwwalefir) ENENews (httpenenewscom) Rus-sianblog (wwwkaverydreamwidthorg) and Courant-blogs(wwwcourantblogscom) websites were crawled to provideevaluation data Thestandard in New Zealand and Alef inIran are two online news agencies established in August2007 and 2006 respectively They publish daily news andarticles by journalists in different categories such as economyenvironment politics and society Thestandard is in Englishand Alef is in PersianThestandard has provided the tree-likereply structure of comments since 2010 ENENews is a newswebsite covering the latest energy-related developmentsIt was established shortly after the Fukushima Daiichidisaster in March 2011 It has grown rapidly to serveapproximately 2000000 page views per month Russianblogand Courantblogs are two blogs that are in Russian andEnglish respectively Russsianblog and Courantblogs areestablished since 2003 and 2012 respectively

The reason for selecting the mentioned websites are (1)They support multilevel reply (2) their users are active andarticles usually get many comments (3) author and time ofcomments are available (4) they cover different contexts andlanguages (news agencies cultural sites and blogs in EnglishRussian and Persian)

The Scientific World Journal 11

procedure frequent-words (comment candidate comment)Sum = 0For each (Word1 in comment)

For each (Word2 in candidate-comment)Sum += Score [Word1 Word2]

End ifEnd forreturn Sum

Pseudocode 2 A pseudocode for calculating score of frequent words between two comments

Table 1 The datasets specific properties

Datasets Language Confirmation bymoderator

Usersregister

Thestandard English times

Alef Persian times

ENENews English times

Russianblog Russian times

Courantblogs English times

For each website we crawled the webpages that werepublished until the end of 2012We then parsed the pages andextracted the reply structure and used it for ground truthWehave removed the threadswith less than 4 comments becausethese kinds of threads do not give us much information andusually their comments reply to the root Table 1 summarizesthe information about the prepared datasetsThe datasets areavailable at httpeceutacirnlpresourceshtml

In ENENews and Russianblog users have to register inorder to leave a comment However in other datasets userscan leave comments in a thread with different names ordifferent users can use the same names

Table 2 reports some statistics on the crawled websitesThe length of commentsrsquo text in Russianblog is shorter thanthe other datasets which causes text similarity measures toperform poorly on it In ENENews the root usually includesa tweet that is why the length of the rootrsquos text is shorterthan the other datasets All comments have authorrsquos nameexcept for some comments in Alef Therefore the numbersof commenters for Alef was calculated from the commentsthat have authorrsquos name The average number of commentsper article in Thestandard and ENENews are about 50 and39 respectively which are larger than the other datasets

In order to gain some insights into the data we show somecharts extracted from the datasets

Figure 10 shows the distribution of the number of com-ments in the articles It is seen thatmost threads have between5 and 15 comments in Russianblog and Alef However inThestandard length of threads is longer than the otherdatasets andmost threads have between 12 and 24 comments

Publication rate of articles is shown in Figure 11 Thepublication rate follows a bell-shape and articles are publishedbetween 7 am and 19 pm and the highest rate belongs to the 4-hour period between 9 am and 13 pm Since there is only one

author who publishes the articles in Russianblog its chart hasless variation and the root is usually published between 10 amand 18 pm

Figure 12 shows the publication rate of comments afterthe root It is seen that all datasets except Russianblog havesimilar behavior about one hour after the article is publishedit has received most of the comments After that the rate ofcomment reception decreases Russianblog shows differentbehavior It seems that a part of its visitors reply to its articlesthe next morning 16ndash21 hours after the publication

Figure 13 shows the time difference between publicationtime of comments and their replies It is seen that the maxi-mum difference is usually less than one hour ENENews andRussianblog do not moderate comments and Thestandardhas very active moderators who immediately check andaccept comments However in Courantblogs and Alef wheremoderators are not always online the time difference isbetween one and two hours

Figure 14 shows how depth of comments increases whentime passes after publication of the main article Deepercomments show longer and more serious conversations Asshown in the figure comments usually reply to the root inearly hours After a few hours conversations are continuedaround the comments which causes the depth of thread toincrease Visitors of Alef and Courantblogs talk more aboutthe root article However Thestandard and ENENews havelonger and deeper discussions

Figure 15 shows length of comments in words Most ofcomments include 10ndash20 words Except for comments ofRussianblog the other datasets are similar This tells thatthe similarity measure is not enough and we have to usenontextual features as well Russianblog has comments thatare shorter than the other datasets This dataset is a personalblog and users usually write friendly comments

Figure 16 shows depth of comments Depth is directlyrelated to the conversations It is seen that comments areusually found in the first depth (below the root) RussianblogENENews and Thestandard have more comments in higherlevels meaning conversations are longer

52 Evaluation Metrics To evaluate the experiment resultswe use several metrics For edge prediction we use precisionrecall and 119865-score measures [4 13 17 18] Output of RTS isa tree This causes precision recall and 119865-score values to beequal [2] since FP (False Positive) and FN (False Negative)

12 The Scientific World Journal

Table 2 Datasetsrsquo statistics split up per website

DatasetsAvg length of

commentsrsquo text (inwords)

Avg length of therootrsquos text (in words)

Number ofcomments

Avg comments pernews

Number ofnews

Number ofcommenter

Thestandard 7298 35768 149058 4981 2992 3431Alef 7466 76655 23210 2796 830 4387ENENews 6819 16264 16661 3892 428 831Courantblogs 7924 39817 4952 1919 258 1716Russianblog 1856 23716 30877 1618 1908 1258

0

20

40

60

80

100

120

4 8 16 32 64 128 256

Num

ber o

f new

s

Number of comments

AlefEnenewsRussianblog

CourantblogsThestandard

Figure 10 Distribution of the number of comments in news article

0

005

01

015

02

025

0 2 4 6 8 10 12 14 16 18 20 22

Nor

mal

ized

new

s cou

nt

Hour

AlefEnenewsRussianblog

CourantblogsThestandard

Figure 11 Publication rate of articles per hour

0002004006008

01012014016

0 2 4 6 8 10 12 14 16 18 20 22 24

Nor

mal

ized

com

men

ts co

unt

Elapsed time after published the root (hour)

AlefEnenewsCourantblogs

RussianblogThestandard

Figure 12 Average number of published comments per hour after the root

The Scientific World Journal 13

0005

01015

02025

03035

04045

05

0 1 2 3 4 5 6 7 8 9 10 11 12

Nor

mal

ized

com

men

ts co

unt

Time difference

AlefEnenewsRusssiablog

CourantblogsThestandard

Figure 13 Time difference between a node and corresponding parent per hour

0

05

1

15

2

25

3

35

4

0 2 4 6 8 10 12 14 16 18 20 22 24

Aver

age c

omm

ents

dept

h

Elapsed time after publishing the root (hour)

AlefEnenewsRusssianblog

CourantblogsThestandard

Figure 14 Depth of comments averaged on the elapsed time from publication of root

in precision and recall are always equal Instead we use thefollowing edge accuracy measure

Accuracy (Acc)edge

=

1003816100381610038161003816reply relations cap detected relations1003816100381610038161003816119873

(7)

where the tree is comprised of119873 comments and one rootThe secondmetric is path accuracy which was introduced

byWang et al [3]Thismetric has a global view and considerspaths from nodes to the root

Accuracy (Acc)path =sum|119862|

119894=1

1003816100381610038161003816path119866 (119894) = path119875(119894)1003816100381610038161003816

|119862| (8)

where path119866(119894) is the ground-truth structure for 119894th comment

and path119875(119894) is the predicted structure for it |119862| is the number

of comments in a thread If any irrelevant comment appearsin the path this metric considers it to be completely wrongSo it is very strict To relax this Wang et al introduced

a metric that computes the overlap between the paths inground truth and the predicted path

Precision (119875)119877-path =

sum|119862|

119894=1(1003816100381610038161003816path119866 (119894) cap path

119875(119894)1003816100381610038161003816 1003816100381610038161003816path119875 (119894)

1003816100381610038161003816)

|119862|

Recall (119877)119877-path =

sum|119862|

119894=1(1003816100381610038161003816path119866 (119894) cap path

119875(119894)1003816100381610038161003816 1003816100381610038161003816path119866 (119894)

1003816100381610038161003816)

|119862|

(9)

where |path119875(119894)| is the number of comments in the prediction

path of 119894th comment and |path119866(119894)| is the number of com-

ments in the ground-truth path of 119894th comment 1198651-score isthe harmonic mean of precision and recall

1198651-score = 2 times Precision times recallPrecision + recall

(10)

The above mentioned metrics are appropriate in inter-rogative threads As mentioned before the root of declar-ative threads is news articles or main contents which aredifferent from the root of interrogative threads This causesthe structure of threads and reply relations to be differentThere are two types of reply relations in declarative threads

14 The Scientific World Journal

0

001

002

003

004

005

006

1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91Nor

mal

ized

com

men

ts co

unt

Length of comments (words)

AlefEnenewsRussianblog

CourantblogsThestandard

Figure 15 Histogram of length of comments in words

0

01

02

03

04

05

06

07

08

1 2 3 4 5 6 7 8 9Nor

mal

ized

com

men

ts co

unt

Depth

AlefEnenewsRusssainblog

CourantblogsThestandard

Figure 16 Histogram of the depth of comments

(1) comment-root that is the relation between commentsand the root article and (2) comment-comment that is therelation between two comments one parent and one childThe comment-root relations show different views around themain article (root) The comment-comment relations showconversation among visitors which is a valuable source ofinformation In Figure 17 there are three comment-root andthree comment-comment relations When a user reads theroot and comments heshe can write hisher opinion orsentiment about the root by replying to root or participatingin a discussion by replying to usersrsquo comments

We believe that the evaluation metrics mentioned beforeare not enough to cover both types of reply relations dueto differences between interrogative and declarative threadsAn appropriate metric should be able to detect both typesof relations So we have to modify the evaluation metrics ordefine new metrics

We propose an evaluation metric Accuracy (Acc)CTDwhere CTD stands for Comments Type Detection It isdefined as the proportion of correctly detected commenttypes The comment-root type includes comments whichinitiate a new view about the root (NP) and the comment-

comment type includes comments which are part of adiscussion (PD)

Accuracy (Acc)CTD =TP + TN

TP + TN + FP + FN (11)

where TP is the number of correctly detected NP commentsTN is the number of correctly detected PD comments FP isthe number of incorrectly detected NP comments and FN isthe number of incorrectly detected PD comments

To evaluate accuracy of comment-comment relations thestandard precision recall and 119865-score measures are used

Precision (119875)edge =TP

TP + FP

Recall (119877)edge =TP

TP + FN

1198651-score (119865)edge = 2 timesPrecision times recallPrecision + recall

(12)

where TP is the number of correctly detected comment-comment relations FP is the number of incorrectly detectedcomment-comment relations and FN is the number of

The Scientific World Journal 15

C

C

Comment-root relation

Comment-comment relation

C

C C C Starter discussion commentStarter discussion comment

R

Figure 17 An illustration of a declarative thread

comment-comment relations in the ground truth that werenot predicted at all

The path accuracy metric mentioned earlier is a mod-ified version of 119875path and 119877path which is appropriate todeclarative platform This metric consider discussion pathsfrom each PD comment to the starter discussion commentbut not to root

Precision (119875)path

=

sum|119862|119901

119894=1

1003816100381610038161003816path119866 (119894) = path119875(119894)1003816100381610038161003816

|119862|119901

ow

1 if (|119862|119901 = 0)

Recall (119877)path

=

sum|119862|119877

119894=1

1003816100381610038161003816path119866 (119894) = path119875(119894)1003816100381610038161003816

|119862|119877

ow

1 if (|119862|119877= 0)

(13)

where |119862|119877is the number of PD comments in the ground-

truth thread and |119862|119901is the number of PD comments in the

predicted thread Path119866(119894) is the discussion path from 119894th

node to the discussion starter comment in the ground truthPath119875(119894) is discussion path from 119894th node to the discussion

starter comment in the predicted threadAlso the relaxed Precision (119875)

119877-path and Recall (119877)119877-path

are modified to be suitable for declarative platform

Precision (119875)119877-path

=

sum|119862|119901

119894=1(1003816100381610038161003816path119866 (119894) cap path

119875(119894)1003816100381610038161003816 1003816100381610038161003816path119875 (119894)

1003816100381610038161003816)

|119862|119901

ow

1 if (|119862|119901= 0)

Recall (119877)119877-path

=

sum|119862|119877

119894=1(10038161003816100381610038161003816path119866 (119894)

cap path119875(119894)

1003816100381610038161003816100381610038161003816100381610038161003816path119866(119894)

10038161003816100381610038161003816)

|119862|119877

ow

1 if (|119862|119877= 0)

(14)

Figure 18 shows two threads of which one is the groundtruth and the other one is the predicted thread In order tobetter understand the evaluation metrics we calculate themfor this example

In Table 3 the predicted thread in Figure 18 has beenevaluated by metrics from interrogative threads The resultsshow high values Table 4 shows the results of evaluationby metrics from declarative threads The results show thatdeclarative metrics have more appropriate results

There are two reasons which lead declarative metricsto better evaluate the predicted structure (1) The root indeclarative threads has many children So if a methodconnects all comments to the root interrogative metricsshow good results (2) Interrogative metrics cannot properlyevaluate comment-comment relations in declarative threadsHowever the declarative metrics can evaluate both types ofrelations

53 Experimental Results and Analysis We use several base-lines to compare the effectiveness of SLARTS The first base-line is the performance when we simply link all comments totheir previous comment (we name it Last baseline) [3 4]Thismethod leads to good results for RTS task in interrogativethreads The second baseline is to link all comments to theroot (we name it First baseline)

The only work that has been done on RTS has focused onthe authorrsquos name in online news agencies [19] Schuth et alrsquosmethod used several features to find the commenterrsquos name inonline news articles This method achieves a good precisionbut it has a low recall because many comments do not referto any authorrsquos name So we selected Seo et alrsquos method [2]which is a strong method in interrogative platforms

Seo et alrsquos method has focused on forums and emails andused the quoted text as one of theirmain featuresThis featureis not available in our datasets Therefore we use all theirproposed features except the quoted text

We have applied 5fold cross-validation to minimize biaswith 95 as confidence interval Table 5 shows the results ofour experiments based on interrogative evaluation metricsAccording to Accedge SLARTS reveals higher accuracy exceptin Alef in which lots of replies are connected to the root

According to 119875119877-path and 119877119877-path First and Last baselines

have the best performance respectively First connects allcomments to the root this way irrelevant comments do notappear in the path from the comment to the root and also theroot appears in all paths Last connects all comments to theirprevious comments This causes all comments to appear inthe path

16 The Scientific World Journal

Table3Ex

ampleo

fcalculatio

nof

interrogativem

etric

sfor

thee

xamples

hownin

Figure

18

Evaluatio

nmetric

Calculation

Accuracy

edge

Replyrelatio

ns=1rarr

1198772rarr

1198777rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

14rarr

35rarr

26rarr

58rarr

5

Detectedrelatio

ns==1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

14rarr

15rarr

26rarr

57rarr

28rarr

7

8 11asymp073

Acc p

ath

|(1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

1rarr

1198775rarr

2rarr

1198776rarr

5rarr

2rarr

119877)|

Num

berof

comments

=8 11asymp073

119875119877-path

(1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

1rarr

1198775rarr

2rarr

1198776rarr

5rarr

2rarr

119877)+(4rarr

1rarr

1198777rarr

2rarr

1198778rarr

7rarr

2rarr

119877)

Num

berof

comments

=(1+1+1+1+1+1+1+1)+(1+12+23)

11

=1017

11

asymp092

119877119877-path

(1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

1rarr

1198775rarr

2rarr

1198776rarr

5rarr

2rarr

119877)+(4rarr

3rarr

1rarr

1198777rarr

1198778rarr

5rarr

2rarr

119877)

Num

berof

comments

=(1+1+1+1+1+1+1+1)+(23+1+23)

11

=1034

11

asymp094

1198651-score119877-path

2lowast092lowast094

092+094

asymp093

The Scientific World Journal 17

R

9 10 11

3 5

8

21 7

4 6

(a)

R

9 10 11

3 7

8

21

4 5

6

(b)

Figure 18 An example of two threads (a) ground-truth thread and (b) the prediction thread

Table 4 Example of calculation of declarative metrics for the example shown in Figure 18

Evaluation metric Calculation

AccCTD

TP = 1 2 9 10 11TN = 3 4 5 6 8

FN = 7

FP =

5 + 5

5 + 5 + 1 + 0asymp 091

119875edge

119877edge

1198651edge

TP = 3 rarr 1 5 rarr 2 6 rarr 5 119875edge =3

3 + 3= 05

FN = 4 rarr 3 8 rarr 5 119877edge =3

3 + 2= 06

FP = 4 rarr 1 7 rarr 2 8 rarr 7 1198651 minus scoreedge asymp 055

119875path

119877path

1198651path

119875path =| 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 |

| 3 rarr 1 4 rarr 1 5 rarr 2 6 rarr 5 rarr 2 7 rarr 2 8 rarr 7 rarr 2 |=3

6= 05

119877path =| 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 |

| 3 rarr 1 4 rarr 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 8 rarr 5 rarr 2 |=3

5= 06

1198651-score119877-path asymp 055

119875119877-path

119877119877-path

1198651119877-path

119875119877-path =

3 rarr 1 4 rarr 1 5 rarr 2 6 rarr 5 rarr 2 7 rarr 2 8 rarr 7 rarr 2

| 3 4 5 6 7 8 |=1 + 1 + 1 + 1 + 0 + 12

6=45

6= 075

119877119877-path =

3 rarr 1 4 rarr 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 8 rarr 5 rarr 2

| 3 4 5 6 8 |=1 + 12 + 1 + 1 + 12

5=4

5= 08

1198651-score119877-path =

2 lowast 075 lowast 08

075 + 08asymp 077

According to Accpath First has better performance inThestandard ENENews and Alef because more commentsare linked directly to the root Usually SLARTS and Seo etalrsquos methods cannot predict the full path inThestandard andENENews because according to Figure 15 paths are verylong and complex in these datasets

As wementioned earlier interrogative evaluationmetricsare not appropriate for declarative threads because based onthese metrics First shows high performance although thisbaseline does not detect any comment-comment relations

Table 6 shows the results when we use declarative evaluationmetrics proposed in Section 52

According to Table 6 for 119865edge SLARTS performs betterthan Seo et alrsquos method However for119875edge Seo et alrsquos methodperforms better but its 119877edge is lower than SLARTS in all da-tasets

It is important to say that when the difference between119875edge and 119877edge is high and 119875edge is greater than 119877edge themethod most likely connects comments to the root and itdoes not appropriately detect comment-comment relations

18 The Scientific World Journal

J(1)M(1)

V(1)

H(1)

A(1)

R(1)

G(1)

N(1)D(1)

E(1)

C(1)

P(2)

U(1)

S(1)

T(2)

T(1)

M(2)

L(1)

Q(1)

C(2)

K(1)

F(1)L(2)

C(3)

P(1)

A(2)

B(1)O(1)

I(1)

H(2)

R

(a)

J(1)

M(1)

R

A(1)

N(4)

D(1)

G(1)

I(1)

B(1)H(1)

O(1)

C(1)

N(2)

G(2)O(2)

P(1)

E(1)

D(2)

F(1)

E(2)F(2)

D(3)

L(1) F(3) N(1)

D(4)

F(4)K(1)D(5)

D(6)

N(3)

(b)

1)

1)

1)

1)

1)

1)

1)

1)1)

1)

1)

1)

R

F(2)

K(3)

R(1)

K(2)

M(2)A(3)

Q(1)

N(2)

K(4)A(2)

J(2)

1)

H(2)

H(3)

I(2)

I(1)

P(1)

O(1)

(c)

Figure 19 Three real threads in which nodes are labeled according to commenterrsquos name and number of its comments

(like First baseline) On the other hand when 119877edge isgreater than 119875edge the method does not appropriately detectcomment-root relations (like Last baseline)

So Seo et alrsquos method most likely connects comments tothe root and this is not appropriately detecting comment-comment relations On the other hands 119875edge and 119877edge ofSLARTS are close

Several features in SLARTS such as Global similarityfrequent words Frequent patterns and authorsrsquo languagefocus on detection of comment-comment relations Thismakes the results of SLARTS better than Seo et alrsquos methodin declarative evaluation metrics

We already saw that the average length of threads inThestandard and ENENews is longer than the other datasets(Table 2) and their paths are much longer and more complexthan the other datasets (Figure 16) According to 119865edge theaccuracy of Thestandard and ENENews is less than otherdatasets Note that 119865edge is a strict metric in declarativethreads

SLARTS has better 119865edge than Seo et alrsquos method in alldatasetsThemaximumdifference occurs inThestandard andRussianblog In these datasets many of the defined featureshave goodperformance (Importance of features in ENENewsand Russianblog datasets will be shown in Table 7 and we willexplain more in the next section)

The Scientific World Journal 19

Table 5 Experimental results with interrogative evaluation metrics

Datasets Methods Accedge Accpath 119875119877-path 119877

119877-path 1198651119877-path

Thestandard

SLARTS 04669plusmn0009

03059plusmn0013

06927plusmn0018

07450plusmn0010

07179plusmn0013

Seo et alrsquos [2] 03941plusmn0005

03413plusmn0006

08303plusmn0009

06563plusmn0005

07331plusmn0003

First 03630plusmn0010

03630plusmn0010

10000plusmn0000

05846plusmn0009

07379plusmn0007

Last 01824plusmn0006

00495plusmn0004

02214plusmn0008

10000plusmn0000

03625plusmn0010

ENENews

SLARTS 04837plusmn0018

03695plusmn0027

07625plusmn0026

07770plusmn0013

07697plusmn0017

Seo et alrsquos [2] 04661plusmn0017

04134plusmn0021

08720plusmn0011

07086plusmn0013

07818plusmn0011

First 04524plusmn0018

04524plusmn0018

10000plusmn0000

06634plusmn0016

07976plusmn0011

Last 02090plusmn0016

00596plusmn0003

02319plusmn0011

10000plusmn0000

03764plusmn0015

Courantblogs

SLARTS 06843plusmn0028

06453plusmn0033

09127plusmn0013

08619plusmn0014

08865plusmn0013

Seo et alrsquos [2] 06700plusmn0040

06628plusmn0043

09667plusmn0014

08250plusmn0018

08902plusmn0015

First 06490plusmn0036

06490plusmn0036

10000plusmn0000

08001plusmn0019

08889plusmn0012

Last 02037plusmn0017

01082plusmn0008

03183plusmn0011

10000plusmn0000

04829plusmn0012

Russianblog

SLARTS 07463plusmn0017

06904plusmn0017

08741plusmn0008

08787plusmn0006

08764plusmn0007

Seo et alrsquos [2] 05446plusmn0010

05174plusmn0011

09293plusmn0003

07821plusmn0004

08494plusmn0003

First 04968plusmn0006

04968plusmn0006

10000plusmn0000

07134plusmn0004

08327plusmn0003

Last 03632plusmn0018

01549plusmn0012

03695plusmn0015

10000plusmn0000

05395plusmn0016

Alef

SLARTS 07753plusmn0017

07641plusmn0018

09579plusmn0006

09033plusmn0008

09298plusmn0007

Seo et alrsquos [2] 07802plusmn0019

07800plusmn0019

09950plusmn0003

08828plusmn0010

09355plusmn0006

First 07853plusmn0018

07853plusmn0018

10000plusmn0000

08826plusmn0010

09376plusmn0006

Last 00993plusmn0005

00822plusmn0005

02526plusmn0012

10000plusmn0000

04032plusmn0015

The bold style shows the best result in each metric

As shown in Table 6 Russianblog has better resultsthan the other datasets in all metrics The main reason isthat its comments are not confirmed by a moderator Thiscauses Accedge of Last baseline in Russianblog to be equalto 03632 that is more than other datasets (Table 5) Accedgeof Last baseline has inverse relationship with complexity ofreplies Also we showed in Figure 13 that the Russianbloghas the lowest time difference between a comment and itsparent When time difference between a child and its parentdecreases detection of reply relationwould be easier In otherwords as the parent appears to be closer to its child somefeatures such as frequent pattern and location Prior that are

based on position of comments in chronological order workbetter

The difference between 119865119877-path and 119865path is about 20

in Thestandard and ENENews where threads are larger andpaths have more complex structures

The minimum difference between the results of SLARTSand Seo et alrsquos methods appears in Alef datasets In Alef manyrelations are comment root and many comments do not haveauthorrsquos name which make the features perform poorly

Since SLARTS method has features which speciallyfocused on detecting comment-root relations (eg by addingcandidate filtering) AccCTD of SLARTS is better than Seo

20 The Scientific World Journal

et al method in all datasets The best result is 091 forRussianblog and the worst result is 069 for ENENewsThe root of ENENews dataset is usually a tweet Accordingto Table 2 it makes the average length of the rootrsquos textto be shorter than the other datasets and this makes thetextual features perform poorly on detecting comment-rootrelations

Confidence intervals of Alef andCourantblog datasets arehigher than the other datasets becausemany of their relationsare comment root (Figure 16) This makes ranking SVM tobe bias towards connecting comments to the root especiallywhen a thread includes very few comment-comment rela-tions

We compare 119875-value to specify the significance of differ-ences between SLARTS and Seo et alrsquosmethods on declarativemetrics Since ranking SVM ranks candidates based on theirscore and selects the first candidate from the list only thefirst candidate is important So p1 is computed The resultsindicate that all improvements are statistically significant (119875-value lt 0005) in all datasets

54 Evaluation of the Features In this section we evaluatethe role of the introduced features We use backward featureselection That means to measure the importance of a featurewe use all features except that feature repeat the experimentsin its absence and compare the results to the case whereall features are present The difference between the values ofmetrics in presence of a feature and its absence is reported inTable 7

It is seen that some features improve precision of themetrics for example location prior and candidate filteringrule 3 where they most likely tend to detect comment-rootrelations Some features improve recall of the metrics such asauthorsrsquo language global similarity candidate filtering rule1 and frequent patterns These features most likely tendto detect comment-comment relations Some features affectboth precision and recall for example authorsrsquo name

As stated earlier Russianblog has different behaviourin comparison with other datasets (Figures 12 13 and 15)Also ENENews has larger threads and more complex paths(Figure 16 and Table 2) So we use these datasets to evaluatefeatures in depth

Similarity feature improves recall of evaluation metricsSince Russian language is known as a morphologically richlanguage and the length of commentsrsquo text is very short(about average 18 words according to Table 2) in comparisonwith other datasets improvement of textual features is lowTo increase the performance of textual feature in Russianwe need a stemmer a Wordnet and a tokenizer Similarityfeature has rather a good impact on AccCTD

Authorsrsquo language feature improves recall of evaluationmetrics According to 119877edge the improvements of this featureare 003 and 001 in ENENews and Russianblog respectively

Global similarity feature improves both precision andrecall in Russianblog and recall in ENENews

The frequent words feature has a small improve-ment in Russianblog like similarity feature This feature

improves recall of the evaluation metrics in ENEnews about001

Length ratio feature improves precision in both datasetsHowever since longer comments in ENENews have manychildren this feature is more prominent there

Authorsrsquo name feature is useful in all evaluation metricsThe value of this feature in ENENews is more than Russian-blog because authorsrsquo name in Russianblog is different fromother datasets it is an email and no one would refer to it as aname

The frequent patterns feature focuses on detectingcomment-comment relations This feature improves recall ofevaluation metrics in both datasets

The location prior feature improves precision in bothdatasets This feature has a good improvement on AccCTDAccording to 119875edge the best improvement is 016091 forRussianblog since comments are not moderated

The candidate filtering rule 1 improves recall of evaluationmetrics in both datasets This feature removes the rootcandidate accurately This feature has a good improvementon AccCTD The maximum improvement for 119877edge is 034 forRussianblog

The candidate filtering rule 2 has small improvements onprecision and recall of evaluation metrics in both datasetsMaximum improvement is gained with Russianblog

Finally The candidate filtering rule 3 improves precisionof metrics in both datasets This feature removes all can-didates except the root Thus detection of comment-rootrelations is improved Also this feature improves AccCTD

6 Information Extraction byVisualizing Threads Structure

In this section we discuss the information which can beextracted from hierarchical structure of threads Figure 19 isthe same as Figure 3 except that the commenters and theirnumber of posted comments are shown as the nodesrsquo labelfor example ldquo119863(3)rdquo is the third comment sent by user ldquo119863rdquoVisualization of the structure reveals valuable information asfollowing

(i) Is the root article controversial The threads ldquo119861rdquo andldquo119862rdquo in Figure 19 aremore controversial than ldquo119860rdquo Visi-tors have expressed different opinions and sentimentsabout the root in ldquo119861rdquo and ldquo119862rdquo leading to formationof longer conversations The thread ldquo119860rdquo shows acommon thread which has short conversations aboutthe root The height and width of trees can help as ameasure to recognize whether a root is controversialor not

(ii) Which comments are starters [36] Starter commentshave important role in the conversations becauseusers read them and then read other comments tobetter understand the conversations

(iii) Which comments have participated in this conversa-tion Users can follow the conversations that seeminteresting to them without reading any unrelatedcomment

The Scientific World Journal 21

Table 6 Experimental results with declarative evaluation metrics

Datasets Method 119875edge 119877edge 119865edge 119875path 119877path 119865path 119875119877-path 119877

119877-path 119865119877-path AccCTD

ThestandardSLARTS 03782

plusmn000804155plusmn0011

03960plusmn0006

01565plusmn0007

01713plusmn0008

01636plusmn0007

03628plusmn0012

04166plusmn0016

03876plusmn0005

07525plusmn0005

Seo et alrsquos[2]

03014plusmn0031

01414plusmn0015

01921plusmn0016

01544plusmn0025

00576plusmn0011

00835plusmn0013

03013plusmn0038

01319plusmn0018

01831plusmn0021

05831plusmn0007

Difference 00778 02741 02039 00021 01137 00801 00615 02847 02045 01694

ENENewsSLARTS 03797

plusmn003803891plusmn0019

03839plusmn0024

01873plusmn0040

01855plusmn0026

01857plusmn0030

03768plusmn0048

03768plusmn0017

03759plusmn0025

06872plusmn0014

Seo et alrsquos[2]

04060plusmn0030

01713plusmn0016

02407plusmn0018

02378plusmn0034

00837plusmn0016

01235plusmn0021

03947plusmn0035

01523plusmn0015

02196plusmn0019

05954plusmn0009

Difference minus00263 02178 01432 minus00505 01018 00622 minus00179 02245 01563 00918

CourantblogsSLARTS 05236

plusmn003003963plusmn0073

04495plusmn0052

04220plusmn0057

03110plusmn0058

03566plusmn0049

05325plusmn0049

03853plusmn0075

04452plusmn0057

07625plusmn0033

Seo et alrsquos[2]

07320plusmn0090

02049plusmn0062

03189plusmn0082

06815plusmn0105

01896plusmn0070

02951plusmn0093

07309plusmn0094

02010plusmn0065

03140plusmn0086

07001plusmn0031

Difference minus02084 01914 01306 minus02595 01214 00615 minus01984 01843 01312 00624

RussianblogSLARTS 05940

plusmn002805674plusmn0022

05803plusmn0023

04907plusmn0031

04664plusmn0020

04782plusmn0024

05878plusmn0028

05823plusmn0021

05849plusmn0020

09116plusmn0011

Seo et alrsquos[2]

04705plusmn0020

03030plusmn0015

03685plusmn0016

03823plusmn0023

02534plusmn0013

03046plusmn0015

04682plusmn0020

02822plusmn0014

03521plusmn0015

05862plusmn0013

Difference 01235 02644 02118 01084 0213 01736 01196 03001 02328 03254

AlefSLARTS 06189

plusmn002804131plusmn0040

04952plusmn0037

05639plusmn0027

03819plusmn0037

04551plusmn0034

06277plusmn0027

04069plusmn0043

04934plusmn0039

08044plusmn0013

Seo et alrsquos[2]

08827plusmn0052

02637plusmn0051

04045plusmn0060

08797plusmn0053

02631plusmn0051

04034plusmn0060

08840plusmn0052

02635plusmn0051

04043plusmn0060

07846plusmn0018

Difference minus02638 01494 00907 minus03158 01188 00517 minus02563 01434 00891 00198The bold style shows the best result in each metric

(iv) Who plays an important role in a discussion Someusers playmore important roles in conversations thanother users For example users ldquo119863(1)rdquo in threadldquoBrdquo and ldquo119860(1)rdquo in thread ldquoCrdquo have made a longconversation These users are known as a hub or astarter in a thread and their degree of importance canbe compared according to their indegree [39] Manyanalyses can be performed on this structure which isknown as popularity features [12]

(v) How many conversations are about the root Thereare thirteen conversations which are replyingdirectly to the root in thread ldquoCrdquo for exampleF(2)H(2)H(3) I(2)O(1)orF(2)H(2) l(1) showtwo conversations in thread ldquo119862rdquo

7 Conclusion and Future Work

In this paper we proposed SLARTS a method based onranking SVMwhich predicts the tree-like structure of declar-ative threads for example blogs and online news agenciesWe emphasized on the differences between declarative andinterrogative threads and showed that many of the previouslyproposed features perform poorly on declarative threadsbecause of this Instead we defined a set of novel textual andnontextual features and used a ranking SVM algorithm tocombine these features

We detect two types of reply relations in declarativethreads comment-root relation and comment-commentrelation An appropriate method should be able to detectboth types of reply relations and an appropriatemetric shouldconsider both types of reply relations So in order to havefair judge on the quality of the predicted structures wemodified the evaluation metrics accordingly We also defineda novel metric that measures the accuracy of comment typedetection

The results of our experiments showed that accord-ing to declarative evaluation metrics our method showshigher accuracy in comparison with the baselines on fivedatasets Also we showed that all improvements in detectingcomment-comment relations are statistically significant in alldatasets

We believe that the defined features are not limited todeclarative threads Some features such as authorrsquos languageand frequent patterns extract relations among users can beuseful in interrogative threads

For future we would like to use the proposed methodon interrogative platforms such as forums Also we wouldlike to analyze the tree-like reply structure deeply and webelieve it can provide valuable information for example tohelp in understanding the role of users in discussions or tofind important and controversial comments among a largevolume of comments

22 The Scientific World Journal

Table 7 The difference between the evaluation metrics in presence and absence of features

Feature Dataset Evaluation metric119875edge 119877edge 119875Path 119877Path 119875

119877-path 119877119877-path AccCTD

Similarity ENENews minus000233 007415 minus000538 003289 minus001095 008996 004684Russianblog 000874 000012 000963 000347 000695 minus001079 000355

Authorsrsquo language ENENews minus001354 003246 minus001785 000693 minus002757 004052 000458Russianblog minus000229 001052 minus000179 000750 minus000590 000807 minus000182

Global similarity ENENews minus000402 005604 minus001338 001581 minus001129 007086 002742Russianblog 001090 001443 000582 000897 000368 minus000317 minus000087

Frequent words ENENews 000252 000931 000170 000424 minus000132 000973 000292Russianblog 000008 000023 000016 000029 000034 000047 minus000011

Length ratio ENENews 004206 minus002984 004450 001700 006421 minus005336 minus000850Russianblog 000651 minus000355 000920 000080 000784 minus000592 000235

Authorsrsquo name ENENews 003753 006392 001920 003285 002341 005998 001788Russianblog 000000 000000 000000 000000 000000 000000 000000

Frequent pattern ENENews minus002028 002930 minus002357 000300 minus003793 003410 000526Russianblog 000458 006100 000830 005206 minus000545 006150 000404

Location prior ENENews 008778 minus003690 008556 003280 011792 minus007852 002925Russianblog 016091 009671 017766 013515 016699 008793 002414

Candidate filtering rule 1 ENENews minus005431 006138 minus004102 001296 minus006458 007545 006949Russianblog minus012506 034370 minus013916 027445 minus012683 037003 032546

Candidate filtering rule 2 ENENews 000014 minus000004 000048 000041 minus000038 minus000035 minus000009Russianblog 002575 002475 002462 002341 001746 000802 000000

Candidate filtering rule 3 ENENews 000357 minus000121 000452 000254 000551 minus000161 000161Russianblog 008521 minus001978 009636 001702 009380 minus003825 007371

Since our goal is to provide a language-independentmodel to reconstruct the thread structure we did not usetext processing tools such as Wordnet and Named EntityRecognition (as they are not available in the same quality forall languages) We would like to focus on English and usethese tools to find out whether they can improve the accuracyof RTS method or not

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] Y Wu Q Ye L Li and J Xiao ldquoPower-law properties ofhuman view and reply behavior in online societyrdquoMathematicalProblems in Engineering vol 2012 Article ID 969087 7 pages2012

[2] J Seo W B Croft and D A Smith ldquoOnline community searchusing conversational structuresrdquo Information Retrieval vol 14no 6 pp 547ndash571 2011

[3] HWang CWang C X Zhai and J Han ldquoLearning online dis-cussion structures by conditional random fieldsrdquo in Proceedingsof the 34th International ACM SIGIR Conference on Researchand Development in Information Retrieval (SIGIR rsquo11) pp 435ndash444 July 2011

[4] E Aumayr J Chan and C Hayes ldquoReconstruction of threadedconversations in online discussion forumsrdquo inProceedings of the

5th International AAAI Conference onWeblogs and SocialMedia(ICWSM rsquo11) pp 26ndash33 2011

[5] D Shen Q Yang J-T Sun and Z Chen ldquoThread detactionin dynamic text message streamsrdquo in Proceedings of the 29thAnnual International ACM SIGIR Conference on Research andDevelopment in Information Retrieval pp 35ndash42 Seattle WashUSA August 2006

[6] M Dehghani M Asadpour and A Shakery ldquoAn evolutionary-based method for reconstructing conversation threads in emailcorporardquo in Proceedings of the IEEEACM International Con-ference on Advances in Social Networks Analysis and Mining(ASONAM rsquo12) 2012

[7] Y Jen-Yuan ldquoEmail thread reassembly using similarity match-ingrdquo in Proceedings of the 3rd Conference on Email and Anti-Spam (CEAS rsquo06) Muntain View Calif USA 2006

[8] S Gottipati D Lo and J Jiang ldquoFinding relevant answersin software forumsrdquo in Proceedings of the 26th IEEEACMInternational Conference on Automated Software Engineering(ASE rsquo11) pp 323ndash332 November 2011

[9] H Duan and C Zhai ldquoExploiting thread structures to improvesmoothing of language models for forum post retrievalrdquo inAdvances in Information Retrieval P Clough C Foley CGurrin et al Eds vol 6611 pp 350ndash361 Springer BerlinGermany 2011

[10] T Georgiou M Karvounis and Y Ioannidis ldquoExtractingtopics of debate between users on web discussion boardsrdquo inProceedings of the 1st International Conference forUndergraduateand Postgraduate Students in Research Poster Competition 2010

[11] Y C Wang M Joshi W Cohen and C P Rose ldquoRecoveringimplicit thread structure in newsgroup style conversationsrdquo

The Scientific World Journal 23

in Proceedings of the International Conference on Weblogs andSocial Media Seattle Wash USA 2008

[12] J Chan C Hayes and E M Daly ldquoDecomposing discussionforums and boards using user rolesrdquo in Proceedings of the 4thInternational AAAI Conference on Weblogs and Social Media2010

[13] Y-C Wang and C P Rose ldquoMaking conversational structureexplicit identification of initiation-response pairs within onlinediscussionsrdquo in Proceedings of the Human Language Technolo-gies Conference ofthe North American Chapter of the Associationfor Computational Linguistics pp 673ndash676 June 2010

[14] LWangM Lui S N Kim J Nivre and T Baldwin ldquoPredictingthread discourse structure over technical web forumsrdquo inProceedings of the Conference on Empirical Methods in NaturalLanguage Processing (EMNLP rsquo11) pp 13ndash25 Edinburgh UKJuly 2011

[15] A Balali H FailiMAsadpour andMDehghani ldquoa supervisedapproach for reconstructing thread structure in commentson blogs and online news agenciesrdquo in Proceedings of the14th International Conference on Intelligent Text Processing andComputational Linguistics Samos Greece 2013

[16] P Adams and C Martel ldquoConversational thread extraction andtopic detection in text-based chatrdquo in Semantic Computing pp87ndash113 John Wiley amp Sons 2010

[17] P H Adams and C H Martell ldquoTopic detection and extractionin chatrdquo in Proceedings of the 2nd Annual IEEE InternationalConference on Semantic Computing (ICSC rsquo08) pp 581ndash588August 2008

[18] C Lin J M Yang R Cai X J Wang W Wang and L ZhangldquoModeling semantics and structure of discussion threadsrdquo inProceedings of the 18th International Conference on World WideWeb pp 1103ndash1104 2009

[19] A Schuth MMarx andM d Rijke ldquoExtracting the discussionstructure in comments on news-articlesrdquo in Proceedings of the9th ACM International Workshop onWeb Information and DataManagement (WIDM rsquo07) Lisboa Portugal November 2007

[20] C Wang J Han Q Li X Li W P Lin and H Ji ldquoLearninghierarchical relationships among partially ordered objects withheterogeneous attributes and linksrdquo in Proceedings of the 12thSIAM International Conference on Data Mining pp 516ndash527Anaheim Calif USA 2012

[21] M Dehghani A Shakery M Asadpour and A KoushkestanildquoA learning approach for email conversation thread reconstruc-tionrdquo Journal of Information Science vol 39 no 6 pp 846ndash8632013

[22] A Balali A Rajabi S Ghassemi M Asadpour and H FailildquoContent diffusion prediction in social networksrdquo in Pro-ceedings of the 5th Conference on Information and KnowledgeTechnology (IKT rsquo13) pp 467ndash471 2013

[23] G Mishne and N Glance ldquoLeave a reply an analysis of weblogcommentsrdquo in Proceedings of the 3rd Annual Workshop onthe Weblogging ecosystem Aggregation Analysis and DynamicsEdinburgh UK 2006

[24] V Gomez A Kaltenbrunner and V Lopez ldquoStatistical analysisof the social network and discussion threads in Slashdotrdquo inProceedings of the 17th International Conference on World WideWeb (WWW rsquo08) pp 645ndash654 April 2008

[25] T Weninger X A Zhu and J Han ldquoAn exploration ofdiscussion threads in social news sites a case study of the redditcommunityrdquo 2013

[26] D Laniado R Tasso Y Volkovich and A KaltenbrunnerldquoWhen the Wikipedians talk network and tree structure of

Wikipedia discussion pagesrdquo in Proceedings of the 5th Interna-tional AAAI Conference on Weblogs and Social Media (ICWSMrsquo11) 2011

[27] A Grabowski ldquoHuman behavior in online social systemsrdquoTheEuropean Physical Journal B vol 69 no 4 pp 605ndash611 2009

[28] S N Kim L Wang and T Baldwin ldquoTagging and linkingweb forum postsrdquo in Proceedings of the 40th Conference onComputational Natural Language Learning Uppsala Sweden2010

[29] J Andreas S Rosenthal and K McKeown ldquoAnnotating agree-ment and disagreement in threaded discussionrdquo in Proceedingsof the 8th International Conference on Language Resources andComputation (LREC rsquo12) pp 818ndash822 Istanbul Turkey May2012

[30] A Stolcke K Ries N Coccaro et al ldquoDialogue actmodeling forautomatic tagging and recognition of conversational speechrdquoComputational Linguistics vol 26 no 3 pp 339ndash373 2000

[31] E N Forsyth Improving Automated Lexical and DiscourseAnalysis of Online Chat Dialog Naval Postgraduate SchoolMonterey Calif USA 2007

[32] D Feng E Shaw J Kim and E Hovy ldquoAn intelligentdiscussion-bot for answering student queries in threaded dis-cussionsrdquo in Proceedings of the 11th international Conference onIntelligent User Interfaces pp 171ndash177 February 2006

[33] F Hu T Ruan Z Shao and J Ding ldquoAutomatic web informa-tion extraction based on rulesrdquo in Proceedings of the Web Infor-mation System Engineering (WISE rsquo11) pp 265ndash272 Springer2011

[34] F Hu T Ruan and Z Shao ldquoComplete-thread extraction fromweb forumsrdquo inWebTechnologies andApplications pp 727ndash734Springer 2012

[35] D Feng J Kim E Shaw and E Hovy ldquoTowards modelingthreaded discussions using induced ontology knowledgerdquo inProceedings of the National Conference on Artificial Intelligencepp 1289ndash1294 July 2006

[36] G D Venolia and C Neustaedter ldquoUnderstanding sequenceand reply relationships within email conversations a mixed-model visualizationrdquo in Proceedings of the SIGCHI Conferenceon Human Factors in Computing Systems pp 361ndash368 April2003

[37] T Joachims ldquoTraining linear SVMs in linear timerdquo in Pro-ceedings of the 12th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining pp 217ndash226 Philadel-phia Pa USA August 2006

[38] RM Fano ldquoTransmission of information a statistical theory ofcommunicationsrdquo American Journal of Physics vol 29 pp 793ndash794 1961

[39] T Opsahl F Agneessens and J Skvoretz ldquoNode centrality inweighted networks generalizing degree and shortest pathsrdquoSocial Networks vol 32 no 3 pp 245ndash251 2010

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 10: Research Article A Supervised Approach to Predict the …downloads.hindawi.com/journals/tswj/2014/479746.pdf · 2019-07-31 · Research Article A Supervised Approach to Predict the

10 The Scientific World Journal

Step 1For 119894 = 1 to comments-1

if Author(119894) = Author(119894 + 1)119891[Author(119894 + 1) Author(119894)]++

End ifEnd forStep 2For 119894 = 2 to Comments For each comment which looking for its parent

For 119895 = 0 to 119894 minus 1 For each candidateFPScore(119894119895)= 119891[Author(119894) Author(119895)] + 119891[Author(119895) Author(119894)]

End forEnd for

Pseudocode 1 Pseudocode to calculate frequent patterns of authors appearance

Time line

Relation CountFrequentpattern

Sum

Aminus gt B 2 A B 3

Bminus gt C 1 B C 1

Cminus gt A 1 C A 1

Bminus gt A 1 AD 1Aminus gt D 1

B

AFrequent (A B) = 3 Frequent (A C) = 1

C

Step 1 Step 2

A B C A B A D

Figure 9 An example for authors appearance sequence

429 Candidate Filtering In addition to the features we usesome heuristics to filter some inappropriate candidates

(1) Usually a commenter does not reply to the root inhisher second comment So if a commenter haswritten more than one comment the root is removedfrom the parent candidates of the second and nextcomment This was shown to be a useful featurebecause the root is an important candidate so if wecould remove it correctly the results are improvedsignificantly

(2) A commenter does not reply to himherself So wesimply remove all comments of a commenter fromhisher commentrsquos candidates [2]

(3) Commenters who post only one comment on a threadare more likely to reply to the root post So othercandidates can be removed except the root

5 Experiments

In this section we provide details about the datasets used forevaluation describe the evaluation metrics and then presentthe results of the SLARTS method and compare them withthe results of the baseline approaches

51 Dataset Unfortunately there is no standard datasetfor RTS task in declarative threads while severaldatasets are available for interrogative threads [3 4 14]Therefore Thestandard (httpthestandardorgnz) Alef(httpwwwalefir) ENENews (httpenenewscom) Rus-sianblog (wwwkaverydreamwidthorg) and Courant-blogs(wwwcourantblogscom) websites were crawled to provideevaluation data Thestandard in New Zealand and Alef inIran are two online news agencies established in August2007 and 2006 respectively They publish daily news andarticles by journalists in different categories such as economyenvironment politics and society Thestandard is in Englishand Alef is in PersianThestandard has provided the tree-likereply structure of comments since 2010 ENENews is a newswebsite covering the latest energy-related developmentsIt was established shortly after the Fukushima Daiichidisaster in March 2011 It has grown rapidly to serveapproximately 2000000 page views per month Russianblogand Courantblogs are two blogs that are in Russian andEnglish respectively Russsianblog and Courantblogs areestablished since 2003 and 2012 respectively

The reason for selecting the mentioned websites are (1)They support multilevel reply (2) their users are active andarticles usually get many comments (3) author and time ofcomments are available (4) they cover different contexts andlanguages (news agencies cultural sites and blogs in EnglishRussian and Persian)

The Scientific World Journal 11

procedure frequent-words (comment candidate comment)Sum = 0For each (Word1 in comment)

For each (Word2 in candidate-comment)Sum += Score [Word1 Word2]

End ifEnd forreturn Sum

Pseudocode 2 A pseudocode for calculating score of frequent words between two comments

Table 1 The datasets specific properties

Datasets Language Confirmation bymoderator

Usersregister

Thestandard English times

Alef Persian times

ENENews English times

Russianblog Russian times

Courantblogs English times

For each website we crawled the webpages that werepublished until the end of 2012We then parsed the pages andextracted the reply structure and used it for ground truthWehave removed the threadswith less than 4 comments becausethese kinds of threads do not give us much information andusually their comments reply to the root Table 1 summarizesthe information about the prepared datasetsThe datasets areavailable at httpeceutacirnlpresourceshtml

In ENENews and Russianblog users have to register inorder to leave a comment However in other datasets userscan leave comments in a thread with different names ordifferent users can use the same names

Table 2 reports some statistics on the crawled websitesThe length of commentsrsquo text in Russianblog is shorter thanthe other datasets which causes text similarity measures toperform poorly on it In ENENews the root usually includesa tweet that is why the length of the rootrsquos text is shorterthan the other datasets All comments have authorrsquos nameexcept for some comments in Alef Therefore the numbersof commenters for Alef was calculated from the commentsthat have authorrsquos name The average number of commentsper article in Thestandard and ENENews are about 50 and39 respectively which are larger than the other datasets

In order to gain some insights into the data we show somecharts extracted from the datasets

Figure 10 shows the distribution of the number of com-ments in the articles It is seen thatmost threads have between5 and 15 comments in Russianblog and Alef However inThestandard length of threads is longer than the otherdatasets andmost threads have between 12 and 24 comments

Publication rate of articles is shown in Figure 11 Thepublication rate follows a bell-shape and articles are publishedbetween 7 am and 19 pm and the highest rate belongs to the 4-hour period between 9 am and 13 pm Since there is only one

author who publishes the articles in Russianblog its chart hasless variation and the root is usually published between 10 amand 18 pm

Figure 12 shows the publication rate of comments afterthe root It is seen that all datasets except Russianblog havesimilar behavior about one hour after the article is publishedit has received most of the comments After that the rate ofcomment reception decreases Russianblog shows differentbehavior It seems that a part of its visitors reply to its articlesthe next morning 16ndash21 hours after the publication

Figure 13 shows the time difference between publicationtime of comments and their replies It is seen that the maxi-mum difference is usually less than one hour ENENews andRussianblog do not moderate comments and Thestandardhas very active moderators who immediately check andaccept comments However in Courantblogs and Alef wheremoderators are not always online the time difference isbetween one and two hours

Figure 14 shows how depth of comments increases whentime passes after publication of the main article Deepercomments show longer and more serious conversations Asshown in the figure comments usually reply to the root inearly hours After a few hours conversations are continuedaround the comments which causes the depth of thread toincrease Visitors of Alef and Courantblogs talk more aboutthe root article However Thestandard and ENENews havelonger and deeper discussions

Figure 15 shows length of comments in words Most ofcomments include 10ndash20 words Except for comments ofRussianblog the other datasets are similar This tells thatthe similarity measure is not enough and we have to usenontextual features as well Russianblog has comments thatare shorter than the other datasets This dataset is a personalblog and users usually write friendly comments

Figure 16 shows depth of comments Depth is directlyrelated to the conversations It is seen that comments areusually found in the first depth (below the root) RussianblogENENews and Thestandard have more comments in higherlevels meaning conversations are longer

52 Evaluation Metrics To evaluate the experiment resultswe use several metrics For edge prediction we use precisionrecall and 119865-score measures [4 13 17 18] Output of RTS isa tree This causes precision recall and 119865-score values to beequal [2] since FP (False Positive) and FN (False Negative)

12 The Scientific World Journal

Table 2 Datasetsrsquo statistics split up per website

DatasetsAvg length of

commentsrsquo text (inwords)

Avg length of therootrsquos text (in words)

Number ofcomments

Avg comments pernews

Number ofnews

Number ofcommenter

Thestandard 7298 35768 149058 4981 2992 3431Alef 7466 76655 23210 2796 830 4387ENENews 6819 16264 16661 3892 428 831Courantblogs 7924 39817 4952 1919 258 1716Russianblog 1856 23716 30877 1618 1908 1258

0

20

40

60

80

100

120

4 8 16 32 64 128 256

Num

ber o

f new

s

Number of comments

AlefEnenewsRussianblog

CourantblogsThestandard

Figure 10 Distribution of the number of comments in news article

0

005

01

015

02

025

0 2 4 6 8 10 12 14 16 18 20 22

Nor

mal

ized

new

s cou

nt

Hour

AlefEnenewsRussianblog

CourantblogsThestandard

Figure 11 Publication rate of articles per hour

0002004006008

01012014016

0 2 4 6 8 10 12 14 16 18 20 22 24

Nor

mal

ized

com

men

ts co

unt

Elapsed time after published the root (hour)

AlefEnenewsCourantblogs

RussianblogThestandard

Figure 12 Average number of published comments per hour after the root

The Scientific World Journal 13

0005

01015

02025

03035

04045

05

0 1 2 3 4 5 6 7 8 9 10 11 12

Nor

mal

ized

com

men

ts co

unt

Time difference

AlefEnenewsRusssiablog

CourantblogsThestandard

Figure 13 Time difference between a node and corresponding parent per hour

0

05

1

15

2

25

3

35

4

0 2 4 6 8 10 12 14 16 18 20 22 24

Aver

age c

omm

ents

dept

h

Elapsed time after publishing the root (hour)

AlefEnenewsRusssianblog

CourantblogsThestandard

Figure 14 Depth of comments averaged on the elapsed time from publication of root

in precision and recall are always equal Instead we use thefollowing edge accuracy measure

Accuracy (Acc)edge

=

1003816100381610038161003816reply relations cap detected relations1003816100381610038161003816119873

(7)

where the tree is comprised of119873 comments and one rootThe secondmetric is path accuracy which was introduced

byWang et al [3]Thismetric has a global view and considerspaths from nodes to the root

Accuracy (Acc)path =sum|119862|

119894=1

1003816100381610038161003816path119866 (119894) = path119875(119894)1003816100381610038161003816

|119862| (8)

where path119866(119894) is the ground-truth structure for 119894th comment

and path119875(119894) is the predicted structure for it |119862| is the number

of comments in a thread If any irrelevant comment appearsin the path this metric considers it to be completely wrongSo it is very strict To relax this Wang et al introduced

a metric that computes the overlap between the paths inground truth and the predicted path

Precision (119875)119877-path =

sum|119862|

119894=1(1003816100381610038161003816path119866 (119894) cap path

119875(119894)1003816100381610038161003816 1003816100381610038161003816path119875 (119894)

1003816100381610038161003816)

|119862|

Recall (119877)119877-path =

sum|119862|

119894=1(1003816100381610038161003816path119866 (119894) cap path

119875(119894)1003816100381610038161003816 1003816100381610038161003816path119866 (119894)

1003816100381610038161003816)

|119862|

(9)

where |path119875(119894)| is the number of comments in the prediction

path of 119894th comment and |path119866(119894)| is the number of com-

ments in the ground-truth path of 119894th comment 1198651-score isthe harmonic mean of precision and recall

1198651-score = 2 times Precision times recallPrecision + recall

(10)

The above mentioned metrics are appropriate in inter-rogative threads As mentioned before the root of declar-ative threads is news articles or main contents which aredifferent from the root of interrogative threads This causesthe structure of threads and reply relations to be differentThere are two types of reply relations in declarative threads

14 The Scientific World Journal

0

001

002

003

004

005

006

1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91Nor

mal

ized

com

men

ts co

unt

Length of comments (words)

AlefEnenewsRussianblog

CourantblogsThestandard

Figure 15 Histogram of length of comments in words

0

01

02

03

04

05

06

07

08

1 2 3 4 5 6 7 8 9Nor

mal

ized

com

men

ts co

unt

Depth

AlefEnenewsRusssainblog

CourantblogsThestandard

Figure 16 Histogram of the depth of comments

(1) comment-root that is the relation between commentsand the root article and (2) comment-comment that is therelation between two comments one parent and one childThe comment-root relations show different views around themain article (root) The comment-comment relations showconversation among visitors which is a valuable source ofinformation In Figure 17 there are three comment-root andthree comment-comment relations When a user reads theroot and comments heshe can write hisher opinion orsentiment about the root by replying to root or participatingin a discussion by replying to usersrsquo comments

We believe that the evaluation metrics mentioned beforeare not enough to cover both types of reply relations dueto differences between interrogative and declarative threadsAn appropriate metric should be able to detect both typesof relations So we have to modify the evaluation metrics ordefine new metrics

We propose an evaluation metric Accuracy (Acc)CTDwhere CTD stands for Comments Type Detection It isdefined as the proportion of correctly detected commenttypes The comment-root type includes comments whichinitiate a new view about the root (NP) and the comment-

comment type includes comments which are part of adiscussion (PD)

Accuracy (Acc)CTD =TP + TN

TP + TN + FP + FN (11)

where TP is the number of correctly detected NP commentsTN is the number of correctly detected PD comments FP isthe number of incorrectly detected NP comments and FN isthe number of incorrectly detected PD comments

To evaluate accuracy of comment-comment relations thestandard precision recall and 119865-score measures are used

Precision (119875)edge =TP

TP + FP

Recall (119877)edge =TP

TP + FN

1198651-score (119865)edge = 2 timesPrecision times recallPrecision + recall

(12)

where TP is the number of correctly detected comment-comment relations FP is the number of incorrectly detectedcomment-comment relations and FN is the number of

The Scientific World Journal 15

C

C

Comment-root relation

Comment-comment relation

C

C C C Starter discussion commentStarter discussion comment

R

Figure 17 An illustration of a declarative thread

comment-comment relations in the ground truth that werenot predicted at all

The path accuracy metric mentioned earlier is a mod-ified version of 119875path and 119877path which is appropriate todeclarative platform This metric consider discussion pathsfrom each PD comment to the starter discussion commentbut not to root

Precision (119875)path

=

sum|119862|119901

119894=1

1003816100381610038161003816path119866 (119894) = path119875(119894)1003816100381610038161003816

|119862|119901

ow

1 if (|119862|119901 = 0)

Recall (119877)path

=

sum|119862|119877

119894=1

1003816100381610038161003816path119866 (119894) = path119875(119894)1003816100381610038161003816

|119862|119877

ow

1 if (|119862|119877= 0)

(13)

where |119862|119877is the number of PD comments in the ground-

truth thread and |119862|119901is the number of PD comments in the

predicted thread Path119866(119894) is the discussion path from 119894th

node to the discussion starter comment in the ground truthPath119875(119894) is discussion path from 119894th node to the discussion

starter comment in the predicted threadAlso the relaxed Precision (119875)

119877-path and Recall (119877)119877-path

are modified to be suitable for declarative platform

Precision (119875)119877-path

=

sum|119862|119901

119894=1(1003816100381610038161003816path119866 (119894) cap path

119875(119894)1003816100381610038161003816 1003816100381610038161003816path119875 (119894)

1003816100381610038161003816)

|119862|119901

ow

1 if (|119862|119901= 0)

Recall (119877)119877-path

=

sum|119862|119877

119894=1(10038161003816100381610038161003816path119866 (119894)

cap path119875(119894)

1003816100381610038161003816100381610038161003816100381610038161003816path119866(119894)

10038161003816100381610038161003816)

|119862|119877

ow

1 if (|119862|119877= 0)

(14)

Figure 18 shows two threads of which one is the groundtruth and the other one is the predicted thread In order tobetter understand the evaluation metrics we calculate themfor this example

In Table 3 the predicted thread in Figure 18 has beenevaluated by metrics from interrogative threads The resultsshow high values Table 4 shows the results of evaluationby metrics from declarative threads The results show thatdeclarative metrics have more appropriate results

There are two reasons which lead declarative metricsto better evaluate the predicted structure (1) The root indeclarative threads has many children So if a methodconnects all comments to the root interrogative metricsshow good results (2) Interrogative metrics cannot properlyevaluate comment-comment relations in declarative threadsHowever the declarative metrics can evaluate both types ofrelations

53 Experimental Results and Analysis We use several base-lines to compare the effectiveness of SLARTS The first base-line is the performance when we simply link all comments totheir previous comment (we name it Last baseline) [3 4]Thismethod leads to good results for RTS task in interrogativethreads The second baseline is to link all comments to theroot (we name it First baseline)

The only work that has been done on RTS has focused onthe authorrsquos name in online news agencies [19] Schuth et alrsquosmethod used several features to find the commenterrsquos name inonline news articles This method achieves a good precisionbut it has a low recall because many comments do not referto any authorrsquos name So we selected Seo et alrsquos method [2]which is a strong method in interrogative platforms

Seo et alrsquos method has focused on forums and emails andused the quoted text as one of theirmain featuresThis featureis not available in our datasets Therefore we use all theirproposed features except the quoted text

We have applied 5fold cross-validation to minimize biaswith 95 as confidence interval Table 5 shows the results ofour experiments based on interrogative evaluation metricsAccording to Accedge SLARTS reveals higher accuracy exceptin Alef in which lots of replies are connected to the root

According to 119875119877-path and 119877119877-path First and Last baselines

have the best performance respectively First connects allcomments to the root this way irrelevant comments do notappear in the path from the comment to the root and also theroot appears in all paths Last connects all comments to theirprevious comments This causes all comments to appear inthe path

16 The Scientific World Journal

Table3Ex

ampleo

fcalculatio

nof

interrogativem

etric

sfor

thee

xamples

hownin

Figure

18

Evaluatio

nmetric

Calculation

Accuracy

edge

Replyrelatio

ns=1rarr

1198772rarr

1198777rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

14rarr

35rarr

26rarr

58rarr

5

Detectedrelatio

ns==1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

14rarr

15rarr

26rarr

57rarr

28rarr

7

8 11asymp073

Acc p

ath

|(1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

1rarr

1198775rarr

2rarr

1198776rarr

5rarr

2rarr

119877)|

Num

berof

comments

=8 11asymp073

119875119877-path

(1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

1rarr

1198775rarr

2rarr

1198776rarr

5rarr

2rarr

119877)+(4rarr

1rarr

1198777rarr

2rarr

1198778rarr

7rarr

2rarr

119877)

Num

berof

comments

=(1+1+1+1+1+1+1+1)+(1+12+23)

11

=1017

11

asymp092

119877119877-path

(1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

1rarr

1198775rarr

2rarr

1198776rarr

5rarr

2rarr

119877)+(4rarr

3rarr

1rarr

1198777rarr

1198778rarr

5rarr

2rarr

119877)

Num

berof

comments

=(1+1+1+1+1+1+1+1)+(23+1+23)

11

=1034

11

asymp094

1198651-score119877-path

2lowast092lowast094

092+094

asymp093

The Scientific World Journal 17

R

9 10 11

3 5

8

21 7

4 6

(a)

R

9 10 11

3 7

8

21

4 5

6

(b)

Figure 18 An example of two threads (a) ground-truth thread and (b) the prediction thread

Table 4 Example of calculation of declarative metrics for the example shown in Figure 18

Evaluation metric Calculation

AccCTD

TP = 1 2 9 10 11TN = 3 4 5 6 8

FN = 7

FP =

5 + 5

5 + 5 + 1 + 0asymp 091

119875edge

119877edge

1198651edge

TP = 3 rarr 1 5 rarr 2 6 rarr 5 119875edge =3

3 + 3= 05

FN = 4 rarr 3 8 rarr 5 119877edge =3

3 + 2= 06

FP = 4 rarr 1 7 rarr 2 8 rarr 7 1198651 minus scoreedge asymp 055

119875path

119877path

1198651path

119875path =| 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 |

| 3 rarr 1 4 rarr 1 5 rarr 2 6 rarr 5 rarr 2 7 rarr 2 8 rarr 7 rarr 2 |=3

6= 05

119877path =| 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 |

| 3 rarr 1 4 rarr 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 8 rarr 5 rarr 2 |=3

5= 06

1198651-score119877-path asymp 055

119875119877-path

119877119877-path

1198651119877-path

119875119877-path =

3 rarr 1 4 rarr 1 5 rarr 2 6 rarr 5 rarr 2 7 rarr 2 8 rarr 7 rarr 2

| 3 4 5 6 7 8 |=1 + 1 + 1 + 1 + 0 + 12

6=45

6= 075

119877119877-path =

3 rarr 1 4 rarr 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 8 rarr 5 rarr 2

| 3 4 5 6 8 |=1 + 12 + 1 + 1 + 12

5=4

5= 08

1198651-score119877-path =

2 lowast 075 lowast 08

075 + 08asymp 077

According to Accpath First has better performance inThestandard ENENews and Alef because more commentsare linked directly to the root Usually SLARTS and Seo etalrsquos methods cannot predict the full path inThestandard andENENews because according to Figure 15 paths are verylong and complex in these datasets

As wementioned earlier interrogative evaluationmetricsare not appropriate for declarative threads because based onthese metrics First shows high performance although thisbaseline does not detect any comment-comment relations

Table 6 shows the results when we use declarative evaluationmetrics proposed in Section 52

According to Table 6 for 119865edge SLARTS performs betterthan Seo et alrsquos method However for119875edge Seo et alrsquos methodperforms better but its 119877edge is lower than SLARTS in all da-tasets

It is important to say that when the difference between119875edge and 119877edge is high and 119875edge is greater than 119877edge themethod most likely connects comments to the root and itdoes not appropriately detect comment-comment relations

18 The Scientific World Journal

J(1)M(1)

V(1)

H(1)

A(1)

R(1)

G(1)

N(1)D(1)

E(1)

C(1)

P(2)

U(1)

S(1)

T(2)

T(1)

M(2)

L(1)

Q(1)

C(2)

K(1)

F(1)L(2)

C(3)

P(1)

A(2)

B(1)O(1)

I(1)

H(2)

R

(a)

J(1)

M(1)

R

A(1)

N(4)

D(1)

G(1)

I(1)

B(1)H(1)

O(1)

C(1)

N(2)

G(2)O(2)

P(1)

E(1)

D(2)

F(1)

E(2)F(2)

D(3)

L(1) F(3) N(1)

D(4)

F(4)K(1)D(5)

D(6)

N(3)

(b)

1)

1)

1)

1)

1)

1)

1)

1)1)

1)

1)

1)

R

F(2)

K(3)

R(1)

K(2)

M(2)A(3)

Q(1)

N(2)

K(4)A(2)

J(2)

1)

H(2)

H(3)

I(2)

I(1)

P(1)

O(1)

(c)

Figure 19 Three real threads in which nodes are labeled according to commenterrsquos name and number of its comments

(like First baseline) On the other hand when 119877edge isgreater than 119875edge the method does not appropriately detectcomment-root relations (like Last baseline)

So Seo et alrsquos method most likely connects comments tothe root and this is not appropriately detecting comment-comment relations On the other hands 119875edge and 119877edge ofSLARTS are close

Several features in SLARTS such as Global similarityfrequent words Frequent patterns and authorsrsquo languagefocus on detection of comment-comment relations Thismakes the results of SLARTS better than Seo et alrsquos methodin declarative evaluation metrics

We already saw that the average length of threads inThestandard and ENENews is longer than the other datasets(Table 2) and their paths are much longer and more complexthan the other datasets (Figure 16) According to 119865edge theaccuracy of Thestandard and ENENews is less than otherdatasets Note that 119865edge is a strict metric in declarativethreads

SLARTS has better 119865edge than Seo et alrsquos method in alldatasetsThemaximumdifference occurs inThestandard andRussianblog In these datasets many of the defined featureshave goodperformance (Importance of features in ENENewsand Russianblog datasets will be shown in Table 7 and we willexplain more in the next section)

The Scientific World Journal 19

Table 5 Experimental results with interrogative evaluation metrics

Datasets Methods Accedge Accpath 119875119877-path 119877

119877-path 1198651119877-path

Thestandard

SLARTS 04669plusmn0009

03059plusmn0013

06927plusmn0018

07450plusmn0010

07179plusmn0013

Seo et alrsquos [2] 03941plusmn0005

03413plusmn0006

08303plusmn0009

06563plusmn0005

07331plusmn0003

First 03630plusmn0010

03630plusmn0010

10000plusmn0000

05846plusmn0009

07379plusmn0007

Last 01824plusmn0006

00495plusmn0004

02214plusmn0008

10000plusmn0000

03625plusmn0010

ENENews

SLARTS 04837plusmn0018

03695plusmn0027

07625plusmn0026

07770plusmn0013

07697plusmn0017

Seo et alrsquos [2] 04661plusmn0017

04134plusmn0021

08720plusmn0011

07086plusmn0013

07818plusmn0011

First 04524plusmn0018

04524plusmn0018

10000plusmn0000

06634plusmn0016

07976plusmn0011

Last 02090plusmn0016

00596plusmn0003

02319plusmn0011

10000plusmn0000

03764plusmn0015

Courantblogs

SLARTS 06843plusmn0028

06453plusmn0033

09127plusmn0013

08619plusmn0014

08865plusmn0013

Seo et alrsquos [2] 06700plusmn0040

06628plusmn0043

09667plusmn0014

08250plusmn0018

08902plusmn0015

First 06490plusmn0036

06490plusmn0036

10000plusmn0000

08001plusmn0019

08889plusmn0012

Last 02037plusmn0017

01082plusmn0008

03183plusmn0011

10000plusmn0000

04829plusmn0012

Russianblog

SLARTS 07463plusmn0017

06904plusmn0017

08741plusmn0008

08787plusmn0006

08764plusmn0007

Seo et alrsquos [2] 05446plusmn0010

05174plusmn0011

09293plusmn0003

07821plusmn0004

08494plusmn0003

First 04968plusmn0006

04968plusmn0006

10000plusmn0000

07134plusmn0004

08327plusmn0003

Last 03632plusmn0018

01549plusmn0012

03695plusmn0015

10000plusmn0000

05395plusmn0016

Alef

SLARTS 07753plusmn0017

07641plusmn0018

09579plusmn0006

09033plusmn0008

09298plusmn0007

Seo et alrsquos [2] 07802plusmn0019

07800plusmn0019

09950plusmn0003

08828plusmn0010

09355plusmn0006

First 07853plusmn0018

07853plusmn0018

10000plusmn0000

08826plusmn0010

09376plusmn0006

Last 00993plusmn0005

00822plusmn0005

02526plusmn0012

10000plusmn0000

04032plusmn0015

The bold style shows the best result in each metric

As shown in Table 6 Russianblog has better resultsthan the other datasets in all metrics The main reason isthat its comments are not confirmed by a moderator Thiscauses Accedge of Last baseline in Russianblog to be equalto 03632 that is more than other datasets (Table 5) Accedgeof Last baseline has inverse relationship with complexity ofreplies Also we showed in Figure 13 that the Russianbloghas the lowest time difference between a comment and itsparent When time difference between a child and its parentdecreases detection of reply relationwould be easier In otherwords as the parent appears to be closer to its child somefeatures such as frequent pattern and location Prior that are

based on position of comments in chronological order workbetter

The difference between 119865119877-path and 119865path is about 20

in Thestandard and ENENews where threads are larger andpaths have more complex structures

The minimum difference between the results of SLARTSand Seo et alrsquos methods appears in Alef datasets In Alef manyrelations are comment root and many comments do not haveauthorrsquos name which make the features perform poorly

Since SLARTS method has features which speciallyfocused on detecting comment-root relations (eg by addingcandidate filtering) AccCTD of SLARTS is better than Seo

20 The Scientific World Journal

et al method in all datasets The best result is 091 forRussianblog and the worst result is 069 for ENENewsThe root of ENENews dataset is usually a tweet Accordingto Table 2 it makes the average length of the rootrsquos textto be shorter than the other datasets and this makes thetextual features perform poorly on detecting comment-rootrelations

Confidence intervals of Alef andCourantblog datasets arehigher than the other datasets becausemany of their relationsare comment root (Figure 16) This makes ranking SVM tobe bias towards connecting comments to the root especiallywhen a thread includes very few comment-comment rela-tions

We compare 119875-value to specify the significance of differ-ences between SLARTS and Seo et alrsquosmethods on declarativemetrics Since ranking SVM ranks candidates based on theirscore and selects the first candidate from the list only thefirst candidate is important So p1 is computed The resultsindicate that all improvements are statistically significant (119875-value lt 0005) in all datasets

54 Evaluation of the Features In this section we evaluatethe role of the introduced features We use backward featureselection That means to measure the importance of a featurewe use all features except that feature repeat the experimentsin its absence and compare the results to the case whereall features are present The difference between the values ofmetrics in presence of a feature and its absence is reported inTable 7

It is seen that some features improve precision of themetrics for example location prior and candidate filteringrule 3 where they most likely tend to detect comment-rootrelations Some features improve recall of the metrics such asauthorsrsquo language global similarity candidate filtering rule1 and frequent patterns These features most likely tendto detect comment-comment relations Some features affectboth precision and recall for example authorsrsquo name

As stated earlier Russianblog has different behaviourin comparison with other datasets (Figures 12 13 and 15)Also ENENews has larger threads and more complex paths(Figure 16 and Table 2) So we use these datasets to evaluatefeatures in depth

Similarity feature improves recall of evaluation metricsSince Russian language is known as a morphologically richlanguage and the length of commentsrsquo text is very short(about average 18 words according to Table 2) in comparisonwith other datasets improvement of textual features is lowTo increase the performance of textual feature in Russianwe need a stemmer a Wordnet and a tokenizer Similarityfeature has rather a good impact on AccCTD

Authorsrsquo language feature improves recall of evaluationmetrics According to 119877edge the improvements of this featureare 003 and 001 in ENENews and Russianblog respectively

Global similarity feature improves both precision andrecall in Russianblog and recall in ENENews

The frequent words feature has a small improve-ment in Russianblog like similarity feature This feature

improves recall of the evaluation metrics in ENEnews about001

Length ratio feature improves precision in both datasetsHowever since longer comments in ENENews have manychildren this feature is more prominent there

Authorsrsquo name feature is useful in all evaluation metricsThe value of this feature in ENENews is more than Russian-blog because authorsrsquo name in Russianblog is different fromother datasets it is an email and no one would refer to it as aname

The frequent patterns feature focuses on detectingcomment-comment relations This feature improves recall ofevaluation metrics in both datasets

The location prior feature improves precision in bothdatasets This feature has a good improvement on AccCTDAccording to 119875edge the best improvement is 016091 forRussianblog since comments are not moderated

The candidate filtering rule 1 improves recall of evaluationmetrics in both datasets This feature removes the rootcandidate accurately This feature has a good improvementon AccCTD The maximum improvement for 119877edge is 034 forRussianblog

The candidate filtering rule 2 has small improvements onprecision and recall of evaluation metrics in both datasetsMaximum improvement is gained with Russianblog

Finally The candidate filtering rule 3 improves precisionof metrics in both datasets This feature removes all can-didates except the root Thus detection of comment-rootrelations is improved Also this feature improves AccCTD

6 Information Extraction byVisualizing Threads Structure

In this section we discuss the information which can beextracted from hierarchical structure of threads Figure 19 isthe same as Figure 3 except that the commenters and theirnumber of posted comments are shown as the nodesrsquo labelfor example ldquo119863(3)rdquo is the third comment sent by user ldquo119863rdquoVisualization of the structure reveals valuable information asfollowing

(i) Is the root article controversial The threads ldquo119861rdquo andldquo119862rdquo in Figure 19 aremore controversial than ldquo119860rdquo Visi-tors have expressed different opinions and sentimentsabout the root in ldquo119861rdquo and ldquo119862rdquo leading to formationof longer conversations The thread ldquo119860rdquo shows acommon thread which has short conversations aboutthe root The height and width of trees can help as ameasure to recognize whether a root is controversialor not

(ii) Which comments are starters [36] Starter commentshave important role in the conversations becauseusers read them and then read other comments tobetter understand the conversations

(iii) Which comments have participated in this conversa-tion Users can follow the conversations that seeminteresting to them without reading any unrelatedcomment

The Scientific World Journal 21

Table 6 Experimental results with declarative evaluation metrics

Datasets Method 119875edge 119877edge 119865edge 119875path 119877path 119865path 119875119877-path 119877

119877-path 119865119877-path AccCTD

ThestandardSLARTS 03782

plusmn000804155plusmn0011

03960plusmn0006

01565plusmn0007

01713plusmn0008

01636plusmn0007

03628plusmn0012

04166plusmn0016

03876plusmn0005

07525plusmn0005

Seo et alrsquos[2]

03014plusmn0031

01414plusmn0015

01921plusmn0016

01544plusmn0025

00576plusmn0011

00835plusmn0013

03013plusmn0038

01319plusmn0018

01831plusmn0021

05831plusmn0007

Difference 00778 02741 02039 00021 01137 00801 00615 02847 02045 01694

ENENewsSLARTS 03797

plusmn003803891plusmn0019

03839plusmn0024

01873plusmn0040

01855plusmn0026

01857plusmn0030

03768plusmn0048

03768plusmn0017

03759plusmn0025

06872plusmn0014

Seo et alrsquos[2]

04060plusmn0030

01713plusmn0016

02407plusmn0018

02378plusmn0034

00837plusmn0016

01235plusmn0021

03947plusmn0035

01523plusmn0015

02196plusmn0019

05954plusmn0009

Difference minus00263 02178 01432 minus00505 01018 00622 minus00179 02245 01563 00918

CourantblogsSLARTS 05236

plusmn003003963plusmn0073

04495plusmn0052

04220plusmn0057

03110plusmn0058

03566plusmn0049

05325plusmn0049

03853plusmn0075

04452plusmn0057

07625plusmn0033

Seo et alrsquos[2]

07320plusmn0090

02049plusmn0062

03189plusmn0082

06815plusmn0105

01896plusmn0070

02951plusmn0093

07309plusmn0094

02010plusmn0065

03140plusmn0086

07001plusmn0031

Difference minus02084 01914 01306 minus02595 01214 00615 minus01984 01843 01312 00624

RussianblogSLARTS 05940

plusmn002805674plusmn0022

05803plusmn0023

04907plusmn0031

04664plusmn0020

04782plusmn0024

05878plusmn0028

05823plusmn0021

05849plusmn0020

09116plusmn0011

Seo et alrsquos[2]

04705plusmn0020

03030plusmn0015

03685plusmn0016

03823plusmn0023

02534plusmn0013

03046plusmn0015

04682plusmn0020

02822plusmn0014

03521plusmn0015

05862plusmn0013

Difference 01235 02644 02118 01084 0213 01736 01196 03001 02328 03254

AlefSLARTS 06189

plusmn002804131plusmn0040

04952plusmn0037

05639plusmn0027

03819plusmn0037

04551plusmn0034

06277plusmn0027

04069plusmn0043

04934plusmn0039

08044plusmn0013

Seo et alrsquos[2]

08827plusmn0052

02637plusmn0051

04045plusmn0060

08797plusmn0053

02631plusmn0051

04034plusmn0060

08840plusmn0052

02635plusmn0051

04043plusmn0060

07846plusmn0018

Difference minus02638 01494 00907 minus03158 01188 00517 minus02563 01434 00891 00198The bold style shows the best result in each metric

(iv) Who plays an important role in a discussion Someusers playmore important roles in conversations thanother users For example users ldquo119863(1)rdquo in threadldquoBrdquo and ldquo119860(1)rdquo in thread ldquoCrdquo have made a longconversation These users are known as a hub or astarter in a thread and their degree of importance canbe compared according to their indegree [39] Manyanalyses can be performed on this structure which isknown as popularity features [12]

(v) How many conversations are about the root Thereare thirteen conversations which are replyingdirectly to the root in thread ldquoCrdquo for exampleF(2)H(2)H(3) I(2)O(1)orF(2)H(2) l(1) showtwo conversations in thread ldquo119862rdquo

7 Conclusion and Future Work

In this paper we proposed SLARTS a method based onranking SVMwhich predicts the tree-like structure of declar-ative threads for example blogs and online news agenciesWe emphasized on the differences between declarative andinterrogative threads and showed that many of the previouslyproposed features perform poorly on declarative threadsbecause of this Instead we defined a set of novel textual andnontextual features and used a ranking SVM algorithm tocombine these features

We detect two types of reply relations in declarativethreads comment-root relation and comment-commentrelation An appropriate method should be able to detectboth types of reply relations and an appropriatemetric shouldconsider both types of reply relations So in order to havefair judge on the quality of the predicted structures wemodified the evaluation metrics accordingly We also defineda novel metric that measures the accuracy of comment typedetection

The results of our experiments showed that accord-ing to declarative evaluation metrics our method showshigher accuracy in comparison with the baselines on fivedatasets Also we showed that all improvements in detectingcomment-comment relations are statistically significant in alldatasets

We believe that the defined features are not limited todeclarative threads Some features such as authorrsquos languageand frequent patterns extract relations among users can beuseful in interrogative threads

For future we would like to use the proposed methodon interrogative platforms such as forums Also we wouldlike to analyze the tree-like reply structure deeply and webelieve it can provide valuable information for example tohelp in understanding the role of users in discussions or tofind important and controversial comments among a largevolume of comments

22 The Scientific World Journal

Table 7 The difference between the evaluation metrics in presence and absence of features

Feature Dataset Evaluation metric119875edge 119877edge 119875Path 119877Path 119875

119877-path 119877119877-path AccCTD

Similarity ENENews minus000233 007415 minus000538 003289 minus001095 008996 004684Russianblog 000874 000012 000963 000347 000695 minus001079 000355

Authorsrsquo language ENENews minus001354 003246 minus001785 000693 minus002757 004052 000458Russianblog minus000229 001052 minus000179 000750 minus000590 000807 minus000182

Global similarity ENENews minus000402 005604 minus001338 001581 minus001129 007086 002742Russianblog 001090 001443 000582 000897 000368 minus000317 minus000087

Frequent words ENENews 000252 000931 000170 000424 minus000132 000973 000292Russianblog 000008 000023 000016 000029 000034 000047 minus000011

Length ratio ENENews 004206 minus002984 004450 001700 006421 minus005336 minus000850Russianblog 000651 minus000355 000920 000080 000784 minus000592 000235

Authorsrsquo name ENENews 003753 006392 001920 003285 002341 005998 001788Russianblog 000000 000000 000000 000000 000000 000000 000000

Frequent pattern ENENews minus002028 002930 minus002357 000300 minus003793 003410 000526Russianblog 000458 006100 000830 005206 minus000545 006150 000404

Location prior ENENews 008778 minus003690 008556 003280 011792 minus007852 002925Russianblog 016091 009671 017766 013515 016699 008793 002414

Candidate filtering rule 1 ENENews minus005431 006138 minus004102 001296 minus006458 007545 006949Russianblog minus012506 034370 minus013916 027445 minus012683 037003 032546

Candidate filtering rule 2 ENENews 000014 minus000004 000048 000041 minus000038 minus000035 minus000009Russianblog 002575 002475 002462 002341 001746 000802 000000

Candidate filtering rule 3 ENENews 000357 minus000121 000452 000254 000551 minus000161 000161Russianblog 008521 minus001978 009636 001702 009380 minus003825 007371

Since our goal is to provide a language-independentmodel to reconstruct the thread structure we did not usetext processing tools such as Wordnet and Named EntityRecognition (as they are not available in the same quality forall languages) We would like to focus on English and usethese tools to find out whether they can improve the accuracyof RTS method or not

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] Y Wu Q Ye L Li and J Xiao ldquoPower-law properties ofhuman view and reply behavior in online societyrdquoMathematicalProblems in Engineering vol 2012 Article ID 969087 7 pages2012

[2] J Seo W B Croft and D A Smith ldquoOnline community searchusing conversational structuresrdquo Information Retrieval vol 14no 6 pp 547ndash571 2011

[3] HWang CWang C X Zhai and J Han ldquoLearning online dis-cussion structures by conditional random fieldsrdquo in Proceedingsof the 34th International ACM SIGIR Conference on Researchand Development in Information Retrieval (SIGIR rsquo11) pp 435ndash444 July 2011

[4] E Aumayr J Chan and C Hayes ldquoReconstruction of threadedconversations in online discussion forumsrdquo inProceedings of the

5th International AAAI Conference onWeblogs and SocialMedia(ICWSM rsquo11) pp 26ndash33 2011

[5] D Shen Q Yang J-T Sun and Z Chen ldquoThread detactionin dynamic text message streamsrdquo in Proceedings of the 29thAnnual International ACM SIGIR Conference on Research andDevelopment in Information Retrieval pp 35ndash42 Seattle WashUSA August 2006

[6] M Dehghani M Asadpour and A Shakery ldquoAn evolutionary-based method for reconstructing conversation threads in emailcorporardquo in Proceedings of the IEEEACM International Con-ference on Advances in Social Networks Analysis and Mining(ASONAM rsquo12) 2012

[7] Y Jen-Yuan ldquoEmail thread reassembly using similarity match-ingrdquo in Proceedings of the 3rd Conference on Email and Anti-Spam (CEAS rsquo06) Muntain View Calif USA 2006

[8] S Gottipati D Lo and J Jiang ldquoFinding relevant answersin software forumsrdquo in Proceedings of the 26th IEEEACMInternational Conference on Automated Software Engineering(ASE rsquo11) pp 323ndash332 November 2011

[9] H Duan and C Zhai ldquoExploiting thread structures to improvesmoothing of language models for forum post retrievalrdquo inAdvances in Information Retrieval P Clough C Foley CGurrin et al Eds vol 6611 pp 350ndash361 Springer BerlinGermany 2011

[10] T Georgiou M Karvounis and Y Ioannidis ldquoExtractingtopics of debate between users on web discussion boardsrdquo inProceedings of the 1st International Conference forUndergraduateand Postgraduate Students in Research Poster Competition 2010

[11] Y C Wang M Joshi W Cohen and C P Rose ldquoRecoveringimplicit thread structure in newsgroup style conversationsrdquo

The Scientific World Journal 23

in Proceedings of the International Conference on Weblogs andSocial Media Seattle Wash USA 2008

[12] J Chan C Hayes and E M Daly ldquoDecomposing discussionforums and boards using user rolesrdquo in Proceedings of the 4thInternational AAAI Conference on Weblogs and Social Media2010

[13] Y-C Wang and C P Rose ldquoMaking conversational structureexplicit identification of initiation-response pairs within onlinediscussionsrdquo in Proceedings of the Human Language Technolo-gies Conference ofthe North American Chapter of the Associationfor Computational Linguistics pp 673ndash676 June 2010

[14] LWangM Lui S N Kim J Nivre and T Baldwin ldquoPredictingthread discourse structure over technical web forumsrdquo inProceedings of the Conference on Empirical Methods in NaturalLanguage Processing (EMNLP rsquo11) pp 13ndash25 Edinburgh UKJuly 2011

[15] A Balali H FailiMAsadpour andMDehghani ldquoa supervisedapproach for reconstructing thread structure in commentson blogs and online news agenciesrdquo in Proceedings of the14th International Conference on Intelligent Text Processing andComputational Linguistics Samos Greece 2013

[16] P Adams and C Martel ldquoConversational thread extraction andtopic detection in text-based chatrdquo in Semantic Computing pp87ndash113 John Wiley amp Sons 2010

[17] P H Adams and C H Martell ldquoTopic detection and extractionin chatrdquo in Proceedings of the 2nd Annual IEEE InternationalConference on Semantic Computing (ICSC rsquo08) pp 581ndash588August 2008

[18] C Lin J M Yang R Cai X J Wang W Wang and L ZhangldquoModeling semantics and structure of discussion threadsrdquo inProceedings of the 18th International Conference on World WideWeb pp 1103ndash1104 2009

[19] A Schuth MMarx andM d Rijke ldquoExtracting the discussionstructure in comments on news-articlesrdquo in Proceedings of the9th ACM International Workshop onWeb Information and DataManagement (WIDM rsquo07) Lisboa Portugal November 2007

[20] C Wang J Han Q Li X Li W P Lin and H Ji ldquoLearninghierarchical relationships among partially ordered objects withheterogeneous attributes and linksrdquo in Proceedings of the 12thSIAM International Conference on Data Mining pp 516ndash527Anaheim Calif USA 2012

[21] M Dehghani A Shakery M Asadpour and A KoushkestanildquoA learning approach for email conversation thread reconstruc-tionrdquo Journal of Information Science vol 39 no 6 pp 846ndash8632013

[22] A Balali A Rajabi S Ghassemi M Asadpour and H FailildquoContent diffusion prediction in social networksrdquo in Pro-ceedings of the 5th Conference on Information and KnowledgeTechnology (IKT rsquo13) pp 467ndash471 2013

[23] G Mishne and N Glance ldquoLeave a reply an analysis of weblogcommentsrdquo in Proceedings of the 3rd Annual Workshop onthe Weblogging ecosystem Aggregation Analysis and DynamicsEdinburgh UK 2006

[24] V Gomez A Kaltenbrunner and V Lopez ldquoStatistical analysisof the social network and discussion threads in Slashdotrdquo inProceedings of the 17th International Conference on World WideWeb (WWW rsquo08) pp 645ndash654 April 2008

[25] T Weninger X A Zhu and J Han ldquoAn exploration ofdiscussion threads in social news sites a case study of the redditcommunityrdquo 2013

[26] D Laniado R Tasso Y Volkovich and A KaltenbrunnerldquoWhen the Wikipedians talk network and tree structure of

Wikipedia discussion pagesrdquo in Proceedings of the 5th Interna-tional AAAI Conference on Weblogs and Social Media (ICWSMrsquo11) 2011

[27] A Grabowski ldquoHuman behavior in online social systemsrdquoTheEuropean Physical Journal B vol 69 no 4 pp 605ndash611 2009

[28] S N Kim L Wang and T Baldwin ldquoTagging and linkingweb forum postsrdquo in Proceedings of the 40th Conference onComputational Natural Language Learning Uppsala Sweden2010

[29] J Andreas S Rosenthal and K McKeown ldquoAnnotating agree-ment and disagreement in threaded discussionrdquo in Proceedingsof the 8th International Conference on Language Resources andComputation (LREC rsquo12) pp 818ndash822 Istanbul Turkey May2012

[30] A Stolcke K Ries N Coccaro et al ldquoDialogue actmodeling forautomatic tagging and recognition of conversational speechrdquoComputational Linguistics vol 26 no 3 pp 339ndash373 2000

[31] E N Forsyth Improving Automated Lexical and DiscourseAnalysis of Online Chat Dialog Naval Postgraduate SchoolMonterey Calif USA 2007

[32] D Feng E Shaw J Kim and E Hovy ldquoAn intelligentdiscussion-bot for answering student queries in threaded dis-cussionsrdquo in Proceedings of the 11th international Conference onIntelligent User Interfaces pp 171ndash177 February 2006

[33] F Hu T Ruan Z Shao and J Ding ldquoAutomatic web informa-tion extraction based on rulesrdquo in Proceedings of the Web Infor-mation System Engineering (WISE rsquo11) pp 265ndash272 Springer2011

[34] F Hu T Ruan and Z Shao ldquoComplete-thread extraction fromweb forumsrdquo inWebTechnologies andApplications pp 727ndash734Springer 2012

[35] D Feng J Kim E Shaw and E Hovy ldquoTowards modelingthreaded discussions using induced ontology knowledgerdquo inProceedings of the National Conference on Artificial Intelligencepp 1289ndash1294 July 2006

[36] G D Venolia and C Neustaedter ldquoUnderstanding sequenceand reply relationships within email conversations a mixed-model visualizationrdquo in Proceedings of the SIGCHI Conferenceon Human Factors in Computing Systems pp 361ndash368 April2003

[37] T Joachims ldquoTraining linear SVMs in linear timerdquo in Pro-ceedings of the 12th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining pp 217ndash226 Philadel-phia Pa USA August 2006

[38] RM Fano ldquoTransmission of information a statistical theory ofcommunicationsrdquo American Journal of Physics vol 29 pp 793ndash794 1961

[39] T Opsahl F Agneessens and J Skvoretz ldquoNode centrality inweighted networks generalizing degree and shortest pathsrdquoSocial Networks vol 32 no 3 pp 245ndash251 2010

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 11: Research Article A Supervised Approach to Predict the …downloads.hindawi.com/journals/tswj/2014/479746.pdf · 2019-07-31 · Research Article A Supervised Approach to Predict the

The Scientific World Journal 11

procedure frequent-words (comment candidate comment)Sum = 0For each (Word1 in comment)

For each (Word2 in candidate-comment)Sum += Score [Word1 Word2]

End ifEnd forreturn Sum

Pseudocode 2 A pseudocode for calculating score of frequent words between two comments

Table 1 The datasets specific properties

Datasets Language Confirmation bymoderator

Usersregister

Thestandard English times

Alef Persian times

ENENews English times

Russianblog Russian times

Courantblogs English times

For each website we crawled the webpages that werepublished until the end of 2012We then parsed the pages andextracted the reply structure and used it for ground truthWehave removed the threadswith less than 4 comments becausethese kinds of threads do not give us much information andusually their comments reply to the root Table 1 summarizesthe information about the prepared datasetsThe datasets areavailable at httpeceutacirnlpresourceshtml

In ENENews and Russianblog users have to register inorder to leave a comment However in other datasets userscan leave comments in a thread with different names ordifferent users can use the same names

Table 2 reports some statistics on the crawled websitesThe length of commentsrsquo text in Russianblog is shorter thanthe other datasets which causes text similarity measures toperform poorly on it In ENENews the root usually includesa tweet that is why the length of the rootrsquos text is shorterthan the other datasets All comments have authorrsquos nameexcept for some comments in Alef Therefore the numbersof commenters for Alef was calculated from the commentsthat have authorrsquos name The average number of commentsper article in Thestandard and ENENews are about 50 and39 respectively which are larger than the other datasets

In order to gain some insights into the data we show somecharts extracted from the datasets

Figure 10 shows the distribution of the number of com-ments in the articles It is seen thatmost threads have between5 and 15 comments in Russianblog and Alef However inThestandard length of threads is longer than the otherdatasets andmost threads have between 12 and 24 comments

Publication rate of articles is shown in Figure 11 Thepublication rate follows a bell-shape and articles are publishedbetween 7 am and 19 pm and the highest rate belongs to the 4-hour period between 9 am and 13 pm Since there is only one

author who publishes the articles in Russianblog its chart hasless variation and the root is usually published between 10 amand 18 pm

Figure 12 shows the publication rate of comments afterthe root It is seen that all datasets except Russianblog havesimilar behavior about one hour after the article is publishedit has received most of the comments After that the rate ofcomment reception decreases Russianblog shows differentbehavior It seems that a part of its visitors reply to its articlesthe next morning 16ndash21 hours after the publication

Figure 13 shows the time difference between publicationtime of comments and their replies It is seen that the maxi-mum difference is usually less than one hour ENENews andRussianblog do not moderate comments and Thestandardhas very active moderators who immediately check andaccept comments However in Courantblogs and Alef wheremoderators are not always online the time difference isbetween one and two hours

Figure 14 shows how depth of comments increases whentime passes after publication of the main article Deepercomments show longer and more serious conversations Asshown in the figure comments usually reply to the root inearly hours After a few hours conversations are continuedaround the comments which causes the depth of thread toincrease Visitors of Alef and Courantblogs talk more aboutthe root article However Thestandard and ENENews havelonger and deeper discussions

Figure 15 shows length of comments in words Most ofcomments include 10ndash20 words Except for comments ofRussianblog the other datasets are similar This tells thatthe similarity measure is not enough and we have to usenontextual features as well Russianblog has comments thatare shorter than the other datasets This dataset is a personalblog and users usually write friendly comments

Figure 16 shows depth of comments Depth is directlyrelated to the conversations It is seen that comments areusually found in the first depth (below the root) RussianblogENENews and Thestandard have more comments in higherlevels meaning conversations are longer

52 Evaluation Metrics To evaluate the experiment resultswe use several metrics For edge prediction we use precisionrecall and 119865-score measures [4 13 17 18] Output of RTS isa tree This causes precision recall and 119865-score values to beequal [2] since FP (False Positive) and FN (False Negative)

12 The Scientific World Journal

Table 2 Datasetsrsquo statistics split up per website

DatasetsAvg length of

commentsrsquo text (inwords)

Avg length of therootrsquos text (in words)

Number ofcomments

Avg comments pernews

Number ofnews

Number ofcommenter

Thestandard 7298 35768 149058 4981 2992 3431Alef 7466 76655 23210 2796 830 4387ENENews 6819 16264 16661 3892 428 831Courantblogs 7924 39817 4952 1919 258 1716Russianblog 1856 23716 30877 1618 1908 1258

0

20

40

60

80

100

120

4 8 16 32 64 128 256

Num

ber o

f new

s

Number of comments

AlefEnenewsRussianblog

CourantblogsThestandard

Figure 10 Distribution of the number of comments in news article

0

005

01

015

02

025

0 2 4 6 8 10 12 14 16 18 20 22

Nor

mal

ized

new

s cou

nt

Hour

AlefEnenewsRussianblog

CourantblogsThestandard

Figure 11 Publication rate of articles per hour

0002004006008

01012014016

0 2 4 6 8 10 12 14 16 18 20 22 24

Nor

mal

ized

com

men

ts co

unt

Elapsed time after published the root (hour)

AlefEnenewsCourantblogs

RussianblogThestandard

Figure 12 Average number of published comments per hour after the root

The Scientific World Journal 13

0005

01015

02025

03035

04045

05

0 1 2 3 4 5 6 7 8 9 10 11 12

Nor

mal

ized

com

men

ts co

unt

Time difference

AlefEnenewsRusssiablog

CourantblogsThestandard

Figure 13 Time difference between a node and corresponding parent per hour

0

05

1

15

2

25

3

35

4

0 2 4 6 8 10 12 14 16 18 20 22 24

Aver

age c

omm

ents

dept

h

Elapsed time after publishing the root (hour)

AlefEnenewsRusssianblog

CourantblogsThestandard

Figure 14 Depth of comments averaged on the elapsed time from publication of root

in precision and recall are always equal Instead we use thefollowing edge accuracy measure

Accuracy (Acc)edge

=

1003816100381610038161003816reply relations cap detected relations1003816100381610038161003816119873

(7)

where the tree is comprised of119873 comments and one rootThe secondmetric is path accuracy which was introduced

byWang et al [3]Thismetric has a global view and considerspaths from nodes to the root

Accuracy (Acc)path =sum|119862|

119894=1

1003816100381610038161003816path119866 (119894) = path119875(119894)1003816100381610038161003816

|119862| (8)

where path119866(119894) is the ground-truth structure for 119894th comment

and path119875(119894) is the predicted structure for it |119862| is the number

of comments in a thread If any irrelevant comment appearsin the path this metric considers it to be completely wrongSo it is very strict To relax this Wang et al introduced

a metric that computes the overlap between the paths inground truth and the predicted path

Precision (119875)119877-path =

sum|119862|

119894=1(1003816100381610038161003816path119866 (119894) cap path

119875(119894)1003816100381610038161003816 1003816100381610038161003816path119875 (119894)

1003816100381610038161003816)

|119862|

Recall (119877)119877-path =

sum|119862|

119894=1(1003816100381610038161003816path119866 (119894) cap path

119875(119894)1003816100381610038161003816 1003816100381610038161003816path119866 (119894)

1003816100381610038161003816)

|119862|

(9)

where |path119875(119894)| is the number of comments in the prediction

path of 119894th comment and |path119866(119894)| is the number of com-

ments in the ground-truth path of 119894th comment 1198651-score isthe harmonic mean of precision and recall

1198651-score = 2 times Precision times recallPrecision + recall

(10)

The above mentioned metrics are appropriate in inter-rogative threads As mentioned before the root of declar-ative threads is news articles or main contents which aredifferent from the root of interrogative threads This causesthe structure of threads and reply relations to be differentThere are two types of reply relations in declarative threads

14 The Scientific World Journal

0

001

002

003

004

005

006

1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91Nor

mal

ized

com

men

ts co

unt

Length of comments (words)

AlefEnenewsRussianblog

CourantblogsThestandard

Figure 15 Histogram of length of comments in words

0

01

02

03

04

05

06

07

08

1 2 3 4 5 6 7 8 9Nor

mal

ized

com

men

ts co

unt

Depth

AlefEnenewsRusssainblog

CourantblogsThestandard

Figure 16 Histogram of the depth of comments

(1) comment-root that is the relation between commentsand the root article and (2) comment-comment that is therelation between two comments one parent and one childThe comment-root relations show different views around themain article (root) The comment-comment relations showconversation among visitors which is a valuable source ofinformation In Figure 17 there are three comment-root andthree comment-comment relations When a user reads theroot and comments heshe can write hisher opinion orsentiment about the root by replying to root or participatingin a discussion by replying to usersrsquo comments

We believe that the evaluation metrics mentioned beforeare not enough to cover both types of reply relations dueto differences between interrogative and declarative threadsAn appropriate metric should be able to detect both typesof relations So we have to modify the evaluation metrics ordefine new metrics

We propose an evaluation metric Accuracy (Acc)CTDwhere CTD stands for Comments Type Detection It isdefined as the proportion of correctly detected commenttypes The comment-root type includes comments whichinitiate a new view about the root (NP) and the comment-

comment type includes comments which are part of adiscussion (PD)

Accuracy (Acc)CTD =TP + TN

TP + TN + FP + FN (11)

where TP is the number of correctly detected NP commentsTN is the number of correctly detected PD comments FP isthe number of incorrectly detected NP comments and FN isthe number of incorrectly detected PD comments

To evaluate accuracy of comment-comment relations thestandard precision recall and 119865-score measures are used

Precision (119875)edge =TP

TP + FP

Recall (119877)edge =TP

TP + FN

1198651-score (119865)edge = 2 timesPrecision times recallPrecision + recall

(12)

where TP is the number of correctly detected comment-comment relations FP is the number of incorrectly detectedcomment-comment relations and FN is the number of

The Scientific World Journal 15

C

C

Comment-root relation

Comment-comment relation

C

C C C Starter discussion commentStarter discussion comment

R

Figure 17 An illustration of a declarative thread

comment-comment relations in the ground truth that werenot predicted at all

The path accuracy metric mentioned earlier is a mod-ified version of 119875path and 119877path which is appropriate todeclarative platform This metric consider discussion pathsfrom each PD comment to the starter discussion commentbut not to root

Precision (119875)path

=

sum|119862|119901

119894=1

1003816100381610038161003816path119866 (119894) = path119875(119894)1003816100381610038161003816

|119862|119901

ow

1 if (|119862|119901 = 0)

Recall (119877)path

=

sum|119862|119877

119894=1

1003816100381610038161003816path119866 (119894) = path119875(119894)1003816100381610038161003816

|119862|119877

ow

1 if (|119862|119877= 0)

(13)

where |119862|119877is the number of PD comments in the ground-

truth thread and |119862|119901is the number of PD comments in the

predicted thread Path119866(119894) is the discussion path from 119894th

node to the discussion starter comment in the ground truthPath119875(119894) is discussion path from 119894th node to the discussion

starter comment in the predicted threadAlso the relaxed Precision (119875)

119877-path and Recall (119877)119877-path

are modified to be suitable for declarative platform

Precision (119875)119877-path

=

sum|119862|119901

119894=1(1003816100381610038161003816path119866 (119894) cap path

119875(119894)1003816100381610038161003816 1003816100381610038161003816path119875 (119894)

1003816100381610038161003816)

|119862|119901

ow

1 if (|119862|119901= 0)

Recall (119877)119877-path

=

sum|119862|119877

119894=1(10038161003816100381610038161003816path119866 (119894)

cap path119875(119894)

1003816100381610038161003816100381610038161003816100381610038161003816path119866(119894)

10038161003816100381610038161003816)

|119862|119877

ow

1 if (|119862|119877= 0)

(14)

Figure 18 shows two threads of which one is the groundtruth and the other one is the predicted thread In order tobetter understand the evaluation metrics we calculate themfor this example

In Table 3 the predicted thread in Figure 18 has beenevaluated by metrics from interrogative threads The resultsshow high values Table 4 shows the results of evaluationby metrics from declarative threads The results show thatdeclarative metrics have more appropriate results

There are two reasons which lead declarative metricsto better evaluate the predicted structure (1) The root indeclarative threads has many children So if a methodconnects all comments to the root interrogative metricsshow good results (2) Interrogative metrics cannot properlyevaluate comment-comment relations in declarative threadsHowever the declarative metrics can evaluate both types ofrelations

53 Experimental Results and Analysis We use several base-lines to compare the effectiveness of SLARTS The first base-line is the performance when we simply link all comments totheir previous comment (we name it Last baseline) [3 4]Thismethod leads to good results for RTS task in interrogativethreads The second baseline is to link all comments to theroot (we name it First baseline)

The only work that has been done on RTS has focused onthe authorrsquos name in online news agencies [19] Schuth et alrsquosmethod used several features to find the commenterrsquos name inonline news articles This method achieves a good precisionbut it has a low recall because many comments do not referto any authorrsquos name So we selected Seo et alrsquos method [2]which is a strong method in interrogative platforms

Seo et alrsquos method has focused on forums and emails andused the quoted text as one of theirmain featuresThis featureis not available in our datasets Therefore we use all theirproposed features except the quoted text

We have applied 5fold cross-validation to minimize biaswith 95 as confidence interval Table 5 shows the results ofour experiments based on interrogative evaluation metricsAccording to Accedge SLARTS reveals higher accuracy exceptin Alef in which lots of replies are connected to the root

According to 119875119877-path and 119877119877-path First and Last baselines

have the best performance respectively First connects allcomments to the root this way irrelevant comments do notappear in the path from the comment to the root and also theroot appears in all paths Last connects all comments to theirprevious comments This causes all comments to appear inthe path

16 The Scientific World Journal

Table3Ex

ampleo

fcalculatio

nof

interrogativem

etric

sfor

thee

xamples

hownin

Figure

18

Evaluatio

nmetric

Calculation

Accuracy

edge

Replyrelatio

ns=1rarr

1198772rarr

1198777rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

14rarr

35rarr

26rarr

58rarr

5

Detectedrelatio

ns==1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

14rarr

15rarr

26rarr

57rarr

28rarr

7

8 11asymp073

Acc p

ath

|(1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

1rarr

1198775rarr

2rarr

1198776rarr

5rarr

2rarr

119877)|

Num

berof

comments

=8 11asymp073

119875119877-path

(1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

1rarr

1198775rarr

2rarr

1198776rarr

5rarr

2rarr

119877)+(4rarr

1rarr

1198777rarr

2rarr

1198778rarr

7rarr

2rarr

119877)

Num

berof

comments

=(1+1+1+1+1+1+1+1)+(1+12+23)

11

=1017

11

asymp092

119877119877-path

(1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

1rarr

1198775rarr

2rarr

1198776rarr

5rarr

2rarr

119877)+(4rarr

3rarr

1rarr

1198777rarr

1198778rarr

5rarr

2rarr

119877)

Num

berof

comments

=(1+1+1+1+1+1+1+1)+(23+1+23)

11

=1034

11

asymp094

1198651-score119877-path

2lowast092lowast094

092+094

asymp093

The Scientific World Journal 17

R

9 10 11

3 5

8

21 7

4 6

(a)

R

9 10 11

3 7

8

21

4 5

6

(b)

Figure 18 An example of two threads (a) ground-truth thread and (b) the prediction thread

Table 4 Example of calculation of declarative metrics for the example shown in Figure 18

Evaluation metric Calculation

AccCTD

TP = 1 2 9 10 11TN = 3 4 5 6 8

FN = 7

FP =

5 + 5

5 + 5 + 1 + 0asymp 091

119875edge

119877edge

1198651edge

TP = 3 rarr 1 5 rarr 2 6 rarr 5 119875edge =3

3 + 3= 05

FN = 4 rarr 3 8 rarr 5 119877edge =3

3 + 2= 06

FP = 4 rarr 1 7 rarr 2 8 rarr 7 1198651 minus scoreedge asymp 055

119875path

119877path

1198651path

119875path =| 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 |

| 3 rarr 1 4 rarr 1 5 rarr 2 6 rarr 5 rarr 2 7 rarr 2 8 rarr 7 rarr 2 |=3

6= 05

119877path =| 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 |

| 3 rarr 1 4 rarr 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 8 rarr 5 rarr 2 |=3

5= 06

1198651-score119877-path asymp 055

119875119877-path

119877119877-path

1198651119877-path

119875119877-path =

3 rarr 1 4 rarr 1 5 rarr 2 6 rarr 5 rarr 2 7 rarr 2 8 rarr 7 rarr 2

| 3 4 5 6 7 8 |=1 + 1 + 1 + 1 + 0 + 12

6=45

6= 075

119877119877-path =

3 rarr 1 4 rarr 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 8 rarr 5 rarr 2

| 3 4 5 6 8 |=1 + 12 + 1 + 1 + 12

5=4

5= 08

1198651-score119877-path =

2 lowast 075 lowast 08

075 + 08asymp 077

According to Accpath First has better performance inThestandard ENENews and Alef because more commentsare linked directly to the root Usually SLARTS and Seo etalrsquos methods cannot predict the full path inThestandard andENENews because according to Figure 15 paths are verylong and complex in these datasets

As wementioned earlier interrogative evaluationmetricsare not appropriate for declarative threads because based onthese metrics First shows high performance although thisbaseline does not detect any comment-comment relations

Table 6 shows the results when we use declarative evaluationmetrics proposed in Section 52

According to Table 6 for 119865edge SLARTS performs betterthan Seo et alrsquos method However for119875edge Seo et alrsquos methodperforms better but its 119877edge is lower than SLARTS in all da-tasets

It is important to say that when the difference between119875edge and 119877edge is high and 119875edge is greater than 119877edge themethod most likely connects comments to the root and itdoes not appropriately detect comment-comment relations

18 The Scientific World Journal

J(1)M(1)

V(1)

H(1)

A(1)

R(1)

G(1)

N(1)D(1)

E(1)

C(1)

P(2)

U(1)

S(1)

T(2)

T(1)

M(2)

L(1)

Q(1)

C(2)

K(1)

F(1)L(2)

C(3)

P(1)

A(2)

B(1)O(1)

I(1)

H(2)

R

(a)

J(1)

M(1)

R

A(1)

N(4)

D(1)

G(1)

I(1)

B(1)H(1)

O(1)

C(1)

N(2)

G(2)O(2)

P(1)

E(1)

D(2)

F(1)

E(2)F(2)

D(3)

L(1) F(3) N(1)

D(4)

F(4)K(1)D(5)

D(6)

N(3)

(b)

1)

1)

1)

1)

1)

1)

1)

1)1)

1)

1)

1)

R

F(2)

K(3)

R(1)

K(2)

M(2)A(3)

Q(1)

N(2)

K(4)A(2)

J(2)

1)

H(2)

H(3)

I(2)

I(1)

P(1)

O(1)

(c)

Figure 19 Three real threads in which nodes are labeled according to commenterrsquos name and number of its comments

(like First baseline) On the other hand when 119877edge isgreater than 119875edge the method does not appropriately detectcomment-root relations (like Last baseline)

So Seo et alrsquos method most likely connects comments tothe root and this is not appropriately detecting comment-comment relations On the other hands 119875edge and 119877edge ofSLARTS are close

Several features in SLARTS such as Global similarityfrequent words Frequent patterns and authorsrsquo languagefocus on detection of comment-comment relations Thismakes the results of SLARTS better than Seo et alrsquos methodin declarative evaluation metrics

We already saw that the average length of threads inThestandard and ENENews is longer than the other datasets(Table 2) and their paths are much longer and more complexthan the other datasets (Figure 16) According to 119865edge theaccuracy of Thestandard and ENENews is less than otherdatasets Note that 119865edge is a strict metric in declarativethreads

SLARTS has better 119865edge than Seo et alrsquos method in alldatasetsThemaximumdifference occurs inThestandard andRussianblog In these datasets many of the defined featureshave goodperformance (Importance of features in ENENewsand Russianblog datasets will be shown in Table 7 and we willexplain more in the next section)

The Scientific World Journal 19

Table 5 Experimental results with interrogative evaluation metrics

Datasets Methods Accedge Accpath 119875119877-path 119877

119877-path 1198651119877-path

Thestandard

SLARTS 04669plusmn0009

03059plusmn0013

06927plusmn0018

07450plusmn0010

07179plusmn0013

Seo et alrsquos [2] 03941plusmn0005

03413plusmn0006

08303plusmn0009

06563plusmn0005

07331plusmn0003

First 03630plusmn0010

03630plusmn0010

10000plusmn0000

05846plusmn0009

07379plusmn0007

Last 01824plusmn0006

00495plusmn0004

02214plusmn0008

10000plusmn0000

03625plusmn0010

ENENews

SLARTS 04837plusmn0018

03695plusmn0027

07625plusmn0026

07770plusmn0013

07697plusmn0017

Seo et alrsquos [2] 04661plusmn0017

04134plusmn0021

08720plusmn0011

07086plusmn0013

07818plusmn0011

First 04524plusmn0018

04524plusmn0018

10000plusmn0000

06634plusmn0016

07976plusmn0011

Last 02090plusmn0016

00596plusmn0003

02319plusmn0011

10000plusmn0000

03764plusmn0015

Courantblogs

SLARTS 06843plusmn0028

06453plusmn0033

09127plusmn0013

08619plusmn0014

08865plusmn0013

Seo et alrsquos [2] 06700plusmn0040

06628plusmn0043

09667plusmn0014

08250plusmn0018

08902plusmn0015

First 06490plusmn0036

06490plusmn0036

10000plusmn0000

08001plusmn0019

08889plusmn0012

Last 02037plusmn0017

01082plusmn0008

03183plusmn0011

10000plusmn0000

04829plusmn0012

Russianblog

SLARTS 07463plusmn0017

06904plusmn0017

08741plusmn0008

08787plusmn0006

08764plusmn0007

Seo et alrsquos [2] 05446plusmn0010

05174plusmn0011

09293plusmn0003

07821plusmn0004

08494plusmn0003

First 04968plusmn0006

04968plusmn0006

10000plusmn0000

07134plusmn0004

08327plusmn0003

Last 03632plusmn0018

01549plusmn0012

03695plusmn0015

10000plusmn0000

05395plusmn0016

Alef

SLARTS 07753plusmn0017

07641plusmn0018

09579plusmn0006

09033plusmn0008

09298plusmn0007

Seo et alrsquos [2] 07802plusmn0019

07800plusmn0019

09950plusmn0003

08828plusmn0010

09355plusmn0006

First 07853plusmn0018

07853plusmn0018

10000plusmn0000

08826plusmn0010

09376plusmn0006

Last 00993plusmn0005

00822plusmn0005

02526plusmn0012

10000plusmn0000

04032plusmn0015

The bold style shows the best result in each metric

As shown in Table 6 Russianblog has better resultsthan the other datasets in all metrics The main reason isthat its comments are not confirmed by a moderator Thiscauses Accedge of Last baseline in Russianblog to be equalto 03632 that is more than other datasets (Table 5) Accedgeof Last baseline has inverse relationship with complexity ofreplies Also we showed in Figure 13 that the Russianbloghas the lowest time difference between a comment and itsparent When time difference between a child and its parentdecreases detection of reply relationwould be easier In otherwords as the parent appears to be closer to its child somefeatures such as frequent pattern and location Prior that are

based on position of comments in chronological order workbetter

The difference between 119865119877-path and 119865path is about 20

in Thestandard and ENENews where threads are larger andpaths have more complex structures

The minimum difference between the results of SLARTSand Seo et alrsquos methods appears in Alef datasets In Alef manyrelations are comment root and many comments do not haveauthorrsquos name which make the features perform poorly

Since SLARTS method has features which speciallyfocused on detecting comment-root relations (eg by addingcandidate filtering) AccCTD of SLARTS is better than Seo

20 The Scientific World Journal

et al method in all datasets The best result is 091 forRussianblog and the worst result is 069 for ENENewsThe root of ENENews dataset is usually a tweet Accordingto Table 2 it makes the average length of the rootrsquos textto be shorter than the other datasets and this makes thetextual features perform poorly on detecting comment-rootrelations

Confidence intervals of Alef andCourantblog datasets arehigher than the other datasets becausemany of their relationsare comment root (Figure 16) This makes ranking SVM tobe bias towards connecting comments to the root especiallywhen a thread includes very few comment-comment rela-tions

We compare 119875-value to specify the significance of differ-ences between SLARTS and Seo et alrsquosmethods on declarativemetrics Since ranking SVM ranks candidates based on theirscore and selects the first candidate from the list only thefirst candidate is important So p1 is computed The resultsindicate that all improvements are statistically significant (119875-value lt 0005) in all datasets

54 Evaluation of the Features In this section we evaluatethe role of the introduced features We use backward featureselection That means to measure the importance of a featurewe use all features except that feature repeat the experimentsin its absence and compare the results to the case whereall features are present The difference between the values ofmetrics in presence of a feature and its absence is reported inTable 7

It is seen that some features improve precision of themetrics for example location prior and candidate filteringrule 3 where they most likely tend to detect comment-rootrelations Some features improve recall of the metrics such asauthorsrsquo language global similarity candidate filtering rule1 and frequent patterns These features most likely tendto detect comment-comment relations Some features affectboth precision and recall for example authorsrsquo name

As stated earlier Russianblog has different behaviourin comparison with other datasets (Figures 12 13 and 15)Also ENENews has larger threads and more complex paths(Figure 16 and Table 2) So we use these datasets to evaluatefeatures in depth

Similarity feature improves recall of evaluation metricsSince Russian language is known as a morphologically richlanguage and the length of commentsrsquo text is very short(about average 18 words according to Table 2) in comparisonwith other datasets improvement of textual features is lowTo increase the performance of textual feature in Russianwe need a stemmer a Wordnet and a tokenizer Similarityfeature has rather a good impact on AccCTD

Authorsrsquo language feature improves recall of evaluationmetrics According to 119877edge the improvements of this featureare 003 and 001 in ENENews and Russianblog respectively

Global similarity feature improves both precision andrecall in Russianblog and recall in ENENews

The frequent words feature has a small improve-ment in Russianblog like similarity feature This feature

improves recall of the evaluation metrics in ENEnews about001

Length ratio feature improves precision in both datasetsHowever since longer comments in ENENews have manychildren this feature is more prominent there

Authorsrsquo name feature is useful in all evaluation metricsThe value of this feature in ENENews is more than Russian-blog because authorsrsquo name in Russianblog is different fromother datasets it is an email and no one would refer to it as aname

The frequent patterns feature focuses on detectingcomment-comment relations This feature improves recall ofevaluation metrics in both datasets

The location prior feature improves precision in bothdatasets This feature has a good improvement on AccCTDAccording to 119875edge the best improvement is 016091 forRussianblog since comments are not moderated

The candidate filtering rule 1 improves recall of evaluationmetrics in both datasets This feature removes the rootcandidate accurately This feature has a good improvementon AccCTD The maximum improvement for 119877edge is 034 forRussianblog

The candidate filtering rule 2 has small improvements onprecision and recall of evaluation metrics in both datasetsMaximum improvement is gained with Russianblog

Finally The candidate filtering rule 3 improves precisionof metrics in both datasets This feature removes all can-didates except the root Thus detection of comment-rootrelations is improved Also this feature improves AccCTD

6 Information Extraction byVisualizing Threads Structure

In this section we discuss the information which can beextracted from hierarchical structure of threads Figure 19 isthe same as Figure 3 except that the commenters and theirnumber of posted comments are shown as the nodesrsquo labelfor example ldquo119863(3)rdquo is the third comment sent by user ldquo119863rdquoVisualization of the structure reveals valuable information asfollowing

(i) Is the root article controversial The threads ldquo119861rdquo andldquo119862rdquo in Figure 19 aremore controversial than ldquo119860rdquo Visi-tors have expressed different opinions and sentimentsabout the root in ldquo119861rdquo and ldquo119862rdquo leading to formationof longer conversations The thread ldquo119860rdquo shows acommon thread which has short conversations aboutthe root The height and width of trees can help as ameasure to recognize whether a root is controversialor not

(ii) Which comments are starters [36] Starter commentshave important role in the conversations becauseusers read them and then read other comments tobetter understand the conversations

(iii) Which comments have participated in this conversa-tion Users can follow the conversations that seeminteresting to them without reading any unrelatedcomment

The Scientific World Journal 21

Table 6 Experimental results with declarative evaluation metrics

Datasets Method 119875edge 119877edge 119865edge 119875path 119877path 119865path 119875119877-path 119877

119877-path 119865119877-path AccCTD

ThestandardSLARTS 03782

plusmn000804155plusmn0011

03960plusmn0006

01565plusmn0007

01713plusmn0008

01636plusmn0007

03628plusmn0012

04166plusmn0016

03876plusmn0005

07525plusmn0005

Seo et alrsquos[2]

03014plusmn0031

01414plusmn0015

01921plusmn0016

01544plusmn0025

00576plusmn0011

00835plusmn0013

03013plusmn0038

01319plusmn0018

01831plusmn0021

05831plusmn0007

Difference 00778 02741 02039 00021 01137 00801 00615 02847 02045 01694

ENENewsSLARTS 03797

plusmn003803891plusmn0019

03839plusmn0024

01873plusmn0040

01855plusmn0026

01857plusmn0030

03768plusmn0048

03768plusmn0017

03759plusmn0025

06872plusmn0014

Seo et alrsquos[2]

04060plusmn0030

01713plusmn0016

02407plusmn0018

02378plusmn0034

00837plusmn0016

01235plusmn0021

03947plusmn0035

01523plusmn0015

02196plusmn0019

05954plusmn0009

Difference minus00263 02178 01432 minus00505 01018 00622 minus00179 02245 01563 00918

CourantblogsSLARTS 05236

plusmn003003963plusmn0073

04495plusmn0052

04220plusmn0057

03110plusmn0058

03566plusmn0049

05325plusmn0049

03853plusmn0075

04452plusmn0057

07625plusmn0033

Seo et alrsquos[2]

07320plusmn0090

02049plusmn0062

03189plusmn0082

06815plusmn0105

01896plusmn0070

02951plusmn0093

07309plusmn0094

02010plusmn0065

03140plusmn0086

07001plusmn0031

Difference minus02084 01914 01306 minus02595 01214 00615 minus01984 01843 01312 00624

RussianblogSLARTS 05940

plusmn002805674plusmn0022

05803plusmn0023

04907plusmn0031

04664plusmn0020

04782plusmn0024

05878plusmn0028

05823plusmn0021

05849plusmn0020

09116plusmn0011

Seo et alrsquos[2]

04705plusmn0020

03030plusmn0015

03685plusmn0016

03823plusmn0023

02534plusmn0013

03046plusmn0015

04682plusmn0020

02822plusmn0014

03521plusmn0015

05862plusmn0013

Difference 01235 02644 02118 01084 0213 01736 01196 03001 02328 03254

AlefSLARTS 06189

plusmn002804131plusmn0040

04952plusmn0037

05639plusmn0027

03819plusmn0037

04551plusmn0034

06277plusmn0027

04069plusmn0043

04934plusmn0039

08044plusmn0013

Seo et alrsquos[2]

08827plusmn0052

02637plusmn0051

04045plusmn0060

08797plusmn0053

02631plusmn0051

04034plusmn0060

08840plusmn0052

02635plusmn0051

04043plusmn0060

07846plusmn0018

Difference minus02638 01494 00907 minus03158 01188 00517 minus02563 01434 00891 00198The bold style shows the best result in each metric

(iv) Who plays an important role in a discussion Someusers playmore important roles in conversations thanother users For example users ldquo119863(1)rdquo in threadldquoBrdquo and ldquo119860(1)rdquo in thread ldquoCrdquo have made a longconversation These users are known as a hub or astarter in a thread and their degree of importance canbe compared according to their indegree [39] Manyanalyses can be performed on this structure which isknown as popularity features [12]

(v) How many conversations are about the root Thereare thirteen conversations which are replyingdirectly to the root in thread ldquoCrdquo for exampleF(2)H(2)H(3) I(2)O(1)orF(2)H(2) l(1) showtwo conversations in thread ldquo119862rdquo

7 Conclusion and Future Work

In this paper we proposed SLARTS a method based onranking SVMwhich predicts the tree-like structure of declar-ative threads for example blogs and online news agenciesWe emphasized on the differences between declarative andinterrogative threads and showed that many of the previouslyproposed features perform poorly on declarative threadsbecause of this Instead we defined a set of novel textual andnontextual features and used a ranking SVM algorithm tocombine these features

We detect two types of reply relations in declarativethreads comment-root relation and comment-commentrelation An appropriate method should be able to detectboth types of reply relations and an appropriatemetric shouldconsider both types of reply relations So in order to havefair judge on the quality of the predicted structures wemodified the evaluation metrics accordingly We also defineda novel metric that measures the accuracy of comment typedetection

The results of our experiments showed that accord-ing to declarative evaluation metrics our method showshigher accuracy in comparison with the baselines on fivedatasets Also we showed that all improvements in detectingcomment-comment relations are statistically significant in alldatasets

We believe that the defined features are not limited todeclarative threads Some features such as authorrsquos languageand frequent patterns extract relations among users can beuseful in interrogative threads

For future we would like to use the proposed methodon interrogative platforms such as forums Also we wouldlike to analyze the tree-like reply structure deeply and webelieve it can provide valuable information for example tohelp in understanding the role of users in discussions or tofind important and controversial comments among a largevolume of comments

22 The Scientific World Journal

Table 7 The difference between the evaluation metrics in presence and absence of features

Feature Dataset Evaluation metric119875edge 119877edge 119875Path 119877Path 119875

119877-path 119877119877-path AccCTD

Similarity ENENews minus000233 007415 minus000538 003289 minus001095 008996 004684Russianblog 000874 000012 000963 000347 000695 minus001079 000355

Authorsrsquo language ENENews minus001354 003246 minus001785 000693 minus002757 004052 000458Russianblog minus000229 001052 minus000179 000750 minus000590 000807 minus000182

Global similarity ENENews minus000402 005604 minus001338 001581 minus001129 007086 002742Russianblog 001090 001443 000582 000897 000368 minus000317 minus000087

Frequent words ENENews 000252 000931 000170 000424 minus000132 000973 000292Russianblog 000008 000023 000016 000029 000034 000047 minus000011

Length ratio ENENews 004206 minus002984 004450 001700 006421 minus005336 minus000850Russianblog 000651 minus000355 000920 000080 000784 minus000592 000235

Authorsrsquo name ENENews 003753 006392 001920 003285 002341 005998 001788Russianblog 000000 000000 000000 000000 000000 000000 000000

Frequent pattern ENENews minus002028 002930 minus002357 000300 minus003793 003410 000526Russianblog 000458 006100 000830 005206 minus000545 006150 000404

Location prior ENENews 008778 minus003690 008556 003280 011792 minus007852 002925Russianblog 016091 009671 017766 013515 016699 008793 002414

Candidate filtering rule 1 ENENews minus005431 006138 minus004102 001296 minus006458 007545 006949Russianblog minus012506 034370 minus013916 027445 minus012683 037003 032546

Candidate filtering rule 2 ENENews 000014 minus000004 000048 000041 minus000038 minus000035 minus000009Russianblog 002575 002475 002462 002341 001746 000802 000000

Candidate filtering rule 3 ENENews 000357 minus000121 000452 000254 000551 minus000161 000161Russianblog 008521 minus001978 009636 001702 009380 minus003825 007371

Since our goal is to provide a language-independentmodel to reconstruct the thread structure we did not usetext processing tools such as Wordnet and Named EntityRecognition (as they are not available in the same quality forall languages) We would like to focus on English and usethese tools to find out whether they can improve the accuracyof RTS method or not

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] Y Wu Q Ye L Li and J Xiao ldquoPower-law properties ofhuman view and reply behavior in online societyrdquoMathematicalProblems in Engineering vol 2012 Article ID 969087 7 pages2012

[2] J Seo W B Croft and D A Smith ldquoOnline community searchusing conversational structuresrdquo Information Retrieval vol 14no 6 pp 547ndash571 2011

[3] HWang CWang C X Zhai and J Han ldquoLearning online dis-cussion structures by conditional random fieldsrdquo in Proceedingsof the 34th International ACM SIGIR Conference on Researchand Development in Information Retrieval (SIGIR rsquo11) pp 435ndash444 July 2011

[4] E Aumayr J Chan and C Hayes ldquoReconstruction of threadedconversations in online discussion forumsrdquo inProceedings of the

5th International AAAI Conference onWeblogs and SocialMedia(ICWSM rsquo11) pp 26ndash33 2011

[5] D Shen Q Yang J-T Sun and Z Chen ldquoThread detactionin dynamic text message streamsrdquo in Proceedings of the 29thAnnual International ACM SIGIR Conference on Research andDevelopment in Information Retrieval pp 35ndash42 Seattle WashUSA August 2006

[6] M Dehghani M Asadpour and A Shakery ldquoAn evolutionary-based method for reconstructing conversation threads in emailcorporardquo in Proceedings of the IEEEACM International Con-ference on Advances in Social Networks Analysis and Mining(ASONAM rsquo12) 2012

[7] Y Jen-Yuan ldquoEmail thread reassembly using similarity match-ingrdquo in Proceedings of the 3rd Conference on Email and Anti-Spam (CEAS rsquo06) Muntain View Calif USA 2006

[8] S Gottipati D Lo and J Jiang ldquoFinding relevant answersin software forumsrdquo in Proceedings of the 26th IEEEACMInternational Conference on Automated Software Engineering(ASE rsquo11) pp 323ndash332 November 2011

[9] H Duan and C Zhai ldquoExploiting thread structures to improvesmoothing of language models for forum post retrievalrdquo inAdvances in Information Retrieval P Clough C Foley CGurrin et al Eds vol 6611 pp 350ndash361 Springer BerlinGermany 2011

[10] T Georgiou M Karvounis and Y Ioannidis ldquoExtractingtopics of debate between users on web discussion boardsrdquo inProceedings of the 1st International Conference forUndergraduateand Postgraduate Students in Research Poster Competition 2010

[11] Y C Wang M Joshi W Cohen and C P Rose ldquoRecoveringimplicit thread structure in newsgroup style conversationsrdquo

The Scientific World Journal 23

in Proceedings of the International Conference on Weblogs andSocial Media Seattle Wash USA 2008

[12] J Chan C Hayes and E M Daly ldquoDecomposing discussionforums and boards using user rolesrdquo in Proceedings of the 4thInternational AAAI Conference on Weblogs and Social Media2010

[13] Y-C Wang and C P Rose ldquoMaking conversational structureexplicit identification of initiation-response pairs within onlinediscussionsrdquo in Proceedings of the Human Language Technolo-gies Conference ofthe North American Chapter of the Associationfor Computational Linguistics pp 673ndash676 June 2010

[14] LWangM Lui S N Kim J Nivre and T Baldwin ldquoPredictingthread discourse structure over technical web forumsrdquo inProceedings of the Conference on Empirical Methods in NaturalLanguage Processing (EMNLP rsquo11) pp 13ndash25 Edinburgh UKJuly 2011

[15] A Balali H FailiMAsadpour andMDehghani ldquoa supervisedapproach for reconstructing thread structure in commentson blogs and online news agenciesrdquo in Proceedings of the14th International Conference on Intelligent Text Processing andComputational Linguistics Samos Greece 2013

[16] P Adams and C Martel ldquoConversational thread extraction andtopic detection in text-based chatrdquo in Semantic Computing pp87ndash113 John Wiley amp Sons 2010

[17] P H Adams and C H Martell ldquoTopic detection and extractionin chatrdquo in Proceedings of the 2nd Annual IEEE InternationalConference on Semantic Computing (ICSC rsquo08) pp 581ndash588August 2008

[18] C Lin J M Yang R Cai X J Wang W Wang and L ZhangldquoModeling semantics and structure of discussion threadsrdquo inProceedings of the 18th International Conference on World WideWeb pp 1103ndash1104 2009

[19] A Schuth MMarx andM d Rijke ldquoExtracting the discussionstructure in comments on news-articlesrdquo in Proceedings of the9th ACM International Workshop onWeb Information and DataManagement (WIDM rsquo07) Lisboa Portugal November 2007

[20] C Wang J Han Q Li X Li W P Lin and H Ji ldquoLearninghierarchical relationships among partially ordered objects withheterogeneous attributes and linksrdquo in Proceedings of the 12thSIAM International Conference on Data Mining pp 516ndash527Anaheim Calif USA 2012

[21] M Dehghani A Shakery M Asadpour and A KoushkestanildquoA learning approach for email conversation thread reconstruc-tionrdquo Journal of Information Science vol 39 no 6 pp 846ndash8632013

[22] A Balali A Rajabi S Ghassemi M Asadpour and H FailildquoContent diffusion prediction in social networksrdquo in Pro-ceedings of the 5th Conference on Information and KnowledgeTechnology (IKT rsquo13) pp 467ndash471 2013

[23] G Mishne and N Glance ldquoLeave a reply an analysis of weblogcommentsrdquo in Proceedings of the 3rd Annual Workshop onthe Weblogging ecosystem Aggregation Analysis and DynamicsEdinburgh UK 2006

[24] V Gomez A Kaltenbrunner and V Lopez ldquoStatistical analysisof the social network and discussion threads in Slashdotrdquo inProceedings of the 17th International Conference on World WideWeb (WWW rsquo08) pp 645ndash654 April 2008

[25] T Weninger X A Zhu and J Han ldquoAn exploration ofdiscussion threads in social news sites a case study of the redditcommunityrdquo 2013

[26] D Laniado R Tasso Y Volkovich and A KaltenbrunnerldquoWhen the Wikipedians talk network and tree structure of

Wikipedia discussion pagesrdquo in Proceedings of the 5th Interna-tional AAAI Conference on Weblogs and Social Media (ICWSMrsquo11) 2011

[27] A Grabowski ldquoHuman behavior in online social systemsrdquoTheEuropean Physical Journal B vol 69 no 4 pp 605ndash611 2009

[28] S N Kim L Wang and T Baldwin ldquoTagging and linkingweb forum postsrdquo in Proceedings of the 40th Conference onComputational Natural Language Learning Uppsala Sweden2010

[29] J Andreas S Rosenthal and K McKeown ldquoAnnotating agree-ment and disagreement in threaded discussionrdquo in Proceedingsof the 8th International Conference on Language Resources andComputation (LREC rsquo12) pp 818ndash822 Istanbul Turkey May2012

[30] A Stolcke K Ries N Coccaro et al ldquoDialogue actmodeling forautomatic tagging and recognition of conversational speechrdquoComputational Linguistics vol 26 no 3 pp 339ndash373 2000

[31] E N Forsyth Improving Automated Lexical and DiscourseAnalysis of Online Chat Dialog Naval Postgraduate SchoolMonterey Calif USA 2007

[32] D Feng E Shaw J Kim and E Hovy ldquoAn intelligentdiscussion-bot for answering student queries in threaded dis-cussionsrdquo in Proceedings of the 11th international Conference onIntelligent User Interfaces pp 171ndash177 February 2006

[33] F Hu T Ruan Z Shao and J Ding ldquoAutomatic web informa-tion extraction based on rulesrdquo in Proceedings of the Web Infor-mation System Engineering (WISE rsquo11) pp 265ndash272 Springer2011

[34] F Hu T Ruan and Z Shao ldquoComplete-thread extraction fromweb forumsrdquo inWebTechnologies andApplications pp 727ndash734Springer 2012

[35] D Feng J Kim E Shaw and E Hovy ldquoTowards modelingthreaded discussions using induced ontology knowledgerdquo inProceedings of the National Conference on Artificial Intelligencepp 1289ndash1294 July 2006

[36] G D Venolia and C Neustaedter ldquoUnderstanding sequenceand reply relationships within email conversations a mixed-model visualizationrdquo in Proceedings of the SIGCHI Conferenceon Human Factors in Computing Systems pp 361ndash368 April2003

[37] T Joachims ldquoTraining linear SVMs in linear timerdquo in Pro-ceedings of the 12th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining pp 217ndash226 Philadel-phia Pa USA August 2006

[38] RM Fano ldquoTransmission of information a statistical theory ofcommunicationsrdquo American Journal of Physics vol 29 pp 793ndash794 1961

[39] T Opsahl F Agneessens and J Skvoretz ldquoNode centrality inweighted networks generalizing degree and shortest pathsrdquoSocial Networks vol 32 no 3 pp 245ndash251 2010

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 12: Research Article A Supervised Approach to Predict the …downloads.hindawi.com/journals/tswj/2014/479746.pdf · 2019-07-31 · Research Article A Supervised Approach to Predict the

12 The Scientific World Journal

Table 2 Datasetsrsquo statistics split up per website

DatasetsAvg length of

commentsrsquo text (inwords)

Avg length of therootrsquos text (in words)

Number ofcomments

Avg comments pernews

Number ofnews

Number ofcommenter

Thestandard 7298 35768 149058 4981 2992 3431Alef 7466 76655 23210 2796 830 4387ENENews 6819 16264 16661 3892 428 831Courantblogs 7924 39817 4952 1919 258 1716Russianblog 1856 23716 30877 1618 1908 1258

0

20

40

60

80

100

120

4 8 16 32 64 128 256

Num

ber o

f new

s

Number of comments

AlefEnenewsRussianblog

CourantblogsThestandard

Figure 10 Distribution of the number of comments in news article

0

005

01

015

02

025

0 2 4 6 8 10 12 14 16 18 20 22

Nor

mal

ized

new

s cou

nt

Hour

AlefEnenewsRussianblog

CourantblogsThestandard

Figure 11 Publication rate of articles per hour

0002004006008

01012014016

0 2 4 6 8 10 12 14 16 18 20 22 24

Nor

mal

ized

com

men

ts co

unt

Elapsed time after published the root (hour)

AlefEnenewsCourantblogs

RussianblogThestandard

Figure 12 Average number of published comments per hour after the root

The Scientific World Journal 13

0005

01015

02025

03035

04045

05

0 1 2 3 4 5 6 7 8 9 10 11 12

Nor

mal

ized

com

men

ts co

unt

Time difference

AlefEnenewsRusssiablog

CourantblogsThestandard

Figure 13 Time difference between a node and corresponding parent per hour

0

05

1

15

2

25

3

35

4

0 2 4 6 8 10 12 14 16 18 20 22 24

Aver

age c

omm

ents

dept

h

Elapsed time after publishing the root (hour)

AlefEnenewsRusssianblog

CourantblogsThestandard

Figure 14 Depth of comments averaged on the elapsed time from publication of root

in precision and recall are always equal Instead we use thefollowing edge accuracy measure

Accuracy (Acc)edge

=

1003816100381610038161003816reply relations cap detected relations1003816100381610038161003816119873

(7)

where the tree is comprised of119873 comments and one rootThe secondmetric is path accuracy which was introduced

byWang et al [3]Thismetric has a global view and considerspaths from nodes to the root

Accuracy (Acc)path =sum|119862|

119894=1

1003816100381610038161003816path119866 (119894) = path119875(119894)1003816100381610038161003816

|119862| (8)

where path119866(119894) is the ground-truth structure for 119894th comment

and path119875(119894) is the predicted structure for it |119862| is the number

of comments in a thread If any irrelevant comment appearsin the path this metric considers it to be completely wrongSo it is very strict To relax this Wang et al introduced

a metric that computes the overlap between the paths inground truth and the predicted path

Precision (119875)119877-path =

sum|119862|

119894=1(1003816100381610038161003816path119866 (119894) cap path

119875(119894)1003816100381610038161003816 1003816100381610038161003816path119875 (119894)

1003816100381610038161003816)

|119862|

Recall (119877)119877-path =

sum|119862|

119894=1(1003816100381610038161003816path119866 (119894) cap path

119875(119894)1003816100381610038161003816 1003816100381610038161003816path119866 (119894)

1003816100381610038161003816)

|119862|

(9)

where |path119875(119894)| is the number of comments in the prediction

path of 119894th comment and |path119866(119894)| is the number of com-

ments in the ground-truth path of 119894th comment 1198651-score isthe harmonic mean of precision and recall

1198651-score = 2 times Precision times recallPrecision + recall

(10)

The above mentioned metrics are appropriate in inter-rogative threads As mentioned before the root of declar-ative threads is news articles or main contents which aredifferent from the root of interrogative threads This causesthe structure of threads and reply relations to be differentThere are two types of reply relations in declarative threads

14 The Scientific World Journal

0

001

002

003

004

005

006

1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91Nor

mal

ized

com

men

ts co

unt

Length of comments (words)

AlefEnenewsRussianblog

CourantblogsThestandard

Figure 15 Histogram of length of comments in words

0

01

02

03

04

05

06

07

08

1 2 3 4 5 6 7 8 9Nor

mal

ized

com

men

ts co

unt

Depth

AlefEnenewsRusssainblog

CourantblogsThestandard

Figure 16 Histogram of the depth of comments

(1) comment-root that is the relation between commentsand the root article and (2) comment-comment that is therelation between two comments one parent and one childThe comment-root relations show different views around themain article (root) The comment-comment relations showconversation among visitors which is a valuable source ofinformation In Figure 17 there are three comment-root andthree comment-comment relations When a user reads theroot and comments heshe can write hisher opinion orsentiment about the root by replying to root or participatingin a discussion by replying to usersrsquo comments

We believe that the evaluation metrics mentioned beforeare not enough to cover both types of reply relations dueto differences between interrogative and declarative threadsAn appropriate metric should be able to detect both typesof relations So we have to modify the evaluation metrics ordefine new metrics

We propose an evaluation metric Accuracy (Acc)CTDwhere CTD stands for Comments Type Detection It isdefined as the proportion of correctly detected commenttypes The comment-root type includes comments whichinitiate a new view about the root (NP) and the comment-

comment type includes comments which are part of adiscussion (PD)

Accuracy (Acc)CTD =TP + TN

TP + TN + FP + FN (11)

where TP is the number of correctly detected NP commentsTN is the number of correctly detected PD comments FP isthe number of incorrectly detected NP comments and FN isthe number of incorrectly detected PD comments

To evaluate accuracy of comment-comment relations thestandard precision recall and 119865-score measures are used

Precision (119875)edge =TP

TP + FP

Recall (119877)edge =TP

TP + FN

1198651-score (119865)edge = 2 timesPrecision times recallPrecision + recall

(12)

where TP is the number of correctly detected comment-comment relations FP is the number of incorrectly detectedcomment-comment relations and FN is the number of

The Scientific World Journal 15

C

C

Comment-root relation

Comment-comment relation

C

C C C Starter discussion commentStarter discussion comment

R

Figure 17 An illustration of a declarative thread

comment-comment relations in the ground truth that werenot predicted at all

The path accuracy metric mentioned earlier is a mod-ified version of 119875path and 119877path which is appropriate todeclarative platform This metric consider discussion pathsfrom each PD comment to the starter discussion commentbut not to root

Precision (119875)path

=

sum|119862|119901

119894=1

1003816100381610038161003816path119866 (119894) = path119875(119894)1003816100381610038161003816

|119862|119901

ow

1 if (|119862|119901 = 0)

Recall (119877)path

=

sum|119862|119877

119894=1

1003816100381610038161003816path119866 (119894) = path119875(119894)1003816100381610038161003816

|119862|119877

ow

1 if (|119862|119877= 0)

(13)

where |119862|119877is the number of PD comments in the ground-

truth thread and |119862|119901is the number of PD comments in the

predicted thread Path119866(119894) is the discussion path from 119894th

node to the discussion starter comment in the ground truthPath119875(119894) is discussion path from 119894th node to the discussion

starter comment in the predicted threadAlso the relaxed Precision (119875)

119877-path and Recall (119877)119877-path

are modified to be suitable for declarative platform

Precision (119875)119877-path

=

sum|119862|119901

119894=1(1003816100381610038161003816path119866 (119894) cap path

119875(119894)1003816100381610038161003816 1003816100381610038161003816path119875 (119894)

1003816100381610038161003816)

|119862|119901

ow

1 if (|119862|119901= 0)

Recall (119877)119877-path

=

sum|119862|119877

119894=1(10038161003816100381610038161003816path119866 (119894)

cap path119875(119894)

1003816100381610038161003816100381610038161003816100381610038161003816path119866(119894)

10038161003816100381610038161003816)

|119862|119877

ow

1 if (|119862|119877= 0)

(14)

Figure 18 shows two threads of which one is the groundtruth and the other one is the predicted thread In order tobetter understand the evaluation metrics we calculate themfor this example

In Table 3 the predicted thread in Figure 18 has beenevaluated by metrics from interrogative threads The resultsshow high values Table 4 shows the results of evaluationby metrics from declarative threads The results show thatdeclarative metrics have more appropriate results

There are two reasons which lead declarative metricsto better evaluate the predicted structure (1) The root indeclarative threads has many children So if a methodconnects all comments to the root interrogative metricsshow good results (2) Interrogative metrics cannot properlyevaluate comment-comment relations in declarative threadsHowever the declarative metrics can evaluate both types ofrelations

53 Experimental Results and Analysis We use several base-lines to compare the effectiveness of SLARTS The first base-line is the performance when we simply link all comments totheir previous comment (we name it Last baseline) [3 4]Thismethod leads to good results for RTS task in interrogativethreads The second baseline is to link all comments to theroot (we name it First baseline)

The only work that has been done on RTS has focused onthe authorrsquos name in online news agencies [19] Schuth et alrsquosmethod used several features to find the commenterrsquos name inonline news articles This method achieves a good precisionbut it has a low recall because many comments do not referto any authorrsquos name So we selected Seo et alrsquos method [2]which is a strong method in interrogative platforms

Seo et alrsquos method has focused on forums and emails andused the quoted text as one of theirmain featuresThis featureis not available in our datasets Therefore we use all theirproposed features except the quoted text

We have applied 5fold cross-validation to minimize biaswith 95 as confidence interval Table 5 shows the results ofour experiments based on interrogative evaluation metricsAccording to Accedge SLARTS reveals higher accuracy exceptin Alef in which lots of replies are connected to the root

According to 119875119877-path and 119877119877-path First and Last baselines

have the best performance respectively First connects allcomments to the root this way irrelevant comments do notappear in the path from the comment to the root and also theroot appears in all paths Last connects all comments to theirprevious comments This causes all comments to appear inthe path

16 The Scientific World Journal

Table3Ex

ampleo

fcalculatio

nof

interrogativem

etric

sfor

thee

xamples

hownin

Figure

18

Evaluatio

nmetric

Calculation

Accuracy

edge

Replyrelatio

ns=1rarr

1198772rarr

1198777rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

14rarr

35rarr

26rarr

58rarr

5

Detectedrelatio

ns==1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

14rarr

15rarr

26rarr

57rarr

28rarr

7

8 11asymp073

Acc p

ath

|(1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

1rarr

1198775rarr

2rarr

1198776rarr

5rarr

2rarr

119877)|

Num

berof

comments

=8 11asymp073

119875119877-path

(1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

1rarr

1198775rarr

2rarr

1198776rarr

5rarr

2rarr

119877)+(4rarr

1rarr

1198777rarr

2rarr

1198778rarr

7rarr

2rarr

119877)

Num

berof

comments

=(1+1+1+1+1+1+1+1)+(1+12+23)

11

=1017

11

asymp092

119877119877-path

(1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

1rarr

1198775rarr

2rarr

1198776rarr

5rarr

2rarr

119877)+(4rarr

3rarr

1rarr

1198777rarr

1198778rarr

5rarr

2rarr

119877)

Num

berof

comments

=(1+1+1+1+1+1+1+1)+(23+1+23)

11

=1034

11

asymp094

1198651-score119877-path

2lowast092lowast094

092+094

asymp093

The Scientific World Journal 17

R

9 10 11

3 5

8

21 7

4 6

(a)

R

9 10 11

3 7

8

21

4 5

6

(b)

Figure 18 An example of two threads (a) ground-truth thread and (b) the prediction thread

Table 4 Example of calculation of declarative metrics for the example shown in Figure 18

Evaluation metric Calculation

AccCTD

TP = 1 2 9 10 11TN = 3 4 5 6 8

FN = 7

FP =

5 + 5

5 + 5 + 1 + 0asymp 091

119875edge

119877edge

1198651edge

TP = 3 rarr 1 5 rarr 2 6 rarr 5 119875edge =3

3 + 3= 05

FN = 4 rarr 3 8 rarr 5 119877edge =3

3 + 2= 06

FP = 4 rarr 1 7 rarr 2 8 rarr 7 1198651 minus scoreedge asymp 055

119875path

119877path

1198651path

119875path =| 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 |

| 3 rarr 1 4 rarr 1 5 rarr 2 6 rarr 5 rarr 2 7 rarr 2 8 rarr 7 rarr 2 |=3

6= 05

119877path =| 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 |

| 3 rarr 1 4 rarr 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 8 rarr 5 rarr 2 |=3

5= 06

1198651-score119877-path asymp 055

119875119877-path

119877119877-path

1198651119877-path

119875119877-path =

3 rarr 1 4 rarr 1 5 rarr 2 6 rarr 5 rarr 2 7 rarr 2 8 rarr 7 rarr 2

| 3 4 5 6 7 8 |=1 + 1 + 1 + 1 + 0 + 12

6=45

6= 075

119877119877-path =

3 rarr 1 4 rarr 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 8 rarr 5 rarr 2

| 3 4 5 6 8 |=1 + 12 + 1 + 1 + 12

5=4

5= 08

1198651-score119877-path =

2 lowast 075 lowast 08

075 + 08asymp 077

According to Accpath First has better performance inThestandard ENENews and Alef because more commentsare linked directly to the root Usually SLARTS and Seo etalrsquos methods cannot predict the full path inThestandard andENENews because according to Figure 15 paths are verylong and complex in these datasets

As wementioned earlier interrogative evaluationmetricsare not appropriate for declarative threads because based onthese metrics First shows high performance although thisbaseline does not detect any comment-comment relations

Table 6 shows the results when we use declarative evaluationmetrics proposed in Section 52

According to Table 6 for 119865edge SLARTS performs betterthan Seo et alrsquos method However for119875edge Seo et alrsquos methodperforms better but its 119877edge is lower than SLARTS in all da-tasets

It is important to say that when the difference between119875edge and 119877edge is high and 119875edge is greater than 119877edge themethod most likely connects comments to the root and itdoes not appropriately detect comment-comment relations

18 The Scientific World Journal

J(1)M(1)

V(1)

H(1)

A(1)

R(1)

G(1)

N(1)D(1)

E(1)

C(1)

P(2)

U(1)

S(1)

T(2)

T(1)

M(2)

L(1)

Q(1)

C(2)

K(1)

F(1)L(2)

C(3)

P(1)

A(2)

B(1)O(1)

I(1)

H(2)

R

(a)

J(1)

M(1)

R

A(1)

N(4)

D(1)

G(1)

I(1)

B(1)H(1)

O(1)

C(1)

N(2)

G(2)O(2)

P(1)

E(1)

D(2)

F(1)

E(2)F(2)

D(3)

L(1) F(3) N(1)

D(4)

F(4)K(1)D(5)

D(6)

N(3)

(b)

1)

1)

1)

1)

1)

1)

1)

1)1)

1)

1)

1)

R

F(2)

K(3)

R(1)

K(2)

M(2)A(3)

Q(1)

N(2)

K(4)A(2)

J(2)

1)

H(2)

H(3)

I(2)

I(1)

P(1)

O(1)

(c)

Figure 19 Three real threads in which nodes are labeled according to commenterrsquos name and number of its comments

(like First baseline) On the other hand when 119877edge isgreater than 119875edge the method does not appropriately detectcomment-root relations (like Last baseline)

So Seo et alrsquos method most likely connects comments tothe root and this is not appropriately detecting comment-comment relations On the other hands 119875edge and 119877edge ofSLARTS are close

Several features in SLARTS such as Global similarityfrequent words Frequent patterns and authorsrsquo languagefocus on detection of comment-comment relations Thismakes the results of SLARTS better than Seo et alrsquos methodin declarative evaluation metrics

We already saw that the average length of threads inThestandard and ENENews is longer than the other datasets(Table 2) and their paths are much longer and more complexthan the other datasets (Figure 16) According to 119865edge theaccuracy of Thestandard and ENENews is less than otherdatasets Note that 119865edge is a strict metric in declarativethreads

SLARTS has better 119865edge than Seo et alrsquos method in alldatasetsThemaximumdifference occurs inThestandard andRussianblog In these datasets many of the defined featureshave goodperformance (Importance of features in ENENewsand Russianblog datasets will be shown in Table 7 and we willexplain more in the next section)

The Scientific World Journal 19

Table 5 Experimental results with interrogative evaluation metrics

Datasets Methods Accedge Accpath 119875119877-path 119877

119877-path 1198651119877-path

Thestandard

SLARTS 04669plusmn0009

03059plusmn0013

06927plusmn0018

07450plusmn0010

07179plusmn0013

Seo et alrsquos [2] 03941plusmn0005

03413plusmn0006

08303plusmn0009

06563plusmn0005

07331plusmn0003

First 03630plusmn0010

03630plusmn0010

10000plusmn0000

05846plusmn0009

07379plusmn0007

Last 01824plusmn0006

00495plusmn0004

02214plusmn0008

10000plusmn0000

03625plusmn0010

ENENews

SLARTS 04837plusmn0018

03695plusmn0027

07625plusmn0026

07770plusmn0013

07697plusmn0017

Seo et alrsquos [2] 04661plusmn0017

04134plusmn0021

08720plusmn0011

07086plusmn0013

07818plusmn0011

First 04524plusmn0018

04524plusmn0018

10000plusmn0000

06634plusmn0016

07976plusmn0011

Last 02090plusmn0016

00596plusmn0003

02319plusmn0011

10000plusmn0000

03764plusmn0015

Courantblogs

SLARTS 06843plusmn0028

06453plusmn0033

09127plusmn0013

08619plusmn0014

08865plusmn0013

Seo et alrsquos [2] 06700plusmn0040

06628plusmn0043

09667plusmn0014

08250plusmn0018

08902plusmn0015

First 06490plusmn0036

06490plusmn0036

10000plusmn0000

08001plusmn0019

08889plusmn0012

Last 02037plusmn0017

01082plusmn0008

03183plusmn0011

10000plusmn0000

04829plusmn0012

Russianblog

SLARTS 07463plusmn0017

06904plusmn0017

08741plusmn0008

08787plusmn0006

08764plusmn0007

Seo et alrsquos [2] 05446plusmn0010

05174plusmn0011

09293plusmn0003

07821plusmn0004

08494plusmn0003

First 04968plusmn0006

04968plusmn0006

10000plusmn0000

07134plusmn0004

08327plusmn0003

Last 03632plusmn0018

01549plusmn0012

03695plusmn0015

10000plusmn0000

05395plusmn0016

Alef

SLARTS 07753plusmn0017

07641plusmn0018

09579plusmn0006

09033plusmn0008

09298plusmn0007

Seo et alrsquos [2] 07802plusmn0019

07800plusmn0019

09950plusmn0003

08828plusmn0010

09355plusmn0006

First 07853plusmn0018

07853plusmn0018

10000plusmn0000

08826plusmn0010

09376plusmn0006

Last 00993plusmn0005

00822plusmn0005

02526plusmn0012

10000plusmn0000

04032plusmn0015

The bold style shows the best result in each metric

As shown in Table 6 Russianblog has better resultsthan the other datasets in all metrics The main reason isthat its comments are not confirmed by a moderator Thiscauses Accedge of Last baseline in Russianblog to be equalto 03632 that is more than other datasets (Table 5) Accedgeof Last baseline has inverse relationship with complexity ofreplies Also we showed in Figure 13 that the Russianbloghas the lowest time difference between a comment and itsparent When time difference between a child and its parentdecreases detection of reply relationwould be easier In otherwords as the parent appears to be closer to its child somefeatures such as frequent pattern and location Prior that are

based on position of comments in chronological order workbetter

The difference between 119865119877-path and 119865path is about 20

in Thestandard and ENENews where threads are larger andpaths have more complex structures

The minimum difference between the results of SLARTSand Seo et alrsquos methods appears in Alef datasets In Alef manyrelations are comment root and many comments do not haveauthorrsquos name which make the features perform poorly

Since SLARTS method has features which speciallyfocused on detecting comment-root relations (eg by addingcandidate filtering) AccCTD of SLARTS is better than Seo

20 The Scientific World Journal

et al method in all datasets The best result is 091 forRussianblog and the worst result is 069 for ENENewsThe root of ENENews dataset is usually a tweet Accordingto Table 2 it makes the average length of the rootrsquos textto be shorter than the other datasets and this makes thetextual features perform poorly on detecting comment-rootrelations

Confidence intervals of Alef andCourantblog datasets arehigher than the other datasets becausemany of their relationsare comment root (Figure 16) This makes ranking SVM tobe bias towards connecting comments to the root especiallywhen a thread includes very few comment-comment rela-tions

We compare 119875-value to specify the significance of differ-ences between SLARTS and Seo et alrsquosmethods on declarativemetrics Since ranking SVM ranks candidates based on theirscore and selects the first candidate from the list only thefirst candidate is important So p1 is computed The resultsindicate that all improvements are statistically significant (119875-value lt 0005) in all datasets

54 Evaluation of the Features In this section we evaluatethe role of the introduced features We use backward featureselection That means to measure the importance of a featurewe use all features except that feature repeat the experimentsin its absence and compare the results to the case whereall features are present The difference between the values ofmetrics in presence of a feature and its absence is reported inTable 7

It is seen that some features improve precision of themetrics for example location prior and candidate filteringrule 3 where they most likely tend to detect comment-rootrelations Some features improve recall of the metrics such asauthorsrsquo language global similarity candidate filtering rule1 and frequent patterns These features most likely tendto detect comment-comment relations Some features affectboth precision and recall for example authorsrsquo name

As stated earlier Russianblog has different behaviourin comparison with other datasets (Figures 12 13 and 15)Also ENENews has larger threads and more complex paths(Figure 16 and Table 2) So we use these datasets to evaluatefeatures in depth

Similarity feature improves recall of evaluation metricsSince Russian language is known as a morphologically richlanguage and the length of commentsrsquo text is very short(about average 18 words according to Table 2) in comparisonwith other datasets improvement of textual features is lowTo increase the performance of textual feature in Russianwe need a stemmer a Wordnet and a tokenizer Similarityfeature has rather a good impact on AccCTD

Authorsrsquo language feature improves recall of evaluationmetrics According to 119877edge the improvements of this featureare 003 and 001 in ENENews and Russianblog respectively

Global similarity feature improves both precision andrecall in Russianblog and recall in ENENews

The frequent words feature has a small improve-ment in Russianblog like similarity feature This feature

improves recall of the evaluation metrics in ENEnews about001

Length ratio feature improves precision in both datasetsHowever since longer comments in ENENews have manychildren this feature is more prominent there

Authorsrsquo name feature is useful in all evaluation metricsThe value of this feature in ENENews is more than Russian-blog because authorsrsquo name in Russianblog is different fromother datasets it is an email and no one would refer to it as aname

The frequent patterns feature focuses on detectingcomment-comment relations This feature improves recall ofevaluation metrics in both datasets

The location prior feature improves precision in bothdatasets This feature has a good improvement on AccCTDAccording to 119875edge the best improvement is 016091 forRussianblog since comments are not moderated

The candidate filtering rule 1 improves recall of evaluationmetrics in both datasets This feature removes the rootcandidate accurately This feature has a good improvementon AccCTD The maximum improvement for 119877edge is 034 forRussianblog

The candidate filtering rule 2 has small improvements onprecision and recall of evaluation metrics in both datasetsMaximum improvement is gained with Russianblog

Finally The candidate filtering rule 3 improves precisionof metrics in both datasets This feature removes all can-didates except the root Thus detection of comment-rootrelations is improved Also this feature improves AccCTD

6 Information Extraction byVisualizing Threads Structure

In this section we discuss the information which can beextracted from hierarchical structure of threads Figure 19 isthe same as Figure 3 except that the commenters and theirnumber of posted comments are shown as the nodesrsquo labelfor example ldquo119863(3)rdquo is the third comment sent by user ldquo119863rdquoVisualization of the structure reveals valuable information asfollowing

(i) Is the root article controversial The threads ldquo119861rdquo andldquo119862rdquo in Figure 19 aremore controversial than ldquo119860rdquo Visi-tors have expressed different opinions and sentimentsabout the root in ldquo119861rdquo and ldquo119862rdquo leading to formationof longer conversations The thread ldquo119860rdquo shows acommon thread which has short conversations aboutthe root The height and width of trees can help as ameasure to recognize whether a root is controversialor not

(ii) Which comments are starters [36] Starter commentshave important role in the conversations becauseusers read them and then read other comments tobetter understand the conversations

(iii) Which comments have participated in this conversa-tion Users can follow the conversations that seeminteresting to them without reading any unrelatedcomment

The Scientific World Journal 21

Table 6 Experimental results with declarative evaluation metrics

Datasets Method 119875edge 119877edge 119865edge 119875path 119877path 119865path 119875119877-path 119877

119877-path 119865119877-path AccCTD

ThestandardSLARTS 03782

plusmn000804155plusmn0011

03960plusmn0006

01565plusmn0007

01713plusmn0008

01636plusmn0007

03628plusmn0012

04166plusmn0016

03876plusmn0005

07525plusmn0005

Seo et alrsquos[2]

03014plusmn0031

01414plusmn0015

01921plusmn0016

01544plusmn0025

00576plusmn0011

00835plusmn0013

03013plusmn0038

01319plusmn0018

01831plusmn0021

05831plusmn0007

Difference 00778 02741 02039 00021 01137 00801 00615 02847 02045 01694

ENENewsSLARTS 03797

plusmn003803891plusmn0019

03839plusmn0024

01873plusmn0040

01855plusmn0026

01857plusmn0030

03768plusmn0048

03768plusmn0017

03759plusmn0025

06872plusmn0014

Seo et alrsquos[2]

04060plusmn0030

01713plusmn0016

02407plusmn0018

02378plusmn0034

00837plusmn0016

01235plusmn0021

03947plusmn0035

01523plusmn0015

02196plusmn0019

05954plusmn0009

Difference minus00263 02178 01432 minus00505 01018 00622 minus00179 02245 01563 00918

CourantblogsSLARTS 05236

plusmn003003963plusmn0073

04495plusmn0052

04220plusmn0057

03110plusmn0058

03566plusmn0049

05325plusmn0049

03853plusmn0075

04452plusmn0057

07625plusmn0033

Seo et alrsquos[2]

07320plusmn0090

02049plusmn0062

03189plusmn0082

06815plusmn0105

01896plusmn0070

02951plusmn0093

07309plusmn0094

02010plusmn0065

03140plusmn0086

07001plusmn0031

Difference minus02084 01914 01306 minus02595 01214 00615 minus01984 01843 01312 00624

RussianblogSLARTS 05940

plusmn002805674plusmn0022

05803plusmn0023

04907plusmn0031

04664plusmn0020

04782plusmn0024

05878plusmn0028

05823plusmn0021

05849plusmn0020

09116plusmn0011

Seo et alrsquos[2]

04705plusmn0020

03030plusmn0015

03685plusmn0016

03823plusmn0023

02534plusmn0013

03046plusmn0015

04682plusmn0020

02822plusmn0014

03521plusmn0015

05862plusmn0013

Difference 01235 02644 02118 01084 0213 01736 01196 03001 02328 03254

AlefSLARTS 06189

plusmn002804131plusmn0040

04952plusmn0037

05639plusmn0027

03819plusmn0037

04551plusmn0034

06277plusmn0027

04069plusmn0043

04934plusmn0039

08044plusmn0013

Seo et alrsquos[2]

08827plusmn0052

02637plusmn0051

04045plusmn0060

08797plusmn0053

02631plusmn0051

04034plusmn0060

08840plusmn0052

02635plusmn0051

04043plusmn0060

07846plusmn0018

Difference minus02638 01494 00907 minus03158 01188 00517 minus02563 01434 00891 00198The bold style shows the best result in each metric

(iv) Who plays an important role in a discussion Someusers playmore important roles in conversations thanother users For example users ldquo119863(1)rdquo in threadldquoBrdquo and ldquo119860(1)rdquo in thread ldquoCrdquo have made a longconversation These users are known as a hub or astarter in a thread and their degree of importance canbe compared according to their indegree [39] Manyanalyses can be performed on this structure which isknown as popularity features [12]

(v) How many conversations are about the root Thereare thirteen conversations which are replyingdirectly to the root in thread ldquoCrdquo for exampleF(2)H(2)H(3) I(2)O(1)orF(2)H(2) l(1) showtwo conversations in thread ldquo119862rdquo

7 Conclusion and Future Work

In this paper we proposed SLARTS a method based onranking SVMwhich predicts the tree-like structure of declar-ative threads for example blogs and online news agenciesWe emphasized on the differences between declarative andinterrogative threads and showed that many of the previouslyproposed features perform poorly on declarative threadsbecause of this Instead we defined a set of novel textual andnontextual features and used a ranking SVM algorithm tocombine these features

We detect two types of reply relations in declarativethreads comment-root relation and comment-commentrelation An appropriate method should be able to detectboth types of reply relations and an appropriatemetric shouldconsider both types of reply relations So in order to havefair judge on the quality of the predicted structures wemodified the evaluation metrics accordingly We also defineda novel metric that measures the accuracy of comment typedetection

The results of our experiments showed that accord-ing to declarative evaluation metrics our method showshigher accuracy in comparison with the baselines on fivedatasets Also we showed that all improvements in detectingcomment-comment relations are statistically significant in alldatasets

We believe that the defined features are not limited todeclarative threads Some features such as authorrsquos languageand frequent patterns extract relations among users can beuseful in interrogative threads

For future we would like to use the proposed methodon interrogative platforms such as forums Also we wouldlike to analyze the tree-like reply structure deeply and webelieve it can provide valuable information for example tohelp in understanding the role of users in discussions or tofind important and controversial comments among a largevolume of comments

22 The Scientific World Journal

Table 7 The difference between the evaluation metrics in presence and absence of features

Feature Dataset Evaluation metric119875edge 119877edge 119875Path 119877Path 119875

119877-path 119877119877-path AccCTD

Similarity ENENews minus000233 007415 minus000538 003289 minus001095 008996 004684Russianblog 000874 000012 000963 000347 000695 minus001079 000355

Authorsrsquo language ENENews minus001354 003246 minus001785 000693 minus002757 004052 000458Russianblog minus000229 001052 minus000179 000750 minus000590 000807 minus000182

Global similarity ENENews minus000402 005604 minus001338 001581 minus001129 007086 002742Russianblog 001090 001443 000582 000897 000368 minus000317 minus000087

Frequent words ENENews 000252 000931 000170 000424 minus000132 000973 000292Russianblog 000008 000023 000016 000029 000034 000047 minus000011

Length ratio ENENews 004206 minus002984 004450 001700 006421 minus005336 minus000850Russianblog 000651 minus000355 000920 000080 000784 minus000592 000235

Authorsrsquo name ENENews 003753 006392 001920 003285 002341 005998 001788Russianblog 000000 000000 000000 000000 000000 000000 000000

Frequent pattern ENENews minus002028 002930 minus002357 000300 minus003793 003410 000526Russianblog 000458 006100 000830 005206 minus000545 006150 000404

Location prior ENENews 008778 minus003690 008556 003280 011792 minus007852 002925Russianblog 016091 009671 017766 013515 016699 008793 002414

Candidate filtering rule 1 ENENews minus005431 006138 minus004102 001296 minus006458 007545 006949Russianblog minus012506 034370 minus013916 027445 minus012683 037003 032546

Candidate filtering rule 2 ENENews 000014 minus000004 000048 000041 minus000038 minus000035 minus000009Russianblog 002575 002475 002462 002341 001746 000802 000000

Candidate filtering rule 3 ENENews 000357 minus000121 000452 000254 000551 minus000161 000161Russianblog 008521 minus001978 009636 001702 009380 minus003825 007371

Since our goal is to provide a language-independentmodel to reconstruct the thread structure we did not usetext processing tools such as Wordnet and Named EntityRecognition (as they are not available in the same quality forall languages) We would like to focus on English and usethese tools to find out whether they can improve the accuracyof RTS method or not

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] Y Wu Q Ye L Li and J Xiao ldquoPower-law properties ofhuman view and reply behavior in online societyrdquoMathematicalProblems in Engineering vol 2012 Article ID 969087 7 pages2012

[2] J Seo W B Croft and D A Smith ldquoOnline community searchusing conversational structuresrdquo Information Retrieval vol 14no 6 pp 547ndash571 2011

[3] HWang CWang C X Zhai and J Han ldquoLearning online dis-cussion structures by conditional random fieldsrdquo in Proceedingsof the 34th International ACM SIGIR Conference on Researchand Development in Information Retrieval (SIGIR rsquo11) pp 435ndash444 July 2011

[4] E Aumayr J Chan and C Hayes ldquoReconstruction of threadedconversations in online discussion forumsrdquo inProceedings of the

5th International AAAI Conference onWeblogs and SocialMedia(ICWSM rsquo11) pp 26ndash33 2011

[5] D Shen Q Yang J-T Sun and Z Chen ldquoThread detactionin dynamic text message streamsrdquo in Proceedings of the 29thAnnual International ACM SIGIR Conference on Research andDevelopment in Information Retrieval pp 35ndash42 Seattle WashUSA August 2006

[6] M Dehghani M Asadpour and A Shakery ldquoAn evolutionary-based method for reconstructing conversation threads in emailcorporardquo in Proceedings of the IEEEACM International Con-ference on Advances in Social Networks Analysis and Mining(ASONAM rsquo12) 2012

[7] Y Jen-Yuan ldquoEmail thread reassembly using similarity match-ingrdquo in Proceedings of the 3rd Conference on Email and Anti-Spam (CEAS rsquo06) Muntain View Calif USA 2006

[8] S Gottipati D Lo and J Jiang ldquoFinding relevant answersin software forumsrdquo in Proceedings of the 26th IEEEACMInternational Conference on Automated Software Engineering(ASE rsquo11) pp 323ndash332 November 2011

[9] H Duan and C Zhai ldquoExploiting thread structures to improvesmoothing of language models for forum post retrievalrdquo inAdvances in Information Retrieval P Clough C Foley CGurrin et al Eds vol 6611 pp 350ndash361 Springer BerlinGermany 2011

[10] T Georgiou M Karvounis and Y Ioannidis ldquoExtractingtopics of debate between users on web discussion boardsrdquo inProceedings of the 1st International Conference forUndergraduateand Postgraduate Students in Research Poster Competition 2010

[11] Y C Wang M Joshi W Cohen and C P Rose ldquoRecoveringimplicit thread structure in newsgroup style conversationsrdquo

The Scientific World Journal 23

in Proceedings of the International Conference on Weblogs andSocial Media Seattle Wash USA 2008

[12] J Chan C Hayes and E M Daly ldquoDecomposing discussionforums and boards using user rolesrdquo in Proceedings of the 4thInternational AAAI Conference on Weblogs and Social Media2010

[13] Y-C Wang and C P Rose ldquoMaking conversational structureexplicit identification of initiation-response pairs within onlinediscussionsrdquo in Proceedings of the Human Language Technolo-gies Conference ofthe North American Chapter of the Associationfor Computational Linguistics pp 673ndash676 June 2010

[14] LWangM Lui S N Kim J Nivre and T Baldwin ldquoPredictingthread discourse structure over technical web forumsrdquo inProceedings of the Conference on Empirical Methods in NaturalLanguage Processing (EMNLP rsquo11) pp 13ndash25 Edinburgh UKJuly 2011

[15] A Balali H FailiMAsadpour andMDehghani ldquoa supervisedapproach for reconstructing thread structure in commentson blogs and online news agenciesrdquo in Proceedings of the14th International Conference on Intelligent Text Processing andComputational Linguistics Samos Greece 2013

[16] P Adams and C Martel ldquoConversational thread extraction andtopic detection in text-based chatrdquo in Semantic Computing pp87ndash113 John Wiley amp Sons 2010

[17] P H Adams and C H Martell ldquoTopic detection and extractionin chatrdquo in Proceedings of the 2nd Annual IEEE InternationalConference on Semantic Computing (ICSC rsquo08) pp 581ndash588August 2008

[18] C Lin J M Yang R Cai X J Wang W Wang and L ZhangldquoModeling semantics and structure of discussion threadsrdquo inProceedings of the 18th International Conference on World WideWeb pp 1103ndash1104 2009

[19] A Schuth MMarx andM d Rijke ldquoExtracting the discussionstructure in comments on news-articlesrdquo in Proceedings of the9th ACM International Workshop onWeb Information and DataManagement (WIDM rsquo07) Lisboa Portugal November 2007

[20] C Wang J Han Q Li X Li W P Lin and H Ji ldquoLearninghierarchical relationships among partially ordered objects withheterogeneous attributes and linksrdquo in Proceedings of the 12thSIAM International Conference on Data Mining pp 516ndash527Anaheim Calif USA 2012

[21] M Dehghani A Shakery M Asadpour and A KoushkestanildquoA learning approach for email conversation thread reconstruc-tionrdquo Journal of Information Science vol 39 no 6 pp 846ndash8632013

[22] A Balali A Rajabi S Ghassemi M Asadpour and H FailildquoContent diffusion prediction in social networksrdquo in Pro-ceedings of the 5th Conference on Information and KnowledgeTechnology (IKT rsquo13) pp 467ndash471 2013

[23] G Mishne and N Glance ldquoLeave a reply an analysis of weblogcommentsrdquo in Proceedings of the 3rd Annual Workshop onthe Weblogging ecosystem Aggregation Analysis and DynamicsEdinburgh UK 2006

[24] V Gomez A Kaltenbrunner and V Lopez ldquoStatistical analysisof the social network and discussion threads in Slashdotrdquo inProceedings of the 17th International Conference on World WideWeb (WWW rsquo08) pp 645ndash654 April 2008

[25] T Weninger X A Zhu and J Han ldquoAn exploration ofdiscussion threads in social news sites a case study of the redditcommunityrdquo 2013

[26] D Laniado R Tasso Y Volkovich and A KaltenbrunnerldquoWhen the Wikipedians talk network and tree structure of

Wikipedia discussion pagesrdquo in Proceedings of the 5th Interna-tional AAAI Conference on Weblogs and Social Media (ICWSMrsquo11) 2011

[27] A Grabowski ldquoHuman behavior in online social systemsrdquoTheEuropean Physical Journal B vol 69 no 4 pp 605ndash611 2009

[28] S N Kim L Wang and T Baldwin ldquoTagging and linkingweb forum postsrdquo in Proceedings of the 40th Conference onComputational Natural Language Learning Uppsala Sweden2010

[29] J Andreas S Rosenthal and K McKeown ldquoAnnotating agree-ment and disagreement in threaded discussionrdquo in Proceedingsof the 8th International Conference on Language Resources andComputation (LREC rsquo12) pp 818ndash822 Istanbul Turkey May2012

[30] A Stolcke K Ries N Coccaro et al ldquoDialogue actmodeling forautomatic tagging and recognition of conversational speechrdquoComputational Linguistics vol 26 no 3 pp 339ndash373 2000

[31] E N Forsyth Improving Automated Lexical and DiscourseAnalysis of Online Chat Dialog Naval Postgraduate SchoolMonterey Calif USA 2007

[32] D Feng E Shaw J Kim and E Hovy ldquoAn intelligentdiscussion-bot for answering student queries in threaded dis-cussionsrdquo in Proceedings of the 11th international Conference onIntelligent User Interfaces pp 171ndash177 February 2006

[33] F Hu T Ruan Z Shao and J Ding ldquoAutomatic web informa-tion extraction based on rulesrdquo in Proceedings of the Web Infor-mation System Engineering (WISE rsquo11) pp 265ndash272 Springer2011

[34] F Hu T Ruan and Z Shao ldquoComplete-thread extraction fromweb forumsrdquo inWebTechnologies andApplications pp 727ndash734Springer 2012

[35] D Feng J Kim E Shaw and E Hovy ldquoTowards modelingthreaded discussions using induced ontology knowledgerdquo inProceedings of the National Conference on Artificial Intelligencepp 1289ndash1294 July 2006

[36] G D Venolia and C Neustaedter ldquoUnderstanding sequenceand reply relationships within email conversations a mixed-model visualizationrdquo in Proceedings of the SIGCHI Conferenceon Human Factors in Computing Systems pp 361ndash368 April2003

[37] T Joachims ldquoTraining linear SVMs in linear timerdquo in Pro-ceedings of the 12th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining pp 217ndash226 Philadel-phia Pa USA August 2006

[38] RM Fano ldquoTransmission of information a statistical theory ofcommunicationsrdquo American Journal of Physics vol 29 pp 793ndash794 1961

[39] T Opsahl F Agneessens and J Skvoretz ldquoNode centrality inweighted networks generalizing degree and shortest pathsrdquoSocial Networks vol 32 no 3 pp 245ndash251 2010

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 13: Research Article A Supervised Approach to Predict the …downloads.hindawi.com/journals/tswj/2014/479746.pdf · 2019-07-31 · Research Article A Supervised Approach to Predict the

The Scientific World Journal 13

0005

01015

02025

03035

04045

05

0 1 2 3 4 5 6 7 8 9 10 11 12

Nor

mal

ized

com

men

ts co

unt

Time difference

AlefEnenewsRusssiablog

CourantblogsThestandard

Figure 13 Time difference between a node and corresponding parent per hour

0

05

1

15

2

25

3

35

4

0 2 4 6 8 10 12 14 16 18 20 22 24

Aver

age c

omm

ents

dept

h

Elapsed time after publishing the root (hour)

AlefEnenewsRusssianblog

CourantblogsThestandard

Figure 14 Depth of comments averaged on the elapsed time from publication of root

in precision and recall are always equal Instead we use thefollowing edge accuracy measure

Accuracy (Acc)edge

=

1003816100381610038161003816reply relations cap detected relations1003816100381610038161003816119873

(7)

where the tree is comprised of119873 comments and one rootThe secondmetric is path accuracy which was introduced

byWang et al [3]Thismetric has a global view and considerspaths from nodes to the root

Accuracy (Acc)path =sum|119862|

119894=1

1003816100381610038161003816path119866 (119894) = path119875(119894)1003816100381610038161003816

|119862| (8)

where path119866(119894) is the ground-truth structure for 119894th comment

and path119875(119894) is the predicted structure for it |119862| is the number

of comments in a thread If any irrelevant comment appearsin the path this metric considers it to be completely wrongSo it is very strict To relax this Wang et al introduced

a metric that computes the overlap between the paths inground truth and the predicted path

Precision (119875)119877-path =

sum|119862|

119894=1(1003816100381610038161003816path119866 (119894) cap path

119875(119894)1003816100381610038161003816 1003816100381610038161003816path119875 (119894)

1003816100381610038161003816)

|119862|

Recall (119877)119877-path =

sum|119862|

119894=1(1003816100381610038161003816path119866 (119894) cap path

119875(119894)1003816100381610038161003816 1003816100381610038161003816path119866 (119894)

1003816100381610038161003816)

|119862|

(9)

where |path119875(119894)| is the number of comments in the prediction

path of 119894th comment and |path119866(119894)| is the number of com-

ments in the ground-truth path of 119894th comment 1198651-score isthe harmonic mean of precision and recall

1198651-score = 2 times Precision times recallPrecision + recall

(10)

The above mentioned metrics are appropriate in inter-rogative threads As mentioned before the root of declar-ative threads is news articles or main contents which aredifferent from the root of interrogative threads This causesthe structure of threads and reply relations to be differentThere are two types of reply relations in declarative threads

14 The Scientific World Journal

0

001

002

003

004

005

006

1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91Nor

mal

ized

com

men

ts co

unt

Length of comments (words)

AlefEnenewsRussianblog

CourantblogsThestandard

Figure 15 Histogram of length of comments in words

0

01

02

03

04

05

06

07

08

1 2 3 4 5 6 7 8 9Nor

mal

ized

com

men

ts co

unt

Depth

AlefEnenewsRusssainblog

CourantblogsThestandard

Figure 16 Histogram of the depth of comments

(1) comment-root that is the relation between commentsand the root article and (2) comment-comment that is therelation between two comments one parent and one childThe comment-root relations show different views around themain article (root) The comment-comment relations showconversation among visitors which is a valuable source ofinformation In Figure 17 there are three comment-root andthree comment-comment relations When a user reads theroot and comments heshe can write hisher opinion orsentiment about the root by replying to root or participatingin a discussion by replying to usersrsquo comments

We believe that the evaluation metrics mentioned beforeare not enough to cover both types of reply relations dueto differences between interrogative and declarative threadsAn appropriate metric should be able to detect both typesof relations So we have to modify the evaluation metrics ordefine new metrics

We propose an evaluation metric Accuracy (Acc)CTDwhere CTD stands for Comments Type Detection It isdefined as the proportion of correctly detected commenttypes The comment-root type includes comments whichinitiate a new view about the root (NP) and the comment-

comment type includes comments which are part of adiscussion (PD)

Accuracy (Acc)CTD =TP + TN

TP + TN + FP + FN (11)

where TP is the number of correctly detected NP commentsTN is the number of correctly detected PD comments FP isthe number of incorrectly detected NP comments and FN isthe number of incorrectly detected PD comments

To evaluate accuracy of comment-comment relations thestandard precision recall and 119865-score measures are used

Precision (119875)edge =TP

TP + FP

Recall (119877)edge =TP

TP + FN

1198651-score (119865)edge = 2 timesPrecision times recallPrecision + recall

(12)

where TP is the number of correctly detected comment-comment relations FP is the number of incorrectly detectedcomment-comment relations and FN is the number of

The Scientific World Journal 15

C

C

Comment-root relation

Comment-comment relation

C

C C C Starter discussion commentStarter discussion comment

R

Figure 17 An illustration of a declarative thread

comment-comment relations in the ground truth that werenot predicted at all

The path accuracy metric mentioned earlier is a mod-ified version of 119875path and 119877path which is appropriate todeclarative platform This metric consider discussion pathsfrom each PD comment to the starter discussion commentbut not to root

Precision (119875)path

=

sum|119862|119901

119894=1

1003816100381610038161003816path119866 (119894) = path119875(119894)1003816100381610038161003816

|119862|119901

ow

1 if (|119862|119901 = 0)

Recall (119877)path

=

sum|119862|119877

119894=1

1003816100381610038161003816path119866 (119894) = path119875(119894)1003816100381610038161003816

|119862|119877

ow

1 if (|119862|119877= 0)

(13)

where |119862|119877is the number of PD comments in the ground-

truth thread and |119862|119901is the number of PD comments in the

predicted thread Path119866(119894) is the discussion path from 119894th

node to the discussion starter comment in the ground truthPath119875(119894) is discussion path from 119894th node to the discussion

starter comment in the predicted threadAlso the relaxed Precision (119875)

119877-path and Recall (119877)119877-path

are modified to be suitable for declarative platform

Precision (119875)119877-path

=

sum|119862|119901

119894=1(1003816100381610038161003816path119866 (119894) cap path

119875(119894)1003816100381610038161003816 1003816100381610038161003816path119875 (119894)

1003816100381610038161003816)

|119862|119901

ow

1 if (|119862|119901= 0)

Recall (119877)119877-path

=

sum|119862|119877

119894=1(10038161003816100381610038161003816path119866 (119894)

cap path119875(119894)

1003816100381610038161003816100381610038161003816100381610038161003816path119866(119894)

10038161003816100381610038161003816)

|119862|119877

ow

1 if (|119862|119877= 0)

(14)

Figure 18 shows two threads of which one is the groundtruth and the other one is the predicted thread In order tobetter understand the evaluation metrics we calculate themfor this example

In Table 3 the predicted thread in Figure 18 has beenevaluated by metrics from interrogative threads The resultsshow high values Table 4 shows the results of evaluationby metrics from declarative threads The results show thatdeclarative metrics have more appropriate results

There are two reasons which lead declarative metricsto better evaluate the predicted structure (1) The root indeclarative threads has many children So if a methodconnects all comments to the root interrogative metricsshow good results (2) Interrogative metrics cannot properlyevaluate comment-comment relations in declarative threadsHowever the declarative metrics can evaluate both types ofrelations

53 Experimental Results and Analysis We use several base-lines to compare the effectiveness of SLARTS The first base-line is the performance when we simply link all comments totheir previous comment (we name it Last baseline) [3 4]Thismethod leads to good results for RTS task in interrogativethreads The second baseline is to link all comments to theroot (we name it First baseline)

The only work that has been done on RTS has focused onthe authorrsquos name in online news agencies [19] Schuth et alrsquosmethod used several features to find the commenterrsquos name inonline news articles This method achieves a good precisionbut it has a low recall because many comments do not referto any authorrsquos name So we selected Seo et alrsquos method [2]which is a strong method in interrogative platforms

Seo et alrsquos method has focused on forums and emails andused the quoted text as one of theirmain featuresThis featureis not available in our datasets Therefore we use all theirproposed features except the quoted text

We have applied 5fold cross-validation to minimize biaswith 95 as confidence interval Table 5 shows the results ofour experiments based on interrogative evaluation metricsAccording to Accedge SLARTS reveals higher accuracy exceptin Alef in which lots of replies are connected to the root

According to 119875119877-path and 119877119877-path First and Last baselines

have the best performance respectively First connects allcomments to the root this way irrelevant comments do notappear in the path from the comment to the root and also theroot appears in all paths Last connects all comments to theirprevious comments This causes all comments to appear inthe path

16 The Scientific World Journal

Table3Ex

ampleo

fcalculatio

nof

interrogativem

etric

sfor

thee

xamples

hownin

Figure

18

Evaluatio

nmetric

Calculation

Accuracy

edge

Replyrelatio

ns=1rarr

1198772rarr

1198777rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

14rarr

35rarr

26rarr

58rarr

5

Detectedrelatio

ns==1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

14rarr

15rarr

26rarr

57rarr

28rarr

7

8 11asymp073

Acc p

ath

|(1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

1rarr

1198775rarr

2rarr

1198776rarr

5rarr

2rarr

119877)|

Num

berof

comments

=8 11asymp073

119875119877-path

(1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

1rarr

1198775rarr

2rarr

1198776rarr

5rarr

2rarr

119877)+(4rarr

1rarr

1198777rarr

2rarr

1198778rarr

7rarr

2rarr

119877)

Num

berof

comments

=(1+1+1+1+1+1+1+1)+(1+12+23)

11

=1017

11

asymp092

119877119877-path

(1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

1rarr

1198775rarr

2rarr

1198776rarr

5rarr

2rarr

119877)+(4rarr

3rarr

1rarr

1198777rarr

1198778rarr

5rarr

2rarr

119877)

Num

berof

comments

=(1+1+1+1+1+1+1+1)+(23+1+23)

11

=1034

11

asymp094

1198651-score119877-path

2lowast092lowast094

092+094

asymp093

The Scientific World Journal 17

R

9 10 11

3 5

8

21 7

4 6

(a)

R

9 10 11

3 7

8

21

4 5

6

(b)

Figure 18 An example of two threads (a) ground-truth thread and (b) the prediction thread

Table 4 Example of calculation of declarative metrics for the example shown in Figure 18

Evaluation metric Calculation

AccCTD

TP = 1 2 9 10 11TN = 3 4 5 6 8

FN = 7

FP =

5 + 5

5 + 5 + 1 + 0asymp 091

119875edge

119877edge

1198651edge

TP = 3 rarr 1 5 rarr 2 6 rarr 5 119875edge =3

3 + 3= 05

FN = 4 rarr 3 8 rarr 5 119877edge =3

3 + 2= 06

FP = 4 rarr 1 7 rarr 2 8 rarr 7 1198651 minus scoreedge asymp 055

119875path

119877path

1198651path

119875path =| 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 |

| 3 rarr 1 4 rarr 1 5 rarr 2 6 rarr 5 rarr 2 7 rarr 2 8 rarr 7 rarr 2 |=3

6= 05

119877path =| 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 |

| 3 rarr 1 4 rarr 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 8 rarr 5 rarr 2 |=3

5= 06

1198651-score119877-path asymp 055

119875119877-path

119877119877-path

1198651119877-path

119875119877-path =

3 rarr 1 4 rarr 1 5 rarr 2 6 rarr 5 rarr 2 7 rarr 2 8 rarr 7 rarr 2

| 3 4 5 6 7 8 |=1 + 1 + 1 + 1 + 0 + 12

6=45

6= 075

119877119877-path =

3 rarr 1 4 rarr 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 8 rarr 5 rarr 2

| 3 4 5 6 8 |=1 + 12 + 1 + 1 + 12

5=4

5= 08

1198651-score119877-path =

2 lowast 075 lowast 08

075 + 08asymp 077

According to Accpath First has better performance inThestandard ENENews and Alef because more commentsare linked directly to the root Usually SLARTS and Seo etalrsquos methods cannot predict the full path inThestandard andENENews because according to Figure 15 paths are verylong and complex in these datasets

As wementioned earlier interrogative evaluationmetricsare not appropriate for declarative threads because based onthese metrics First shows high performance although thisbaseline does not detect any comment-comment relations

Table 6 shows the results when we use declarative evaluationmetrics proposed in Section 52

According to Table 6 for 119865edge SLARTS performs betterthan Seo et alrsquos method However for119875edge Seo et alrsquos methodperforms better but its 119877edge is lower than SLARTS in all da-tasets

It is important to say that when the difference between119875edge and 119877edge is high and 119875edge is greater than 119877edge themethod most likely connects comments to the root and itdoes not appropriately detect comment-comment relations

18 The Scientific World Journal

J(1)M(1)

V(1)

H(1)

A(1)

R(1)

G(1)

N(1)D(1)

E(1)

C(1)

P(2)

U(1)

S(1)

T(2)

T(1)

M(2)

L(1)

Q(1)

C(2)

K(1)

F(1)L(2)

C(3)

P(1)

A(2)

B(1)O(1)

I(1)

H(2)

R

(a)

J(1)

M(1)

R

A(1)

N(4)

D(1)

G(1)

I(1)

B(1)H(1)

O(1)

C(1)

N(2)

G(2)O(2)

P(1)

E(1)

D(2)

F(1)

E(2)F(2)

D(3)

L(1) F(3) N(1)

D(4)

F(4)K(1)D(5)

D(6)

N(3)

(b)

1)

1)

1)

1)

1)

1)

1)

1)1)

1)

1)

1)

R

F(2)

K(3)

R(1)

K(2)

M(2)A(3)

Q(1)

N(2)

K(4)A(2)

J(2)

1)

H(2)

H(3)

I(2)

I(1)

P(1)

O(1)

(c)

Figure 19 Three real threads in which nodes are labeled according to commenterrsquos name and number of its comments

(like First baseline) On the other hand when 119877edge isgreater than 119875edge the method does not appropriately detectcomment-root relations (like Last baseline)

So Seo et alrsquos method most likely connects comments tothe root and this is not appropriately detecting comment-comment relations On the other hands 119875edge and 119877edge ofSLARTS are close

Several features in SLARTS such as Global similarityfrequent words Frequent patterns and authorsrsquo languagefocus on detection of comment-comment relations Thismakes the results of SLARTS better than Seo et alrsquos methodin declarative evaluation metrics

We already saw that the average length of threads inThestandard and ENENews is longer than the other datasets(Table 2) and their paths are much longer and more complexthan the other datasets (Figure 16) According to 119865edge theaccuracy of Thestandard and ENENews is less than otherdatasets Note that 119865edge is a strict metric in declarativethreads

SLARTS has better 119865edge than Seo et alrsquos method in alldatasetsThemaximumdifference occurs inThestandard andRussianblog In these datasets many of the defined featureshave goodperformance (Importance of features in ENENewsand Russianblog datasets will be shown in Table 7 and we willexplain more in the next section)

The Scientific World Journal 19

Table 5 Experimental results with interrogative evaluation metrics

Datasets Methods Accedge Accpath 119875119877-path 119877

119877-path 1198651119877-path

Thestandard

SLARTS 04669plusmn0009

03059plusmn0013

06927plusmn0018

07450plusmn0010

07179plusmn0013

Seo et alrsquos [2] 03941plusmn0005

03413plusmn0006

08303plusmn0009

06563plusmn0005

07331plusmn0003

First 03630plusmn0010

03630plusmn0010

10000plusmn0000

05846plusmn0009

07379plusmn0007

Last 01824plusmn0006

00495plusmn0004

02214plusmn0008

10000plusmn0000

03625plusmn0010

ENENews

SLARTS 04837plusmn0018

03695plusmn0027

07625plusmn0026

07770plusmn0013

07697plusmn0017

Seo et alrsquos [2] 04661plusmn0017

04134plusmn0021

08720plusmn0011

07086plusmn0013

07818plusmn0011

First 04524plusmn0018

04524plusmn0018

10000plusmn0000

06634plusmn0016

07976plusmn0011

Last 02090plusmn0016

00596plusmn0003

02319plusmn0011

10000plusmn0000

03764plusmn0015

Courantblogs

SLARTS 06843plusmn0028

06453plusmn0033

09127plusmn0013

08619plusmn0014

08865plusmn0013

Seo et alrsquos [2] 06700plusmn0040

06628plusmn0043

09667plusmn0014

08250plusmn0018

08902plusmn0015

First 06490plusmn0036

06490plusmn0036

10000plusmn0000

08001plusmn0019

08889plusmn0012

Last 02037plusmn0017

01082plusmn0008

03183plusmn0011

10000plusmn0000

04829plusmn0012

Russianblog

SLARTS 07463plusmn0017

06904plusmn0017

08741plusmn0008

08787plusmn0006

08764plusmn0007

Seo et alrsquos [2] 05446plusmn0010

05174plusmn0011

09293plusmn0003

07821plusmn0004

08494plusmn0003

First 04968plusmn0006

04968plusmn0006

10000plusmn0000

07134plusmn0004

08327plusmn0003

Last 03632plusmn0018

01549plusmn0012

03695plusmn0015

10000plusmn0000

05395plusmn0016

Alef

SLARTS 07753plusmn0017

07641plusmn0018

09579plusmn0006

09033plusmn0008

09298plusmn0007

Seo et alrsquos [2] 07802plusmn0019

07800plusmn0019

09950plusmn0003

08828plusmn0010

09355plusmn0006

First 07853plusmn0018

07853plusmn0018

10000plusmn0000

08826plusmn0010

09376plusmn0006

Last 00993plusmn0005

00822plusmn0005

02526plusmn0012

10000plusmn0000

04032plusmn0015

The bold style shows the best result in each metric

As shown in Table 6 Russianblog has better resultsthan the other datasets in all metrics The main reason isthat its comments are not confirmed by a moderator Thiscauses Accedge of Last baseline in Russianblog to be equalto 03632 that is more than other datasets (Table 5) Accedgeof Last baseline has inverse relationship with complexity ofreplies Also we showed in Figure 13 that the Russianbloghas the lowest time difference between a comment and itsparent When time difference between a child and its parentdecreases detection of reply relationwould be easier In otherwords as the parent appears to be closer to its child somefeatures such as frequent pattern and location Prior that are

based on position of comments in chronological order workbetter

The difference between 119865119877-path and 119865path is about 20

in Thestandard and ENENews where threads are larger andpaths have more complex structures

The minimum difference between the results of SLARTSand Seo et alrsquos methods appears in Alef datasets In Alef manyrelations are comment root and many comments do not haveauthorrsquos name which make the features perform poorly

Since SLARTS method has features which speciallyfocused on detecting comment-root relations (eg by addingcandidate filtering) AccCTD of SLARTS is better than Seo

20 The Scientific World Journal

et al method in all datasets The best result is 091 forRussianblog and the worst result is 069 for ENENewsThe root of ENENews dataset is usually a tweet Accordingto Table 2 it makes the average length of the rootrsquos textto be shorter than the other datasets and this makes thetextual features perform poorly on detecting comment-rootrelations

Confidence intervals of Alef andCourantblog datasets arehigher than the other datasets becausemany of their relationsare comment root (Figure 16) This makes ranking SVM tobe bias towards connecting comments to the root especiallywhen a thread includes very few comment-comment rela-tions

We compare 119875-value to specify the significance of differ-ences between SLARTS and Seo et alrsquosmethods on declarativemetrics Since ranking SVM ranks candidates based on theirscore and selects the first candidate from the list only thefirst candidate is important So p1 is computed The resultsindicate that all improvements are statistically significant (119875-value lt 0005) in all datasets

54 Evaluation of the Features In this section we evaluatethe role of the introduced features We use backward featureselection That means to measure the importance of a featurewe use all features except that feature repeat the experimentsin its absence and compare the results to the case whereall features are present The difference between the values ofmetrics in presence of a feature and its absence is reported inTable 7

It is seen that some features improve precision of themetrics for example location prior and candidate filteringrule 3 where they most likely tend to detect comment-rootrelations Some features improve recall of the metrics such asauthorsrsquo language global similarity candidate filtering rule1 and frequent patterns These features most likely tendto detect comment-comment relations Some features affectboth precision and recall for example authorsrsquo name

As stated earlier Russianblog has different behaviourin comparison with other datasets (Figures 12 13 and 15)Also ENENews has larger threads and more complex paths(Figure 16 and Table 2) So we use these datasets to evaluatefeatures in depth

Similarity feature improves recall of evaluation metricsSince Russian language is known as a morphologically richlanguage and the length of commentsrsquo text is very short(about average 18 words according to Table 2) in comparisonwith other datasets improvement of textual features is lowTo increase the performance of textual feature in Russianwe need a stemmer a Wordnet and a tokenizer Similarityfeature has rather a good impact on AccCTD

Authorsrsquo language feature improves recall of evaluationmetrics According to 119877edge the improvements of this featureare 003 and 001 in ENENews and Russianblog respectively

Global similarity feature improves both precision andrecall in Russianblog and recall in ENENews

The frequent words feature has a small improve-ment in Russianblog like similarity feature This feature

improves recall of the evaluation metrics in ENEnews about001

Length ratio feature improves precision in both datasetsHowever since longer comments in ENENews have manychildren this feature is more prominent there

Authorsrsquo name feature is useful in all evaluation metricsThe value of this feature in ENENews is more than Russian-blog because authorsrsquo name in Russianblog is different fromother datasets it is an email and no one would refer to it as aname

The frequent patterns feature focuses on detectingcomment-comment relations This feature improves recall ofevaluation metrics in both datasets

The location prior feature improves precision in bothdatasets This feature has a good improvement on AccCTDAccording to 119875edge the best improvement is 016091 forRussianblog since comments are not moderated

The candidate filtering rule 1 improves recall of evaluationmetrics in both datasets This feature removes the rootcandidate accurately This feature has a good improvementon AccCTD The maximum improvement for 119877edge is 034 forRussianblog

The candidate filtering rule 2 has small improvements onprecision and recall of evaluation metrics in both datasetsMaximum improvement is gained with Russianblog

Finally The candidate filtering rule 3 improves precisionof metrics in both datasets This feature removes all can-didates except the root Thus detection of comment-rootrelations is improved Also this feature improves AccCTD

6 Information Extraction byVisualizing Threads Structure

In this section we discuss the information which can beextracted from hierarchical structure of threads Figure 19 isthe same as Figure 3 except that the commenters and theirnumber of posted comments are shown as the nodesrsquo labelfor example ldquo119863(3)rdquo is the third comment sent by user ldquo119863rdquoVisualization of the structure reveals valuable information asfollowing

(i) Is the root article controversial The threads ldquo119861rdquo andldquo119862rdquo in Figure 19 aremore controversial than ldquo119860rdquo Visi-tors have expressed different opinions and sentimentsabout the root in ldquo119861rdquo and ldquo119862rdquo leading to formationof longer conversations The thread ldquo119860rdquo shows acommon thread which has short conversations aboutthe root The height and width of trees can help as ameasure to recognize whether a root is controversialor not

(ii) Which comments are starters [36] Starter commentshave important role in the conversations becauseusers read them and then read other comments tobetter understand the conversations

(iii) Which comments have participated in this conversa-tion Users can follow the conversations that seeminteresting to them without reading any unrelatedcomment

The Scientific World Journal 21

Table 6 Experimental results with declarative evaluation metrics

Datasets Method 119875edge 119877edge 119865edge 119875path 119877path 119865path 119875119877-path 119877

119877-path 119865119877-path AccCTD

ThestandardSLARTS 03782

plusmn000804155plusmn0011

03960plusmn0006

01565plusmn0007

01713plusmn0008

01636plusmn0007

03628plusmn0012

04166plusmn0016

03876plusmn0005

07525plusmn0005

Seo et alrsquos[2]

03014plusmn0031

01414plusmn0015

01921plusmn0016

01544plusmn0025

00576plusmn0011

00835plusmn0013

03013plusmn0038

01319plusmn0018

01831plusmn0021

05831plusmn0007

Difference 00778 02741 02039 00021 01137 00801 00615 02847 02045 01694

ENENewsSLARTS 03797

plusmn003803891plusmn0019

03839plusmn0024

01873plusmn0040

01855plusmn0026

01857plusmn0030

03768plusmn0048

03768plusmn0017

03759plusmn0025

06872plusmn0014

Seo et alrsquos[2]

04060plusmn0030

01713plusmn0016

02407plusmn0018

02378plusmn0034

00837plusmn0016

01235plusmn0021

03947plusmn0035

01523plusmn0015

02196plusmn0019

05954plusmn0009

Difference minus00263 02178 01432 minus00505 01018 00622 minus00179 02245 01563 00918

CourantblogsSLARTS 05236

plusmn003003963plusmn0073

04495plusmn0052

04220plusmn0057

03110plusmn0058

03566plusmn0049

05325plusmn0049

03853plusmn0075

04452plusmn0057

07625plusmn0033

Seo et alrsquos[2]

07320plusmn0090

02049plusmn0062

03189plusmn0082

06815plusmn0105

01896plusmn0070

02951plusmn0093

07309plusmn0094

02010plusmn0065

03140plusmn0086

07001plusmn0031

Difference minus02084 01914 01306 minus02595 01214 00615 minus01984 01843 01312 00624

RussianblogSLARTS 05940

plusmn002805674plusmn0022

05803plusmn0023

04907plusmn0031

04664plusmn0020

04782plusmn0024

05878plusmn0028

05823plusmn0021

05849plusmn0020

09116plusmn0011

Seo et alrsquos[2]

04705plusmn0020

03030plusmn0015

03685plusmn0016

03823plusmn0023

02534plusmn0013

03046plusmn0015

04682plusmn0020

02822plusmn0014

03521plusmn0015

05862plusmn0013

Difference 01235 02644 02118 01084 0213 01736 01196 03001 02328 03254

AlefSLARTS 06189

plusmn002804131plusmn0040

04952plusmn0037

05639plusmn0027

03819plusmn0037

04551plusmn0034

06277plusmn0027

04069plusmn0043

04934plusmn0039

08044plusmn0013

Seo et alrsquos[2]

08827plusmn0052

02637plusmn0051

04045plusmn0060

08797plusmn0053

02631plusmn0051

04034plusmn0060

08840plusmn0052

02635plusmn0051

04043plusmn0060

07846plusmn0018

Difference minus02638 01494 00907 minus03158 01188 00517 minus02563 01434 00891 00198The bold style shows the best result in each metric

(iv) Who plays an important role in a discussion Someusers playmore important roles in conversations thanother users For example users ldquo119863(1)rdquo in threadldquoBrdquo and ldquo119860(1)rdquo in thread ldquoCrdquo have made a longconversation These users are known as a hub or astarter in a thread and their degree of importance canbe compared according to their indegree [39] Manyanalyses can be performed on this structure which isknown as popularity features [12]

(v) How many conversations are about the root Thereare thirteen conversations which are replyingdirectly to the root in thread ldquoCrdquo for exampleF(2)H(2)H(3) I(2)O(1)orF(2)H(2) l(1) showtwo conversations in thread ldquo119862rdquo

7 Conclusion and Future Work

In this paper we proposed SLARTS a method based onranking SVMwhich predicts the tree-like structure of declar-ative threads for example blogs and online news agenciesWe emphasized on the differences between declarative andinterrogative threads and showed that many of the previouslyproposed features perform poorly on declarative threadsbecause of this Instead we defined a set of novel textual andnontextual features and used a ranking SVM algorithm tocombine these features

We detect two types of reply relations in declarativethreads comment-root relation and comment-commentrelation An appropriate method should be able to detectboth types of reply relations and an appropriatemetric shouldconsider both types of reply relations So in order to havefair judge on the quality of the predicted structures wemodified the evaluation metrics accordingly We also defineda novel metric that measures the accuracy of comment typedetection

The results of our experiments showed that accord-ing to declarative evaluation metrics our method showshigher accuracy in comparison with the baselines on fivedatasets Also we showed that all improvements in detectingcomment-comment relations are statistically significant in alldatasets

We believe that the defined features are not limited todeclarative threads Some features such as authorrsquos languageand frequent patterns extract relations among users can beuseful in interrogative threads

For future we would like to use the proposed methodon interrogative platforms such as forums Also we wouldlike to analyze the tree-like reply structure deeply and webelieve it can provide valuable information for example tohelp in understanding the role of users in discussions or tofind important and controversial comments among a largevolume of comments

22 The Scientific World Journal

Table 7 The difference between the evaluation metrics in presence and absence of features

Feature Dataset Evaluation metric119875edge 119877edge 119875Path 119877Path 119875

119877-path 119877119877-path AccCTD

Similarity ENENews minus000233 007415 minus000538 003289 minus001095 008996 004684Russianblog 000874 000012 000963 000347 000695 minus001079 000355

Authorsrsquo language ENENews minus001354 003246 minus001785 000693 minus002757 004052 000458Russianblog minus000229 001052 minus000179 000750 minus000590 000807 minus000182

Global similarity ENENews minus000402 005604 minus001338 001581 minus001129 007086 002742Russianblog 001090 001443 000582 000897 000368 minus000317 minus000087

Frequent words ENENews 000252 000931 000170 000424 minus000132 000973 000292Russianblog 000008 000023 000016 000029 000034 000047 minus000011

Length ratio ENENews 004206 minus002984 004450 001700 006421 minus005336 minus000850Russianblog 000651 minus000355 000920 000080 000784 minus000592 000235

Authorsrsquo name ENENews 003753 006392 001920 003285 002341 005998 001788Russianblog 000000 000000 000000 000000 000000 000000 000000

Frequent pattern ENENews minus002028 002930 minus002357 000300 minus003793 003410 000526Russianblog 000458 006100 000830 005206 minus000545 006150 000404

Location prior ENENews 008778 minus003690 008556 003280 011792 minus007852 002925Russianblog 016091 009671 017766 013515 016699 008793 002414

Candidate filtering rule 1 ENENews minus005431 006138 minus004102 001296 minus006458 007545 006949Russianblog minus012506 034370 minus013916 027445 minus012683 037003 032546

Candidate filtering rule 2 ENENews 000014 minus000004 000048 000041 minus000038 minus000035 minus000009Russianblog 002575 002475 002462 002341 001746 000802 000000

Candidate filtering rule 3 ENENews 000357 minus000121 000452 000254 000551 minus000161 000161Russianblog 008521 minus001978 009636 001702 009380 minus003825 007371

Since our goal is to provide a language-independentmodel to reconstruct the thread structure we did not usetext processing tools such as Wordnet and Named EntityRecognition (as they are not available in the same quality forall languages) We would like to focus on English and usethese tools to find out whether they can improve the accuracyof RTS method or not

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] Y Wu Q Ye L Li and J Xiao ldquoPower-law properties ofhuman view and reply behavior in online societyrdquoMathematicalProblems in Engineering vol 2012 Article ID 969087 7 pages2012

[2] J Seo W B Croft and D A Smith ldquoOnline community searchusing conversational structuresrdquo Information Retrieval vol 14no 6 pp 547ndash571 2011

[3] HWang CWang C X Zhai and J Han ldquoLearning online dis-cussion structures by conditional random fieldsrdquo in Proceedingsof the 34th International ACM SIGIR Conference on Researchand Development in Information Retrieval (SIGIR rsquo11) pp 435ndash444 July 2011

[4] E Aumayr J Chan and C Hayes ldquoReconstruction of threadedconversations in online discussion forumsrdquo inProceedings of the

5th International AAAI Conference onWeblogs and SocialMedia(ICWSM rsquo11) pp 26ndash33 2011

[5] D Shen Q Yang J-T Sun and Z Chen ldquoThread detactionin dynamic text message streamsrdquo in Proceedings of the 29thAnnual International ACM SIGIR Conference on Research andDevelopment in Information Retrieval pp 35ndash42 Seattle WashUSA August 2006

[6] M Dehghani M Asadpour and A Shakery ldquoAn evolutionary-based method for reconstructing conversation threads in emailcorporardquo in Proceedings of the IEEEACM International Con-ference on Advances in Social Networks Analysis and Mining(ASONAM rsquo12) 2012

[7] Y Jen-Yuan ldquoEmail thread reassembly using similarity match-ingrdquo in Proceedings of the 3rd Conference on Email and Anti-Spam (CEAS rsquo06) Muntain View Calif USA 2006

[8] S Gottipati D Lo and J Jiang ldquoFinding relevant answersin software forumsrdquo in Proceedings of the 26th IEEEACMInternational Conference on Automated Software Engineering(ASE rsquo11) pp 323ndash332 November 2011

[9] H Duan and C Zhai ldquoExploiting thread structures to improvesmoothing of language models for forum post retrievalrdquo inAdvances in Information Retrieval P Clough C Foley CGurrin et al Eds vol 6611 pp 350ndash361 Springer BerlinGermany 2011

[10] T Georgiou M Karvounis and Y Ioannidis ldquoExtractingtopics of debate between users on web discussion boardsrdquo inProceedings of the 1st International Conference forUndergraduateand Postgraduate Students in Research Poster Competition 2010

[11] Y C Wang M Joshi W Cohen and C P Rose ldquoRecoveringimplicit thread structure in newsgroup style conversationsrdquo

The Scientific World Journal 23

in Proceedings of the International Conference on Weblogs andSocial Media Seattle Wash USA 2008

[12] J Chan C Hayes and E M Daly ldquoDecomposing discussionforums and boards using user rolesrdquo in Proceedings of the 4thInternational AAAI Conference on Weblogs and Social Media2010

[13] Y-C Wang and C P Rose ldquoMaking conversational structureexplicit identification of initiation-response pairs within onlinediscussionsrdquo in Proceedings of the Human Language Technolo-gies Conference ofthe North American Chapter of the Associationfor Computational Linguistics pp 673ndash676 June 2010

[14] LWangM Lui S N Kim J Nivre and T Baldwin ldquoPredictingthread discourse structure over technical web forumsrdquo inProceedings of the Conference on Empirical Methods in NaturalLanguage Processing (EMNLP rsquo11) pp 13ndash25 Edinburgh UKJuly 2011

[15] A Balali H FailiMAsadpour andMDehghani ldquoa supervisedapproach for reconstructing thread structure in commentson blogs and online news agenciesrdquo in Proceedings of the14th International Conference on Intelligent Text Processing andComputational Linguistics Samos Greece 2013

[16] P Adams and C Martel ldquoConversational thread extraction andtopic detection in text-based chatrdquo in Semantic Computing pp87ndash113 John Wiley amp Sons 2010

[17] P H Adams and C H Martell ldquoTopic detection and extractionin chatrdquo in Proceedings of the 2nd Annual IEEE InternationalConference on Semantic Computing (ICSC rsquo08) pp 581ndash588August 2008

[18] C Lin J M Yang R Cai X J Wang W Wang and L ZhangldquoModeling semantics and structure of discussion threadsrdquo inProceedings of the 18th International Conference on World WideWeb pp 1103ndash1104 2009

[19] A Schuth MMarx andM d Rijke ldquoExtracting the discussionstructure in comments on news-articlesrdquo in Proceedings of the9th ACM International Workshop onWeb Information and DataManagement (WIDM rsquo07) Lisboa Portugal November 2007

[20] C Wang J Han Q Li X Li W P Lin and H Ji ldquoLearninghierarchical relationships among partially ordered objects withheterogeneous attributes and linksrdquo in Proceedings of the 12thSIAM International Conference on Data Mining pp 516ndash527Anaheim Calif USA 2012

[21] M Dehghani A Shakery M Asadpour and A KoushkestanildquoA learning approach for email conversation thread reconstruc-tionrdquo Journal of Information Science vol 39 no 6 pp 846ndash8632013

[22] A Balali A Rajabi S Ghassemi M Asadpour and H FailildquoContent diffusion prediction in social networksrdquo in Pro-ceedings of the 5th Conference on Information and KnowledgeTechnology (IKT rsquo13) pp 467ndash471 2013

[23] G Mishne and N Glance ldquoLeave a reply an analysis of weblogcommentsrdquo in Proceedings of the 3rd Annual Workshop onthe Weblogging ecosystem Aggregation Analysis and DynamicsEdinburgh UK 2006

[24] V Gomez A Kaltenbrunner and V Lopez ldquoStatistical analysisof the social network and discussion threads in Slashdotrdquo inProceedings of the 17th International Conference on World WideWeb (WWW rsquo08) pp 645ndash654 April 2008

[25] T Weninger X A Zhu and J Han ldquoAn exploration ofdiscussion threads in social news sites a case study of the redditcommunityrdquo 2013

[26] D Laniado R Tasso Y Volkovich and A KaltenbrunnerldquoWhen the Wikipedians talk network and tree structure of

Wikipedia discussion pagesrdquo in Proceedings of the 5th Interna-tional AAAI Conference on Weblogs and Social Media (ICWSMrsquo11) 2011

[27] A Grabowski ldquoHuman behavior in online social systemsrdquoTheEuropean Physical Journal B vol 69 no 4 pp 605ndash611 2009

[28] S N Kim L Wang and T Baldwin ldquoTagging and linkingweb forum postsrdquo in Proceedings of the 40th Conference onComputational Natural Language Learning Uppsala Sweden2010

[29] J Andreas S Rosenthal and K McKeown ldquoAnnotating agree-ment and disagreement in threaded discussionrdquo in Proceedingsof the 8th International Conference on Language Resources andComputation (LREC rsquo12) pp 818ndash822 Istanbul Turkey May2012

[30] A Stolcke K Ries N Coccaro et al ldquoDialogue actmodeling forautomatic tagging and recognition of conversational speechrdquoComputational Linguistics vol 26 no 3 pp 339ndash373 2000

[31] E N Forsyth Improving Automated Lexical and DiscourseAnalysis of Online Chat Dialog Naval Postgraduate SchoolMonterey Calif USA 2007

[32] D Feng E Shaw J Kim and E Hovy ldquoAn intelligentdiscussion-bot for answering student queries in threaded dis-cussionsrdquo in Proceedings of the 11th international Conference onIntelligent User Interfaces pp 171ndash177 February 2006

[33] F Hu T Ruan Z Shao and J Ding ldquoAutomatic web informa-tion extraction based on rulesrdquo in Proceedings of the Web Infor-mation System Engineering (WISE rsquo11) pp 265ndash272 Springer2011

[34] F Hu T Ruan and Z Shao ldquoComplete-thread extraction fromweb forumsrdquo inWebTechnologies andApplications pp 727ndash734Springer 2012

[35] D Feng J Kim E Shaw and E Hovy ldquoTowards modelingthreaded discussions using induced ontology knowledgerdquo inProceedings of the National Conference on Artificial Intelligencepp 1289ndash1294 July 2006

[36] G D Venolia and C Neustaedter ldquoUnderstanding sequenceand reply relationships within email conversations a mixed-model visualizationrdquo in Proceedings of the SIGCHI Conferenceon Human Factors in Computing Systems pp 361ndash368 April2003

[37] T Joachims ldquoTraining linear SVMs in linear timerdquo in Pro-ceedings of the 12th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining pp 217ndash226 Philadel-phia Pa USA August 2006

[38] RM Fano ldquoTransmission of information a statistical theory ofcommunicationsrdquo American Journal of Physics vol 29 pp 793ndash794 1961

[39] T Opsahl F Agneessens and J Skvoretz ldquoNode centrality inweighted networks generalizing degree and shortest pathsrdquoSocial Networks vol 32 no 3 pp 245ndash251 2010

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 14: Research Article A Supervised Approach to Predict the …downloads.hindawi.com/journals/tswj/2014/479746.pdf · 2019-07-31 · Research Article A Supervised Approach to Predict the

14 The Scientific World Journal

0

001

002

003

004

005

006

1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91Nor

mal

ized

com

men

ts co

unt

Length of comments (words)

AlefEnenewsRussianblog

CourantblogsThestandard

Figure 15 Histogram of length of comments in words

0

01

02

03

04

05

06

07

08

1 2 3 4 5 6 7 8 9Nor

mal

ized

com

men

ts co

unt

Depth

AlefEnenewsRusssainblog

CourantblogsThestandard

Figure 16 Histogram of the depth of comments

(1) comment-root that is the relation between commentsand the root article and (2) comment-comment that is therelation between two comments one parent and one childThe comment-root relations show different views around themain article (root) The comment-comment relations showconversation among visitors which is a valuable source ofinformation In Figure 17 there are three comment-root andthree comment-comment relations When a user reads theroot and comments heshe can write hisher opinion orsentiment about the root by replying to root or participatingin a discussion by replying to usersrsquo comments

We believe that the evaluation metrics mentioned beforeare not enough to cover both types of reply relations dueto differences between interrogative and declarative threadsAn appropriate metric should be able to detect both typesof relations So we have to modify the evaluation metrics ordefine new metrics

We propose an evaluation metric Accuracy (Acc)CTDwhere CTD stands for Comments Type Detection It isdefined as the proportion of correctly detected commenttypes The comment-root type includes comments whichinitiate a new view about the root (NP) and the comment-

comment type includes comments which are part of adiscussion (PD)

Accuracy (Acc)CTD =TP + TN

TP + TN + FP + FN (11)

where TP is the number of correctly detected NP commentsTN is the number of correctly detected PD comments FP isthe number of incorrectly detected NP comments and FN isthe number of incorrectly detected PD comments

To evaluate accuracy of comment-comment relations thestandard precision recall and 119865-score measures are used

Precision (119875)edge =TP

TP + FP

Recall (119877)edge =TP

TP + FN

1198651-score (119865)edge = 2 timesPrecision times recallPrecision + recall

(12)

where TP is the number of correctly detected comment-comment relations FP is the number of incorrectly detectedcomment-comment relations and FN is the number of

The Scientific World Journal 15

C

C

Comment-root relation

Comment-comment relation

C

C C C Starter discussion commentStarter discussion comment

R

Figure 17 An illustration of a declarative thread

comment-comment relations in the ground truth that werenot predicted at all

The path accuracy metric mentioned earlier is a mod-ified version of 119875path and 119877path which is appropriate todeclarative platform This metric consider discussion pathsfrom each PD comment to the starter discussion commentbut not to root

Precision (119875)path

=

sum|119862|119901

119894=1

1003816100381610038161003816path119866 (119894) = path119875(119894)1003816100381610038161003816

|119862|119901

ow

1 if (|119862|119901 = 0)

Recall (119877)path

=

sum|119862|119877

119894=1

1003816100381610038161003816path119866 (119894) = path119875(119894)1003816100381610038161003816

|119862|119877

ow

1 if (|119862|119877= 0)

(13)

where |119862|119877is the number of PD comments in the ground-

truth thread and |119862|119901is the number of PD comments in the

predicted thread Path119866(119894) is the discussion path from 119894th

node to the discussion starter comment in the ground truthPath119875(119894) is discussion path from 119894th node to the discussion

starter comment in the predicted threadAlso the relaxed Precision (119875)

119877-path and Recall (119877)119877-path

are modified to be suitable for declarative platform

Precision (119875)119877-path

=

sum|119862|119901

119894=1(1003816100381610038161003816path119866 (119894) cap path

119875(119894)1003816100381610038161003816 1003816100381610038161003816path119875 (119894)

1003816100381610038161003816)

|119862|119901

ow

1 if (|119862|119901= 0)

Recall (119877)119877-path

=

sum|119862|119877

119894=1(10038161003816100381610038161003816path119866 (119894)

cap path119875(119894)

1003816100381610038161003816100381610038161003816100381610038161003816path119866(119894)

10038161003816100381610038161003816)

|119862|119877

ow

1 if (|119862|119877= 0)

(14)

Figure 18 shows two threads of which one is the groundtruth and the other one is the predicted thread In order tobetter understand the evaluation metrics we calculate themfor this example

In Table 3 the predicted thread in Figure 18 has beenevaluated by metrics from interrogative threads The resultsshow high values Table 4 shows the results of evaluationby metrics from declarative threads The results show thatdeclarative metrics have more appropriate results

There are two reasons which lead declarative metricsto better evaluate the predicted structure (1) The root indeclarative threads has many children So if a methodconnects all comments to the root interrogative metricsshow good results (2) Interrogative metrics cannot properlyevaluate comment-comment relations in declarative threadsHowever the declarative metrics can evaluate both types ofrelations

53 Experimental Results and Analysis We use several base-lines to compare the effectiveness of SLARTS The first base-line is the performance when we simply link all comments totheir previous comment (we name it Last baseline) [3 4]Thismethod leads to good results for RTS task in interrogativethreads The second baseline is to link all comments to theroot (we name it First baseline)

The only work that has been done on RTS has focused onthe authorrsquos name in online news agencies [19] Schuth et alrsquosmethod used several features to find the commenterrsquos name inonline news articles This method achieves a good precisionbut it has a low recall because many comments do not referto any authorrsquos name So we selected Seo et alrsquos method [2]which is a strong method in interrogative platforms

Seo et alrsquos method has focused on forums and emails andused the quoted text as one of theirmain featuresThis featureis not available in our datasets Therefore we use all theirproposed features except the quoted text

We have applied 5fold cross-validation to minimize biaswith 95 as confidence interval Table 5 shows the results ofour experiments based on interrogative evaluation metricsAccording to Accedge SLARTS reveals higher accuracy exceptin Alef in which lots of replies are connected to the root

According to 119875119877-path and 119877119877-path First and Last baselines

have the best performance respectively First connects allcomments to the root this way irrelevant comments do notappear in the path from the comment to the root and also theroot appears in all paths Last connects all comments to theirprevious comments This causes all comments to appear inthe path

16 The Scientific World Journal

Table3Ex

ampleo

fcalculatio

nof

interrogativem

etric

sfor

thee

xamples

hownin

Figure

18

Evaluatio

nmetric

Calculation

Accuracy

edge

Replyrelatio

ns=1rarr

1198772rarr

1198777rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

14rarr

35rarr

26rarr

58rarr

5

Detectedrelatio

ns==1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

14rarr

15rarr

26rarr

57rarr

28rarr

7

8 11asymp073

Acc p

ath

|(1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

1rarr

1198775rarr

2rarr

1198776rarr

5rarr

2rarr

119877)|

Num

berof

comments

=8 11asymp073

119875119877-path

(1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

1rarr

1198775rarr

2rarr

1198776rarr

5rarr

2rarr

119877)+(4rarr

1rarr

1198777rarr

2rarr

1198778rarr

7rarr

2rarr

119877)

Num

berof

comments

=(1+1+1+1+1+1+1+1)+(1+12+23)

11

=1017

11

asymp092

119877119877-path

(1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

1rarr

1198775rarr

2rarr

1198776rarr

5rarr

2rarr

119877)+(4rarr

3rarr

1rarr

1198777rarr

1198778rarr

5rarr

2rarr

119877)

Num

berof

comments

=(1+1+1+1+1+1+1+1)+(23+1+23)

11

=1034

11

asymp094

1198651-score119877-path

2lowast092lowast094

092+094

asymp093

The Scientific World Journal 17

R

9 10 11

3 5

8

21 7

4 6

(a)

R

9 10 11

3 7

8

21

4 5

6

(b)

Figure 18 An example of two threads (a) ground-truth thread and (b) the prediction thread

Table 4 Example of calculation of declarative metrics for the example shown in Figure 18

Evaluation metric Calculation

AccCTD

TP = 1 2 9 10 11TN = 3 4 5 6 8

FN = 7

FP =

5 + 5

5 + 5 + 1 + 0asymp 091

119875edge

119877edge

1198651edge

TP = 3 rarr 1 5 rarr 2 6 rarr 5 119875edge =3

3 + 3= 05

FN = 4 rarr 3 8 rarr 5 119877edge =3

3 + 2= 06

FP = 4 rarr 1 7 rarr 2 8 rarr 7 1198651 minus scoreedge asymp 055

119875path

119877path

1198651path

119875path =| 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 |

| 3 rarr 1 4 rarr 1 5 rarr 2 6 rarr 5 rarr 2 7 rarr 2 8 rarr 7 rarr 2 |=3

6= 05

119877path =| 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 |

| 3 rarr 1 4 rarr 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 8 rarr 5 rarr 2 |=3

5= 06

1198651-score119877-path asymp 055

119875119877-path

119877119877-path

1198651119877-path

119875119877-path =

3 rarr 1 4 rarr 1 5 rarr 2 6 rarr 5 rarr 2 7 rarr 2 8 rarr 7 rarr 2

| 3 4 5 6 7 8 |=1 + 1 + 1 + 1 + 0 + 12

6=45

6= 075

119877119877-path =

3 rarr 1 4 rarr 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 8 rarr 5 rarr 2

| 3 4 5 6 8 |=1 + 12 + 1 + 1 + 12

5=4

5= 08

1198651-score119877-path =

2 lowast 075 lowast 08

075 + 08asymp 077

According to Accpath First has better performance inThestandard ENENews and Alef because more commentsare linked directly to the root Usually SLARTS and Seo etalrsquos methods cannot predict the full path inThestandard andENENews because according to Figure 15 paths are verylong and complex in these datasets

As wementioned earlier interrogative evaluationmetricsare not appropriate for declarative threads because based onthese metrics First shows high performance although thisbaseline does not detect any comment-comment relations

Table 6 shows the results when we use declarative evaluationmetrics proposed in Section 52

According to Table 6 for 119865edge SLARTS performs betterthan Seo et alrsquos method However for119875edge Seo et alrsquos methodperforms better but its 119877edge is lower than SLARTS in all da-tasets

It is important to say that when the difference between119875edge and 119877edge is high and 119875edge is greater than 119877edge themethod most likely connects comments to the root and itdoes not appropriately detect comment-comment relations

18 The Scientific World Journal

J(1)M(1)

V(1)

H(1)

A(1)

R(1)

G(1)

N(1)D(1)

E(1)

C(1)

P(2)

U(1)

S(1)

T(2)

T(1)

M(2)

L(1)

Q(1)

C(2)

K(1)

F(1)L(2)

C(3)

P(1)

A(2)

B(1)O(1)

I(1)

H(2)

R

(a)

J(1)

M(1)

R

A(1)

N(4)

D(1)

G(1)

I(1)

B(1)H(1)

O(1)

C(1)

N(2)

G(2)O(2)

P(1)

E(1)

D(2)

F(1)

E(2)F(2)

D(3)

L(1) F(3) N(1)

D(4)

F(4)K(1)D(5)

D(6)

N(3)

(b)

1)

1)

1)

1)

1)

1)

1)

1)1)

1)

1)

1)

R

F(2)

K(3)

R(1)

K(2)

M(2)A(3)

Q(1)

N(2)

K(4)A(2)

J(2)

1)

H(2)

H(3)

I(2)

I(1)

P(1)

O(1)

(c)

Figure 19 Three real threads in which nodes are labeled according to commenterrsquos name and number of its comments

(like First baseline) On the other hand when 119877edge isgreater than 119875edge the method does not appropriately detectcomment-root relations (like Last baseline)

So Seo et alrsquos method most likely connects comments tothe root and this is not appropriately detecting comment-comment relations On the other hands 119875edge and 119877edge ofSLARTS are close

Several features in SLARTS such as Global similarityfrequent words Frequent patterns and authorsrsquo languagefocus on detection of comment-comment relations Thismakes the results of SLARTS better than Seo et alrsquos methodin declarative evaluation metrics

We already saw that the average length of threads inThestandard and ENENews is longer than the other datasets(Table 2) and their paths are much longer and more complexthan the other datasets (Figure 16) According to 119865edge theaccuracy of Thestandard and ENENews is less than otherdatasets Note that 119865edge is a strict metric in declarativethreads

SLARTS has better 119865edge than Seo et alrsquos method in alldatasetsThemaximumdifference occurs inThestandard andRussianblog In these datasets many of the defined featureshave goodperformance (Importance of features in ENENewsand Russianblog datasets will be shown in Table 7 and we willexplain more in the next section)

The Scientific World Journal 19

Table 5 Experimental results with interrogative evaluation metrics

Datasets Methods Accedge Accpath 119875119877-path 119877

119877-path 1198651119877-path

Thestandard

SLARTS 04669plusmn0009

03059plusmn0013

06927plusmn0018

07450plusmn0010

07179plusmn0013

Seo et alrsquos [2] 03941plusmn0005

03413plusmn0006

08303plusmn0009

06563plusmn0005

07331plusmn0003

First 03630plusmn0010

03630plusmn0010

10000plusmn0000

05846plusmn0009

07379plusmn0007

Last 01824plusmn0006

00495plusmn0004

02214plusmn0008

10000plusmn0000

03625plusmn0010

ENENews

SLARTS 04837plusmn0018

03695plusmn0027

07625plusmn0026

07770plusmn0013

07697plusmn0017

Seo et alrsquos [2] 04661plusmn0017

04134plusmn0021

08720plusmn0011

07086plusmn0013

07818plusmn0011

First 04524plusmn0018

04524plusmn0018

10000plusmn0000

06634plusmn0016

07976plusmn0011

Last 02090plusmn0016

00596plusmn0003

02319plusmn0011

10000plusmn0000

03764plusmn0015

Courantblogs

SLARTS 06843plusmn0028

06453plusmn0033

09127plusmn0013

08619plusmn0014

08865plusmn0013

Seo et alrsquos [2] 06700plusmn0040

06628plusmn0043

09667plusmn0014

08250plusmn0018

08902plusmn0015

First 06490plusmn0036

06490plusmn0036

10000plusmn0000

08001plusmn0019

08889plusmn0012

Last 02037plusmn0017

01082plusmn0008

03183plusmn0011

10000plusmn0000

04829plusmn0012

Russianblog

SLARTS 07463plusmn0017

06904plusmn0017

08741plusmn0008

08787plusmn0006

08764plusmn0007

Seo et alrsquos [2] 05446plusmn0010

05174plusmn0011

09293plusmn0003

07821plusmn0004

08494plusmn0003

First 04968plusmn0006

04968plusmn0006

10000plusmn0000

07134plusmn0004

08327plusmn0003

Last 03632plusmn0018

01549plusmn0012

03695plusmn0015

10000plusmn0000

05395plusmn0016

Alef

SLARTS 07753plusmn0017

07641plusmn0018

09579plusmn0006

09033plusmn0008

09298plusmn0007

Seo et alrsquos [2] 07802plusmn0019

07800plusmn0019

09950plusmn0003

08828plusmn0010

09355plusmn0006

First 07853plusmn0018

07853plusmn0018

10000plusmn0000

08826plusmn0010

09376plusmn0006

Last 00993plusmn0005

00822plusmn0005

02526plusmn0012

10000plusmn0000

04032plusmn0015

The bold style shows the best result in each metric

As shown in Table 6 Russianblog has better resultsthan the other datasets in all metrics The main reason isthat its comments are not confirmed by a moderator Thiscauses Accedge of Last baseline in Russianblog to be equalto 03632 that is more than other datasets (Table 5) Accedgeof Last baseline has inverse relationship with complexity ofreplies Also we showed in Figure 13 that the Russianbloghas the lowest time difference between a comment and itsparent When time difference between a child and its parentdecreases detection of reply relationwould be easier In otherwords as the parent appears to be closer to its child somefeatures such as frequent pattern and location Prior that are

based on position of comments in chronological order workbetter

The difference between 119865119877-path and 119865path is about 20

in Thestandard and ENENews where threads are larger andpaths have more complex structures

The minimum difference between the results of SLARTSand Seo et alrsquos methods appears in Alef datasets In Alef manyrelations are comment root and many comments do not haveauthorrsquos name which make the features perform poorly

Since SLARTS method has features which speciallyfocused on detecting comment-root relations (eg by addingcandidate filtering) AccCTD of SLARTS is better than Seo

20 The Scientific World Journal

et al method in all datasets The best result is 091 forRussianblog and the worst result is 069 for ENENewsThe root of ENENews dataset is usually a tweet Accordingto Table 2 it makes the average length of the rootrsquos textto be shorter than the other datasets and this makes thetextual features perform poorly on detecting comment-rootrelations

Confidence intervals of Alef andCourantblog datasets arehigher than the other datasets becausemany of their relationsare comment root (Figure 16) This makes ranking SVM tobe bias towards connecting comments to the root especiallywhen a thread includes very few comment-comment rela-tions

We compare 119875-value to specify the significance of differ-ences between SLARTS and Seo et alrsquosmethods on declarativemetrics Since ranking SVM ranks candidates based on theirscore and selects the first candidate from the list only thefirst candidate is important So p1 is computed The resultsindicate that all improvements are statistically significant (119875-value lt 0005) in all datasets

54 Evaluation of the Features In this section we evaluatethe role of the introduced features We use backward featureselection That means to measure the importance of a featurewe use all features except that feature repeat the experimentsin its absence and compare the results to the case whereall features are present The difference between the values ofmetrics in presence of a feature and its absence is reported inTable 7

It is seen that some features improve precision of themetrics for example location prior and candidate filteringrule 3 where they most likely tend to detect comment-rootrelations Some features improve recall of the metrics such asauthorsrsquo language global similarity candidate filtering rule1 and frequent patterns These features most likely tendto detect comment-comment relations Some features affectboth precision and recall for example authorsrsquo name

As stated earlier Russianblog has different behaviourin comparison with other datasets (Figures 12 13 and 15)Also ENENews has larger threads and more complex paths(Figure 16 and Table 2) So we use these datasets to evaluatefeatures in depth

Similarity feature improves recall of evaluation metricsSince Russian language is known as a morphologically richlanguage and the length of commentsrsquo text is very short(about average 18 words according to Table 2) in comparisonwith other datasets improvement of textual features is lowTo increase the performance of textual feature in Russianwe need a stemmer a Wordnet and a tokenizer Similarityfeature has rather a good impact on AccCTD

Authorsrsquo language feature improves recall of evaluationmetrics According to 119877edge the improvements of this featureare 003 and 001 in ENENews and Russianblog respectively

Global similarity feature improves both precision andrecall in Russianblog and recall in ENENews

The frequent words feature has a small improve-ment in Russianblog like similarity feature This feature

improves recall of the evaluation metrics in ENEnews about001

Length ratio feature improves precision in both datasetsHowever since longer comments in ENENews have manychildren this feature is more prominent there

Authorsrsquo name feature is useful in all evaluation metricsThe value of this feature in ENENews is more than Russian-blog because authorsrsquo name in Russianblog is different fromother datasets it is an email and no one would refer to it as aname

The frequent patterns feature focuses on detectingcomment-comment relations This feature improves recall ofevaluation metrics in both datasets

The location prior feature improves precision in bothdatasets This feature has a good improvement on AccCTDAccording to 119875edge the best improvement is 016091 forRussianblog since comments are not moderated

The candidate filtering rule 1 improves recall of evaluationmetrics in both datasets This feature removes the rootcandidate accurately This feature has a good improvementon AccCTD The maximum improvement for 119877edge is 034 forRussianblog

The candidate filtering rule 2 has small improvements onprecision and recall of evaluation metrics in both datasetsMaximum improvement is gained with Russianblog

Finally The candidate filtering rule 3 improves precisionof metrics in both datasets This feature removes all can-didates except the root Thus detection of comment-rootrelations is improved Also this feature improves AccCTD

6 Information Extraction byVisualizing Threads Structure

In this section we discuss the information which can beextracted from hierarchical structure of threads Figure 19 isthe same as Figure 3 except that the commenters and theirnumber of posted comments are shown as the nodesrsquo labelfor example ldquo119863(3)rdquo is the third comment sent by user ldquo119863rdquoVisualization of the structure reveals valuable information asfollowing

(i) Is the root article controversial The threads ldquo119861rdquo andldquo119862rdquo in Figure 19 aremore controversial than ldquo119860rdquo Visi-tors have expressed different opinions and sentimentsabout the root in ldquo119861rdquo and ldquo119862rdquo leading to formationof longer conversations The thread ldquo119860rdquo shows acommon thread which has short conversations aboutthe root The height and width of trees can help as ameasure to recognize whether a root is controversialor not

(ii) Which comments are starters [36] Starter commentshave important role in the conversations becauseusers read them and then read other comments tobetter understand the conversations

(iii) Which comments have participated in this conversa-tion Users can follow the conversations that seeminteresting to them without reading any unrelatedcomment

The Scientific World Journal 21

Table 6 Experimental results with declarative evaluation metrics

Datasets Method 119875edge 119877edge 119865edge 119875path 119877path 119865path 119875119877-path 119877

119877-path 119865119877-path AccCTD

ThestandardSLARTS 03782

plusmn000804155plusmn0011

03960plusmn0006

01565plusmn0007

01713plusmn0008

01636plusmn0007

03628plusmn0012

04166plusmn0016

03876plusmn0005

07525plusmn0005

Seo et alrsquos[2]

03014plusmn0031

01414plusmn0015

01921plusmn0016

01544plusmn0025

00576plusmn0011

00835plusmn0013

03013plusmn0038

01319plusmn0018

01831plusmn0021

05831plusmn0007

Difference 00778 02741 02039 00021 01137 00801 00615 02847 02045 01694

ENENewsSLARTS 03797

plusmn003803891plusmn0019

03839plusmn0024

01873plusmn0040

01855plusmn0026

01857plusmn0030

03768plusmn0048

03768plusmn0017

03759plusmn0025

06872plusmn0014

Seo et alrsquos[2]

04060plusmn0030

01713plusmn0016

02407plusmn0018

02378plusmn0034

00837plusmn0016

01235plusmn0021

03947plusmn0035

01523plusmn0015

02196plusmn0019

05954plusmn0009

Difference minus00263 02178 01432 minus00505 01018 00622 minus00179 02245 01563 00918

CourantblogsSLARTS 05236

plusmn003003963plusmn0073

04495plusmn0052

04220plusmn0057

03110plusmn0058

03566plusmn0049

05325plusmn0049

03853plusmn0075

04452plusmn0057

07625plusmn0033

Seo et alrsquos[2]

07320plusmn0090

02049plusmn0062

03189plusmn0082

06815plusmn0105

01896plusmn0070

02951plusmn0093

07309plusmn0094

02010plusmn0065

03140plusmn0086

07001plusmn0031

Difference minus02084 01914 01306 minus02595 01214 00615 minus01984 01843 01312 00624

RussianblogSLARTS 05940

plusmn002805674plusmn0022

05803plusmn0023

04907plusmn0031

04664plusmn0020

04782plusmn0024

05878plusmn0028

05823plusmn0021

05849plusmn0020

09116plusmn0011

Seo et alrsquos[2]

04705plusmn0020

03030plusmn0015

03685plusmn0016

03823plusmn0023

02534plusmn0013

03046plusmn0015

04682plusmn0020

02822plusmn0014

03521plusmn0015

05862plusmn0013

Difference 01235 02644 02118 01084 0213 01736 01196 03001 02328 03254

AlefSLARTS 06189

plusmn002804131plusmn0040

04952plusmn0037

05639plusmn0027

03819plusmn0037

04551plusmn0034

06277plusmn0027

04069plusmn0043

04934plusmn0039

08044plusmn0013

Seo et alrsquos[2]

08827plusmn0052

02637plusmn0051

04045plusmn0060

08797plusmn0053

02631plusmn0051

04034plusmn0060

08840plusmn0052

02635plusmn0051

04043plusmn0060

07846plusmn0018

Difference minus02638 01494 00907 minus03158 01188 00517 minus02563 01434 00891 00198The bold style shows the best result in each metric

(iv) Who plays an important role in a discussion Someusers playmore important roles in conversations thanother users For example users ldquo119863(1)rdquo in threadldquoBrdquo and ldquo119860(1)rdquo in thread ldquoCrdquo have made a longconversation These users are known as a hub or astarter in a thread and their degree of importance canbe compared according to their indegree [39] Manyanalyses can be performed on this structure which isknown as popularity features [12]

(v) How many conversations are about the root Thereare thirteen conversations which are replyingdirectly to the root in thread ldquoCrdquo for exampleF(2)H(2)H(3) I(2)O(1)orF(2)H(2) l(1) showtwo conversations in thread ldquo119862rdquo

7 Conclusion and Future Work

In this paper we proposed SLARTS a method based onranking SVMwhich predicts the tree-like structure of declar-ative threads for example blogs and online news agenciesWe emphasized on the differences between declarative andinterrogative threads and showed that many of the previouslyproposed features perform poorly on declarative threadsbecause of this Instead we defined a set of novel textual andnontextual features and used a ranking SVM algorithm tocombine these features

We detect two types of reply relations in declarativethreads comment-root relation and comment-commentrelation An appropriate method should be able to detectboth types of reply relations and an appropriatemetric shouldconsider both types of reply relations So in order to havefair judge on the quality of the predicted structures wemodified the evaluation metrics accordingly We also defineda novel metric that measures the accuracy of comment typedetection

The results of our experiments showed that accord-ing to declarative evaluation metrics our method showshigher accuracy in comparison with the baselines on fivedatasets Also we showed that all improvements in detectingcomment-comment relations are statistically significant in alldatasets

We believe that the defined features are not limited todeclarative threads Some features such as authorrsquos languageand frequent patterns extract relations among users can beuseful in interrogative threads

For future we would like to use the proposed methodon interrogative platforms such as forums Also we wouldlike to analyze the tree-like reply structure deeply and webelieve it can provide valuable information for example tohelp in understanding the role of users in discussions or tofind important and controversial comments among a largevolume of comments

22 The Scientific World Journal

Table 7 The difference between the evaluation metrics in presence and absence of features

Feature Dataset Evaluation metric119875edge 119877edge 119875Path 119877Path 119875

119877-path 119877119877-path AccCTD

Similarity ENENews minus000233 007415 minus000538 003289 minus001095 008996 004684Russianblog 000874 000012 000963 000347 000695 minus001079 000355

Authorsrsquo language ENENews minus001354 003246 minus001785 000693 minus002757 004052 000458Russianblog minus000229 001052 minus000179 000750 minus000590 000807 minus000182

Global similarity ENENews minus000402 005604 minus001338 001581 minus001129 007086 002742Russianblog 001090 001443 000582 000897 000368 minus000317 minus000087

Frequent words ENENews 000252 000931 000170 000424 minus000132 000973 000292Russianblog 000008 000023 000016 000029 000034 000047 minus000011

Length ratio ENENews 004206 minus002984 004450 001700 006421 minus005336 minus000850Russianblog 000651 minus000355 000920 000080 000784 minus000592 000235

Authorsrsquo name ENENews 003753 006392 001920 003285 002341 005998 001788Russianblog 000000 000000 000000 000000 000000 000000 000000

Frequent pattern ENENews minus002028 002930 minus002357 000300 minus003793 003410 000526Russianblog 000458 006100 000830 005206 minus000545 006150 000404

Location prior ENENews 008778 minus003690 008556 003280 011792 minus007852 002925Russianblog 016091 009671 017766 013515 016699 008793 002414

Candidate filtering rule 1 ENENews minus005431 006138 minus004102 001296 minus006458 007545 006949Russianblog minus012506 034370 minus013916 027445 minus012683 037003 032546

Candidate filtering rule 2 ENENews 000014 minus000004 000048 000041 minus000038 minus000035 minus000009Russianblog 002575 002475 002462 002341 001746 000802 000000

Candidate filtering rule 3 ENENews 000357 minus000121 000452 000254 000551 minus000161 000161Russianblog 008521 minus001978 009636 001702 009380 minus003825 007371

Since our goal is to provide a language-independentmodel to reconstruct the thread structure we did not usetext processing tools such as Wordnet and Named EntityRecognition (as they are not available in the same quality forall languages) We would like to focus on English and usethese tools to find out whether they can improve the accuracyof RTS method or not

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] Y Wu Q Ye L Li and J Xiao ldquoPower-law properties ofhuman view and reply behavior in online societyrdquoMathematicalProblems in Engineering vol 2012 Article ID 969087 7 pages2012

[2] J Seo W B Croft and D A Smith ldquoOnline community searchusing conversational structuresrdquo Information Retrieval vol 14no 6 pp 547ndash571 2011

[3] HWang CWang C X Zhai and J Han ldquoLearning online dis-cussion structures by conditional random fieldsrdquo in Proceedingsof the 34th International ACM SIGIR Conference on Researchand Development in Information Retrieval (SIGIR rsquo11) pp 435ndash444 July 2011

[4] E Aumayr J Chan and C Hayes ldquoReconstruction of threadedconversations in online discussion forumsrdquo inProceedings of the

5th International AAAI Conference onWeblogs and SocialMedia(ICWSM rsquo11) pp 26ndash33 2011

[5] D Shen Q Yang J-T Sun and Z Chen ldquoThread detactionin dynamic text message streamsrdquo in Proceedings of the 29thAnnual International ACM SIGIR Conference on Research andDevelopment in Information Retrieval pp 35ndash42 Seattle WashUSA August 2006

[6] M Dehghani M Asadpour and A Shakery ldquoAn evolutionary-based method for reconstructing conversation threads in emailcorporardquo in Proceedings of the IEEEACM International Con-ference on Advances in Social Networks Analysis and Mining(ASONAM rsquo12) 2012

[7] Y Jen-Yuan ldquoEmail thread reassembly using similarity match-ingrdquo in Proceedings of the 3rd Conference on Email and Anti-Spam (CEAS rsquo06) Muntain View Calif USA 2006

[8] S Gottipati D Lo and J Jiang ldquoFinding relevant answersin software forumsrdquo in Proceedings of the 26th IEEEACMInternational Conference on Automated Software Engineering(ASE rsquo11) pp 323ndash332 November 2011

[9] H Duan and C Zhai ldquoExploiting thread structures to improvesmoothing of language models for forum post retrievalrdquo inAdvances in Information Retrieval P Clough C Foley CGurrin et al Eds vol 6611 pp 350ndash361 Springer BerlinGermany 2011

[10] T Georgiou M Karvounis and Y Ioannidis ldquoExtractingtopics of debate between users on web discussion boardsrdquo inProceedings of the 1st International Conference forUndergraduateand Postgraduate Students in Research Poster Competition 2010

[11] Y C Wang M Joshi W Cohen and C P Rose ldquoRecoveringimplicit thread structure in newsgroup style conversationsrdquo

The Scientific World Journal 23

in Proceedings of the International Conference on Weblogs andSocial Media Seattle Wash USA 2008

[12] J Chan C Hayes and E M Daly ldquoDecomposing discussionforums and boards using user rolesrdquo in Proceedings of the 4thInternational AAAI Conference on Weblogs and Social Media2010

[13] Y-C Wang and C P Rose ldquoMaking conversational structureexplicit identification of initiation-response pairs within onlinediscussionsrdquo in Proceedings of the Human Language Technolo-gies Conference ofthe North American Chapter of the Associationfor Computational Linguistics pp 673ndash676 June 2010

[14] LWangM Lui S N Kim J Nivre and T Baldwin ldquoPredictingthread discourse structure over technical web forumsrdquo inProceedings of the Conference on Empirical Methods in NaturalLanguage Processing (EMNLP rsquo11) pp 13ndash25 Edinburgh UKJuly 2011

[15] A Balali H FailiMAsadpour andMDehghani ldquoa supervisedapproach for reconstructing thread structure in commentson blogs and online news agenciesrdquo in Proceedings of the14th International Conference on Intelligent Text Processing andComputational Linguistics Samos Greece 2013

[16] P Adams and C Martel ldquoConversational thread extraction andtopic detection in text-based chatrdquo in Semantic Computing pp87ndash113 John Wiley amp Sons 2010

[17] P H Adams and C H Martell ldquoTopic detection and extractionin chatrdquo in Proceedings of the 2nd Annual IEEE InternationalConference on Semantic Computing (ICSC rsquo08) pp 581ndash588August 2008

[18] C Lin J M Yang R Cai X J Wang W Wang and L ZhangldquoModeling semantics and structure of discussion threadsrdquo inProceedings of the 18th International Conference on World WideWeb pp 1103ndash1104 2009

[19] A Schuth MMarx andM d Rijke ldquoExtracting the discussionstructure in comments on news-articlesrdquo in Proceedings of the9th ACM International Workshop onWeb Information and DataManagement (WIDM rsquo07) Lisboa Portugal November 2007

[20] C Wang J Han Q Li X Li W P Lin and H Ji ldquoLearninghierarchical relationships among partially ordered objects withheterogeneous attributes and linksrdquo in Proceedings of the 12thSIAM International Conference on Data Mining pp 516ndash527Anaheim Calif USA 2012

[21] M Dehghani A Shakery M Asadpour and A KoushkestanildquoA learning approach for email conversation thread reconstruc-tionrdquo Journal of Information Science vol 39 no 6 pp 846ndash8632013

[22] A Balali A Rajabi S Ghassemi M Asadpour and H FailildquoContent diffusion prediction in social networksrdquo in Pro-ceedings of the 5th Conference on Information and KnowledgeTechnology (IKT rsquo13) pp 467ndash471 2013

[23] G Mishne and N Glance ldquoLeave a reply an analysis of weblogcommentsrdquo in Proceedings of the 3rd Annual Workshop onthe Weblogging ecosystem Aggregation Analysis and DynamicsEdinburgh UK 2006

[24] V Gomez A Kaltenbrunner and V Lopez ldquoStatistical analysisof the social network and discussion threads in Slashdotrdquo inProceedings of the 17th International Conference on World WideWeb (WWW rsquo08) pp 645ndash654 April 2008

[25] T Weninger X A Zhu and J Han ldquoAn exploration ofdiscussion threads in social news sites a case study of the redditcommunityrdquo 2013

[26] D Laniado R Tasso Y Volkovich and A KaltenbrunnerldquoWhen the Wikipedians talk network and tree structure of

Wikipedia discussion pagesrdquo in Proceedings of the 5th Interna-tional AAAI Conference on Weblogs and Social Media (ICWSMrsquo11) 2011

[27] A Grabowski ldquoHuman behavior in online social systemsrdquoTheEuropean Physical Journal B vol 69 no 4 pp 605ndash611 2009

[28] S N Kim L Wang and T Baldwin ldquoTagging and linkingweb forum postsrdquo in Proceedings of the 40th Conference onComputational Natural Language Learning Uppsala Sweden2010

[29] J Andreas S Rosenthal and K McKeown ldquoAnnotating agree-ment and disagreement in threaded discussionrdquo in Proceedingsof the 8th International Conference on Language Resources andComputation (LREC rsquo12) pp 818ndash822 Istanbul Turkey May2012

[30] A Stolcke K Ries N Coccaro et al ldquoDialogue actmodeling forautomatic tagging and recognition of conversational speechrdquoComputational Linguistics vol 26 no 3 pp 339ndash373 2000

[31] E N Forsyth Improving Automated Lexical and DiscourseAnalysis of Online Chat Dialog Naval Postgraduate SchoolMonterey Calif USA 2007

[32] D Feng E Shaw J Kim and E Hovy ldquoAn intelligentdiscussion-bot for answering student queries in threaded dis-cussionsrdquo in Proceedings of the 11th international Conference onIntelligent User Interfaces pp 171ndash177 February 2006

[33] F Hu T Ruan Z Shao and J Ding ldquoAutomatic web informa-tion extraction based on rulesrdquo in Proceedings of the Web Infor-mation System Engineering (WISE rsquo11) pp 265ndash272 Springer2011

[34] F Hu T Ruan and Z Shao ldquoComplete-thread extraction fromweb forumsrdquo inWebTechnologies andApplications pp 727ndash734Springer 2012

[35] D Feng J Kim E Shaw and E Hovy ldquoTowards modelingthreaded discussions using induced ontology knowledgerdquo inProceedings of the National Conference on Artificial Intelligencepp 1289ndash1294 July 2006

[36] G D Venolia and C Neustaedter ldquoUnderstanding sequenceand reply relationships within email conversations a mixed-model visualizationrdquo in Proceedings of the SIGCHI Conferenceon Human Factors in Computing Systems pp 361ndash368 April2003

[37] T Joachims ldquoTraining linear SVMs in linear timerdquo in Pro-ceedings of the 12th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining pp 217ndash226 Philadel-phia Pa USA August 2006

[38] RM Fano ldquoTransmission of information a statistical theory ofcommunicationsrdquo American Journal of Physics vol 29 pp 793ndash794 1961

[39] T Opsahl F Agneessens and J Skvoretz ldquoNode centrality inweighted networks generalizing degree and shortest pathsrdquoSocial Networks vol 32 no 3 pp 245ndash251 2010

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 15: Research Article A Supervised Approach to Predict the …downloads.hindawi.com/journals/tswj/2014/479746.pdf · 2019-07-31 · Research Article A Supervised Approach to Predict the

The Scientific World Journal 15

C

C

Comment-root relation

Comment-comment relation

C

C C C Starter discussion commentStarter discussion comment

R

Figure 17 An illustration of a declarative thread

comment-comment relations in the ground truth that werenot predicted at all

The path accuracy metric mentioned earlier is a mod-ified version of 119875path and 119877path which is appropriate todeclarative platform This metric consider discussion pathsfrom each PD comment to the starter discussion commentbut not to root

Precision (119875)path

=

sum|119862|119901

119894=1

1003816100381610038161003816path119866 (119894) = path119875(119894)1003816100381610038161003816

|119862|119901

ow

1 if (|119862|119901 = 0)

Recall (119877)path

=

sum|119862|119877

119894=1

1003816100381610038161003816path119866 (119894) = path119875(119894)1003816100381610038161003816

|119862|119877

ow

1 if (|119862|119877= 0)

(13)

where |119862|119877is the number of PD comments in the ground-

truth thread and |119862|119901is the number of PD comments in the

predicted thread Path119866(119894) is the discussion path from 119894th

node to the discussion starter comment in the ground truthPath119875(119894) is discussion path from 119894th node to the discussion

starter comment in the predicted threadAlso the relaxed Precision (119875)

119877-path and Recall (119877)119877-path

are modified to be suitable for declarative platform

Precision (119875)119877-path

=

sum|119862|119901

119894=1(1003816100381610038161003816path119866 (119894) cap path

119875(119894)1003816100381610038161003816 1003816100381610038161003816path119875 (119894)

1003816100381610038161003816)

|119862|119901

ow

1 if (|119862|119901= 0)

Recall (119877)119877-path

=

sum|119862|119877

119894=1(10038161003816100381610038161003816path119866 (119894)

cap path119875(119894)

1003816100381610038161003816100381610038161003816100381610038161003816path119866(119894)

10038161003816100381610038161003816)

|119862|119877

ow

1 if (|119862|119877= 0)

(14)

Figure 18 shows two threads of which one is the groundtruth and the other one is the predicted thread In order tobetter understand the evaluation metrics we calculate themfor this example

In Table 3 the predicted thread in Figure 18 has beenevaluated by metrics from interrogative threads The resultsshow high values Table 4 shows the results of evaluationby metrics from declarative threads The results show thatdeclarative metrics have more appropriate results

There are two reasons which lead declarative metricsto better evaluate the predicted structure (1) The root indeclarative threads has many children So if a methodconnects all comments to the root interrogative metricsshow good results (2) Interrogative metrics cannot properlyevaluate comment-comment relations in declarative threadsHowever the declarative metrics can evaluate both types ofrelations

53 Experimental Results and Analysis We use several base-lines to compare the effectiveness of SLARTS The first base-line is the performance when we simply link all comments totheir previous comment (we name it Last baseline) [3 4]Thismethod leads to good results for RTS task in interrogativethreads The second baseline is to link all comments to theroot (we name it First baseline)

The only work that has been done on RTS has focused onthe authorrsquos name in online news agencies [19] Schuth et alrsquosmethod used several features to find the commenterrsquos name inonline news articles This method achieves a good precisionbut it has a low recall because many comments do not referto any authorrsquos name So we selected Seo et alrsquos method [2]which is a strong method in interrogative platforms

Seo et alrsquos method has focused on forums and emails andused the quoted text as one of theirmain featuresThis featureis not available in our datasets Therefore we use all theirproposed features except the quoted text

We have applied 5fold cross-validation to minimize biaswith 95 as confidence interval Table 5 shows the results ofour experiments based on interrogative evaluation metricsAccording to Accedge SLARTS reveals higher accuracy exceptin Alef in which lots of replies are connected to the root

According to 119875119877-path and 119877119877-path First and Last baselines

have the best performance respectively First connects allcomments to the root this way irrelevant comments do notappear in the path from the comment to the root and also theroot appears in all paths Last connects all comments to theirprevious comments This causes all comments to appear inthe path

16 The Scientific World Journal

Table3Ex

ampleo

fcalculatio

nof

interrogativem

etric

sfor

thee

xamples

hownin

Figure

18

Evaluatio

nmetric

Calculation

Accuracy

edge

Replyrelatio

ns=1rarr

1198772rarr

1198777rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

14rarr

35rarr

26rarr

58rarr

5

Detectedrelatio

ns==1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

14rarr

15rarr

26rarr

57rarr

28rarr

7

8 11asymp073

Acc p

ath

|(1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

1rarr

1198775rarr

2rarr

1198776rarr

5rarr

2rarr

119877)|

Num

berof

comments

=8 11asymp073

119875119877-path

(1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

1rarr

1198775rarr

2rarr

1198776rarr

5rarr

2rarr

119877)+(4rarr

1rarr

1198777rarr

2rarr

1198778rarr

7rarr

2rarr

119877)

Num

berof

comments

=(1+1+1+1+1+1+1+1)+(1+12+23)

11

=1017

11

asymp092

119877119877-path

(1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

1rarr

1198775rarr

2rarr

1198776rarr

5rarr

2rarr

119877)+(4rarr

3rarr

1rarr

1198777rarr

1198778rarr

5rarr

2rarr

119877)

Num

berof

comments

=(1+1+1+1+1+1+1+1)+(23+1+23)

11

=1034

11

asymp094

1198651-score119877-path

2lowast092lowast094

092+094

asymp093

The Scientific World Journal 17

R

9 10 11

3 5

8

21 7

4 6

(a)

R

9 10 11

3 7

8

21

4 5

6

(b)

Figure 18 An example of two threads (a) ground-truth thread and (b) the prediction thread

Table 4 Example of calculation of declarative metrics for the example shown in Figure 18

Evaluation metric Calculation

AccCTD

TP = 1 2 9 10 11TN = 3 4 5 6 8

FN = 7

FP =

5 + 5

5 + 5 + 1 + 0asymp 091

119875edge

119877edge

1198651edge

TP = 3 rarr 1 5 rarr 2 6 rarr 5 119875edge =3

3 + 3= 05

FN = 4 rarr 3 8 rarr 5 119877edge =3

3 + 2= 06

FP = 4 rarr 1 7 rarr 2 8 rarr 7 1198651 minus scoreedge asymp 055

119875path

119877path

1198651path

119875path =| 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 |

| 3 rarr 1 4 rarr 1 5 rarr 2 6 rarr 5 rarr 2 7 rarr 2 8 rarr 7 rarr 2 |=3

6= 05

119877path =| 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 |

| 3 rarr 1 4 rarr 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 8 rarr 5 rarr 2 |=3

5= 06

1198651-score119877-path asymp 055

119875119877-path

119877119877-path

1198651119877-path

119875119877-path =

3 rarr 1 4 rarr 1 5 rarr 2 6 rarr 5 rarr 2 7 rarr 2 8 rarr 7 rarr 2

| 3 4 5 6 7 8 |=1 + 1 + 1 + 1 + 0 + 12

6=45

6= 075

119877119877-path =

3 rarr 1 4 rarr 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 8 rarr 5 rarr 2

| 3 4 5 6 8 |=1 + 12 + 1 + 1 + 12

5=4

5= 08

1198651-score119877-path =

2 lowast 075 lowast 08

075 + 08asymp 077

According to Accpath First has better performance inThestandard ENENews and Alef because more commentsare linked directly to the root Usually SLARTS and Seo etalrsquos methods cannot predict the full path inThestandard andENENews because according to Figure 15 paths are verylong and complex in these datasets

As wementioned earlier interrogative evaluationmetricsare not appropriate for declarative threads because based onthese metrics First shows high performance although thisbaseline does not detect any comment-comment relations

Table 6 shows the results when we use declarative evaluationmetrics proposed in Section 52

According to Table 6 for 119865edge SLARTS performs betterthan Seo et alrsquos method However for119875edge Seo et alrsquos methodperforms better but its 119877edge is lower than SLARTS in all da-tasets

It is important to say that when the difference between119875edge and 119877edge is high and 119875edge is greater than 119877edge themethod most likely connects comments to the root and itdoes not appropriately detect comment-comment relations

18 The Scientific World Journal

J(1)M(1)

V(1)

H(1)

A(1)

R(1)

G(1)

N(1)D(1)

E(1)

C(1)

P(2)

U(1)

S(1)

T(2)

T(1)

M(2)

L(1)

Q(1)

C(2)

K(1)

F(1)L(2)

C(3)

P(1)

A(2)

B(1)O(1)

I(1)

H(2)

R

(a)

J(1)

M(1)

R

A(1)

N(4)

D(1)

G(1)

I(1)

B(1)H(1)

O(1)

C(1)

N(2)

G(2)O(2)

P(1)

E(1)

D(2)

F(1)

E(2)F(2)

D(3)

L(1) F(3) N(1)

D(4)

F(4)K(1)D(5)

D(6)

N(3)

(b)

1)

1)

1)

1)

1)

1)

1)

1)1)

1)

1)

1)

R

F(2)

K(3)

R(1)

K(2)

M(2)A(3)

Q(1)

N(2)

K(4)A(2)

J(2)

1)

H(2)

H(3)

I(2)

I(1)

P(1)

O(1)

(c)

Figure 19 Three real threads in which nodes are labeled according to commenterrsquos name and number of its comments

(like First baseline) On the other hand when 119877edge isgreater than 119875edge the method does not appropriately detectcomment-root relations (like Last baseline)

So Seo et alrsquos method most likely connects comments tothe root and this is not appropriately detecting comment-comment relations On the other hands 119875edge and 119877edge ofSLARTS are close

Several features in SLARTS such as Global similarityfrequent words Frequent patterns and authorsrsquo languagefocus on detection of comment-comment relations Thismakes the results of SLARTS better than Seo et alrsquos methodin declarative evaluation metrics

We already saw that the average length of threads inThestandard and ENENews is longer than the other datasets(Table 2) and their paths are much longer and more complexthan the other datasets (Figure 16) According to 119865edge theaccuracy of Thestandard and ENENews is less than otherdatasets Note that 119865edge is a strict metric in declarativethreads

SLARTS has better 119865edge than Seo et alrsquos method in alldatasetsThemaximumdifference occurs inThestandard andRussianblog In these datasets many of the defined featureshave goodperformance (Importance of features in ENENewsand Russianblog datasets will be shown in Table 7 and we willexplain more in the next section)

The Scientific World Journal 19

Table 5 Experimental results with interrogative evaluation metrics

Datasets Methods Accedge Accpath 119875119877-path 119877

119877-path 1198651119877-path

Thestandard

SLARTS 04669plusmn0009

03059plusmn0013

06927plusmn0018

07450plusmn0010

07179plusmn0013

Seo et alrsquos [2] 03941plusmn0005

03413plusmn0006

08303plusmn0009

06563plusmn0005

07331plusmn0003

First 03630plusmn0010

03630plusmn0010

10000plusmn0000

05846plusmn0009

07379plusmn0007

Last 01824plusmn0006

00495plusmn0004

02214plusmn0008

10000plusmn0000

03625plusmn0010

ENENews

SLARTS 04837plusmn0018

03695plusmn0027

07625plusmn0026

07770plusmn0013

07697plusmn0017

Seo et alrsquos [2] 04661plusmn0017

04134plusmn0021

08720plusmn0011

07086plusmn0013

07818plusmn0011

First 04524plusmn0018

04524plusmn0018

10000plusmn0000

06634plusmn0016

07976plusmn0011

Last 02090plusmn0016

00596plusmn0003

02319plusmn0011

10000plusmn0000

03764plusmn0015

Courantblogs

SLARTS 06843plusmn0028

06453plusmn0033

09127plusmn0013

08619plusmn0014

08865plusmn0013

Seo et alrsquos [2] 06700plusmn0040

06628plusmn0043

09667plusmn0014

08250plusmn0018

08902plusmn0015

First 06490plusmn0036

06490plusmn0036

10000plusmn0000

08001plusmn0019

08889plusmn0012

Last 02037plusmn0017

01082plusmn0008

03183plusmn0011

10000plusmn0000

04829plusmn0012

Russianblog

SLARTS 07463plusmn0017

06904plusmn0017

08741plusmn0008

08787plusmn0006

08764plusmn0007

Seo et alrsquos [2] 05446plusmn0010

05174plusmn0011

09293plusmn0003

07821plusmn0004

08494plusmn0003

First 04968plusmn0006

04968plusmn0006

10000plusmn0000

07134plusmn0004

08327plusmn0003

Last 03632plusmn0018

01549plusmn0012

03695plusmn0015

10000plusmn0000

05395plusmn0016

Alef

SLARTS 07753plusmn0017

07641plusmn0018

09579plusmn0006

09033plusmn0008

09298plusmn0007

Seo et alrsquos [2] 07802plusmn0019

07800plusmn0019

09950plusmn0003

08828plusmn0010

09355plusmn0006

First 07853plusmn0018

07853plusmn0018

10000plusmn0000

08826plusmn0010

09376plusmn0006

Last 00993plusmn0005

00822plusmn0005

02526plusmn0012

10000plusmn0000

04032plusmn0015

The bold style shows the best result in each metric

As shown in Table 6 Russianblog has better resultsthan the other datasets in all metrics The main reason isthat its comments are not confirmed by a moderator Thiscauses Accedge of Last baseline in Russianblog to be equalto 03632 that is more than other datasets (Table 5) Accedgeof Last baseline has inverse relationship with complexity ofreplies Also we showed in Figure 13 that the Russianbloghas the lowest time difference between a comment and itsparent When time difference between a child and its parentdecreases detection of reply relationwould be easier In otherwords as the parent appears to be closer to its child somefeatures such as frequent pattern and location Prior that are

based on position of comments in chronological order workbetter

The difference between 119865119877-path and 119865path is about 20

in Thestandard and ENENews where threads are larger andpaths have more complex structures

The minimum difference between the results of SLARTSand Seo et alrsquos methods appears in Alef datasets In Alef manyrelations are comment root and many comments do not haveauthorrsquos name which make the features perform poorly

Since SLARTS method has features which speciallyfocused on detecting comment-root relations (eg by addingcandidate filtering) AccCTD of SLARTS is better than Seo

20 The Scientific World Journal

et al method in all datasets The best result is 091 forRussianblog and the worst result is 069 for ENENewsThe root of ENENews dataset is usually a tweet Accordingto Table 2 it makes the average length of the rootrsquos textto be shorter than the other datasets and this makes thetextual features perform poorly on detecting comment-rootrelations

Confidence intervals of Alef andCourantblog datasets arehigher than the other datasets becausemany of their relationsare comment root (Figure 16) This makes ranking SVM tobe bias towards connecting comments to the root especiallywhen a thread includes very few comment-comment rela-tions

We compare 119875-value to specify the significance of differ-ences between SLARTS and Seo et alrsquosmethods on declarativemetrics Since ranking SVM ranks candidates based on theirscore and selects the first candidate from the list only thefirst candidate is important So p1 is computed The resultsindicate that all improvements are statistically significant (119875-value lt 0005) in all datasets

54 Evaluation of the Features In this section we evaluatethe role of the introduced features We use backward featureselection That means to measure the importance of a featurewe use all features except that feature repeat the experimentsin its absence and compare the results to the case whereall features are present The difference between the values ofmetrics in presence of a feature and its absence is reported inTable 7

It is seen that some features improve precision of themetrics for example location prior and candidate filteringrule 3 where they most likely tend to detect comment-rootrelations Some features improve recall of the metrics such asauthorsrsquo language global similarity candidate filtering rule1 and frequent patterns These features most likely tendto detect comment-comment relations Some features affectboth precision and recall for example authorsrsquo name

As stated earlier Russianblog has different behaviourin comparison with other datasets (Figures 12 13 and 15)Also ENENews has larger threads and more complex paths(Figure 16 and Table 2) So we use these datasets to evaluatefeatures in depth

Similarity feature improves recall of evaluation metricsSince Russian language is known as a morphologically richlanguage and the length of commentsrsquo text is very short(about average 18 words according to Table 2) in comparisonwith other datasets improvement of textual features is lowTo increase the performance of textual feature in Russianwe need a stemmer a Wordnet and a tokenizer Similarityfeature has rather a good impact on AccCTD

Authorsrsquo language feature improves recall of evaluationmetrics According to 119877edge the improvements of this featureare 003 and 001 in ENENews and Russianblog respectively

Global similarity feature improves both precision andrecall in Russianblog and recall in ENENews

The frequent words feature has a small improve-ment in Russianblog like similarity feature This feature

improves recall of the evaluation metrics in ENEnews about001

Length ratio feature improves precision in both datasetsHowever since longer comments in ENENews have manychildren this feature is more prominent there

Authorsrsquo name feature is useful in all evaluation metricsThe value of this feature in ENENews is more than Russian-blog because authorsrsquo name in Russianblog is different fromother datasets it is an email and no one would refer to it as aname

The frequent patterns feature focuses on detectingcomment-comment relations This feature improves recall ofevaluation metrics in both datasets

The location prior feature improves precision in bothdatasets This feature has a good improvement on AccCTDAccording to 119875edge the best improvement is 016091 forRussianblog since comments are not moderated

The candidate filtering rule 1 improves recall of evaluationmetrics in both datasets This feature removes the rootcandidate accurately This feature has a good improvementon AccCTD The maximum improvement for 119877edge is 034 forRussianblog

The candidate filtering rule 2 has small improvements onprecision and recall of evaluation metrics in both datasetsMaximum improvement is gained with Russianblog

Finally The candidate filtering rule 3 improves precisionof metrics in both datasets This feature removes all can-didates except the root Thus detection of comment-rootrelations is improved Also this feature improves AccCTD

6 Information Extraction byVisualizing Threads Structure

In this section we discuss the information which can beextracted from hierarchical structure of threads Figure 19 isthe same as Figure 3 except that the commenters and theirnumber of posted comments are shown as the nodesrsquo labelfor example ldquo119863(3)rdquo is the third comment sent by user ldquo119863rdquoVisualization of the structure reveals valuable information asfollowing

(i) Is the root article controversial The threads ldquo119861rdquo andldquo119862rdquo in Figure 19 aremore controversial than ldquo119860rdquo Visi-tors have expressed different opinions and sentimentsabout the root in ldquo119861rdquo and ldquo119862rdquo leading to formationof longer conversations The thread ldquo119860rdquo shows acommon thread which has short conversations aboutthe root The height and width of trees can help as ameasure to recognize whether a root is controversialor not

(ii) Which comments are starters [36] Starter commentshave important role in the conversations becauseusers read them and then read other comments tobetter understand the conversations

(iii) Which comments have participated in this conversa-tion Users can follow the conversations that seeminteresting to them without reading any unrelatedcomment

The Scientific World Journal 21

Table 6 Experimental results with declarative evaluation metrics

Datasets Method 119875edge 119877edge 119865edge 119875path 119877path 119865path 119875119877-path 119877

119877-path 119865119877-path AccCTD

ThestandardSLARTS 03782

plusmn000804155plusmn0011

03960plusmn0006

01565plusmn0007

01713plusmn0008

01636plusmn0007

03628plusmn0012

04166plusmn0016

03876plusmn0005

07525plusmn0005

Seo et alrsquos[2]

03014plusmn0031

01414plusmn0015

01921plusmn0016

01544plusmn0025

00576plusmn0011

00835plusmn0013

03013plusmn0038

01319plusmn0018

01831plusmn0021

05831plusmn0007

Difference 00778 02741 02039 00021 01137 00801 00615 02847 02045 01694

ENENewsSLARTS 03797

plusmn003803891plusmn0019

03839plusmn0024

01873plusmn0040

01855plusmn0026

01857plusmn0030

03768plusmn0048

03768plusmn0017

03759plusmn0025

06872plusmn0014

Seo et alrsquos[2]

04060plusmn0030

01713plusmn0016

02407plusmn0018

02378plusmn0034

00837plusmn0016

01235plusmn0021

03947plusmn0035

01523plusmn0015

02196plusmn0019

05954plusmn0009

Difference minus00263 02178 01432 minus00505 01018 00622 minus00179 02245 01563 00918

CourantblogsSLARTS 05236

plusmn003003963plusmn0073

04495plusmn0052

04220plusmn0057

03110plusmn0058

03566plusmn0049

05325plusmn0049

03853plusmn0075

04452plusmn0057

07625plusmn0033

Seo et alrsquos[2]

07320plusmn0090

02049plusmn0062

03189plusmn0082

06815plusmn0105

01896plusmn0070

02951plusmn0093

07309plusmn0094

02010plusmn0065

03140plusmn0086

07001plusmn0031

Difference minus02084 01914 01306 minus02595 01214 00615 minus01984 01843 01312 00624

RussianblogSLARTS 05940

plusmn002805674plusmn0022

05803plusmn0023

04907plusmn0031

04664plusmn0020

04782plusmn0024

05878plusmn0028

05823plusmn0021

05849plusmn0020

09116plusmn0011

Seo et alrsquos[2]

04705plusmn0020

03030plusmn0015

03685plusmn0016

03823plusmn0023

02534plusmn0013

03046plusmn0015

04682plusmn0020

02822plusmn0014

03521plusmn0015

05862plusmn0013

Difference 01235 02644 02118 01084 0213 01736 01196 03001 02328 03254

AlefSLARTS 06189

plusmn002804131plusmn0040

04952plusmn0037

05639plusmn0027

03819plusmn0037

04551plusmn0034

06277plusmn0027

04069plusmn0043

04934plusmn0039

08044plusmn0013

Seo et alrsquos[2]

08827plusmn0052

02637plusmn0051

04045plusmn0060

08797plusmn0053

02631plusmn0051

04034plusmn0060

08840plusmn0052

02635plusmn0051

04043plusmn0060

07846plusmn0018

Difference minus02638 01494 00907 minus03158 01188 00517 minus02563 01434 00891 00198The bold style shows the best result in each metric

(iv) Who plays an important role in a discussion Someusers playmore important roles in conversations thanother users For example users ldquo119863(1)rdquo in threadldquoBrdquo and ldquo119860(1)rdquo in thread ldquoCrdquo have made a longconversation These users are known as a hub or astarter in a thread and their degree of importance canbe compared according to their indegree [39] Manyanalyses can be performed on this structure which isknown as popularity features [12]

(v) How many conversations are about the root Thereare thirteen conversations which are replyingdirectly to the root in thread ldquoCrdquo for exampleF(2)H(2)H(3) I(2)O(1)orF(2)H(2) l(1) showtwo conversations in thread ldquo119862rdquo

7 Conclusion and Future Work

In this paper we proposed SLARTS a method based onranking SVMwhich predicts the tree-like structure of declar-ative threads for example blogs and online news agenciesWe emphasized on the differences between declarative andinterrogative threads and showed that many of the previouslyproposed features perform poorly on declarative threadsbecause of this Instead we defined a set of novel textual andnontextual features and used a ranking SVM algorithm tocombine these features

We detect two types of reply relations in declarativethreads comment-root relation and comment-commentrelation An appropriate method should be able to detectboth types of reply relations and an appropriatemetric shouldconsider both types of reply relations So in order to havefair judge on the quality of the predicted structures wemodified the evaluation metrics accordingly We also defineda novel metric that measures the accuracy of comment typedetection

The results of our experiments showed that accord-ing to declarative evaluation metrics our method showshigher accuracy in comparison with the baselines on fivedatasets Also we showed that all improvements in detectingcomment-comment relations are statistically significant in alldatasets

We believe that the defined features are not limited todeclarative threads Some features such as authorrsquos languageand frequent patterns extract relations among users can beuseful in interrogative threads

For future we would like to use the proposed methodon interrogative platforms such as forums Also we wouldlike to analyze the tree-like reply structure deeply and webelieve it can provide valuable information for example tohelp in understanding the role of users in discussions or tofind important and controversial comments among a largevolume of comments

22 The Scientific World Journal

Table 7 The difference between the evaluation metrics in presence and absence of features

Feature Dataset Evaluation metric119875edge 119877edge 119875Path 119877Path 119875

119877-path 119877119877-path AccCTD

Similarity ENENews minus000233 007415 minus000538 003289 minus001095 008996 004684Russianblog 000874 000012 000963 000347 000695 minus001079 000355

Authorsrsquo language ENENews minus001354 003246 minus001785 000693 minus002757 004052 000458Russianblog minus000229 001052 minus000179 000750 minus000590 000807 minus000182

Global similarity ENENews minus000402 005604 minus001338 001581 minus001129 007086 002742Russianblog 001090 001443 000582 000897 000368 minus000317 minus000087

Frequent words ENENews 000252 000931 000170 000424 minus000132 000973 000292Russianblog 000008 000023 000016 000029 000034 000047 minus000011

Length ratio ENENews 004206 minus002984 004450 001700 006421 minus005336 minus000850Russianblog 000651 minus000355 000920 000080 000784 minus000592 000235

Authorsrsquo name ENENews 003753 006392 001920 003285 002341 005998 001788Russianblog 000000 000000 000000 000000 000000 000000 000000

Frequent pattern ENENews minus002028 002930 minus002357 000300 minus003793 003410 000526Russianblog 000458 006100 000830 005206 minus000545 006150 000404

Location prior ENENews 008778 minus003690 008556 003280 011792 minus007852 002925Russianblog 016091 009671 017766 013515 016699 008793 002414

Candidate filtering rule 1 ENENews minus005431 006138 minus004102 001296 minus006458 007545 006949Russianblog minus012506 034370 minus013916 027445 minus012683 037003 032546

Candidate filtering rule 2 ENENews 000014 minus000004 000048 000041 minus000038 minus000035 minus000009Russianblog 002575 002475 002462 002341 001746 000802 000000

Candidate filtering rule 3 ENENews 000357 minus000121 000452 000254 000551 minus000161 000161Russianblog 008521 minus001978 009636 001702 009380 minus003825 007371

Since our goal is to provide a language-independentmodel to reconstruct the thread structure we did not usetext processing tools such as Wordnet and Named EntityRecognition (as they are not available in the same quality forall languages) We would like to focus on English and usethese tools to find out whether they can improve the accuracyof RTS method or not

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] Y Wu Q Ye L Li and J Xiao ldquoPower-law properties ofhuman view and reply behavior in online societyrdquoMathematicalProblems in Engineering vol 2012 Article ID 969087 7 pages2012

[2] J Seo W B Croft and D A Smith ldquoOnline community searchusing conversational structuresrdquo Information Retrieval vol 14no 6 pp 547ndash571 2011

[3] HWang CWang C X Zhai and J Han ldquoLearning online dis-cussion structures by conditional random fieldsrdquo in Proceedingsof the 34th International ACM SIGIR Conference on Researchand Development in Information Retrieval (SIGIR rsquo11) pp 435ndash444 July 2011

[4] E Aumayr J Chan and C Hayes ldquoReconstruction of threadedconversations in online discussion forumsrdquo inProceedings of the

5th International AAAI Conference onWeblogs and SocialMedia(ICWSM rsquo11) pp 26ndash33 2011

[5] D Shen Q Yang J-T Sun and Z Chen ldquoThread detactionin dynamic text message streamsrdquo in Proceedings of the 29thAnnual International ACM SIGIR Conference on Research andDevelopment in Information Retrieval pp 35ndash42 Seattle WashUSA August 2006

[6] M Dehghani M Asadpour and A Shakery ldquoAn evolutionary-based method for reconstructing conversation threads in emailcorporardquo in Proceedings of the IEEEACM International Con-ference on Advances in Social Networks Analysis and Mining(ASONAM rsquo12) 2012

[7] Y Jen-Yuan ldquoEmail thread reassembly using similarity match-ingrdquo in Proceedings of the 3rd Conference on Email and Anti-Spam (CEAS rsquo06) Muntain View Calif USA 2006

[8] S Gottipati D Lo and J Jiang ldquoFinding relevant answersin software forumsrdquo in Proceedings of the 26th IEEEACMInternational Conference on Automated Software Engineering(ASE rsquo11) pp 323ndash332 November 2011

[9] H Duan and C Zhai ldquoExploiting thread structures to improvesmoothing of language models for forum post retrievalrdquo inAdvances in Information Retrieval P Clough C Foley CGurrin et al Eds vol 6611 pp 350ndash361 Springer BerlinGermany 2011

[10] T Georgiou M Karvounis and Y Ioannidis ldquoExtractingtopics of debate between users on web discussion boardsrdquo inProceedings of the 1st International Conference forUndergraduateand Postgraduate Students in Research Poster Competition 2010

[11] Y C Wang M Joshi W Cohen and C P Rose ldquoRecoveringimplicit thread structure in newsgroup style conversationsrdquo

The Scientific World Journal 23

in Proceedings of the International Conference on Weblogs andSocial Media Seattle Wash USA 2008

[12] J Chan C Hayes and E M Daly ldquoDecomposing discussionforums and boards using user rolesrdquo in Proceedings of the 4thInternational AAAI Conference on Weblogs and Social Media2010

[13] Y-C Wang and C P Rose ldquoMaking conversational structureexplicit identification of initiation-response pairs within onlinediscussionsrdquo in Proceedings of the Human Language Technolo-gies Conference ofthe North American Chapter of the Associationfor Computational Linguistics pp 673ndash676 June 2010

[14] LWangM Lui S N Kim J Nivre and T Baldwin ldquoPredictingthread discourse structure over technical web forumsrdquo inProceedings of the Conference on Empirical Methods in NaturalLanguage Processing (EMNLP rsquo11) pp 13ndash25 Edinburgh UKJuly 2011

[15] A Balali H FailiMAsadpour andMDehghani ldquoa supervisedapproach for reconstructing thread structure in commentson blogs and online news agenciesrdquo in Proceedings of the14th International Conference on Intelligent Text Processing andComputational Linguistics Samos Greece 2013

[16] P Adams and C Martel ldquoConversational thread extraction andtopic detection in text-based chatrdquo in Semantic Computing pp87ndash113 John Wiley amp Sons 2010

[17] P H Adams and C H Martell ldquoTopic detection and extractionin chatrdquo in Proceedings of the 2nd Annual IEEE InternationalConference on Semantic Computing (ICSC rsquo08) pp 581ndash588August 2008

[18] C Lin J M Yang R Cai X J Wang W Wang and L ZhangldquoModeling semantics and structure of discussion threadsrdquo inProceedings of the 18th International Conference on World WideWeb pp 1103ndash1104 2009

[19] A Schuth MMarx andM d Rijke ldquoExtracting the discussionstructure in comments on news-articlesrdquo in Proceedings of the9th ACM International Workshop onWeb Information and DataManagement (WIDM rsquo07) Lisboa Portugal November 2007

[20] C Wang J Han Q Li X Li W P Lin and H Ji ldquoLearninghierarchical relationships among partially ordered objects withheterogeneous attributes and linksrdquo in Proceedings of the 12thSIAM International Conference on Data Mining pp 516ndash527Anaheim Calif USA 2012

[21] M Dehghani A Shakery M Asadpour and A KoushkestanildquoA learning approach for email conversation thread reconstruc-tionrdquo Journal of Information Science vol 39 no 6 pp 846ndash8632013

[22] A Balali A Rajabi S Ghassemi M Asadpour and H FailildquoContent diffusion prediction in social networksrdquo in Pro-ceedings of the 5th Conference on Information and KnowledgeTechnology (IKT rsquo13) pp 467ndash471 2013

[23] G Mishne and N Glance ldquoLeave a reply an analysis of weblogcommentsrdquo in Proceedings of the 3rd Annual Workshop onthe Weblogging ecosystem Aggregation Analysis and DynamicsEdinburgh UK 2006

[24] V Gomez A Kaltenbrunner and V Lopez ldquoStatistical analysisof the social network and discussion threads in Slashdotrdquo inProceedings of the 17th International Conference on World WideWeb (WWW rsquo08) pp 645ndash654 April 2008

[25] T Weninger X A Zhu and J Han ldquoAn exploration ofdiscussion threads in social news sites a case study of the redditcommunityrdquo 2013

[26] D Laniado R Tasso Y Volkovich and A KaltenbrunnerldquoWhen the Wikipedians talk network and tree structure of

Wikipedia discussion pagesrdquo in Proceedings of the 5th Interna-tional AAAI Conference on Weblogs and Social Media (ICWSMrsquo11) 2011

[27] A Grabowski ldquoHuman behavior in online social systemsrdquoTheEuropean Physical Journal B vol 69 no 4 pp 605ndash611 2009

[28] S N Kim L Wang and T Baldwin ldquoTagging and linkingweb forum postsrdquo in Proceedings of the 40th Conference onComputational Natural Language Learning Uppsala Sweden2010

[29] J Andreas S Rosenthal and K McKeown ldquoAnnotating agree-ment and disagreement in threaded discussionrdquo in Proceedingsof the 8th International Conference on Language Resources andComputation (LREC rsquo12) pp 818ndash822 Istanbul Turkey May2012

[30] A Stolcke K Ries N Coccaro et al ldquoDialogue actmodeling forautomatic tagging and recognition of conversational speechrdquoComputational Linguistics vol 26 no 3 pp 339ndash373 2000

[31] E N Forsyth Improving Automated Lexical and DiscourseAnalysis of Online Chat Dialog Naval Postgraduate SchoolMonterey Calif USA 2007

[32] D Feng E Shaw J Kim and E Hovy ldquoAn intelligentdiscussion-bot for answering student queries in threaded dis-cussionsrdquo in Proceedings of the 11th international Conference onIntelligent User Interfaces pp 171ndash177 February 2006

[33] F Hu T Ruan Z Shao and J Ding ldquoAutomatic web informa-tion extraction based on rulesrdquo in Proceedings of the Web Infor-mation System Engineering (WISE rsquo11) pp 265ndash272 Springer2011

[34] F Hu T Ruan and Z Shao ldquoComplete-thread extraction fromweb forumsrdquo inWebTechnologies andApplications pp 727ndash734Springer 2012

[35] D Feng J Kim E Shaw and E Hovy ldquoTowards modelingthreaded discussions using induced ontology knowledgerdquo inProceedings of the National Conference on Artificial Intelligencepp 1289ndash1294 July 2006

[36] G D Venolia and C Neustaedter ldquoUnderstanding sequenceand reply relationships within email conversations a mixed-model visualizationrdquo in Proceedings of the SIGCHI Conferenceon Human Factors in Computing Systems pp 361ndash368 April2003

[37] T Joachims ldquoTraining linear SVMs in linear timerdquo in Pro-ceedings of the 12th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining pp 217ndash226 Philadel-phia Pa USA August 2006

[38] RM Fano ldquoTransmission of information a statistical theory ofcommunicationsrdquo American Journal of Physics vol 29 pp 793ndash794 1961

[39] T Opsahl F Agneessens and J Skvoretz ldquoNode centrality inweighted networks generalizing degree and shortest pathsrdquoSocial Networks vol 32 no 3 pp 245ndash251 2010

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 16: Research Article A Supervised Approach to Predict the …downloads.hindawi.com/journals/tswj/2014/479746.pdf · 2019-07-31 · Research Article A Supervised Approach to Predict the

16 The Scientific World Journal

Table3Ex

ampleo

fcalculatio

nof

interrogativem

etric

sfor

thee

xamples

hownin

Figure

18

Evaluatio

nmetric

Calculation

Accuracy

edge

Replyrelatio

ns=1rarr

1198772rarr

1198777rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

14rarr

35rarr

26rarr

58rarr

5

Detectedrelatio

ns==1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

14rarr

15rarr

26rarr

57rarr

28rarr

7

8 11asymp073

Acc p

ath

|(1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

1rarr

1198775rarr

2rarr

1198776rarr

5rarr

2rarr

119877)|

Num

berof

comments

=8 11asymp073

119875119877-path

(1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

1rarr

1198775rarr

2rarr

1198776rarr

5rarr

2rarr

119877)+(4rarr

1rarr

1198777rarr

2rarr

1198778rarr

7rarr

2rarr

119877)

Num

berof

comments

=(1+1+1+1+1+1+1+1)+(1+12+23)

11

=1017

11

asymp092

119877119877-path

(1rarr

1198772rarr

1198779rarr

11987710rarr

11987711rarr

1198773rarr

1rarr

1198775rarr

2rarr

1198776rarr

5rarr

2rarr

119877)+(4rarr

3rarr

1rarr

1198777rarr

1198778rarr

5rarr

2rarr

119877)

Num

berof

comments

=(1+1+1+1+1+1+1+1)+(23+1+23)

11

=1034

11

asymp094

1198651-score119877-path

2lowast092lowast094

092+094

asymp093

The Scientific World Journal 17

R

9 10 11

3 5

8

21 7

4 6

(a)

R

9 10 11

3 7

8

21

4 5

6

(b)

Figure 18 An example of two threads (a) ground-truth thread and (b) the prediction thread

Table 4 Example of calculation of declarative metrics for the example shown in Figure 18

Evaluation metric Calculation

AccCTD

TP = 1 2 9 10 11TN = 3 4 5 6 8

FN = 7

FP =

5 + 5

5 + 5 + 1 + 0asymp 091

119875edge

119877edge

1198651edge

TP = 3 rarr 1 5 rarr 2 6 rarr 5 119875edge =3

3 + 3= 05

FN = 4 rarr 3 8 rarr 5 119877edge =3

3 + 2= 06

FP = 4 rarr 1 7 rarr 2 8 rarr 7 1198651 minus scoreedge asymp 055

119875path

119877path

1198651path

119875path =| 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 |

| 3 rarr 1 4 rarr 1 5 rarr 2 6 rarr 5 rarr 2 7 rarr 2 8 rarr 7 rarr 2 |=3

6= 05

119877path =| 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 |

| 3 rarr 1 4 rarr 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 8 rarr 5 rarr 2 |=3

5= 06

1198651-score119877-path asymp 055

119875119877-path

119877119877-path

1198651119877-path

119875119877-path =

3 rarr 1 4 rarr 1 5 rarr 2 6 rarr 5 rarr 2 7 rarr 2 8 rarr 7 rarr 2

| 3 4 5 6 7 8 |=1 + 1 + 1 + 1 + 0 + 12

6=45

6= 075

119877119877-path =

3 rarr 1 4 rarr 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 8 rarr 5 rarr 2

| 3 4 5 6 8 |=1 + 12 + 1 + 1 + 12

5=4

5= 08

1198651-score119877-path =

2 lowast 075 lowast 08

075 + 08asymp 077

According to Accpath First has better performance inThestandard ENENews and Alef because more commentsare linked directly to the root Usually SLARTS and Seo etalrsquos methods cannot predict the full path inThestandard andENENews because according to Figure 15 paths are verylong and complex in these datasets

As wementioned earlier interrogative evaluationmetricsare not appropriate for declarative threads because based onthese metrics First shows high performance although thisbaseline does not detect any comment-comment relations

Table 6 shows the results when we use declarative evaluationmetrics proposed in Section 52

According to Table 6 for 119865edge SLARTS performs betterthan Seo et alrsquos method However for119875edge Seo et alrsquos methodperforms better but its 119877edge is lower than SLARTS in all da-tasets

It is important to say that when the difference between119875edge and 119877edge is high and 119875edge is greater than 119877edge themethod most likely connects comments to the root and itdoes not appropriately detect comment-comment relations

18 The Scientific World Journal

J(1)M(1)

V(1)

H(1)

A(1)

R(1)

G(1)

N(1)D(1)

E(1)

C(1)

P(2)

U(1)

S(1)

T(2)

T(1)

M(2)

L(1)

Q(1)

C(2)

K(1)

F(1)L(2)

C(3)

P(1)

A(2)

B(1)O(1)

I(1)

H(2)

R

(a)

J(1)

M(1)

R

A(1)

N(4)

D(1)

G(1)

I(1)

B(1)H(1)

O(1)

C(1)

N(2)

G(2)O(2)

P(1)

E(1)

D(2)

F(1)

E(2)F(2)

D(3)

L(1) F(3) N(1)

D(4)

F(4)K(1)D(5)

D(6)

N(3)

(b)

1)

1)

1)

1)

1)

1)

1)

1)1)

1)

1)

1)

R

F(2)

K(3)

R(1)

K(2)

M(2)A(3)

Q(1)

N(2)

K(4)A(2)

J(2)

1)

H(2)

H(3)

I(2)

I(1)

P(1)

O(1)

(c)

Figure 19 Three real threads in which nodes are labeled according to commenterrsquos name and number of its comments

(like First baseline) On the other hand when 119877edge isgreater than 119875edge the method does not appropriately detectcomment-root relations (like Last baseline)

So Seo et alrsquos method most likely connects comments tothe root and this is not appropriately detecting comment-comment relations On the other hands 119875edge and 119877edge ofSLARTS are close

Several features in SLARTS such as Global similarityfrequent words Frequent patterns and authorsrsquo languagefocus on detection of comment-comment relations Thismakes the results of SLARTS better than Seo et alrsquos methodin declarative evaluation metrics

We already saw that the average length of threads inThestandard and ENENews is longer than the other datasets(Table 2) and their paths are much longer and more complexthan the other datasets (Figure 16) According to 119865edge theaccuracy of Thestandard and ENENews is less than otherdatasets Note that 119865edge is a strict metric in declarativethreads

SLARTS has better 119865edge than Seo et alrsquos method in alldatasetsThemaximumdifference occurs inThestandard andRussianblog In these datasets many of the defined featureshave goodperformance (Importance of features in ENENewsand Russianblog datasets will be shown in Table 7 and we willexplain more in the next section)

The Scientific World Journal 19

Table 5 Experimental results with interrogative evaluation metrics

Datasets Methods Accedge Accpath 119875119877-path 119877

119877-path 1198651119877-path

Thestandard

SLARTS 04669plusmn0009

03059plusmn0013

06927plusmn0018

07450plusmn0010

07179plusmn0013

Seo et alrsquos [2] 03941plusmn0005

03413plusmn0006

08303plusmn0009

06563plusmn0005

07331plusmn0003

First 03630plusmn0010

03630plusmn0010

10000plusmn0000

05846plusmn0009

07379plusmn0007

Last 01824plusmn0006

00495plusmn0004

02214plusmn0008

10000plusmn0000

03625plusmn0010

ENENews

SLARTS 04837plusmn0018

03695plusmn0027

07625plusmn0026

07770plusmn0013

07697plusmn0017

Seo et alrsquos [2] 04661plusmn0017

04134plusmn0021

08720plusmn0011

07086plusmn0013

07818plusmn0011

First 04524plusmn0018

04524plusmn0018

10000plusmn0000

06634plusmn0016

07976plusmn0011

Last 02090plusmn0016

00596plusmn0003

02319plusmn0011

10000plusmn0000

03764plusmn0015

Courantblogs

SLARTS 06843plusmn0028

06453plusmn0033

09127plusmn0013

08619plusmn0014

08865plusmn0013

Seo et alrsquos [2] 06700plusmn0040

06628plusmn0043

09667plusmn0014

08250plusmn0018

08902plusmn0015

First 06490plusmn0036

06490plusmn0036

10000plusmn0000

08001plusmn0019

08889plusmn0012

Last 02037plusmn0017

01082plusmn0008

03183plusmn0011

10000plusmn0000

04829plusmn0012

Russianblog

SLARTS 07463plusmn0017

06904plusmn0017

08741plusmn0008

08787plusmn0006

08764plusmn0007

Seo et alrsquos [2] 05446plusmn0010

05174plusmn0011

09293plusmn0003

07821plusmn0004

08494plusmn0003

First 04968plusmn0006

04968plusmn0006

10000plusmn0000

07134plusmn0004

08327plusmn0003

Last 03632plusmn0018

01549plusmn0012

03695plusmn0015

10000plusmn0000

05395plusmn0016

Alef

SLARTS 07753plusmn0017

07641plusmn0018

09579plusmn0006

09033plusmn0008

09298plusmn0007

Seo et alrsquos [2] 07802plusmn0019

07800plusmn0019

09950plusmn0003

08828plusmn0010

09355plusmn0006

First 07853plusmn0018

07853plusmn0018

10000plusmn0000

08826plusmn0010

09376plusmn0006

Last 00993plusmn0005

00822plusmn0005

02526plusmn0012

10000plusmn0000

04032plusmn0015

The bold style shows the best result in each metric

As shown in Table 6 Russianblog has better resultsthan the other datasets in all metrics The main reason isthat its comments are not confirmed by a moderator Thiscauses Accedge of Last baseline in Russianblog to be equalto 03632 that is more than other datasets (Table 5) Accedgeof Last baseline has inverse relationship with complexity ofreplies Also we showed in Figure 13 that the Russianbloghas the lowest time difference between a comment and itsparent When time difference between a child and its parentdecreases detection of reply relationwould be easier In otherwords as the parent appears to be closer to its child somefeatures such as frequent pattern and location Prior that are

based on position of comments in chronological order workbetter

The difference between 119865119877-path and 119865path is about 20

in Thestandard and ENENews where threads are larger andpaths have more complex structures

The minimum difference between the results of SLARTSand Seo et alrsquos methods appears in Alef datasets In Alef manyrelations are comment root and many comments do not haveauthorrsquos name which make the features perform poorly

Since SLARTS method has features which speciallyfocused on detecting comment-root relations (eg by addingcandidate filtering) AccCTD of SLARTS is better than Seo

20 The Scientific World Journal

et al method in all datasets The best result is 091 forRussianblog and the worst result is 069 for ENENewsThe root of ENENews dataset is usually a tweet Accordingto Table 2 it makes the average length of the rootrsquos textto be shorter than the other datasets and this makes thetextual features perform poorly on detecting comment-rootrelations

Confidence intervals of Alef andCourantblog datasets arehigher than the other datasets becausemany of their relationsare comment root (Figure 16) This makes ranking SVM tobe bias towards connecting comments to the root especiallywhen a thread includes very few comment-comment rela-tions

We compare 119875-value to specify the significance of differ-ences between SLARTS and Seo et alrsquosmethods on declarativemetrics Since ranking SVM ranks candidates based on theirscore and selects the first candidate from the list only thefirst candidate is important So p1 is computed The resultsindicate that all improvements are statistically significant (119875-value lt 0005) in all datasets

54 Evaluation of the Features In this section we evaluatethe role of the introduced features We use backward featureselection That means to measure the importance of a featurewe use all features except that feature repeat the experimentsin its absence and compare the results to the case whereall features are present The difference between the values ofmetrics in presence of a feature and its absence is reported inTable 7

It is seen that some features improve precision of themetrics for example location prior and candidate filteringrule 3 where they most likely tend to detect comment-rootrelations Some features improve recall of the metrics such asauthorsrsquo language global similarity candidate filtering rule1 and frequent patterns These features most likely tendto detect comment-comment relations Some features affectboth precision and recall for example authorsrsquo name

As stated earlier Russianblog has different behaviourin comparison with other datasets (Figures 12 13 and 15)Also ENENews has larger threads and more complex paths(Figure 16 and Table 2) So we use these datasets to evaluatefeatures in depth

Similarity feature improves recall of evaluation metricsSince Russian language is known as a morphologically richlanguage and the length of commentsrsquo text is very short(about average 18 words according to Table 2) in comparisonwith other datasets improvement of textual features is lowTo increase the performance of textual feature in Russianwe need a stemmer a Wordnet and a tokenizer Similarityfeature has rather a good impact on AccCTD

Authorsrsquo language feature improves recall of evaluationmetrics According to 119877edge the improvements of this featureare 003 and 001 in ENENews and Russianblog respectively

Global similarity feature improves both precision andrecall in Russianblog and recall in ENENews

The frequent words feature has a small improve-ment in Russianblog like similarity feature This feature

improves recall of the evaluation metrics in ENEnews about001

Length ratio feature improves precision in both datasetsHowever since longer comments in ENENews have manychildren this feature is more prominent there

Authorsrsquo name feature is useful in all evaluation metricsThe value of this feature in ENENews is more than Russian-blog because authorsrsquo name in Russianblog is different fromother datasets it is an email and no one would refer to it as aname

The frequent patterns feature focuses on detectingcomment-comment relations This feature improves recall ofevaluation metrics in both datasets

The location prior feature improves precision in bothdatasets This feature has a good improvement on AccCTDAccording to 119875edge the best improvement is 016091 forRussianblog since comments are not moderated

The candidate filtering rule 1 improves recall of evaluationmetrics in both datasets This feature removes the rootcandidate accurately This feature has a good improvementon AccCTD The maximum improvement for 119877edge is 034 forRussianblog

The candidate filtering rule 2 has small improvements onprecision and recall of evaluation metrics in both datasetsMaximum improvement is gained with Russianblog

Finally The candidate filtering rule 3 improves precisionof metrics in both datasets This feature removes all can-didates except the root Thus detection of comment-rootrelations is improved Also this feature improves AccCTD

6 Information Extraction byVisualizing Threads Structure

In this section we discuss the information which can beextracted from hierarchical structure of threads Figure 19 isthe same as Figure 3 except that the commenters and theirnumber of posted comments are shown as the nodesrsquo labelfor example ldquo119863(3)rdquo is the third comment sent by user ldquo119863rdquoVisualization of the structure reveals valuable information asfollowing

(i) Is the root article controversial The threads ldquo119861rdquo andldquo119862rdquo in Figure 19 aremore controversial than ldquo119860rdquo Visi-tors have expressed different opinions and sentimentsabout the root in ldquo119861rdquo and ldquo119862rdquo leading to formationof longer conversations The thread ldquo119860rdquo shows acommon thread which has short conversations aboutthe root The height and width of trees can help as ameasure to recognize whether a root is controversialor not

(ii) Which comments are starters [36] Starter commentshave important role in the conversations becauseusers read them and then read other comments tobetter understand the conversations

(iii) Which comments have participated in this conversa-tion Users can follow the conversations that seeminteresting to them without reading any unrelatedcomment

The Scientific World Journal 21

Table 6 Experimental results with declarative evaluation metrics

Datasets Method 119875edge 119877edge 119865edge 119875path 119877path 119865path 119875119877-path 119877

119877-path 119865119877-path AccCTD

ThestandardSLARTS 03782

plusmn000804155plusmn0011

03960plusmn0006

01565plusmn0007

01713plusmn0008

01636plusmn0007

03628plusmn0012

04166plusmn0016

03876plusmn0005

07525plusmn0005

Seo et alrsquos[2]

03014plusmn0031

01414plusmn0015

01921plusmn0016

01544plusmn0025

00576plusmn0011

00835plusmn0013

03013plusmn0038

01319plusmn0018

01831plusmn0021

05831plusmn0007

Difference 00778 02741 02039 00021 01137 00801 00615 02847 02045 01694

ENENewsSLARTS 03797

plusmn003803891plusmn0019

03839plusmn0024

01873plusmn0040

01855plusmn0026

01857plusmn0030

03768plusmn0048

03768plusmn0017

03759plusmn0025

06872plusmn0014

Seo et alrsquos[2]

04060plusmn0030

01713plusmn0016

02407plusmn0018

02378plusmn0034

00837plusmn0016

01235plusmn0021

03947plusmn0035

01523plusmn0015

02196plusmn0019

05954plusmn0009

Difference minus00263 02178 01432 minus00505 01018 00622 minus00179 02245 01563 00918

CourantblogsSLARTS 05236

plusmn003003963plusmn0073

04495plusmn0052

04220plusmn0057

03110plusmn0058

03566plusmn0049

05325plusmn0049

03853plusmn0075

04452plusmn0057

07625plusmn0033

Seo et alrsquos[2]

07320plusmn0090

02049plusmn0062

03189plusmn0082

06815plusmn0105

01896plusmn0070

02951plusmn0093

07309plusmn0094

02010plusmn0065

03140plusmn0086

07001plusmn0031

Difference minus02084 01914 01306 minus02595 01214 00615 minus01984 01843 01312 00624

RussianblogSLARTS 05940

plusmn002805674plusmn0022

05803plusmn0023

04907plusmn0031

04664plusmn0020

04782plusmn0024

05878plusmn0028

05823plusmn0021

05849plusmn0020

09116plusmn0011

Seo et alrsquos[2]

04705plusmn0020

03030plusmn0015

03685plusmn0016

03823plusmn0023

02534plusmn0013

03046plusmn0015

04682plusmn0020

02822plusmn0014

03521plusmn0015

05862plusmn0013

Difference 01235 02644 02118 01084 0213 01736 01196 03001 02328 03254

AlefSLARTS 06189

plusmn002804131plusmn0040

04952plusmn0037

05639plusmn0027

03819plusmn0037

04551plusmn0034

06277plusmn0027

04069plusmn0043

04934plusmn0039

08044plusmn0013

Seo et alrsquos[2]

08827plusmn0052

02637plusmn0051

04045plusmn0060

08797plusmn0053

02631plusmn0051

04034plusmn0060

08840plusmn0052

02635plusmn0051

04043plusmn0060

07846plusmn0018

Difference minus02638 01494 00907 minus03158 01188 00517 minus02563 01434 00891 00198The bold style shows the best result in each metric

(iv) Who plays an important role in a discussion Someusers playmore important roles in conversations thanother users For example users ldquo119863(1)rdquo in threadldquoBrdquo and ldquo119860(1)rdquo in thread ldquoCrdquo have made a longconversation These users are known as a hub or astarter in a thread and their degree of importance canbe compared according to their indegree [39] Manyanalyses can be performed on this structure which isknown as popularity features [12]

(v) How many conversations are about the root Thereare thirteen conversations which are replyingdirectly to the root in thread ldquoCrdquo for exampleF(2)H(2)H(3) I(2)O(1)orF(2)H(2) l(1) showtwo conversations in thread ldquo119862rdquo

7 Conclusion and Future Work

In this paper we proposed SLARTS a method based onranking SVMwhich predicts the tree-like structure of declar-ative threads for example blogs and online news agenciesWe emphasized on the differences between declarative andinterrogative threads and showed that many of the previouslyproposed features perform poorly on declarative threadsbecause of this Instead we defined a set of novel textual andnontextual features and used a ranking SVM algorithm tocombine these features

We detect two types of reply relations in declarativethreads comment-root relation and comment-commentrelation An appropriate method should be able to detectboth types of reply relations and an appropriatemetric shouldconsider both types of reply relations So in order to havefair judge on the quality of the predicted structures wemodified the evaluation metrics accordingly We also defineda novel metric that measures the accuracy of comment typedetection

The results of our experiments showed that accord-ing to declarative evaluation metrics our method showshigher accuracy in comparison with the baselines on fivedatasets Also we showed that all improvements in detectingcomment-comment relations are statistically significant in alldatasets

We believe that the defined features are not limited todeclarative threads Some features such as authorrsquos languageand frequent patterns extract relations among users can beuseful in interrogative threads

For future we would like to use the proposed methodon interrogative platforms such as forums Also we wouldlike to analyze the tree-like reply structure deeply and webelieve it can provide valuable information for example tohelp in understanding the role of users in discussions or tofind important and controversial comments among a largevolume of comments

22 The Scientific World Journal

Table 7 The difference between the evaluation metrics in presence and absence of features

Feature Dataset Evaluation metric119875edge 119877edge 119875Path 119877Path 119875

119877-path 119877119877-path AccCTD

Similarity ENENews minus000233 007415 minus000538 003289 minus001095 008996 004684Russianblog 000874 000012 000963 000347 000695 minus001079 000355

Authorsrsquo language ENENews minus001354 003246 minus001785 000693 minus002757 004052 000458Russianblog minus000229 001052 minus000179 000750 minus000590 000807 minus000182

Global similarity ENENews minus000402 005604 minus001338 001581 minus001129 007086 002742Russianblog 001090 001443 000582 000897 000368 minus000317 minus000087

Frequent words ENENews 000252 000931 000170 000424 minus000132 000973 000292Russianblog 000008 000023 000016 000029 000034 000047 minus000011

Length ratio ENENews 004206 minus002984 004450 001700 006421 minus005336 minus000850Russianblog 000651 minus000355 000920 000080 000784 minus000592 000235

Authorsrsquo name ENENews 003753 006392 001920 003285 002341 005998 001788Russianblog 000000 000000 000000 000000 000000 000000 000000

Frequent pattern ENENews minus002028 002930 minus002357 000300 minus003793 003410 000526Russianblog 000458 006100 000830 005206 minus000545 006150 000404

Location prior ENENews 008778 minus003690 008556 003280 011792 minus007852 002925Russianblog 016091 009671 017766 013515 016699 008793 002414

Candidate filtering rule 1 ENENews minus005431 006138 minus004102 001296 minus006458 007545 006949Russianblog minus012506 034370 minus013916 027445 minus012683 037003 032546

Candidate filtering rule 2 ENENews 000014 minus000004 000048 000041 minus000038 minus000035 minus000009Russianblog 002575 002475 002462 002341 001746 000802 000000

Candidate filtering rule 3 ENENews 000357 minus000121 000452 000254 000551 minus000161 000161Russianblog 008521 minus001978 009636 001702 009380 minus003825 007371

Since our goal is to provide a language-independentmodel to reconstruct the thread structure we did not usetext processing tools such as Wordnet and Named EntityRecognition (as they are not available in the same quality forall languages) We would like to focus on English and usethese tools to find out whether they can improve the accuracyof RTS method or not

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] Y Wu Q Ye L Li and J Xiao ldquoPower-law properties ofhuman view and reply behavior in online societyrdquoMathematicalProblems in Engineering vol 2012 Article ID 969087 7 pages2012

[2] J Seo W B Croft and D A Smith ldquoOnline community searchusing conversational structuresrdquo Information Retrieval vol 14no 6 pp 547ndash571 2011

[3] HWang CWang C X Zhai and J Han ldquoLearning online dis-cussion structures by conditional random fieldsrdquo in Proceedingsof the 34th International ACM SIGIR Conference on Researchand Development in Information Retrieval (SIGIR rsquo11) pp 435ndash444 July 2011

[4] E Aumayr J Chan and C Hayes ldquoReconstruction of threadedconversations in online discussion forumsrdquo inProceedings of the

5th International AAAI Conference onWeblogs and SocialMedia(ICWSM rsquo11) pp 26ndash33 2011

[5] D Shen Q Yang J-T Sun and Z Chen ldquoThread detactionin dynamic text message streamsrdquo in Proceedings of the 29thAnnual International ACM SIGIR Conference on Research andDevelopment in Information Retrieval pp 35ndash42 Seattle WashUSA August 2006

[6] M Dehghani M Asadpour and A Shakery ldquoAn evolutionary-based method for reconstructing conversation threads in emailcorporardquo in Proceedings of the IEEEACM International Con-ference on Advances in Social Networks Analysis and Mining(ASONAM rsquo12) 2012

[7] Y Jen-Yuan ldquoEmail thread reassembly using similarity match-ingrdquo in Proceedings of the 3rd Conference on Email and Anti-Spam (CEAS rsquo06) Muntain View Calif USA 2006

[8] S Gottipati D Lo and J Jiang ldquoFinding relevant answersin software forumsrdquo in Proceedings of the 26th IEEEACMInternational Conference on Automated Software Engineering(ASE rsquo11) pp 323ndash332 November 2011

[9] H Duan and C Zhai ldquoExploiting thread structures to improvesmoothing of language models for forum post retrievalrdquo inAdvances in Information Retrieval P Clough C Foley CGurrin et al Eds vol 6611 pp 350ndash361 Springer BerlinGermany 2011

[10] T Georgiou M Karvounis and Y Ioannidis ldquoExtractingtopics of debate between users on web discussion boardsrdquo inProceedings of the 1st International Conference forUndergraduateand Postgraduate Students in Research Poster Competition 2010

[11] Y C Wang M Joshi W Cohen and C P Rose ldquoRecoveringimplicit thread structure in newsgroup style conversationsrdquo

The Scientific World Journal 23

in Proceedings of the International Conference on Weblogs andSocial Media Seattle Wash USA 2008

[12] J Chan C Hayes and E M Daly ldquoDecomposing discussionforums and boards using user rolesrdquo in Proceedings of the 4thInternational AAAI Conference on Weblogs and Social Media2010

[13] Y-C Wang and C P Rose ldquoMaking conversational structureexplicit identification of initiation-response pairs within onlinediscussionsrdquo in Proceedings of the Human Language Technolo-gies Conference ofthe North American Chapter of the Associationfor Computational Linguistics pp 673ndash676 June 2010

[14] LWangM Lui S N Kim J Nivre and T Baldwin ldquoPredictingthread discourse structure over technical web forumsrdquo inProceedings of the Conference on Empirical Methods in NaturalLanguage Processing (EMNLP rsquo11) pp 13ndash25 Edinburgh UKJuly 2011

[15] A Balali H FailiMAsadpour andMDehghani ldquoa supervisedapproach for reconstructing thread structure in commentson blogs and online news agenciesrdquo in Proceedings of the14th International Conference on Intelligent Text Processing andComputational Linguistics Samos Greece 2013

[16] P Adams and C Martel ldquoConversational thread extraction andtopic detection in text-based chatrdquo in Semantic Computing pp87ndash113 John Wiley amp Sons 2010

[17] P H Adams and C H Martell ldquoTopic detection and extractionin chatrdquo in Proceedings of the 2nd Annual IEEE InternationalConference on Semantic Computing (ICSC rsquo08) pp 581ndash588August 2008

[18] C Lin J M Yang R Cai X J Wang W Wang and L ZhangldquoModeling semantics and structure of discussion threadsrdquo inProceedings of the 18th International Conference on World WideWeb pp 1103ndash1104 2009

[19] A Schuth MMarx andM d Rijke ldquoExtracting the discussionstructure in comments on news-articlesrdquo in Proceedings of the9th ACM International Workshop onWeb Information and DataManagement (WIDM rsquo07) Lisboa Portugal November 2007

[20] C Wang J Han Q Li X Li W P Lin and H Ji ldquoLearninghierarchical relationships among partially ordered objects withheterogeneous attributes and linksrdquo in Proceedings of the 12thSIAM International Conference on Data Mining pp 516ndash527Anaheim Calif USA 2012

[21] M Dehghani A Shakery M Asadpour and A KoushkestanildquoA learning approach for email conversation thread reconstruc-tionrdquo Journal of Information Science vol 39 no 6 pp 846ndash8632013

[22] A Balali A Rajabi S Ghassemi M Asadpour and H FailildquoContent diffusion prediction in social networksrdquo in Pro-ceedings of the 5th Conference on Information and KnowledgeTechnology (IKT rsquo13) pp 467ndash471 2013

[23] G Mishne and N Glance ldquoLeave a reply an analysis of weblogcommentsrdquo in Proceedings of the 3rd Annual Workshop onthe Weblogging ecosystem Aggregation Analysis and DynamicsEdinburgh UK 2006

[24] V Gomez A Kaltenbrunner and V Lopez ldquoStatistical analysisof the social network and discussion threads in Slashdotrdquo inProceedings of the 17th International Conference on World WideWeb (WWW rsquo08) pp 645ndash654 April 2008

[25] T Weninger X A Zhu and J Han ldquoAn exploration ofdiscussion threads in social news sites a case study of the redditcommunityrdquo 2013

[26] D Laniado R Tasso Y Volkovich and A KaltenbrunnerldquoWhen the Wikipedians talk network and tree structure of

Wikipedia discussion pagesrdquo in Proceedings of the 5th Interna-tional AAAI Conference on Weblogs and Social Media (ICWSMrsquo11) 2011

[27] A Grabowski ldquoHuman behavior in online social systemsrdquoTheEuropean Physical Journal B vol 69 no 4 pp 605ndash611 2009

[28] S N Kim L Wang and T Baldwin ldquoTagging and linkingweb forum postsrdquo in Proceedings of the 40th Conference onComputational Natural Language Learning Uppsala Sweden2010

[29] J Andreas S Rosenthal and K McKeown ldquoAnnotating agree-ment and disagreement in threaded discussionrdquo in Proceedingsof the 8th International Conference on Language Resources andComputation (LREC rsquo12) pp 818ndash822 Istanbul Turkey May2012

[30] A Stolcke K Ries N Coccaro et al ldquoDialogue actmodeling forautomatic tagging and recognition of conversational speechrdquoComputational Linguistics vol 26 no 3 pp 339ndash373 2000

[31] E N Forsyth Improving Automated Lexical and DiscourseAnalysis of Online Chat Dialog Naval Postgraduate SchoolMonterey Calif USA 2007

[32] D Feng E Shaw J Kim and E Hovy ldquoAn intelligentdiscussion-bot for answering student queries in threaded dis-cussionsrdquo in Proceedings of the 11th international Conference onIntelligent User Interfaces pp 171ndash177 February 2006

[33] F Hu T Ruan Z Shao and J Ding ldquoAutomatic web informa-tion extraction based on rulesrdquo in Proceedings of the Web Infor-mation System Engineering (WISE rsquo11) pp 265ndash272 Springer2011

[34] F Hu T Ruan and Z Shao ldquoComplete-thread extraction fromweb forumsrdquo inWebTechnologies andApplications pp 727ndash734Springer 2012

[35] D Feng J Kim E Shaw and E Hovy ldquoTowards modelingthreaded discussions using induced ontology knowledgerdquo inProceedings of the National Conference on Artificial Intelligencepp 1289ndash1294 July 2006

[36] G D Venolia and C Neustaedter ldquoUnderstanding sequenceand reply relationships within email conversations a mixed-model visualizationrdquo in Proceedings of the SIGCHI Conferenceon Human Factors in Computing Systems pp 361ndash368 April2003

[37] T Joachims ldquoTraining linear SVMs in linear timerdquo in Pro-ceedings of the 12th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining pp 217ndash226 Philadel-phia Pa USA August 2006

[38] RM Fano ldquoTransmission of information a statistical theory ofcommunicationsrdquo American Journal of Physics vol 29 pp 793ndash794 1961

[39] T Opsahl F Agneessens and J Skvoretz ldquoNode centrality inweighted networks generalizing degree and shortest pathsrdquoSocial Networks vol 32 no 3 pp 245ndash251 2010

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 17: Research Article A Supervised Approach to Predict the …downloads.hindawi.com/journals/tswj/2014/479746.pdf · 2019-07-31 · Research Article A Supervised Approach to Predict the

The Scientific World Journal 17

R

9 10 11

3 5

8

21 7

4 6

(a)

R

9 10 11

3 7

8

21

4 5

6

(b)

Figure 18 An example of two threads (a) ground-truth thread and (b) the prediction thread

Table 4 Example of calculation of declarative metrics for the example shown in Figure 18

Evaluation metric Calculation

AccCTD

TP = 1 2 9 10 11TN = 3 4 5 6 8

FN = 7

FP =

5 + 5

5 + 5 + 1 + 0asymp 091

119875edge

119877edge

1198651edge

TP = 3 rarr 1 5 rarr 2 6 rarr 5 119875edge =3

3 + 3= 05

FN = 4 rarr 3 8 rarr 5 119877edge =3

3 + 2= 06

FP = 4 rarr 1 7 rarr 2 8 rarr 7 1198651 minus scoreedge asymp 055

119875path

119877path

1198651path

119875path =| 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 |

| 3 rarr 1 4 rarr 1 5 rarr 2 6 rarr 5 rarr 2 7 rarr 2 8 rarr 7 rarr 2 |=3

6= 05

119877path =| 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 |

| 3 rarr 1 4 rarr 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 8 rarr 5 rarr 2 |=3

5= 06

1198651-score119877-path asymp 055

119875119877-path

119877119877-path

1198651119877-path

119875119877-path =

3 rarr 1 4 rarr 1 5 rarr 2 6 rarr 5 rarr 2 7 rarr 2 8 rarr 7 rarr 2

| 3 4 5 6 7 8 |=1 + 1 + 1 + 1 + 0 + 12

6=45

6= 075

119877119877-path =

3 rarr 1 4 rarr 3 rarr 1 5 rarr 2 6 rarr 5 rarr 2 8 rarr 5 rarr 2

| 3 4 5 6 8 |=1 + 12 + 1 + 1 + 12

5=4

5= 08

1198651-score119877-path =

2 lowast 075 lowast 08

075 + 08asymp 077

According to Accpath First has better performance inThestandard ENENews and Alef because more commentsare linked directly to the root Usually SLARTS and Seo etalrsquos methods cannot predict the full path inThestandard andENENews because according to Figure 15 paths are verylong and complex in these datasets

As wementioned earlier interrogative evaluationmetricsare not appropriate for declarative threads because based onthese metrics First shows high performance although thisbaseline does not detect any comment-comment relations

Table 6 shows the results when we use declarative evaluationmetrics proposed in Section 52

According to Table 6 for 119865edge SLARTS performs betterthan Seo et alrsquos method However for119875edge Seo et alrsquos methodperforms better but its 119877edge is lower than SLARTS in all da-tasets

It is important to say that when the difference between119875edge and 119877edge is high and 119875edge is greater than 119877edge themethod most likely connects comments to the root and itdoes not appropriately detect comment-comment relations

18 The Scientific World Journal

J(1)M(1)

V(1)

H(1)

A(1)

R(1)

G(1)

N(1)D(1)

E(1)

C(1)

P(2)

U(1)

S(1)

T(2)

T(1)

M(2)

L(1)

Q(1)

C(2)

K(1)

F(1)L(2)

C(3)

P(1)

A(2)

B(1)O(1)

I(1)

H(2)

R

(a)

J(1)

M(1)

R

A(1)

N(4)

D(1)

G(1)

I(1)

B(1)H(1)

O(1)

C(1)

N(2)

G(2)O(2)

P(1)

E(1)

D(2)

F(1)

E(2)F(2)

D(3)

L(1) F(3) N(1)

D(4)

F(4)K(1)D(5)

D(6)

N(3)

(b)

1)

1)

1)

1)

1)

1)

1)

1)1)

1)

1)

1)

R

F(2)

K(3)

R(1)

K(2)

M(2)A(3)

Q(1)

N(2)

K(4)A(2)

J(2)

1)

H(2)

H(3)

I(2)

I(1)

P(1)

O(1)

(c)

Figure 19 Three real threads in which nodes are labeled according to commenterrsquos name and number of its comments

(like First baseline) On the other hand when 119877edge isgreater than 119875edge the method does not appropriately detectcomment-root relations (like Last baseline)

So Seo et alrsquos method most likely connects comments tothe root and this is not appropriately detecting comment-comment relations On the other hands 119875edge and 119877edge ofSLARTS are close

Several features in SLARTS such as Global similarityfrequent words Frequent patterns and authorsrsquo languagefocus on detection of comment-comment relations Thismakes the results of SLARTS better than Seo et alrsquos methodin declarative evaluation metrics

We already saw that the average length of threads inThestandard and ENENews is longer than the other datasets(Table 2) and their paths are much longer and more complexthan the other datasets (Figure 16) According to 119865edge theaccuracy of Thestandard and ENENews is less than otherdatasets Note that 119865edge is a strict metric in declarativethreads

SLARTS has better 119865edge than Seo et alrsquos method in alldatasetsThemaximumdifference occurs inThestandard andRussianblog In these datasets many of the defined featureshave goodperformance (Importance of features in ENENewsand Russianblog datasets will be shown in Table 7 and we willexplain more in the next section)

The Scientific World Journal 19

Table 5 Experimental results with interrogative evaluation metrics

Datasets Methods Accedge Accpath 119875119877-path 119877

119877-path 1198651119877-path

Thestandard

SLARTS 04669plusmn0009

03059plusmn0013

06927plusmn0018

07450plusmn0010

07179plusmn0013

Seo et alrsquos [2] 03941plusmn0005

03413plusmn0006

08303plusmn0009

06563plusmn0005

07331plusmn0003

First 03630plusmn0010

03630plusmn0010

10000plusmn0000

05846plusmn0009

07379plusmn0007

Last 01824plusmn0006

00495plusmn0004

02214plusmn0008

10000plusmn0000

03625plusmn0010

ENENews

SLARTS 04837plusmn0018

03695plusmn0027

07625plusmn0026

07770plusmn0013

07697plusmn0017

Seo et alrsquos [2] 04661plusmn0017

04134plusmn0021

08720plusmn0011

07086plusmn0013

07818plusmn0011

First 04524plusmn0018

04524plusmn0018

10000plusmn0000

06634plusmn0016

07976plusmn0011

Last 02090plusmn0016

00596plusmn0003

02319plusmn0011

10000plusmn0000

03764plusmn0015

Courantblogs

SLARTS 06843plusmn0028

06453plusmn0033

09127plusmn0013

08619plusmn0014

08865plusmn0013

Seo et alrsquos [2] 06700plusmn0040

06628plusmn0043

09667plusmn0014

08250plusmn0018

08902plusmn0015

First 06490plusmn0036

06490plusmn0036

10000plusmn0000

08001plusmn0019

08889plusmn0012

Last 02037plusmn0017

01082plusmn0008

03183plusmn0011

10000plusmn0000

04829plusmn0012

Russianblog

SLARTS 07463plusmn0017

06904plusmn0017

08741plusmn0008

08787plusmn0006

08764plusmn0007

Seo et alrsquos [2] 05446plusmn0010

05174plusmn0011

09293plusmn0003

07821plusmn0004

08494plusmn0003

First 04968plusmn0006

04968plusmn0006

10000plusmn0000

07134plusmn0004

08327plusmn0003

Last 03632plusmn0018

01549plusmn0012

03695plusmn0015

10000plusmn0000

05395plusmn0016

Alef

SLARTS 07753plusmn0017

07641plusmn0018

09579plusmn0006

09033plusmn0008

09298plusmn0007

Seo et alrsquos [2] 07802plusmn0019

07800plusmn0019

09950plusmn0003

08828plusmn0010

09355plusmn0006

First 07853plusmn0018

07853plusmn0018

10000plusmn0000

08826plusmn0010

09376plusmn0006

Last 00993plusmn0005

00822plusmn0005

02526plusmn0012

10000plusmn0000

04032plusmn0015

The bold style shows the best result in each metric

As shown in Table 6 Russianblog has better resultsthan the other datasets in all metrics The main reason isthat its comments are not confirmed by a moderator Thiscauses Accedge of Last baseline in Russianblog to be equalto 03632 that is more than other datasets (Table 5) Accedgeof Last baseline has inverse relationship with complexity ofreplies Also we showed in Figure 13 that the Russianbloghas the lowest time difference between a comment and itsparent When time difference between a child and its parentdecreases detection of reply relationwould be easier In otherwords as the parent appears to be closer to its child somefeatures such as frequent pattern and location Prior that are

based on position of comments in chronological order workbetter

The difference between 119865119877-path and 119865path is about 20

in Thestandard and ENENews where threads are larger andpaths have more complex structures

The minimum difference between the results of SLARTSand Seo et alrsquos methods appears in Alef datasets In Alef manyrelations are comment root and many comments do not haveauthorrsquos name which make the features perform poorly

Since SLARTS method has features which speciallyfocused on detecting comment-root relations (eg by addingcandidate filtering) AccCTD of SLARTS is better than Seo

20 The Scientific World Journal

et al method in all datasets The best result is 091 forRussianblog and the worst result is 069 for ENENewsThe root of ENENews dataset is usually a tweet Accordingto Table 2 it makes the average length of the rootrsquos textto be shorter than the other datasets and this makes thetextual features perform poorly on detecting comment-rootrelations

Confidence intervals of Alef andCourantblog datasets arehigher than the other datasets becausemany of their relationsare comment root (Figure 16) This makes ranking SVM tobe bias towards connecting comments to the root especiallywhen a thread includes very few comment-comment rela-tions

We compare 119875-value to specify the significance of differ-ences between SLARTS and Seo et alrsquosmethods on declarativemetrics Since ranking SVM ranks candidates based on theirscore and selects the first candidate from the list only thefirst candidate is important So p1 is computed The resultsindicate that all improvements are statistically significant (119875-value lt 0005) in all datasets

54 Evaluation of the Features In this section we evaluatethe role of the introduced features We use backward featureselection That means to measure the importance of a featurewe use all features except that feature repeat the experimentsin its absence and compare the results to the case whereall features are present The difference between the values ofmetrics in presence of a feature and its absence is reported inTable 7

It is seen that some features improve precision of themetrics for example location prior and candidate filteringrule 3 where they most likely tend to detect comment-rootrelations Some features improve recall of the metrics such asauthorsrsquo language global similarity candidate filtering rule1 and frequent patterns These features most likely tendto detect comment-comment relations Some features affectboth precision and recall for example authorsrsquo name

As stated earlier Russianblog has different behaviourin comparison with other datasets (Figures 12 13 and 15)Also ENENews has larger threads and more complex paths(Figure 16 and Table 2) So we use these datasets to evaluatefeatures in depth

Similarity feature improves recall of evaluation metricsSince Russian language is known as a morphologically richlanguage and the length of commentsrsquo text is very short(about average 18 words according to Table 2) in comparisonwith other datasets improvement of textual features is lowTo increase the performance of textual feature in Russianwe need a stemmer a Wordnet and a tokenizer Similarityfeature has rather a good impact on AccCTD

Authorsrsquo language feature improves recall of evaluationmetrics According to 119877edge the improvements of this featureare 003 and 001 in ENENews and Russianblog respectively

Global similarity feature improves both precision andrecall in Russianblog and recall in ENENews

The frequent words feature has a small improve-ment in Russianblog like similarity feature This feature

improves recall of the evaluation metrics in ENEnews about001

Length ratio feature improves precision in both datasetsHowever since longer comments in ENENews have manychildren this feature is more prominent there

Authorsrsquo name feature is useful in all evaluation metricsThe value of this feature in ENENews is more than Russian-blog because authorsrsquo name in Russianblog is different fromother datasets it is an email and no one would refer to it as aname

The frequent patterns feature focuses on detectingcomment-comment relations This feature improves recall ofevaluation metrics in both datasets

The location prior feature improves precision in bothdatasets This feature has a good improvement on AccCTDAccording to 119875edge the best improvement is 016091 forRussianblog since comments are not moderated

The candidate filtering rule 1 improves recall of evaluationmetrics in both datasets This feature removes the rootcandidate accurately This feature has a good improvementon AccCTD The maximum improvement for 119877edge is 034 forRussianblog

The candidate filtering rule 2 has small improvements onprecision and recall of evaluation metrics in both datasetsMaximum improvement is gained with Russianblog

Finally The candidate filtering rule 3 improves precisionof metrics in both datasets This feature removes all can-didates except the root Thus detection of comment-rootrelations is improved Also this feature improves AccCTD

6 Information Extraction byVisualizing Threads Structure

In this section we discuss the information which can beextracted from hierarchical structure of threads Figure 19 isthe same as Figure 3 except that the commenters and theirnumber of posted comments are shown as the nodesrsquo labelfor example ldquo119863(3)rdquo is the third comment sent by user ldquo119863rdquoVisualization of the structure reveals valuable information asfollowing

(i) Is the root article controversial The threads ldquo119861rdquo andldquo119862rdquo in Figure 19 aremore controversial than ldquo119860rdquo Visi-tors have expressed different opinions and sentimentsabout the root in ldquo119861rdquo and ldquo119862rdquo leading to formationof longer conversations The thread ldquo119860rdquo shows acommon thread which has short conversations aboutthe root The height and width of trees can help as ameasure to recognize whether a root is controversialor not

(ii) Which comments are starters [36] Starter commentshave important role in the conversations becauseusers read them and then read other comments tobetter understand the conversations

(iii) Which comments have participated in this conversa-tion Users can follow the conversations that seeminteresting to them without reading any unrelatedcomment

The Scientific World Journal 21

Table 6 Experimental results with declarative evaluation metrics

Datasets Method 119875edge 119877edge 119865edge 119875path 119877path 119865path 119875119877-path 119877

119877-path 119865119877-path AccCTD

ThestandardSLARTS 03782

plusmn000804155plusmn0011

03960plusmn0006

01565plusmn0007

01713plusmn0008

01636plusmn0007

03628plusmn0012

04166plusmn0016

03876plusmn0005

07525plusmn0005

Seo et alrsquos[2]

03014plusmn0031

01414plusmn0015

01921plusmn0016

01544plusmn0025

00576plusmn0011

00835plusmn0013

03013plusmn0038

01319plusmn0018

01831plusmn0021

05831plusmn0007

Difference 00778 02741 02039 00021 01137 00801 00615 02847 02045 01694

ENENewsSLARTS 03797

plusmn003803891plusmn0019

03839plusmn0024

01873plusmn0040

01855plusmn0026

01857plusmn0030

03768plusmn0048

03768plusmn0017

03759plusmn0025

06872plusmn0014

Seo et alrsquos[2]

04060plusmn0030

01713plusmn0016

02407plusmn0018

02378plusmn0034

00837plusmn0016

01235plusmn0021

03947plusmn0035

01523plusmn0015

02196plusmn0019

05954plusmn0009

Difference minus00263 02178 01432 minus00505 01018 00622 minus00179 02245 01563 00918

CourantblogsSLARTS 05236

plusmn003003963plusmn0073

04495plusmn0052

04220plusmn0057

03110plusmn0058

03566plusmn0049

05325plusmn0049

03853plusmn0075

04452plusmn0057

07625plusmn0033

Seo et alrsquos[2]

07320plusmn0090

02049plusmn0062

03189plusmn0082

06815plusmn0105

01896plusmn0070

02951plusmn0093

07309plusmn0094

02010plusmn0065

03140plusmn0086

07001plusmn0031

Difference minus02084 01914 01306 minus02595 01214 00615 minus01984 01843 01312 00624

RussianblogSLARTS 05940

plusmn002805674plusmn0022

05803plusmn0023

04907plusmn0031

04664plusmn0020

04782plusmn0024

05878plusmn0028

05823plusmn0021

05849plusmn0020

09116plusmn0011

Seo et alrsquos[2]

04705plusmn0020

03030plusmn0015

03685plusmn0016

03823plusmn0023

02534plusmn0013

03046plusmn0015

04682plusmn0020

02822plusmn0014

03521plusmn0015

05862plusmn0013

Difference 01235 02644 02118 01084 0213 01736 01196 03001 02328 03254

AlefSLARTS 06189

plusmn002804131plusmn0040

04952plusmn0037

05639plusmn0027

03819plusmn0037

04551plusmn0034

06277plusmn0027

04069plusmn0043

04934plusmn0039

08044plusmn0013

Seo et alrsquos[2]

08827plusmn0052

02637plusmn0051

04045plusmn0060

08797plusmn0053

02631plusmn0051

04034plusmn0060

08840plusmn0052

02635plusmn0051

04043plusmn0060

07846plusmn0018

Difference minus02638 01494 00907 minus03158 01188 00517 minus02563 01434 00891 00198The bold style shows the best result in each metric

(iv) Who plays an important role in a discussion Someusers playmore important roles in conversations thanother users For example users ldquo119863(1)rdquo in threadldquoBrdquo and ldquo119860(1)rdquo in thread ldquoCrdquo have made a longconversation These users are known as a hub or astarter in a thread and their degree of importance canbe compared according to their indegree [39] Manyanalyses can be performed on this structure which isknown as popularity features [12]

(v) How many conversations are about the root Thereare thirteen conversations which are replyingdirectly to the root in thread ldquoCrdquo for exampleF(2)H(2)H(3) I(2)O(1)orF(2)H(2) l(1) showtwo conversations in thread ldquo119862rdquo

7 Conclusion and Future Work

In this paper we proposed SLARTS a method based onranking SVMwhich predicts the tree-like structure of declar-ative threads for example blogs and online news agenciesWe emphasized on the differences between declarative andinterrogative threads and showed that many of the previouslyproposed features perform poorly on declarative threadsbecause of this Instead we defined a set of novel textual andnontextual features and used a ranking SVM algorithm tocombine these features

We detect two types of reply relations in declarativethreads comment-root relation and comment-commentrelation An appropriate method should be able to detectboth types of reply relations and an appropriatemetric shouldconsider both types of reply relations So in order to havefair judge on the quality of the predicted structures wemodified the evaluation metrics accordingly We also defineda novel metric that measures the accuracy of comment typedetection

The results of our experiments showed that accord-ing to declarative evaluation metrics our method showshigher accuracy in comparison with the baselines on fivedatasets Also we showed that all improvements in detectingcomment-comment relations are statistically significant in alldatasets

We believe that the defined features are not limited todeclarative threads Some features such as authorrsquos languageand frequent patterns extract relations among users can beuseful in interrogative threads

For future we would like to use the proposed methodon interrogative platforms such as forums Also we wouldlike to analyze the tree-like reply structure deeply and webelieve it can provide valuable information for example tohelp in understanding the role of users in discussions or tofind important and controversial comments among a largevolume of comments

22 The Scientific World Journal

Table 7 The difference between the evaluation metrics in presence and absence of features

Feature Dataset Evaluation metric119875edge 119877edge 119875Path 119877Path 119875

119877-path 119877119877-path AccCTD

Similarity ENENews minus000233 007415 minus000538 003289 minus001095 008996 004684Russianblog 000874 000012 000963 000347 000695 minus001079 000355

Authorsrsquo language ENENews minus001354 003246 minus001785 000693 minus002757 004052 000458Russianblog minus000229 001052 minus000179 000750 minus000590 000807 minus000182

Global similarity ENENews minus000402 005604 minus001338 001581 minus001129 007086 002742Russianblog 001090 001443 000582 000897 000368 minus000317 minus000087

Frequent words ENENews 000252 000931 000170 000424 minus000132 000973 000292Russianblog 000008 000023 000016 000029 000034 000047 minus000011

Length ratio ENENews 004206 minus002984 004450 001700 006421 minus005336 minus000850Russianblog 000651 minus000355 000920 000080 000784 minus000592 000235

Authorsrsquo name ENENews 003753 006392 001920 003285 002341 005998 001788Russianblog 000000 000000 000000 000000 000000 000000 000000

Frequent pattern ENENews minus002028 002930 minus002357 000300 minus003793 003410 000526Russianblog 000458 006100 000830 005206 minus000545 006150 000404

Location prior ENENews 008778 minus003690 008556 003280 011792 minus007852 002925Russianblog 016091 009671 017766 013515 016699 008793 002414

Candidate filtering rule 1 ENENews minus005431 006138 minus004102 001296 minus006458 007545 006949Russianblog minus012506 034370 minus013916 027445 minus012683 037003 032546

Candidate filtering rule 2 ENENews 000014 minus000004 000048 000041 minus000038 minus000035 minus000009Russianblog 002575 002475 002462 002341 001746 000802 000000

Candidate filtering rule 3 ENENews 000357 minus000121 000452 000254 000551 minus000161 000161Russianblog 008521 minus001978 009636 001702 009380 minus003825 007371

Since our goal is to provide a language-independentmodel to reconstruct the thread structure we did not usetext processing tools such as Wordnet and Named EntityRecognition (as they are not available in the same quality forall languages) We would like to focus on English and usethese tools to find out whether they can improve the accuracyof RTS method or not

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] Y Wu Q Ye L Li and J Xiao ldquoPower-law properties ofhuman view and reply behavior in online societyrdquoMathematicalProblems in Engineering vol 2012 Article ID 969087 7 pages2012

[2] J Seo W B Croft and D A Smith ldquoOnline community searchusing conversational structuresrdquo Information Retrieval vol 14no 6 pp 547ndash571 2011

[3] HWang CWang C X Zhai and J Han ldquoLearning online dis-cussion structures by conditional random fieldsrdquo in Proceedingsof the 34th International ACM SIGIR Conference on Researchand Development in Information Retrieval (SIGIR rsquo11) pp 435ndash444 July 2011

[4] E Aumayr J Chan and C Hayes ldquoReconstruction of threadedconversations in online discussion forumsrdquo inProceedings of the

5th International AAAI Conference onWeblogs and SocialMedia(ICWSM rsquo11) pp 26ndash33 2011

[5] D Shen Q Yang J-T Sun and Z Chen ldquoThread detactionin dynamic text message streamsrdquo in Proceedings of the 29thAnnual International ACM SIGIR Conference on Research andDevelopment in Information Retrieval pp 35ndash42 Seattle WashUSA August 2006

[6] M Dehghani M Asadpour and A Shakery ldquoAn evolutionary-based method for reconstructing conversation threads in emailcorporardquo in Proceedings of the IEEEACM International Con-ference on Advances in Social Networks Analysis and Mining(ASONAM rsquo12) 2012

[7] Y Jen-Yuan ldquoEmail thread reassembly using similarity match-ingrdquo in Proceedings of the 3rd Conference on Email and Anti-Spam (CEAS rsquo06) Muntain View Calif USA 2006

[8] S Gottipati D Lo and J Jiang ldquoFinding relevant answersin software forumsrdquo in Proceedings of the 26th IEEEACMInternational Conference on Automated Software Engineering(ASE rsquo11) pp 323ndash332 November 2011

[9] H Duan and C Zhai ldquoExploiting thread structures to improvesmoothing of language models for forum post retrievalrdquo inAdvances in Information Retrieval P Clough C Foley CGurrin et al Eds vol 6611 pp 350ndash361 Springer BerlinGermany 2011

[10] T Georgiou M Karvounis and Y Ioannidis ldquoExtractingtopics of debate between users on web discussion boardsrdquo inProceedings of the 1st International Conference forUndergraduateand Postgraduate Students in Research Poster Competition 2010

[11] Y C Wang M Joshi W Cohen and C P Rose ldquoRecoveringimplicit thread structure in newsgroup style conversationsrdquo

The Scientific World Journal 23

in Proceedings of the International Conference on Weblogs andSocial Media Seattle Wash USA 2008

[12] J Chan C Hayes and E M Daly ldquoDecomposing discussionforums and boards using user rolesrdquo in Proceedings of the 4thInternational AAAI Conference on Weblogs and Social Media2010

[13] Y-C Wang and C P Rose ldquoMaking conversational structureexplicit identification of initiation-response pairs within onlinediscussionsrdquo in Proceedings of the Human Language Technolo-gies Conference ofthe North American Chapter of the Associationfor Computational Linguistics pp 673ndash676 June 2010

[14] LWangM Lui S N Kim J Nivre and T Baldwin ldquoPredictingthread discourse structure over technical web forumsrdquo inProceedings of the Conference on Empirical Methods in NaturalLanguage Processing (EMNLP rsquo11) pp 13ndash25 Edinburgh UKJuly 2011

[15] A Balali H FailiMAsadpour andMDehghani ldquoa supervisedapproach for reconstructing thread structure in commentson blogs and online news agenciesrdquo in Proceedings of the14th International Conference on Intelligent Text Processing andComputational Linguistics Samos Greece 2013

[16] P Adams and C Martel ldquoConversational thread extraction andtopic detection in text-based chatrdquo in Semantic Computing pp87ndash113 John Wiley amp Sons 2010

[17] P H Adams and C H Martell ldquoTopic detection and extractionin chatrdquo in Proceedings of the 2nd Annual IEEE InternationalConference on Semantic Computing (ICSC rsquo08) pp 581ndash588August 2008

[18] C Lin J M Yang R Cai X J Wang W Wang and L ZhangldquoModeling semantics and structure of discussion threadsrdquo inProceedings of the 18th International Conference on World WideWeb pp 1103ndash1104 2009

[19] A Schuth MMarx andM d Rijke ldquoExtracting the discussionstructure in comments on news-articlesrdquo in Proceedings of the9th ACM International Workshop onWeb Information and DataManagement (WIDM rsquo07) Lisboa Portugal November 2007

[20] C Wang J Han Q Li X Li W P Lin and H Ji ldquoLearninghierarchical relationships among partially ordered objects withheterogeneous attributes and linksrdquo in Proceedings of the 12thSIAM International Conference on Data Mining pp 516ndash527Anaheim Calif USA 2012

[21] M Dehghani A Shakery M Asadpour and A KoushkestanildquoA learning approach for email conversation thread reconstruc-tionrdquo Journal of Information Science vol 39 no 6 pp 846ndash8632013

[22] A Balali A Rajabi S Ghassemi M Asadpour and H FailildquoContent diffusion prediction in social networksrdquo in Pro-ceedings of the 5th Conference on Information and KnowledgeTechnology (IKT rsquo13) pp 467ndash471 2013

[23] G Mishne and N Glance ldquoLeave a reply an analysis of weblogcommentsrdquo in Proceedings of the 3rd Annual Workshop onthe Weblogging ecosystem Aggregation Analysis and DynamicsEdinburgh UK 2006

[24] V Gomez A Kaltenbrunner and V Lopez ldquoStatistical analysisof the social network and discussion threads in Slashdotrdquo inProceedings of the 17th International Conference on World WideWeb (WWW rsquo08) pp 645ndash654 April 2008

[25] T Weninger X A Zhu and J Han ldquoAn exploration ofdiscussion threads in social news sites a case study of the redditcommunityrdquo 2013

[26] D Laniado R Tasso Y Volkovich and A KaltenbrunnerldquoWhen the Wikipedians talk network and tree structure of

Wikipedia discussion pagesrdquo in Proceedings of the 5th Interna-tional AAAI Conference on Weblogs and Social Media (ICWSMrsquo11) 2011

[27] A Grabowski ldquoHuman behavior in online social systemsrdquoTheEuropean Physical Journal B vol 69 no 4 pp 605ndash611 2009

[28] S N Kim L Wang and T Baldwin ldquoTagging and linkingweb forum postsrdquo in Proceedings of the 40th Conference onComputational Natural Language Learning Uppsala Sweden2010

[29] J Andreas S Rosenthal and K McKeown ldquoAnnotating agree-ment and disagreement in threaded discussionrdquo in Proceedingsof the 8th International Conference on Language Resources andComputation (LREC rsquo12) pp 818ndash822 Istanbul Turkey May2012

[30] A Stolcke K Ries N Coccaro et al ldquoDialogue actmodeling forautomatic tagging and recognition of conversational speechrdquoComputational Linguistics vol 26 no 3 pp 339ndash373 2000

[31] E N Forsyth Improving Automated Lexical and DiscourseAnalysis of Online Chat Dialog Naval Postgraduate SchoolMonterey Calif USA 2007

[32] D Feng E Shaw J Kim and E Hovy ldquoAn intelligentdiscussion-bot for answering student queries in threaded dis-cussionsrdquo in Proceedings of the 11th international Conference onIntelligent User Interfaces pp 171ndash177 February 2006

[33] F Hu T Ruan Z Shao and J Ding ldquoAutomatic web informa-tion extraction based on rulesrdquo in Proceedings of the Web Infor-mation System Engineering (WISE rsquo11) pp 265ndash272 Springer2011

[34] F Hu T Ruan and Z Shao ldquoComplete-thread extraction fromweb forumsrdquo inWebTechnologies andApplications pp 727ndash734Springer 2012

[35] D Feng J Kim E Shaw and E Hovy ldquoTowards modelingthreaded discussions using induced ontology knowledgerdquo inProceedings of the National Conference on Artificial Intelligencepp 1289ndash1294 July 2006

[36] G D Venolia and C Neustaedter ldquoUnderstanding sequenceand reply relationships within email conversations a mixed-model visualizationrdquo in Proceedings of the SIGCHI Conferenceon Human Factors in Computing Systems pp 361ndash368 April2003

[37] T Joachims ldquoTraining linear SVMs in linear timerdquo in Pro-ceedings of the 12th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining pp 217ndash226 Philadel-phia Pa USA August 2006

[38] RM Fano ldquoTransmission of information a statistical theory ofcommunicationsrdquo American Journal of Physics vol 29 pp 793ndash794 1961

[39] T Opsahl F Agneessens and J Skvoretz ldquoNode centrality inweighted networks generalizing degree and shortest pathsrdquoSocial Networks vol 32 no 3 pp 245ndash251 2010

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 18: Research Article A Supervised Approach to Predict the …downloads.hindawi.com/journals/tswj/2014/479746.pdf · 2019-07-31 · Research Article A Supervised Approach to Predict the

18 The Scientific World Journal

J(1)M(1)

V(1)

H(1)

A(1)

R(1)

G(1)

N(1)D(1)

E(1)

C(1)

P(2)

U(1)

S(1)

T(2)

T(1)

M(2)

L(1)

Q(1)

C(2)

K(1)

F(1)L(2)

C(3)

P(1)

A(2)

B(1)O(1)

I(1)

H(2)

R

(a)

J(1)

M(1)

R

A(1)

N(4)

D(1)

G(1)

I(1)

B(1)H(1)

O(1)

C(1)

N(2)

G(2)O(2)

P(1)

E(1)

D(2)

F(1)

E(2)F(2)

D(3)

L(1) F(3) N(1)

D(4)

F(4)K(1)D(5)

D(6)

N(3)

(b)

1)

1)

1)

1)

1)

1)

1)

1)1)

1)

1)

1)

R

F(2)

K(3)

R(1)

K(2)

M(2)A(3)

Q(1)

N(2)

K(4)A(2)

J(2)

1)

H(2)

H(3)

I(2)

I(1)

P(1)

O(1)

(c)

Figure 19 Three real threads in which nodes are labeled according to commenterrsquos name and number of its comments

(like First baseline) On the other hand when 119877edge isgreater than 119875edge the method does not appropriately detectcomment-root relations (like Last baseline)

So Seo et alrsquos method most likely connects comments tothe root and this is not appropriately detecting comment-comment relations On the other hands 119875edge and 119877edge ofSLARTS are close

Several features in SLARTS such as Global similarityfrequent words Frequent patterns and authorsrsquo languagefocus on detection of comment-comment relations Thismakes the results of SLARTS better than Seo et alrsquos methodin declarative evaluation metrics

We already saw that the average length of threads inThestandard and ENENews is longer than the other datasets(Table 2) and their paths are much longer and more complexthan the other datasets (Figure 16) According to 119865edge theaccuracy of Thestandard and ENENews is less than otherdatasets Note that 119865edge is a strict metric in declarativethreads

SLARTS has better 119865edge than Seo et alrsquos method in alldatasetsThemaximumdifference occurs inThestandard andRussianblog In these datasets many of the defined featureshave goodperformance (Importance of features in ENENewsand Russianblog datasets will be shown in Table 7 and we willexplain more in the next section)

The Scientific World Journal 19

Table 5 Experimental results with interrogative evaluation metrics

Datasets Methods Accedge Accpath 119875119877-path 119877

119877-path 1198651119877-path

Thestandard

SLARTS 04669plusmn0009

03059plusmn0013

06927plusmn0018

07450plusmn0010

07179plusmn0013

Seo et alrsquos [2] 03941plusmn0005

03413plusmn0006

08303plusmn0009

06563plusmn0005

07331plusmn0003

First 03630plusmn0010

03630plusmn0010

10000plusmn0000

05846plusmn0009

07379plusmn0007

Last 01824plusmn0006

00495plusmn0004

02214plusmn0008

10000plusmn0000

03625plusmn0010

ENENews

SLARTS 04837plusmn0018

03695plusmn0027

07625plusmn0026

07770plusmn0013

07697plusmn0017

Seo et alrsquos [2] 04661plusmn0017

04134plusmn0021

08720plusmn0011

07086plusmn0013

07818plusmn0011

First 04524plusmn0018

04524plusmn0018

10000plusmn0000

06634plusmn0016

07976plusmn0011

Last 02090plusmn0016

00596plusmn0003

02319plusmn0011

10000plusmn0000

03764plusmn0015

Courantblogs

SLARTS 06843plusmn0028

06453plusmn0033

09127plusmn0013

08619plusmn0014

08865plusmn0013

Seo et alrsquos [2] 06700plusmn0040

06628plusmn0043

09667plusmn0014

08250plusmn0018

08902plusmn0015

First 06490plusmn0036

06490plusmn0036

10000plusmn0000

08001plusmn0019

08889plusmn0012

Last 02037plusmn0017

01082plusmn0008

03183plusmn0011

10000plusmn0000

04829plusmn0012

Russianblog

SLARTS 07463plusmn0017

06904plusmn0017

08741plusmn0008

08787plusmn0006

08764plusmn0007

Seo et alrsquos [2] 05446plusmn0010

05174plusmn0011

09293plusmn0003

07821plusmn0004

08494plusmn0003

First 04968plusmn0006

04968plusmn0006

10000plusmn0000

07134plusmn0004

08327plusmn0003

Last 03632plusmn0018

01549plusmn0012

03695plusmn0015

10000plusmn0000

05395plusmn0016

Alef

SLARTS 07753plusmn0017

07641plusmn0018

09579plusmn0006

09033plusmn0008

09298plusmn0007

Seo et alrsquos [2] 07802plusmn0019

07800plusmn0019

09950plusmn0003

08828plusmn0010

09355plusmn0006

First 07853plusmn0018

07853plusmn0018

10000plusmn0000

08826plusmn0010

09376plusmn0006

Last 00993plusmn0005

00822plusmn0005

02526plusmn0012

10000plusmn0000

04032plusmn0015

The bold style shows the best result in each metric

As shown in Table 6 Russianblog has better resultsthan the other datasets in all metrics The main reason isthat its comments are not confirmed by a moderator Thiscauses Accedge of Last baseline in Russianblog to be equalto 03632 that is more than other datasets (Table 5) Accedgeof Last baseline has inverse relationship with complexity ofreplies Also we showed in Figure 13 that the Russianbloghas the lowest time difference between a comment and itsparent When time difference between a child and its parentdecreases detection of reply relationwould be easier In otherwords as the parent appears to be closer to its child somefeatures such as frequent pattern and location Prior that are

based on position of comments in chronological order workbetter

The difference between 119865119877-path and 119865path is about 20

in Thestandard and ENENews where threads are larger andpaths have more complex structures

The minimum difference between the results of SLARTSand Seo et alrsquos methods appears in Alef datasets In Alef manyrelations are comment root and many comments do not haveauthorrsquos name which make the features perform poorly

Since SLARTS method has features which speciallyfocused on detecting comment-root relations (eg by addingcandidate filtering) AccCTD of SLARTS is better than Seo

20 The Scientific World Journal

et al method in all datasets The best result is 091 forRussianblog and the worst result is 069 for ENENewsThe root of ENENews dataset is usually a tweet Accordingto Table 2 it makes the average length of the rootrsquos textto be shorter than the other datasets and this makes thetextual features perform poorly on detecting comment-rootrelations

Confidence intervals of Alef andCourantblog datasets arehigher than the other datasets becausemany of their relationsare comment root (Figure 16) This makes ranking SVM tobe bias towards connecting comments to the root especiallywhen a thread includes very few comment-comment rela-tions

We compare 119875-value to specify the significance of differ-ences between SLARTS and Seo et alrsquosmethods on declarativemetrics Since ranking SVM ranks candidates based on theirscore and selects the first candidate from the list only thefirst candidate is important So p1 is computed The resultsindicate that all improvements are statistically significant (119875-value lt 0005) in all datasets

54 Evaluation of the Features In this section we evaluatethe role of the introduced features We use backward featureselection That means to measure the importance of a featurewe use all features except that feature repeat the experimentsin its absence and compare the results to the case whereall features are present The difference between the values ofmetrics in presence of a feature and its absence is reported inTable 7

It is seen that some features improve precision of themetrics for example location prior and candidate filteringrule 3 where they most likely tend to detect comment-rootrelations Some features improve recall of the metrics such asauthorsrsquo language global similarity candidate filtering rule1 and frequent patterns These features most likely tendto detect comment-comment relations Some features affectboth precision and recall for example authorsrsquo name

As stated earlier Russianblog has different behaviourin comparison with other datasets (Figures 12 13 and 15)Also ENENews has larger threads and more complex paths(Figure 16 and Table 2) So we use these datasets to evaluatefeatures in depth

Similarity feature improves recall of evaluation metricsSince Russian language is known as a morphologically richlanguage and the length of commentsrsquo text is very short(about average 18 words according to Table 2) in comparisonwith other datasets improvement of textual features is lowTo increase the performance of textual feature in Russianwe need a stemmer a Wordnet and a tokenizer Similarityfeature has rather a good impact on AccCTD

Authorsrsquo language feature improves recall of evaluationmetrics According to 119877edge the improvements of this featureare 003 and 001 in ENENews and Russianblog respectively

Global similarity feature improves both precision andrecall in Russianblog and recall in ENENews

The frequent words feature has a small improve-ment in Russianblog like similarity feature This feature

improves recall of the evaluation metrics in ENEnews about001

Length ratio feature improves precision in both datasetsHowever since longer comments in ENENews have manychildren this feature is more prominent there

Authorsrsquo name feature is useful in all evaluation metricsThe value of this feature in ENENews is more than Russian-blog because authorsrsquo name in Russianblog is different fromother datasets it is an email and no one would refer to it as aname

The frequent patterns feature focuses on detectingcomment-comment relations This feature improves recall ofevaluation metrics in both datasets

The location prior feature improves precision in bothdatasets This feature has a good improvement on AccCTDAccording to 119875edge the best improvement is 016091 forRussianblog since comments are not moderated

The candidate filtering rule 1 improves recall of evaluationmetrics in both datasets This feature removes the rootcandidate accurately This feature has a good improvementon AccCTD The maximum improvement for 119877edge is 034 forRussianblog

The candidate filtering rule 2 has small improvements onprecision and recall of evaluation metrics in both datasetsMaximum improvement is gained with Russianblog

Finally The candidate filtering rule 3 improves precisionof metrics in both datasets This feature removes all can-didates except the root Thus detection of comment-rootrelations is improved Also this feature improves AccCTD

6 Information Extraction byVisualizing Threads Structure

In this section we discuss the information which can beextracted from hierarchical structure of threads Figure 19 isthe same as Figure 3 except that the commenters and theirnumber of posted comments are shown as the nodesrsquo labelfor example ldquo119863(3)rdquo is the third comment sent by user ldquo119863rdquoVisualization of the structure reveals valuable information asfollowing

(i) Is the root article controversial The threads ldquo119861rdquo andldquo119862rdquo in Figure 19 aremore controversial than ldquo119860rdquo Visi-tors have expressed different opinions and sentimentsabout the root in ldquo119861rdquo and ldquo119862rdquo leading to formationof longer conversations The thread ldquo119860rdquo shows acommon thread which has short conversations aboutthe root The height and width of trees can help as ameasure to recognize whether a root is controversialor not

(ii) Which comments are starters [36] Starter commentshave important role in the conversations becauseusers read them and then read other comments tobetter understand the conversations

(iii) Which comments have participated in this conversa-tion Users can follow the conversations that seeminteresting to them without reading any unrelatedcomment

The Scientific World Journal 21

Table 6 Experimental results with declarative evaluation metrics

Datasets Method 119875edge 119877edge 119865edge 119875path 119877path 119865path 119875119877-path 119877

119877-path 119865119877-path AccCTD

ThestandardSLARTS 03782

plusmn000804155plusmn0011

03960plusmn0006

01565plusmn0007

01713plusmn0008

01636plusmn0007

03628plusmn0012

04166plusmn0016

03876plusmn0005

07525plusmn0005

Seo et alrsquos[2]

03014plusmn0031

01414plusmn0015

01921plusmn0016

01544plusmn0025

00576plusmn0011

00835plusmn0013

03013plusmn0038

01319plusmn0018

01831plusmn0021

05831plusmn0007

Difference 00778 02741 02039 00021 01137 00801 00615 02847 02045 01694

ENENewsSLARTS 03797

plusmn003803891plusmn0019

03839plusmn0024

01873plusmn0040

01855plusmn0026

01857plusmn0030

03768plusmn0048

03768plusmn0017

03759plusmn0025

06872plusmn0014

Seo et alrsquos[2]

04060plusmn0030

01713plusmn0016

02407plusmn0018

02378plusmn0034

00837plusmn0016

01235plusmn0021

03947plusmn0035

01523plusmn0015

02196plusmn0019

05954plusmn0009

Difference minus00263 02178 01432 minus00505 01018 00622 minus00179 02245 01563 00918

CourantblogsSLARTS 05236

plusmn003003963plusmn0073

04495plusmn0052

04220plusmn0057

03110plusmn0058

03566plusmn0049

05325plusmn0049

03853plusmn0075

04452plusmn0057

07625plusmn0033

Seo et alrsquos[2]

07320plusmn0090

02049plusmn0062

03189plusmn0082

06815plusmn0105

01896plusmn0070

02951plusmn0093

07309plusmn0094

02010plusmn0065

03140plusmn0086

07001plusmn0031

Difference minus02084 01914 01306 minus02595 01214 00615 minus01984 01843 01312 00624

RussianblogSLARTS 05940

plusmn002805674plusmn0022

05803plusmn0023

04907plusmn0031

04664plusmn0020

04782plusmn0024

05878plusmn0028

05823plusmn0021

05849plusmn0020

09116plusmn0011

Seo et alrsquos[2]

04705plusmn0020

03030plusmn0015

03685plusmn0016

03823plusmn0023

02534plusmn0013

03046plusmn0015

04682plusmn0020

02822plusmn0014

03521plusmn0015

05862plusmn0013

Difference 01235 02644 02118 01084 0213 01736 01196 03001 02328 03254

AlefSLARTS 06189

plusmn002804131plusmn0040

04952plusmn0037

05639plusmn0027

03819plusmn0037

04551plusmn0034

06277plusmn0027

04069plusmn0043

04934plusmn0039

08044plusmn0013

Seo et alrsquos[2]

08827plusmn0052

02637plusmn0051

04045plusmn0060

08797plusmn0053

02631plusmn0051

04034plusmn0060

08840plusmn0052

02635plusmn0051

04043plusmn0060

07846plusmn0018

Difference minus02638 01494 00907 minus03158 01188 00517 minus02563 01434 00891 00198The bold style shows the best result in each metric

(iv) Who plays an important role in a discussion Someusers playmore important roles in conversations thanother users For example users ldquo119863(1)rdquo in threadldquoBrdquo and ldquo119860(1)rdquo in thread ldquoCrdquo have made a longconversation These users are known as a hub or astarter in a thread and their degree of importance canbe compared according to their indegree [39] Manyanalyses can be performed on this structure which isknown as popularity features [12]

(v) How many conversations are about the root Thereare thirteen conversations which are replyingdirectly to the root in thread ldquoCrdquo for exampleF(2)H(2)H(3) I(2)O(1)orF(2)H(2) l(1) showtwo conversations in thread ldquo119862rdquo

7 Conclusion and Future Work

In this paper we proposed SLARTS a method based onranking SVMwhich predicts the tree-like structure of declar-ative threads for example blogs and online news agenciesWe emphasized on the differences between declarative andinterrogative threads and showed that many of the previouslyproposed features perform poorly on declarative threadsbecause of this Instead we defined a set of novel textual andnontextual features and used a ranking SVM algorithm tocombine these features

We detect two types of reply relations in declarativethreads comment-root relation and comment-commentrelation An appropriate method should be able to detectboth types of reply relations and an appropriatemetric shouldconsider both types of reply relations So in order to havefair judge on the quality of the predicted structures wemodified the evaluation metrics accordingly We also defineda novel metric that measures the accuracy of comment typedetection

The results of our experiments showed that accord-ing to declarative evaluation metrics our method showshigher accuracy in comparison with the baselines on fivedatasets Also we showed that all improvements in detectingcomment-comment relations are statistically significant in alldatasets

We believe that the defined features are not limited todeclarative threads Some features such as authorrsquos languageand frequent patterns extract relations among users can beuseful in interrogative threads

For future we would like to use the proposed methodon interrogative platforms such as forums Also we wouldlike to analyze the tree-like reply structure deeply and webelieve it can provide valuable information for example tohelp in understanding the role of users in discussions or tofind important and controversial comments among a largevolume of comments

22 The Scientific World Journal

Table 7 The difference between the evaluation metrics in presence and absence of features

Feature Dataset Evaluation metric119875edge 119877edge 119875Path 119877Path 119875

119877-path 119877119877-path AccCTD

Similarity ENENews minus000233 007415 minus000538 003289 minus001095 008996 004684Russianblog 000874 000012 000963 000347 000695 minus001079 000355

Authorsrsquo language ENENews minus001354 003246 minus001785 000693 minus002757 004052 000458Russianblog minus000229 001052 minus000179 000750 minus000590 000807 minus000182

Global similarity ENENews minus000402 005604 minus001338 001581 minus001129 007086 002742Russianblog 001090 001443 000582 000897 000368 minus000317 minus000087

Frequent words ENENews 000252 000931 000170 000424 minus000132 000973 000292Russianblog 000008 000023 000016 000029 000034 000047 minus000011

Length ratio ENENews 004206 minus002984 004450 001700 006421 minus005336 minus000850Russianblog 000651 minus000355 000920 000080 000784 minus000592 000235

Authorsrsquo name ENENews 003753 006392 001920 003285 002341 005998 001788Russianblog 000000 000000 000000 000000 000000 000000 000000

Frequent pattern ENENews minus002028 002930 minus002357 000300 minus003793 003410 000526Russianblog 000458 006100 000830 005206 minus000545 006150 000404

Location prior ENENews 008778 minus003690 008556 003280 011792 minus007852 002925Russianblog 016091 009671 017766 013515 016699 008793 002414

Candidate filtering rule 1 ENENews minus005431 006138 minus004102 001296 minus006458 007545 006949Russianblog minus012506 034370 minus013916 027445 minus012683 037003 032546

Candidate filtering rule 2 ENENews 000014 minus000004 000048 000041 minus000038 minus000035 minus000009Russianblog 002575 002475 002462 002341 001746 000802 000000

Candidate filtering rule 3 ENENews 000357 minus000121 000452 000254 000551 minus000161 000161Russianblog 008521 minus001978 009636 001702 009380 minus003825 007371

Since our goal is to provide a language-independentmodel to reconstruct the thread structure we did not usetext processing tools such as Wordnet and Named EntityRecognition (as they are not available in the same quality forall languages) We would like to focus on English and usethese tools to find out whether they can improve the accuracyof RTS method or not

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] Y Wu Q Ye L Li and J Xiao ldquoPower-law properties ofhuman view and reply behavior in online societyrdquoMathematicalProblems in Engineering vol 2012 Article ID 969087 7 pages2012

[2] J Seo W B Croft and D A Smith ldquoOnline community searchusing conversational structuresrdquo Information Retrieval vol 14no 6 pp 547ndash571 2011

[3] HWang CWang C X Zhai and J Han ldquoLearning online dis-cussion structures by conditional random fieldsrdquo in Proceedingsof the 34th International ACM SIGIR Conference on Researchand Development in Information Retrieval (SIGIR rsquo11) pp 435ndash444 July 2011

[4] E Aumayr J Chan and C Hayes ldquoReconstruction of threadedconversations in online discussion forumsrdquo inProceedings of the

5th International AAAI Conference onWeblogs and SocialMedia(ICWSM rsquo11) pp 26ndash33 2011

[5] D Shen Q Yang J-T Sun and Z Chen ldquoThread detactionin dynamic text message streamsrdquo in Proceedings of the 29thAnnual International ACM SIGIR Conference on Research andDevelopment in Information Retrieval pp 35ndash42 Seattle WashUSA August 2006

[6] M Dehghani M Asadpour and A Shakery ldquoAn evolutionary-based method for reconstructing conversation threads in emailcorporardquo in Proceedings of the IEEEACM International Con-ference on Advances in Social Networks Analysis and Mining(ASONAM rsquo12) 2012

[7] Y Jen-Yuan ldquoEmail thread reassembly using similarity match-ingrdquo in Proceedings of the 3rd Conference on Email and Anti-Spam (CEAS rsquo06) Muntain View Calif USA 2006

[8] S Gottipati D Lo and J Jiang ldquoFinding relevant answersin software forumsrdquo in Proceedings of the 26th IEEEACMInternational Conference on Automated Software Engineering(ASE rsquo11) pp 323ndash332 November 2011

[9] H Duan and C Zhai ldquoExploiting thread structures to improvesmoothing of language models for forum post retrievalrdquo inAdvances in Information Retrieval P Clough C Foley CGurrin et al Eds vol 6611 pp 350ndash361 Springer BerlinGermany 2011

[10] T Georgiou M Karvounis and Y Ioannidis ldquoExtractingtopics of debate between users on web discussion boardsrdquo inProceedings of the 1st International Conference forUndergraduateand Postgraduate Students in Research Poster Competition 2010

[11] Y C Wang M Joshi W Cohen and C P Rose ldquoRecoveringimplicit thread structure in newsgroup style conversationsrdquo

The Scientific World Journal 23

in Proceedings of the International Conference on Weblogs andSocial Media Seattle Wash USA 2008

[12] J Chan C Hayes and E M Daly ldquoDecomposing discussionforums and boards using user rolesrdquo in Proceedings of the 4thInternational AAAI Conference on Weblogs and Social Media2010

[13] Y-C Wang and C P Rose ldquoMaking conversational structureexplicit identification of initiation-response pairs within onlinediscussionsrdquo in Proceedings of the Human Language Technolo-gies Conference ofthe North American Chapter of the Associationfor Computational Linguistics pp 673ndash676 June 2010

[14] LWangM Lui S N Kim J Nivre and T Baldwin ldquoPredictingthread discourse structure over technical web forumsrdquo inProceedings of the Conference on Empirical Methods in NaturalLanguage Processing (EMNLP rsquo11) pp 13ndash25 Edinburgh UKJuly 2011

[15] A Balali H FailiMAsadpour andMDehghani ldquoa supervisedapproach for reconstructing thread structure in commentson blogs and online news agenciesrdquo in Proceedings of the14th International Conference on Intelligent Text Processing andComputational Linguistics Samos Greece 2013

[16] P Adams and C Martel ldquoConversational thread extraction andtopic detection in text-based chatrdquo in Semantic Computing pp87ndash113 John Wiley amp Sons 2010

[17] P H Adams and C H Martell ldquoTopic detection and extractionin chatrdquo in Proceedings of the 2nd Annual IEEE InternationalConference on Semantic Computing (ICSC rsquo08) pp 581ndash588August 2008

[18] C Lin J M Yang R Cai X J Wang W Wang and L ZhangldquoModeling semantics and structure of discussion threadsrdquo inProceedings of the 18th International Conference on World WideWeb pp 1103ndash1104 2009

[19] A Schuth MMarx andM d Rijke ldquoExtracting the discussionstructure in comments on news-articlesrdquo in Proceedings of the9th ACM International Workshop onWeb Information and DataManagement (WIDM rsquo07) Lisboa Portugal November 2007

[20] C Wang J Han Q Li X Li W P Lin and H Ji ldquoLearninghierarchical relationships among partially ordered objects withheterogeneous attributes and linksrdquo in Proceedings of the 12thSIAM International Conference on Data Mining pp 516ndash527Anaheim Calif USA 2012

[21] M Dehghani A Shakery M Asadpour and A KoushkestanildquoA learning approach for email conversation thread reconstruc-tionrdquo Journal of Information Science vol 39 no 6 pp 846ndash8632013

[22] A Balali A Rajabi S Ghassemi M Asadpour and H FailildquoContent diffusion prediction in social networksrdquo in Pro-ceedings of the 5th Conference on Information and KnowledgeTechnology (IKT rsquo13) pp 467ndash471 2013

[23] G Mishne and N Glance ldquoLeave a reply an analysis of weblogcommentsrdquo in Proceedings of the 3rd Annual Workshop onthe Weblogging ecosystem Aggregation Analysis and DynamicsEdinburgh UK 2006

[24] V Gomez A Kaltenbrunner and V Lopez ldquoStatistical analysisof the social network and discussion threads in Slashdotrdquo inProceedings of the 17th International Conference on World WideWeb (WWW rsquo08) pp 645ndash654 April 2008

[25] T Weninger X A Zhu and J Han ldquoAn exploration ofdiscussion threads in social news sites a case study of the redditcommunityrdquo 2013

[26] D Laniado R Tasso Y Volkovich and A KaltenbrunnerldquoWhen the Wikipedians talk network and tree structure of

Wikipedia discussion pagesrdquo in Proceedings of the 5th Interna-tional AAAI Conference on Weblogs and Social Media (ICWSMrsquo11) 2011

[27] A Grabowski ldquoHuman behavior in online social systemsrdquoTheEuropean Physical Journal B vol 69 no 4 pp 605ndash611 2009

[28] S N Kim L Wang and T Baldwin ldquoTagging and linkingweb forum postsrdquo in Proceedings of the 40th Conference onComputational Natural Language Learning Uppsala Sweden2010

[29] J Andreas S Rosenthal and K McKeown ldquoAnnotating agree-ment and disagreement in threaded discussionrdquo in Proceedingsof the 8th International Conference on Language Resources andComputation (LREC rsquo12) pp 818ndash822 Istanbul Turkey May2012

[30] A Stolcke K Ries N Coccaro et al ldquoDialogue actmodeling forautomatic tagging and recognition of conversational speechrdquoComputational Linguistics vol 26 no 3 pp 339ndash373 2000

[31] E N Forsyth Improving Automated Lexical and DiscourseAnalysis of Online Chat Dialog Naval Postgraduate SchoolMonterey Calif USA 2007

[32] D Feng E Shaw J Kim and E Hovy ldquoAn intelligentdiscussion-bot for answering student queries in threaded dis-cussionsrdquo in Proceedings of the 11th international Conference onIntelligent User Interfaces pp 171ndash177 February 2006

[33] F Hu T Ruan Z Shao and J Ding ldquoAutomatic web informa-tion extraction based on rulesrdquo in Proceedings of the Web Infor-mation System Engineering (WISE rsquo11) pp 265ndash272 Springer2011

[34] F Hu T Ruan and Z Shao ldquoComplete-thread extraction fromweb forumsrdquo inWebTechnologies andApplications pp 727ndash734Springer 2012

[35] D Feng J Kim E Shaw and E Hovy ldquoTowards modelingthreaded discussions using induced ontology knowledgerdquo inProceedings of the National Conference on Artificial Intelligencepp 1289ndash1294 July 2006

[36] G D Venolia and C Neustaedter ldquoUnderstanding sequenceand reply relationships within email conversations a mixed-model visualizationrdquo in Proceedings of the SIGCHI Conferenceon Human Factors in Computing Systems pp 361ndash368 April2003

[37] T Joachims ldquoTraining linear SVMs in linear timerdquo in Pro-ceedings of the 12th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining pp 217ndash226 Philadel-phia Pa USA August 2006

[38] RM Fano ldquoTransmission of information a statistical theory ofcommunicationsrdquo American Journal of Physics vol 29 pp 793ndash794 1961

[39] T Opsahl F Agneessens and J Skvoretz ldquoNode centrality inweighted networks generalizing degree and shortest pathsrdquoSocial Networks vol 32 no 3 pp 245ndash251 2010

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 19: Research Article A Supervised Approach to Predict the …downloads.hindawi.com/journals/tswj/2014/479746.pdf · 2019-07-31 · Research Article A Supervised Approach to Predict the

The Scientific World Journal 19

Table 5 Experimental results with interrogative evaluation metrics

Datasets Methods Accedge Accpath 119875119877-path 119877

119877-path 1198651119877-path

Thestandard

SLARTS 04669plusmn0009

03059plusmn0013

06927plusmn0018

07450plusmn0010

07179plusmn0013

Seo et alrsquos [2] 03941plusmn0005

03413plusmn0006

08303plusmn0009

06563plusmn0005

07331plusmn0003

First 03630plusmn0010

03630plusmn0010

10000plusmn0000

05846plusmn0009

07379plusmn0007

Last 01824plusmn0006

00495plusmn0004

02214plusmn0008

10000plusmn0000

03625plusmn0010

ENENews

SLARTS 04837plusmn0018

03695plusmn0027

07625plusmn0026

07770plusmn0013

07697plusmn0017

Seo et alrsquos [2] 04661plusmn0017

04134plusmn0021

08720plusmn0011

07086plusmn0013

07818plusmn0011

First 04524plusmn0018

04524plusmn0018

10000plusmn0000

06634plusmn0016

07976plusmn0011

Last 02090plusmn0016

00596plusmn0003

02319plusmn0011

10000plusmn0000

03764plusmn0015

Courantblogs

SLARTS 06843plusmn0028

06453plusmn0033

09127plusmn0013

08619plusmn0014

08865plusmn0013

Seo et alrsquos [2] 06700plusmn0040

06628plusmn0043

09667plusmn0014

08250plusmn0018

08902plusmn0015

First 06490plusmn0036

06490plusmn0036

10000plusmn0000

08001plusmn0019

08889plusmn0012

Last 02037plusmn0017

01082plusmn0008

03183plusmn0011

10000plusmn0000

04829plusmn0012

Russianblog

SLARTS 07463plusmn0017

06904plusmn0017

08741plusmn0008

08787plusmn0006

08764plusmn0007

Seo et alrsquos [2] 05446plusmn0010

05174plusmn0011

09293plusmn0003

07821plusmn0004

08494plusmn0003

First 04968plusmn0006

04968plusmn0006

10000plusmn0000

07134plusmn0004

08327plusmn0003

Last 03632plusmn0018

01549plusmn0012

03695plusmn0015

10000plusmn0000

05395plusmn0016

Alef

SLARTS 07753plusmn0017

07641plusmn0018

09579plusmn0006

09033plusmn0008

09298plusmn0007

Seo et alrsquos [2] 07802plusmn0019

07800plusmn0019

09950plusmn0003

08828plusmn0010

09355plusmn0006

First 07853plusmn0018

07853plusmn0018

10000plusmn0000

08826plusmn0010

09376plusmn0006

Last 00993plusmn0005

00822plusmn0005

02526plusmn0012

10000plusmn0000

04032plusmn0015

The bold style shows the best result in each metric

As shown in Table 6 Russianblog has better resultsthan the other datasets in all metrics The main reason isthat its comments are not confirmed by a moderator Thiscauses Accedge of Last baseline in Russianblog to be equalto 03632 that is more than other datasets (Table 5) Accedgeof Last baseline has inverse relationship with complexity ofreplies Also we showed in Figure 13 that the Russianbloghas the lowest time difference between a comment and itsparent When time difference between a child and its parentdecreases detection of reply relationwould be easier In otherwords as the parent appears to be closer to its child somefeatures such as frequent pattern and location Prior that are

based on position of comments in chronological order workbetter

The difference between 119865119877-path and 119865path is about 20

in Thestandard and ENENews where threads are larger andpaths have more complex structures

The minimum difference between the results of SLARTSand Seo et alrsquos methods appears in Alef datasets In Alef manyrelations are comment root and many comments do not haveauthorrsquos name which make the features perform poorly

Since SLARTS method has features which speciallyfocused on detecting comment-root relations (eg by addingcandidate filtering) AccCTD of SLARTS is better than Seo

20 The Scientific World Journal

et al method in all datasets The best result is 091 forRussianblog and the worst result is 069 for ENENewsThe root of ENENews dataset is usually a tweet Accordingto Table 2 it makes the average length of the rootrsquos textto be shorter than the other datasets and this makes thetextual features perform poorly on detecting comment-rootrelations

Confidence intervals of Alef andCourantblog datasets arehigher than the other datasets becausemany of their relationsare comment root (Figure 16) This makes ranking SVM tobe bias towards connecting comments to the root especiallywhen a thread includes very few comment-comment rela-tions

We compare 119875-value to specify the significance of differ-ences between SLARTS and Seo et alrsquosmethods on declarativemetrics Since ranking SVM ranks candidates based on theirscore and selects the first candidate from the list only thefirst candidate is important So p1 is computed The resultsindicate that all improvements are statistically significant (119875-value lt 0005) in all datasets

54 Evaluation of the Features In this section we evaluatethe role of the introduced features We use backward featureselection That means to measure the importance of a featurewe use all features except that feature repeat the experimentsin its absence and compare the results to the case whereall features are present The difference between the values ofmetrics in presence of a feature and its absence is reported inTable 7

It is seen that some features improve precision of themetrics for example location prior and candidate filteringrule 3 where they most likely tend to detect comment-rootrelations Some features improve recall of the metrics such asauthorsrsquo language global similarity candidate filtering rule1 and frequent patterns These features most likely tendto detect comment-comment relations Some features affectboth precision and recall for example authorsrsquo name

As stated earlier Russianblog has different behaviourin comparison with other datasets (Figures 12 13 and 15)Also ENENews has larger threads and more complex paths(Figure 16 and Table 2) So we use these datasets to evaluatefeatures in depth

Similarity feature improves recall of evaluation metricsSince Russian language is known as a morphologically richlanguage and the length of commentsrsquo text is very short(about average 18 words according to Table 2) in comparisonwith other datasets improvement of textual features is lowTo increase the performance of textual feature in Russianwe need a stemmer a Wordnet and a tokenizer Similarityfeature has rather a good impact on AccCTD

Authorsrsquo language feature improves recall of evaluationmetrics According to 119877edge the improvements of this featureare 003 and 001 in ENENews and Russianblog respectively

Global similarity feature improves both precision andrecall in Russianblog and recall in ENENews

The frequent words feature has a small improve-ment in Russianblog like similarity feature This feature

improves recall of the evaluation metrics in ENEnews about001

Length ratio feature improves precision in both datasetsHowever since longer comments in ENENews have manychildren this feature is more prominent there

Authorsrsquo name feature is useful in all evaluation metricsThe value of this feature in ENENews is more than Russian-blog because authorsrsquo name in Russianblog is different fromother datasets it is an email and no one would refer to it as aname

The frequent patterns feature focuses on detectingcomment-comment relations This feature improves recall ofevaluation metrics in both datasets

The location prior feature improves precision in bothdatasets This feature has a good improvement on AccCTDAccording to 119875edge the best improvement is 016091 forRussianblog since comments are not moderated

The candidate filtering rule 1 improves recall of evaluationmetrics in both datasets This feature removes the rootcandidate accurately This feature has a good improvementon AccCTD The maximum improvement for 119877edge is 034 forRussianblog

The candidate filtering rule 2 has small improvements onprecision and recall of evaluation metrics in both datasetsMaximum improvement is gained with Russianblog

Finally The candidate filtering rule 3 improves precisionof metrics in both datasets This feature removes all can-didates except the root Thus detection of comment-rootrelations is improved Also this feature improves AccCTD

6 Information Extraction byVisualizing Threads Structure

In this section we discuss the information which can beextracted from hierarchical structure of threads Figure 19 isthe same as Figure 3 except that the commenters and theirnumber of posted comments are shown as the nodesrsquo labelfor example ldquo119863(3)rdquo is the third comment sent by user ldquo119863rdquoVisualization of the structure reveals valuable information asfollowing

(i) Is the root article controversial The threads ldquo119861rdquo andldquo119862rdquo in Figure 19 aremore controversial than ldquo119860rdquo Visi-tors have expressed different opinions and sentimentsabout the root in ldquo119861rdquo and ldquo119862rdquo leading to formationof longer conversations The thread ldquo119860rdquo shows acommon thread which has short conversations aboutthe root The height and width of trees can help as ameasure to recognize whether a root is controversialor not

(ii) Which comments are starters [36] Starter commentshave important role in the conversations becauseusers read them and then read other comments tobetter understand the conversations

(iii) Which comments have participated in this conversa-tion Users can follow the conversations that seeminteresting to them without reading any unrelatedcomment

The Scientific World Journal 21

Table 6 Experimental results with declarative evaluation metrics

Datasets Method 119875edge 119877edge 119865edge 119875path 119877path 119865path 119875119877-path 119877

119877-path 119865119877-path AccCTD

ThestandardSLARTS 03782

plusmn000804155plusmn0011

03960plusmn0006

01565plusmn0007

01713plusmn0008

01636plusmn0007

03628plusmn0012

04166plusmn0016

03876plusmn0005

07525plusmn0005

Seo et alrsquos[2]

03014plusmn0031

01414plusmn0015

01921plusmn0016

01544plusmn0025

00576plusmn0011

00835plusmn0013

03013plusmn0038

01319plusmn0018

01831plusmn0021

05831plusmn0007

Difference 00778 02741 02039 00021 01137 00801 00615 02847 02045 01694

ENENewsSLARTS 03797

plusmn003803891plusmn0019

03839plusmn0024

01873plusmn0040

01855plusmn0026

01857plusmn0030

03768plusmn0048

03768plusmn0017

03759plusmn0025

06872plusmn0014

Seo et alrsquos[2]

04060plusmn0030

01713plusmn0016

02407plusmn0018

02378plusmn0034

00837plusmn0016

01235plusmn0021

03947plusmn0035

01523plusmn0015

02196plusmn0019

05954plusmn0009

Difference minus00263 02178 01432 minus00505 01018 00622 minus00179 02245 01563 00918

CourantblogsSLARTS 05236

plusmn003003963plusmn0073

04495plusmn0052

04220plusmn0057

03110plusmn0058

03566plusmn0049

05325plusmn0049

03853plusmn0075

04452plusmn0057

07625plusmn0033

Seo et alrsquos[2]

07320plusmn0090

02049plusmn0062

03189plusmn0082

06815plusmn0105

01896plusmn0070

02951plusmn0093

07309plusmn0094

02010plusmn0065

03140plusmn0086

07001plusmn0031

Difference minus02084 01914 01306 minus02595 01214 00615 minus01984 01843 01312 00624

RussianblogSLARTS 05940

plusmn002805674plusmn0022

05803plusmn0023

04907plusmn0031

04664plusmn0020

04782plusmn0024

05878plusmn0028

05823plusmn0021

05849plusmn0020

09116plusmn0011

Seo et alrsquos[2]

04705plusmn0020

03030plusmn0015

03685plusmn0016

03823plusmn0023

02534plusmn0013

03046plusmn0015

04682plusmn0020

02822plusmn0014

03521plusmn0015

05862plusmn0013

Difference 01235 02644 02118 01084 0213 01736 01196 03001 02328 03254

AlefSLARTS 06189

plusmn002804131plusmn0040

04952plusmn0037

05639plusmn0027

03819plusmn0037

04551plusmn0034

06277plusmn0027

04069plusmn0043

04934plusmn0039

08044plusmn0013

Seo et alrsquos[2]

08827plusmn0052

02637plusmn0051

04045plusmn0060

08797plusmn0053

02631plusmn0051

04034plusmn0060

08840plusmn0052

02635plusmn0051

04043plusmn0060

07846plusmn0018

Difference minus02638 01494 00907 minus03158 01188 00517 minus02563 01434 00891 00198The bold style shows the best result in each metric

(iv) Who plays an important role in a discussion Someusers playmore important roles in conversations thanother users For example users ldquo119863(1)rdquo in threadldquoBrdquo and ldquo119860(1)rdquo in thread ldquoCrdquo have made a longconversation These users are known as a hub or astarter in a thread and their degree of importance canbe compared according to their indegree [39] Manyanalyses can be performed on this structure which isknown as popularity features [12]

(v) How many conversations are about the root Thereare thirteen conversations which are replyingdirectly to the root in thread ldquoCrdquo for exampleF(2)H(2)H(3) I(2)O(1)orF(2)H(2) l(1) showtwo conversations in thread ldquo119862rdquo

7 Conclusion and Future Work

In this paper we proposed SLARTS a method based onranking SVMwhich predicts the tree-like structure of declar-ative threads for example blogs and online news agenciesWe emphasized on the differences between declarative andinterrogative threads and showed that many of the previouslyproposed features perform poorly on declarative threadsbecause of this Instead we defined a set of novel textual andnontextual features and used a ranking SVM algorithm tocombine these features

We detect two types of reply relations in declarativethreads comment-root relation and comment-commentrelation An appropriate method should be able to detectboth types of reply relations and an appropriatemetric shouldconsider both types of reply relations So in order to havefair judge on the quality of the predicted structures wemodified the evaluation metrics accordingly We also defineda novel metric that measures the accuracy of comment typedetection

The results of our experiments showed that accord-ing to declarative evaluation metrics our method showshigher accuracy in comparison with the baselines on fivedatasets Also we showed that all improvements in detectingcomment-comment relations are statistically significant in alldatasets

We believe that the defined features are not limited todeclarative threads Some features such as authorrsquos languageand frequent patterns extract relations among users can beuseful in interrogative threads

For future we would like to use the proposed methodon interrogative platforms such as forums Also we wouldlike to analyze the tree-like reply structure deeply and webelieve it can provide valuable information for example tohelp in understanding the role of users in discussions or tofind important and controversial comments among a largevolume of comments

22 The Scientific World Journal

Table 7 The difference between the evaluation metrics in presence and absence of features

Feature Dataset Evaluation metric119875edge 119877edge 119875Path 119877Path 119875

119877-path 119877119877-path AccCTD

Similarity ENENews minus000233 007415 minus000538 003289 minus001095 008996 004684Russianblog 000874 000012 000963 000347 000695 minus001079 000355

Authorsrsquo language ENENews minus001354 003246 minus001785 000693 minus002757 004052 000458Russianblog minus000229 001052 minus000179 000750 minus000590 000807 minus000182

Global similarity ENENews minus000402 005604 minus001338 001581 minus001129 007086 002742Russianblog 001090 001443 000582 000897 000368 minus000317 minus000087

Frequent words ENENews 000252 000931 000170 000424 minus000132 000973 000292Russianblog 000008 000023 000016 000029 000034 000047 minus000011

Length ratio ENENews 004206 minus002984 004450 001700 006421 minus005336 minus000850Russianblog 000651 minus000355 000920 000080 000784 minus000592 000235

Authorsrsquo name ENENews 003753 006392 001920 003285 002341 005998 001788Russianblog 000000 000000 000000 000000 000000 000000 000000

Frequent pattern ENENews minus002028 002930 minus002357 000300 minus003793 003410 000526Russianblog 000458 006100 000830 005206 minus000545 006150 000404

Location prior ENENews 008778 minus003690 008556 003280 011792 minus007852 002925Russianblog 016091 009671 017766 013515 016699 008793 002414

Candidate filtering rule 1 ENENews minus005431 006138 minus004102 001296 minus006458 007545 006949Russianblog minus012506 034370 minus013916 027445 minus012683 037003 032546

Candidate filtering rule 2 ENENews 000014 minus000004 000048 000041 minus000038 minus000035 minus000009Russianblog 002575 002475 002462 002341 001746 000802 000000

Candidate filtering rule 3 ENENews 000357 minus000121 000452 000254 000551 minus000161 000161Russianblog 008521 minus001978 009636 001702 009380 minus003825 007371

Since our goal is to provide a language-independentmodel to reconstruct the thread structure we did not usetext processing tools such as Wordnet and Named EntityRecognition (as they are not available in the same quality forall languages) We would like to focus on English and usethese tools to find out whether they can improve the accuracyof RTS method or not

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] Y Wu Q Ye L Li and J Xiao ldquoPower-law properties ofhuman view and reply behavior in online societyrdquoMathematicalProblems in Engineering vol 2012 Article ID 969087 7 pages2012

[2] J Seo W B Croft and D A Smith ldquoOnline community searchusing conversational structuresrdquo Information Retrieval vol 14no 6 pp 547ndash571 2011

[3] HWang CWang C X Zhai and J Han ldquoLearning online dis-cussion structures by conditional random fieldsrdquo in Proceedingsof the 34th International ACM SIGIR Conference on Researchand Development in Information Retrieval (SIGIR rsquo11) pp 435ndash444 July 2011

[4] E Aumayr J Chan and C Hayes ldquoReconstruction of threadedconversations in online discussion forumsrdquo inProceedings of the

5th International AAAI Conference onWeblogs and SocialMedia(ICWSM rsquo11) pp 26ndash33 2011

[5] D Shen Q Yang J-T Sun and Z Chen ldquoThread detactionin dynamic text message streamsrdquo in Proceedings of the 29thAnnual International ACM SIGIR Conference on Research andDevelopment in Information Retrieval pp 35ndash42 Seattle WashUSA August 2006

[6] M Dehghani M Asadpour and A Shakery ldquoAn evolutionary-based method for reconstructing conversation threads in emailcorporardquo in Proceedings of the IEEEACM International Con-ference on Advances in Social Networks Analysis and Mining(ASONAM rsquo12) 2012

[7] Y Jen-Yuan ldquoEmail thread reassembly using similarity match-ingrdquo in Proceedings of the 3rd Conference on Email and Anti-Spam (CEAS rsquo06) Muntain View Calif USA 2006

[8] S Gottipati D Lo and J Jiang ldquoFinding relevant answersin software forumsrdquo in Proceedings of the 26th IEEEACMInternational Conference on Automated Software Engineering(ASE rsquo11) pp 323ndash332 November 2011

[9] H Duan and C Zhai ldquoExploiting thread structures to improvesmoothing of language models for forum post retrievalrdquo inAdvances in Information Retrieval P Clough C Foley CGurrin et al Eds vol 6611 pp 350ndash361 Springer BerlinGermany 2011

[10] T Georgiou M Karvounis and Y Ioannidis ldquoExtractingtopics of debate between users on web discussion boardsrdquo inProceedings of the 1st International Conference forUndergraduateand Postgraduate Students in Research Poster Competition 2010

[11] Y C Wang M Joshi W Cohen and C P Rose ldquoRecoveringimplicit thread structure in newsgroup style conversationsrdquo

The Scientific World Journal 23

in Proceedings of the International Conference on Weblogs andSocial Media Seattle Wash USA 2008

[12] J Chan C Hayes and E M Daly ldquoDecomposing discussionforums and boards using user rolesrdquo in Proceedings of the 4thInternational AAAI Conference on Weblogs and Social Media2010

[13] Y-C Wang and C P Rose ldquoMaking conversational structureexplicit identification of initiation-response pairs within onlinediscussionsrdquo in Proceedings of the Human Language Technolo-gies Conference ofthe North American Chapter of the Associationfor Computational Linguistics pp 673ndash676 June 2010

[14] LWangM Lui S N Kim J Nivre and T Baldwin ldquoPredictingthread discourse structure over technical web forumsrdquo inProceedings of the Conference on Empirical Methods in NaturalLanguage Processing (EMNLP rsquo11) pp 13ndash25 Edinburgh UKJuly 2011

[15] A Balali H FailiMAsadpour andMDehghani ldquoa supervisedapproach for reconstructing thread structure in commentson blogs and online news agenciesrdquo in Proceedings of the14th International Conference on Intelligent Text Processing andComputational Linguistics Samos Greece 2013

[16] P Adams and C Martel ldquoConversational thread extraction andtopic detection in text-based chatrdquo in Semantic Computing pp87ndash113 John Wiley amp Sons 2010

[17] P H Adams and C H Martell ldquoTopic detection and extractionin chatrdquo in Proceedings of the 2nd Annual IEEE InternationalConference on Semantic Computing (ICSC rsquo08) pp 581ndash588August 2008

[18] C Lin J M Yang R Cai X J Wang W Wang and L ZhangldquoModeling semantics and structure of discussion threadsrdquo inProceedings of the 18th International Conference on World WideWeb pp 1103ndash1104 2009

[19] A Schuth MMarx andM d Rijke ldquoExtracting the discussionstructure in comments on news-articlesrdquo in Proceedings of the9th ACM International Workshop onWeb Information and DataManagement (WIDM rsquo07) Lisboa Portugal November 2007

[20] C Wang J Han Q Li X Li W P Lin and H Ji ldquoLearninghierarchical relationships among partially ordered objects withheterogeneous attributes and linksrdquo in Proceedings of the 12thSIAM International Conference on Data Mining pp 516ndash527Anaheim Calif USA 2012

[21] M Dehghani A Shakery M Asadpour and A KoushkestanildquoA learning approach for email conversation thread reconstruc-tionrdquo Journal of Information Science vol 39 no 6 pp 846ndash8632013

[22] A Balali A Rajabi S Ghassemi M Asadpour and H FailildquoContent diffusion prediction in social networksrdquo in Pro-ceedings of the 5th Conference on Information and KnowledgeTechnology (IKT rsquo13) pp 467ndash471 2013

[23] G Mishne and N Glance ldquoLeave a reply an analysis of weblogcommentsrdquo in Proceedings of the 3rd Annual Workshop onthe Weblogging ecosystem Aggregation Analysis and DynamicsEdinburgh UK 2006

[24] V Gomez A Kaltenbrunner and V Lopez ldquoStatistical analysisof the social network and discussion threads in Slashdotrdquo inProceedings of the 17th International Conference on World WideWeb (WWW rsquo08) pp 645ndash654 April 2008

[25] T Weninger X A Zhu and J Han ldquoAn exploration ofdiscussion threads in social news sites a case study of the redditcommunityrdquo 2013

[26] D Laniado R Tasso Y Volkovich and A KaltenbrunnerldquoWhen the Wikipedians talk network and tree structure of

Wikipedia discussion pagesrdquo in Proceedings of the 5th Interna-tional AAAI Conference on Weblogs and Social Media (ICWSMrsquo11) 2011

[27] A Grabowski ldquoHuman behavior in online social systemsrdquoTheEuropean Physical Journal B vol 69 no 4 pp 605ndash611 2009

[28] S N Kim L Wang and T Baldwin ldquoTagging and linkingweb forum postsrdquo in Proceedings of the 40th Conference onComputational Natural Language Learning Uppsala Sweden2010

[29] J Andreas S Rosenthal and K McKeown ldquoAnnotating agree-ment and disagreement in threaded discussionrdquo in Proceedingsof the 8th International Conference on Language Resources andComputation (LREC rsquo12) pp 818ndash822 Istanbul Turkey May2012

[30] A Stolcke K Ries N Coccaro et al ldquoDialogue actmodeling forautomatic tagging and recognition of conversational speechrdquoComputational Linguistics vol 26 no 3 pp 339ndash373 2000

[31] E N Forsyth Improving Automated Lexical and DiscourseAnalysis of Online Chat Dialog Naval Postgraduate SchoolMonterey Calif USA 2007

[32] D Feng E Shaw J Kim and E Hovy ldquoAn intelligentdiscussion-bot for answering student queries in threaded dis-cussionsrdquo in Proceedings of the 11th international Conference onIntelligent User Interfaces pp 171ndash177 February 2006

[33] F Hu T Ruan Z Shao and J Ding ldquoAutomatic web informa-tion extraction based on rulesrdquo in Proceedings of the Web Infor-mation System Engineering (WISE rsquo11) pp 265ndash272 Springer2011

[34] F Hu T Ruan and Z Shao ldquoComplete-thread extraction fromweb forumsrdquo inWebTechnologies andApplications pp 727ndash734Springer 2012

[35] D Feng J Kim E Shaw and E Hovy ldquoTowards modelingthreaded discussions using induced ontology knowledgerdquo inProceedings of the National Conference on Artificial Intelligencepp 1289ndash1294 July 2006

[36] G D Venolia and C Neustaedter ldquoUnderstanding sequenceand reply relationships within email conversations a mixed-model visualizationrdquo in Proceedings of the SIGCHI Conferenceon Human Factors in Computing Systems pp 361ndash368 April2003

[37] T Joachims ldquoTraining linear SVMs in linear timerdquo in Pro-ceedings of the 12th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining pp 217ndash226 Philadel-phia Pa USA August 2006

[38] RM Fano ldquoTransmission of information a statistical theory ofcommunicationsrdquo American Journal of Physics vol 29 pp 793ndash794 1961

[39] T Opsahl F Agneessens and J Skvoretz ldquoNode centrality inweighted networks generalizing degree and shortest pathsrdquoSocial Networks vol 32 no 3 pp 245ndash251 2010

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 20: Research Article A Supervised Approach to Predict the …downloads.hindawi.com/journals/tswj/2014/479746.pdf · 2019-07-31 · Research Article A Supervised Approach to Predict the

20 The Scientific World Journal

et al method in all datasets The best result is 091 forRussianblog and the worst result is 069 for ENENewsThe root of ENENews dataset is usually a tweet Accordingto Table 2 it makes the average length of the rootrsquos textto be shorter than the other datasets and this makes thetextual features perform poorly on detecting comment-rootrelations

Confidence intervals of Alef andCourantblog datasets arehigher than the other datasets becausemany of their relationsare comment root (Figure 16) This makes ranking SVM tobe bias towards connecting comments to the root especiallywhen a thread includes very few comment-comment rela-tions

We compare 119875-value to specify the significance of differ-ences between SLARTS and Seo et alrsquosmethods on declarativemetrics Since ranking SVM ranks candidates based on theirscore and selects the first candidate from the list only thefirst candidate is important So p1 is computed The resultsindicate that all improvements are statistically significant (119875-value lt 0005) in all datasets

54 Evaluation of the Features In this section we evaluatethe role of the introduced features We use backward featureselection That means to measure the importance of a featurewe use all features except that feature repeat the experimentsin its absence and compare the results to the case whereall features are present The difference between the values ofmetrics in presence of a feature and its absence is reported inTable 7

It is seen that some features improve precision of themetrics for example location prior and candidate filteringrule 3 where they most likely tend to detect comment-rootrelations Some features improve recall of the metrics such asauthorsrsquo language global similarity candidate filtering rule1 and frequent patterns These features most likely tendto detect comment-comment relations Some features affectboth precision and recall for example authorsrsquo name

As stated earlier Russianblog has different behaviourin comparison with other datasets (Figures 12 13 and 15)Also ENENews has larger threads and more complex paths(Figure 16 and Table 2) So we use these datasets to evaluatefeatures in depth

Similarity feature improves recall of evaluation metricsSince Russian language is known as a morphologically richlanguage and the length of commentsrsquo text is very short(about average 18 words according to Table 2) in comparisonwith other datasets improvement of textual features is lowTo increase the performance of textual feature in Russianwe need a stemmer a Wordnet and a tokenizer Similarityfeature has rather a good impact on AccCTD

Authorsrsquo language feature improves recall of evaluationmetrics According to 119877edge the improvements of this featureare 003 and 001 in ENENews and Russianblog respectively

Global similarity feature improves both precision andrecall in Russianblog and recall in ENENews

The frequent words feature has a small improve-ment in Russianblog like similarity feature This feature

improves recall of the evaluation metrics in ENEnews about001

Length ratio feature improves precision in both datasetsHowever since longer comments in ENENews have manychildren this feature is more prominent there

Authorsrsquo name feature is useful in all evaluation metricsThe value of this feature in ENENews is more than Russian-blog because authorsrsquo name in Russianblog is different fromother datasets it is an email and no one would refer to it as aname

The frequent patterns feature focuses on detectingcomment-comment relations This feature improves recall ofevaluation metrics in both datasets

The location prior feature improves precision in bothdatasets This feature has a good improvement on AccCTDAccording to 119875edge the best improvement is 016091 forRussianblog since comments are not moderated

The candidate filtering rule 1 improves recall of evaluationmetrics in both datasets This feature removes the rootcandidate accurately This feature has a good improvementon AccCTD The maximum improvement for 119877edge is 034 forRussianblog

The candidate filtering rule 2 has small improvements onprecision and recall of evaluation metrics in both datasetsMaximum improvement is gained with Russianblog

Finally The candidate filtering rule 3 improves precisionof metrics in both datasets This feature removes all can-didates except the root Thus detection of comment-rootrelations is improved Also this feature improves AccCTD

6 Information Extraction byVisualizing Threads Structure

In this section we discuss the information which can beextracted from hierarchical structure of threads Figure 19 isthe same as Figure 3 except that the commenters and theirnumber of posted comments are shown as the nodesrsquo labelfor example ldquo119863(3)rdquo is the third comment sent by user ldquo119863rdquoVisualization of the structure reveals valuable information asfollowing

(i) Is the root article controversial The threads ldquo119861rdquo andldquo119862rdquo in Figure 19 aremore controversial than ldquo119860rdquo Visi-tors have expressed different opinions and sentimentsabout the root in ldquo119861rdquo and ldquo119862rdquo leading to formationof longer conversations The thread ldquo119860rdquo shows acommon thread which has short conversations aboutthe root The height and width of trees can help as ameasure to recognize whether a root is controversialor not

(ii) Which comments are starters [36] Starter commentshave important role in the conversations becauseusers read them and then read other comments tobetter understand the conversations

(iii) Which comments have participated in this conversa-tion Users can follow the conversations that seeminteresting to them without reading any unrelatedcomment

The Scientific World Journal 21

Table 6 Experimental results with declarative evaluation metrics

Datasets Method 119875edge 119877edge 119865edge 119875path 119877path 119865path 119875119877-path 119877

119877-path 119865119877-path AccCTD

ThestandardSLARTS 03782

plusmn000804155plusmn0011

03960plusmn0006

01565plusmn0007

01713plusmn0008

01636plusmn0007

03628plusmn0012

04166plusmn0016

03876plusmn0005

07525plusmn0005

Seo et alrsquos[2]

03014plusmn0031

01414plusmn0015

01921plusmn0016

01544plusmn0025

00576plusmn0011

00835plusmn0013

03013plusmn0038

01319plusmn0018

01831plusmn0021

05831plusmn0007

Difference 00778 02741 02039 00021 01137 00801 00615 02847 02045 01694

ENENewsSLARTS 03797

plusmn003803891plusmn0019

03839plusmn0024

01873plusmn0040

01855plusmn0026

01857plusmn0030

03768plusmn0048

03768plusmn0017

03759plusmn0025

06872plusmn0014

Seo et alrsquos[2]

04060plusmn0030

01713plusmn0016

02407plusmn0018

02378plusmn0034

00837plusmn0016

01235plusmn0021

03947plusmn0035

01523plusmn0015

02196plusmn0019

05954plusmn0009

Difference minus00263 02178 01432 minus00505 01018 00622 minus00179 02245 01563 00918

CourantblogsSLARTS 05236

plusmn003003963plusmn0073

04495plusmn0052

04220plusmn0057

03110plusmn0058

03566plusmn0049

05325plusmn0049

03853plusmn0075

04452plusmn0057

07625plusmn0033

Seo et alrsquos[2]

07320plusmn0090

02049plusmn0062

03189plusmn0082

06815plusmn0105

01896plusmn0070

02951plusmn0093

07309plusmn0094

02010plusmn0065

03140plusmn0086

07001plusmn0031

Difference minus02084 01914 01306 minus02595 01214 00615 minus01984 01843 01312 00624

RussianblogSLARTS 05940

plusmn002805674plusmn0022

05803plusmn0023

04907plusmn0031

04664plusmn0020

04782plusmn0024

05878plusmn0028

05823plusmn0021

05849plusmn0020

09116plusmn0011

Seo et alrsquos[2]

04705plusmn0020

03030plusmn0015

03685plusmn0016

03823plusmn0023

02534plusmn0013

03046plusmn0015

04682plusmn0020

02822plusmn0014

03521plusmn0015

05862plusmn0013

Difference 01235 02644 02118 01084 0213 01736 01196 03001 02328 03254

AlefSLARTS 06189

plusmn002804131plusmn0040

04952plusmn0037

05639plusmn0027

03819plusmn0037

04551plusmn0034

06277plusmn0027

04069plusmn0043

04934plusmn0039

08044plusmn0013

Seo et alrsquos[2]

08827plusmn0052

02637plusmn0051

04045plusmn0060

08797plusmn0053

02631plusmn0051

04034plusmn0060

08840plusmn0052

02635plusmn0051

04043plusmn0060

07846plusmn0018

Difference minus02638 01494 00907 minus03158 01188 00517 minus02563 01434 00891 00198The bold style shows the best result in each metric

(iv) Who plays an important role in a discussion Someusers playmore important roles in conversations thanother users For example users ldquo119863(1)rdquo in threadldquoBrdquo and ldquo119860(1)rdquo in thread ldquoCrdquo have made a longconversation These users are known as a hub or astarter in a thread and their degree of importance canbe compared according to their indegree [39] Manyanalyses can be performed on this structure which isknown as popularity features [12]

(v) How many conversations are about the root Thereare thirteen conversations which are replyingdirectly to the root in thread ldquoCrdquo for exampleF(2)H(2)H(3) I(2)O(1)orF(2)H(2) l(1) showtwo conversations in thread ldquo119862rdquo

7 Conclusion and Future Work

In this paper we proposed SLARTS a method based onranking SVMwhich predicts the tree-like structure of declar-ative threads for example blogs and online news agenciesWe emphasized on the differences between declarative andinterrogative threads and showed that many of the previouslyproposed features perform poorly on declarative threadsbecause of this Instead we defined a set of novel textual andnontextual features and used a ranking SVM algorithm tocombine these features

We detect two types of reply relations in declarativethreads comment-root relation and comment-commentrelation An appropriate method should be able to detectboth types of reply relations and an appropriatemetric shouldconsider both types of reply relations So in order to havefair judge on the quality of the predicted structures wemodified the evaluation metrics accordingly We also defineda novel metric that measures the accuracy of comment typedetection

The results of our experiments showed that accord-ing to declarative evaluation metrics our method showshigher accuracy in comparison with the baselines on fivedatasets Also we showed that all improvements in detectingcomment-comment relations are statistically significant in alldatasets

We believe that the defined features are not limited todeclarative threads Some features such as authorrsquos languageand frequent patterns extract relations among users can beuseful in interrogative threads

For future we would like to use the proposed methodon interrogative platforms such as forums Also we wouldlike to analyze the tree-like reply structure deeply and webelieve it can provide valuable information for example tohelp in understanding the role of users in discussions or tofind important and controversial comments among a largevolume of comments

22 The Scientific World Journal

Table 7 The difference between the evaluation metrics in presence and absence of features

Feature Dataset Evaluation metric119875edge 119877edge 119875Path 119877Path 119875

119877-path 119877119877-path AccCTD

Similarity ENENews minus000233 007415 minus000538 003289 minus001095 008996 004684Russianblog 000874 000012 000963 000347 000695 minus001079 000355

Authorsrsquo language ENENews minus001354 003246 minus001785 000693 minus002757 004052 000458Russianblog minus000229 001052 minus000179 000750 minus000590 000807 minus000182

Global similarity ENENews minus000402 005604 minus001338 001581 minus001129 007086 002742Russianblog 001090 001443 000582 000897 000368 minus000317 minus000087

Frequent words ENENews 000252 000931 000170 000424 minus000132 000973 000292Russianblog 000008 000023 000016 000029 000034 000047 minus000011

Length ratio ENENews 004206 minus002984 004450 001700 006421 minus005336 minus000850Russianblog 000651 minus000355 000920 000080 000784 minus000592 000235

Authorsrsquo name ENENews 003753 006392 001920 003285 002341 005998 001788Russianblog 000000 000000 000000 000000 000000 000000 000000

Frequent pattern ENENews minus002028 002930 minus002357 000300 minus003793 003410 000526Russianblog 000458 006100 000830 005206 minus000545 006150 000404

Location prior ENENews 008778 minus003690 008556 003280 011792 minus007852 002925Russianblog 016091 009671 017766 013515 016699 008793 002414

Candidate filtering rule 1 ENENews minus005431 006138 minus004102 001296 minus006458 007545 006949Russianblog minus012506 034370 minus013916 027445 minus012683 037003 032546

Candidate filtering rule 2 ENENews 000014 minus000004 000048 000041 minus000038 minus000035 minus000009Russianblog 002575 002475 002462 002341 001746 000802 000000

Candidate filtering rule 3 ENENews 000357 minus000121 000452 000254 000551 minus000161 000161Russianblog 008521 minus001978 009636 001702 009380 minus003825 007371

Since our goal is to provide a language-independentmodel to reconstruct the thread structure we did not usetext processing tools such as Wordnet and Named EntityRecognition (as they are not available in the same quality forall languages) We would like to focus on English and usethese tools to find out whether they can improve the accuracyof RTS method or not

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] Y Wu Q Ye L Li and J Xiao ldquoPower-law properties ofhuman view and reply behavior in online societyrdquoMathematicalProblems in Engineering vol 2012 Article ID 969087 7 pages2012

[2] J Seo W B Croft and D A Smith ldquoOnline community searchusing conversational structuresrdquo Information Retrieval vol 14no 6 pp 547ndash571 2011

[3] HWang CWang C X Zhai and J Han ldquoLearning online dis-cussion structures by conditional random fieldsrdquo in Proceedingsof the 34th International ACM SIGIR Conference on Researchand Development in Information Retrieval (SIGIR rsquo11) pp 435ndash444 July 2011

[4] E Aumayr J Chan and C Hayes ldquoReconstruction of threadedconversations in online discussion forumsrdquo inProceedings of the

5th International AAAI Conference onWeblogs and SocialMedia(ICWSM rsquo11) pp 26ndash33 2011

[5] D Shen Q Yang J-T Sun and Z Chen ldquoThread detactionin dynamic text message streamsrdquo in Proceedings of the 29thAnnual International ACM SIGIR Conference on Research andDevelopment in Information Retrieval pp 35ndash42 Seattle WashUSA August 2006

[6] M Dehghani M Asadpour and A Shakery ldquoAn evolutionary-based method for reconstructing conversation threads in emailcorporardquo in Proceedings of the IEEEACM International Con-ference on Advances in Social Networks Analysis and Mining(ASONAM rsquo12) 2012

[7] Y Jen-Yuan ldquoEmail thread reassembly using similarity match-ingrdquo in Proceedings of the 3rd Conference on Email and Anti-Spam (CEAS rsquo06) Muntain View Calif USA 2006

[8] S Gottipati D Lo and J Jiang ldquoFinding relevant answersin software forumsrdquo in Proceedings of the 26th IEEEACMInternational Conference on Automated Software Engineering(ASE rsquo11) pp 323ndash332 November 2011

[9] H Duan and C Zhai ldquoExploiting thread structures to improvesmoothing of language models for forum post retrievalrdquo inAdvances in Information Retrieval P Clough C Foley CGurrin et al Eds vol 6611 pp 350ndash361 Springer BerlinGermany 2011

[10] T Georgiou M Karvounis and Y Ioannidis ldquoExtractingtopics of debate between users on web discussion boardsrdquo inProceedings of the 1st International Conference forUndergraduateand Postgraduate Students in Research Poster Competition 2010

[11] Y C Wang M Joshi W Cohen and C P Rose ldquoRecoveringimplicit thread structure in newsgroup style conversationsrdquo

The Scientific World Journal 23

in Proceedings of the International Conference on Weblogs andSocial Media Seattle Wash USA 2008

[12] J Chan C Hayes and E M Daly ldquoDecomposing discussionforums and boards using user rolesrdquo in Proceedings of the 4thInternational AAAI Conference on Weblogs and Social Media2010

[13] Y-C Wang and C P Rose ldquoMaking conversational structureexplicit identification of initiation-response pairs within onlinediscussionsrdquo in Proceedings of the Human Language Technolo-gies Conference ofthe North American Chapter of the Associationfor Computational Linguistics pp 673ndash676 June 2010

[14] LWangM Lui S N Kim J Nivre and T Baldwin ldquoPredictingthread discourse structure over technical web forumsrdquo inProceedings of the Conference on Empirical Methods in NaturalLanguage Processing (EMNLP rsquo11) pp 13ndash25 Edinburgh UKJuly 2011

[15] A Balali H FailiMAsadpour andMDehghani ldquoa supervisedapproach for reconstructing thread structure in commentson blogs and online news agenciesrdquo in Proceedings of the14th International Conference on Intelligent Text Processing andComputational Linguistics Samos Greece 2013

[16] P Adams and C Martel ldquoConversational thread extraction andtopic detection in text-based chatrdquo in Semantic Computing pp87ndash113 John Wiley amp Sons 2010

[17] P H Adams and C H Martell ldquoTopic detection and extractionin chatrdquo in Proceedings of the 2nd Annual IEEE InternationalConference on Semantic Computing (ICSC rsquo08) pp 581ndash588August 2008

[18] C Lin J M Yang R Cai X J Wang W Wang and L ZhangldquoModeling semantics and structure of discussion threadsrdquo inProceedings of the 18th International Conference on World WideWeb pp 1103ndash1104 2009

[19] A Schuth MMarx andM d Rijke ldquoExtracting the discussionstructure in comments on news-articlesrdquo in Proceedings of the9th ACM International Workshop onWeb Information and DataManagement (WIDM rsquo07) Lisboa Portugal November 2007

[20] C Wang J Han Q Li X Li W P Lin and H Ji ldquoLearninghierarchical relationships among partially ordered objects withheterogeneous attributes and linksrdquo in Proceedings of the 12thSIAM International Conference on Data Mining pp 516ndash527Anaheim Calif USA 2012

[21] M Dehghani A Shakery M Asadpour and A KoushkestanildquoA learning approach for email conversation thread reconstruc-tionrdquo Journal of Information Science vol 39 no 6 pp 846ndash8632013

[22] A Balali A Rajabi S Ghassemi M Asadpour and H FailildquoContent diffusion prediction in social networksrdquo in Pro-ceedings of the 5th Conference on Information and KnowledgeTechnology (IKT rsquo13) pp 467ndash471 2013

[23] G Mishne and N Glance ldquoLeave a reply an analysis of weblogcommentsrdquo in Proceedings of the 3rd Annual Workshop onthe Weblogging ecosystem Aggregation Analysis and DynamicsEdinburgh UK 2006

[24] V Gomez A Kaltenbrunner and V Lopez ldquoStatistical analysisof the social network and discussion threads in Slashdotrdquo inProceedings of the 17th International Conference on World WideWeb (WWW rsquo08) pp 645ndash654 April 2008

[25] T Weninger X A Zhu and J Han ldquoAn exploration ofdiscussion threads in social news sites a case study of the redditcommunityrdquo 2013

[26] D Laniado R Tasso Y Volkovich and A KaltenbrunnerldquoWhen the Wikipedians talk network and tree structure of

Wikipedia discussion pagesrdquo in Proceedings of the 5th Interna-tional AAAI Conference on Weblogs and Social Media (ICWSMrsquo11) 2011

[27] A Grabowski ldquoHuman behavior in online social systemsrdquoTheEuropean Physical Journal B vol 69 no 4 pp 605ndash611 2009

[28] S N Kim L Wang and T Baldwin ldquoTagging and linkingweb forum postsrdquo in Proceedings of the 40th Conference onComputational Natural Language Learning Uppsala Sweden2010

[29] J Andreas S Rosenthal and K McKeown ldquoAnnotating agree-ment and disagreement in threaded discussionrdquo in Proceedingsof the 8th International Conference on Language Resources andComputation (LREC rsquo12) pp 818ndash822 Istanbul Turkey May2012

[30] A Stolcke K Ries N Coccaro et al ldquoDialogue actmodeling forautomatic tagging and recognition of conversational speechrdquoComputational Linguistics vol 26 no 3 pp 339ndash373 2000

[31] E N Forsyth Improving Automated Lexical and DiscourseAnalysis of Online Chat Dialog Naval Postgraduate SchoolMonterey Calif USA 2007

[32] D Feng E Shaw J Kim and E Hovy ldquoAn intelligentdiscussion-bot for answering student queries in threaded dis-cussionsrdquo in Proceedings of the 11th international Conference onIntelligent User Interfaces pp 171ndash177 February 2006

[33] F Hu T Ruan Z Shao and J Ding ldquoAutomatic web informa-tion extraction based on rulesrdquo in Proceedings of the Web Infor-mation System Engineering (WISE rsquo11) pp 265ndash272 Springer2011

[34] F Hu T Ruan and Z Shao ldquoComplete-thread extraction fromweb forumsrdquo inWebTechnologies andApplications pp 727ndash734Springer 2012

[35] D Feng J Kim E Shaw and E Hovy ldquoTowards modelingthreaded discussions using induced ontology knowledgerdquo inProceedings of the National Conference on Artificial Intelligencepp 1289ndash1294 July 2006

[36] G D Venolia and C Neustaedter ldquoUnderstanding sequenceand reply relationships within email conversations a mixed-model visualizationrdquo in Proceedings of the SIGCHI Conferenceon Human Factors in Computing Systems pp 361ndash368 April2003

[37] T Joachims ldquoTraining linear SVMs in linear timerdquo in Pro-ceedings of the 12th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining pp 217ndash226 Philadel-phia Pa USA August 2006

[38] RM Fano ldquoTransmission of information a statistical theory ofcommunicationsrdquo American Journal of Physics vol 29 pp 793ndash794 1961

[39] T Opsahl F Agneessens and J Skvoretz ldquoNode centrality inweighted networks generalizing degree and shortest pathsrdquoSocial Networks vol 32 no 3 pp 245ndash251 2010

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 21: Research Article A Supervised Approach to Predict the …downloads.hindawi.com/journals/tswj/2014/479746.pdf · 2019-07-31 · Research Article A Supervised Approach to Predict the

The Scientific World Journal 21

Table 6 Experimental results with declarative evaluation metrics

Datasets Method 119875edge 119877edge 119865edge 119875path 119877path 119865path 119875119877-path 119877

119877-path 119865119877-path AccCTD

ThestandardSLARTS 03782

plusmn000804155plusmn0011

03960plusmn0006

01565plusmn0007

01713plusmn0008

01636plusmn0007

03628plusmn0012

04166plusmn0016

03876plusmn0005

07525plusmn0005

Seo et alrsquos[2]

03014plusmn0031

01414plusmn0015

01921plusmn0016

01544plusmn0025

00576plusmn0011

00835plusmn0013

03013plusmn0038

01319plusmn0018

01831plusmn0021

05831plusmn0007

Difference 00778 02741 02039 00021 01137 00801 00615 02847 02045 01694

ENENewsSLARTS 03797

plusmn003803891plusmn0019

03839plusmn0024

01873plusmn0040

01855plusmn0026

01857plusmn0030

03768plusmn0048

03768plusmn0017

03759plusmn0025

06872plusmn0014

Seo et alrsquos[2]

04060plusmn0030

01713plusmn0016

02407plusmn0018

02378plusmn0034

00837plusmn0016

01235plusmn0021

03947plusmn0035

01523plusmn0015

02196plusmn0019

05954plusmn0009

Difference minus00263 02178 01432 minus00505 01018 00622 minus00179 02245 01563 00918

CourantblogsSLARTS 05236

plusmn003003963plusmn0073

04495plusmn0052

04220plusmn0057

03110plusmn0058

03566plusmn0049

05325plusmn0049

03853plusmn0075

04452plusmn0057

07625plusmn0033

Seo et alrsquos[2]

07320plusmn0090

02049plusmn0062

03189plusmn0082

06815plusmn0105

01896plusmn0070

02951plusmn0093

07309plusmn0094

02010plusmn0065

03140plusmn0086

07001plusmn0031

Difference minus02084 01914 01306 minus02595 01214 00615 minus01984 01843 01312 00624

RussianblogSLARTS 05940

plusmn002805674plusmn0022

05803plusmn0023

04907plusmn0031

04664plusmn0020

04782plusmn0024

05878plusmn0028

05823plusmn0021

05849plusmn0020

09116plusmn0011

Seo et alrsquos[2]

04705plusmn0020

03030plusmn0015

03685plusmn0016

03823plusmn0023

02534plusmn0013

03046plusmn0015

04682plusmn0020

02822plusmn0014

03521plusmn0015

05862plusmn0013

Difference 01235 02644 02118 01084 0213 01736 01196 03001 02328 03254

AlefSLARTS 06189

plusmn002804131plusmn0040

04952plusmn0037

05639plusmn0027

03819plusmn0037

04551plusmn0034

06277plusmn0027

04069plusmn0043

04934plusmn0039

08044plusmn0013

Seo et alrsquos[2]

08827plusmn0052

02637plusmn0051

04045plusmn0060

08797plusmn0053

02631plusmn0051

04034plusmn0060

08840plusmn0052

02635plusmn0051

04043plusmn0060

07846plusmn0018

Difference minus02638 01494 00907 minus03158 01188 00517 minus02563 01434 00891 00198The bold style shows the best result in each metric

(iv) Who plays an important role in a discussion Someusers playmore important roles in conversations thanother users For example users ldquo119863(1)rdquo in threadldquoBrdquo and ldquo119860(1)rdquo in thread ldquoCrdquo have made a longconversation These users are known as a hub or astarter in a thread and their degree of importance canbe compared according to their indegree [39] Manyanalyses can be performed on this structure which isknown as popularity features [12]

(v) How many conversations are about the root Thereare thirteen conversations which are replyingdirectly to the root in thread ldquoCrdquo for exampleF(2)H(2)H(3) I(2)O(1)orF(2)H(2) l(1) showtwo conversations in thread ldquo119862rdquo

7 Conclusion and Future Work

In this paper we proposed SLARTS a method based onranking SVMwhich predicts the tree-like structure of declar-ative threads for example blogs and online news agenciesWe emphasized on the differences between declarative andinterrogative threads and showed that many of the previouslyproposed features perform poorly on declarative threadsbecause of this Instead we defined a set of novel textual andnontextual features and used a ranking SVM algorithm tocombine these features

We detect two types of reply relations in declarativethreads comment-root relation and comment-commentrelation An appropriate method should be able to detectboth types of reply relations and an appropriatemetric shouldconsider both types of reply relations So in order to havefair judge on the quality of the predicted structures wemodified the evaluation metrics accordingly We also defineda novel metric that measures the accuracy of comment typedetection

The results of our experiments showed that accord-ing to declarative evaluation metrics our method showshigher accuracy in comparison with the baselines on fivedatasets Also we showed that all improvements in detectingcomment-comment relations are statistically significant in alldatasets

We believe that the defined features are not limited todeclarative threads Some features such as authorrsquos languageand frequent patterns extract relations among users can beuseful in interrogative threads

For future we would like to use the proposed methodon interrogative platforms such as forums Also we wouldlike to analyze the tree-like reply structure deeply and webelieve it can provide valuable information for example tohelp in understanding the role of users in discussions or tofind important and controversial comments among a largevolume of comments

22 The Scientific World Journal

Table 7 The difference between the evaluation metrics in presence and absence of features

Feature Dataset Evaluation metric119875edge 119877edge 119875Path 119877Path 119875

119877-path 119877119877-path AccCTD

Similarity ENENews minus000233 007415 minus000538 003289 minus001095 008996 004684Russianblog 000874 000012 000963 000347 000695 minus001079 000355

Authorsrsquo language ENENews minus001354 003246 minus001785 000693 minus002757 004052 000458Russianblog minus000229 001052 minus000179 000750 minus000590 000807 minus000182

Global similarity ENENews minus000402 005604 minus001338 001581 minus001129 007086 002742Russianblog 001090 001443 000582 000897 000368 minus000317 minus000087

Frequent words ENENews 000252 000931 000170 000424 minus000132 000973 000292Russianblog 000008 000023 000016 000029 000034 000047 minus000011

Length ratio ENENews 004206 minus002984 004450 001700 006421 minus005336 minus000850Russianblog 000651 minus000355 000920 000080 000784 minus000592 000235

Authorsrsquo name ENENews 003753 006392 001920 003285 002341 005998 001788Russianblog 000000 000000 000000 000000 000000 000000 000000

Frequent pattern ENENews minus002028 002930 minus002357 000300 minus003793 003410 000526Russianblog 000458 006100 000830 005206 minus000545 006150 000404

Location prior ENENews 008778 minus003690 008556 003280 011792 minus007852 002925Russianblog 016091 009671 017766 013515 016699 008793 002414

Candidate filtering rule 1 ENENews minus005431 006138 minus004102 001296 minus006458 007545 006949Russianblog minus012506 034370 minus013916 027445 minus012683 037003 032546

Candidate filtering rule 2 ENENews 000014 minus000004 000048 000041 minus000038 minus000035 minus000009Russianblog 002575 002475 002462 002341 001746 000802 000000

Candidate filtering rule 3 ENENews 000357 minus000121 000452 000254 000551 minus000161 000161Russianblog 008521 minus001978 009636 001702 009380 minus003825 007371

Since our goal is to provide a language-independentmodel to reconstruct the thread structure we did not usetext processing tools such as Wordnet and Named EntityRecognition (as they are not available in the same quality forall languages) We would like to focus on English and usethese tools to find out whether they can improve the accuracyof RTS method or not

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] Y Wu Q Ye L Li and J Xiao ldquoPower-law properties ofhuman view and reply behavior in online societyrdquoMathematicalProblems in Engineering vol 2012 Article ID 969087 7 pages2012

[2] J Seo W B Croft and D A Smith ldquoOnline community searchusing conversational structuresrdquo Information Retrieval vol 14no 6 pp 547ndash571 2011

[3] HWang CWang C X Zhai and J Han ldquoLearning online dis-cussion structures by conditional random fieldsrdquo in Proceedingsof the 34th International ACM SIGIR Conference on Researchand Development in Information Retrieval (SIGIR rsquo11) pp 435ndash444 July 2011

[4] E Aumayr J Chan and C Hayes ldquoReconstruction of threadedconversations in online discussion forumsrdquo inProceedings of the

5th International AAAI Conference onWeblogs and SocialMedia(ICWSM rsquo11) pp 26ndash33 2011

[5] D Shen Q Yang J-T Sun and Z Chen ldquoThread detactionin dynamic text message streamsrdquo in Proceedings of the 29thAnnual International ACM SIGIR Conference on Research andDevelopment in Information Retrieval pp 35ndash42 Seattle WashUSA August 2006

[6] M Dehghani M Asadpour and A Shakery ldquoAn evolutionary-based method for reconstructing conversation threads in emailcorporardquo in Proceedings of the IEEEACM International Con-ference on Advances in Social Networks Analysis and Mining(ASONAM rsquo12) 2012

[7] Y Jen-Yuan ldquoEmail thread reassembly using similarity match-ingrdquo in Proceedings of the 3rd Conference on Email and Anti-Spam (CEAS rsquo06) Muntain View Calif USA 2006

[8] S Gottipati D Lo and J Jiang ldquoFinding relevant answersin software forumsrdquo in Proceedings of the 26th IEEEACMInternational Conference on Automated Software Engineering(ASE rsquo11) pp 323ndash332 November 2011

[9] H Duan and C Zhai ldquoExploiting thread structures to improvesmoothing of language models for forum post retrievalrdquo inAdvances in Information Retrieval P Clough C Foley CGurrin et al Eds vol 6611 pp 350ndash361 Springer BerlinGermany 2011

[10] T Georgiou M Karvounis and Y Ioannidis ldquoExtractingtopics of debate between users on web discussion boardsrdquo inProceedings of the 1st International Conference forUndergraduateand Postgraduate Students in Research Poster Competition 2010

[11] Y C Wang M Joshi W Cohen and C P Rose ldquoRecoveringimplicit thread structure in newsgroup style conversationsrdquo

The Scientific World Journal 23

in Proceedings of the International Conference on Weblogs andSocial Media Seattle Wash USA 2008

[12] J Chan C Hayes and E M Daly ldquoDecomposing discussionforums and boards using user rolesrdquo in Proceedings of the 4thInternational AAAI Conference on Weblogs and Social Media2010

[13] Y-C Wang and C P Rose ldquoMaking conversational structureexplicit identification of initiation-response pairs within onlinediscussionsrdquo in Proceedings of the Human Language Technolo-gies Conference ofthe North American Chapter of the Associationfor Computational Linguistics pp 673ndash676 June 2010

[14] LWangM Lui S N Kim J Nivre and T Baldwin ldquoPredictingthread discourse structure over technical web forumsrdquo inProceedings of the Conference on Empirical Methods in NaturalLanguage Processing (EMNLP rsquo11) pp 13ndash25 Edinburgh UKJuly 2011

[15] A Balali H FailiMAsadpour andMDehghani ldquoa supervisedapproach for reconstructing thread structure in commentson blogs and online news agenciesrdquo in Proceedings of the14th International Conference on Intelligent Text Processing andComputational Linguistics Samos Greece 2013

[16] P Adams and C Martel ldquoConversational thread extraction andtopic detection in text-based chatrdquo in Semantic Computing pp87ndash113 John Wiley amp Sons 2010

[17] P H Adams and C H Martell ldquoTopic detection and extractionin chatrdquo in Proceedings of the 2nd Annual IEEE InternationalConference on Semantic Computing (ICSC rsquo08) pp 581ndash588August 2008

[18] C Lin J M Yang R Cai X J Wang W Wang and L ZhangldquoModeling semantics and structure of discussion threadsrdquo inProceedings of the 18th International Conference on World WideWeb pp 1103ndash1104 2009

[19] A Schuth MMarx andM d Rijke ldquoExtracting the discussionstructure in comments on news-articlesrdquo in Proceedings of the9th ACM International Workshop onWeb Information and DataManagement (WIDM rsquo07) Lisboa Portugal November 2007

[20] C Wang J Han Q Li X Li W P Lin and H Ji ldquoLearninghierarchical relationships among partially ordered objects withheterogeneous attributes and linksrdquo in Proceedings of the 12thSIAM International Conference on Data Mining pp 516ndash527Anaheim Calif USA 2012

[21] M Dehghani A Shakery M Asadpour and A KoushkestanildquoA learning approach for email conversation thread reconstruc-tionrdquo Journal of Information Science vol 39 no 6 pp 846ndash8632013

[22] A Balali A Rajabi S Ghassemi M Asadpour and H FailildquoContent diffusion prediction in social networksrdquo in Pro-ceedings of the 5th Conference on Information and KnowledgeTechnology (IKT rsquo13) pp 467ndash471 2013

[23] G Mishne and N Glance ldquoLeave a reply an analysis of weblogcommentsrdquo in Proceedings of the 3rd Annual Workshop onthe Weblogging ecosystem Aggregation Analysis and DynamicsEdinburgh UK 2006

[24] V Gomez A Kaltenbrunner and V Lopez ldquoStatistical analysisof the social network and discussion threads in Slashdotrdquo inProceedings of the 17th International Conference on World WideWeb (WWW rsquo08) pp 645ndash654 April 2008

[25] T Weninger X A Zhu and J Han ldquoAn exploration ofdiscussion threads in social news sites a case study of the redditcommunityrdquo 2013

[26] D Laniado R Tasso Y Volkovich and A KaltenbrunnerldquoWhen the Wikipedians talk network and tree structure of

Wikipedia discussion pagesrdquo in Proceedings of the 5th Interna-tional AAAI Conference on Weblogs and Social Media (ICWSMrsquo11) 2011

[27] A Grabowski ldquoHuman behavior in online social systemsrdquoTheEuropean Physical Journal B vol 69 no 4 pp 605ndash611 2009

[28] S N Kim L Wang and T Baldwin ldquoTagging and linkingweb forum postsrdquo in Proceedings of the 40th Conference onComputational Natural Language Learning Uppsala Sweden2010

[29] J Andreas S Rosenthal and K McKeown ldquoAnnotating agree-ment and disagreement in threaded discussionrdquo in Proceedingsof the 8th International Conference on Language Resources andComputation (LREC rsquo12) pp 818ndash822 Istanbul Turkey May2012

[30] A Stolcke K Ries N Coccaro et al ldquoDialogue actmodeling forautomatic tagging and recognition of conversational speechrdquoComputational Linguistics vol 26 no 3 pp 339ndash373 2000

[31] E N Forsyth Improving Automated Lexical and DiscourseAnalysis of Online Chat Dialog Naval Postgraduate SchoolMonterey Calif USA 2007

[32] D Feng E Shaw J Kim and E Hovy ldquoAn intelligentdiscussion-bot for answering student queries in threaded dis-cussionsrdquo in Proceedings of the 11th international Conference onIntelligent User Interfaces pp 171ndash177 February 2006

[33] F Hu T Ruan Z Shao and J Ding ldquoAutomatic web informa-tion extraction based on rulesrdquo in Proceedings of the Web Infor-mation System Engineering (WISE rsquo11) pp 265ndash272 Springer2011

[34] F Hu T Ruan and Z Shao ldquoComplete-thread extraction fromweb forumsrdquo inWebTechnologies andApplications pp 727ndash734Springer 2012

[35] D Feng J Kim E Shaw and E Hovy ldquoTowards modelingthreaded discussions using induced ontology knowledgerdquo inProceedings of the National Conference on Artificial Intelligencepp 1289ndash1294 July 2006

[36] G D Venolia and C Neustaedter ldquoUnderstanding sequenceand reply relationships within email conversations a mixed-model visualizationrdquo in Proceedings of the SIGCHI Conferenceon Human Factors in Computing Systems pp 361ndash368 April2003

[37] T Joachims ldquoTraining linear SVMs in linear timerdquo in Pro-ceedings of the 12th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining pp 217ndash226 Philadel-phia Pa USA August 2006

[38] RM Fano ldquoTransmission of information a statistical theory ofcommunicationsrdquo American Journal of Physics vol 29 pp 793ndash794 1961

[39] T Opsahl F Agneessens and J Skvoretz ldquoNode centrality inweighted networks generalizing degree and shortest pathsrdquoSocial Networks vol 32 no 3 pp 245ndash251 2010

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 22: Research Article A Supervised Approach to Predict the …downloads.hindawi.com/journals/tswj/2014/479746.pdf · 2019-07-31 · Research Article A Supervised Approach to Predict the

22 The Scientific World Journal

Table 7 The difference between the evaluation metrics in presence and absence of features

Feature Dataset Evaluation metric119875edge 119877edge 119875Path 119877Path 119875

119877-path 119877119877-path AccCTD

Similarity ENENews minus000233 007415 minus000538 003289 minus001095 008996 004684Russianblog 000874 000012 000963 000347 000695 minus001079 000355

Authorsrsquo language ENENews minus001354 003246 minus001785 000693 minus002757 004052 000458Russianblog minus000229 001052 minus000179 000750 minus000590 000807 minus000182

Global similarity ENENews minus000402 005604 minus001338 001581 minus001129 007086 002742Russianblog 001090 001443 000582 000897 000368 minus000317 minus000087

Frequent words ENENews 000252 000931 000170 000424 minus000132 000973 000292Russianblog 000008 000023 000016 000029 000034 000047 minus000011

Length ratio ENENews 004206 minus002984 004450 001700 006421 minus005336 minus000850Russianblog 000651 minus000355 000920 000080 000784 minus000592 000235

Authorsrsquo name ENENews 003753 006392 001920 003285 002341 005998 001788Russianblog 000000 000000 000000 000000 000000 000000 000000

Frequent pattern ENENews minus002028 002930 minus002357 000300 minus003793 003410 000526Russianblog 000458 006100 000830 005206 minus000545 006150 000404

Location prior ENENews 008778 minus003690 008556 003280 011792 minus007852 002925Russianblog 016091 009671 017766 013515 016699 008793 002414

Candidate filtering rule 1 ENENews minus005431 006138 minus004102 001296 minus006458 007545 006949Russianblog minus012506 034370 minus013916 027445 minus012683 037003 032546

Candidate filtering rule 2 ENENews 000014 minus000004 000048 000041 minus000038 minus000035 minus000009Russianblog 002575 002475 002462 002341 001746 000802 000000

Candidate filtering rule 3 ENENews 000357 minus000121 000452 000254 000551 minus000161 000161Russianblog 008521 minus001978 009636 001702 009380 minus003825 007371

Since our goal is to provide a language-independentmodel to reconstruct the thread structure we did not usetext processing tools such as Wordnet and Named EntityRecognition (as they are not available in the same quality forall languages) We would like to focus on English and usethese tools to find out whether they can improve the accuracyof RTS method or not

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] Y Wu Q Ye L Li and J Xiao ldquoPower-law properties ofhuman view and reply behavior in online societyrdquoMathematicalProblems in Engineering vol 2012 Article ID 969087 7 pages2012

[2] J Seo W B Croft and D A Smith ldquoOnline community searchusing conversational structuresrdquo Information Retrieval vol 14no 6 pp 547ndash571 2011

[3] HWang CWang C X Zhai and J Han ldquoLearning online dis-cussion structures by conditional random fieldsrdquo in Proceedingsof the 34th International ACM SIGIR Conference on Researchand Development in Information Retrieval (SIGIR rsquo11) pp 435ndash444 July 2011

[4] E Aumayr J Chan and C Hayes ldquoReconstruction of threadedconversations in online discussion forumsrdquo inProceedings of the

5th International AAAI Conference onWeblogs and SocialMedia(ICWSM rsquo11) pp 26ndash33 2011

[5] D Shen Q Yang J-T Sun and Z Chen ldquoThread detactionin dynamic text message streamsrdquo in Proceedings of the 29thAnnual International ACM SIGIR Conference on Research andDevelopment in Information Retrieval pp 35ndash42 Seattle WashUSA August 2006

[6] M Dehghani M Asadpour and A Shakery ldquoAn evolutionary-based method for reconstructing conversation threads in emailcorporardquo in Proceedings of the IEEEACM International Con-ference on Advances in Social Networks Analysis and Mining(ASONAM rsquo12) 2012

[7] Y Jen-Yuan ldquoEmail thread reassembly using similarity match-ingrdquo in Proceedings of the 3rd Conference on Email and Anti-Spam (CEAS rsquo06) Muntain View Calif USA 2006

[8] S Gottipati D Lo and J Jiang ldquoFinding relevant answersin software forumsrdquo in Proceedings of the 26th IEEEACMInternational Conference on Automated Software Engineering(ASE rsquo11) pp 323ndash332 November 2011

[9] H Duan and C Zhai ldquoExploiting thread structures to improvesmoothing of language models for forum post retrievalrdquo inAdvances in Information Retrieval P Clough C Foley CGurrin et al Eds vol 6611 pp 350ndash361 Springer BerlinGermany 2011

[10] T Georgiou M Karvounis and Y Ioannidis ldquoExtractingtopics of debate between users on web discussion boardsrdquo inProceedings of the 1st International Conference forUndergraduateand Postgraduate Students in Research Poster Competition 2010

[11] Y C Wang M Joshi W Cohen and C P Rose ldquoRecoveringimplicit thread structure in newsgroup style conversationsrdquo

The Scientific World Journal 23

in Proceedings of the International Conference on Weblogs andSocial Media Seattle Wash USA 2008

[12] J Chan C Hayes and E M Daly ldquoDecomposing discussionforums and boards using user rolesrdquo in Proceedings of the 4thInternational AAAI Conference on Weblogs and Social Media2010

[13] Y-C Wang and C P Rose ldquoMaking conversational structureexplicit identification of initiation-response pairs within onlinediscussionsrdquo in Proceedings of the Human Language Technolo-gies Conference ofthe North American Chapter of the Associationfor Computational Linguistics pp 673ndash676 June 2010

[14] LWangM Lui S N Kim J Nivre and T Baldwin ldquoPredictingthread discourse structure over technical web forumsrdquo inProceedings of the Conference on Empirical Methods in NaturalLanguage Processing (EMNLP rsquo11) pp 13ndash25 Edinburgh UKJuly 2011

[15] A Balali H FailiMAsadpour andMDehghani ldquoa supervisedapproach for reconstructing thread structure in commentson blogs and online news agenciesrdquo in Proceedings of the14th International Conference on Intelligent Text Processing andComputational Linguistics Samos Greece 2013

[16] P Adams and C Martel ldquoConversational thread extraction andtopic detection in text-based chatrdquo in Semantic Computing pp87ndash113 John Wiley amp Sons 2010

[17] P H Adams and C H Martell ldquoTopic detection and extractionin chatrdquo in Proceedings of the 2nd Annual IEEE InternationalConference on Semantic Computing (ICSC rsquo08) pp 581ndash588August 2008

[18] C Lin J M Yang R Cai X J Wang W Wang and L ZhangldquoModeling semantics and structure of discussion threadsrdquo inProceedings of the 18th International Conference on World WideWeb pp 1103ndash1104 2009

[19] A Schuth MMarx andM d Rijke ldquoExtracting the discussionstructure in comments on news-articlesrdquo in Proceedings of the9th ACM International Workshop onWeb Information and DataManagement (WIDM rsquo07) Lisboa Portugal November 2007

[20] C Wang J Han Q Li X Li W P Lin and H Ji ldquoLearninghierarchical relationships among partially ordered objects withheterogeneous attributes and linksrdquo in Proceedings of the 12thSIAM International Conference on Data Mining pp 516ndash527Anaheim Calif USA 2012

[21] M Dehghani A Shakery M Asadpour and A KoushkestanildquoA learning approach for email conversation thread reconstruc-tionrdquo Journal of Information Science vol 39 no 6 pp 846ndash8632013

[22] A Balali A Rajabi S Ghassemi M Asadpour and H FailildquoContent diffusion prediction in social networksrdquo in Pro-ceedings of the 5th Conference on Information and KnowledgeTechnology (IKT rsquo13) pp 467ndash471 2013

[23] G Mishne and N Glance ldquoLeave a reply an analysis of weblogcommentsrdquo in Proceedings of the 3rd Annual Workshop onthe Weblogging ecosystem Aggregation Analysis and DynamicsEdinburgh UK 2006

[24] V Gomez A Kaltenbrunner and V Lopez ldquoStatistical analysisof the social network and discussion threads in Slashdotrdquo inProceedings of the 17th International Conference on World WideWeb (WWW rsquo08) pp 645ndash654 April 2008

[25] T Weninger X A Zhu and J Han ldquoAn exploration ofdiscussion threads in social news sites a case study of the redditcommunityrdquo 2013

[26] D Laniado R Tasso Y Volkovich and A KaltenbrunnerldquoWhen the Wikipedians talk network and tree structure of

Wikipedia discussion pagesrdquo in Proceedings of the 5th Interna-tional AAAI Conference on Weblogs and Social Media (ICWSMrsquo11) 2011

[27] A Grabowski ldquoHuman behavior in online social systemsrdquoTheEuropean Physical Journal B vol 69 no 4 pp 605ndash611 2009

[28] S N Kim L Wang and T Baldwin ldquoTagging and linkingweb forum postsrdquo in Proceedings of the 40th Conference onComputational Natural Language Learning Uppsala Sweden2010

[29] J Andreas S Rosenthal and K McKeown ldquoAnnotating agree-ment and disagreement in threaded discussionrdquo in Proceedingsof the 8th International Conference on Language Resources andComputation (LREC rsquo12) pp 818ndash822 Istanbul Turkey May2012

[30] A Stolcke K Ries N Coccaro et al ldquoDialogue actmodeling forautomatic tagging and recognition of conversational speechrdquoComputational Linguistics vol 26 no 3 pp 339ndash373 2000

[31] E N Forsyth Improving Automated Lexical and DiscourseAnalysis of Online Chat Dialog Naval Postgraduate SchoolMonterey Calif USA 2007

[32] D Feng E Shaw J Kim and E Hovy ldquoAn intelligentdiscussion-bot for answering student queries in threaded dis-cussionsrdquo in Proceedings of the 11th international Conference onIntelligent User Interfaces pp 171ndash177 February 2006

[33] F Hu T Ruan Z Shao and J Ding ldquoAutomatic web informa-tion extraction based on rulesrdquo in Proceedings of the Web Infor-mation System Engineering (WISE rsquo11) pp 265ndash272 Springer2011

[34] F Hu T Ruan and Z Shao ldquoComplete-thread extraction fromweb forumsrdquo inWebTechnologies andApplications pp 727ndash734Springer 2012

[35] D Feng J Kim E Shaw and E Hovy ldquoTowards modelingthreaded discussions using induced ontology knowledgerdquo inProceedings of the National Conference on Artificial Intelligencepp 1289ndash1294 July 2006

[36] G D Venolia and C Neustaedter ldquoUnderstanding sequenceand reply relationships within email conversations a mixed-model visualizationrdquo in Proceedings of the SIGCHI Conferenceon Human Factors in Computing Systems pp 361ndash368 April2003

[37] T Joachims ldquoTraining linear SVMs in linear timerdquo in Pro-ceedings of the 12th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining pp 217ndash226 Philadel-phia Pa USA August 2006

[38] RM Fano ldquoTransmission of information a statistical theory ofcommunicationsrdquo American Journal of Physics vol 29 pp 793ndash794 1961

[39] T Opsahl F Agneessens and J Skvoretz ldquoNode centrality inweighted networks generalizing degree and shortest pathsrdquoSocial Networks vol 32 no 3 pp 245ndash251 2010

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 23: Research Article A Supervised Approach to Predict the …downloads.hindawi.com/journals/tswj/2014/479746.pdf · 2019-07-31 · Research Article A Supervised Approach to Predict the

The Scientific World Journal 23

in Proceedings of the International Conference on Weblogs andSocial Media Seattle Wash USA 2008

[12] J Chan C Hayes and E M Daly ldquoDecomposing discussionforums and boards using user rolesrdquo in Proceedings of the 4thInternational AAAI Conference on Weblogs and Social Media2010

[13] Y-C Wang and C P Rose ldquoMaking conversational structureexplicit identification of initiation-response pairs within onlinediscussionsrdquo in Proceedings of the Human Language Technolo-gies Conference ofthe North American Chapter of the Associationfor Computational Linguistics pp 673ndash676 June 2010

[14] LWangM Lui S N Kim J Nivre and T Baldwin ldquoPredictingthread discourse structure over technical web forumsrdquo inProceedings of the Conference on Empirical Methods in NaturalLanguage Processing (EMNLP rsquo11) pp 13ndash25 Edinburgh UKJuly 2011

[15] A Balali H FailiMAsadpour andMDehghani ldquoa supervisedapproach for reconstructing thread structure in commentson blogs and online news agenciesrdquo in Proceedings of the14th International Conference on Intelligent Text Processing andComputational Linguistics Samos Greece 2013

[16] P Adams and C Martel ldquoConversational thread extraction andtopic detection in text-based chatrdquo in Semantic Computing pp87ndash113 John Wiley amp Sons 2010

[17] P H Adams and C H Martell ldquoTopic detection and extractionin chatrdquo in Proceedings of the 2nd Annual IEEE InternationalConference on Semantic Computing (ICSC rsquo08) pp 581ndash588August 2008

[18] C Lin J M Yang R Cai X J Wang W Wang and L ZhangldquoModeling semantics and structure of discussion threadsrdquo inProceedings of the 18th International Conference on World WideWeb pp 1103ndash1104 2009

[19] A Schuth MMarx andM d Rijke ldquoExtracting the discussionstructure in comments on news-articlesrdquo in Proceedings of the9th ACM International Workshop onWeb Information and DataManagement (WIDM rsquo07) Lisboa Portugal November 2007

[20] C Wang J Han Q Li X Li W P Lin and H Ji ldquoLearninghierarchical relationships among partially ordered objects withheterogeneous attributes and linksrdquo in Proceedings of the 12thSIAM International Conference on Data Mining pp 516ndash527Anaheim Calif USA 2012

[21] M Dehghani A Shakery M Asadpour and A KoushkestanildquoA learning approach for email conversation thread reconstruc-tionrdquo Journal of Information Science vol 39 no 6 pp 846ndash8632013

[22] A Balali A Rajabi S Ghassemi M Asadpour and H FailildquoContent diffusion prediction in social networksrdquo in Pro-ceedings of the 5th Conference on Information and KnowledgeTechnology (IKT rsquo13) pp 467ndash471 2013

[23] G Mishne and N Glance ldquoLeave a reply an analysis of weblogcommentsrdquo in Proceedings of the 3rd Annual Workshop onthe Weblogging ecosystem Aggregation Analysis and DynamicsEdinburgh UK 2006

[24] V Gomez A Kaltenbrunner and V Lopez ldquoStatistical analysisof the social network and discussion threads in Slashdotrdquo inProceedings of the 17th International Conference on World WideWeb (WWW rsquo08) pp 645ndash654 April 2008

[25] T Weninger X A Zhu and J Han ldquoAn exploration ofdiscussion threads in social news sites a case study of the redditcommunityrdquo 2013

[26] D Laniado R Tasso Y Volkovich and A KaltenbrunnerldquoWhen the Wikipedians talk network and tree structure of

Wikipedia discussion pagesrdquo in Proceedings of the 5th Interna-tional AAAI Conference on Weblogs and Social Media (ICWSMrsquo11) 2011

[27] A Grabowski ldquoHuman behavior in online social systemsrdquoTheEuropean Physical Journal B vol 69 no 4 pp 605ndash611 2009

[28] S N Kim L Wang and T Baldwin ldquoTagging and linkingweb forum postsrdquo in Proceedings of the 40th Conference onComputational Natural Language Learning Uppsala Sweden2010

[29] J Andreas S Rosenthal and K McKeown ldquoAnnotating agree-ment and disagreement in threaded discussionrdquo in Proceedingsof the 8th International Conference on Language Resources andComputation (LREC rsquo12) pp 818ndash822 Istanbul Turkey May2012

[30] A Stolcke K Ries N Coccaro et al ldquoDialogue actmodeling forautomatic tagging and recognition of conversational speechrdquoComputational Linguistics vol 26 no 3 pp 339ndash373 2000

[31] E N Forsyth Improving Automated Lexical and DiscourseAnalysis of Online Chat Dialog Naval Postgraduate SchoolMonterey Calif USA 2007

[32] D Feng E Shaw J Kim and E Hovy ldquoAn intelligentdiscussion-bot for answering student queries in threaded dis-cussionsrdquo in Proceedings of the 11th international Conference onIntelligent User Interfaces pp 171ndash177 February 2006

[33] F Hu T Ruan Z Shao and J Ding ldquoAutomatic web informa-tion extraction based on rulesrdquo in Proceedings of the Web Infor-mation System Engineering (WISE rsquo11) pp 265ndash272 Springer2011

[34] F Hu T Ruan and Z Shao ldquoComplete-thread extraction fromweb forumsrdquo inWebTechnologies andApplications pp 727ndash734Springer 2012

[35] D Feng J Kim E Shaw and E Hovy ldquoTowards modelingthreaded discussions using induced ontology knowledgerdquo inProceedings of the National Conference on Artificial Intelligencepp 1289ndash1294 July 2006

[36] G D Venolia and C Neustaedter ldquoUnderstanding sequenceand reply relationships within email conversations a mixed-model visualizationrdquo in Proceedings of the SIGCHI Conferenceon Human Factors in Computing Systems pp 361ndash368 April2003

[37] T Joachims ldquoTraining linear SVMs in linear timerdquo in Pro-ceedings of the 12th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining pp 217ndash226 Philadel-phia Pa USA August 2006

[38] RM Fano ldquoTransmission of information a statistical theory ofcommunicationsrdquo American Journal of Physics vol 29 pp 793ndash794 1961

[39] T Opsahl F Agneessens and J Skvoretz ldquoNode centrality inweighted networks generalizing degree and shortest pathsrdquoSocial Networks vol 32 no 3 pp 245ndash251 2010

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 24: Research Article A Supervised Approach to Predict the …downloads.hindawi.com/journals/tswj/2014/479746.pdf · 2019-07-31 · Research Article A Supervised Approach to Predict the

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014