12
1 BullyNet: Unmasking Cyberbullies on Social Networks Aparna Sankaran, Hannah Johnson, Gaby G. Dagher, Min Long Abstract—One of the most harmful consequences of social media is the rise of cyberbullying, which tends to be more sinister than traditional bullying given that online records typically live on the internet for quite a long time and are hard to control. In this paper, we present a three-phase algorithm, called BullyNet, for detecting cyberbullies on Twitter social network. We exploit bullying tendencies by proposing a robust method for constructing a cyberbullying signed network. We analyze tweets to determine their relation to cyberbullying, while considering the context in which the tweets exist in order to optimize their bullying score. We also propose a centrality measure to detect cyberbullies from a cyberbullying signed network, and we show that it outperforms other existing measures. We experiment on a dataset of 5.6 million tweets and our results shows that the proposed approach can detect cyberbullies with high accuracy, while being scalable with respect to the number of tweets. Index Terms—Cyberbullying, Signed Networks, Social Media Mining. I. I NTRODUCTION The Internet has created never before seen opportunities for human interaction and socialization. In the past decade, social media, in particular, has had a popularity explosion. From MySpace to Facebook, Twitter, Flickr, and Instagram, people are connecting and interacting in a way that was previously impossible. The widespread usage of social media across people from all ages created a vast amount of data for several research topics, including recommender systems [1], link predictions [2], visualization, and analysis of social net- works [3]. While the growth of social media has created an excellent platform for communications and information sharing, it has also created a new platform for malicious activities such as spamming [4], trolling [5], and cyberbullying [6]. According to the Cyberbullying Research Center (CRC) [7], cyberbullying occurs when someone uses the technology to send messages to harass, mistreat or threaten a person or a group. Unlike tradi- tional bullying where aggression is a short and temporary face- to-face occurrence, cyberbullying contains hurtful messages which are present online for a long time. These messages can be accessed worldwide, and are often irrevocable. Laws about cyberbullying and how it is handled differ from one place to another. For example, in the United States, the majority of the states incorporate cyberbullying into their bullying laws, and cyberbullying is considered a criminal offense in most of them [8]. Popular social media platforms such as Facebook and Twitter are very vulnerable to cyberbullying due to the popularity of these social media sites and the anonymity that the internet offers to the perpetrators. Although strict laws exist to punish cyberbullying, there are very less tools available to effectively combat cyberbullying. Social media platforms provide users with the option to self-report abusive behavior and content in addition to providing tools to deal with bullying. For example, Twitter has features that include locking accounts for a brief period of time or banning the accounts when the behavior becomes unacceptable. The body of work produced by the research community with regards to cyberbullying in social networks also needs to be expanded to get better insights and help develop effective tools and techniques to tackle the issue. 1 2 3 4 -0.3 -0.3 -0.3 -0.4 -0.4 -0.4 0.8 0.8 0.8 Figure 1: An example of a signed network. To identify cyberbullies in social media, we first need to understand how social media can be modeled. The common way of modeling relationship in social psychology [9] is to represent it as a signed graph with positive edge corresponds to the good intent and negative edge corresponds to malicious intent between people. Using the signed graph, we model the Twitter social network as a signed network to represent users’ behavior [10] where nodes correspond to users, directed edges correspond to communications and/or relations between the users with assigned weight in the range [-1, 1], as illustrated in Figure 1. Definition 1: A signed social network (SSN) is a directed, weighted graph G =(V, E, W ), where V is the set of users and E V × V is the set of edges with an edge weight w W in the range of [-1,1]. Mining social media networks to determine cyberbullies im- poses several challenges and concerns. First, it is typically hard to accurately interpret user’s intentions and meanings in social media based merely on their messages (e.g. posts, tweets, com- ments), which are typically short, use slang languages, or may include multimedia contents such as pictures and videos. For example, Twitter limits its users’ messages to 140 characters, which could be a mix of text, slangs, emojis, and gifs. As a result, it is hard to determine the opinion expressed by a message correctly. For this we utilize sentiment analysis [11], [12] to determine whether the user’s attitude towards other users are positive, negative, or neutral. Second, bullying could

BullyNet: Unmasking Cyberbullies on Social Networks

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: BullyNet: Unmasking Cyberbullies on Social Networks

1

BullyNet:Unmasking Cyberbullies on Social Networks

Aparna Sankaran, Hannah Johnson, Gaby G. Dagher, Min Long

Abstract—One of the most harmful consequences of socialmedia is the rise of cyberbullying, which tends to be more sinisterthan traditional bullying given that online records typicallylive on the internet for quite a long time and are hard tocontrol. In this paper, we present a three-phase algorithm, calledBullyNet, for detecting cyberbullies on Twitter social network.We exploit bullying tendencies by proposing a robust method forconstructing a cyberbullying signed network. We analyze tweetsto determine their relation to cyberbullying, while consideringthe context in which the tweets exist in order to optimize theirbullying score. We also propose a centrality measure to detectcyberbullies from a cyberbullying signed network, and we showthat it outperforms other existing measures. We experiment ona dataset of 5.6 million tweets and our results shows that theproposed approach can detect cyberbullies with high accuracy,while being scalable with respect to the number of tweets.

Index Terms—Cyberbullying, Signed Networks, Social MediaMining.

I. INTRODUCTION

The Internet has created never before seen opportunitiesfor human interaction and socialization. In the past decade,social media, in particular, has had a popularity explosion.From MySpace to Facebook, Twitter, Flickr, and Instagram,people are connecting and interacting in a way that waspreviously impossible. The widespread usage of social mediaacross people from all ages created a vast amount of data forseveral research topics, including recommender systems [1],link predictions [2], visualization, and analysis of social net-works [3].

While the growth of social media has created an excellentplatform for communications and information sharing, it hasalso created a new platform for malicious activities such asspamming [4], trolling [5], and cyberbullying [6]. According tothe Cyberbullying Research Center (CRC) [7], cyberbullyingoccurs when someone uses the technology to send messages toharass, mistreat or threaten a person or a group. Unlike tradi-tional bullying where aggression is a short and temporary face-to-face occurrence, cyberbullying contains hurtful messageswhich are present online for a long time. These messages canbe accessed worldwide, and are often irrevocable. Laws aboutcyberbullying and how it is handled differ from one place toanother. For example, in the United States, the majority ofthe states incorporate cyberbullying into their bullying laws,and cyberbullying is considered a criminal offense in most ofthem [8]. Popular social media platforms such as Facebookand Twitter are very vulnerable to cyberbullying due to thepopularity of these social media sites and the anonymity thatthe internet offers to the perpetrators. Although strict laws existto punish cyberbullying, there are very less tools available

to effectively combat cyberbullying. Social media platformsprovide users with the option to self-report abusive behaviorand content in addition to providing tools to deal with bullying.For example, Twitter has features that include locking accountsfor a brief period of time or banning the accounts when thebehavior becomes unacceptable. The body of work producedby the research community with regards to cyberbullying insocial networks also needs to be expanded to get better insightsand help develop effective tools and techniques to tackle theissue.

1

2

3

4

-0.3 -0.3

-0.3

-0.4

-0.4

-0.40.8 0.8

0.8

Figure 1: An example of a signed network.

To identify cyberbullies in social media, we first need tounderstand how social media can be modeled. The commonway of modeling relationship in social psychology [9] is torepresent it as a signed graph with positive edge correspondsto the good intent and negative edge corresponds to maliciousintent between people. Using the signed graph, we model theTwitter social network as a signed network to represent users’behavior [10] where nodes correspond to users, directed edgescorrespond to communications and/or relations between theusers with assigned weight in the range [-1, 1], as illustratedin Figure 1.

Definition 1: A signed social network (SSN) is a directed,weighted graph G = (V,E, W ), where V is the set of usersand E ⊆ V × V is the set of edges with an edge weightw ∈W in the range of [-1,1].

Mining social media networks to determine cyberbullies im-poses several challenges and concerns. First, it is typically hardto accurately interpret user’s intentions and meanings in socialmedia based merely on their messages (e.g. posts, tweets, com-ments), which are typically short, use slang languages, or mayinclude multimedia contents such as pictures and videos. Forexample, Twitter limits its users’ messages to 140 characters,which could be a mix of text, slangs, emojis, and gifs. Asa result, it is hard to determine the opinion expressed by amessage correctly. For this we utilize sentiment analysis [11],[12] to determine whether the user’s attitude towards otherusers are positive, negative, or neutral. Second, bullying could

Gaby
Text Box
This is the preprint version. See IEEE for the final official version.
Page 2: BullyNet: Unmasking Cyberbullies on Social Networks

2

be hard to detect if the bully chooses to disguise it throughtechniques such as sarcasm or passive-aggression. In thissituation, a single text (message) cannot determine the user’sintention. So, we collect the entire conversation between twoor more users to identify the context in which the user attitudeexists. Third, the large size and dynamic and complex structureof social media networks makes it challenging to identifycyberbullies. For example, on Twitter, hundreds of millionsof tweets are sent every day on the social network platform.Inthis case we construct the social network as a graph and assignvalue based on the maliciousness of the user. Because thenetwork analysis reduce the complex relationship between theusers to the simple existence of nodes and edges [10] There areseveral works in the literature concerning detecting malicioususers from unsigned networks with positive edge weights,including community detection [13], node classification [14]and link prediction [2]. On the other hand, methods thatanalyze signed social networks are scarce [15].

In this paper, we study the problem of cyberbullying insocial media in an attempt to answer the following researchquestion: Can tweet contexts (conversations) help improve thedetection of cyberbullying in Twitter?. Our intuition is thateach tweet should be evaluated not only based on its contents,but also based on the context in which it exists. We call sucha context a conversation, which is a set of tweets betweentwo or more people exchanging information about a certainsubject. Thus, our solution consists of three parts. First, foreach conversation, a conversation graph is generated basedon the sentiment and bullying words in the tweets. Second,we compute the bullying score for each pair of users in aconversation graph, and then combine all graphs to createan SSN called bullying signed network (B). The inclusion ofnegative links can bring out information that would otherwisebe missed with only positive links [16]. Finally, we proposea centrality measure called attitude & merit (A&M ) to detectbullying users from the signed network B.Our main contributions are organized as follows:

1) Collected, preprocessed and labelled the Twitter dataset.2) Proposed a novel efficient algorithm for detecting cyber-

bullies on Twitter.a) Built conversation.b) Constructed Bullying Signed Network.c) Proposed Attitude and Merit Centrality.

3) Experimented on 5.6 million tweets collected over 6months. The results show that our approach can detectcyberbullies with high accuracy, while being scalable withrespect to the number of tweets.

II. RELATED WORKS

In this section, we review the literature on areas related tocyberbullying detection and signed social networks.

A. Cyberbullying Detection

There isn’t a lot of works in the literature that utilizessigned networks to detect cyberbullies. The papers [6], [17],are aimed at detecting trolls in a signed network. Wu et

al. [17] proposed a method for ranking nodes to identifytrolls without using a PageRank algorithm. Kumar et al. [6]proposed an iterative algorithm involving new declutteringoperations and various centrality measures to detect trolls.Unlike the proposed method in this paper, the authors begintheir process with an already created signed network.

A significant amount of work has been done over the pastdecade in the area of cyberbullying detection in general. Therehave been two broad methods in identifying bullies - oneaims to detect bullying messages (e.g. [18]–[21]), while theother approach is to detect the cyberbullies responsible for themessages (e.g. [22]–[25]).

The first method of determining bullying messages wasdone using a combination of text-based analytics and a mixof text and user features. Zhao et al. [18] proposed a textbased Embeddings-Enhanced Bag-of-Words (EBoW) modelthat utilizes a concatenation of bullying features, bag-of-words,and latent semantic features to obtain a final representation,which is then passed through a classifier to identify cyber-bullies. Xu et al. [21] used textual information to identifyemotions in bullying traces, as opposed to determining whetheror not a message was bullying. Singh et al. [19] proposed aprobabilistic socio-textual information fusion for cyberbully-ing detection. This fusion uses social network features derivedfrom a 1.5 ego network and textual features, such as density ofbad words and part-of-speech-tags. Hosseinmardi et al. [20]used images and text to detect cyberbullying incidents. Thetext and image features were gathered from media sessionscontaining images and the corresponding comments, whichwas then fed into various classifiers. Chen [25] proposed annovel method in identifying cyberbullies within a multi-modalcontext. To understand cyberbullying Kao et al. [26] proposeda framework by studying social role detection. By usingwords and comments, temporal characteristics, and socialinformation of a session as well as peer influence Cheng etal. [27], [28] proposed frameworks for detecting cyberbullies.

The second method was aimed at identifying the personbehind the cyberbullying incidents. Squicciarini et al. [22]used MySpace data to create a graph, which integrated user,textual, and network features. This graph was used to detectcyberbullies and predict the spreading of bullying behaviorthrough node classification. Galan-Garcıa et al. [23] usedsupervised machine learning to detect the real users behindtroll profiles on Twitter, and demonstrated the technique in areal case of cyberbullying. In a recent paper on aggression andbullying in Twitter, Chatzakou et al. [24] found cyberbulliesand aggressors using user, text, and network-based features.

From the above methods we determined that these ap-proaches focus on how offensive the content of the messageis and based on that they identify cyberbullies but does notconsider why the message was offensive i.e., the above papersdo not analyze the context of the entire conversation just thecontent of the message. Our approach utilizes the Bag-of-words with the text to identify curse words, use sentimentanalysis to determine the emotions or attitude of the senderand finally we analyze the entire context in which the senderand receiver communicates. These overlooked factors couldsignificantly or completely change the results of Cyberbullying

Page 3: BullyNet: Unmasking Cyberbullies on Social Networks

3

Detection.

B. Signed Social Networks

This section reviews the previous work done on signednetworks (e.g. [6], [10], [15], [17], [29]). The idea of signednetworks is not new but its application and analysis of themwere only developed in recent years. We extended its applica-tion to establish node classification in our model. Previously,in 2010, Leskovec et al. [10] reviewed the balance and statustheory and their relation to social media and proposed amodified status theory that better reflects patterns found insigned networks in social media. Tang et al. have done a broadsurvey of signed networks in social media [15] and proposeda new framework for node classification in signed socialnetworks [29]. The authors incorporated negative links in thesigned network and proposed an approach to mathematicallymodel both independent and dependent information from thelinks.

Over the last few years, a number of methods have beendesigned for signed network analysis with both positive andnegative links [30]–[33]. Most of these methods are basedon simple modifications of the PageRank or Eigenvectorcentrality that accounts for negative weights on the links.However, some of these measures do not consider, how theincoming edges of a node depends upon the outgoing edgesfrom the same node and vice versa i.e., interactions betweenincoming and outgoing links in a signed networks. Mishra etal. [34] employ this scenario and proposed bias and deservemeasures. The deserve of a node depends on the opinions ofother nodes whereas the trustworthiness of a node depends onhow a node gives a correct opinion about other nodes. Fromthe experiment in the Section VI-D, we can find the bias anddeserve measures is not effective for identifying bullies in thenetwork.

III. PROBLEM FORMULATION

In this section, the Twitter social network is represented as adirected, weighted graph G = (U,E) with U being the set ofusers (represented as nodes) and E being the set of tweets Tsent between the users(represented as edges). Each user u ∈ Uhas a set of features including an ID, the number of followers,the number of friends, and the number of the tweets that theysent.

Each tweet t ∈ T is associated with certain features: sourceID (SID), destination ID (DID), the date of creation, a userID (UID), a reply ID (RID), and mentions (MID). If thetweet includes mentions (i.e., if a given @username is includedin a tweet anywhere else but at the very start), then Twitterinterprets this as a mention and the user gets a notification thatsomeone has mentioned them.

i jt

Figure 2: An example of a tweet.

As shown in the figure 2, the notation eij represents atweet t directed edge from node (user) i to node (user) j.The existence of an edge eij denotes an interaction from node

i to node j which is t. Each tweet has a set of features, asshown in Table II. (SID) is assigned when a new tweet iscreated, in this case it is 101. (DID) is an ID to which thiscurrent tweet is in response to where the destination ID is3001. The (UID), (RID) and (MID) corresponds to IDs ofthat particular users/nodes. Finally, the text is the content ofthe tweet sent from the node i to node j.

From the above Twitter data, we extract conversations andbuild a directed weighted graph for each conversation C ={c1, c2, ...c|C|}. In our model, each ci is a set of two or moretweets between two or more users:

Definition 2: A conversation c is a set of time-ordered tweetsc = {t1, t2, ...t|c|} such that:

1) The first tweet t1 is the initiator tweet that starts theconversation, and can be one of the two following types:• DID(t1) = NULL, and eitherMID(t1) or RID(t1)

is not null.• DID(t1) 6= NULL, and ∀t ⊆ T : SID(t) 6=DID(t1).

2) All tweets in c satisfy the following:SID(ti) = DID(ti+1) : 1 ≤ i ≤ |c| − 1.

Our model will analyze the nodes and conversions and willoutput a list as results for detecting cyberbullies on the Twittersocial network:

L ={(u1, s1), (u2, s2), ..., (u|L|, s|L|)

},

where ui is a user (node) and si is a confidence value for thelikelihood of user ui being a bully.

IV. OUR SOLUTION: BULLYNET ALGORITHM

In this section, we first present an overview of the proposedthree-phase bully finding algorithm and elaborate the steps ineach phase.

The objective of our solution is to identify the bullies fromraw Twitter data based on the context as well as the contentsin which the tweets exist. Given a set of tweets T containingTwitter features such as user ID, reply ID etc, the proposedapproach consists of three algorithms: (i) Conversation GraphGeneration Algorithm, (ii) Bullying Signed Network Gen-eration Algorithm, (iii) Bully Finding Algorithm. The firstalgorithm constructs a directed weighted conversation graphGc by efficiently reconstructing the conversations from rawTwitter data while enabling a more accurate model of humaninteractions. The second algorithm constructs a bullying signednetwork B to analyze the behaviour of users in social media.The third algorithm consists of our proposed attitude and meritcentrality measures to identify bullies from B. Figure 3 showsthe process flow of BullyNet where the raw data is extractedfrom Twitter using Twitter API from which the conversationgraph is constructed for each conversation using algorithm 1.Then from the conversation graphs, a bullying signed networkis generated using algorithm 2. Finally, the bullies from Twitterare identified by applying algorithm 3.

A. Algorithm 1 - Conversation Graph Generation

The conversation graph generation algorithm 1, is con-structed from a set of tweets T to generate directed weighted

Page 4: BullyNet: Unmasking Cyberbullies on Social Networks

4

Table I: Comparative evaluation of main features in related approaches including our proposed approach

Approach Detect Attributes based on Signed Network Dataset

Cyberbullying Other Content Context User Network Yes No Twitter YouTube Slashdot InstagramMessage User

Zhao et al. [18] Xu et al. [21] Hosseinmardi et al., [20] Dadvar et al., [35] Dinakar et al., [36] Squicciarini et al. [22] Chen et al. [37] Galan-Garcıa et al. [23] Chatzakou et al. [24] Mishra & Bhattacharya [34] Kumar et. al [6] Wu et. al [17] Ortega et al. [38] Our proposed protocol

Table II: Tweet features

SID DID UID RID MID Text

101 3001 UserI UserJ UserX,UserY @UserX @UserY Lets meet at the central park

Twitter DataConversation

Graphs

BullyingSigned

Network

TwitterBullies

Twitter API Algorithm 1

Algorithm 2

Algorithm 3

Figure 3: Protocol Flowchart of BullyNet.

conversation graphs Gc for each conversation. The weightsbetween the nodes or users are determined by analyzing thesentiment behind the text of a tweet and examining for cursewords. We then provide a score based on the expression thetext represents. For each tweet ti in T , the conversationsare built by doing a binary search DID(ti) with the SIDof the remaining tweets. If a match is found as t′, then itis appended with ti to form a new conversation. If a binarysearch match is found with an already existing tweet in aconversation ci then, ti is appended to tweets in ci. The graphsare represented as Gc = (V,E, I) where V is the set of usersinvolved in the conversation, E is the set of edges representingthe tweets in the conversation, and each edge is assigned abullying indicator value I as the edge weight which is in therange of [−1,+1]. When Iij = −1, it indicates a negativeinteraction by i towards j, and when Iij = 1, indicatinga positive interaction. The bullying indicator for each tweetis computed as I = β ∗ SA + γ ∗ CS, based on sentimentanalysis SA (VADER) and cosine similarities CS with a list ofcommonly used insulting words. The factors of β and γ are 0.9and 0.1 respectively, which are determined by the experiment(see section VI-C).

Example 1: Figure 4 illustrates the conversation extractedfrom the set of tweets T = {t1, . . . , t7}. First, the tweets aresorted in descending order i.e., t7, t6, . . . , t1. Next, DID(t7)is searched with the SID of the remaining tweets (t6 throught1). A match is found in t3 and conversation c1 is formed. This

t1t2t3t4t5t6t7

SID

DID

107

102

106

103

105

104

103

101

102

101

104

-

101

-

c1 = {t1, t3, t7}, c2 = {t1, t4, t6}, c3 = {t2, t5}Output:

Figure 4: Matching tweets based on DID and SID to constructconversations. Given tweets {t1, . . . , t7}, the output is three conver-sations: c1, c2, and c3.

process is repeated for each tweet. The conversations c2 andc3 are created with tweets {t6, t4} and {t5, t2} respectively.Since DID(t4) and DID(t3) match with the SID(t1), thetweet t1 is appended with t4 and t3. So, the final conversationsare c1 = {t7, t3, t1}, c2 = {t6, t4, t1} and c3 = {t5, t2}. Thisprocess can be represented by Algorithm 1.

In the step 3 of Algorithm 1, a directed, weighted graphgci = (V,E) is constructed for every conversation ci wherenodes V , represented as the users, and the edges E, repre-sented as the tweets, are directed from one user to another ina conversation. For every edge e, an edge weight is calculatedas I = β∗SA+γ∗CS. This is known as the Bullying Indicatorwhich is in range of [−1,+1]. The sentiment analysis (SA)and cosine similarity (CS) is computed on the tweet (edge) toevaluate the emotion and behaviour of the user. The β and γare constants, which will be determined by the experiment(See section VI-C). In Step 4, the algorithm outputs theconversation graphs Gc.

Sentiment analysis (SA) is the process of analyzing thesentiment of a message based on the user’s opinion, attitude,and emotion towards an individual. Depending on the analysis,the polarity of the text is classified into positive, negativeor neutral. The sentiment reflects feeling or emotion whileemotion reflects attitude. There are different libraries or toolsavailable to determine the sentiment of the content whichincludes sarcasm, emoji, images etc. Some of the them are:VADER, TextBlob, Python NLTK etc. We use VADER (Va-lence Aware Dictionary and sEntiment Reasoner) [39], whichis a lexicon and rule-based sentiment analysis tool that isspecifically attuned to sentiments expressed in social media. Itperforms well with emojis, emoticons, slangs and acronyms in

Page 5: BullyNet: Unmasking Cyberbullies on Social Networks

5

Conversation Graph Generation AlgorithmInput: Set of tweets, T = {t1, . . . , tn}Output: Conversation graphs Gc = {gc1 , . . . , gcm}

1) Sort all tweets in T in reverse-chronological orderbased on date of creation.

2) For each tweet ti in T , where 1 ≤ i ≤ |T |:a) If ti does not belong to a conversation, then create

a new conversation c ∈ C and associate ti with c.b) If there is a tweet t′ ∈ {ti, ti+1, . . . , t|T |} whereDID(ti) = SID(t′) then associate t′ with all ti’sconversations.

3) For each conversation ci ∈ C:a) Construct a conversation graph gci ∈ Gc, where

users are represented as nodes and tweets as edges.b) For each edge e = (u, v) in gci :

i) Compute the sentiment of the tweet (SA).ii) Compute the cosine similarity of the tweet

with bullying bag of words (CS).iii) Calculate the bullying indicator Iti (weight) of

the edge as follows:Iuv = β ∗ SA+ γ ∗ CS

4) Return Gc

Algorithm 1: Conversation Graph Generation

a sentence. Cosine Similarity (CS) [40] measures the similaritybetween two vectors using their inner product. In the Twitterand some tweets may contain curse or insulting words thatare reasonable indications of the existence of bullying. Thus,we select a reference list of insulting words commonly usedin Twitter and some external linguistic resources for insultinganalysis seeds. This list contains words indicating curse ornegative emotions such as nigga, bitch, slut etc. and arecompared with individual tweets with cosine similarity tocompute a score. In this context, each tweet and insultingseeds are represented as vectors, where each vector has theword frequencies.

Example 2: Figure 5a shows a sample conversation oftweets. From algorithm 1, the conversation graphs are con-structed, as shown in Figure 5b. It contains two conversationgraphs shown with dashed blue edges and with solid red edges.The rounded number on the edges indicate the tweet orderof that particular conversation. Figure 5c and 5d representthe two conversation graphs as gc1 and gc2 with the bullyingindicator as the edge weight. With β and γ values as 0.9 and0.1 respectively which was determined experimentally (seesection VI-C)., the edge weight I31 i.e., the edge from P3to P1, is calculated as -0.23. Similarly, the score of the otheredges are calculated as shown in Figure 5c and 5d.

B. Algorithm 2 - Bullying Signed Network Generation

In many real-world social systems, the relation between twonodes can be represented as signed networks with positive andnegative links. Since this research focuses on identifying thebullying nodes in the network, the algorithm 2 is designed to

determine the final outgoing edge weight, wij for the users inthe conversation graphs Gc.

Bullying Signed Network Generation AlgorithmInput: Set of conversation graphs, Gc

Output: Bullying Signed Network B1) For each conversation graph gci in Gc:

a) For each set of edges with the same order, sortedascendingly, compute the bullying score of sourcenode u toward target node v for each edge e =(u, v) as follows:Suv = Iuv + α(Iuv − Svu).and then determine the average score of node ufor the same set of edges.

b) Compute the overall bullying score S of each nodein gci as follows:i) If the node is the root node, then:S =

∑S

1+2.2(n−1)

ii) Otherwise: S =∑

S2.2(n)

2) Construct the bullying signed network graph B bymerging all the conversation graphs together.

3) Return B.

Algorithm 2: Bullying Signed Network Generation

In Step 1a of the algorithm 2, for every conversation graphgci , a bullying score S is calculated based on how a node/userinteract with other nodes/users in the graph based on the tweetorder (sorted in ascending order) i.e., tweets are arrangedbased on the conversation. For an edge e = (u, v), thebullying score Suv ≡ Iuv if the edge towards v is not a replyfrom u. Otherwise, the bullying score Suv is calculated asIuv + α(Iuv − Svu) where α is a constant determined by theexperiment as 0.6. Here, α is used to calculate how muchpercent of the difference between the sender and receivershould be taken to determine the bullying score S. Iuv isthe Bullying Indicator between the nodes u to v and Svu isthe Bullying Score between the nodes v to u. The differencebetween Iuv and Suv is that, Iuv computes a score for thecontent on a tweet based on the sentiment analysis and cosinesimilarity, whereas Suv computes a score based on the entireconversation between u and v i.e., the context in which opinionof u towards v. If there are more than one edge for a user withthe same order, an average bullying score is computed for thesame set of orders after the bullying score is evaluated.

Example 3: Table III shows the bullying score calculationfor the conversation graph gc1 in Figure 5c. In order 1, thebullying score S21 = I21 = 0 since, the edge from P2 to P1is not a reply edge. The user in the parenthesis represents towhom the edge responds. In order 2, there are two edges fromP3 to P1 and P3 to P2 and the bullying score S31 = −61and S32 = −61 is the same as I31 and I32 respectively. Theorder 3 also has two edges, P2 to P3 and P2 to P1. Since theedge P2 to P3 is a reply to the edge P3 to P2, the bullyingscore is calculated as S23 = I23+α(I23−S32) = 0.99 where

Page 6: BullyNet: Unmasking Cyberbullies on Social Networks

6

Figure 5: Conversation Graph Generation. (a) Sample conversation of tweets. (b) Conversations graph. blue and red sub-graphs represent2 different conversations. (c)(d) Bullying indicators are added as edge weights to the conversation graphs.

α = 0.6 was determined by the experiment. Next, the averageof the score for the same order of the user is computed i.e.,order 2 of the user P3 is -0.61 and order 3 of the user P2is 0.69. Following a similar approach, the bully score S iscalculated in Table IV for the second conversation graph gc2in Figure 5d.

Table III: Bullying score table for gc1

Tweet # P1 P2 P3

1 - 0(P1)

2 - -0.61(P1,P2)

3 - 0.99 (P3)0.39 (P1)

Total - 0.69 -0.61

Table IV: Bullying score table for gc2

Tweet # P1 P4 P5 P3

1 - -0.18(P1,P3,P5)

2 - -0.56(P4)-0.42(P1,P3)

3 - 0.84(P5)0.32(P1,P3)

4 --0.60(P4)-0.16(P5)-0.26(P1)

Total - 0.4 -0.49 -0.34

In Step 1b, the bullying score which was computed in theprevious step for the users in every conversation graph gci1is normalized in [−1, 1]. The normalization is performed intwo ways i.e., for the user that initiated the conversation,known as root nodes, and the users that are involved in theconversation. For the first type of users, the normalization iscomputed as

∑S/(1 + 2.2(n − 1)) and for the second type

of users as∑S/2.2(n) where, n is the number of times the

user occurs in the order and the value 2.2 is computed using1 + (Maxdiff)(α) in which Maxdiff is the range i.e., 2.This normalized score of the users becomes the edge weightto the other users in gci .

In Step 2, the bullying signed network graph B is con-structed by merging all the conversation graphs Gc. If thereis more than one edge i.e., e = (u, v) then a single edgeweight is calculated by taking the difference between averageand standard deviation of all wuv . Step 4 outputs the bullyingsigned network graph B.

Example 4: Figure 6c illustrates the bullying signed net-work by merging the two normalized conversation graphs inFigure 6a and 6b. From Figure 6, it can be seen that thereare two different edges from the user P3 to P1 (-0.27 and -0.15). So, the difference between the average and the standarddeviation of the two edges are calculated as -0.15 which is thefinal edge weight of P3 to P1 in the bullying signed network.

C. Algorithm 3 - Bully Finding

This work, is to identify bullies from B using centralitymeasures. Since this paper is about social networks the im-portance is defined as the behaviour. Among several centrality

Page 7: BullyNet: Unmasking Cyberbullies on Social Networks

7

P1

P2

P3

0.21

0.21

-0.27

-0.27

(a)

P1

P5

P3

P4

-0.22

-0.22-0.22

-0.15

-0.15

-0.15

0.12

0.12

0.12

(b)

P1

P5

P4

P2

P3

0.21

0.21-0.15-0.15

-0.15

-0.150.12

0.12

0.12

-0.22

-0.22

-0.22

(c)

Figure 6: Normalized conversation graphs (a) gc1 (b) gc2 and (c) Bullying signed network.

measures, we consider Bias and Deserve (BAD) by Mishraand Bhattacharya [34] a state of art method, that handlessigned network because, their measure is computed on howthe outgoing edge from a node/user depends on the incomingedges from other nodes/users. However, BAD is modelled ona trust based network i.e., the users that have a propensity totrust/distrust other users. Also, the edge weight denotes thetrust score rather than the bullying score as in this research.

So, we proposed a centrality measure A&M Attitude andMerit, similar to that of BAD to identify bullies from ourproposed signed network B. Merit is a measure of the opinion(good or bad) that the other nodes have towards a particularnode and Attitude is a measure of the behaviour of a nodetowards the other node. However, in a given bullying singednetwork, the attitude or likes or dislike of a node towards othernodes in the network is not known. Therefore the expressionsto compute the Merit and Attitude metrics in a mutuallyrecursive manner.

Mn+1(j) =1

2|in(j)|∑

k∈in(j)

(wkj)(An(i)) (1)

An+1(i) =1

2|out(i)|∑

j∈out(i)

(wij +Xij) (2)

Xij =

{M(j) if (wij ×M(j)) > 0−M(j) otherwise

Let in(j) denote the set of all incoming edges to node jand out(i) denote the set of all outgoing edges from nodei. Normalization is done to maintain the value in the range of[-1, 1]. An auxiliary variable Xij is introduced to measurethe effect of the merit score of a node j on its incomingedge to node i. Since merit is about whether the node isconsidered good or bad, it is calculated to be the sum of allits incoming edges from other nodes. Likewise, since attitudeis about the particular node’s view of others, it is calculatedusing the outgoing edges of a node towards others and itscorresponding merit score in the network. Although we usetwo metrics similar to BAD, the calculation of the incomingand the outgoing edges of a node differs. Since Bias in BADis about how truly it rates other nodes, it is calculated bythe difference in the edge weight and the real trust of a node(deserve). The explanation of the proposed metric follows.

From the above expression, it can be seen that if theoutgoing edge weight from node i to node j has a positive

value and the merit score of node j is also positive, then theattitude of node i to j is calculated by the sum of both values.If the outgoing edge weight from node i to j is negative andthe merit score of node j is positive or vice-versa, then theattitude of node i to j is calculated by subtracting the meritscore from the edge weight, which means if a node has apositive edge weight towards a benign merit node then theattitude score increases. Similarly, holding a negative edgeweight towards a benign merit node decreases that node’sattitude score. However, if a node has a positive edge weighttowards a negative merit node, the attitude of a node decreases.

Table V: Example showing the values of the graph (Figure 6c)after each iteration. A denotes attitude and M denotes merit.

No. P1 P2 P3 P4 P5M A M A M A M A M A

0 -1 - -1 -1 -1 -1 -1 -1 -1 -11 0.02 - 0.01 0.11 -0.01 -0.13 0.09 0.06 0.01 -0.132 0.01 - 0.02 0.11 0.1 -0.11 0.01 0.06 0.0 -0.113 0.01 - 0.01 0.11 0.00 -0.11 0.01 0.06 0.0 -0.114 0.01 - 0.01 0.11 0.00 -0.11 0.01 0.06 0.0 -0.11

From Eq. 1 and Eq. 2, the attitude of a node depends on themerit of its neighbours and vice versa. A fixed-point iterationmethod is used to obtain the solution. The Merit and Attitudeof node i at iteration n are denoted by An(i) and Mn(i)respectively. The proposed algorithm 3 is designed to computemerit and attitude scores for each node in the network. Initially,we start with a Merit and Attitude score of −1 (i.e, the firstiteration) in step 1. In step 2a, the merit scores for eachnode are updated using the attitude scores from the previousiteration. In step 2b, the attitude scores are updated using thenewly updated Merit scores in the same iteration. Both Meritand Attitude scores are mutually recursive and are updateduntil both the scores converges in step 3. The scores of Meritand Attitude from the last iteration are the final scores. In thefinal step 4, all the nodes whose attitude score is less than zeroare added the list L along with the user’s attitude score.

Example 5: Table V demonstrates the value of Attitudeand Merit that are updated after each iteration by applyingalgorithm 3 to Figure 6c. The Attitude column of node P1is blank because there are no outgoing edges from P1. Thelast iteration shows the final attitude and merit score of thenodes. It can be seen that, node P3 and P5 are bullies witha confidence score of 0.11 and 0.11, respectively.

Page 8: BullyNet: Unmasking Cyberbullies on Social Networks

8

Bully Finding AlgorithmInput: Bullying Signed Network Gs = (V,E,W )Output: List of bullies and its attitude score L = [(u1, s1), (u2, s2), . . . , (u|L|, s|L|)]

1) Initialize M0(v) = −1 and A0(v) = −1, ∀v ∈ V .2) Set iteration index i = 1

a) For each v ∈ V compute merit scoreM i(v) = 1

2|in(v)|∑

u∈in(v)(wuv)(Ai−1(u)) where |in(v)| is the number of incoming edges to the node v

b) For each u ∈ V compute attitude scoreAi(u) = 1

2|out(u)|∑

v∈out(u)(wuv +Xuv) where |out(u)| is the number of outgoing edges from the node u

3) If there exist atleast one v ∈ V :M i(v) 6=M i−1(v) or Ai(v) 6= Ai−1(v)

a) Increase the iteration index i = i+ 1b) Repeat step 2a & 2b for each iteration

4) For each v ∈ V add the node and its corresponding attitude score value greater than 0 to the list L5) Return L

Algorithm 3: Bully Finding Algorithm (BFA)

V. ALGORITHM ANALYSIS

In this section, we show the proof of convergence of thecentrality measure and perform complexity analysis of ourproposed approach.

A. Convergence of Centrality Measure

We start the convergence proof by showing the differencebetween the attitude of a node at any iteration and the infiniteiteration is bounded which then leads to convergence byproving the error bound ε,<< 1.

After a certain iteration t, the attitude score of that iterationbecomes close to A∞. Since merit of a node can be expressedin terms of attitude of other nodes, this implies that meritvalues exhibit similar properties.

Proposition 5.1: Attitude and Merit (A&M) of a node atany iteration n and the infinite iteration is bounded by inverseexponential function of n.

Proof : We prove this in Appendix A

B. Complexity Analysis

Proposition 5.2: The overall complexity of our proposedapproach in the average case is O(k × l + log n)n.

Proof : We can determine the time complexity of theproposed approach in three phases: constructing conversationgraph, constructing bullying signed network and bully finding.

Constructing conversation graphs phase. In the construct-ing conversation phase, the runtime complexity is the timetaken to construct m conversations from n tweets and then togenerate graphs from the constructed conversations.

Initial sorting of tweets uses merge sort which takesa computational time of O(n log n). The conversationis constructed by doing a binary search on the DIDand SID of the conversation tweet and the currenttweet respectively, leading to m conversations with acomputational time of O(n log n). The cost for generatinggraph from the conversations is O(m). Therefore the averagecomputational cost to construct conversation graphs is

O(n log n+ n log n+m) = O(n log n+m)

Constructing bullying signed network phase. In theconstructing bullying signed network phase, we traversethough each conversation graph where the bullying scoreis calculated for each node with respect to the edges withsame order. For each conversation graph m, the maximumnumber of nodes in the worst case is k. Therefore the totalcomputational cost is O(n.k +m.k)

Bully finding phase. In the bully finding phase, the runtimeis the time taken to detect the bullying users using attitudeand merit centrality. For each l number of iteration, A&Mcentrality touches each edge atmost twice. Therefore theaverage case in detecting bullies in each iteration is O(2n.k)and for the given l iteration it is O(n.k.l)

Therefore, the overall complexity of our proposed approachin the average case is:O((k.l+log n)n+k.m) = O(k.l+log n)n since m, k << n.

VI. EXPERIMENTAL EVALUATION

In this section, we evaluate the performance of the proposedalgorithms. First, we present the data used in our evaluation,second we discuss the implementation details, and the way weprocess it to build ground truth. Finally, we present the experi-mental results which include determining the coefficients α, βand γ, utility and scalability.

A. Dataset

In this paper, we rely on Twitter’s Streaming API, whichprovides free access to 1% of all tweets. The API returnseach tweet in a JSON format, with the content of the tweet,metadata (e.g., creation time, source ID, destination ID, re-ply/retweet, etc.) as well as information about the poster (e.g.,username, followers, friends). To prevent our own bias, wefirst randomly chose 5000 interconnected users and collectedall the tweets in JSON format totaling 5.6M within a six

Page 9: BullyNet: Unmasking Cyberbullies on Social Networks

9

month time-frame between May and October 2017. We thenextracted features like username, text, replyname, mentionsand network based features like source ID, destination IDfrom the Twitter JSON. There were about 2% of the tweetswhich were in languages other than English. When examiningthe users, about 90% of their geographical location were inUSA, 6% of the users’ location were in United Kingdom andremaining 4% were from Ecuador, Japan and China.

B. Implementation and Setup

We implemented our algorithm in Java, and our experimentswere conducted on a machine equipped with an Intel(R)Core(TM) i7-8550U CPU @ 2.00GHz processor and 16.0 GBRAM, running Windows 10 64-bit operating system.

We employed Amazon Mechanical Turk (MTurk) workersto respond to an online survey that we developed. We provided2700 surveys with each survey consisting of 10 conversations.Each survey was assigned to three workers to classify thebullying behavior of the users in the conversations according tofour predefined labels (strongly positive, likely positive, likelynegative and strongly negative) to avoid biased interpretationof bullies. Overall, the workers rated 27000 conversationscontaining 1700 users, which were extracted from the set ofraw Twitter data by using algorithm 1. The MTurk UI enablesrequesters to create and publish surveys (HITs) in a batchwhen processing many HITs of the same type thus savingtime. For our study, we created a csv file that contained 2700HITs. MTurk automatically created a separate HIT for eachset of conversation in the csv file as shown in Figure 7. Theresults to rate each users involved in the set of conversationswere obtained from the workers. A significant share of theparticipants for the survey came from USA, Canada, Europeand India. There was not a marked variation in the ratingprovided by the workers. There were about 7978 stronglynegative, 47426 likely negative, 56704 likely positive and23762 strongly positive user interactions. Some of these usersappear in few conversations therefore we collect these ratingsbased on the users and number of workers and compute usinga metric to identify 569 users as bullies. Finally, the results arenormalized to form the ground truth. We analyze and computethe ground truth in a metric which results in bullying andnon-bullying users. Having the computed results as groundtruth, we evaluated the performance measure by experimentingthe proposed algorithms results w.r.t. the number of usersincreasing linearly from 500 to 1700.

C. Determining optimal values for coefficients α, β and γ

Recall that Iuv = β × SA + γ × CS in algorithm 1 andSuv = Iuv + α(Iuv − Svu) in algorithm 2. To determine thecoefficient β and γ for bullying indicator I and α for thebullying score S, we generate input tweets of varying lengthand performed experiment for different values of α, β and γ.That is, with 5.7 million tweets dataset we did experiment forthe tweets ranging from 1M, 2M, 3M, 4M and 5M for differentα, β and γ values. After experimenting with different values,we found that the coefficient values of β ≥ 0.6, γ ≤ 0.4 andα ≤ 0.6 to provide the greatest accuracy. The accuracy was

measured with β ≥ 0.6 and γ ≤ 0.4 for every α ≤ 0.6 withrespect to the ground truth, using the F1 Measure [41].

Figure 8 depicts the optimal values for the coefficients α, βand γ with respect to the β and γ values, which are set from60 to 90 and 40 to 10 respectively. We use three differentα values for every bullying indicator coefficients β and γ,which varies from 0.4 to 0.6. In our approach, we observethat the F1 measure increases linearly when the coefficientsβ increases and γ decreases. We also observe that when weincrease the α value, the F1 measure increases in all the casesindicating that the sentiment analysis has more impact on thebullying indicator than the cosine similarity. This is because,sentiment analysis analyzes not only the text but also emojis,emoticons and, determining the cosine similarity alone hurtsthe performance. Hence, we take advantage of both SentimentAnalysis and Cosine. Similarly, the response to a tweet has adirect effect on the bullying score.

D. Utility

We briefly introduce our evaluation metrics that will be usedto determine the accuracy of our approach.• AccuracyCM [42]

The accuracy measure is the ratio of the number of bullyusers detected to the total number of bullies. It does notperform well with imbalanced data sets.

AccuracyCM =# of detected bullies

total number of bullies

• Precision and Recall [43]Precision and Recall are evaluation metrics used in binaryclassification tasks. Precision is the measure of exactnessand recall is the measure of completeness. They aredefined as follows:

Precision =# of true bullies detected

total number of detected users

Recall =# of true bullies detected

total number of true bullies

In simple terms, high precision means that an algorithmreturned substantially more bully users, while high recallmeans that an algorithm returned most of the bullies.

• F1 Measure [41]F1 Measure is the Harmonic Mean between precisionand recall. The range for F1 is [0, 1]. It measures howmany bullies are identified correctly and how robust it is.Mathematically, it can be expressed as :

F1 = 2 × Precision×RecallPrecision+Recall

F1 Measure attempts to find a balance between precisionand recall. The greater the F1 Measure, the better is theperformance of our approach.

To determine the accuracy of our proposed centrality mea-sure, Attitude and Merit, we compare all the evaluation metricsdiscussed above with respect to the number of users increasinglinearly from 500 to 1700 users. Figure 9 illustrates the utilityvalues of the metrics (accuracyCM, precision, recall and F1

Page 10: BullyNet: Unmasking Cyberbullies on Social Networks

10

Figure 7: Sample user interface of Amazon Mechanical Turk survey (positive - appropriate behavior, negative - inappropriate behavior).

63.4

63.6

63.8

64

64.2

64.4

64.6

64.8

65

65.2

65.4

β=60 γ=40 β=70 γ=30 β=80 γ=20 β=90 γ=10

F1 M

eas

ure

(%

)

Sentiment and Cosine Co-efficients

α=0.4 α=0.5 α=0.6

Figure 8: Determining optimal values for coefficients α, β and γ.

Measure) with respect to the number of users generated fromthe algorithm 2 as the input.

For the number of users ranging from 500 to 1700, weobserved that the AccuracyCM metric ranged between 70.8%to 73.6% and can be biased in the case of unbalanced datasets,however it produces better results when false positives (is anerror in bullies detection in which a detection result improperlyindicates that a user is bully, when in reality the user isnot a bully) and false negatives (is an error in which a testdetection improperly indicates that a user is not bully, whenin reality the user is a bully) are almost even. In this caseof uneven distribution of data, we measure the accuracy withF1 Measure, which ranges from 77.5% to 79.4% while theprecision and recall center around 81% and 76% respectively.Therefore from the Figure 9 it can be seen that, the precisionoutperform other metrics i.e., higher the precision means ouralgorithm identifies more bullies precisely among the totalnumber of users. The percentages mentioned above for all themetrics remained almost consistent even with the increase inthe number of users.

Next, we compare the performance of our proposed cen-trality measure Attitude and Merit with the research workdone by Mishra and Bhattacharya [34] - Bias and Deservewhich is explain in the section IV-C. We compare the F1-score in term of accuracy achieved with respect to the numberof users generated from the algorithm 2 as the input. Figure 10elucidates the comparison of the centrality measures w.r.t. thenumber of users increasing linearly from 500 to 1700 users.

In our approach, we observed that A&M has an accuracyof about 80%. Also, our centrality measures outperform BAD

Figure 9: Utility with respect to the number of users

in all the cases i.e., number of users. As the number of usersincreased from 500 to 1700, the accuracy of Bias and Deservedecreased from 65% to 60%, whereas the proposed centralitymeasures Attitude and Merit stays consistent. There can bemultiple reasons behind it. First of all, the bias score of a nodewith highly positive bias decreases when it has an outgoingedge with positive weight whereas in A&M , the Attitudescore increases when a positive node has an outgoing edgewith positive weight. Next, when calculating the deserve fora node, the bias value is taken in range of [0, 1] whereas inA&M , merit is calculated with the attitude value in the rangeof [−1, 1]. Furthermore, BAD does not perform well when anode has fewer outgoing and incoming edges. Nevertheless itis still outperformed by the A&M centrality.

Figure 10: Comparative evaluation of the proposed centrality mea-sure Attitude and Merit with Bias and Deserve

We also compare the accuracy of our BullyNet algorithmwith Chatzakou et al. [24], Zhao et al. [18], and Singh et

Page 11: BullyNet: Unmasking Cyberbullies on Social Networks

11

al. [19]. As shown in Table VI, the F1 score of BullyNetoutperforms all the other methods. However, Precision andRecall of BullyNet are outperformed by [19] and [18],respectively.

Table VI: Performance comparisons of different methods.

Precision Recall F1 Score

Chatzakou et al. [24] 75 53 79Zhao et al. [18] 76.8 79.4 78.0Singh et al. [19] 82 53 64

BullyNet 81.3 77.6 79.4

E. ScalabilityWe measure the scalability of BullyNet with respect to

the number of tweets and observe the run-times of our threealgorithms: conversation graphs generation, bullying signednetwork generation and bully finding with optimal values forcoefficients α, β and γ set at 0.6, 0.9 and 0.1 respectively.

We observed that running a dataset with 1M records takes upto 8 min for the BullyNet algorithm and the runtime increaseslinearly as the record size increases linearly from 1M to 5M.Figure 11 depicts the runtime for the records size from 1M to5M for each dataset. We also observed that the most dominantalgorithm of our experiment is conversation graphs generationwhich took the majority of run time i.e approximately 70% oftotal execution time of the three algorithms. This is due to thefact that the conversation graphs have to calculate sentimentanalysis and cosine similarity for each tweet and then calculatethe corresponding bullying indicator I as the edge weight foreach conversation graph.

We observed that there is a linear increase in total run-time with an increase in number of tweets. However, wealso observed that the bullying signed network generationalgorithm (algorithm 2) runtime didn’t grow linearly with theincrease in records, rather it tends to remain constant. Thisis because, there are k number of nodes in m conversationgraphs. Therefore, to calculate the bullying score for eachgraphs it takes O(k) and does not affect the run-time withthe growth in number of tweets. We can observe that similarto the first algorithm, the runtime of the third algorithm alsoincreases linearly with record size. The variation is attributedto the increase in the number of users in each tweet resultingin corresponding increase in computation time for centralitymeasures.

Figure 11: Scalability with respect to the number of tweets

VII. CONCLUSION AND FUTURE WORK

Although the digital revolution and the rise of social me-dia enabled great advances in communication platforms andsocial interactions, a wider proliferation of harmful behaviorknown as bullying has also emerged. This paper presents anovel framework of BullyNet to identify bully users fromthe Twitter social network. We performed extensive researchon mining signed networks for better understanding of therelationships between users in social media, to build a signednetwork (SN) based on bullying tendencies. We observed thatby constructing conversations based on the context as wellas content, we could effectively identify the emotions andthe behavior behind bullying. In our experimental study, theevaluation of our proposed centrality measures to detect bulliesfrom signed network, we achieved around 80% accuracy with81% precision in identifying bullies for various cases.

There are still several open questions deserving further in-vestigation. First, our approach focuses on extracting emotionsand behavior from texts and emojis in tweets. However, itwould be interesting to investigate images and videos, giventhat many users use them to bully others. Second, it does notdistinguish between bully and aggressive users. Devising newalgorithms or techniques to distinguish bullies from aggressorswould prove critical in better identification of cyberbullies.Another topic of interest would be to study the relationshipbetween conversation graph dynamics and geographic locationand how these dynamics are affected by the geographic dis-persion of the users? Are the proximity increase the bullyingbehaviour?

APPENDIX ACONVERGENCE OF CENTRALITY MEASURE

Proposition 5.1 : Attitude and Merit (A&M) of a node atany iteration n and the infinite iteration is bounded by inverseexponential function of n.

Proof : By using mathematical induction we prove theconvergence of attitude. Given its definition, the attitude scoreA∞(i) and At+1(i) can be written as,

A∞(i) = | 1

2|out(i)|∑

j∈out(i)

{wij ±1

2|in(j)|∑

k∈in(j)

(wkj×

A∞(k))}|

An+1(i) =∣∣ 1

2|out(i)|∑

j∈out(i)

{wij ±

1

2|in(j)|∑

k∈in(j)

(wkj×

An(k))}∣∣

Base case: For n=1, we have

=∣∣ 12|out(i)|

∑j∈out(i)

{wij ± 1

2|in(j)|∑

k∈in(j) wkj(A∞(k))

−A0(k)}∣∣

≤ 12|out(i)|

∑j∈out(i)

{|wij | ± 1

2|in(j)|∑

k∈in(j) |wkj ||(A∞(k))−A0(k)|

}[∵ |x.y| ≤ |x||y| & |wij | and |wkj | ≤ 1]

Page 12: BullyNet: Unmasking Cyberbullies on Social Networks

12

≤ 12|out(i)|

∑j∈out(i)

{1

2|in(j)|∑

k∈in(j) |(A∞(k))−A0(k)|}

≤ 12|out(i)|

∑j∈out(i)

{1

2|in(j)|∑

k∈in(j) 2}

Since A(k) ∈ [−1,+1], we have |A∞(k)−A0(k)| ≤ 2

≤ 12|out(i)|

∑j∈out(i)

{1

2|in(j)|2|in(j)|}

= 12

Induction step : We assume the bound to be true for An(i)so, by the hypothesis |A∞(i)−An(i)| ≤ 1

2n+2 . In the (n+1)th

iteration,

|A∞(i)−An(i)|=∣∣ 12|out(i)|

∑j∈out(i)

{wij ± 1

2|in(j)|∑

k∈in(j) wkj(A∞(k))

−An(k)}∣∣

≤ 12|out(i)|

∑j∈out(i)

{1

2|in(j)|∑

k∈in(j) |(A∞(k))−An(k)|}

≤ 12|out(i)|

∑j∈out(i)

{1

2|in(j)|∑

k∈in(j)12n

}= 1

2n+2

Therefore the error is bounded by an inverse exponentialfunction. Thus, we conclude that a convergence has beenachieved in determining the measures ’attitude’ and ’merit’.�

REFERENCES

[1] J. Tang, C. Aggarwal, and H. Liu, “Recommendations in signed socialnetworks,” in Proceedings of the International Conference on WWW,2016, pp. 31–40.

[2] D. Liben-Nowell and J. Kleinberg, “The link-prediction problem forsocial networks,” Proceedings of the ASIS&T, vol. 58, no. 7, pp. 1019–1031, 2007.

[3] U. Brandes and D. Wagner, “Analysis and visualization of socialnetworks,” in Graph drawing software, 2004, pp. 321–340.

[4] X. Hu, J. Tang, H. Gao, and H. Liu, “Social spammer detection withsentiment information,” In Proceedings of IEEE ICDM, pp. 180––189,2014.

[5] E. E. Buckels, P. D. Trapnell, and D. L. Paulhus, Trolls just want tohave fun, 2014, pp. 67:97–102.

[6] S. Kumar, F. Spezzano, and V. Subrahmanian, “Accurately detectingtrolls in slashdot zoo via decluttering,” in Proceedings of IEEE/ACMASONAM, 2014, pp. 188–195.

[7] J. W. Patchin and S. Hinduja, “2016 cyberbullying data,” 2017.[8] C. R. Center, “https://cyberbullying.org/bullying-laws.”[9] D. Cartwright and F. Harary, “Structural balance: a generalization of

heider’s theory.” Psychological review, vol. 63, no. 5, p. 277, 1956.[10] J. Leskovec, D. Huttenlocher, and J. Kleinberg, “Signed networks in

social media,” in Proceedings of the SIGCHI CHI, 2010, pp. 1361–1370.[11] R. Plutchik, “A general psychoevolutionary theory of emotion,” in

Theories of emotion, 1980, pp. 3–33.[12] W. Medhat, A. Hassan, and H. Korashy, “Sentiment analysis algorithms

and applications: A survey,” Proceedings of the Ain Shams engineeringjournal, vol. 5, no. 4, pp. 1093–1113, 2014.

[13] L. Tang and H. Liu, “Community detection and mining in social media,”Synthesis lectures on data mining and knowledge discovery, vol. 2, no. 1,pp. 1–137, 2010.

[14] S. Bhagat, G. Cormode, and S. Muthukrishnan, “Node classification insocial networks,” in Social network data analytics, 2011, pp. 115–148.

[15] J. Tang, Y. Chang, C. Aggarwal, and H. Liu, “A survey of signed networkmining in social media,” In Proceedings of the ACM Comput. Surv.,no. 3, pp. 42:1–42:37, 2016.

[16] J. Kunegis, J. Preusse, and F. Schwagereit, “What is the added valueof negative links in online social networks?” in Proceedings of theInternational Conference on WWW, 2013, pp. 727–736.

[17] Z. Wu, C. C. Aggarwal, and J. Sun, “The troll-trust model for ranking insigned networks,” in Proceedings of the ACM International Conferenceon WSDM, 2016, pp. 447–456.

[18] R. Zhao, A. Zhou, and K. Mao, “Automatic detection of cyberbullyingon social networks based on bullying features,” in Proceedings of theICDCN, 2016.

[19] V. K. Singh, Q. Huang, and P. K. Atrey, “Cyberbullying detection usingprobabilistic socio-textual information fusion,” In Proceedings of theIEEE/ACM ASONAM, pp. 884—-887, 2016.

[20] H. Hosseinmardi, S. A. Mattson, R. I. Rafiq, R. Han, Q. Lv, andS. Mishra, “Detection of cyberbullying incidents on the instagram socialnetwork,” In Proceedings of the CoRR, 2015.

[21] J.-M. Xu, X. Zhu, and A. Bellmore, “Fast learning for sentiment analysison bullying,” in Proceedings of the First International WISDOM, 2012,pp. 10:1–10:6.

[22] A. C. Squicciarini, S. M. Rajtmajer, Y. Liu, and C. H. Griffin, “Iden-tification and characterization of cyberbullying dynamics in an onlinesocial network,” in Proceedings of the IEEE/ACM ASONAM, 2015, pp.280–285.

[23] P. Galan-Garcia, J. De La Puerta, C. Gomez, I. Santos, and P. Bringas,“Supervised machine learning for the detection of troll profiles in twittersocial network: Application to a real case of cyberbullying,” vol. 24, pp.42–53, 2014.

[24] D. Chatzakou, N. Kourtellis, J. Blackburn, E. De Cristofaro, G. Stringh-ini, and A. Vakali, “Mean birds: Detecting aggression and bullying ontwitter,” in Proceedings of the ACM on WebSci, 2017, pp. 13–22.

[25] L. Cheng, J. Li, Y. N. Silva, D. L. Hall, and H. Liu, “Xbully:Cyberbullying detection within a multi-modal context,” in Proceedingsof the Twelfth ACM International Conference on Web Search and DataMining, 2019, pp. 339–347.

[26] H.-T. Kao, S. Yan, D. Huang, N. Bartley, H. Hosseinmardi, andE. Ferrara, “Understanding cyberbullying on instagram and ask. fm viasocial role detection,” in Companion Proceedings of The 2019 WorldWide Web Conference, 2019, pp. 183–188.

[27] L. Cheng, R. Guo, Y. Silva, D. Hall, and H. Liu, “Hierarchical attentionnetworks for cyberbullying detection on the instagram social network,”in Proceedings of the 2019 SIAM International Conference on DataMining. SIAM, 2019, pp. 235–243.

[28] L. Cheng, J. Li, Y. N. Silva, D. L. Hall, and H. Liu, “Pi-bully:Personalized cyberbullying detection with peer influence.” in IJCAI,2019, pp. 5829–5835.

[29] J. Tang, C. Aggarwal, and H. Liu, “Node classification in signed socialnetworks.” in Proceedings of the SIAM ICDM, 2016, pp. 54–62.

[30] S. Brin and L. Page, “The anatomy of a large-scale hypertextual websearch engine,” Computer Networks, vol. 30, no. 1-7, pp. 107–117, 1998.

[31] M. Shahriari and M. Jalili, “Ranking nodes in signed social networks,”Social network analysis and mining, vol. 4, no. 1, p. 172, 2014.

[32] C. d. Kerchove and P. V. Dooren, “The pagetrust algorithm: How to rankweb pages when negative links are allowed?” in Proceedings of the SIAMInternational Conference on Data Mining, 2008, pp. 346–352.

[33] P. Bonacich and P. Lloyd, “Calculating status with negative relations,”Social networks, vol. 26, no. 4, pp. 331–338, 2004.

[34] A. Mishra and A. Bhattacharya, “Finding the bias and prestige of nodesin networks based on trust scores,” In Proceedings of the internationalconference on WWW, 2011.

[35] M. Dadvar, D. Trieschnigg, R. Ordelman, and F. de Jong, “Improvingcyberbullying detection with user context,” in Proceedings of the Euro-pean Conference on IR, 2013, pp. 693–696.

[36] K. Dinakar, R. Reichart, and H. Lieberman, “Modeling the detection oftextual cyberbullying,” in Proceedings of the international AAAI WSM,2011.

[37] Y. Chen, Y. Zhou, S. Zhu, and H. Xu, “Detecting offensive language insocial media to protect adolescent online safety,” in Proceedings of thePASSAT and SCSM, 2012, pp. 71–80.

[38] L. F. Ortega, J. A. Troyano, F. L. Cruz, C. G. Vallejo, and F. EnrıQuez,“Propagation of trust and distrust for the detection of trolls in a socialnetwork,” Computer Networks, vol. 56, no. 12, pp. 2884–2895, 2012.

[39] C. J. Hutto and E. Gilbert, “Vader: A parsimonious rule-based modelfor sentiment analysis of social media text,” in Proceedings of theinternational AAAI WSM, 2014.

[40] J. Han, J. Pei, and M. Kamber, Data mining: concepts and techniques,2011.

[41] Y. Sasaki et al., “The truth of the f-measure,” Teach Tutor mater, pp.1–5, 2007.

[42] C. E. Metz, “Basic principles of roc analysis,” in Seminars in nuclearmedicine, 1978, pp. 283–298.

[43] J. W. Perry, K. Allen, and M. M. Berry, “Machine literature searchingx. machine language; factors underlying its design and development,”American Documentation (pre-1986), p. 242, 1955.