13
© 2018 IJRAR November 2018, Volume 5, Issue 4 www.ijrar.org (E-ISSN 2348-1269, P- ISSN 2349-5138) IJRAR1BHP126 International Journal of Research and Analytical Reviews (IJRAR) www.ijrar.org 698 Social Networking Attacks detection using Machine Learning Approaches 1 M.Tech (CSE), LPU, Phagwara, Punjab. 3 DRGIT&R,Amravati, important and serious research topic. Because of the development and user friendliness of the web, the amount of social networking as well as social community users has risen substantially. Nonetheless, for using social networking sites globally, the increasing lack of knowledge of privacy and protection on OSN and the social media happens. The confidentiality, protection and privacy of the social networking sites from various positions should be reviewed. Latest studies have shown that online social networking users are disclosing their own personal details such as phone numbers, email addresses etc. In this paper, various attacks are discussed which are possible on social networking sites and can be detected by using machine learning approaches. Keyword OSN, Machine Learning, Deep Learning, Attacts, DDos, XSS, ELM, SVM, K means, KNN, BEC. I. INTRODUCTION Social networks are commonly being used by people all around the world to interact socially with peers and peer groups. Network development and the emergence of intelligent mobile devices have rendered communication and communicating with others more accessible and simpler than ever before. However, this convenience leads to new types of network safety problems, including personal information leakage or even national security problems that put our lives at risk. Security in the network includes a number of aspects, including policies and practices for avoiding and detecting harmful node activity. [1] Social network protection is more important than ever given the volume of knowledge collected and exchanged online.[2] The planet has become more linked without a doubt. This is a good thing in most situations. However, all these links also provide unprecedented access to information for people and business. And when hackers get interested, that can be very nasty.[2] There are numerous attacks possible on social networking. In this paper, those attacks are discussed utilizing machine learning techniques. In many implementations along with the areas of research, Machine Learning (ML) has gained broad attention, particularly in cyber security. To evaluate and identify attacks from the large dataset available, ML approaches can be used with hardware and computer resources becoming more usable. Hundreds of algorithms and methods for ML are generally categorized into unsupervised and supervised learning. Classification takes place when input meets a source or regressing when data is converted to a constant output. The classification of information is supervised. Neha 1 , Sagar Pande 2 , Nikhil Karale 3 2 Assistant Professor(CSE), Lovely Professional University,Punjab, India Abstract With the improvement of social network websites, security protection of private info online has been an

Social Networking Attacks detection using Machine Learning ...ijrar.org/papers/IJRAR1BHP126.pdfA kind of data leak is the risk of location leakage. Various users have mobile device

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Social Networking Attacks detection using Machine Learning ...ijrar.org/papers/IJRAR1BHP126.pdfA kind of data leak is the risk of location leakage. Various users have mobile device

© 2018 IJRAR November 2018, Volume 5, Issue 4 www.ijrar.org (E-ISSN 2348-1269, P- ISSN 2349-5138)

IJRAR1BHP126 International Journal of Research and Analytical Reviews (IJRAR) www.ijrar.org 698

Social Networking Attacks detection using

Machine Learning Approaches

1M.Tech (CSE), LPU, Phagwara, Punjab.

3DRGIT&R,Amravati,

important and serious research topic. Because of the development and user friendliness of the web, the

amount of social networking as well as social community users has risen substantially. Nonetheless, for

using social networking sites globally, the increasing lack of knowledge of privacy and protection on OSN

and the social media happens. The confidentiality, protection and privacy of the social networking sites

from various positions should be reviewed. Latest studies have shown that online social networking users

are disclosing their own personal details such as phone numbers, email addresses etc. In this paper, various

attacks are discussed which are possible on social networking sites and can be detected by using machine

learning approaches.

Keyword

OSN, Machine Learning, Deep Learning, Attacts, DDos, XSS, ELM, SVM, K means, KNN, BEC.

I. INTRODUCTION

Social networks are commonly being used by people all around the world to interact socially with peers and peer

groups. Network development and the emergence of intelligent mobile devices have rendered communication and

communicating with others more accessible and simpler than ever before. However, this convenience leads to new

types of network safety problems, including personal information leakage or even national security problems that

put our lives at risk. Security in the network includes a number of aspects, including policies and practices for

avoiding and detecting harmful node activity. [1]

Social network protection is more important than ever given the volume of knowledge collected and exchanged

online.[2] The planet has become more linked without a doubt. This is a good thing in most situations. However, all

these links also provide unprecedented access to information for people and business. And when hackers get

interested, that can be very nasty.[2] There are numerous attacks possible on social networking. In this paper, those

attacks are discussed utilizing machine learning techniques.

In many implementations along with the areas of research, Machine Learning (ML) has gained broad attention,

particularly in cyber security. To evaluate and identify attacks from the large dataset available, ML approaches can

be used with hardware and computer resources becoming more usable. Hundreds of algorithms and methods for ML

are generally categorized into unsupervised and supervised learning. Classification takes place when input meets a

source or regressing when data is converted to a constant output. The classification of information is supervised.

Neha1 , Sagar Pande2, Nikhil Karale3

2Assistant Professor(CSE), Lovely Professional University,Punjab, India

Abstract

With the improvement of social network websites, security protection of private info online has been an

Page 2: Social Networking Attacks detection using Machine Learning ...ijrar.org/papers/IJRAR1BHP126.pdfA kind of data leak is the risk of location leakage. Various users have mobile device

© 2018 IJRAR November 2018, Volume 5, Issue 4 www.ijrar.org (E-ISSN 2348-1269, P- ISSN 2349-5138)

IJRAR1BHP126 International Journal of Research and Analytical Reviews (IJRAR) www.ijrar.org 699

Unsupervised learning is mainly carried out by clustering and is used to evaluate discovery and to - dimensions. In

cyber safety, these two methods can be used in nearly real time to examine malware and eliminate the shortcomings

in conventional security strategies.[46]

II. ATTACKS ON SOCIAL NETWORKS

A. Phishing Attacks

Phishing schemes use social media to make users reveal personal data (such as financial details, passwords or

company details). [2] There are many types of phishing in social media:

1) C2 Infrastructures: Abuse of short URLs is nothing new, but becomes more popular on Twitter when it

comes to phishing attacks. Threatened actors use a mixture of URL shorter Twitter to conceal malicious links and

even host their C2 infrastructure on the network by other threatening actors, including penteters.[3]

2) Impersonation: Phishing plays an important role in the success of an attack due to the ineffective use of

social engineering. It is easy to damage this guy, the associated name, and manipulate people into taking such acts

by posing as someone with any kind of authority. This does not include satirical accounts that are typically branded,

nor events that effect users adversely. One of the most common examples is that a danger agent, when a celebration

publishes a Tweet, addresses it and promises to give free Bitcoins as that person. Hint: they’re not.[3]

3) Credential Theft and Propagation: Threat actors not only deliver phishing attacks to social media but also

manipulate users to sign onto bogus landing pages, which in effect offers their credentials. When such a move

happens, vulnerabilities can be obtained by a danger team, and attempts can be rendered to cause new users to

exchange passwords or behave more like a BEC attack and to order a wire transfer. [3]

4) Data Dumps: It is not uncommon for damaged network dumps to do internet rounds. That could be found on

dumpsites, blogs and even marketed on the dark web or elsewhere. [3]

5) Data Gathering: What was the first pet’s name? Wasn’t she fluffy? Okay, 10 years ago the message that you

shared on social media included the use of knowledge to reset passwords. What are your life’s confidential

knowledge above the basics?

A person with a hazard can also consider something, and then use it to build a complex, personalized program. [3]

B. Malware

Malware is malicious software which is synonymous with it. It’s a generic intrusive term. It’s intrusive. It’s

designed to sign in and view personal information on a computer. The nature of an OSN and the connections

between users allows a malware assault on social networks safer than with other online services. The worst

ransomware scenario is to reach and send messages to users’ passwords. Koobface malware, for example, was

circulated through OSNs like Twitter, LinkedIN and Facebook [4]. It was used to gather link identifiers and render a

botnet part of the target-infected computer [5]. Of different purposes, an OSN has a critical function of publicity and

entertainment, for example. It exposed its consumers to dangerous activities, though. Fraud and malware

Page 3: Social Networking Attacks detection using Machine Learning ...ijrar.org/papers/IJRAR1BHP126.pdfA kind of data leak is the risk of location leakage. Various users have mobile device

© 2018 IJRAR November 2018, Volume 5, Issue 4 www.ijrar.org (E-ISSN 2348-1269, P- ISSN 2349-5138)

IJRAR1BHP126 International Journal of Research and Analytical Reviews (IJRAR) www.ijrar.org 700

distribution are illegal acts under which hackers have to use an Address to execute malicious code on an OSN user’s

device [6].

C. Spam Attack

Unwanted messages are spam messages on OSNs as an update or email direct message. Spam messages are spam

messages. Spam on OSN is riskier than standard spam mails because users spend a lot of time on OSNs. Spams

generally contain advertisements or malicious phishing or malware links. Spam is created usually by spam tools or

fake accounts. In the false profile’s case, an identity generated in the name of a famous individual normally expands

it [7]. Spam reports are typically sent from infected accounts and spam bots [8]. Most spam, however, spreads from

infected accounts [9,10].A harmful text or URL is identified by spam-filtering methods in a mail and routed before

being sent to a target system [11,12].

D. XSS - Cross-Site Scripting

Cross-Site Scripting is one of the most common and severe weaknesses in web security attacks and badly affected

by web based applications [13]. XSS exploitation allows the intruder to run malicious codes on the user’s intended

internet browser, leading to compromised data, cookies of data, and the details of credit card numbers and

passwords. An intruder may therefore use XSS to build a XSS worms, which can instantly spread can OSNs with an

infrastructure of the Social Network [14].

E. Clickjacking

Clickjacking causes a consumer to click on an item that is invisible or mistaken as another element of a website. It

can unintended lead users to download malware, access suspicious web pages, supply passwords or sensitive

information, pass money or buy items online. By clicking on the assault an intruder will manage OSN users by

uploading spam posts on their timeline and unknowingly requests. A attack of clickjacking will also encourage

attackers to record their actions by using the hardware of the user’s computers, such as the microphone and the

camera [15].

F. De-Anonymization Attacks

De-Anonymisation is a data-mining technique in which undisclosed details may be compared in a confidential

archive with established and available data outlets for the re-identification of an individual. OSNs have good ways

to share info, search material and contacts. As the data exchanged by OSNs becomes publicly accessible by default,

deanonymization attacks[16] are a simple target. Pseudonyms for data privacy are used in existing online services to

make the details available to the public. Nevertheless, an individual can be re-identified by several deanonymization

methods[4].

G. Fake Profiles Attack

A common assault is a fake attack on most social networks. An attacker sends messages for real users in this type

of attack by making a profile with a fake social network credential. Once friendship has been received, it sends

Page 4: Social Networking Attacks detection using Machine Learning ...ijrar.org/papers/IJRAR1BHP126.pdfA kind of data leak is the risk of location leakage. Various users have mobile device

© 2018 IJRAR November 2018, Volume 5, Issue 4 www.ijrar.org (E-ISSN 2348-1269, P- ISSN 2349-5138)

IJRAR1BHP126 International Journal of Research and Analytical Reviews (IJRAR) www.ijrar.org 701

spam. Fake profiles are usually programmed or semi-automatic and simulate an individual. The purpose of the fake

profiles is to gather and distribute privately identifiable data by OSN users that is only available to friends. Another

problem of Online Social Network service provider is the fake profile attack, which exploit its bandwidth [17].

H. Inference Attacks

Important and confidential details may not be disclosed to the public, such as age, ethnicity, religious or political

affiliations, is expected in comparison attacks on social networks. The details or characteristics exposed in the

network should be secret but consumers may take advantage of data mining strategies for leaked OSN knowledge to

forecast customer privacy. Inference attacks may be executed using machine-learning algorithms, for example by

integrating social network data, available to the public, with topology of the network and user’s material. The

common neighbor of any two users may be discovered with a mutually-friends-based attack [18].In Reference [19],

a inference attack was proposed to infer a user’s attributes based on their publicly available data attributes. On

Facebook the methodology was checked to deduce different attributes of users, such as educational background,

interests and location information.

I. Information Leakage

There are all social media for open information communication and interaction with peers. Any people freely

spread their personal data, such as health information [20]. Sadly, some of them disclose too much personal

information regarding goods, ventures, organisations or any other kind of private information. To OSN users, the

distribution of this important and private information can have negative effects. To order to identify consumers as

unsafe customers, for example, an insurance firm can use OSN data [21].

J. Leakage of Location

A kind of data leak is the risk of location leakage. Various users have mobile device patterns to access a social

media network. Usually, applications are used to get to a smartphone source online.Position leakages breach modern

privacy as mobile devices are used for online access. The use of electronic mobile devices allows users to share

knowledge about where they are located [22]. Therefore, attackers can use the exposure of regional details on social

networking sites to hurt users.[4]

K. Cyber Bullying

Cyber bullying is a crime where offenders threaten a target with e-mails and text messages. OSN users also show

location based details through their pictures. A competitor may use content-based approaches to collect information,

but may exploit it later and carry out dangerous assault [24].

L. Sybil Attack

In this type of attack the intruder uses several aliases to gain influence (see Fig. 1). The manner in which users

sign or build identities is linked to Sybil attacks. Given the nature of the OSN and the system’s accessibility, large-

scale Sybil attacks are expected to be practically possible [23].

Page 5: Social Networking Attacks detection using Machine Learning ...ijrar.org/papers/IJRAR1BHP126.pdfA kind of data leak is the risk of location leakage. Various users have mobile device

© 2018 IJRAR November 2018, Volume 5, Issue 4 www.ijrar.org (E-ISSN 2348-1269, P- ISSN 2349-5138)

IJRAR1BHP126 International Journal of Research and Analytical Reviews (IJRAR) www.ijrar.org 702

Fig. 1: Sybil Attack [23]

M. DDoS Attack

The DDoS attack is an intrusion from many sources that overtakes client resources and stops users from servicing

legitimate customers. The DDoS attacks Attacker are using OSN that anyone can take advantage of to target heavy

DDoS. For eg, users can include ¡img¿ tags in Facebook notes to implement this attack . Whenever this attribute is

used, the image is scrolled and saved by Facebook from the external server. The consumers of Facebook uses any

picture many times and even certain text with complex parameters. Facebook server are eventually required to open

the same file in one-page view many times.[24][44][45]

III. MACHINE LEARNING APPROACES

A. Extreme Learning Machine

To utilize the feed-forward neural networks having one or more layers of secret nodes, ELM algorithm is used.

Such nodes which are hidden are arbitrarily balanced and the algorithm analytically calculates their corresponding

performance weights. The creator’s claim this algorithm would deliver outstanding standardized outcomes and can

be used to train neural networks a few thousand times faster than traditional algorithms of learning [25].

Fig. 2: ELM Algorithm [43]

B. Random Forest

The Random Forest algorithm consists of several decision trees for classification and regression functions, which

are supervised by the machine learning algorithm [26]. The algorithm of the Random Forest is a learning algorithm

for the ensemble as it includes the idea of several trees voting on a majority basis. The efficiency of the algorithm is

determined by the cumulative product of all groups of trees, and is interpreted as a class prediction. The

Page 6: Social Networking Attacks detection using Machine Learning ...ijrar.org/papers/IJRAR1BHP126.pdfA kind of data leak is the risk of location leakage. Various users have mobile device

© 2018 IJRAR November 2018, Volume 5, Issue 4 www.ijrar.org (E-ISSN 2348-1269, P- ISSN 2349-5138)

IJRAR1BHP126 International Journal of Research and Analytical Reviews (IJRAR) www.ijrar.org 703

functionality of RF for safety attack, recent studies have examined in particular in spam filtering, malware detection,

injection attacks, etc.[27,28].

Fig. 3: Random Forest Algorithm [42]

C. Support Vector Machine

One of the supervised learning models the SVM is used for analysing regression and classification [32]. The high

precision and less computing power and sophistication was extremely desired. Throughout computer security, SVM

is also used for identification of intrusion [31]. One class SVM has for example been used to evaluate data based on

a modern kernel function [33] and the particular classification of internet traffic[34].The main goal of SVM is the

quest for the optimal hyperplane to distinguish correctly between different class data point (Fig 5). The height and

input features of the hyperplane is equal to less than one. (e.g. the hyperplane is a two-dimensional one when

dealing with three characteristics) [35]. Data dots on one hand of the hyperplane are categorized into a certain class,

while data dots on the other side (green and purple, as in Fig 5) of the hyperplane are classified into another class.

On either side of the hyperplane the difference between the hyperplane and the first point (for all different classes) is

a test of sure that the Algorithm deals with its classification judgment. The broader the gap and the more accurate

we are SVM, the correct decision is taken.[35]

Page 7: Social Networking Attacks detection using Machine Learning ...ijrar.org/papers/IJRAR1BHP126.pdfA kind of data leak is the risk of location leakage. Various users have mobile device

© 2018 IJRAR November 2018, Volume 5, Issue 4 www.ijrar.org (E-ISSN 2348-1269, P- ISSN 2349-5138)

IJRAR1BHP126 International Journal of Research and Analytical Reviews (IJRAR) www.ijrar.org 704

Fig. 4: SVM [35]

Fig. 5: SVM Hyperplane [35]

D. Gradient Boosting

This is a classification and regression method, which generates the decision tree by the process of predicting as a

series of predicting models having weights.[29]. This incorporates the characteristics of a weak learner, an additive

model and a loss function. For minimizing the loss function this model adds the weak learners. The fundamental

assumption of gradient boost is that residual trends will be regularly used and a weak-predicted model will be

strengthened [30].When it reaches a level when there is no configuration for the residuals, the simulation of residues

is stopped (otherwise it could contribute to overfitting).It includes mathematically reducing the failure function to

reduce the probability of test defeat.[31]

E. Logistic Regression

Logistic regression is a statistical model that utilizes a logistic equation as its basic form to predict a discrete

dependent variable, even though there are many more complicated extensions. Logistic regression (or logit

regression) is calculated as a logistical model parameter (a binary regression form) through regression analysis. We

can use logistic regression to recognize or not harmful network traffic [36].

Page 8: Social Networking Attacks detection using Machine Learning ...ijrar.org/papers/IJRAR1BHP126.pdfA kind of data leak is the risk of location leakage. Various users have mobile device

© 2018 IJRAR November 2018, Volume 5, Issue 4 www.ijrar.org (E-ISSN 2348-1269, P- ISSN 2349-5138)

IJRAR1BHP126 International Journal of Research and Analytical Reviews (IJRAR) www.ijrar.org 705

F. Linear Regression

Linear regression is a supervised learning machine algorithm. This performs a function of regression. Regression

predicts a goal value based on different variables. The interaction between variables and forecasting is primarily

used to identify. Different regression models differ depending on the type of interaction between dependent and

independent variables, they are taken into account and the amount of independent variables are used.[36]

G. Naive Bayes

It’s a method of classification based on Bayes theorem, meaning autonomous predictors. Simply put, the Naive

Bayes Classifier assumes that there is no other function in any class for a certain category. Originally, the fruit can

be called an apple if it’s round, diameter of 3cm and color red. Although these attributes differ and include certain

traits.

Fig. 6: Gradient Boosting Algorithm[30]

Fig. 7: Logistic Regression [43]

Characteristics can be seen independently by the Naive Bayes grouping, contributing to the possibility that the fruit

would be an apple[36].

Naive Bayesian model is simple to construct for very wide data sets and particularly useful. In addition to

simplicity Naive Bayes is also supposed to carry out extremely advanced classification procedures [36].

Page 9: Social Networking Attacks detection using Machine Learning ...ijrar.org/papers/IJRAR1BHP126.pdfA kind of data leak is the risk of location leakage. Various users have mobile device

© 2018 IJRAR November 2018, Volume 5, Issue 4 www.ijrar.org (E-ISSN 2348-1269, P- ISSN 2349-5138)

IJRAR1BHP126 International Journal of Research and Analytical Reviews (IJRAR) www.ijrar.org 706

H. Decision Tree

The learning algorithm is a kind of supervised one used primarily for classification problems. This functions

remarkably with categorical variables and continuous variables. In this method, we divide people into two or even

more regular groups. The most critical attributes/independent variables allow different classes to be rendered [36].

Fig. 8: Linear Regression [41]

Fig. 9: Naive Bayes Theorem [36]

Fig. 10: Decision Tree [40]

I. kNN (k- Nearest Neighbors)

Both classification and regression issues are important. Nonetheless, it is most widely used in marking in the field.

K The closest neighbours are a simple algorithm which stores all possible cases and classifies the new cases by

majority of votes of their neighbours. The case of the class is most commonly determined by a distance function

among its nearest K neighbours.[36] Manhattan, Hamming ,Euclidean and Minkowski can be used as the distance

funtions. The first 3 functions and the fourth as categorical variables (hamming) are used for continuous function.

The case is given simply to the next class of the neighbour if K = 1. At times, choosing K is a difficulty during the

design phase of kNN.[36]

Page 10: Social Networking Attacks detection using Machine Learning ...ijrar.org/papers/IJRAR1BHP126.pdfA kind of data leak is the risk of location leakage. Various users have mobile device

© 2018 IJRAR November 2018, Volume 5, Issue 4 www.ijrar.org (E-ISSN 2348-1269, P- ISSN 2349-5138)

IJRAR1BHP126 International Journal of Research and Analytical Reviews (IJRAR) www.ijrar.org 707

Fig. 11: kNN [39]

J. K-Means

The Kmeans algorithm is an iterative algorithm that aims to split the dataset into a predefined non-overlapping

subgroup (clusters) with each data point belonging to only one set. This aims to make the data points between the

clusters as close as possible while making the cluster as distinct (without prejudice). Data points are allocated to a

cluster such that the amount of the squared distance between the data points and the centroid of the cluster

(arithmetic mean of all data points belonging to that cluster) is negligible. The less variance between clusters we

have, the more homogenous the data points are within the same cluster.

Fig. 12: K-Means [38]

IV. CONCLUSION

Regardless of the reality that the improvement of the relevant technology allows attackers to create a lot more

damaging security violations on social network websites, people themselves are really the main component in

security and privacy problems. Information extracted collectively from our discussion will give succeeding to the

researchers along with practitioners with a proper and concise notion of the explanations why protections in

addition to privacy issues continue to become a issue. Also different directions for solving the complications on

networking sites by utilizing machine learning are discussed.

REFERENCES

[1] H. Bansal and M. Misra, ”Sybil Detection in Online Social Networks (OSNs),” 2016 IEEE 6th International

Conference on Advanced Computing (IACC), Bhimavaram, 2016, pp. 569-576.

[2] Christina Newberry, ”8 Social Media Security Tips to Mitigate Risks”, https://blog.hootsuite.com/social-

media-security-for-business/

Page 11: Social Networking Attacks detection using Machine Learning ...ijrar.org/papers/IJRAR1BHP126.pdfA kind of data leak is the risk of location leakage. Various users have mobile device

© 2018 IJRAR November 2018, Volume 5, Issue 4 www.ijrar.org (E-ISSN 2348-1269, P- ISSN 2349-5138)

IJRAR1BHP126 International Journal of Research and Analytical Reviews (IJRAR) www.ijrar.org 708

[3] Elliot Volkman, ”Why Social Media is Increasingly Abused for Phishing Attacks”,

https://info.phishlabs.com/blog/how-social-media-is-abused-forphishing-attacks/

[4] Ali, S.; Islam, N.; Rauf, A.; Din, I.U.; Guizani, M.; Rodrigues, J.J.P.C. Privacy and Security Issues in Online

Social Networks. Future Internet 2018, 10, 114.

[5] Baltazar, J.; Costoya, J.; Flores, R. The Real Face of Koobface: The Largest Web 2.0 Botnet Explained.

Trend Micro Threat Research. 2009. Available online: https://www.trendmicro.de/cloud-

content/us/pdfs/security-intelligence/white-papers/wpthe-real-face-of-koobface.pdf.

[6] Alghamdi, B.; Watson, J.; Xu, Y. Toward detecting malicious links in online social networks through user

behavior. In Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence Workshops,

Omaha, NE, USA, 13–16 October 2016; pp. 5–8.

[7] Fire, M.; Katz, G.; Elovici, Y. Strangers intrusion detection-detecting spammers and fake profiles in social

networks based on topology anomalies. Human J. 2012, 1, 26–39

[8] Egele, M.; Stringhini, G.; Kruegel, C.; Vigna, G. Towards detecting compromised accounts on social

networks. IEEE Trans. Dependable Secure Comput. 2017, 14, 447–460.

[9] Grier, C.; Thomas, K.; Paxson, V.; Zhang, M. @spam: The underground on 140 characters or less. In

Proceedings of the 17th ACM conference on Computer and Communications Security, Chicago, IL, USA,

4–8 October 2010; pp. 27–37.

[10] Gao, H.; Hu, J.; Wilson, C.; Li, Z.; Chen, Y.; Zhao, B.Y. Detecting and characterizing social spam

campaigns. In Proceedings of the 10th ACM SIGCOMM Conference on Internet Measurement, Melbourne,

Australia, 1–3 November 2010; pp. 35–47.

[11] Thomas, K.; Grier, C.; Ma, J.; Paxson, V.; Song, D. Design and evaluation of a real-time URL spam

filtering service. In Proceedings of the IEEE Symposium on Security and Privacy, Oakland, CA, USA, 22–

25 May 2011; pp. 447–462.

[12] Gao, H.; Chen, Y.; Lee, K.; Palsetia, D.; Choudhary, A.N. Towards Online Spam Filtering in Social

Networks. In Proceedings of the 19th Annual Network and Distributed System Security Symposium, San

Diego, CA, USA, 5–8 February 2012; pp. 1–16.

[13] Gupta, S.; Gupta, B.B. Cross-Site Scripting (XSS) attacks and defense mechanisms: Classification and

state-of-the-art. Int. J. Syst. Assur. Eng. Manag. 2017, 8, 512–530.

[14] Faghani, M.R.; Nguyen, U.T. A study of XSS worm propagation and detection mechanisms in online

social networks. IEEE Trans. Inf. Forensics Secur. 2013, 8, 1815–1826.

[15] Lundeen, R.; Ou, J.; Rhodes, T. New Ways Im Going to Hack Your Web APP. Black Hat Abu Dhabi.

Available online: https://www.blackhat.com/html/bhad-11/bh-ad-11-archives.html#Lundeen.

[16] Ding, X.; Zhang, L.; Wan, Z.; Gu, M. A brief survey on de-anonymization attacks in online social

networks. In Proceedings of the IEEE International Conference on Computational Aspects of Social

Networks (CASoN 2010), Taiyuan, China, 26–28 September 2010; pp. 611–615

Page 12: Social Networking Attacks detection using Machine Learning ...ijrar.org/papers/IJRAR1BHP126.pdfA kind of data leak is the risk of location leakage. Various users have mobile device

© 2018 IJRAR November 2018, Volume 5, Issue 4 www.ijrar.org (E-ISSN 2348-1269, P- ISSN 2349-5138)

IJRAR1BHP126 International Journal of Research and Analytical Reviews (IJRAR) www.ijrar.org 709

[17] Wani, M.A.; Jabin, S.; Ahmad, N. A sneak into the Devil’s Colony-Fake Profiles in Online Social

Networks. Available online: https://arxiv.org/ftp/arxiv/ papers/1705/1705.09929.pdf.

[18] Heatherly, R.; Kantarcioglu, M.; Thuraisingham, B. Preventing private information inference attacks on

social networks. IEEE Trans. Knowl. Data Eng. 2013, 25, 1849–1862.

[19] Viswanath, B.; Bashir, M.A.; Crovella, M.; Guha, S.; Gummadi, K.P.; Krishnamurthy, B.; Mislove, A.

Towards Detecting Anomalous User Behavior in Online Social Networks. In Proceedings of the USENIX

Security Symposium, San Diego, CA, USA, 20–22 August 2014; pp. 223–238.

[20] Torabi, S.; Beznosov, K. Privacy Aspects of Health Related Information Sharing in Online Social

Networks. In Proceedings of the 2013 USENIX Conference on Safety, Security, Privacy and Interoperability

of Health Information Technologies, Washington, DC, USA, 12 August 2013; p. 3.

[21] Scism, L.; Maremont, M. Insurers Test Data Profiles to Identify Risky Clients. The Wall Street Journal, 19

November 2010.

[22] Humphreys, L. Mobile social networks and social practice: A case study of Dodgeball. J. Comput.-Mediat.

Commun. 2007, 13, 341–360.

[23] Alqatawna J., Madain A., Al-Zoubi A.M., Al-Sayyed R. (2017) Online Social Networks Security: Threats,

Attacks, and Future Directions. In: Taha N., Al-Sayyed R., Alqatawna J., Rodan A. (eds) Social Media

Shaping e-Publishing and Academia. Springer, Cham

[25] Guang-Bin Huang, Qin-Yu Zhu, and Chee-Kheong Siew. “Extreme learning machine: Theory and

applications”. In: Neurocomputing 70 (2006), pp. 489–501.

[26] Tin Kam Ho. “Random Decision Forests”. In: Proceedings of the 3rd International Conference on

Document Analysis and Recognition (1995), pp. 278–282.

[27] Saakshi Kapoor, Vishal Gupta, and Rohit Kumar. “An Obfuscated Attack Detection Approach for

Collaborative Recommender Systems”. In: Journal of Computing and Information Technology 26 (2018),

pp. 45–56.

[28] Zeinab Khorshidpour, Sattar Hashemi, and Ali Hamzeh. “Evaluation of random forest classifier in security

domain”. In: Applied Intelligence 47 (2017), pp. 558–569.

[24] Aditya Khamparia, Sagar Pande, Deepak Gupta, Ashish Khanna, Arun Kumar Sangaiah, Multi-level

framework for anomaly detection in social networking, Library Hi Tech

[29] P Grover. “Gradient Boosting from scratch”. In: Retrieved from Medium (2017).

[30] Sagar Dhanraj Pande Aditya Khamparia, A Review on Detection of DDOS Attack Using Machine

Learning and Deep Learning Techniques, THINK INDIA JOURNAL,Vol-22,Pg-2035-2043.

[31] Sagar Pande, Ajay B Gadicha, Prevention Mechanism on DDOS Attacks by using Multilevel Filtering of

Distributed Firewalls, International Journal on Recent and Innovation Trends in Computing and

Communication, Vol 3,Pg-1005-1008.

Page 13: Social Networking Attacks detection using Machine Learning ...ijrar.org/papers/IJRAR1BHP126.pdfA kind of data leak is the risk of location leakage. Various users have mobile device

© 2018 IJRAR November 2018, Volume 5, Issue 4 www.ijrar.org (E-ISSN 2348-1269, P- ISSN 2349-5138)

IJRAR1BHP126 International Journal of Research and Analytical Reviews (IJRAR) www.ijrar.org 710

[32] Cynthia Wagner, Jerome Francois, Thomas Engel, et al. “Machine learning approach for ip-flow record

anomaly detection”. In: International Conference on Research in Networking. Springer. 2011, pp. 28–39

[33] Ruixi Yuan et al. “An SVM-based machine learning method for accurate internet traffic classification”. In:

Information Systems Frontiers 12.2 (2010), pp. 149–156

[34] Pier Paolo Ippolito, ”SVM: Feature Selection and Kernels”, https://towardsdatascience.com/svm-feature-

selection-and-kernels-840781cc1a6c

[35] Sunil Ray, ”Commonly used Machine Learning Algorithms (with Python and R Codes)”,

https://www.analyticsvidhya.com/blog/2017/09/commonmachine-learning-algorithms/

[36] Logistic regression, https://en.wikipedia.org/wiki/Logisticregression

[37] SuperDataScience Team, ”Self Organizing Maps (SOM’s) - K-Means Clustering (Refresher)”,

https://towardsdatascience.com/random-forests-and-decision-trees-fromscratch-in-python-3e4fa5ae4249

[42] NTU, ”Extreme Learning Machines (ELM): Filling the Gap between Frank Rosenblatt’s Dream and John

von Neumann’s Puzzle”, https://www.ntu.edu. sg/home/egbhuang/

[43] Packtpub, ”Logistic regression model – building and training”,

https://subscription.packtpub.com/book/big dataandbusinessintelligence/

9781788399906/5/ch05lvl1sec43/logistic-regression-model-building-and-training

[39] Jake Hoare, ”What is a Decision Tree?”, https://www.displayr.com/what-is-a-decision-tree/

[40] GeeksForGeeks, ”ML — Linear Regression”, https://www.geeksforgeeks.org/ml-linear-regression/

[41] Vaibhav Kumar, ”Random forests and decision trees from scratch in python”,

https://www.superdatascience.com/blogs/self-organizingmaps-soms-k-means-clustering-refresher

[38] KRAJ EDUCATION, ”Understanding KNN(K-nearest neighbor) with example”,