18
Noname manuscript No. (will be inserted by the editor) Deep Cybersecurity: A Comprehensive Overview from Neural Network and Deep Learning Perspective Iqbal H. Sarker 1,2* Received: date / Accepted: date Abstract Deep learning (DL), which is originated from an artificial neural network (ANN), is one of the major technologies of today’s smart cybersecurity systems or policies to function in an intelligent manner. Popular deep learning techniques, such as Multi-layer Percep- tron (MLP), Convolutional Neural Network (CNN or ConvNet), Recurrent Neural Network (RNN) or Long Short-Term Memory (LSTM), Self-organizing Map (SOM), Auto-Encoder (AE), Restricted Boltzmann Machine (RBM), Deep Belief Networks (DBN), Generative Adversarial Network (GAN), Deep Transfer Learning (DTL or Deep TL), Deep Reinforcement Learning (DRL or Deep RL), or their ensembles and hybrid approaches can be used to intelligently tackle the diverse cybersecurity issues. In this paper, we aim to present a comprehensive overview from the perspective of these neural networks and deep learning techniques according to today’s diverse needs. We also discuss the applicability of these techniques in various cybersecurity tasks such as intrusion detection, identification of malware or botnets, phishing, predict- ing cyber-attacks, e.g. denial of service (DoS), fraud detection or cyber-anomalies, etc. Finally, we highlight several research issues and future directions within the scope of our study in the field. Overall, the ultimate goal of this paper is to serve as a reference point and guidelines for the academia and professionals in the cy- ber industries, especially from the deep learning point of view. 1 Swinburne University of Technology, Melbourne, VIC 3122, Australia. 2 Dept. of Computer Science and Engineering, Chittagong University of Engineering and Technology, Bangladesh. * Corresponding: [email protected] (Iqbal H. Sarker) Keywords Cybersecurity; deep learning; artificial neural network; artificial intelligence; cyber-attacks; cybersecurity analytics; cyber threat intelligence. 1 Introduction Due to the increasing popularity of Internet-of-Things (IoT) [71], and today’s dependency on digitalization, various security incidents or attacks have grown rapidly in recent years. Malicious activities, malware or ran- somware attack [82], zero-day attack [26], cryptographic attack, unauthorized access [118], denial of service (DoS) [118], data breaches [114], phishing or social engineering [44], or various attacks on IoT Devices etc. are common nowadays. These types of security incidents or cyber- crime can affect organizations and individuals, cause disruptions, as well as devastating financial losses. For example, a data breach costs 8.19 million USD for the United States [14] according to the IBM report, and the total annual cost of cybercrime to the global economy is 400 billion USD [35]. Cybercrimes are growing at an ex- ponential rate that brings an alarming message for the cybersecurity professionals and researchers [108]. Thus, the security management tools having the capability of detecting and preventing such incidents in a timely and intelligent way is urgently needed, on which the over- all national security of the business, government, and individual citizens of a country depends. In this paper, we focus on smart cybersecurity systems or policies to function in an intelligent manner for security manage- ment. Typically, cybersecurity is characterized as a col- lection of technologies and processes designed to pro- tect computers, networks, programs, and data against malicious activities, attacks, harm, or unauthorized ac- Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 16 February 2021 doi:10.20944/preprints202102.0340.v1 © 2021 by the author(s). Distributed under a Creative Commons CC BY license.

Deep Cybersecurity: A Comprehensive Overview from Neural

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Noname manuscript No.(will be inserted by the editor)

Deep Cybersecurity: A Comprehensive Overview from NeuralNetwork and Deep Learning Perspective

Iqbal H. Sarker1,2∗

Received: date / Accepted: date

Abstract Deep learning (DL), which is originated from

an artificial neural network (ANN), is one of the major

technologies of today’s smart cybersecurity systems or

policies to function in an intelligent manner. Popular

deep learning techniques, such as Multi-layer Percep-

tron (MLP), Convolutional Neural Network (CNN or

ConvNet), Recurrent Neural Network (RNN) or Long

Short-Term Memory (LSTM), Self-organizing Map (SOM),

Auto-Encoder (AE), Restricted Boltzmann Machine (RBM),

Deep Belief Networks (DBN), Generative Adversarial

Network (GAN), Deep Transfer Learning (DTL or Deep

TL), Deep Reinforcement Learning (DRL or Deep RL),

or their ensembles and hybrid approaches can be used to

intelligently tackle the diverse cybersecurity issues. In

this paper, we aim to present a comprehensive overview

from the perspective of these neural networks and deeplearning techniques according to today’s diverse needs.

We also discuss the applicability of these techniques in

various cybersecurity tasks such as intrusion detection,

identification of malware or botnets, phishing, predict-

ing cyber-attacks, e.g. denial of service (DoS), fraud

detection or cyber-anomalies, etc. Finally, we highlight

several research issues and future directions within the

scope of our study in the field. Overall, the ultimate

goal of this paper is to serve as a reference point and

guidelines for the academia and professionals in the cy-

ber industries, especially from the deep learning point

of view.

1Swinburne University of Technology, Melbourne, VIC 3122,Australia.2Dept. of Computer Science and Engineering, ChittagongUniversity of Engineering and Technology, Bangladesh.

∗Corresponding: [email protected] (Iqbal H. Sarker)

Keywords Cybersecurity; deep learning; artificial

neural network; artificial intelligence; cyber-attacks;

cybersecurity analytics; cyber threat intelligence.

1 Introduction

Due to the increasing popularity of Internet-of-Things

(IoT) [71], and today’s dependency on digitalization,

various security incidents or attacks have grown rapidly

in recent years. Malicious activities, malware or ran-

somware attack [82], zero-day attack [26], cryptographic

attack, unauthorized access [118], denial of service (DoS)

[118], data breaches [114], phishing or social engineering

[44], or various attacks on IoT Devices etc. are common

nowadays. These types of security incidents or cyber-crime can affect organizations and individuals, cause

disruptions, as well as devastating financial losses. For

example, a data breach costs 8.19 million USD for the

United States [14] according to the IBM report, and the

total annual cost of cybercrime to the global economy is

400 billion USD [35]. Cybercrimes are growing at an ex-

ponential rate that brings an alarming message for the

cybersecurity professionals and researchers [108]. Thus,

the security management tools having the capability of

detecting and preventing such incidents in a timely and

intelligent way is urgently needed, on which the over-

all national security of the business, government, and

individual citizens of a country depends. In this paper,

we focus on smart cybersecurity systems or policies to

function in an intelligent manner for security manage-

ment.

Typically, cybersecurity is characterized as a col-

lection of technologies and processes designed to pro-

tect computers, networks, programs, and data against

malicious activities, attacks, harm, or unauthorized ac-

Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 16 February 2021 doi:10.20944/preprints202102.0340.v1

© 2021 by the author(s). Distributed under a Creative Commons CC BY license.

2 Sarker et al.

Fig. 1: The worldwide popularity score of “Cyber security” and “Deep learning” in a range of 0 (min) to 100 (max) over timewhere x-axis represents the timestamp information and y-axis represents the corresponding popularity score.

cess [25]. According to today’s numerous needs, conven-

tional well-known security solutions such as antivirus,

firewalls, user authentication, encryption etc. may not

be effective [28] [86] [121] [123]. The key issue with these

systems is that they are normally operated by a few se-

curity analysts, where data management is carried out

in an ad-hoc manner and can therefore not work intel-

ligently according to the needs [37] [110]. On the other

hand, in the sense of computing that seeks to operate

in an intelligent manner for cybersecurity management,

data-driven learning techniques have evolved rapidly in

recent years.

Deep learning (DL) is considered as a part of ma-

chine learning (ML) as well as artificial intelligence (AI),

which is originated from an artificial neural network

(ANN) and one of the major technologies of the Fourth

Industrial Revolution (Industry 4.0) [108] [117]. The

worldwide popularity of “Cyber security” and “Deep

learning” is increasing day-by-day, which is shown in

Figure 1. The popularity trend in Figure 1 is shown

based on the data collected from Google Trends over

the last five years [22]. In this paper, we take into ac-

count ten popular neural network and deep learning

techniques including supervised, semi-supervised, un-

supervised, and reinforcement learning in the context

of cybersecurity. These are (i) Multi-layer Perceptron

(MLP), (ii) Convolutional Neural Network (CNN or

ConvNet), (iii) Recurrent Neural Network (RNN) or

Long Short-Term Memory (LSTM), (iv) Self-organizing

Map (SOM), (v) Auto Encoder (AE), (vi) Restricted

Boltzmann Machine (RBM), (vii) Deep Belief Networks

(DBN), (viii) Generative Adversarial Network (GAN),

(ix) Deep Transfer Learning (DTL or Deep TL), and

(x) Deep Reinforcement Learning (DRL or Deep RL).

These deep neural network learning techniques or their

ensembles and hybrid approaches can be used to intel-

ligently solve different cybersecurity issues, such as in-

trusion detection, identification of malware or botnets,

phishing, predicting cyber-attacks, e.g. denial of service

(DoS), fraud detection, or cyber-anomalies. Deep learn-

ing has its benefits to build the security models due to

its better accuracy, especially learning from large quan-

tities of security datasets [134]. The contribution of this

paper is summarized as follows:

– This study concentrates on the knowledge of artifi-

cial neural network (ANN) and deep learning (DL)

techniques, a part of artificial intelligence (AI), to

function in a timely, automated, and intelligent man-

ner in the context of cybersecurity, which are con-

sidered as the major technologies of the Fourth In-

dustrial Revolution (Industry 4.0).

– We discuss various popular neural network and deep

learning techniques including supervised, unsuper-

vised, and reinforcement learning in the context of

cybersecurity, as well as the applicability of these

techniques in various cybersecurity tasks.

– Finally, we highlight several research issues and fu-

ture directions within the scope of our study for

future development and research in the domain of

cybersecurity.

This paper is organized as follows. Section 2 pro-

vides a brief overview of cybersecurity data. In section

3, we discuss various artificial neural networks and deep

learning methods and their applicability within the area

of cybersecurity. Several research issues and potential

solutions based on our study are highlighted in Section

4. Finally, we conclude this paper in Section 5.

Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 16 February 2021 doi:10.20944/preprints202102.0340.v1

Deep Cybersecurity: A Comprehensive Overview from Neural Network and Deep Learning Perspective 3

2 Understanding Cybersecurity Data

The data-driven model based on ANN and DL meth-

ods is usually based on data availability [96]. Usually,

datasets reflect a series of data records consisting of

many attributes or characteristics and relevant informa-

tion from which the data-driven cybersecurity model is

originated. In the field of cybersecurity, many datasets

exist, including intrusion analysis, malware analysis,

and spam analysis, which are used for different pur-

poses. In our earlier paper “cybersecurity data science”,

Sarker et al. [108], we have summarized various secu-

rity datasets that are obtained from different sources.

In the following, several such datasets, including their

different characteristics and attacks, are summarized to

discuss the applicability of security modeling based on

ANN and DL, according to the objective stated in this

paper.

To build an intrusion detection system dataset DARPA

(Defence Advanced Research Project Agency) made the

earliest attempt in 1998 [74]. Under the leadership of

DARPA and AFRL/SNHS, the datasets are compiled

and released by the MIT Lincoln Laboratory’s Cyber

Infrastructure and Technology Division (formerly the

DARPA Intrusion Detection Assessment Group) for the

evaluation of computer network intrusion detection sys-

tems. The KDD Cup 99 dataset containing network

traffic records that include more than forty feature at-

tributes and one class identifier, is one of the most com-

monly used datasets for intrusion detection. [16]. The

dataset contains different types of attacks that fall into

four families: DoS, R2L, U2R, and PROB, as well as

normal data. A refined version of this dataset is known

as the NSL-KDD dataset containing similar features

[122], where duplicate records are excluded from both

the training and test results. As an example of security

data, in Table 1, we have shown the features of intru-

sion detection datasets including the features and their

various types such as integer, float, or nominal for a

deeper understanding of security data [102]. Effectively

processing these features according to the requirements,

building target ANN and DL model, and eventually the

decision analysis, could play a significant role to pro-

vide intelligent cybersecurity services that are discussed

briefly in Section 3.

Another dataset the ISCX [5] was created at the

Canadian Institute for Cybersecurity. To describe at-

tack and distribution strategies in a network context,

the definition of profiles was used. To create accurate

profiles of attacks and other events to test intrusion de-

tection systems, several real traces were analyzed. A

new dataset, CSE-CIC-IDS2018 dataset [8], collected

by the Canadian Cyber Security Institute, was recently

created at the same institution, based on a user profile

that tracks network events and activity. The MAWI

[56] dataset is a collection of research institutions and

academic institutions used by the Japanese network to

calculate the global internet situation across a wide re-

gion. To track new traffic, the dataset is updated daily.

For DDoS intrusion detection, some scholars use this

data set [56]. The types of attacks found in it are vari-

able since MAWI is real data traffic. The ADFA data

set is a set of host-level intrusion detection system data

sets issued by [133] by the Australian Security Academy

(ADFA), which is commonly used in the testing of prod-

ucts for intrusion detection. It includes five types of

attacks, including Hydra-FTP, Hydra-SSH, Add Con-

sumer, Java-MeterPerter, Webshell, and two types of

regular attacks, such as Training and Validation.

The CAIDA’07 [4] dataset represents anonymized

traces of one-hour DDoS attack traffic collected on Au-

gust 04, 2007. The one-hour traffic will be broken down

into 5-minute files. The assault consists primarily of

SYN, ICMP, and HTTP flood traffic. As most of the

legitimate content was removed after collecting the traf-

fic, this dataset is more biased towards DDoS attacks.

The CAIDA’08 [3] dataset is the valid and attack traces

tracked by Equinix (Chicago and San Jose data cen-

ters). On March 19, 2008 and July 17, 2008, respec-

tively, traces were taken in Chicago and San Jose. The

ISOT’10 dataset is a mixture of malicious and non-

malicious datasets generated at the University of Victo-

ria [15] by research in Information Security and Object

Technology (ISOT). Honeynet [13] gathered decentral-

ized botnet data for malicious traffic, and the Ericsson

Research Laboratory and Lawrence Berkeley National

Lab retrieved non-malicious traffic. ISCX’12 reflects the

traffic from a physical test environment in the real world

that produces network traffic while containing central-

ized botnets. A botnet traffic registered at the Univer-

sity of CTU, Czech Republic, in 2011, known as the [9]

CTU-13 dataset. As a source of benign domain names,

the Alexa Top Sites [1] dataset is commonly used as

one can get as many as one million domain names. OS-

INT [2] and DGArchive [10] are the malicious domain

names. The UNSW-NB15 dataset [87] was established

in 2015 at the University of New South Wales. It has

49 characteristics and a total of almost 257,700 docu-

ments covering nine different kinds of modern attacks.

A systematic approach to generate benchmark datasets

for intrusion detection has been presented in [116].

In recent years, a well-organized market involving

large amounts of money has become the malware in-

dustry. Top apps in the Google Play Store [12] are

the most common source of normal knowledge in mal-

ware experiments. Although these applications are not

Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 16 February 2021 doi:10.20944/preprints202102.0340.v1

4 Sarker et al.

Table 1: An example of features of an intrusion detection dataset [102]

Feature Name Value Type Feature Name Value Type

dst host srv count Integer same srv rate Floatflag Nominal dst host same srv rate Float

srv serror rate Float dst host srv serror rate Floatdst host serror rate Float count Integer

protocol type Nominal logged in Integerdst host same src port rate Float dst host srv diff host rate Float

rerror rate Float src bytes Integerdst host srv rerror rate Float service Nominal

srv rerror rate Float dst host rerror rate Floatdst host count Integer dst host diff srv rate Float

srv count Integer wrong fragment Integerserror rate Float num compromised Integer

srv diff host rate Float dst bytes Integerhot Integer diff srv rate Float

duration Integer is guest login Integerroot shell Integer land Integerurgent Integer num failed logins Integer

su attempted Integer num root Integernum file creations Integer num shells Integernum access files Integer num outbound cmds Integer

is host login Integer - -

guaranteed to be malware-free, because of the combina-

tion of Google vetting and the ubiquity of the applica-

tions, they are more likely to be malware-free. Besides,

they are also vetted by using the VirusTotal service

[21]. Malware is stored in many datasets. The Genome

Project dataset [142], for example, consists of 2123 apps,

1260 of which are malicious covering 49 separate fami-

lies of malware. This is similar to the Virus Share [20]

and VirusTotal [21] datasets. Another wide dataset con-

taining 22,500 malicious and 22,500 benign raw files is

the Comodo dataset [6]. The Contagio[7] dataset con-

tains 250 malicious files and is slightly smaller than theothers. The DREBIN Dataset [65] is a highly imbal-

anced dataset containing 120,000 Android apps, 5000

of which are malicious. For the Kaggle competition,

the Microsoft [18] dataset comprises 10,868 hexadeci-

mal and assembly representation binary malware files

named from nine different malware families. There are

some correlations in the datasets containing malicious

data and the Google Play Store data, according to the

statistical details in [31] listed above. In addition, there

was a broad synthetic dataset called the Computer Emer-

gency Readiness Team (CERT) Insider Threat Dataset

v6.2 [73] [41] for insider threat identification. This dataset

includes 516-day device logs containing over 130 million

incidents, approximately 400 of which are malicious.

Due to privacy issues, email datasets are hard to ob-

tain because they are extremely difficult to access. Some

common e-mail corporations, however include Enron-

Spam[11], SpamAssassin[19], and LingSpam[17]. Bot-

IoT is a recent [63] dataset that includes valid and

simulated IoT network traffic along with various types

of forensic network analytics attacks in the Internet of

Things region.

To examine the different trends of security incidents

or malicious behavior, the above-discussed datasets could

be used to construct a data-driven security model based

on artificial neural networks and deep learning tech-

niques. In Section 3, we discuss and review various ANN

and DL methods by taking into account their applica-

bility in various cybersecurity tasks.

3 ANN and Deep Learning in Cybersecurity

Deep learning (DL) is typically considered as a part of

a broader family of machine learning methods as well

as artificial intelligence (AI), which is originated from

Artificial Neural Network (ANN) [108]. The main ad-

vantage of deep learning over traditional machine learn-

ing methods is its better performance in several cases,

particularly learning from large amounts of security

datasets [134]. In the following, we discuss ten pop-

ular neural network and deep learning techniques in-

cluding supervised, semi-supervised, unsupervised, and

reinforcement learning in the context of cybersecurity.

These neural networks and deep learning techniques or

their ensembles and hybrid security models can be used

to intelligently tackle different cybersecurity issues in-

cluding intrusion detection, malware analysis, security

threat analysis, predicting cyber-attacks or anomalies,

etc.

Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 16 February 2021 doi:10.20944/preprints202102.0340.v1

Deep Cybersecurity: A Comprehensive Overview from Neural Network and Deep Learning Perspective 5

3.1 Multi-layer Perceptron (MLP)

Multi-layer Perceptron (MLP), a class of feedforward

artificial neural network (ANN), is a supervised learn-

ing algorithm [92]. It is also considered as the base

architecture of deep learning or deep neural networks

(DNN). A typical MLP is a fully connected network,

consisting of an input layer that receives the input data,

an output layer to make a decision or prediction about

the input signal, and one or more hidden layers between

these two [101], which are considered as the true com-

putational engine of the network, shown in Figure 2.

Fig. 2: An example of a feed-forward artificial neural network(ANN) with multiple hidden layers to detect cyber-anomaliesor attacks.

Since MLPs are fully linked, each node in one layer

connects at a certain weight to each node in the next

layer. Several activation functions such as ReLU (Rec-

tified Linear Unit), Tanh, Sigmoid, Softmax [92] are

used that determine the output of a network. These ac-tivation functions also known as transfer functions in-

troducing non-linear properties in the network to learn

complex functional mappings from data. MLP utilizes

a supervised learning technique called “Backpropaga-

tion” [45] for training, which is the most “fundamental

building block” in a neural network and widely used al-

gorithm for training feedforward neural networks. The

ultimate objective of the backpropagation algorithm

is to optimize the network weights to accurately map

the inputs to the target outputs. Various optimization

techniques such as Stochastic Gradient Descent (SGD),

Limited memory BFGS (L-BFGS), Adaptive Moment

Estimation (Adam) [92] are used during the training

process. Such neural networks can be used to solve

various issues in the domain of cybersecurity. For in-

stance, building an intrusion detection model [36], mal-

ware analysis [57], security threat analysis [50], detect-

ing malicious botnet traffic [54] as well as for building

trustworthy IoT systems [39] MLP based network are

used. MLP is sensitive to feature scaling and needs a

range of hyperparameters such as the number of hid-

den layers, neurons and iterations to be tuned, which

may lead the model computationally expensive to solve

a complex security model. However, MLP has the ad-

vantage of learning non-linear models even in real-time

or on-line learning using partial fit [92].

3.2 Convolutional Neural Network (CNN or ConvNet)

The convolutional neural network (CNN or ConvNet)

[68] is a deep learning network architecture that learns

directly from data, without the need for manual feature

extraction. A typical CNN consists of an input layer,

convolutional layers, pooling layers, fully connected lay-

ers, and an output layer, as shown in Figure 3. Thus,

the CNN improves the architecture of the typical ANN,

which is also considered as regularized versions of multi-

layer perceptrons. Each of the layer in CNN considers

optimized parameters for significant outcome as well

as to reduce the complexity. CNN also uses a ‘dropout’

[40] that can handle the issue of over-fitting, which may

cause in a typical network.

Fig. 3: An example of a convolutional neural network (CNN orConvNet) including multiple convolution and pooling layers.

Convolutional neural networks are specifically de-

signed to deal with the variability of 2D shapes [68].

In terms of application areas, CNNs are broadly used

in image and video recognition, medical image analysis,

recommender systems, image classification, image seg-

mentation, natural language processing, financial time

series, etc. Although CNNs are most commonly applied

to analyzing visual imagery, these networks can also be

used in the domain of cybersecurity. For instance, CNN

based deep learning model is used for intrusion detec-

tion, e.g., denial-of-service (DoS) attacks, in IoT Net-

works [119], to detect malware [136], android malware

detection [83] etc. Besides, a phishing detection model

has been presented in [132] based on convolutional neu-

ral networks. A multi-CNN fusion-based model can be

used for intrusion detection [72] in the area. Although

CNN has a greater computational burden, it has the ad-

Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 16 February 2021 doi:10.20944/preprints202102.0340.v1

6 Sarker et al.

vantage of automatically detecting the important fea-

tures without any human supervision, and thus CNN is

considered to be more powerful than typical ANN. Sev-

eral advanced CNN based deep learning models, such

as AlexNet [64], Xception [32], Inception [120], visual

geometry group (VGG) [46], ResNet [47], etc., or other

lightweight architecture of the model can be used to

minimize the issues depending on the problem domain

and data characteristics.

3.3 Long Short-Term Memory Recurrent Neural

Network (LSTM-RNN)

Recurrent Neural Network (RNN) [42] is another type

of artificial neural network, which is capable to pro-

cess a sequence of inputs in deep learning and retain

its state while processing the next sequence of inputs.

In the recurrent layer, all RNNs have feedback loops,

which allow them to retain information in ’memory’

over time. Long short-term memory (LSTM) networks

are a type of RNN that, in addition to standard units,

uses special units which can deal with the problem of

the vanishing gradient. LSTM units have a ‘memory

cell’ that can store data for long periods in memory.

Figure 4 shows an example of a long short-term mem-

ory (LSTM) cell, where the ‘Forget Gate’, ‘Input Gate’,

and ‘Output Gate’ work cooperatively to control the in-

formation flow in an LSTM unit [55]. For instance, the

‘Forget Gate’ decides what information will be memo-

rized from the previous state cell and to remove the in-

formation that is no longer useful, the ‘Input Gate’ de-

termines which information should enter the cell state,

and finally the ‘Output Gate’ decides and controls the

outputs.

Fig. 4: Basic structure of a long short-term memory (LSTM)unit.

LSTM networks are well-suited for learning and an-

alyzing sequential data, such as classifying, process-

ing, and making predictions based on time-series data,

which differentiates it from other conventional networks.

Thus, LSTM is commonly applied in the area of time-

series prediction, time-series anomaly detection, natu-

ral language processing, question answering chatbots,

machine translation, speech recognition, etc. As a large

amount of security sequential data such as network traf-

fic flows, time-dependent malicious activities, etc. are

generated these days, an LSTM model can also be ap-

plicable in the domain of cybersecurity. Several LSTM

model-based security solutions such as intrusion de-

tection [58], to detect and classify the malicious apps

[126], phishing detection [24], time-based botnet detec-

tion [124] have been studied in the area. Although the

main advantage of a recurrent network over a tradi-

tional network is the capability of modeling the se-

quence of data, it may require a lot of resources and

time to get trained. Thus, considering the above-mentioned

advantage, an effective LSTM-RNN network can im-

prove the security models to detect the security threats,

particularly, where the behavior patterns of the threats

exhibit temporal dynamic behavior.

3.4 Self-Organizing Map (SOM)

Self-organizing map (SOM) or Kohonen Map [61] is a

type of artificial neural network that follows an unsu-

pervised learning approach. It uses a competitive learn-

ing algorithm to train its network, in which nodes are

competing for the right to respond to a subset of input

data. It learns the shape of a dataset by continuously

moving its neurons nearer to the data points. Unlike

other artificial neural networks using error-correction

learning such as backpropagation with gradient descent

[45], SOMs implement competitive learning, a neigh-

borhood function to preserve the topological proper-

ties of the input space. SOM is generally used for clus-

tering [125] and mapping high-dimensional dataset as

low-dimensional (typically two-dimensional) discretized

pattern, which allows to reduce complex problems for

easy interpretation, and thus it is known as dimension-

ality reduction algorithm. As shown in Figure 5, a Ko-

honen Network or SOM consists of two layers of pro-

cessing units called an input layer and an output layer.

When an input pattern is fed to the network, the units

in the output layer compete with each other, and the

winning output unit is usually the one whose incom-

ing connection weights are closest to the input pattern,

such as measuring through Euclidean distance [45].

SOM has been widely used in, for instance, pattern

recognition, health or medical diagnosis, recognition of

anomalies, virus or worm attack detection [62] [94]. Sev-

eral researchers have used SOM for different purposes

in the domain of cybersecurity. For instance, in [66], the

Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 16 February 2021 doi:10.20944/preprints202102.0340.v1

Deep Cybersecurity: A Comprehensive Overview from Neural Network and Deep Learning Perspective 7

Fig. 5: The self-organizing map (SOM) architecture.

authors present a self-organizing map and its modeling

for discovering malignant network traffic. To identify

the hierarchical relations within the modern real-world

datasets with mixed attributes - numerical and categor-

ical, authors in [79] take into account the Growing Hi-

erarchical Self-Organizing Map (GHSOM) and Spark-

GHSOM algorithm in their analysis. The authors have

shown in [67] that SOMs have a high potential as a

data analytics tool on unknown traffic, where they can

recognize the botnet and normal flows with high con-

fidence of approximately 99%. SOMs are also used in

[77] as a visual data mining technique while analyzing

computer user behavior, security incidents, and fraud.

The main advantage of using a SOM is that the data

is easily interpreted and understood. Thus, SOMs can

play a significant role to build a data-driven effective

security model depending on the characteristics of the

data.

3.5 Auto-Encoder (AE)

An auto-encoder (AE) [42] is a type of artificial neural

network used in an unsupervised way to learn efficient

data codes. The goal of an AE is to learn a represen-

tation for a data set, typically by training the network

to ignore the ‘noise’ signal for dimensionality reduction.

An auto-encoder consists of three components: encoder,

code, and decoder as shown in Figure 6. The encoder

compresses the input and generates the data, and the

decoder then uses this code to reconstruct the input.

One primary benefit of the AE is that during propaga-

tion, this model can continuously extract useful features

and filter the useless information [76]. A single-layered

AE with a linear activation function is very similar to

principal component analysis (PCA) [103], which is also

used to decrease the dimensionality of large data sets.

The auto-encoder is widely used for unsupervised

learning tasks, e.g., dimension reduction, feature ex-

Fig. 6: A structure of an auto-encoder (AE) with the compo-nents.

traction, efficient coding, and generative modeling [42]

[140]. In the domain of cybersecurity, the deep AE can

be used to build an effective security model. The reason

is that the AE-based feature learning model in cyberse-

curity typically uses the minimum number of security

features compared to other state-of-the-art algorithms.

The resulting rich and tiny latent representation of the

security features makes the model more effective and

efficient, even in small devices such as smartphones,

known as the Internet of Things (IoT) devices [106].

For example, the authors [139] present an AE-based

feature learning model for cybersecurity applications,

where they have demonstrated the model efficacy for

malware classification and detection of network-based

anomalies. An anomaly-based insider threat detection

model using deep AE has been presented in [75]. In

[128], the authors present a CNN-based android mal-

ware detection model, where they use deep AE as a

pre-training tool to minimize the time of training. To

enhance the intrusion detection method the authors in

[135] use a stacked sparse auto-encoder. Thus, the AE-

based model in the domain of cybersecurity can be use-

ful due to its capability to capture the main features of

data.

3.6 Restricted Boltzmann Machine (RBM)

Boltzmann machines [84] are stochastic and generative

neural networks with only two types of nodes — vis-

ible nodes which we can and do measure, and hidden

nodes which we cannot or do not measure. It is an un-

supervised deep learning model in which every node is

connected to every other node, which helps us under-

stand abnormalities by learning about the working of

the system in normal conditions. Restricted Boltzmann

Boltzmann Machines (RBMs) [80] are a special class of

Boltzmann Machines and are limited in terms of con-

nections between the visible layer and the hidden layer,

i.e. only connections between the hidden and the visible

layer of variables, but not between two variables of the

Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 16 February 2021 doi:10.20944/preprints202102.0340.v1

8 Sarker et al.

same layer [84]. This restriction enables training algo-

rithms to be more efficient than what is available for the

general class of Boltzmann machines, particularly the

gradient-based contrastive divergence algorithm [49].

The Figure 7 shows an illustration of an RBM con-

sisting of m visible units V = (v1, ..., vm) representing

observable data and n hidden units H = (h1, ..., hn)

capturing dependencies between variables observed.

Fig. 7: A graphical representation of a restricted Boltzmannmachine (RBM) with m visible and n hidden nodes.

The RBM algorithm plays an important role in di-

mensionality reduction, classification, regression, col-

laborative filtering, feature learning, topic modeling,

and many more in the era of machine learning and deep

learning. In the domain of cybersecurity, the RBM can

be used to build an effective security model. For exam-

ple, the authors in [34] present network anomaly detec-

tion with the restricted Boltzmann machine. In their

approach, they investigate the efficacy of the model to

combine the expressive power of generative models with

the ability to infer part of its information from incom-

plete training data with good classification accuracy.

To increase the accuracy of DoS attack detection, the

authors in [53] present a deep learning method based

on a restricted Boltzmann machine. In [111], the au-

thors present an approach for the improvement of net-

work intrusion detection accuracy by using RBM that

composes new data by removing the noises and outliers

from the input data. Overall, the Restricted Boltzmann

Machine can automatically recognize patterns in data

and build probabilistic or stochastic models that incor-

porate randomness in the approach, which is used for

feature selection and feature extraction, as well as to

form a deep belief network.

3.7 Deep Belief Networks (DBN)

A deep belief network (DBN) [48] is a generative graph-

ical model or a probabilistic generative model consists

of stacked Boltzmann restricted machines (RBMs), dis-

cussed earlier. As shown in Figure 8, it is a type of deep

neural network (DNN) with multiple RBMs and a back-

propagation (BP) [45] neural network. DBN can cap-

ture a hierarchical representation of input data based

on its deep structure. A two-phase training can be con-

ducted sequentially by: i) pre-training, unsupervised

layer-wise learning of stacked RBM, where the layers

act as feature detectors through probabilistic recon-

structing its inputs, i.e., training with the contrastive

divergence [49] technique, and ii) fine-tuning, super-

vised learning with a classifier, e.g., BP neural network.

DBN’s main concept is to initialize the feed-forward

neural networks with unlabeled data with unsupervised

pre-training and then fine-tune the network using la-

beled data. DBNs can be seen as a composition of sim-

ple, unsupervised networks such as Boltzmann restricted

machines (RBMs) or auto-encoders, where each sub-

hidden network’s layer serves as the next visible layer

[129].

Fig. 8: Schematic structure of a deep belief network (DBN)with several layers.

In the area of cybersecurity, DBN can be used in

a large number of high-dimensional data applications.

For instance, the authors in [98] used the DBN model

as a feature reduction method to build an effective cy-

bersecurity model, e.g., intrusion detection scheme. In

[93], an intrusion detection model based on a deep belief

network has been presented. Their experimental find-

ings on NSL-KDD datasets show that there are better

classification results than SVM in the DBN-based in-

trusion detection model, and the time of model estab-

lishment is also shorter, which significantly improves

the speed of intrusion detection. The authors present

an optimization technique for intrusion detection clas-

sification model based on a deep belief network in [129],

where they find higher detection speed and accuracy of

Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 16 February 2021 doi:10.20944/preprints202102.0340.v1

Deep Cybersecurity: A Comprehensive Overview from Neural Network and Deep Learning Perspective 9

detection. Overall, the DBN security model can play

a significant role, due to its strong capability of fea-

ture extraction and classification in a large number of

high-dimensional data applications in the area of cy-

bersecurity.

3.8 Generative Adversarial Network (GAN)

A generative adversarial network (GAN) is a class of

machine learning frameworks designed by Ian Goodfel-

low [43], which is considered as one of the most interest-

ing ideas in the area. Generative adversarial networks

consist of an overall structure composed of two neu-

ral networks, a generator G and a discriminator D, as

shown in Figure 9, where the generator and discrimina-

tor are trained to compete with each other. The role of

the generator is to generate new data with characteris-

tics close to the actual data input. On the other hand,

the discriminator is trained to estimate the probability

of a future sample coming from the actual data rather

than being provided by the generator.

Fig. 9: Schematic structure of a generative adversarial net-work (GAN).

GANs are used widely in natural image synthesis,

medical image analysis, bioinformatics, data augmen-

tation tasks, video generation, voice generation, etc. It

is also useful in the domain of cybersecurity. Hackers

may use an adversarial attack to access and manip-

ulate user data in the modern world, so it is neces-

sary to implement advanced security measures to avoid

leakage and misuse of sensitive information. GAN can

therefore be trained to recognize such cases of fraud

and make deep learning models more robust. Several

works have been done in the domain of cybersecurity.

The authors of [59], for instance, present a transferred

generative adversarial network (tGAN) for automatic

zero-day attack classification and detection, which is

the best performer compared to traditional machine

learning algorithms. The authors present a zero-day

malware detection strategy in [60] using deep auto-

encoders-based transmitted generative adversarial net-

works, which generates fake malware and learns to dis-

tinguish it from real malware. They achieve 95.74% av-

erage classification accuracy in their experimental study.

In [138], a system based on generative adversarial net-

works to increase botnet detection models (Bot-GAN)

was presented, which improves detection efficiency and

decreases the false positive rate. A new GAN-based

adversarial-example attack method was implemented in

[69], which outperforms the state-of-the-art method by

247.68%. In [85], the authors explore Generative Ad-

versarial Networks (GANs) to improve the training and

ultimately performance of cyber attack detection sys-

tems by balancing data sets with the generated data.

The model generates data that closely mimics the dis-

tribution of data from various types of attacks and is

used to balance previously unbalanced databases, which

is a viable solution for designing cyber-attack intrusion

detection systems. It is useful not only for unsupervised

learning but also for semi-supervised learning, fully su-

pervised learning, and reinforcement learning, depend-

ing on the task, as the main objective of GANs is to

learn from a collection of training data and generate

new data with the same characteristics as the training

data.

3.9 Deep Transfer Learning (DTL or Deep TL)

In machine and deep learning, transfer learning is an

important method for solving the fundamental prob-

lem of inadequate training data. Thus, it eliminates the

need to train AI models, because it allows training neu-

ral networks with relatively small amounts of data [130].

In the field of data science, it is currently very com-

mon since most real-world problems generally do nothave millions of tagged data points to train such com-

plex models. It uses pre-trained models learned from a

source domain and uses these models, shown in Figure

10, for tasks in the target domain. Transfer learning

can be classified under three sub-settings [90] based on

various circumstances between the source and target

domains and tasks:

– Inductive transfer learning - In this setting, the tar-

get task varies from the source task. Several ap-

proaches such as instance transfer, feature represen-

tation transfer, parameter transfer, and relational

knowledge transfer are relevant to this.

– Transductive transfer learning - In this setting, the

source and target tasks are the same, while the source

and target domains are different. Several approaches

such as instance transfer and feature representation

transfer are relevant to this.

– Unsupervised transfer learning - It is similar to in-

ductive transfer learning mentioned above, where

Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 16 February 2021 doi:10.20944/preprints202102.0340.v1

10 Sarker et al.

the target task is different from the source task but

related to each other. It is typically studied in the

context of the feature representation transfer case.

Fig. 10: Learning process of transfer learning.

Deep transfer learning is applicable in various appli-

cation areas such as natural language processing (NLP),

sentiment classification, computer vision, image classi-

fication, speech recognition, medical imaging and spam

filtering, etc. In the domain of cybersecurity, it also

plays an important role due to its various advantages

in modeling like saving training time, improving the ac-

curacy of output, and the need for lesser training data.

For instance, the authors in [131] present a ConvNet

model using transfer learning for network intrusion de-

tection. In [89], the authors propose a signature gen-

eration method based on deep feature transfer learn-

ing that dramatically reduces signature generation and

distribution time. A higher classification accuracy of

99.5% has been achieved in [88]. The authors addressed

transfer learning for the identification of unknown net-

work attacks in [141], where they present a feature-

based transfer learning approach using a linear trans-

formation. A semi-supervised transfer learning model

for malware detection is discussed in [38], where the

transfer variable has improved the byte classifier accu-

racy from 94.72% to 96.90%. The authors present the

classification of malicious software in [95], using deep

neural network resnet-50 transfer learning. Their ex-

perimental findings on a sample indicate the efficacy of

98.62 percent accuracy in classifying malware groups. In

[127] the authors present deep transfer learning for IoT

attack detection with significant accuracy compared to

the baseline deep learning technique. Overall, the trans-

fer learning system significantly accelerates the training

of very deep neural networks while retaining high ef-

ficiency in the field of cybersecurity, even on smaller

datasets. Thus, instead of training the neural network

from scratch, cybersecurity professionals can take into

account a pre-trained, open-source deep learning model

and finetune it for their purpose.

3.10 Deep Reinforcement Learning (DRL or deep RL)

Deep reinforcement learning (DRL or deep RL) [30] is

a category of machine learning and AI, where intelli-

gent machines can learn from their actions similar to

the way humans learn from experience. It incorporates

reinforcement learning (RL) algorithms like Q-learning

and deep learning, e.g., neural network learning, as de-

fined below.

– Reinforcement learning (RL) - is the task of learning

how agents in an environment can take sequences of

actions to maximize cumulative rewards. RL consid-

ers the issue of learning to make decisions by trial

and error by a computational agent.

– Deep learning - is a form of machine learning that

uses multiple layers to progressively extract higher-

level features from the raw input, and make intelli-

gent decisions through neural network learning.

Deep RL thus incorporates deep learning models,

e.g. Deep Neural Network (DNN), based on the Markov

Decision Process (MDP) principle [78], as policy and/or

value function approximators. An MDP is “a tuple S,A, T,R,

where S is a set of states, A is a set of actions, T is a

mapping defining the transition probabilities from ev-

ery state-action pair to every possible new state, and R

is a reward function which associates a real value (re-

ward) to every state-action pair”. Figure 11 provides an

example of a deep RL schematic structure. The learn-

ing system aims to allow the agent to learn to produce

an optimized series of actions that maximize the total

amount of rewards.

Fig. 11: Schematic structure of deep reinforcement learning(DRL or deep RL).

Deep RL can be used in the domain of cyberse-

curity. For instance, the authors in [78] demonstrate

that deep RL models using Deep Q-Network (DQN),

Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 16 February 2021 doi:10.20944/preprints202102.0340.v1

Deep Cybersecurity: A Comprehensive Overview from Neural Network and Deep Learning Perspective 11

Table 2: A summary of artificial neural network (ANN) and deep learning (DL) networks highlighting the key points.

ANN and DL Techniques Descriptive Key Points

Multi-layer Perceptron(MLP)

– supervised learning algorithm– a feed-forward fully connected artificial neural network– computationally expensive to solve a complex problem

Convolutional Neural Net-work (CNN, or ConvNet)

– regularized version of multi-layer perceptrons– can automatically learn or detect the key features from data– typically deal with the variability of 2D shapes, e.g., image.

Long Short-Term MemoryRecurrent Neural Network(LSTM-RNN)

– well-suited for learning and analyzing the sequential data– preferred for NLP tasks, speech processing, and making predictions

based on time-series data

Self-Organizing Map (SOM)

– follows an unsupervised learning approach– a dimensionality reduction algorithm used for clustering and mapping

high-dimensional dataset as low-dimensional– use competitive learning rather than backpropagation

Auto-Encoder (AE)

– an unsupervised learning algorithm that learns a representation ofthe inputs and is deterministic

– to significantly reduce the noise in the input data– used typically for dimensionality reduction, very similar to PCA.

Restricted Boltzmann Ma-chine (RBM)

– an unsupervised learning algorithm that learns the statistical distri-bution and is probabilistic or stochastic

– used for feature selection and feature extraction– constitute the building blocks of deep-belief networks

Deep Belief Networks (DBN)

– a probabilistic generative model with multiple RBMs– the ability to encode richer and higher order network structures and

can work in either an unsupervised or a supervised setting– can be used in a large number of high-dimensional data applications

Generative Adversarial Net-work (GAN)

– a form of generative model typically used for unsupervised learning– generate new, synthetic instances of data with characteristics close

to the actual data input– to make the deep learning models more robust

Deep Transfer Learning(DTL or Deep TL)

– to solve the basic problem of insufficient training data– use the pre-trained model and knowledge is transferred from one

model to another– various advantages in modeling like saving training time, improving

the accuracy of output, and the need for lesser training data

Deep Reinforcement Learn-ing (DRL or Deep RL)

– follow the way how humans learn from experience– combines reinforcement learning (RL) algorithms like Q-learning and

deep learning– can be used to solve very complex problems that cannot be solved by

conventional techniques

Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 16 February 2021 doi:10.20944/preprints202102.0340.v1

12 Sarker et al.

Table 3: A summary of cybersecurity tasks based on artificial neural network (ANN) and deep learning (DL) techniques.

Used Techniques Cybersecurity Tasks References

Multi-layer perceptron (MLP)

intrusion detection,malware analysis,

detecting botnet traffic,security threat analysis

Florencio et al. [36],Karbab et al. [57],Javed et al. [54],Hodo et al. [50]

Convolutional Neural Network(CNN, or ConvNet)

intrusion detection,malware detection,phishing detection,

malicious user detection

Susilo et al. [119], Li et al. [72]Yan et al. [136], Mclaughlin et al. [83],Xiao et al. [132], Adebowale et al. [24],

Hong et al. [51]

Long Short-Term MemoryRecurrent Neural Network

(LSTM-RNN)

intrusion detection,malicious activity detection,

phishing detection,time-based botnet detection,

authentication modeling

Kim et al. [58],Vinayakumar et al. [126],

Adebowale et al. [24], Li et al. [70],Tran et al. [124], Shi et al. [115],

Abuhamad et al. [23]

Self-organizing Map (SOM)discovering malignant network traffic,

modern botnets analysis,distributed clustering

Langin et al. [66],Le et al. [67]

, Malondkar et al. [79]

Auto Encoder (AE)

feature learning model,insider threat detection,

malware detection,intrusion detection system

Yousefi et al. [139],Liu et al. [75],

Wang et al. [128],Yan et al. [135]

Restricted Boltzmann Machine (RBM)network anomaly detection,

DoS attack detection,intrusion detection

Fiore et al. [34],Imamverdiyev et al. [53],

Mayuranathan et al. [81], Alom et al. [27]

Deep Belief Networks (DBN)

intrusion detection systemand optimization,

phishing detection,malware detection

Salama et al. [98], Qu et al. [93],Wei et al. [129],

Yi et al. [137], Arshey et al. [29],Saif et al. [97], Hou et al. [52]

Generative Adversarial Network (GAN)zero-day malware detection,

botnet detection,intrusion detection systems

Kim et al. [60], Li et al. [69],Yin et al. [138],

Merino et al. [85]

Deep Transfer Learning(DTL or Deep TL)

intrusion detection system,detecting unknown network attacks,

malware detection,malicious software classification

Wu et al. [131],Zhao et al. [141],Gao et al. [38],

Rezende et al. [95]

Deep Reinforcement Learning(DRL or deep RL)

intrusion detection system,malware detection,

Security and Privacy

Lopez et al. [78], Sethi et al. [112],Fang et al. [33],

Shakeel et al. [113]

and Double Deep Q-Network (DDQN) give significant

intrusion detection results comparing with traditional

machine learning models. Similarly, a deep RL based

adaptive intrusion detection framework based on Deep-

Q-Network (DQN) for cloud infrastructure has been

presented in [112], where they experimentally reported

higher accuracy and low false-positive rates to detect

and identify new and complex attacks.

Based on our study above, we have summarized the

key points of each neural network and deep learning

technique in Table 2. In Table 3, we have also summa-

rized several cybersecurity applications based on these

techniques. Moreover, the hybrid network model, e.g.,

the ensemble of networks, can be used to build an effec-

tive model considering their combined advantages. For

instance, an LSTM network with the combination of

CNN can also be used for detecting cyber-attacks, such

as for malware detection [136], to detect and mitigate

phishing and Botnet attack across multiple IoT devices

[91]. Thus, we can conclude that various artificial neural

network and deep learning techniques discussed above,

and their variants, or modified approaches can play a

significant role to meet the current needs within the

context of cybersecurity.

4 Challenges and Research Directions

Our study on ANN and DL-based security analytics

opens several research issues in the area of cybersecu-

rity. Thus, in this section, we summarize and discuss

the challenges faced and the potential research oppor-

tunities and future directions to make the networks and

systems secured, automated, and intelligent.

In general, the effectiveness and the efficiency of

an ANN and DL-based security solution depend on

the nature and characteristics of the security data, and

the performance of the learning algorithms. To collect

Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 16 February 2021 doi:10.20944/preprints202102.0340.v1

Deep Cybersecurity: A Comprehensive Overview from Neural Network and Deep Learning Perspective 13

the security data in the domain of cybersecurity is not

straight forward. The current cyberspace enables the

production of a huge amount of data with very high fre-

quency from different domains. Thus, to collect useful

data for the target applications, e.g., security in smart

city applications, and their management is important to

further analysis. Therefore, a more in-depth investiga-

tion of data collection methods is needed while working

on cybersecurity data. The historical security data, dis-

cussed in Section 2 may contain many ambiguous val-

ues, missing values, outliers, and meaningless data. The

ANN and DL algorithms including supervised, unsuper-

vised, and reinforcement learning, discussed in Section

3 highly impact on data quality, and availability for

training, and consequently on the security model. Thus,

to accurately clean and pre-process the diverse secu-

rity data collected from diverse sources is a challenging

task. Therefore, existing pre-processing methods or to

propose new data preparation techniques are required

to effectively use the learning algorithms in the domain

of cybersecurity.

To analyze the data and extract insights, there ex-

ists many neural networks and deep learning algorithms

for building a security model, discussed briefly in Sec-

tion 3. Thus, selecting a proper learning algorithm that

is suitable for the target application in the context of

cybersecurity is challenging. The reason is that the out-

come of different ANN and DL learning algorithms may

vary depending on the data characteristics [109]. We

have also summarized several key points of these tech-

niques in Table 2. Selecting a wrong learning algorithm

would result in producing unexpected outcomes that

may lead to loss of effort, as well as the model’s effec-

tiveness and accuracy. In terms of model building, the

techniques discussed in Section 3 can directly be used

to solve many security issues. However, the hybrid net-

work model, e.g., the ensemble of networks, or mod-

ifying with an improvement, designing new methods,

combining with machine learning techniques [100] [109]

according to the target outcome could be a potential

future work in the area.

Similarly, the irrelevant security data and features

may lead to garbage processing as well as incorrect re-

sults, which is also an important issue in the area. If the

security data is bad, such as non-representative, poor-

quality, irrelevant features, or insufficient quantity for

training, then the deep learning security models may

become useless or will produce lower accuracy. Thus

relevant and quality security data is important for bet-

ter outcome. In addition to the security features, the

broader contextual information [99] [105] [107] such as

temporal context, spatial context, or the relationship or

dependency among the events or network connections,

users behavioral activities, etc. might help to build an

adaptive system. The concept of recent pattern-based

analysis, i.e., recency [104] and designing corresponding

learning technique in cybersecurity solutions could also

be effective depending on the problem domain. Overall,

we can conclude that the success of a data-driven se-

curity solution depends on both the quality of the data

and the performance of the learning algorithms.

5 Concluding Remarks

In this paper, we have conducted a comprehensive overview

of cybersecurity from the perspective of artificial neural

networks and deep learning methods. We have also re-

viewed the recent studies in each category of the neural

networks to make the position of this paper. Thus, ac-

cording to our goal, we have briefly discussed how vari-

ous types of neural networks and deep learning methods

can be used for cybersecurity solutions in various con-

ditions. A successful security model must possess the

relevant deep learning modeling depending on the data

characteristics. The sophisticated learning algorithms

then need to be trained through the collected security

data and knowledge related to the target application

before the system can assist with intelligent decision

making.

Finally, we have summarized and discussed the chal-

lenges faced and the potential research opportunities

and future directions in the area. Therefore, to enhance

the security with time and growing popularity, the chal-

lenges that are identified create promising research op-

portunities in the field which must be addressed with

effective solutions. Overall, we believe that our study on

neural networks and deep learning-based security ana-

lytics opens a promising direction and can be used as a

reference guide for potential research and applications

for both the academia and industry professionals in the

domain of cybersecurity.

Compliance with ethical standards

Conflict of interest The author declares no conflict

of interest.

References

1. Alexa top sites. available online:https://aws.amazon.com/alexa-top-sites/ (accessedon 20 october 2019).

2. Bambenek consulting—master feeds. available online:http://osint.bambenekconsulting.com/feeds/ (accessedon 20 october 2019).

Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 16 February 2021 doi:10.20944/preprints202102.0340.v1

14 Sarker et al.

3. Caida anonymized internet traces 2008 dataset.http://www. caida.org/data/passive/passive-2008-dataset.xml/ (accessed on 20 october 2019).

4. Caida ddos attack 2007 dataset.http://www.caida.org/data/ passive/ddos-20070804-dataset.xml/ (accessed on 20 october 2019).

5. Canadian institute of cybersecurity, univer-sity of new brunswick, iscx dataset, urlhttp://www.unb.ca/cic/datasets/index.html/ (ac-cessed on 20 october 2019).

6. Comodo. available online:https://www.comodo.com/home/internet-security/updates/vdp/database. php (accessed on20 october 2019).

7. Contagio. available online:http://contagiodump.blogspot.com/ (accessed on20 october 2019).

8. Cse-cic-ids2018 [online]. available:https://www.unb.ca/cic/ datasets/ids-2018.html/(accessed on 20 october 2019).

9. The ctu-13 dataset. available online:https://stratosphereips.org/category/datasets-ctu13(accessed on 20 october 2019).

10. Dgarchive. available online:https://dgarchive.caad.fkie.fraunhofer.de/site/ (ac-cessed on 20 october 2019).

11. Enronspam. available online: https://labs-repos.iit.demokritos.gr/skel/i-config/downloads/enron-spam/ (accessed on 20 october 2019).

12. Google play store. available online:https://play.google.com/store/ (accessed on 20 oc-tober 2019).

13. The honeynet project.http://www.honeynet.org/chapters/france/ (accessedon 20 october 2019).

14. Ibm security report, https://www.ibm.com/security/data-breach (accessed on 20 october 2019).

15. Isot botnet dataset. https://www.uvic.ca/engineering/ece/isot/datasets/index.php/ (accessed on 20 october 2019).

16. Kdd cup 99. available online:http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html(accessed on 20 october 2019).

17. Lingspam. available online:https://labs-repos.iit.demokritos.gr/skel/i-config/downloads/lingspampublic.tar.gz/ (accessedon 20 october 2019).

18. Microsoft malware classification (big 2015). availableonline: http://arxiv.org/abs/1802.10135/ (accessed on20 october 2019).

19. Spamassassin. available online:http://www.spamassassin.org/publiccorpus/ (accessedon 20 october 2019).

20. Virusshare. available online: http://virusshare.com/(accessed on 20 october 2019).

21. Virustotal. available online: https://virustotal.com/(accessed on 20 october 2019).

22. Google trends. In https://trends.google.com/trends/,2021.

23. Mohammed Abuhamad, Tamer Abuhmed, David Mo-haisen, and Dae Hun Nyang. Autosen: Deep learning-based implicit continuous authentication using smart-phone sensors. IEEE Internet of Things Journal, 2020.

24. Moruf Akin Adebowale, Khin T Lwin, and Mo-hammed Alamgir Hossain. Intelligent phishing detec-tion scheme using deep learning algorithms. Journal ofEnterprise Information Management, 2020.

25. Steven Aftergood. Cybersecurity: The cold war online.Nature, 547(7661):30, 2017.

26. Mamoun Alazab, Sitalakshmi Venkatraman, Paul Wat-ters, Moutaz Alazab, et al. Zero-day malware detectionbased on supervised learning algorithms of api call sig-natures. 2010.

27. Md Zahangir Alom and Tarek M Taha. Networkintrusion detection for cyber security using unsuper-vised deep learning approaches. In 2017 IEEE Na-tional Aerospace and Electronics Conference (NAE-CON), pages 63–69. IEEE, 2017.

28. Shahid Anwar, Jasni Mohamad Zain, Mohamad FadliZolkipli, Zakira Inayat, Suleman Khan, Bokolo Anthony,and Victor Chang. From intrusion detection to an intru-sion response system: fundamentals, requirements, andfuture directions. Algorithms, 10(2):39, 2017.

29. M Arshey and Angel Viji KS. An optimization-baseddeep belief network for the detection of phishing e-mails.Data Technologies and Applications, 2020.

30. Kai Arulkumaran, Marc Peter Deisenroth, MilesBrundage, and Anil Anthony Bharath. Deep reinforce-ment learning: A brief survey. IEEE Signal ProcessingMagazine, 34(6):26–38, 2017.

31. Daniel S Berman, Anna L Buczak, Jeffrey S Chavis, andCherita L Corbett. A survey of deep learning methodsfor cyber security. Information, 10(4):122, 2019.

32. Francois Chollet. Xception: Deep learning with depth-wise separable convolutions. In Proceedings of the IEEEconference on computer vision and pattern recognition,pages 1251–1258, 2017.

33. Zhiyang Fang, Junfeng Wang, Jiaxuan Geng, and XuanKan. Feature selection for malware detection based onreinforcement learning. IEEE Access, 7:176177–176187,2019.

34. Ugo Fiore, Francesco Palmieri, Aniello Castiglione, andAlfredo De Santis. Network anomaly detection withthe restricted boltzmann machine. Neurocomputing,122:13–23, 2013.

35. Eric A Fischer. Cybersecurity issues and challenges: Inbrief, 2014.

36. Felipe De Almeida Florencio, Edward David MorenoOrdonez, Hendrik Teixeira Macedo, Ricardo Jose PaivaDe Britto Salgueiro, Filipe Barreto Do Nascimento, andFlavio Arthur Oliveira Santos. Intrusion detection viamlp neural network using an arduino embedded sys-tem. In 2018 VIII Brazilian Symposium on ComputingSystems Engineering (SBESC), pages 190–195. IEEE,2018.

37. Farhad Foroughi and Peter Luksch. Data sciencemethodology for cybersecurity projects. arXiv preprintarXiv:1803.04219, 2018.

38. Xianwei Gao, Changzhen Hu, Chun Shan, Baoxu Liu,Zequn Niu, and Hui Xie. Malware classification for thecloud via semi-supervised transfer learning. Journalof Information Security and Applications, 55:102661,2020.

39. Ivan Garcıa-Magarino, Rajarajan Muttukrishnan, andJaime Lloret. Human-centric ai for trustworthy iot sys-tems with explainable multilayer perceptrons. IEEE Ac-cess, 7:125562–125574, 2019.

40. Aurelien Geron. Hands-On Machine Learning withScikit-Learn, Keras, and TensorFlow: Concepts, Tools,and Techniques to Build Intelligent Systems. O’ReillyMedia, 2019.

41. Joshua Glasser and Brian Lindauer. Bridging the gap:A pragmatic approach to generating insider threat data.

Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 16 February 2021 doi:10.20944/preprints202102.0340.v1

Deep Cybersecurity: A Comprehensive Overview from Neural Network and Deep Learning Perspective 15

In 2013 IEEE Security and Privacy Workshops, pages98–104. IEEE, 2013.

42. Ian Goodfellow, Yoshua Bengio, Aaron Courville, andYoshua Bengio. Deep learning, volume 1. MIT pressCambridge, 2016.

43. Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza,Bing Xu, David Warde-Farley, Sherjil Ozair, AaronCourville, and Yoshua Bengio. Generative adversarialnets. In Advances in neural information processing sys-tems, pages 2672–2680, 2014.

44. Brij B Gupta, Aakanksha Tewari, Ankit Kumar Jain,and Dharma P Agrawal. Fighting against phishing at-tacks: state of the art and future challenges. NeuralComputing and Applications, 28(12):3629–3654, 2017.

45. Jiawei Han, Jian Pei, and Micheline Kamber. Data min-ing: concepts and techniques. Elsevier, 2011.

46. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and JianSun. Spatial pyramid pooling in deep convolutionalnetworks for visual recognition. IEEE transactions onpattern analysis and machine intelligence, 37(9):1904–1916, 2015.

47. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and JianSun. Deep residual learning for image recognition. InProceedings of the IEEE conference on computer visionand pattern recognition, pages 770–778, 2016.

48. Geoffrey E Hinton. Deep belief networks. Scholarpedia,4(5):5947, 2009.

49. Geoffrey E Hinton, Simon Osindero, and Yee-Whye Teh.A fast learning algorithm for deep belief nets. Neuralcomputation, 18(7):1527–1554, 2006.

50. Elike Hodo, Xavier Bellekens, Andrew Hamilton, Pierre-Louis Dubouilh, Ephraim Iorkyase, Christos Tachtatzis,and Robert Atkinson. Threat analysis of iot networksusing artificial neural network intrusion detection sys-tem. In 2016 International Symposium on Networks,Computers and Communications (ISNCC), pages 1–6.IEEE, 2016.

51. Taekeun Hong, Chang Choi, and Juhyun Shin. Cnn-based malicious user detection in social networks. Con-currency and Computation: Practice and Experience,30(2):e4163, 2018.

52. Shifu Hou, Aaron Saas, Yanfang Ye, and Lifei Chen.Droiddelver: An android malware detection system us-ing deep belief network based on api call blocks. In In-ternational Conference on Web-Age Information Man-agement, pages 54–66. Springer, 2016.

53. Yadigar Imamverdiyev and Fargana Abdullayeva. Deeplearning method for denial of service attack detectionbased on restricted boltzmann machine. Big Data,6(2):159–169, 2018.

54. Yousra Javed and Navid Rajabi. Multi-layer percep-tron artificial neural network based iot botnet trafficclassification. In Proceedings of the Future TechnologiesConference, pages 973–984. Springer, 2019.

55. Changhui Jiang, Yuwei Chen, Shuai Chen, Yuming Bo,Wei Li, Wenxin Tian, and Jun Guo. A mixed deeprecurrent neural network for mems gyroscope noise sup-pressing. Electronics, 8(2):181, 2019.

56. Xuyang Jing, Zheng Yan, Xueqin Jiang, and WitoldPedrycz. Network traffic fusion and analysis againstddos flooding attacks with a novel reversible sketch. In-formation Fusion, 51:100–113, 2019.

57. ElMouatez Billah Karbab, Mourad Debbabi, Abde-louahid Derhab, and Djedjiga Mouheb. Maldozer: Au-tomatic framework for android malware detection usingdeep learning. Digital Investigation, 24:S48–S59, 2018.

58. Jihyun Kim, Jaehyun Kim, Huong Le Thi Thu, andHowon Kim. Long short term memory recurrent neu-ral network classifier for intrusion detection. In 2016International Conference on Platform Technology andService (PlatCon), pages 1–5. IEEE, 2016.

59. Jin-Young Kim, Seok-Jun Bu, and Sung-Bae Cho. Mal-ware detection using deep transferred generative adver-sarial networks. In International Conference on NeuralInformation Processing, pages 556–564. Springer, 2017.

60. Jin-Young Kim, Seok-Jun Bu, and Sung-Bae Cho. Zero-day malware detection using transferred generative ad-versarial networks based on deep autoencoders. Infor-mation Sciences, 460:83–102, 2018.

61. Teuvo Kohonen. The self-organizing map. Proceedingsof the IEEE, 78(9):1464–1480, 1990.

62. Teuvo Kohonen. Essentials of the self-organizing map.Neural networks, 37:52–65, 2013.

63. Nickolaos Koroniotis, Nour Moustafa, Elena Sitnikova,and Benjamin Turnbull. Towards the development ofrealistic botnet dataset in the internet of things for net-work forensic analytics: Bot-iot dataset. Future Gener-ation Computer Systems, 100:779–796, 2019.

64. Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton.Imagenet classification with deep convolutional neuralnetworks. In Advances in neural information processingsystems, pages 1097–1105, 2012.

65. Rajesh Kumar, Zhang Xiaosong, Riaz Ullah Khan, JayKumar, and Ijaz Ahad. Effective and explainable de-tection of android malware based on machine learningalgorithms. In Proceedings of the 2018 InternationalConference on Computing and Artificial Intelligence,pages 35–40. ACM, 2018.

66. Chet Langin, Hongbo Zhou, Shahram Rahimi, BidyutGupta, Mehdi Zargham, and Mohammad R Sayeh. Aself-organizing map and its modeling for discoveringmalignant network traffic. In 2009 IEEE symposiumon computational intelligence in Cyber Security, pages122–129. IEEE, 2009.

67. Duc C Le, A Nur Zincir-Heywood, and Malcolm I Hey-wood. Data analytics on network traffic flows for botnetbehaviour detection. In 2016 IEEE symposium serieson computational intelligence (SSCI), pages 1–7. IEEE,2016.

68. Yann LeCun, Leon Bottou, Yoshua Bengio, and PatrickHaffner. Gradient-based learning applied to documentrecognition. Proceedings of the IEEE, 86(11):2278–2324,1998.

69. Heng Li, ShiYao Zhou, Wei Yuan, Jiahuan Li, andHenry Leung. Adversarial-example attacks toward an-droid malware detection system. IEEE Systems Jour-nal, 14(1):653–656, 2019.

70. Qi Li, Mingyu Cheng, Junfeng Wang, and Bowen Sun.Lstm based phishing detection for big email data. IEEETransactions on Big Data, 2020.

71. Shancang Li, Li Da Xu, and Shanshan Zhao. The inter-net of things: a survey. Information Systems Frontiers,17(2):243–259, 2015.

72. Yanmiao Li, Yingying Xu, Zhi Liu, Haixia Hou, YushuoZheng, Yang Xin, Yuefeng Zhao, and Lizhen Cui. Ro-bust detection for network intrusion of industrial iotbased on multi-cnn fusion. Measurement, 154:107450,2020.

73. Brian Lindauer, Joshua Glasser, Mitch Rosen, Kurt CWallnau, and L ExactData. Generating test data forinsider threat detectors. JoWUA, 5(2):80–94, 2014.

Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 16 February 2021 doi:10.20944/preprints202102.0340.v1

16 Sarker et al.

74. Richard P Lippmann, David J Fried, Isaac Graf,Joshua W Haines, Kristopher R Kendall, David Mc-Clung, Dan Weber, Seth E Webster, Dan Wyschogrod,Robert K Cunningham, et al. Evaluating intrusion de-tection systems: The 1998 darpa off-line intrusion de-tection evaluation. In Proceedings DARPA InformationSurvivability Conference and Exposition. DISCEX’00,volume 2, pages 12–26. IEEE, 2000.

75. Liu Liu, Olivier De Vel, Chao Chen, Jun Zhang, andYang Xiang. Anomaly-based insider threat detectionusing deep autoencoders. In 2018 IEEE InternationalConference on Data Mining Workshops (ICDMW),pages 39–48. IEEE, 2018.

76. Weibo Liu, Zidong Wang, Xiaohui Liu, Nianyin Zeng,Yurong Liu, and Fuad E Alsaadi. A survey of deep neu-ral network architectures and their applications. Neu-rocomputing, 234:11–26, 2017.

77. Alberto Uruena Lopez, Fernando Mateo, Julio Navıo-Marco, Jose Marıa Martınez-Martınez, Juan Gomez-Sanchıs, Joan Vila-Frances, and Antonio Jose Serrano-Lopez. Analysis of computer user behavior, security in-cidents and fraud using self-organizing maps. Comput-ers & Security, 83:38–51, 2019.

78. Manuel Lopez-Martin, Belen Carro, and AntonioSanchez-Esguevillas. Application of deep reinforcementlearning to intrusion detection for supervised problems.Expert Systems with Applications, 141:112963, 2020.

79. Ameya Malondkar, Roberto Corizzo, Iluju Kiringa,Michelangelo Ceci, and Nathalie Japkowicz. Spark-ghsom: Growing hierarchical self-organizing map forlarge scale mixed attribute datasets. Information Sci-ences, 496:572–591, 2019.

80. Benjamin Marlin, Kevin Swersky, Bo Chen, and NandoFreitas. Inductive principles for restricted boltzmannmachine learning. In Proceedings of the thirteenth inter-national conference on artificial intelligence and statis-tics, pages 509–516. JMLR Workshop and ConferenceProceedings, 2010.

81. M Mayuranathan, M Murugan, and V Dhanakoti. Bestfeatures based intrusion detection system by rbm modelfor detecting ddos in cloud environment. Journal ofAmbient Intelligence and Humanized Computing, pages1–11, 2019.

82. Timothy McIntosh, Julian Jang-Jaccard, Paul Wat-ters, and Teo Susnjak. The inadequacy of entropy-based ransomware detection. In International Confer-ence on Neural Information Processing, pages 181–189.Springer, 2019.

83. Niall McLaughlin, Jesus Martinez del Rincon, Boo-Joong Kang, Suleiman Yerima, Paul Miller, Sakir Sezer,Yeganeh Safaei, Erik Trickel, Ziming Zhao, AdamDoupe, et al. Deep android malware detection. In Pro-ceedings of the Seventh ACM on Conference on Dataand Application Security and Privacy, pages 301–308,2017.

84. Roland Memisevic and Geoffrey E Hinton. Learn-ing to represent spatial transformations with factoredhigher-order boltzmann machines. Neural computation,22(6):1473–1492, 2010.

85. Tim Merino, Matt Stillwell, Mark Steele, Max Coplan,Jon Patton, Alexander Stoyanov, and Lin Deng. Ex-pansion of cyber attack data from unbalanced datasetsusing generative adversarial networks. In InternationalConference on Software Engineering Research, Man-agement and Applications, pages 131–145. Springer,2019.

86. Sara Mohammadi, Hamid Mirvaziri, MostafaGhazizadeh-Ahsaee, and Hadis Karimipour. Cy-ber intrusion detection by combined feature selectionalgorithm. Journal of information security andapplications, 44:80–88, 2019.

87. Nour Moustafa and Jill Slay. Unsw-nb15: a comprehen-sive data set for network intrusion detection systems(unsw-nb15 network data set). In 2015 military com-munications and information systems conference (Mil-CIS), pages 1–6. IEEE, 2015.

88. Daniel Nahmias, Aviad Cohen, Nir Nissim, and YuvalElovici. Trustsign: Trusted malware signature genera-tion in private clouds using deep feature transfer learn-ing. In 2019 International Joint Conference on NeuralNetworks (IJCNN), pages 1–8. IEEE, 2019.

89. Daniel Nahmias, Aviad Cohen, Nir Nissim, and Yu-val Elovici. Deep feature transfer learning for trustedand automated malware signature generation in pri-vate cloud environments. Neural Networks, 124:243–257, 2020.

90. Sinno Jialin Pan and Qiang Yang. A survey on transferlearning. IEEE Transactions on knowledge and dataengineering, 22(10):1345–1359, 2009.

91. Gonzalo De La Torre Parra, Paul Rad, Kim-Kwang Ray-mond Choo, and Nicole Beebe. Detecting internet ofthings attacks using distributed deep learning. Journalof Network and Computer Applications, page 102662,2020.

92. Fabian Pedregosa, Gael Varoquaux, Alexandre Gram-fort, Vincent Michel, Bertrand Thirion, Olivier Grisel,Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vin-cent Dubourg, et al. Scikit-learn: Machine learningin python. the Journal of machine Learning research,12:2825–2830, 2011.

93. Feng Qu, Jitao Zhang, Zetian Shao, and Shuzhuang Qi.An intrusion detection model based on deep belief net-work. In Proceedings of the 2017 VI International Con-ference on Network, Communication and Computing,pages 97–101, 2017.

94. Xiaofei Qu, Lin Yang, Kai Guo, Linru Ma, Meng Sun,Mingxing Ke, and Mu Li. A survey on the developmentof self-organizing maps for unsupervised intrusion de-tection. Mobile Networks and Applications, pages 1–22,2019.

95. Edmar Rezende, Guilherme Ruppert, Tiago Carvalho,Fabio Ramos, and Paulo De Geus. Malicious soft-ware classification using transfer learning of resnet-50deep neural network. In 2017 16th IEEE Interna-tional Conference on Machine Learning and Applica-tions (ICMLA), pages 1011–1014. IEEE, 2017.

96. Aya Rizk and Ahmed Elragal. Data science: developingtheoretical contributions in information systems via textanalytics. Journal of Big Data, 7(1):1–26, 2020.

97. Dina Saif, SM El-Gokhy, and E Sallam. Deep be-lief networks-based framework for malware detectionin android systems. Alexandria engineering journal,57(4):4049–4057, 2018.

98. Mostafa A Salama, Heba F Eid, Rabie A Ramadan,Ashraf Darwish, and Aboul Ella Hassanien. Hybrid in-telligent intrusion detection scheme. In Soft comput-ing in industrial applications, pages 293–303. Springer,2011.

99. Iqbal H Sarker. Context-aware rule learning from smart-phone data: survey, challenges and future directions.Journal of Big Data, 6(1):95, 2019.

Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 16 February 2021 doi:10.20944/preprints202102.0340.v1

Deep Cybersecurity: A Comprehensive Overview from Neural Network and Deep Learning Perspective 17

100. Iqbal H Sarker. A machine learning based robust pre-diction model for real-life mobile phone data. Internetof Things, 5:180–193, 2019.

101. Iqbal H Sarker. Ai-driven cybersecurity: An overview,security intelligence modeling and research directions.2021.

102. Iqbal H Sarker, Yoosef B Abushark, Fawaz Alsolami,and Asif Irshad Khan. Intrudtree: A machine learningbased cyber security intrusion detection model. Sym-metry, 12(5):754, 2020.

103. Iqbal H Sarker, Yoosef B Abushark, and Asif IrshadKhan. Contextpca: Predicting context-aware smart-phone apps usage based on machine learning techniques.Symmetry, 12(4):499, 2020.

104. Iqbal H Sarker, Alan Colman, and Jun Han. Recen-cyminer: mining recency-based personalized behaviorfrom contextual smartphone data. Journal of Big Data,6(1):1–21, 2019.

105. Iqbal H Sarker, Alan Colman, Muhammad Ashad Kabir,and Jun Han. Individualized time-series segmentationfor mining mobile phone user behavior. The ComputerJournal, 61(3):349–368, 2018.

106. Iqbal H Sarker, Mohammed Moshiul Hoque, Md KafilUddin, and Tawfeeq Alsanoosy. Mobile data science andintelligent apps: Concepts, ai-based modeling and re-search directions. Mobile Networks and Applications,pages 1–19, 2020.

107. Iqbal H Sarker and ASM Kayes. Abc-ruleminer:User behavioral rule-based machine learning method forcontext-aware intelligent services. Journal of Networkand Computer Applications, 168:102762, 2020.

108. Iqbal H Sarker, ASM Kayes, Shahriar Badsha, HamedAlqahtani, Paul Watters, and Alex Ng. Cybersecuritydata science: an overview from machine learning per-spective. Journal of Big Data, 7(1):1–29, 2020.

109. Iqbal H Sarker, ASM Kayes, and Paul Watters. Ef-fectiveness analysis of machine learning classificationmodels for predicting personalized context-aware smart-phone usage. Journal of Big Data, 6(1):57, 2019.

110. Joshua Saxe and Hillary Sanders. Malware data science:Attack detection and attribution. 2018.

111. Sanghyun Seo, Seongchul Park, and Juntae Kim. Im-provement of network intrusion detection accuracy byusing restricted boltzmann machine. In 2016 8th In-ternational Conference on Computational Intelligenceand Communication Networks (CICN), pages 413–417.IEEE, 2016.

112. Kamalakanta Sethi, Rahul Kumar, Nishant Prajapati,and Padmalochan Bera. Deep reinforcement learningbased intrusion detection system for cloud infrastruc-ture. In 2020 International Conference on COMmu-nication Systems & NETworkS (COMSNETS), pages1–6. IEEE, 2020.

113. P Mohamed Shakeel, S Baskar, VR Sarma Dhulipala,Sukumar Mishra, and Mustafa Musa Jaber. Maintainingsecurity and privacy in health care system using learn-ing based deep-q-networks. Journal of medical systems,42(10):186, 2018.

114. Abraham Shaw. Data breach: from notification to pre-vention using pci dss. Colum. JL & Soc. Probs., 43:517,2009.

115. Wan-Chen Shi and Hung-Min Sun. Deepbot: a time-based botnet detection with deep learning. SOFTCOMPUTING, 2020.

116. Ali Shiravi, Hadi Shiravi, Mahbod Tavallaee, and Ali AGhorbani. Toward developing a systematic approach

to generate benchmark datasets for intrusion detection.computers & security, 31(3):357–374, 2012.

117. Beata Slusarczyk. Industry 4.0: Are we ready? PolishJournal of Management Studies, 17, 2018.

118. Nan Sun, Jun Zhang, Paul Rimba, Shang Gao, Leo YuZhang, and Yang Xiang. Data-driven cybersecurity inci-dent prediction: A survey. IEEE Communications Sur-veys & Tutorials, 21(2):1744–1772, 2018.

119. Bambang Susilo and Riri Fitri Sari. Intrusion detectionin iot networks using deep learning algorithm. Informa-tion, 11(5):279, 2020.

120. Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Ser-manet, Scott Reed, Dragomir Anguelov, Dumitru Er-han, Vincent Vanhoucke, and Andrew Rabinovich. Go-ing deeper with convolutions. In Proceedings of theIEEE conference on computer vision and pattern recog-nition, pages 1–9, 2015.

121. Juan E Tapiador, Agustin Orfila, Arturo Ribagorda,and Benjamin Ramos. Key-recovery attacks on kids,a keyed anomaly detection system. IEEE Transactionson Dependable and Secure Computing, 12(3):312–325,2013.

122. Mahbod Tavallaee, Ebrahim Bagheri, Wei Lu, and Ali AGhorbani. A detailed analysis of the kdd cup 99 dataset. In 2009 IEEE Symposium on Computational In-telligence for Security and Defense Applications, pages1–6. IEEE, 2009.

123. Mahbod Tavallaee, Natalia Stakhanova, and Ali Ak-bar Ghorbani. Toward credible evaluation of anomaly-based intrusion-detection methods. IEEE Transactionson Systems, Man, and Cybernetics, Part C (Applica-tions and Reviews), 40(5):516–524, 2010.

124. Duc Tran, Hieu Mac, Van Tong, Hai Anh Tran, andLinh Giang Nguyen. A lstm based framework for han-dling multiclass imbalance in dga botnet detection. Neu-rocomputing, 275:2401–2413, 2018.

125. Juha Vesanto and Esa Alhoniemi. Clustering of theself-organizing map. IEEE Transactions on neural net-works, 11(3):586–600, 2000.

126. R Vinayakumar, KP Soman, and Prabaharan Poor-nachandran. Deep android malware detection and clas-sification. In 2017 International conference on ad-vances in computing, communications and informatics(ICACCI), pages 1677–1683. IEEE, 2017.

127. Ly Vu, Quang Uy Nguyen, Diep N Nguyen, Dinh ThaiHoang, and Eryk Dutkiewicz. Deep transfer learningfor iot attack detection. IEEE Access, 8:107335–107344,2020.

128. Wei Wang, Mengxue Zhao, and Jigang Wang. Effectiveandroid malware detection with a hybrid model basedon deep autoencoder and convolutional neural network.Journal of Ambient Intelligence and Humanized Com-puting, 10(8):3035–3043, 2019.

129. Peng Wei, Yufeng Li, Zhen Zhang, Tao Hu, Ziyong Li,and Diyang Liu. An optimization method for intrusiondetection classification model based on deep belief net-work. IEEE Access, 7:87593–87605, 2019.

130. Karl Weiss, Taghi M Khoshgoftaar, and DingDingWang. A survey of transfer learning. Journal of Bigdata, 3(1):9, 2016.

131. Peilun Wu, Hui Guo, and Richard Buckland. A transferlearning approach for network intrusion detection. In2019 IEEE 4th International Conference on Big DataAnalytics (ICBDA), pages 281–285. IEEE, 2019.

132. Xi Xiao, Dianyan Zhang, Guangwu Hu, Yong Jiang, andShutao Xia. Cnn-mhsa: A convolutional neural network

Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 16 February 2021 doi:10.20944/preprints202102.0340.v1

18 Sarker et al.

and multi-head self-attention combined approach for de-tecting phishing websites. Neural Networks, 2020.

133. Miao Xie, Jiankun Hu, Xinghuo Yu, and ElizabethChang. Evaluating host-based anomaly detection sys-tems: Application of the frequency-based algorithms toadfa-ld. In International Conference on Network andSystem Security, pages 542–549. Springer, 2015.

134. Yang Xin, Lingshuang Kong, Zhi Liu, Yuling Chen, Yan-miao Li, Hongliang Zhu, Mingcheng Gao, Haixia Hou,and Chunhua Wang. Machine learning and deep learn-ing methods for cybersecurity. IEEE Access, 6:35365–35381, 2018.

135. Binghao Yan and Guodong Han. Effective feature ex-traction via stacked sparse autoencoder to improve in-trusion detection system. IEEE Access, 6:41238–41248,2018.

136. Jinpei Yan, Yong Qi, and Qifan Rao. Detecting malwarewith an ensemble method based on deep neural network.Security and Communication Networks, 2018, 2018.

137. Ping Yi, Yuxiang Guan, Futai Zou, Yao Yao, Wei Wang,and Ting Zhu. Web phishing detection using a deeplearning framework. Wireless Communications and Mo-bile Computing, 2018, 2018.

138. Chuanlong Yin, Yuefei Zhu, Shengli Liu, Jinlong Fei,and Hetong Zhang. An enhancing framework for botnetdetection using generative adversarial networks. In 2018International Conference on Artificial Intelligence andBig Data (ICAIBD), pages 228–234. IEEE, 2018.

139. Mahmood Yousefi-Azar, Vijay Varadharajan, LenHamey, and Uday Tupakula. Autoencoder-based featurelearning for cyber security applications. In 2017 Inter-national joint conference on neural networks (IJCNN),pages 3854–3861. IEEE, 2017.

140. Guijuan Zhang, Yang Liu, and Xiaoning Jin. A surveyof autoencoder-based recommender systems. Frontiersof Computer Science, 14(2):430–450, 2020.

141. Juan Zhao, Sachin Shetty, Jan Wei Pan, CharlesKamhoua, and Kevin Kwiat. Transfer learning for de-tecting unknown network attacks. EURASIP Journalon Information Security, 2019(1):1, 2019.

142. Yajin Zhou and Xuxian Jiang. Dissecting androidmalware: Characterization and evolution. In 2012IEEE symposium on security and privacy, pages 95–109. IEEE, 2012.

Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 16 February 2021 doi:10.20944/preprints202102.0340.v1