Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
International Journal of Information Technology Vol. 24 No. 1 2018
1
Abstract
Background: Blockchain and deep learning, touted as two revolutionary technologies, have gained
great success in their respective fields. Quite a lot of efforts have been made to develop various
algorithms and applications. However, few studies have considered on how to combine these two
technologies together and improve their respective performance.
Methods: In this paper, we propose to address this issue by answering two questions:
1. How does deep learning benefit blockchain?
2. How does blockchain benefit deep learning?
Results: For the former, we systematically decompose the blockchain system into five layers,
including a network layer, a consensus layer, a data model layer, an execution layer, and an
application layer, from bottom to top. In each layer, we introduce the major challenge, and the
corresponding deep learning-based approach. For the latter, we discuss the benefits of blockchain,
including anonymity, decentralization, security and immutability, which can be utilized to protect
data privacy and coordinate distributed control in deep learning applications. Then we devise a
unified blockchain based deep learning framework by decoupling the data plane and control plane.
Conclusions: Our work provides a general and holistic perspective to integrate these two emerging
technologies and consolidates and links the broad body of these two technologies while distilling the
key features.
The Tango Between Blockchain and Deep Learning: An
Outlook
Han Hu1, Yong Luo2, and Yonggang Wen2
1: Beijing Institute of Technology, China; 2: Nanyang
Technological University, Singapore
[email protected]; {yluo,ygwen}@ntu.edu.sg
Han Hu, Yong Luo, and Yonggang Wen
Keyword: Blockchain, Artificial Intelligence, Data Scientist, Distributed Model Training.
I. Introduction
As two of the hottest technologies in recent years, blockchain and deep learning have achieved great
success, and attracted huge enthusiasm from both industry and academia. Essentially, the blockchain
is an append-only data structure hosted by a set of distributed un-trusted nodes. The structure
maintains transaction records of the whole system. The nodes verify these records by a distributed
consensus protocol and ensure the consistency and credibility. Based on the blockchain technology,
a series of digital currencies, such as Bitcoin [1] and Ethereum [2], have gained amazing commercial
profits. Built on deep neural networks, deep learning methods [3] utilize a cascade of neurons for
representation learning and feature extraction. In many applications, such as computer vision and
sports, deep learning-based approaches have now surpassed human level capability.
These two technologies are still far from mature due to their inherent challenges. For the former,
blockchain should make improvements in the following aspects: 1) Scalability: with the growth of
transaction amount day by day, the size of records stored in the blockchain system becomes
hugeness, which requires nodes with more powerful capabilities. In addition, the transaction speed
and verification time are far behind the existing payment systems. For instance, the transaction speed
of Bitcoin is at most 7 per second, while VISA can handle 1,700 transactions per second [4]; 2)
Security: the blockchain system is designed to operate over a P2P network, which is vulnerable to
the network attacks, such as denial of service attacks and Sybil attacks; 3) Privacy: Even the
blockchain system allows users to generate many addresses to prevent information leakage, recent
studies show that we can still detect different levels of privacy issues. For example, we can map the
blockchain address to the real IP address in which the transaction was made, and we can link
different blockchain addresses belonging to the same user together [5]; and 4) Energy Consumption:
International Journal of Information Technology Vol. 24 No. 1 2018
3
due to the computational complexity in verifying blocks (i.e., block mining), the blockchain system
consumes enormous electricity. For example, the electricity consumption of Bitcoin system
surpasses 178 countries in 2017 [6]. For the latter, deep learning is now being applied to more data-
sensitive applications with higher performance requirements. It should tackle the following issues: 1)
Privacy protection: owners of sensitive data, such as hospitals and finance companies, commonly
rely on the third-party artificial intelligence companies for value mining, due to the professionalism
and complexity of data analytics. As deep learning pays little attention to protect privacy,
inappropriate usage of these sensitive data may incur the risk of privacy leakage; and 2) Distributed
coordination: in order to improve the accuracy, deep learning models are becoming more and more
complex, which leads to a heavy burden in model training. To tackle this issue, the prevalent
frameworks employ the distributed architecture, such as multiple GPUs or even multiple clusters, to
accelerate the training speed. A reliable and effective coordination strategy is desired in distributed
model training.
Huge efforts have been devoted to solving the aforementioned challenges and improve the
performance of blockchain and deep learning separately. Tshorsch et al. [7] presented a
comprehensive survey on the Bitcoin system, and explored the design space on security threats,
privacy protection and consensus protocol. Croman et al. [4] analyzed the fundamental bottlenecks
in Bitcoin limit and proposed the re-parameterization of block size and data sharding to improve the
scalability in terms of higher throughputs and lower latencies. Khalilov et al. [5] analyzed the
anonymity and privacy in Bitcoin and described several improvement methods for these two issues.
Proof of work (PoW) [1], adopted by Bitcoin and Ethereum, requires a complicated computational
process in the authentication and thus consumes a huge amount of electricity. Some other power-
saving consensus protocols are devised recent years, such as Proof of stake (PoS) [8] and Practical
byzantine fault tolerance (PBFT) [9]. These efforts improve the blockchain system from different
angels, but are not related to deep learning. On the other hand, deep learning is not specifically
Han Hu, Yong Luo, and Yonggang Wen
designed for protecting data privacy. It mainly relies on protecting data in the cloud based services,
including making databases more secure [10], hiding trained models [11], and making access control
on sensitive data [12]. Most of these methods are based on the centralized architecture, which
presents security and robustness vulnerabilities. To preserve privacy in modeling training, various
distributed machine learning algorithms have been proposed, such as EXPLORE [13] and distributed
autonomous online learning [14]. All these machine learning algorithms, which either update the
models in a batch or online fashion, relied on a centralized network architecture that may suffer from
security risks such as a single-point-of-failure.
A fundamental issue is “can we integrate these two techniques together to bridge the aforementioned
performance gap?” In particular, we aim to answer two essential questions:
1. How does deep learning benefit blockchain?
2. How does blockchain benefit deep learning?
In response to the first question, we systematically decompose the blockchain system into five
layers, including a network layer, a consensus layer, a data model layer, an execution layer, and an
application layer, from bottom to top. In each layer, we introduce the major challenge, and the
corresponding deep learning based approach. In particular, the network layer runs over the P2P
network, which is vulnerable to the distributed network attack, such as denial of service. We brief
the deep learning based network attack detection. The consensus layer utilizes a distributed protocol
(e.g., proof-of-work) to make all the nodes to agree on the blockchain content, which is closely
related to block mining and consumes a huge amount of power. We present the deep reinforcement
learning based workload scheduling and cooling control for electricity saving. The data model layer
specifies how data is organized and stored in the whole system, such as transaction, block and
blockchain. We focus on the scalability in this layer and discuss the graph based data sharding
strategy to tackle this issue. The application layer builds a variety of applications and service over
International Journal of Information Technology Vol. 24 No. 1 2018
5
the underlaying blockchain system, such as Bitcoin and Ehtereum. We introduce the privacy leakage
challenge and presents the graph-based privacy analysis.
In response to the second question, we first discuss the benefits of blockchain, including anonymity,
decentralization, security and immutability. It can be utilized to protect data privacy and coordinate
distributed control in deep learning applications. As the inherent shortages of blockchain, we cannot
directly integrate blockchain into the deep learning applications. Therefore, we then devise a unified
blockchain based deep learning framework. It consists of three layers, including a blockchain layer,
an AI entity layer, and a resource layer. The core idea is to decouple the data plane and control
plane. The control plane is based on the blockchain and purpose to store the control messages among
AI entities and provide consensus. The data plane is built on the mature cloud computing paradigm.
It host the actual data and provides storage and computational capacity. Finally, we present two case
studies based on the framework, including the data privacy management and distributed model
training.
The rest of this paper is structured as follows. Section II introduces the basic knowledge of
blockchain and deep learning. Section III presents the challenges of blockchain, and several deep
learning based methods to tackle them. Section IV analyzes the challenges of deep learning and
devise a unified framework to tackle them. Finally, Section V concludes the whole paper.
II. A Brief Introduction to Blockchain and AI
In this section, we give a brief introduction to the blockchain and AI, with an emphasize on the
machine learning algorithms.
Han Hu, Yong Luo, and Yonggang Wen
A. Blockchain
We present the major building blocks of blockchain protocols, including blockchain network,
transaction, Merkle trees, and block mining process.
Underlay Network. Blockchain systems run over a P2P network. The participators of a blockchain
system are similar to the nodes in the P2P network. By contrast to the typical P2P network, the
blockchain system utilizes the underlay network to broadcast data among all the connected nodes,
using the flooding protocol. The main advantage of using a P2P network is the agile movement of
data for all nodes to achieve consensus.
Transaction. We define a transaction as the transfer of an amount of digital currency ownership
rights from the wallet of the buyer to the wallet of the seller, in exchange for a product or service.
The wallets utilize elliptic curve digital signatures to handle the transfer of ownership rights and
ensure that unauthorized spending of the cryptocurrency is infeasible. Each wallet randomly
generates a private key Pr which is used to derive its corresponding public key Pub that is shared
among all users. The Pub is used to generate the address of the wallet needed to make payments to it
while the Pr is used to generate a digital signature corresponding to the Pub in order to claim
payments made to the wallet and use them in later transactions.
Block & Blockchain. Validated transactions are grouped into blocks which are then mined and
stored in the blockchain. A single block can contain multiple transactions up to the block size limit.
Merkle trees, sometimes referred to as hash trees, are utilized to cluster multiple transactions in one
block. The blockchain is a public ledger that stores all previous transactions since the creation of
Bitcoin. As new transactions are processed, the blockchain is extended. Blocks are linked back-to-
back, with each one referencing its previous block to form the complete blockchain.
International Journal of Information Technology Vol. 24 No. 1 2018
7
Block Mining. Block mining refers to the transaction verification. After this stage, the valid
transaction is added to the blockchain, and becomes visible to all users. The transaction claimers can
use the transaction as the input to other transactions whenever desired.
B. Machine Learning
Machine learning is about learning from data and making predictions and/or decisions, which can be
categorized into supervised, unsupervised, and reinforcement learning. Supervised learning is built
on labeled data. Classification and regression are two types of supervised learning problems, with
categorical and numerical outputs respectively. Unsupervised learning extracts information from
unlabeled data. Typical unsupervised learning problems include clustering, density estimation, and
dimension reduction. Reinforcement learning handles the sequential decision problems with
evaluative feedbacks, but no supervised signals. Supervised and unsupervised learning are usually
one-shot with instant reward; while reinforcement learning is sequential, considering long-term
accumulative reward.
Deep learning, built on deep neural networks, is a particular machine learning scheme. It can be
utilized for supervised or unsupervised learning, and can also be integrated with reinforcement
learning, usually as a function approximator.
III. AI Benefits Blockchain
In this section, we discuss the challenges met by blockchain, and discuss how to utilize AI to tackle
these challenges by enumerating several cases, including power management in block mining,
scalability and privacy analysis.
Han Hu, Yong Luo, and Yonggang Wen
A. Layered Architecture and Challenges of Blockchain
As an emerging technology, blockchain is far from mature and facing multiple challenges and
problems. We highlight four major issues, including scalability, privacy, power consumption, and
network security, as shown in Figure 1.
Scalability. As the linear or exponential growth of transactions, Blockchain has the serious storage
scalability issue. The size is not bounded but grows indefinitely as time passes. In August of 2017,
the Bitcoin blockchain was about 120 GB, while it was only 75 GB in August 2016. When joining
the Bitcoin network, a bootstrapping node has to download and verify the entire blockchain of 120
GB. This builds a high barrier for the low resource devices.
Privacy. Although blockchain utilizes several schemes to guarantee anonymity and safe as claimed,
researchers can still infer certain user profiles using the blockchain transaction history or network
information. In particular, we can link user pseudonms to IP addresses even when users are behind
network address translation or firewalls. We can also cluster different bitcoin addresses into the
same entity (e.g., user) even users may generate multiple public keys for each transaction.
Power consumption. Blockchain relies on solving the complex mathematical puzzle to create new
blocks, which consumes enormous electricity. According to Digiconomist’s Bitcoin Energy
Consumption Index, as of 25th April, 2018, Bitcoin's current estimated annual electricity
consumption stands at 62.73 TWh. That is equivalent of 0.28\% of the global electricity
consumption, and ranks 41 in the world in terms of country electricity consumption (i.e., surpassing
178 countries). More serious, this value will gradually increase in the coming years.
International Journal of Information Technology Vol. 24 No. 1 2018
9
Network security. Blockchain is designed to operate over a P2P network, which is vulnerable to the
decentralized network attacks. Typical network attacks include denial of service (DoS), Sybil
attacks, eclipse attacks, and routing attacks.
Figure 1 Layered architecture and major challenges of blockchain systems
B. Opportunities from AI
Recent advances in AI, especially deep neural network and deep reinforcement, provide a powerful
tool to solve these issues.
Deep neural networks, built on multi-layer neural networks, have powerful representation learning
and patter recognition capabilities, and outperform the traditional machine learning algorithms. They
replace feature extraction and (un)supervised feature learning in the traditional methods, with a
unified neural architecture, and learn more latent features via huge amount of connected neurons to
gain better performance. They have already demonstrated great success in computer vision, natural
language processing, and speech recognition.
Deep reinforcement learning integrates reinforcement learning and deep neural network. In
particular, deep neural networks are utilized to appropriate value function, policy or model. Owing
Han Hu, Yong Luo, and Yonggang Wen
to the representation learning and end-to-end learning through gradient descent, deep reinforcement
learning does not rely on domain knowledge, and can combat the exponential challenges of the curse
of dimensionality via the hierarchical composition of factors in data. It has achieved amazing
success in robots, games and natural language processing.
C. AI Based Privacy Analysis
From network measurement and transaction historical information, we can conduct five types
privacy analysis, including
• Mapping between blockchain address to real entity: we may use the blockchain address to
infer the (sur)name of the corresponding entity, and vice versa.
• Mapping from blokchain address to IP address: we may use the blockchain address to infer
the possible IP addresses where the transaction are generated.
• Linking blockchain addresses: As one use may have more than one blockchain address, we
may link blockchain addresses belonging to the same user together.
• Mapping from blockchain address to geo-location: we may use the blockchain address to
infer the physical location of a user or transaction.
Currently, there are four types of methods to analyze the blockchain privacy, including transacting,
off-network crawling, physical network analysis, and transaction historical data analysis. For the
transacting methods, as a buyer must know the blockchain address of the seller to make a payment,
one can act as a buyer and learn other entities' mapping information, i.e., blockchain address and
entity name. For the off-network crawling methods, we can crawl the mapping relationship from
externally off-network sources. For instance, donation websites will release the blockchain addresses
to prevent the service abuses. For the physical network analysis, we can utilize the physical network
information to infer the IP address where a transaction is generated. For example, one can set a super
International Journal of Information Technology Vol. 24 No. 1 2018
11
client connected to every node in the blockchain network. As the transaction propagation relies on
peers to relay, it can be assumed that the first node to inform of a transaction is the source. For the
transaction historical data analysis, we can download the whole blockchain, and build the transaction
graph or blockchain address graph for analysis. The current blockchain privacy analysis is still in its
infancy, and these studies are still far from mature.
Actually, the blockchain privacy can be casted into a semi-supervised learning problem over a
directed graph. In particular, we can utilize the design principle to construct some label information,
including:
• In a multi-input transaction, a user performs a payment using more than one address. These
addresses belongs to the same user.
• In a multi-output transaction, there will be a new generated blockchain address. This address
belongs to the buyer.
Then we can build the graph for either transaction or blockchain address. Taking blockchain address
graph as an example, we can utilize the Figure 2 to represent the transaction flow. Each circle
represents a blockchain address. The edge between two circles represents the transaction, e.g., node
1 pays some bitcoins to node 2. According to the aforementioned information, we can infer some
blockchain addresses belongs to the same user. In Figure 1, node 1 and node 2 belongs to the same
user. Based on this graph, we can utilize the social graph analysis to analyze the blockchain privacy.
For instance, the payment of certain blockchain users may exist some patterns. We can evaluating
the behaviors of blockchain addresses, and cluster them together. The blockchain addresses in the
same cluster may correspond to a user or a geo-location.
Han Hu, Yong Luo, and Yonggang Wen
Figure 2 Graph analysis for blockchain privacy
D. AI Based Power Management
To gain advantage over other competitors, blockchain miners commonly utilize large-scale
computation clusters with powerful computation capabilities to verify blocks faster, sacrificing a
large amount of electricity consumption. The electricity consumption is primarily attributed to
servers and cooling equipment. In particular, the cooling system alone may account for up to 30\% in
average, while the servers consume around 56\% of total electricity consumption. There exist great
potential and opportunity to optimize the electricity consumption from the perspectives of cooling
system and server system.
Most of existing works on optimizing electricity consumption of server system and cooling system
utilize the model based methods to adjust workload distribution and cooling configuration. In
particular, they model the workload arrival, server system utilization, and temperature distribution,
schedule more workloads to servers with low utilization to reduce power, and adjust the airflow rate
to decrease the indoor temperature. These methods may suffer from the following issues: 1) Some
key factors (such as CPU utilization, CPU power consumption, rack inlet temperature, etc.) are
approximated by linear functions, which are sometimes either inadequate or inaccurate to capture the
intricacies of various interacting processes of data center; 2) The complexity and local optimal
International Journal of Information Technology Vol. 24 No. 1 2018
13
solution of the optimization problem cannot be addressed well; and 3) The dynamics of data center
cannot be well captured by a mathematical model.
Recent breakthrough of deep reinforcement learning provides a promising technique for optimizing
electricity consumption in block mining. Under such a framework, we only to define the system state
space (e.g., workload arrival), action space (e.g., airflow adjustment), and reward function (e.g.,
overall electricity consumption), build the real environment or testbed to train the model, and utilize
proper algorithms, such as deep Q-network (DQN), deep determinnistic policy gradient (DDPG), or
asynchronous advantage actor-critic (A3C), to find the optimal solution [15]. In comparison with the
model based approaches, this method have several advantages: 1) it does not rely on accurate system
models (e.g., queueing models) and thereby enhances the applicability in complex environment with
random behaviors. For instance, the thermal diffusion process in the cluster is difficult to model,
which is approximated as a linear function in previous works; 2) It is able to deal with highly time-
variant environments. For example, the transaction generation and server utilization is highly
dynamic.
E. AI Based Network Security Detection
We present major network attacks that can compromise the blockchain network, including DoS,
Sybil attacks, eclipse attacks, and routing attacks. DoS attacks flood the network with bogus traffic
to disrupt legitimate services and nodes connected to the blockchain network. Sybil attacks send
multiple pseudonymous identities from a single node, monopolize connections to nodes, and control
data propagating to them. Eclipse attacks monopolize all the outbound and inbound connections to a
node and isolate the victim node from the rest of the blockchain network. Routing attacks inercept
the network transmitted messages and tamper with them. These attacks seriously affect the regular
operation of the blockchain network. For instance, DoS attacks on a mining pool can eliminate the
Han Hu, Yong Luo, and Yonggang Wen
pool from the mining competition, and give advantages to other miners. Sybil attacks, eclipse attacks
and routing attacks can convince other normal nodes with wrong messages, and cause subsequent
harms, such as double-spending issue.
Many researchers have proposed innovative approaches to network attack detection, which can be
categorized into rule based approached and machine learning based approaches. The rule based
approaches set up some rules according to pervious network attack behaviors or personal experience.
If the incoming network traffic meets certain rules, then the traffic is simply classified as a type of
attack. As for the machine learning based approaches, they rely on special datasets, such as
(ab)normal network traffic dataset. Using the dataset, they manually extract network attack related
features, such as average packets pre flow, average bytes per flow, and growth of different ports, and
then take advantage of machine learning algorithms, such as SVM, RFs, genetic algorithms and
artificial neural networks, to train a classifier. The resulted classified will be used to determine
whether new traffic is the network attack. These methods rely on manually determined rules or
features, and the learning capacity is limited. When network environment is complicated, learning
efficiency further decreases.
The essential of network attack detection is a classification problem, which can be well solved by
multi-layer neural network based algorithms. Taking the widely used dataset, i.e, NSL\_KDD, as an
example, each sample consists of multiple fields, such as protocol type, source byte, destination
byte, flag, etc. We can roughly divide them into two types, including categorial feature and continual
feature. For example, ``Protocol type" can be TCP or UDP, which belongs to the categorial feature;
``Source byte" can be 15K or 21.3K, which belongs to the continuous feature. For the input layer, we
utilize the one-hot encoding for each type of data, and fed them into the input layer. As different
categorical variables have diverse dimensions, the importance of some variables with the lower
dimension will be weakened. We can devise an embedding layer, which transforms different features
International Journal of Information Technology Vol. 24 No. 1 2018
15
into proper dimensions independently. All the output of the embedding layer will be fully connected
to one dense layer. The output layer is the classification result. In comparison with the traditional
methods, the DNN based algorithms can learn high and abstract level of attack behaviors, and
complex network attack rules can be found.
IV. Blockchain Benefits AI
In this section, we discuss the challenges met by AI, and design a unified blockchain based
framework to solve them. In addition, we detail three cases, including data security management,
distributed model training, and model integration, using the proposed framework.
A. Challenges of AI
Although an explosive amount of efforts have been devoted to AI development, AI still lacks of
countermeasures for data privacy management and reliable distributed model training.
AI is now being applied to more and more specific domains with stringent requirements on data and
model privacy, such as finance and healthcare services. These services commonly collect, store and
use huge amount of data, containing personally identifiable information or other sensitive
information. They should protect individuals' privacy preferences and their personally identifiable
information while using these data. However, because of the professionalism and complexity of AI,
they tend to rely on third-party AI companies. Inappropriate data disclosure may incur the risk of
privacy leak. Furthermore, AI models trained on these sensitive data should be operated properly, as
attackers may infer some statistics, such as membership and demography, from these models.
Han Hu, Yong Luo, and Yonggang Wen
To achieve better performance, many AI algorithms relies on computation and data collaboration,
which demand for distributed control frameworks. First, due to the high computation-burden, many
AI algorithms utilize multiple data centers to aggregate enough computational capacity for model
training. In particular, these type of frameworks split the original dataset into multiple subsets stored
in different data centers, and each data center only needs to process a subset. The resulted parameter
updates of each data center are synchronized among them. Second, cross-institutional modeling
training can take the advantage of the diversity of multiple datasets, and improve the generalization
ability. For instance, a model that predicts risk of re-admission for a particular set of patients will be
more generalizable if developed with data from multiple institutions. A reliable distributed control
protocol is needed to coordinate model training among multiple participators.
B. A Unified Blockchain based Framework
Although the salient features, including decentralized architecture and immutable audit trail, of
blochchain are suitable to tackle the aforementioned challenges, there still exist several inherent
limitations:
• Limited block size: The blocksize is on the order of kilobytes and cannot hold much data.
Furthermore, all nodes participating the blockchain network need to maintain a full copy of
the blockchain, limiting the total size of blockchain.
• Slow transaction rate: The transaction rate is capped by the transaction propagation rate and
block verification rate. New transactions can take several minutes to be accepted.
These two drawbacks indicate that we cannot directly utilize blockchain to store sensitive data and
coordinate model training.
International Journal of Information Technology Vol. 24 No. 1 2018
17
Figure 3 Blockchain-based distributed AI platform.
To bridge the gap, we devise a three-layer architecture (AIChain) based on the emerging blockchain
technique and cloud computing paradigm, as shown in Figure 3. The key design principle is that we
decouple the data plane and control plane. The control plane is based on blockchain and the data
plane is build on cloud computing paradigm. In particular, our framework consists of three layers,
including a blockchain layer, an AI entity layer, and an resource layer. The blockchain layer is at the
bottom and serves as the control plane. The main purpose is to store AI sequence operations
encapsulated in blocks and provide consensus among AI operators. The AI entity layer is located in
the middle layer, and comprises of AI agents and a agent database. Each AI agent corresponds to an
institution that needs AI services. The agent database stores the status information and coordinates
different AI agents. The top-most layer is the resource layer, which is built on different cloud
computing platforms. It hosts the actual data for different AI agents and provides computational
capacity (e.g., GPU/CPU). Each cloud platform has the corresponding portal to store metadata and
verify entrance of other agents.
Han Hu, Yong Luo, and Yonggang Wen
AIChain enables inter-operability among AI agents, and has the following advantages. First,
AIChain is Built on mature infrastructure. With the support of the blockchain backbone (e.g., Bitcoin
or Ethereum). AIChain can leverage all existing data storage infrastructure. Second, AIChain
maintains the modularity. Each agent remains modular, and has control about how its data are
accessed. Third, AIChain protects data privacy with inter-operability. AIChain inherits the features
of blockchain to secure data by exchanging zero sensitive raw data. In addition, different AI agents
can coordinate to train AI models.
C. Blockchain Based Data Privacy Management
Data privacy management is prerequisite to collect and utilize the sensitive data. For these data, we
need to remain the confidence of personal and sensitive information, and control the data access
according to the data owners' rules. Nowadays, many organization host the sensitive data, such as
banks, hospitals and social service providers, rely on the third-party companies for storage (e.g.,
public cloud) or data analysis (e.g., Amazon AI), the problem of privacy leakage can be amplified
with inappropriate usage.
Encryption and access control list (ACL) based methods are the mainstream technique to solve the
data privacy issue. Encryption methods allow computation and processing of the encrypted data and
return encrypted results. ACL methods delegate the data owners to define a set of rules, which state
who can access a specific set of data when. These methods incur several drawbacks, including
efficiency, ownership and lifecycle management. In particular, due to the high computation burden
of cryptographic algorithms, encryption methods are inefficient and difficult to scale with large
applications. For the ACL methods, the key questions are who owns the data and who can modify it
International Journal of Information Technology Vol. 24 No. 1 2018
19
along the lifecycle of the data. However, a systematic approach is impossible for most of
applications in considerations of the evolution of services and users' convenience.
Figure 4 Data privacy management using smart contract.
A possible method to tackle these challenges is to utilize the smart contract to define ACL for data
privacy management, as shown in Figure 4. The key idea is to delegate users to register sensitive
data and define the access policies by smart contracts. Once other users want to access the data, they
will request the underlying blockchain network to execute the associated smart contract, rather than
the data owner, to acquire the access token. Once acquiring the data token, visitors can access the
corresponding data via the resource portal. A typical workflow is shown in Figure
\ref{fig:dataprivacy}, which consists of 5 primary steps. Initially, data owner A store the sensitive
data in its own data storage. When it wants to disclosure to other visitors (e.g., agent B), it will
register such data and the access policy (e.g., who can access the data) using the smart contract.
Second, when agent B request for the access, it will first send requests to the blockchain backbone.
Third, the blockchain backbone will execute A's smart contract to determine whether agent B is
valid. If B is valid, the system will issue a transaction and grant agent B the access right in the form
Han Hu, Yong Luo, and Yonggang Wen
of token. Fourth, the access token is then returned to agent B. Finally, agent B can successfully
access the desired data using the data token via agent A's portal.
In comparison with the prevalent methods, the proposed method has much better flexibility and
retroactivity. As blockchain can update the smrat contract or establish new contract easily, the smart
contract based ACL method is programmable while guarantees the full control of data. Furthermore,
a successful access request will be recorded in the blockchain backbone in the form of transaction.
Thus, we can accurately track who and when access the data.
D. Blockchain Based Distributed Model Training
Distributed model training is an effective method to solve the problem of huge computation
requirement caused by complex AI algorithms and massive data. It can simultaneously utilize
computation resources and datasets from multiple parties (e.g., cloud platforms or institutions) to
accelerate the training rate, and improve the model accuracy.
Existing efforts focus on data parallelism algorithms or distributed algorithms to solve this problem.
Specifically, parameter server based algorithm relies on the centralized to coordinate among GPU
servers. In each training epoch, all the participated GPU servers send the parameter updates to the
parameter server. Then the parameter server disseminates the overall updates to all the GPU servers.
Similarly, the distributed machine learning relies on a centralized network architecture to conduct a
batch or online model training. Essentially, all these algorithms adopt the centralized architecture.
They meet several challenges, including leader selection, single point failure, and participator
join/leave. First, the centralized point must be stable and reliable. The key issue is how and who to
select this participator. Second, if the central server is shut down or attached, the whole network
International Journal of Information Technology Vol. 24 No. 1 2018
21
stops working. Third, if any participator joins or leaves the network for a short period, the training
process is disrupted, and the central server needs to deal with the recovering issue.
Figure 5 Distributed model training using smart contract.
A possible method is to utilize the smart contracts to organize participators and coordinate the
training process (e.g., parameter updates, synchronization), as shown in Figure 5. The workflow
mainly consists of two phases, including initialization and training. In the initialization phase, when
one agent (e.g., agent A) initiates a distributed collaborative training task, it registers a training
contact in the blockchain backbone, along with the reward for successful training. Other agent can
join the training process by accepting the training contract. When all the participants accept the
contract (e.g., agent A-C), they can create a coordinator contract, which defines the number of
training epochs, access control policy among participant pairs, parameter update path, etc. After the
Han Hu, Yong Luo, and Yonggang Wen
success of initialization, all the participants begin the training process under the help of coordinator
contract. At each training epoch, the coordinator contract notifies all the agents to start the training.
After receiving the notification, all the agents invoke the corresponding backbone computation
resource to run the training process. The resulted parameter updates for this epoch will be stored in
the directory as specified in the negotiation. The coordinator contract regularly queries these
directories to see if all the participants have completed the model training for this round. when all the
parameter updates are calculated successfully, the coordinator contract notifies all the agents to
exchange parameter updates. Upon the completion, the training for this epoch is accomplished. The
process continues until the end of epoch count.
The proposed method can solve the drawbacks incurred by the traditional methods. In particular, the
smart contract based method adopts the decentralized architecture to coordinate the distributed
model training. There is no single point failure risk and no leader selection puzzle. Besides, each
agent can join and leave flexibly without interruption of the training. In addition to these benefits,
the parameter updates are recorded in the form of transactions, which can guarantee the reliable
communication among agents.
V. Summary and Future Direction
In this study, we discussed the integration of blockchain and deep learning. For the blockchain, we
explored the benefits from deep learning by decomposing the blockchain system and identifying the
key challenges, including network attack, power consumption, scalability, and privacy protection.
We presented different deep learning based methods to tackle them. For the deep learning, we
discussed the benefits of blockchain, including anonymity, decentralization, security and
immutability. We utilized these salient features to protect data privacy and coordinate distributed
control in deep learning applications. We devised a unified blockchain based deep learning
framework and presented two case studies based on the framework.
International Journal of Information Technology Vol. 24 No. 1 2018
23
References
[1] S. Nakamoto, “Bitcoin: A peer-to-peer electronic cash system,” 2008.
[2] G. Wood, “Ethereum: A secure decentralised generalised transaction ledger,” Ethereum
project yellow paper, vol. 151, pp. 1–32, 2014.
[3] I. Goodfellow, Y. Bengio, A. Courville, and Y. Bengio, Deep learning. MIT press
Cambridge, 2016, vol. 1.
[4] K. Croman, C. Decker, I. Eyal, A. E. Gencer, A. Juels, A. Kosba, A. Miller, P. Saxena, E.
Shi, E. G. Sirer et al., “On scaling decentralized blockchains,” in International Conference
on Financial Cryptography and Data Security. Springer, 2016, pp. 106–125.
[5] M. C. K. Khalilov and A. Levi, “A survey on anonymity and privacy in bitcoin-like digital
cash systems,” IEEE Communications Surveys & Tutorials, 2018.
[6] Digiconomist, “Bitcoin energy consumption index.” [Online]. Available:
https://digiconomist.net/bitcoin-energy-consumption
[7] F. Tschorsch and B. Scheuermann, “Bitcoin and beyond: A technical survey on
decentralized digital currencies,” IEEE Communications Surveys & Tutorials, vol. 18, no. 3,
pp. 2084–2123, 2016.
[8] V. Zamfir, “Introducing casper the friendly ghost,” Ethereum Blog URL: https://blog.
ethereum. org/2015/08/01/introducing-casper-friendly-ghost, 2015.
[9] M. Castro, B. Liskov et al., “Practical byzantine fault tolerance,” in OSDI, vol. 99, 1999, pp.
173–186.
[10] A. Cuzzocrea, C. Mastroianni, and G. M. Grasso, “Private databases on the cloud: Models,
issues and research perspectives,” in Big Data (Big Data), 2016 IEEE International
Conference on. IEEE, 2016, pp. 3656–3661.
Han Hu, Yong Luo, and Yonggang Wen
[11] M. Abadi, A. Chu, I. Goodfellow, H. B. McMahan, I. Mironov, K. Talwar, and L. Zhang,
“Deep learning with differential privacy,” in Proceedings of the 2016 ACM SIGSAC
Conference on Computer and Communications Security. ACM, 2016, pp. 308–318.
[12] K. Xu, T. Cao, S. Shah, C. Maung, and H. Schweitzer, “Cleaning the null space: A privacy
mechanism for predictors.” in AAAI, 2017, pp. 2789–2795.
[13] S.Wang, X. Jiang, Y. Wu, L. Cui, S. Cheng, and L. Ohno-Machado, “Expectation
propagation logistic regression (explorer): distributed privacy preserving online model
learning,” Journal of biomedical informatics, vol. 46, no. 3, pp. 480–496, 2013.
[14] F. Yan, S. Sundaram, S. Vishwanathan, and Y. Qi, “Distributed autonomous online
learning: Regrets and intrinsic privacy-preserving properties,” IEEE Transactions on
Knowledge and Data Engineering, vol. 25, no. 11, pp. 2483–2493, 2013.
[15] Y. Li, Y. Wen, K. Guan, and D. Tao, “Transforming cooling optimization for green data
center via deep reinforcement learning,” CoRR, vol. abs/1709.05077, 2017. [Online].
Available: http://arxiv.org/ abs/1709.05077
Han Hu is an Associate Professor with School of Information and
Electronics at Beijing Institute of Technology, China. He received
the B.S. degree and Ph.D. both from University of Science and
Technology of China (USTC) in 2007 and 2012 respectively. He
joined National University of Singapore and Nanyang
Technological University, Singapore as a research fellow in 2012
and in 2014 respectively. His research interests include multimedia
networking and data analytics. He has published 40+ papers in top
journals (e.g., IEEE JSAC, IEEE TCSVT, and IEEE TMM) and
prestigious conferences (e.g., Infocom, and ACM MM), and
received three best paper awards, including 2015 IEEE Multimedia
Best Paper Award, 2015 Chinacom Best Paper Award, and 2013
IEEE Globecom Best Paper Award. His work on multimedia
networking and data center networking have been awarded the 2013
ASEAN ICT Awards (Gold Medal) and 2015 Datacentre Dynamics
Awards.
International Journal of Information Technology Vol. 24 No. 1 2018
25
Yong Luo received the B.E. degree in Computer Science from the
Northwestern Polytechnical University, Xian, China, in 2009, and
the D.Sc. degree in the School of Electronics Engineering and
Computer Science, Peking University, Beijing, China, in 2014. He
is currently a Research Fellow with the School of Computer
Science and Engineering, Nanyang Technological University. His
research interests are primarily on machine learning and data
mining with applications to visual information understanding and
analysis. He has authored several scientific articles at top venues
including IEEE T-PAMI, T-NNLS, IEEE T-IP, IEEE T-KDE,
IEEE T-MM, IJCAI and AAAI. He received the IEEE Globecom
2016 Best Paper Award, and was nominated as the IJCAI 2017
Distinguished Best Paper Award. Dr. Luo is the corresponding
author of this work.
Yonggang Wen (S’99-M’08-SM’14) is an Associate Professor
and Director of Innovation Lab with School of Computer Science
and Engineering (SCSE) at Nanyang Technological University
(NTU), Singapore. He also serves as the Acting Director of
Nanyang Technopreneurship Centre at NTU. He received his
PhD degree in Electrical Engineering and Computer Science
(minor in Western Literature) from Massachusetts Institute of
Technology (MIT), Cambridge, USA, in 2007. Previously he has
worked in Cisco to lead product development in content delivery
network, which had a revenue impact of 3 Billion US dollars
globally. Dr. Wen has worked extensively in learning-based
system prototyping and performance optimization for large-scale
networked computer systems. His work in Multi-Screen Cloud
Social TV has been featured by global media (more than 1600
news articles from over 29 countries) and received 2013 ASEAN
ICT Awards (Gold Medal). His work on Cloud3DView, as the
only academia entry, has won 2016 ASEAN ICT Awards (Gold
Medal) and 2015 Datacentre Dynamics Awards 2015 C APAC
(Oscar award of data centre industry). He is a co-recipient of
2015 IEEE Multimedia Best Paper Award, and a co-recipient of
Best Paper Awards at 2016 IEEE Globecom, 2016 IEEE Infocom
MuSIC Workshop, 2015 EAI/ICST Chinacom, 2014 IEEE
WCSP, 2013 IEEE Globecom and 2012 IEEE EUC. He received
2016 IEEE ComSoc MMTC Distinguished Leadership Award.