1 Aurora: a probabilistic algorithm for distributed

1

Aurora: a probabilistic algorithm for distributedledgers enabling trustless synchronization and

transaction inclusion verificationFederico Matteo Bencic, Member, IEEE, Ivana Podnar Zarko, Member, IEEE,

Abstract—A new node joining a blockchain network first synchronizes with the network to verify ledger state by downloading the entireledger history. We present Aurora, a probabilistic algorithm that identifies honest nodes for transient or persistent communication in thepresence of malicious nodes in a blockchain network, or ceases operation if it is unable to do so. The algorithm allows a node joiningthe network to make an informed decision about its next synchronization step or to verify that a transaction is contained in a validledger block without downloading the entire ledger or even the header chain. The algorithm constructs a Directed Acyclic Graph on thenetwork topology to select a subset of nodes including a predefined number of honest nodes with a given probability. It is evaluated ona Bitcoin-like network topology using an open-source blockchain simulator. We investigate algorithm performance and analyze itscommunication complexity. Our results show that the algorithm facilitates trustless interactions of resource-constrained nodes with ablockchain network containing malicious nodes to enable a leaner initial blockchain download or an efficient and trustless transactioninclusion verification. Moreover, the algorithm can be implemented without any changes to the existing consensus protocol.

Index Terms—blockchain, scalability, decentralization, trustless, light client

F

1 INTRODUCTION

D ISTRIBUTED LEDGER Technology (DLT) is a revolu-tionary technology for digitizing assets [1] and has

become widely known through one of its specializations,the blockchain. Blockchain is a distributed ledger basedon a peer-to-peer (P2P) network to manage transactionsfor multiple entities in a verifiable and traceable manner.Data is written within blocks of blockchain transactions, anddoes not require a centralized entity, but rather the entireP2P network, to maintain the blocks. The main features ofblockchain are immutability, transparency and fault toler-ance [2].

Users running blockchain nodes can choose betweentwo types of nodes: full and light. Full nodes downloadand verify the entire ledger which contains all transactionssince the ledger creation. Consequently, they operate in atrustless manner, but also have more stringent hardwarerequirements. In contrast, light nodes only need to down-load part of the ledger (e.g., the header chain) and thereforeconsume less processing power, network bandwidth andmemory compared to full nodes. However, light nodesdepend on full nodes — full nodes provide light nodes withthe metadata required for their operation.

Regardless of the node type and due to the immutabilityrequirement, the size of the ledger is immense as the num-ber of transactions continuously increases. This leads to asignificant load during node synchronization. For example,in January 2020, the Ethereum blockchain had about 250GB in non-archived mode [3]. Bitcoin’s header chain had50 MB in April 2020, while Ethereum’s header chain had 5GB [4] and is growing about 1 GB per year [5]. Since thesynchronization process is memory and time consuming,a new node may either be unable to download the ledgeror unwilling to wait too long for the process to complete.For example, nodes in the Internet of Things (IoT) domain

are often resource-constrained devices with limited memory,processing power, and energy, which further exacerbatesthe problem. As a result, users often opt for third-partyexplorers instead of running their own nodes.

In addition to high resource consumption, the process ofnode synchronization (i.e., bootstrapping) becomes a pointof centralization, as it depends in part on a set of well knownnodes that are assumed to be both available and honest.Today’s client implementations rely on a list of well knownaddresses [6] that may not always be available or may evenexhibit Byzantine behavior. One could work around theissue of unavailable seeds by manually adding bootstrappeers, but manually added nodes may also exhibit maliciousbehavior, making a new node vulnerable to various typesof attacks, from denial of service to Eclipse attacks [7], [8].The consequences of a malicious node being a first contactnode can vary for a victim. They range from a waste oftime and resources, in the best case, to a state where, in theworst case, the victim is unaware of the existence of a longerchain because no honest node is available to advertise it.Note that if a new node joins the network through a setof partitioned nodes (which may or may not be malicious),the new node could synchronize with a ledger that is in astate of extended fork, and such a state can lead to a double-spending attack. For simplicity, hereinafter we do not distin-guish between partitioned and malicious Byzantine nodes,as our algorithm is applicable in both scenarios.

The Aurora algorithm, originally proposed in [9], is aconsensus-agnostic probabilistic algorithm designed to en-able a new node to avoid the adverse influence of Byzantineor malicious nodes in ledger networks by detecting a prede-fined number of honest nodes with a high and controllableprobability. The identified nodes are used thereafter forpersistent or transient communication of the new node with

arX

iv:2

108.

0827

2v1

[cs

.CR

] 9

Aug

202

1

2

the network. If the predefined number of honest nodescannot be identified, the algorithm stops. Assuming thatan honest node reveals to a new node what is acceptedas canonical truth by the majority of the voting power, asopposed to a malicious node that tries to subvert the newnode, the algorithm saves the node’s resources by efficientlydiscovering a subset of network nodes containing at least 1or 1 + |M | honest nodes, where M is the set of maliciousnetwork nodes. Such subsets of network nodes can be usedfor trustless ledger synchronization, or transaction inclusionverification without the need to download the entire ledgeror even the header chain.

While the algorithm can be used by both full and lightnodes, its low memory footprint and stochastic behaviormake it particularly suitable for resource-constrained de-vices, where it offers the greatest benefit. Although poten-tially applicable in other domains, the algorithm offers par-ticular benefits in the domain of public and permissionlessDLT solutions, as it reinforces the decentralized and trustlessethos on which the technology is based.

This work builds on our previous work which intro-duced the initial version of the Aurora algorithm [9]. Themain contributions of the paper and the most notable dis-tinctions compared to our previous work are the following:

1) Redefinition of the algorithm to provide an accurateprobabilistic output and communication complexityof the algorithm. The algorithm identifies a subset ofnetwork nodes containing a predefined number ofhonest nodes with high probability as a function ofthe assumed number of malicious network nodes.

2) A comprehensive evaluation of the redefined al-gorithm is performed by simulation on a realis-tic network topology using open-source simulationtools in a Bitcoin-like network environment. Thesimulation results confirm our analytical findings.

3) Two concrete application examples of the algorithm:

a) In a scenario where a new node joins adistributed ledger network, the algorithmassists the node during the synchronizationprocess with a blockchain network with ma-licious nodes.

b) The algorithm facilitates trustless and effi-cient verification of transaction inclusion ina block with a predefined correctness proba-bility and significantly reduces the resourceconsumption of a device executing the veri-fication procedure.

Although our work is relevant to solutions for peer-to-peer (P2P) networks which detect malicious actors orcounter their direct influence, our work differs from such so-lutions in that it does not detect malicious nodes, but ratheridentifies node sets containing honest nodes. Moreover, ourwork has a well-defined application domain, namely DLTnetworks. Comparison with related work is discussed inmore detail in Section 2.

The paper is organized as follows: Section 2 comparesour solution with other existing solutions. Section 3 intro-duces necessary terms and definitions, and enumerates theassumptions under which our algorithm works. Section 4

discusses how relevant variables affect the size of sets thatcontain a predefined number of honest nodes. Section 5presents the Aurora algorithm and introduces two relevantapplications of the algorithm, while Section 6 investigates itscommunication complexity. Section 7 explains the method-ology used for algorithm verification, the experiments per-formed, and their results. Section 8 discusses possible chal-lenges and future work, while Section 9 concludes the paperwith a glossary appended at the end of the paper.

2 RELATED WORK

We begin by taking a closer look at light-client solutions,highlighting potential synergies and contrasting differencesto our algorithm. Next, since our work is the most valu-able when adversarial influences are present in both P2Pand DLT networks, we continue with a brief overview ofrelevant works.

Light client implementations fall broadly into the fol-lowing three categories: remote clients, simplified paymentverification clients, and trustless clients.

Remote clients rely on a centralized solution for com-munication with a DLT network. An example is MetaMask,an Ethereum wallet that communicates with the Ethereumledger via the Infura Gateway System [10]. Remote clientswhich are not completely centralized also exist, for examplethe SlockIt INCUBED client1. This client uses multiple gate-ways to communicate with a ledger network. The gatewayshave a stake in the network that can be reduced if their mis-behavior is discovered. Such clients sacrifice the integrity ofthe data by trusting a third party in exchange for a verysmall computational resource requirement to interact with aDLT network. They differ greatly from our solution becauseour solution is completely decentralized.

Simplified Payment Verification clients (SPVs) are stan-dard distributed ledger clients that synchronize with thenetwork’s header chain and request the rest of the blockinformation as needed. Examples are Electrum2 for Bitcoin,or Geth (in light mode) for Ethereum. There are SPVsthat, similar to our solution, recognize the added value ofmultiple sources of truth and can even be used in conjunc-tion with our solution. For example, Electrum maintainsconnections to approximately 10 servers and subscribes toblock header notifications to all of them to detect forks andpartitions. Here we see an emerging synergy: The Auroraalgorithm can be used to identify 10 servers which arehonest with high probability. The Tendermint [11] lightclient acquires prerequisites for connecting to the networkby means outside of Tendermint3 (e.g., social consensus),and thus it can apply our solution to identify these prereq-uisites. SPVs differ from remote clients since they maintaintheir trustlessness to some extent by validating metadataagainst the header chain in their possession. However, sincethey are served by full nodes, they rely on full nodeswhich are assumed to be available and honest. Moreover,the applicability of SPVs to resource-constrained devices isdebatable — even the header chain may be too large for

1. https://consensys.net/diligence/audits/2019/09/slock.it-incubed3/

2. https://electrum.readthedocs.io/en/latest/faq.html3. https://docs.tendermint.com/master/spec/p2p/node.html

3

some devices. Our solution differs greatly from SPVs sinceit does not require the header chain to work.

Trustless clients attempt to maintain the trustless mech-anisms of SPVs while keeping their size sublinear or con-stant compared to ledger size or even header chain size. Assuch, they are sometimes referred to as super light clients.A client running the Aurora algorithm would partially fallinto this category — when used for transaction inclusionverification, the client does not need to download the headerchain (or part of the header chain). A proposal from [12]uses a cryptography accumulator and generates a chainsummary in the form of a block attribute. Then, the clientmust randomly choose a slice of the network nodes, andif the majority of nodes is compromised by the attacker,the protocol is not secure. Consequently, our algorithm canbe used in conjunction with the above — the slice canbe the output of our algorithm. Vault [13] is a solutionbuilt on a proof-of-stake consensus protocol proposed byAlgorand [14] and introduces a fast bootstrapping methodrelying on the presence of stamping certificates, while oursolution does not require any additional data structures tooperate.

We continue by highlighting two solutions: Fly-Client [15] as an example of a solution that would benefitsignificantly from our algorithm, and BlockQuick [16] as anexample of a solution that addresses the potential existenceof Eclipse attacks, but does so in a fundamentally differentway than our solution.

FlyClient is a light client proposal for Proof of Workblockchains [15]. The client only needs to store a logarithmicnumber of block headers to provide strong mechanisms forensuring the validity of those block headers by using tech-niques such as probabilistic sampling, MRRs4, and the Fiat-Shamir heuristic. As FlyClient requires nodes to maintainMRRs, FlyClient cannot be used on the currently runningBitcoin and Ethereum networks without forks. Also, theclient must be connected to at least one honest node, whichmeans that FlyClient cannot protect a node against Eclipseattacks, unlike the Aurora client. Thus, our client does notcompete with this solution, but can work in synergy with it.The one honest node that FlyClient needs to operate can befound with our solution.

BlockQuick is based on a consensus-based reputationscheme [16]. A BlockQuick client maintains the ConsensusReputation Table, which contains miner addresses with thehighest reputation shares generated based on the latest 100block headers. The reputation share of a miner is equal to thenumber of blocks mined in the last 100 blocks. When newblocks are broadcasted, the client validates the miner’s cryp-tographic signature against the data from the ConsensusReputation Table. A new block is considered valid only if theblock receives a consensus share score > 50%. Thus, it is notjust a matter of choosing the longest chain with the highestdifficulty, but accepting the block from miners with a highconsensus share. The client is resistant to both MITM andEclipse attacks. Similar to FlyClient, the requirements forrunning BlockQuick clients with the current Ethereum andBitcoin networks are not met. According to [16], each minershould be reachable at the address specified in the block,

4. Merkle Mountain Ranges

each block header must contain an address and the publicID of the block miner and each block must contain a proofof inclusion of all previous block headers. The result is that,unlike our solution, a BlockQuick client cannot be deployedon currently running Ethereum or Bitcoin networks withouta hard fork.

Relevant solutions in the P2P domain are anomaly de-tection methods and sibyl group inference solutions (e.g.,SybilGuard [17]). However, since reputation measurementsare not available in existing blockchain solutions, such ap-proaches are not applicable in this context. Anomaly de-tection techniques [18], flow-based and graph-based meth-ods [19], [20] as well as network traffic reliant solutions(e.g., [21], [22]) differ from our solution in two key aspects:The first and most obvious is the application domain — ourwork is DLT-specific. Second, we do not deal with or inferthe identity of malicious clusters, but focus on identifyinghonest nodes for bootstrapping and transaction inclusionverification.

As for the adversarial influence in DLT networks, pre-vious studies, e.g., [6], [7], [23], [24], [25], [26], [27], haveconfirmed that prominent DLT solutions such as Bitcoin [28]and Ethereum [23] are vulnerable to Denial of Service,Eclipse Attacks, BGP highjacks, Man-in-the-Middle (MITM)attacks, network partitioning, etc. The consequences of ad-versarial influences are non-trivial and range from trans-action dropping to double spending. Although the relatedwork proposes and implements several countermeasures,it differs from our work in two key ways. First, we donot directly counteract malicious clusters, but focus ondetecting honest nodes for persistent or transient commu-nication, thus circumventing the malicious influence ratherthan countering it or dealing with the consequences. Second,our algorithm does not require special data structures orprotocol modifications to be integrated with deployed DLTsolutions.

3 PRELIMINARIES

Let us consider a network S with M malicious nodes. Weassume the following:

Assumption 1. A new node a ∈ S joins the network bycontacting a single first-contact node fc and is unawareof any other network node (e.g., bootstrap peers are notavailable).

Assumption 2. The first-contact node fc may be malicious,and may collude with other malicious peers to subvert nodea.

Assumption 3. A subset of nodes from S, denoted by Γ,contains nodes discoverable by node a via fc.

By discoverable nodes we refer to those nodes that node a hasencountered or become aware of their existence when itsfirst-contact node is fc.

Assumption 4. There is a set of malicious nodes, M ⊂ S,the size of M is κ and the relation κ < |Γ| holds. The valueof κ is known a priori or can be inferred (the initializationof κ is discussed in the following sections).

4

Assumption 5. The adversary always holds a minority ofthe voting power.

Assumption 6. The number of malicious nodes in thenetwork cannot increase during algorithm execution.

Under these assumptions, node a performs a process ofnetwork exploration which we name gathering, as the pro-cess of collecting discoverable nodes initiated from node fcto create Γ. Each node queried during a gathering exposesa set of its neighboring nodes to node a, which a uses toexpand Γ. We continue by providing definitions based onthe previously stated assumptions.

Definition 1 (Deterministic Honest Set, ∆h). For a given Γ,a set ∆h is a subset of Γ which contains at least h honestnodes.

Corollary 1.1. For any given Γ, there exists a deterministichonest set with at least 1 honest node, ∆1.

Proof. Iff the relation κ < |Γ| holds as per Assumption 4,Γ must contain at least one honest node. Consequently, ∆1

can always be constructed.

Definition 2 (Probabilistic Honest Set, Πh). For a given Γ,a set Πh is a subset of Γ which contains at least h honestnodes with probability ρ. The probability ρ is derived froman underlying hypergeometric distribution.

To reduce problem complexity, we introduce justifiableconstraints on h relevant to the DLT domain. First, we con-sider those ∆h and Πh that contain at least one honest node,which guarantees that the truth (e.g., the longest blockchain)can eventually be received by node a. Second, we considerthose ∆h and Πh that contain a majority of honest nodes,which guarantees that an honest answer can be derivedby majority voting without requiring a resource-intensiveprocess of ledger/header chain analysis. With respect tothe above constraints on h, we define the correspondingdeterministic sets that will be used as a naive baseline asfollows:

Definition 3 (Deterministic Safe Set, ∆s ≡ ∆1). For a givenΓ containing κ malicious nodes, where |Γ| > κ, ∆s is adeterministic honest set which contains at least one honestnode and its size is at least κ+ 1.

We call this set safe set since node a can obtain at least onehonest answer in the worst case by querying all membersof ∆s. By analyzing the obtained ledgers and verifyingtheir transactions since genesis, the node can detect possiblediscrepancies in the ledger state as reported by the queriednodes, but also identify a correct ledger.

Definition 4 (Deterministic Progress Set, ∆p ≡ ∆b|∆|/2c+1).For a given Γ containing κ malicious nodes, where |Γ| > 2κ,∆p is a deterministic honest set which contains a majorityof honest nodes and its size is at least 2κ+ 1.

We call this set progress set because node a can obtain amajority of honest answers in the worst case by queryingall members of the set and perform an action based on themajority vote about the ledger state reported by the queriednodes.

The deterministic baselines, both safe and progress sets,offer a naive solution to the problem of identifying a truthful

node: A node can trivially query multiple network nodesunder the assumption that there are at most κ maliciousnodes, and compare the obtained answers in search ofthe truth. However, this method is very inefficient, as weshow in Section 4. Let us now consider definitions of thecorresponding sets with probabilistic bounds.

Definition 5 (Probabilistic Safe Set, Πs ≡ Π1). For a givenΓ containing κ malicious nodes, where |Γ| > κ, Πs is aprobabilistic honest set which contains at least one honestnode with probability ρ.

Definition 6 (Probabilistic Progress Set, Πp ≡ Πb|Π|/2c+1).For a given Γ containing κ malicious nodes, where |Γ| > 2κ,Πp is a probabilistic honest set which contains a majority ofhonest nodes with probability ρ.

Probabilistic sets are generated during an iterative pro-cess when node a in each step 1) contacts a network node,2) asks for a set of its neighbors, and then 3) selects a next-draw node to repeat the process. For this reason, we definethe following:

Definition 7 (Set of discoverable network nodes in step i,Γi). Γi is a subset of network nodes which node a haslearned about while performing a gathering up to step i,where |Γi| ≤ |Γi+1|.

In other words, Γi is filled by nodes appearing in neigh-bor sets (i.e., peer lists) reported by contacted nodes, andthis set contains candidates for probabilistic honest sets.Πh is generated from a finite population Γi by samplingrandom nodes without replacement. The sampling processcan be modeled by the hypergeometric distribution [29], aseach element selected from Γi can be classified as a failureor a success. In our context, a success denotes the selectionof an honest node, while a failure occurs when a maliciousnode is selected. In summary, the distribution models theprobabilities associated with the number of successes in ahypergeometric experiment, and is defined by

X ∼ Hypergeometric(N,K, n), (1)

where N is the population size, K is the number of suc-cesses in the population, and n is the sample size.

The underlying probability mass function is then givenby

P (X = x) =

(Kk

)(N−Kn−k

)(Nn

) = f(N,K, n, k), (2)

where k is the number of successes in the sample.We begin the construction of a probabilistic set by a

gathering initiated at node fc to explore the network un-til |Γi| > κ, and then start performing hypergeometricexperiments for a given Γi (the population). It contains|Γi|−κ honest nodes (successes), and there exists a subset ofsize |Πh| (the sample) which contains h honest nodes withprobability ρ or more. More formally, the hypergeometricdistribution can be stated as

X ∼ Hypergeometric(|Γi|, |Γi| − κ, |Πh|), (3)

where we evaluate the following predicate

p(X ≥ h) ≥ ρ. (4)

5

If the predicate is evaluated as true, we have constructeda probabilistic set of size |Πh| containing at least h honestnodes with at least probability ρ. If the predicate is evalu-ated as false, we continue exploring the network.

The pseudocode defining the construction of a proba-bilistic set is given in Alg. 1 which requires three inputparameters: the correctness probability ρ, the first contactnode fc, and κ, the initialization of which is covered in Sec-tion 5.3.

Algorithm 1: Πh constructionInput : ρ — Probability guaranteeInput : |Πh|— Desired probabilistic set sizeInput : κ — Desired malicious node

tolerance1 Gather at least κ + 1 nodes in Γ;2 constructed← false;3 do4 X ∼ Hypergeometric(|Γ|, |Γ| − κ, |Πh|);5 if p(X ≥ h) ≥ ρ then6 constructed← true;7 else8 Add more nodes to Γ;9 end

10 while !constructed;

Now that we understand how Πh is constructed, wecan analyze why using probabilistic sets is advantageouscompared to using their deterministic counterparts.

We perform an experiment to compare the deterministicand probabilistic honest sets which are the most relevant tobe used in the DLT domain, namely safe sets and progresssets. For four different population sizes |Γ|, we graduallyincrease the number of malicious nodes κ to compare ∆s

with Πs, and ∆p with Πp. The results presented in Fig. 1lead us to the following conclusions:

• The sizes of probabilistic sets Πs and Πp are muchsmaller compared to the sizes of ∆s and ∆p;

• The size of deterministic sets is, as expected, lin-early dependant on the number of malicious networknodes, while the size of probabilistic sets showssublinear growth. For certain parameters (e.g., whenκ is reasonably small), the probabilistic set sizes aresignificantly smaller compared to their deterministiccounterparts. The observed difference is significantfor larger populations, when a probabilistic set sizecan be several orders of magnitude smaller com-pared to its deterministic counterpart.

Thus, we can conclude that probabilistic honest setsbring significant benefits for solving the problem of iden-tifying honest nodes within subsets of network nodes, espe-cially for large networks.

4 PROBABILISTIC SET SIZE

Since node a uses members of Πh (i.e., Πs or Πp) fortransient or persistent communication, reducing |Πh| willdiminish the communication complexity of our algorithm.As shown in the previous section, the size of Πh can be

� ��

��

��

� ��

��

��

� ��

��

��

��

� �N ��N ��N ��N�

�N

��N

��N

��N

_b�V�_ _n�V�_ _b�S�_ _n�S�_

_a_� �� _a_� ��

_a_� �� _a_� ��

�

6HW�VL]HV

Fig. 1. Deterministic vs. probabilistic set sizes with an increase of κfor four different population sizes. When population size increases, weobserve a greater difference between a probabilistic set size and itsdeterministic counterpart.

reduced by increasing the number of successes in the popu-lation. In this section, we examine the rate at which |Πh| isreduced with respect to the growth of |Γ|, and compare |Πh|to sublinear functions of κ in the context of the current sizeof the Bitcoin IPV4 network.

An experiment was conducted to mimic algorithm exe-cution that creates an increasing set of discoverable networknodes: We varied the number of discoverable nodes |Γ|from 2 to 20480, which is the maximum number of totalnodes that a Bitcoin client can store [30]. For each |Γ|, κ wasset to |Γ| − 1 and gradually decreased in increments of 1.During each experimental run, the ratio between |Γ| and κwas marked when the following predicates (i.e., conditions)were met:

1) |Πs| ≤√κ.

2) |Πs| ≤ lnκ.3) |Πp| ≤

√κ.

4) |Πp| ≤ lnκ.

��

� � � � � � � ��

� � � � � � � ��N

�

�

��

��

��

��

��

_´�V�_��¥É _´�V�_��ORJ�É� _´�S�_��¥É _´�S�_��ORJ�É�

_+_

_+_�É

Fig. 2. Ratio |Γ|κ

when the four predicates limiting the probabilistic setsizes to

√κ and lnκ have been satisfied with an increase of |Γ|.

6

Results are depicted in Fig. 2. By examining the ratiosindicating potential reduction of probabilistic set sizes withrespect to malicious node tolerance, we can conclude thatas the size of the population increases, the ratio between |Γ|and κ required to reduce the probabilistic set sizes decreases.Thus, it is easier to satisfy the four predicates limiting |Πh|for larger populations, i.e., for larger sets of discoverablenodes.

Table 1 shows the values extracted from Fig. 2 for|Γ| = 6356, which is the number of active Bitcoin IPV4nodes discovered and reported in [31]. This number is usedas the default network size in all the following experiments.

TABLE 1ps: probabilistic set type, f : upper bound for probabilistic set size as a

function of κ, f(κ): f evaluated at respective κ, P : predicate, |Π|:respective probabilistic set size, |∆|: respective deterministic set size

|Γ| 6356

ps Πs Πp

f sqrt(κ) ln(κ) sqrt(κ) ln(κ)

P |ps| ≤ f(κ) |ps| ≤ f(κ) |ps| ≤ f(κ) |ps| ≤ f(κ)

|Γ| − κ 549 3985 4615 6053

κ 5807 2371 1741 303

|Γ|/κ 1.09454 2.68073 3.65078 20.9769

f(κ) 76.20367 7.77107 41.72529 5.71373

|∆| 5808 2372 3483 607

|Π| 76 7 41 5

|∆||Π| 76.421052 338.85714 84.951219 121.4

p 0.9990005 0.9990004 0.9990073 0.9990014

For example, by examining the first column in Table 1,we see that to satisfy the predicate |Πs| ≤ sqrt(κ), even upto 5807 of 6356 network nodes can be malicious, while thesize of Πs is less than or equal to

√5807 = 76.20367, which

corresponds to 76, and the correct execution is guaranteedwith p = 0.9990005. If we would want to deterministicallyfind at least one honest node within the same network,|∆s| = κ + 1 = 5808. In other words, 76.421052 timesmore nodes need to be queried for a deterministic answerin comparison to a probabilistic answer.

From a pragmatic point of view, the first three predicateslimiting probabilistic set sizes with respect to κ (resultsdisplayed in the first three columns of Table 1) can besatisfied relatively easily, while the last condition has littlevalue for a real-world scenario due to its extremely high|Γ|/κ ratio. Further reduction of the probabilistic set sizerequires an even larger |Γ|/κ ratio and is considered im-practical. Therefore, we decided to use the square root asa sublinear function that bounds the probabilistic set sizewith respect to malicious node tolerance and we definethe following constraint which will be of great use in theupcoming complexity analysis:

Constraint 1. |Πh| is bounded by√κ, i.e., |Πh| ≤

√κ.

Note that so far we have relied on the existence of theset Γ without elaborating on how this set is constructed,as our analysis has been independent of any particular

DLT solution or underlying network topology. However,both the topology and specifics of DLT solutions need tobe considered when defining the Aurora algorithm, as weexplain in the following section.

5 THE AURORA ALGORITHM AND ITS APPLICA-TIONS

The Aurora algorithm is used to construct probabilisticsets from a list of discoverable nodes. When performingthe algorithm, node a contacts other network nodes: Inaccordance with Assumption 1 it does not have access totrusted nodes and contacts a random one. As previouslystated, starting from node fc, node a explores the networkin a series of steps, where a single step is defined as:

Definition 8 (Draw). A draw is a step performed by node aduring network exploration which consists of two messages:a ping message from node a to a network node requesting itspeer list, and the corresponding pong message containing aresponse denoted as ℵ. A draw at step i expands Γi−1 withthe unique nodes found in ℵi, i.e., Γi ← Γi−1 ∪ ℵi

A network node can receive multiple ping queries andis assumed to be stateful [32], which means it retains stateabout which addresses of neighboring peers are revealedto node a together with a timestamp. Consequently, thenetwork node will not reveal nodes that have already beenrevealed to node a in a previous pong message. A networknode is no longer queried by node a if it responds withan empty peer list or fails to respond. Once a draw issuccessful, node a terminates a connection to the networknode since resources should be released as soon as possible.Draws are executed in steps until a halting condition issatisfied.

Definition 9 (Halting condition). Halting condition is apredicate that defines specific conditions which indicate tonode a to terminate further draws in the network.

Node a checks, after each draw, whether the haltingcondition is satisfied and consequently terminates furtherdraws. Hereafter, unless otherwise specified, the default halt-ing condition is defined as a step when all discoverable nodesstop responding to ping requests of node a (i.e., there are noavailable nodes left to query).

We define a sequence of draw steps as follows:

Definition 10 (Gathering). Gathering is the process of net-work exploration which consists of a sequence of drawsexecuted in steps i = 1 . . . d, where a halting condition ismet in step d. We assume it is executed in a directed andacyclic manner.

We assume that a network can be represented as a graphin which nodes are represented as vertices and connectionsbetween nodes are represented as edges, where edges havean associated direction. In such a graph, a gathering can beassociated with a Directed Acyclic Graph (DAG) consistingof nodes contacted during a gathering. After a node re-sponds with an empty peer list or does not respond at all, asubsequent node is selected from Γ uniformly at random. Asnode a performs the gathering, it also overlooks the creation

7

of the associated DAG to ensure that cycle formation isavoided.

Although at first glance a gathering is similar to theprocess of recursively scraping the entire network, it issomething quite different. The purpose of a gathering isto collect a minimal number of nodes within a minimalperiod of time to construct probabilistic honest sets. Theprobabilistic sets are then used by node a to choose an op-timal bootstrap candidate and honest nodes for transactioninclusion verification.

The Aurora algorithm, as defined in Alg. 2, requiresthe following input parameters: a first contact node fc, adesired malicious node tolerance κ, the probability ρ underwhich it guarantees correct execution, a halting conditionhc, the maximum acceptable size of the probabilistic set|Πh|, and the number of honest nodes h which the proba-bilistic honest set should contain. The last parameter conse-quently determines which of the probabilistic sets is created:Πs or Πp. The algorithm performs a do-while loop whicheither constructs a Πh (line 7) or throws an exception (line13). Ping and pong messages are exchanged between nodea and a randomly sampled responding node from Γ (lines9−11). Unique nodes gathered from ping responses are usedto expand Γ (line 12). After each ping and pong exchange,the algorithm checks weather the halting condition has beensatisfied in line 13.

Algorithm 2: Aurora algorithmInput : ρ — Probability guaranteeInput : fc — First contact node identifierInput : hc — Halting condition

encapsulationInput : |Πh|— Maximum Πh sizeInput : h — [1, b|Πh|/2c+ 1]Input : κ — Desired malicious node

tolerance1 nxtDraw ← fc;2 Γ← ∅;3 do4 X ∼ Hypergeometric(|Γ|, |Γ| − κ, |Πh|);5 if p(X ≥ h) ≥ ρ then6 Πh ← Sample without replacement |Πh|

nodes from Γ;7 return Πh

8 else9 nxtDraw ← random responding node in Γ;

10 Send ping to nxtDraw;11 ℵ ← peers in pong message from nxtDraw;12 Add ℵ to Γ;13 if

hc.isHaltingConditionReached(nxtDraw,ℵ)then

14 throw15 end16 end17 while True;

The halting condition used in line 12 is an injected de-pendency that implements a generic method which returnstrue if a gathering should halt, or false otherwise. It uses as

input parameters nxtDraw, which is the next node queriedfor its peer list, and the corresponding peer list ℵ.

A halting condition may be defined as follows:

1) halt if the average number of newly-discoverednodes in a draw falls below a defined threshold, or

2) halt if a predefined maximal number of gatheringsteps has been reached.

Examples of halting conditions and their influence onalgorithm performance are investigated in Section 7.

5.1 Choosing a bootstrap candidate

For the purposed of identifying an honest bootstrap candi-date, node a should use the probabilistic safe set generatedas the output of the Aurora algorithm since it contains atleast 1 honest node with high probability. Here, we assumethat nodes in the network advertise the state of the ledger(e.g., the highest total difficulty represents the latest ledgerin Ethereum), and node a will attempt to synchronize itsledger with other nodes using the order starting from thelatest to the oldest advertised state. To understand why Πs

is adequate for this purpose, let us analyze the behaviorof an adversary. First, if the adversary advertises an olderledger state compared to honest nodes (i.e., a lower chainhead), node a will synchronize with another honest memberof Πs that reveals the latest ledger state (i.e., the highestchain head). Second, if the adversary advertises a newerledger state compared to an honest node (i.e., a highestchain head), then by Assumption 5 the ledger offered bythe adversary must be tampered with. In this scenario, Πs

will eventually provide safety with respect to blockchainsynchronization. In other words, node a could choose amalicious node from Πs and start synchronizing, but it willeventually detect that fraudulent data is provided by themalicious node by examining the ledger. Since there is aguarantee with probability ρ that an honest node is availableamong all nodes in Πs, the node will start synchronizingwith the available honest node.

Alg. 3 relies on the existence of a Πs constructedby Alg. 2 and allows node a to choose a candidate nodefor synchronization. The candidate is the node that providesthe most recent ledger according to the consensus protocol,e.g., advertises the longest chain, highest difficulty, etc. Foreach such candidate, node a attempts to synchronize withits ledger. If it turns out that the ledger has been tamperedwith, node a chooses the next candidate. The algorithmstops when node a downloads and verifies an entire ledgerwithout detecting any tampering.

5.2 Transaction inclusion verification

For the purpose of transaction inclusion verification, nodea should use the probabilistic progress set Πp to verifywhether a transaction with identifier txid is containedwithin a block with identifier blkid. Here we make furtherassumptions:

Assumption 7. The data contained in a block is part of anintegrity-validating structure, and a node can verify thatthe transaction is indeed contained in that structure. It is

8

Algorithm 3: Πs use for synchronizationInput : Πs — Probabilistic Safe Set

1 sync← false;2 while sync = false do3 candidate← null;4 for node in Πs do5 candidate← node offering latest ledger state6 end7 Attempt to sync with candidate;8 if Attempt successful then9 sync← true;

10 else11 Remove candidate from Πs;12 end13 end

assumed that all necessary additional metadata is availableto the node.5

To explain the usefulness of a progress set Πp for trans-action inclusion verification, we compare it with Πs. Whenrelying on an instance of Πs, node a cannot assume that thechosen bootstrap candidate is honest, since ledgers need tobe downloaded and verified. However, node a can compareanswers from all members of Πp and conclude that theanswer provided by the majority is true with probabilityρ.

When Assumption 7 holds, Alg. 4 either checks whethera transaction with ID txid is included in a block blkid withprobability ρ by checking with every member of Πp if theprovided Merkle proof is valid with respect to the availableMerkle root mroot, and thus chooses the majority answer.

5.3 Initialization of parameters and use cases

To use Alg. 2, at least six parameters are required. Asmentioned earlier, it is recommended to set the correctnessprobability ρ to 0.999, with a note that the impact of ρ onthe size of probability sets is part of our future work. Thefirst contact node fc can be found in any way a user seesfit, e.g., via chat, forums, etc. We define the default haltingcondition hc as the moment when there are no more nodesto ask for peer lists, although we strongly suggest changingthis to a condition that better fits the underlying networktopology and the resources of a ledger client (an exampleis given in Section 7.3). The maximum size of Πh can beset in accordance with Constraint 1, i.e., b

√κc. The number

of honest nodes h in Πh, i.e. choosing between Πs and Πp,depends on the use case, as discussed further in this section.

Initializing κ is nontrivial: if a user makes a correct guessand sets κ = M , the algorithm executes optimally. If theuser sets κ < M , then correct algorithm execution cannot beguaranteed, while if κ > M , correct execution is guaranteedwith probability ρ at the cost of higher communicationcomplexity. Thus, this parameter should in general be anoverestimation of the envisioned number of malicious nodesin the network.

5. Such structures are present in prominent DLT solutions in the formof Merkle trees and Merkle proofs.

Algorithm 4: Πp use for txid inclusion verificationInput : mroot — String, merkle rootInput : txid — String, transaction IDInput : blkid — String, block IDInput : Πp — Probabilistic Progress Set

1 majority ← b|Πp|/2c+ 1;2 included← 0;3 notIncluded← 0;4 for node in Πp do5 Request Merkle proof from node;6 Verify Merkle proof for txid in blkid using mroot;7 if node offers valid Merkle proof then8 included← included+ 1;9 else

10 notIncluded← notIncluded+ 1;11 end12 if included = majority then13 Declare transaction included;14 return;15 else if notIncluded = majority then16 Declare transaction not included;17 return;18 end19 end

Since it is excessive to expect that a number of maliciousnodes in the network can be guessed correctly, an alternativecan be offered instead of κ which uses another parameter,the maximum execution time in seconds tmax specifyinghow long a user is willing to wait for the algorithm tocomplete. Within a period of tmax, a node running thealgorithm collects nodes. After the specified timeout occurs,using the values from Fig. 2, the user is presented with atable such as Table 1 to choose an adequate probabilistic setbased on the underlying use case.

To use Alg. 3, a user must obtain the transaction ID txid,the block ID blkid, and the corresponding Merkle root mroot

from an entity issuing the transaction. For example, in ause case scenario when a merchant ships products after abuyer has paid for the goods, the buyer sends the aboveparameters to the merchant as proof of payment.

The Aurora algorithm has a variety of potential usecases. For example, if a user is willing to download theledger and time is not critical6. The vital parameter for thisuse case is tolerance to κ malicious nodes, and thus the usermay decide to construct a Πs that guarantees at least onehonest node while specifying an overestimated κ.

Similarly, if consumer-grade hardware is used to checkthe state of a transaction, but it cannot store the entireledger, time is again not critical and a user wants a trustlessverification without relying on a third-party service. For thisuse case, the user will opt to use the Πp to query the state ofa transaction.

Finally, if a user applies a resource-constrained device(e.g., a mobile phone) which cannot store the entire ledgerand is unwilling to download the header chain, but the user

6. The user is willing to wait for an extended period of time to collectunique nodes during a gathering. The constructed Πh does not need tosatisfy Constraint 1

9

still wants to verify that a transaction is contained in a ledgerand is interested in doing so in an efficient manner, the usershould construct a Πp that follows constraints adequate forthis use case, for example Constraint 1. Since in this paperwe focus on applying the Aurora algorithm on resource-constrained devices, this use case is discussed in more detailin Section 7.5.

6 COMMUNICATION COMPLEXITY

Let us examine an upper bound on the number of messagesexchanged by node a running the Aurora algorithm as afunction of the desired malicious nodes tolerance κ.

The communication complexity of the algorithm can bedecomposed into two main parts:

1) the number of messages exchanged during a gath-ering, and

2) the number of messages exchanged during commu-nication with members of Πh.

The total communication complexity can be expressedas a sum of these values, which we can derive from thefollowing three equations.

First, a gathering can be described as a linear functionwhere the domain is the number of steps performed duringa gathering d, the co-domain is the number discoverablenodes discovered during the gathering |Γ|, and the slope Yis a random variable. Thus, we write

|Γ| = f(d) = Y ∗ d. (5)

The average slope Z is a random variable and is theaverage of the sample means7 that can be calculated forany network or topology via Monte Carlo simulation to ap-proximate a normal distribution. By applying the previousstatement to Eq. (5), we obtain the following

|Γ| = f(d) = Z ∗ d. (6)

Second, as observed in Fig. 2, the ratio ω = |Γ|κ when |Πs|

and |Πp| begin behaving like√κ is a function of κ and can

be precomputed8. Thus, the total unique number of nodes|Γ| can also be expressed as:

|Γ| = f(κ) = ω ∗ κ (7)

By joining Eq. (6) and Eq. (7) we get

ω ∗ κ = Z ∗ d =⇒ d =ω

Z∗ κ (8)

Since there are d steps in a gathering and each stepinvolves ping and pong messages, the number of messagesexchanged during a gathering can be expressed as

f(κ, Z) = 2 ∗⌈ω ∗ κZ

⌉(9)

Third, given Constraint 1, the number of requests (pings)to be sent from the node executing the algorithm to themembers of the constructed Πh can be expressed as follows:

7. Central Limit Theorem (CLT)8. E.g, for Πp, ω ranges from 55.22 to 1.99 for all relevant values of

the domain.

f(κ) = |Πh| = b√κc (10)

In other words, the maximum number of messagesexchanged when communicating with members of Πh isexchanged when all members Πh must be queried. In total,|Πh| ping and |Πh| pongs messages are exchanged, whichcan be expressed as follows:

f(κ) = 2 ∗ b√κc (11)

Consequently, the total number of messages exchangedbefore the algorithm terminates can be expressed as the sumof Eq. (9) and Eq. (11):

f(κ, Z) = 2 ∗⌈ω ∗ κZ

⌉+ 2 ∗ b

√κc (12)

where ω can be read from Fig. 2 and Z can be set equalto the threshold on the average number of newly discoverednodes per draw (see Section 7.3).

7 METHODOLOGY AND EXPERIMENTS

Having established a theoretical basis for the Aurora algo-rithm performance, we verify our findings using simula-tions. In this section, we explain the applied methodologyand present the results of five distinct experiments per-formed using the available open-source tools.

The network topology was modelled in accordance withprevious work [31], [33], i.e., by modelling a highly con-nected core network and a lightly connected edge. Theactive Bitcoin IPV4 network slice defined in [31] was chosenas the basis for the experiments, i.e., all experiments use aBitcoin-like network with a total of 6356 nodes.

An adversary was modelled to advertise a spoofed totaldifficulty in the network that is higher than the canonicalchain difficulty. When asked about transaction inclusion, theadversary always lies. When an adversary replies with apeer list, it reveals only malicious peers. All malicious nodesknow about each other (the assumption that a maliciousclique is fully connected is supported by [34]). Finally, asingle group of colluding malicious peers was modelled inthe network.

The Simblock [35] simulator was applied to define anetwork topology and extended to enable nodes to sendping messages and respond with pong messages, as definedby the Bitcoin protocol [7], [30], [31], [32], [33], where aping message maps to a GETADDR message and a pongrequest message to an ADDR message. The simulator wasredesigned to rely on the Java streaming API to reduceits memory requirements and allow parallel execution. Allother parameters are inherited from the Simblock codebase.Mining did not occur during a simulation run (i.e., the chainheader was not modified). The Algorithms 2, 3 and 4 wereredesigned to match the underlying discrete event-drivenenvironment.

7.1 Experiment 1: Algorithm effectiveness when in-creasing the percentage of malicious network nodesThe goal of this experiment is to evaluate the effectivenessof the algorithm when gradually increasing the percentage

10

of malicious network nodes, while the node running thealgorithm is assumed to have unlimited time and resources.No upper bound is placed on the probabilistic set sizes (i.e.,Constraint 1 has been abolished).

The effectiveness of the algorithm is measured by run-ning a Monte Carlo simulation of the network while grad-ually increasing the number of malicious nodes from 0%to 100%. Malicious nodes are selected uniformly at randomfrom the existing nodes in the topology and marked as mali-cious. First contact nodes fc are also selected uniformly andrandomly from malicious nodes to ensure a 99% confidencelevel and a 1% margin of error. We use the default haltingcondition for each gathering, i.e., the node executing thealgorithm halts the gathering if there are no more nodes toask for peer lists. The ultimate goal of the node executingthe algorithm is to synchronize with an honest peer. Thedesired malicious node tolerance κ is always set equal tothe maximum value M .

An outcome of a simulation run is classified as halt ifthe algorithm cannot guarantee protection against a desirednumber of malicious nodes and the node decides to abortoperation. In contrast, an outcome is classified as progressedwhen the algorithm does not halt, i.e., it has succeeded inmaking a decision about a ledger synchronization candi-date, which may be either honest or malicious.

An outcome is marked as success if the node has chosento halt or has progressed to synchronize with an honestnode. Otherwise it is marked as failure since the nodeexecuting the algorithm decided to synchronize with amalicious node.

The results are displayed in Fig. 3. The algorithm be-haves as expected, i.e., it maintains a high success rate re-gardless of the underlying network topology or entry point,and identifies an honest node for persistent communicationin 99.9% of cases. The algorithm consistently halts when thenumber of malicious nodes in the network exceeds 50% inaccordance with Assumption 5: It halts only when the firstcontact is malicious and the gathering could not collect adesired number of nodes.

Fig. 3. Outcomes of the algorithm with an increase of the percentage ofmalicious network nodes.

The experiment highlights the idea of eventual safety

with respect to ledger synchronization. As mentioned ear-lier, the algorithm provides eventual safety with respect toblockchain synchronization as long as the first contact is notmalicious. That is, if the first contact is honest, the algorithmensures the presence of at least one honest node (see trace Atleast one honest in Fig. 3) and node a will eventually detectthat fraudulent data is provided by malicious nodes andstart synchronization with the available honest node.

7.2 Experiment 2: Number of discoverable nodes as afunction of the number of drawsThis experiment investigates how the number of gatheringsteps d relates to the number of discoverable nodes, givenan underlying network topology and without the presenceof malicious nodes.

A total of 1000 experiments were conducted where nodea enters a network using a randomly-selected first contactnode fc. Node a has unlimited time and resources, and thehalting condition remains the default one.

Fig. 4 shows the average rate at which Γi grows as newdraws are performed, i.e. the node discovery rate for theexecuted 1000 runs is depicted as a function of the numberof gathering steps. The shaded region around the averagerepresents uncertainty and is bounded by one standarddeviation. The results show that the node discovery rate isirregular and depends on the underlying topology and entrypoint. That is, if a node encounters a high-degree node early,the network expansion is faster and vice versa.

Fig. 4. The average and standard deviation of the node discovery ratewith an increase of the number of gathering steps during 1000 experi-ments.

The results lead us to two conclusions. First, the numberof newly discovered nodes at the beginning of a gatheringshows a linear dependence on the number of gatheringsteps. As the gathering progresses, the expansion rate slowsdown, which means that new nodes are harder to find.Second, after a certain point in time, the continuation ofa gathering makes little sense, as very few, if any, newnodes are discovered. From a pragmatic point of view, itis unreasonable to discover the entire network before thealgorithm terminates. The communication complexity of thegathering can be reduced by introducing a more complex

11

halting condition which we investigate in the followingexperiment.

7.3 Experiment 3: Average number of new nodes dis-covered in a draw

After studying the algorithm effectiveness when node a hasunlimited resources, we introduce additional constraints totest the algorithm in a more realistic scenario. This exper-iment examines whether a custom halting condition basedon the average number of newly discovered nodes can beconstructed to reduce the communication complexity of thegathering in terms of the number of gathering steps, with-out significantly affecting the number of nodes discoveredduring a gathering.

Node a enters a network using a randomly-selected firstcontact node fc and monitors the average number of newlydiscovered nodes in a draw (at least 10 draws are neededbefore an average is calculated). If the average number ofnewly discovered nodes in a draw falls below a threshold,the node halts. The threshold varies from 1 to 500, and foreach threshold, a total of 1000 experiments is performed,based on which the average and standard deviation of thepercentage of the network discovered is calculated.

Fig. 5 shows the average of the percentage of the networkdiscovered with an increase of the threshold for the averagenumber of newly discovered nodes. The shaded regionaround the average represents uncertainty and is boundedby one standard deviation. The experiment confirms theexpected behavior, i.e., larger thresholds are not reliable interms of percentage of network discovered, while smallerthresholds provide a more consistent result, regardless ofthe number of nodes known by the first contact node.

Fig. 5. The average and the uncertainty of the percentage of the networknodes seen at the end of a gathering with an increase of the thresholdfor the average number of newly discovered nodes in a draw.

Since Fig. 5 shows that when introducing a minimalthreshold on the average number of newly discovered nodesin a draw results with a high percentage of the networknodes discovered, we continue by examining a setup whenthe threshold is set to 15 (i.e., the node running the al-gorithm must discover on average at least 15 new nodes

��

��

��

��

��

��

3HUFHQW

$PRXQW

Fig. 6. Percentage of network nodes seen when the threshold on theaverage number of newly discovered nodes in a draw is set to 15.

after each draw). When such a threshold is chosen, thenode discovers on average 98.00% of network nodes witha standard deviation of 0.911%, as shown in Fig. 6. Thisparticular threshold is used in the following experiments,as we consider the result beneficial for a real world usagescenario leading to a reliable result in terms of the percent-age of network nodes discovered. Note that the thresholdchosen in the scope of this experiments is not applicableto any network. The method for obtaining the threshold isgeneric and should be repeated for a new network topology.

7.4 Experiment 4: The upper bound on the number ofgenerated messagesThis experiment tests the validity of the theoretical resultsfor communication complexity expressed by Eq. (12) in areal network with malicious nodes. Hence the experimentinvestigates the number of messages generated during eachsimulation run.

We ran two simulation scenarios and for each scenario1000 experiments were performed where node a enters thenetwork using a randomly-selected first contact node fc.The threshold on the average number of newly discoverednodes per draw is set to 15 in both scenarios9. In each exper-iment, the node running the algorithm attempts to constructand query a probabilistic progress set Πp. The construc-tion and use of a probabilistic safe set Πs is intentionallyomitted here, as it is much more difficult to construct a Πp,making Πp more suitable for this experiment, which can beviewed as a stress test. Nevertheless, Eq. (12) can be appliedregardless of the probabilistic set type (safe of progress).During the experiment we count the number of messagesexchanged before the algorithm terminates, either becauseit managed to progress or because it halted.

In the first scenario, we set κ = M = 1272. In thissetting, the probabilistic set Πp can be constructed and Con-straint 1 can be met quite easily. Eq. (12) gives us an upperbound of 728 exchanged messages. In the second scenario,

9. Note that this value has the same semantics as Z, which is usedin Eq. (12).

12

we set κ = M = 1614. Compared to the first setting, this oneis more demanding: a node must collect at least 6000 totalnodes in a gathering in order to satisfy Constraint 1. Eq. (12)gives us an upper bound of 880 exchanged messages.

The results are shown in Fig. 7. In all simulation runs,the actual number of generated messages is smaller then orequal to the corresponding analytic bound, which stronglysuggests that Eq. (12) is valid.

� ��

É ��

É ��

É ��ERXQG

É ��ERXQG

1XPEHU�RI�H[FKDQJHG�PHVVDJHV

2FFXUUHQFH

Fig. 7. Comparing the actual number of generating messages with theanalytical bound on communication complexity. Markers (dots) markone or more gatherings that ended with the corresponding number ofexchanged messages.

7.5 Experiment 5: Transaction inclusion verification forlight clientsThis experiment investigates the effectiveness of the algo-rithm on a Bitcoin-like network containing a total of 6356nodes, with a gradual increase in the total number of ma-licious nodes, given that the node running the algorithmdoes not have unlimited time and resources. Thus, thisexperiment examines a realistic usage scenario in which, forinstance, a user has only a resource-constrained device (e.g.,a mobile phone), but still wants to verify that a transaction iscontained in a ledger and is interested in doing so efficientlyand trustworthy.

The simulation setup and methodology is the same asin Section 7.1, with two key differences. First, Constraint 1must be satisfied — if the node executing the algorithm isunable to respect the given constraint, it will not construct aprobabilistic set and will consequently halt. Second, the halt-ing condition is changed to take into account the thresholdfor the average number of unique nodes per draw, whichis set to 15, but only if 10 or more draws were previouslymade.

The results shown in Fig. 8 confirm that the algorithm ex-ecutes within the expected success rate with good progresswhile the number of reachable nodes which are malicious isbelow one quarter. Similarly to Fig. 3, the algorithm behavesas expected and identifies an honest node for transientor persistent communication in 99.9% of the trials, whilehalting consistently when less than a quarter of nodes inΓ are malicious. Thus, we conclude that the algorithm

is adequate for resource-constrained devices when similarnetwork conditions can be expected.

� ��

�

��

��

��

��

�

)DLOXUH�UDWH +DOWLQJ�UDWH 3URJUHVV�UDWH 6XFFHVV�UDWH

3HUFHQWDJH�RI�PDOLFLRXV�QRGHV�LQ�WKH�QHWZRUN��0

3HUFHQWDJH�RI�DOO�UXQV

Fig. 8. Outcomes of the algorithm in a realistic setup when Constraint 1is satisfied and the average message size threshold is set to 15, whileincreasing the malicious number of network nodes.

After verifying our work on a Bitcoin-like network, wedraw three main conclusions. First, the algorithm behavesas expected, i.e., it manages to download a correct ledger,infer a state of a transaction, or halt in ρ percent of thecases. Second, since the number of detected nodes in pongmessages decreases as the number of steps increases, itis useful to reduce the number of steps executed in agathering. This can be done by halting the gathering whenthe average number of newly discovered nodes in a drawfalls below a threshold. Third, the algorithm can be appliedon a resource-constrained device in a realistic Bitcoin-likenetwork when about a quarter of the network nodes aremalicious. Comparing the deterministic and probabilisticvariants, the probabilistic approach generates one to twoorders of magnitude fewer messages when inferring thestate of the ledger.

8 CHALLENGES AND FUTURE WORK

The most pressing challenge in a production environmentconcerns the responses to ping requests where networknodes are expected to respond with their peer list. Currentlyin Bitcoin, only publicly reachable nodes with a free slotsend such replies [36]. We can approach the problem in threedifferent ways. First, as part of a temporary solution, if anode does not respond because a free slot is unavailable,we can filter out that node and contact another node. Thiswill generate additional steps for the algorithm. The secondsolution requires a change in the node protocol so that nodesrespond to ping messages regardless of the number of freeslots. Such a change would not affect the underlying con-sensus protocol and can be presented as an opt-in feature.However, such a change makes nodes vulnerable to denial-of-service attacks. If an attack is detected, the affected nodemay simply decide not to respond to ping requests for awhile. Third, as a persistent solution, it is likely that morepublicly reachable nodes will emerge as DLT becomes more

13

widespread and mature. As a result, more peers will beavailable (e.g., each BTC node can connect to 8 outgoingpeers and up to 117 incoming peers [33], [36]), to partiallymitigate the problem.

Furthermore, one may argue that the gathering processis resource intensive and time consuming. However, itprovides trustless and decentralized capabilities which areotherwise unavailable in today’s DLT networks. The cachingof received peer lists and their reuse can help to mitigate theproblem, while a gathering can also be performed while thedevice is idle or charging. Sets can also be shared by socialconsensus (e.g., by a trusted person, friend, family member,etc.) and then used for future queries.

Finally, there is the question of incentive. Why woulda node act as a serving node for other nodes runningthe Aurora protocol? The information about neighboringpeers and the ledger head is already shared voluntarilyby nodes, however with a limit on the maximum numberof concurrent node connections. Furthermore, responsesregarding the inclusion of transactions can be classifiedas simple requests comparable to peer list responses, sowe assume that a node would be willing to share suchinformation without additional incentives. If at any pointour assumption proves to be wrong, the solution can beadapted to provide incentives for full nodes to collaborate,e.g., in the form of micro-payments for each transactioninclusion request.

As our future work, some of the open questions havealready been mentioned, namely, the impact of ρ on the sizeof probabilistic sets and the use case, and the study of a time-based halting condition. We also want to study the impact ofmining on the execution of the algorithm. We could compen-sate for new blocks generated during the execution of thealgorithm by querying for blocks beyond the ledger headand then checking if there is a common ancestor. In addition,the algorithm would benefit from parallel execution of thegathering, the effects of which can be studied in a controlledenvironment using a prominent distributed ledger clientmodified to run the Aurora algorithm.

9 CONCLUSION

The Aurora algorithm presented in this paper allows a newnode entering a DLT network that contains |M | maliciousnodes to discover sets of nodes that have a high probabilityof containing at least 1 or 1 + |M | honest nodes. Such setscan then be used by a new node to synchronize with thelatest ledger or to determine the state of ledger transaction.

The algorithm can be divided into two main parts. First,the algorithm constructs a Directed Acyclic Graph over anetwork topology for the purpose of node discovery usingresponse messages from contacted nodes conveying lists oftheir neighbors. Second, the algorithm selects a subset ofthe collected nodes to be tolerant to κ malicious nodes withprobability ρ by relying on the hypergeometric distribution.

In this work, we study a scenario where bootstrap nodesare unavailable or malicious, i.e., their behavior is hostileand they conspire against a victim that enters the network.The goal of such an adversary can range from denial ofservice (i.e., wasting the victim’s resources) to withholding

the canonical truth (i.e., the longest chain) from the vic-tim. In such an environment, we enable decentralized andtrustless ledger synchronization and attempt to reduce thevictim’s resource consumption by providing a probabilisticalgorithm that detects an honest peer with probability ρfor subsequent interaction. Specifically, the algorithm eitherallows ledger synchronization to continue if it determinesthat an honest bootstrap candidate has been found, or abortsthe process if no such guarantee can be made.

The second application of the Aurora algorithm allowsa device, under the same conditions and with the sameguarantees as for trustless ledger synchronization, to verifythat a transaction has been included in the ledger withouthaving to download the entire ledger or ledger header, or totrust a central authority.

We compare the Aurora algorithm with existing solu-tions, provide the pseudocode of the algorithm, explainthe prerequisites for running the algorithm, provide ananalytical expression for communication complexity, andhighlight two prominent applications of the algorithm. Weevaluate the algorithm using Monte Carlo simulations on aBitcoin IPV4 network slice. We used an open-source and dis-crete event-driven blockchain simulator to run experimentswhich confirmed that the experimental results are consistentwith our analytical findings.

Our experimental results show that the Aurora algo-rithm ensures decentralized and trustless DLT interactionof nodes running on both consumer-grade hardware andresource-constrained devices. In a realistic scenario whereup to a quarter of the nodes in the network are maliciousand actively attempt to subvert a new node joining thenetwork, the algorithm is able to correctly synchronizethe ledger state, infer the state of a transaction, or other-wise halt with high probability guarantees. Comparing ourprobabilistic algorithm with its deterministic variant, theprobabilistic approach is much more efficient and generatesone to two orders of magnitude fewer messages.

Our solution thus lowers the barrier to entry that DLTimposes on consumer-grade hardware and improves net-work decentralization, while allowing changes to be im-plemented in existing solutions without changing to theconsensus protocol.

ACKNOWLEDGMENTS

This work has been supported in part by Croatian Sci-ence Foundation under the project IP-2019-04-1986 (IoT4us:Human-centric smart services in interoperable and decen-tralized IoT environments).

REFERENCES

[1] J. Ghosh, “The blockchain: Opportunities for research in informa-tion systems and information technology,” 2019.

[2] X. Zheng, Y. Zhu, and X. Si, “A survey on challenges and pro-gresses in blockchain technologies: A performance and securityperspective,” Applied Sciences, vol. 9, no. 22, p. 4731, 2019.

[3] A. Gabizon, K. Gurkan, P. Jovanovic, G. Konstantopoulos,A. Oines, M. Olszewski, M. Straka, and E. Tromer, “Plumo: To-wards scalable interoperable blockchains using ultra light valida-tion systems,” 2020.

[4] A. Zamyatin, Z. Avarikioti, D. Perez, and W. J. Knottenbelt, “Tx-chain: Efficient cryptocurrency light clients via contingent trans-action aggregation,” in Data Privacy Management, Cryptocurrenciesand Blockchain Technology. Springer, 2020, pp. 269–286.

14

[5] Y. Lu, Q. Tang, and G. Wang, “Generic superlight client forpermissionless blockchains,” in European Symposium on Researchin Computer Security. Springer, 2020, pp. 713–733.

[6] I. Homoliak, S. Venugopalan, Q. Hum, and P. Szalachowski, “Asecurity reference architecture for blockchains,” in 2019 IEEEInternational Conference on Blockchain (Blockchain). IEEE, 2019, pp.390–397.

[7] E. Heilman, A. Kendler, A. Zohar, and S. Goldberg, “Eclipseattacks on bitcoin’s peer-to-peer network,” in 24th {USENIX}Security Symposium ({USENIX} Security 15), 2015, pp. 129–144.

[8] M. Saad, J. Spaulding, L. Njilla, C. Kamhoua, S. Shetty, D. Nyang,and D. Mohaisen, “Exploring the attack surface of blockchain: Acomprehensive survey,” IEEE Communications Surveys & Tutorials,vol. 22, no. 3, pp. 1977–2008, 2020.

[9] F. M. Bencic, A. Hrga, and I. P. Zarko, “Aurora: a robust and trust-less verification and synchronization algorithm for distributedledgers,” in 2019 IEEE International Conference on Blockchain(Blockchain). IEEE, 2019, pp. 332–338.

[10] W.-M. Lee, “Beginning ethereum smart contracts programming,”With Examples in Python, Solidity and JavaScript, 2019.

[11] J. Kwon, “Tendermint: Consensus without mining,” Draft v. 0.6,fall, vol. 1, no. 11, 2014.

[12] L. Xu, L. Chen, Z. Gao, S. Xu, and W. Shi, “Efficient pub-lic blockchain client for lightweight users,” arXiv preprintarXiv:1811.04900, 2018.

[13] D. Leung, A. Suhl, Y. Gilad, and N. Zeldovich, “Vault: Fastbootstrapping for the algorand cryptocurrency.” in NDSS, 2019.

[14] J. Chen and S. Micali, “Algorand,” arXiv preprint arXiv:1607.01341,2016.

[15] B. Bunz, L. Kiffer, L. Luu, and M. Zamani, “Flyclient: Super-lightclients for cryptocurrencies,” in 2020 IEEE Symposium on Securityand Privacy (SP). IEEE, 2020, pp. 928–946.

[16] D. Letz, “Blockquick: Super-light client protocol for blockchainvalidation on constrained devices.” IACR Cryptol. ePrint Arch., vol.2019, p. 579, 2019.

[17] H. Yu, M. Kaminsky, P. B. Gibbons, and A. Flaxman, “Sybilguard:defending against sybil attacks via social networks,” in Proceedingsof the 2006 conference on Applications, technologies, architectures, andprotocols for computer communications, 2006, pp. 267–278.

[18] Q. Ding, N. Katenka, P. Barford, E. Kolaczyk, and M. Crovella,“Intrusion as (anti) social communication: characterization anddetection,” in Proceedings of the 18th ACM SIGKDD internationalconference on Knowledge discovery and data mining, 2012, pp. 886–894.

[19] H.-H. Chen and C. L. Giles, “Ascos: an asymmetric networkstructure context similarity measure,” in 2013 IEEE/ACM Interna-tional Conference on Advances in Social Networks Analysis and Mining(ASONAM 2013). IEEE, 2013, pp. 442–449.

[20] S. Chowdhury, M. Khanzadeh, R. Akula, F. Zhang, S. Zhang,H. Medal, M. Marufuzzaman, and L. Bian, “Botnet detection usinggraph-based feature clustering,” Journal of Big Data, vol. 4, no. 1,pp. 1–23, 2017.

[21] J. Zhang, Y. Xiang, Y. Wang, W. Zhou, Y. Xiang, and Y. Guan, “Net-work traffic classification using correlation information,” IEEETransactions on Parallel and Distributed systems, vol. 24, no. 1, pp.104–117, 2012.

[22] J. Zhang, C. Chen, Y. Xiang, W. Zhou, and A. V. Vasilakos, “Aneffective network traffic classification method with unknown flowdetection,” IEEE Transactions on Network and Service Management,vol. 10, no. 2, pp. 133–147, 2013.

[23] K. Wust and A. Gervais, “Ethereum eclipse attacks,” ETH Zurich,Tech. Rep., 2016.

[24] Y. Marcus, E. Heilman, and S. Goldberg, “Low-resource eclipseattacks on ethereum’s peer-to-peer network.” IACR Cryptol. ePrintArch., vol. 2018, p. 236, 2018.

[25] G. Xu, B. Guo, C. Su, X. Zheng, K. Liang, D. S. Wong, and H. Wang,“Am i eclipsed? a smart detector of eclipse attacks for ethereum,”Computers & Security, vol. 88, p. 101604, 2020.

[26] M. Apostolaki, A. Zohar, and L. Vanbever, “Hijacking bitcoin:Routing attacks on cryptocurrencies,” in 2017 IEEE Symposium onSecurity and Privacy (SP). IEEE, 2017, pp. 375–392.

[27] M. Al-Bassam, A. Sonnino, and V. Buterin, “Fraud anddata availability proofs: Maximising light client security andscaling blockchains with dishonest majorities,” arXiv preprintarXiv:1809.09044, 2018.

[28] S. Nakamoto, “Bitcoin: A peer-to-peer electronic cash system,”Manubot, Tech. Rep., 2019.

[29] A. Rivadulla, “Mathematical statistics and metastatistical analy-sis,” Erkenntnis, vol. 34, no. 2, pp. 211–236, 1991.

[30] A. Biryukov, D. Khovratovich, and I. Pustogarov, “Deanonymisa-tion of clients in bitcoin p2p network,” in Proceedings of the 2014ACM SIGSAC Conference on Computer and Communications Security,2014, pp. 15–29.

[31] V. Deshpande, H. Badis, and L. George, “Btcmap: Mapping bitcoinpeer-to-peer network topology,” in 2018 IFIP/IEEE InternationalConference on Performance Evaluation and Modeling in Wired andWireless Networks (PEMWN). IEEE, 2018, pp. 1–6.

[32] A. Miller, J. Litton, A. Pachulski, N. Gupta, D. Levin, N. Spring,and B. Bhattacharjee, “Discovering bitcoin’s public topology andinfluential nodes,” et al, 2015.

[33] J. Misic, V. B. Misic, X. Chang, S. G. Motlagh, and M. Z. Ali, “Mod-eling of bitcoin’s blockchain delivery network,” IEEE Transactionson Network Science and Engineering, vol. 7, no. 3, pp. 1368–1381,2019.

[34] D. Dagon, G. Gu, C. P. Lee, and W. Lee, “A taxonomy of botnetstructures,” in Twenty-Third Annual Computer Security ApplicationsConference (ACSAC 2007). IEEE, 2007, pp. 325–339.

[35] Y. Aoki, K. Otsuki, T. Kaneko, R. Banno, and K. Shudo, “Simblock:A blockchain network simulator,” in IEEE INFOCOM 2019-IEEEConference on Computer Communications Workshops (INFOCOM WK-SHPS). IEEE, 2019, pp. 325–329.

[36] L. Wang and I. Pustogarov, “Towards better understanding ofbitcoin unreachable peers,” arXiv preprint arXiv:1709.06837, 2017.

Federico Matteo Bencic received the M.Sc.degree in information and communication tech-nology from the University of Zagreb, Faculty ofElectrical Engineering and Computing in 2017,where he is currently pursuing the Ph.D. degreewith the Department of Telecommunications. Heis currently an Assistant with the Department ofTelecommunications, Faculty of Electrical Engi-neering and Computing, University of Zagreb.His interests include applications of DLT in theIoT area.

Ivana Podnar Zarko is Full Professor at theUniversity of Zagreb, Faculty of Electrical Engi-neering and Computing, Croatia (UNIZG-FER)where she teaches distributed information sys-tems. She received her B.Sc., M.Sc. and Ph.D.degrees in electrical engineering from UNIZG-FER, in 1996, 1999 and 2004, respectively. Sheis affiliated with the Department of Telecommu-nications at UNIZG-FER from 1997. She was aguest researcher and research associate at theTechnical University of Vienna, Austria, and a

postdoctoral researcher at the Swiss Federal Institute of Technology inLausanne (EPFL), Switzerland. She was promoted to Full Professor inDecember 2017. She has participated in a number of research projectsfunded by national sources and EU funds, and is currently leadingthe UNIZG-FER Internet of Things Laboratory. Ivana Podnar Zarko isthe Technical Manager of the H2020 project symbIoTe: Symbiosis ofsmart objects across IoT environments, and is currently participatingin the Centre of Research Excellence for Data Science and AdvancedCooperative Systems, which is the first national center of excellence inthe field of technical sciences in Croatia. She has co-authored morethan 60 scientific journal and conference papers in the area of large-scale distributed systems, IoT, and Big data processing, and has as aprogram committee member for a number of international conferencesand workshops. Prof. Ivana Podnar Zarko is a member of IEEE and wasthe Chapter Chair of IEEE Communications Society, Croatia Chapter(2011— 2014). She has received the award for engineering excellencefrom the IEEE Croatia Section in 2013.

http://arxiv.org/abs/1811.04900




15

GLOSSARY

K Number of success in the populationM Number of malicious nodes in the networkN Population size parameter for the construction of a

hypergeometric distributionS A set of all network nodesX Random variableY Random variable denoting the number of unique nodes

found throughout a single gatheringZ Random variable denoting the average number of unique

nodes found throughout a single gathering, Z ∼N (µ, σ2)

∆p Deterministic Progress Set∆s Deterministic Safe Set∆ A shorthand for ∆h when h is implied from context or

not relevantΓ Subset of network nodes encountered in a gatheringΠp Probabilistic Progress SetΠs Probabilistic Safe SetΠ A shorthand for Πh when h is implied from context or

not relevantℵ Response to a peer list request, a of node identifiers∆h Deterministic Honest SetΠh Probabilistic Honest SetΓi Subset of network nodes encountered up to step iκ Desired malicious node tolerance, expressed as an abso-

lute numberω Measured ratio between |Γ| and κ when the correspond-

ing probabilistic sets size starts to behave as a sub-linear function of κ

ρ Probability guarantee on probabilistic sets correctnessa A new node entering a distributed ledger networkblkid A block identifierd Number of draws made in a gatheringfc First contact node identifierhc Halting condition encapsulationk Number of success in the samplemroot A Merkle rootnxtDraw Identifier of a next draw in a gatheringn The sample sizetmax Maximum execution time in secondstxid A transaction identifierx Random variable occurrence|Γ| The number of unique nodes discovered during a

gathering, also the population size used for hy-pergeometric set construction

Deterministic Progress Set For a given Γ containing κmalicious nodes, where |Γ| > 2κ, ∆p is a deter-ministic honest set which contains a majority ofhonest nodes and its size is at least 2κ+ 1

Deterministic Safe Set For a given Γ containing κmaliciousnodes, where |Γ| > κ, ∆s is a deterministic honestset which contains at least one honest node and itssize is at least κ+ 1

draw A step performed during network exploration

gathering The process of exploring the network

Probabilistic Progress Set For a given Γ containing κ mali-cious nodes, where |Γ| > 2κ, Πp is a probabilistic

honest set which contains a majority of honestnodes with probability ρ

Probabilistic Safe Set For a given Γ containing κ maliciousnodes, where |Γ| > κ, Πs is a probabilistic honestset which contains at least one honest node withprobability ρ

Documents

1 Aurora: a probabilistic algorithm for distributed