Enabling Verifiable and Dynamic Ranked Search Over ...ieee-trustcom.org/faculty/~csgjwang/papers/QinLiu-IEEETSC2019.pdf · Enabling Veriﬁable and Dynamic Ranked Search Over Outsourced

1939-1374 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TSC.2019.2922177, IEEETransactions on Services Computing

1

Enabling Verifiable and Dynamic Ranked SearchOver Outsourced Data

Qin Liu, IEEE Member, Yue Tian, IEEE Student Member, Jie Wu, IEEE Fellow, Tao Peng, IEEE Member,and Guojun Wang, IEEE Member

Abstract—Cloud computing as a promising computing paradigm is increasingly utilized as potential hosts for users’ massive dataset.Since the cloud service provider (CSP) is outside the users’ trusted domain, existing research suggests encrypting sensitive databefore outsourcing and adopting Searchable Symmetric Encryption (SSE) to facilitate keyword-based searches over the ciphertexts.However, it remains a challenging task to design an effective SSE scheme that simultaneously supports sublinear search time, efficientupdate and verification, and on-demand information retrieval. To address this, we propose a Verifiable Dynamic Encryption withRanked Search (VDERS) scheme that allows a user to perform top-K searches on a dynamic document collection and verify thecorrectness of the search results in a secure and efficient way. Specifically, we first provide a basic construction, VDERS0, where aranked inverted index and a verifiable matrix are constructed to enable verifiable document insertion in top-K searches. Then, anadvanced construction, VDERS⋆, is devised to further support document deletion with a reduced communication cost. Extensiveexperiments on real datasets demonstrate the efficiency and effectiveness of our VDERS scheme.

Index Terms—searchable symmetric encryption, verifiability, dynamic, top-K searches

F

1 INTRODUCTION

As a promising computing paradigm, cloud computinghas drawn great attention from both research and industrycommunities. Because of the benefits of low costs, flexibility,and scalability, it has become a prevalent trend for users tooutsource their massive datasets to clouds and delegate acloud service provider (CSP) to manage data storage andoffer query services. Due to security and privacy concerns,existing research suggests encrypting data before outsourc-ing [1]. However, data encryption makes keyword-basedsearches over ciphertexts a challenging problem. This iseven harder for efficient top-K searches in a dynamic andmalicious cloud environment [2].

Let us consider the following scenario. Alice outsourcesarchived emails to the cloud, where each email is indexedby the sender’s name and ranked in descending order ofthe receipt date. For example, for a set of emails indexedby keyword Bob, the email received on April 2 has a higherrank than the email received on April 1. To keep keywordand document contents secret, Alice uploads them in theencrypted forms to the cloud. There could be hundreds ofdocuments matching a specific keyword, and the consumed

• Qin Liu and Yue Tian are with the College of Computer Science andElectronic Engineering, Hunan University, Changsha, Hunan Province,P. R. China, 410082. E-mail: [email protected]; [email protected]

• Jie Wu is with the Department of Computer and InformationSciences, Temple University, Philadelphia, PA 19122, USA. E-mail:[email protected]

• Tao Peng and Guojun Wang are with the School of Computer Scienceand Cyber Engineering, Guangzhou University, Guangzhou, GuangdongProvince, P.R. China, 510006. E-mail: pengtao [email protected];[email protected]

costs will be extensive if all the matched documents arereturned to and decrypted by the user. Therefore, Alice maywant to perform a top-K search to retrieve the most recentemails. Moreover, Alice may want to store only the emailsreceived in the last three months for monetary saving. Forexample, when entering May, Alice will delete all emailsreceived before February.

In the above application scenario, the adopted encryp-tion scheme should meet the following requirements: (1)Ranked search. The user is allowed to perform a top-K searchto retrieve the best-matched documents. (2) Dynamic. Theuser is able to update (add and delete) documents storedin the cloud. (3) Verifiability. The malicious CSP may deleteencrypted documents not commonly used to save memoryspace, or it may forge the search results to deceive the user.Even if the CSP is honest, a virus or worm may tamper withencrypted documents. Therefore, the user should have theability to verify the correctness of the search results. (4) Ef-ficiency. The user can efficiently perform searches, updates,and verifications on a set of encrypted documents.

Although Searchable Symmetric Encryption (SSE) allowsa user to retrieve desired documents in a privacy-preservingway, existing SSE schemes only partially address the aboverequirements. To simultaneously satisfy all these properties,this paper proposes a Verifiable Dynamic Encryption withRanked Search (VDERS) scheme that allows the user toperform updates and top-K searches on ciphertexts in averifiable and efficient way. Our main idea is to constructa verifiable matrix to record the ranking information andencode it with RSA accumulator [3]. Furthermore, a rankedinverted index is built from a collection of documents tofacilitate efficient top-K searches and updates. Specifically,we first provide a basic construction, denoted by VDERS0,which enables verifiable document insertion operations.Then, we provide an advanced construction, denoted by



2

VDERS⋆, which not only can support efficient deletion op-erations, but also can reduce communication costs withoutoutsourcing the verifiable matrix. Our main contributionsare summarized as follows:

• We propose a VDERS scheme to achieve dynamicand ranked searches in a cloud environment in anefficient and verifiable way.

• Two constructions are provided to achieve efficienttop-K searches with support for verifiable updates.

• We theoretically analyze the security and perfor-mance of our scheme and conduct extensive experi-ments on real datasets to validate its effectiveness.

Paper organization. We introduce related work in Sec-tion 2 and provide the preliminaries in Section 3. After theoverview of this work in Section 4, we provide our basic andadvanced VDERS constructions in Section 5 and Section 6,respectively. We evaluate the proposed scheme in Section 7.Finally, we conclude the paper in Section 8.

2 RELATED WORK

Our work focuses on verifiable ranked queries on dynamicencrypted data. SSE that allows an untrusted server to per-form keyword-based searches on ciphertexts can partiallyaddress our requirements.

SSE has been widely researched since it was first pro-posed by Song et al. [4]. As a seminal work in SSE, Curt-mola et al. [5] provided a rigorous security definition andconstructed two schemes, SSE-1 and SSE-2, based on aninverted index. Compared to SSE-1, SSE-2 is more secure,and it has been proven to be secure against adaptive cho-sen keyword attacks (CKA2). SSE-1 is secure against non-adaptive chosen-keyword attacks (CKA1), but yields anoptimal (sublinear) search time O(r), where r is the numberof documents that contain the query keyword. However,neither SSE-1 nor SSE-2 has properties of dynamism, verifia-bility, and ranked search.

Dynamic SSE (DSSE) allows a user to update the en-crypted outsourced data in an efficient and secure way.Kamara et al. [6] constructed a DSSE scheme based on an ex-tended inverted index. Their subsequent work [7] extendedit to a parallel search setting by using a red-black tree. Cashet al. [8] constructed a DSSE scheme optimized for super-large datasets, but their scheme supports only efficient docu-ment insertion. Naveed et al. [9] put forward a DSSE schemein which a server worked as a blind storage to decreaseleakage at the cost of multiple rounds of interaction. Theabove schemes are proven to be CKA2-secure, but leakkeyword information about the newly added documents.With this leakage, the attackers can reveal the content of apast query by injecting new documents in a dataset [10]. Tomitigate such an attack, Stefanov et al. [11] proposed the firstDSSE scheme with forward privacy, but their scheme suffersfrom inefficiency. Since then, forward privacy has been themain motivation of recent DSSE schemes [12], [13].

Verifiable SSE (VSSE) allows a user to verify the correct-ness of search results. Kurosawa et al. [14] constructed a U-niversally Composable (UC)-secure VSSE scheme, in whicha user can detect any malicious server’s cheating behavior.While UC-security is stronger than CKA2-security, their con-struction requires a linear search time. Soleimanian et al. [15]presented a public VSSE scheme, which delegates a third

TABLE 1Comparison with previous work

Rank Dynamic Verifiability SublinearRefs. [6]–[8] X XRefs. [14], [15] XRefs. [16]–[18] X XRef. [19] X X XRefs. [20], [24]–[26] X XRef. [22] X XRef. [23] X X XOur VDERS X X X X

party to accomplish the verification, but fails to supportdynamic operations. The subsequent work of Kurosawaet.al [16] extends VSSE to a dynamic environment. Theirscheme employs RSA accumulator [3] to generate constant-size digests/proofs, but the verification cost on the clientside grows linearly with the total number of documents. Toenable verifiable conjunctive keyword search over dynamicencrypted data, Sun et al. [17] exploited the bilinear-mapaccumulator technique to construct an accumulation tree.Jiang et al. [18] proposed a VDSSE scheme that also utilizedan accumulator tree to verify results of boolean queries. Theaccumulator tree structure is more efficient than RSA ac-cumulator in verification, but consumes more computationtime for updating tree structure. Zhu et al. [19] proposeda generic VSSE scheme in a multi-user setting, where theverifiable design can provide result verification for any SSEschemes and support data updates. However, the above VSSEschemes return all search results and therefore, may be unsuitablefor an environment where a lot of documents match a user’s querybut the user is only interested in best-match documents.

Ranked SSE (RSSE) allows a user to rank the search re-sults based on different evaluation functions. Cao et al. [20]proposed a dynamic multi-keyword RSSE scheme whichutilized the secure KNN technique [21] to rank resultsaccording to the number of matched query keywords. Themain limitation of their scheme is inefficiency, where thesearch time increases linearly with the number of docu-ments. Inspired by their work, many multi-keyword VRSSEschemes [22]–[26] were put forward. Sun et al. [22] proposeda verifiable RSSE scheme which ranked search results basedon cosine similarity while improving search efficiency byan MDB tree structure. Chen et al. [23] proposed a RSSEscheme based on a hierarchical clustering index. Theirscheme supports update and verification, but requires theclient to re-encrypt search indexes once a document isadded into a dataset. Zhang et al. [24] developed a RSSEscheme in a multi-owner model, where the search timegrew linearly with the number of queried keywords. Fuet al. [25] leveraged a user interest model to assign weightto query keywords according to the user’s search history.Guo et al. [26] put forward a dynamic multi-phrase SSEscheme that allowed the user to rank results locally basedon TF-IDF scores. However, it is hard for existing RSSE schemesto simultaneously support sublinear search time and efficientupdates and verification. The comparison results between ourwork and previous SSE schemes are shown in Table 1.

3 PRELIMINARY3.1 System Model

The system consists of three different parties: the CSP, thedata owner, and the data user, as illustrated in Fig. 1. The



3

Fig. 1. System model. Communication channels are secured with secu-rity protocols such as SSL/SSH.

CSP maintains cloud platforms that pool hard and softresources to provide data storage and query services.

The data owner first creates ciphertexts c for a documentcollection d. Given keywords w extracted from d, she thenbuilds a secure index I for fast searches, and generates alocal evidence Ψ and remote auxiliary information Φ forverifiable searches. After uploading (c, I,Φ) to the cloud,she can perform updates on ciphertexts with an updatetoken T∗ and retrieve documents on demand with a searchtoken TW in a verifiable way. On receiving the search resultsand a search proof (RW ,Π) from the CSP, the data ownerrecovers document contents after verifying the correctnessof the search results. The data owner can also delegate thesearch/update/verification ability to authorized data users.In this paper, we do not differentiate between the dataowner and the data user, and refer to them as users.3.2 Adversary Model

We assume that the users are fully trusted. The CSP isthe potential attacker and is assumed to be honest butcurious [1]. That is, the CSP would correctly execute theprespecified protocol, but still attempt to learn extra infor-mation about the stored data and the received message.

As defined in [5], access pattern refers to the outcome ofsearch results, i.e., which documents have been returned;search pattern refers to whether two searches have beenperformed for the same keyword. As a tradeoff betweensecurity and efficiency, existing SSE schemes resort to theweakened security guarantee for efficiency concerns. Thatis, they will reveal the access pattern and the search patternbut nothing else during the search process. Like existing SSEschemes, such information is also available to the CSP inour scheme. Furthermore, in a top-K search, only K highestranked documents will be returned. Therefore, our schemewill leak information about document ranks besides accesspattern and search pattern. It is worth noticing that the leak-age of ranking information is inevitable in top-K searches.For example, the adversary first issues a top-1 search andthen a top-2 search for keyword W , thereby it can knowthat the document returned in the first round has the highestrank and the new document returned in the second roundis ranked second. If the adversary issues top-1, 2, . . . , top-K continuously, then the rank of each document will beexposed by comparing search results.

Our scheme mainly aims to preserve the following pri-vacy properties:• Confidentiality. The CSP knows nothing about the

document/keyword contents expect search pattern, accesspattern, and document ranks.• Verifiability. The CSP cannot forge a search result or

falsify the outsourced ciphertexts.

3.3 RSA accumulatorRSA accumulator [3] is applied to verify the correctness ofthe search results in our scheme. It provides a constant-sizedigest for an arbitrarily large set of inputs and a constant-size witness for any element in the set such that it can beused to verify the (non-)membership of the element in thisset. Let κ ∈ N be a security parameter, and let p = 2p′ + 1and q = 2q′ + 1 be two large primes where p′, q′ are primessuch that |pq| > 3κ. Let F = {f : {0, 1}3κ → {0, 1}κ}be a family of two-universal hash functions, let N = pqand φ(N) = (p − 1)(q − 1), and let G be a cyclic group ofsize (p − 1)(q − 1)/4 where g is a generator of G. For a setof elements E = {y1, . . . , yn} with yi ∈ {0, 1}κ, the RSAaccumulator works as follows:• For each yi ∈ E, we choose a random prime xi,

denoted by P(yi), such that yi = f(xi). The accumulatoris calculated as Acc(E) = g

∏ni=1 P(yi) mod N.

• For any subset E′ ⊆ E, a witness π = g∏

yi∈E−E′ P(yi)

mod N is produced.• The subset test is carried out by checking Acc(E) =

π∏

yi∈E′ P(yi) mod N.Proposition 1. [29] For any y ∈ {0, 1}κ, we can computea prime x ∈ {0, 1}3κ such that f(x) = y with overwhelmingprobability from the set of inverses f−1(y) by sampling O(κ2)times.Strong RSA assumption. [30] Given N = pq and a randomelement y ∈ ZN, it is hard to find x and e > 1 such that y = xe

mod N.Proposition 2. Given N, g, f , and E = {y1, . . . , yn}, it is hardto find y ∈ E and π such that πP(y) = Acc(E) mod N underthe strong RSA assumption.

Given Proposition 1, a prime xi can be computed effi-ciently for any yi ∈ E. The security of RSA accumulator isbased on strong RSA assumption and Proposition 2.

4 SCHEME OVERVIEW

4.1 NotationsLet κ ∈ N be a secure parameter of the whole system, andlet 0 be a string of 0s with length κ. For t ∈ N, notation [t]is used to denote the set of integers {1, . . . , t}. The set of allbinary strings of length t is denoted by {0, 1}t and the set offinite binary strings by {0, 1}∗. Given a sequence of elementsv, its i-th element is denoted by v[i] or Vi and its totalnumber of elements by |v|. Given a matrix M, the elementin its i-th row and j-th column is denoted by M[i][j]. If Sis a set then |S| refers to its cardinality. If s is a string then||s|| refers to its bit length. The concatenation of t stringss1, . . . , st is denoted by ⟨s1, . . . , st⟩.

The data set consists of a sequence of n documents d =(D1, . . . , Dn), where the j-th document Dj , associated withidentifier j, contains a sequence of keywords wj for j ∈ [n].The ciphertext collection is denoted by c = (C1, . . . , Cn),where Cj is the ciphertext of document Dj for j ∈ [n]. Theuniversal distinct keywords extracted from d are denoted byw = w1∪. . .∪wn = (W1, . . . ,Wm). The i-th keywordWi ∈w, associated with identifier i, is contained in a sequence ofdocuments dWi , for i ∈ [m]. We assume that the identifierassociated with each document/keyword is independent ofits contents, and thus is allowed to be exposed to the CSP.Let ξ be a function outputting the identifier of a document or



4

TABLE 2Summary of Notations

d A sequence of n documents (D1, . . . , Dn)w A sequence of m keywords (W1, . . . ,Wm)c A sequence of n ciphertexts (C1, . . . , Cn)dW A sequence of documents containing keyword Wwj A sequence of keywords contained in document Dj

I A ranked inverted index (Ts,As)Ψ The local evidence (ψC , ψI)Φ The remote auxiliary informationΠ The search proof (πC , πI)TW A search token generated for keyword WT∗ An update token generated for document DidW A sequence of document identifiers (ID1, . . . , ID|dW |)

where IDj is rank-j document’s identifier for keyword WsW A sequence of scores (ε1, . . . , ε|dW |) where

εj is rank-j document’s score for keyword WcW A sequence of ciphertexts (CID1 , . . . , CID|dW | ) where

CIDjis rank-j document’s ciphertext for keyword W

RW The search results for search token TW

a keyword. For documentD ∈ d and keywordW ∈ w, theiridentifiers are denoted by ξ(D) and ξ(W ), respectively. Themost relevant notations in our scheme are shown in Table 2.4.2 Ranked Inverted IndexIn the inverted index [5], a list of nodes pointing to docu-ments containing keyword W ∈ w are randomly stored ina search array As and the pointer to the head node is storedin a search table Ts. The overwhelming advantage of aninverted index is its search efficiency, i.e., the complexityof searching keyword W is O(|dW |), which is not onlysub-linear, but also optimal. As an extension, our rankedinverted index I is composed of a search table Ts and asearch array As. Let−1 denote a (log#As+1)-length stringof 1s. For each keywordW ∈ w, a ranked linked list LW thatchains a list of nodes in As is defined as follows:Definition 1 (Ranked linked list). LW links |dW | n-odes (N1, . . . ,N|dW |) and each node is defined as Nj =⟨idj , εj , addrs(Nj+1)⟩, where idj is identifier of the j-th doc-ument in dW , εj is the score 1 of the j-th document in dW forkeyword W 2, and addrs(Nj+1) is the address of node Nj+1 ina search array As. In the special case, addrs(N|dW |+1) = −1.

As for data structures, the search array As is an arraywith As[i] denoting the value stored at location i of As andAs[i]← v denoting storing v at location i of As. The searchtable Ts is a dictionary that stores key-value pairs. If a pair(k, v) exists in Ts, then v is the value associated with key kin Ts. Furthermore, Ts[k] ← v denotes storing the value vunder key k in Ts. Let #As and #Ts denote the size of As

and Ts, respectively. Similar to [6], we set #Ts = m + 1,where the first m entries correspond to keywords in w andthe last entry points to an unused entry in As; and we set#As = ||c||/8 + z to protect statistical information about d,where ||c|| is the bit length of the ciphertext collection andz ∈ N is the number of unused entries in As.4.3 Verifiable MatrixLet H : {0, 1}∗ → {0, 1}κ be a collision-free hash function,let σ be a pseudo-random permutation (PRP) on [m], and

1. For the single keyword search, we choose the term frequency (TF)as the relevance score, which is calculated by TF = DW

|D| , where DW

denotes the frequency of the keyword W appearing in the document Dand |D| denotes the number of words in D.

2. The role of scores is to order documents in top-K searches. There-fore, there is no need to use authentic scores for comparison, and allscores will be preprocessed by order preserving functions like [27], [28].

let S : {0, 1}κ × {0, 1}∗ → {0, 1}κ be a pseudo-randomfunction (PRF) with key k4. The verifiable matrix V is anm× n matrix defined as follows:

V[i][j] =

{H(IDj−1, IDj , IDj+1)⊕ Sk4(W ), j ∈ [|dW |]γij , otherwise,

(1)where i = σ(ξ(W )), IDj denotes the identifier of the rank-j document for keyword W , and γij is a random stringof length κ stored at V[i][j]. Therefore, V[i][j] records theinformation of the rank-(j − 1), rank-j, and rank-(j + 1)documents for keyword W . In the special case, we haveID0 = ID|dW |+1 = 0.

4.4 Scheme DefinitionThe VDERS scheme consists of the following algorithms:• (params, SK) ← GenKey(1κ): It takes a security pa-

rameter κ as input and outputs public parameters paramsand a secret key SK .• (I, c) ← Encryption(SK,d,w): It takes a secret key

SK , a sequence of documents d, and a sequence of key-words w as inputs, and outputs a ranked inverted index Iand a sequence of ciphertexts c.• (Φ,Ψ) ← AccGen(params, SK,d,w, c): It takes pub-

lic parameters params, a secret key SK , a sequence ofdocuments d, a sequence of keywords w, and a sequenceof ciphertexts c as inputs. It outputs remote auxiliary infor-mation Φ and a local evidence Ψ.• TW ← SrcToken(SK,W ): It takes a secret key SK and

a keyword W ∈ w as inputs to generate a search token TW .• RW ← Search(I, TW ): It takes a search token TW and

an index I as inputs to output search results RW .•Π← GenProof(params, c,Φ, TW ,RW ): It takes public

parameters params, a sequence of ciphertexts c, remoteauxiliary information Φ, a search token TW , and searchresults RW as inputs to generate a search proof Π.• {0, 1} ← Verify(params, SK,Ψ, TW ,RW ,Π): It takes

public parameters params, a secret key SK , a local evidenceΨ, a search token TW , search resultsRW , and a search proofΠ as inputs. It outputs 1 if verification is successful, andoutputs 0 otherwise. If the output is 1, for each ciphertext inRW , it takes a secret key SK as input to recover plaintext.• (T∗,Ψ′) ← UpdToken(params, SK,D,Ψ): It takes

public parameters params, a secret key SK , a document D,and a local evidence Ψ as inputs to generate an update tokenT∗ and a new local evidence Ψ′, where ∗ ∈ {add,del}.• (I′, c′,Φ′) ← Update(I, c,Φ, T∗): It takes a ranked

inverted index I, a ciphertext collection c, remote auxil-iary information Φ, and an update token T∗ as inputs. Itgenerates a new ranked inverted index I′, a new ciphertextcollection c′, and new remote auxiliary information Φ′.Definition 2 (Correctness of VDERS). VDERS is correct if forall κ ∈ N, for all keys generated by the GenKey algorithm, forall (I, c) output by the Encryption algorithm, and for all updates,the Search algorithm always returns the correct search results.

4.5 Security DefinitionConfidentiality. We follow the approach in [6] that utilizesleakage functions to capture what is being revealed byciphertexts and tokens. Let Adv be a stateful adversary,let Sim be a stateful simulator, and let L1(d) ,L2(d,W ),



5

L3(d, D), and L4(d, D) be stateful leakage algorithms. Weconsider the following probabilistic experiments.•RealAdv(κ): the challenger runs GenKey(1κ) to gener-

ate a secret key SK. Adv outputs d and receives (I, c) fromthe challenger so that (I, c) ← Encryption(SK,d,w). Advmakes a polynomial number of adaptive queries, and foreach query, it receives from the challenger either a search to-ken TW such that TW ← SrcToken(SK,W ) or an update to-ken T∗ such that (T∗,Ψ′)← UdpToken(params, SK,D,Ψ).Finally, Adv returns a bit b that is output by the experiment.• IdealAdv,Sim(κ): Adv outputs d. Given L1(d), Sim

generates and sends a pair (I, c) to Adv. Adv makes a poly-nomial number of adaptive queries. Given either L2(d,W ),L3(d, D), or L4(d, D), Sim returns an appropriate tokenand in the case of an add operation, a ciphtertext C. Finally,Adv returns a bit b that is output by the experiment.

Definition 3 (Confidentiality of VDERS). VDERS is confiden-tial if for all probabilistic polynomial-time (PPT) adversaries Adv,there exists a PPT simulator Sim such that |Pr[RealAdv(κ) =1]− Pr[IdealAdv,Sim(κ) = 1]| is negligible.

Intrinsically, the confidentiality of VDERS is equivalentto the CKA2 security of [6], in which Adv’s views given bySim are indistinguishable from those in the real world.

Verifiability. It describes the fact that no adversary candeceive the user to accept incorrect search results. As in [16],we say that the adversary Adv wins if given (d,w, I) andsearch queries Adv can return (RW ,Π) for some query suchthat RW = RW and the user accepts the result.

Definition 4 (Verifiability of VDERS). VDERS is verifiable iffor any PPT adversary Adv, Pr(Adv wins) is negligible for any(d,w, I) and any query.

5 BASIC VDERS SCHEME

5.1 Main Idea

The foundational technologies in our VDERS scheme areRSA accumulator (defined in Section 3.3), the ranked invert-ed index (defined in Section 4.2), and the verifiable matrix(defined in Section 4.3). Our main idea is first building aranked inverted index to achieve sublinear search time intop-K queries, and then verifying the correctness of searchresults based on RSA accumulator and the verifiable matrix.

Specially, the search index is constructed as a rankedinverted index. Unlike the inverted index [5] in which eachnode contains only information about document identifierand the location of the next node in the search array, ourranked inverted index incorporates ranking information ineach node to enable top-K queries in cloud computing.

Furthermore, inspired by the work in [16], our VDERSscheme also employs RSA accumulator for dynamic veri-fication. The main difference from their work is that ourscheme constructs a verifiable matrix V to record the docu-ment ranking information for each keyword, and calculatesthe remote auxiliary information Φ and a local evidence Ψby encoding the elements of V into RSA accumulator.

5.2 Our Construction

Let F = {f : {0, 1}3κ → {0, 1}κ} be a family of two-universal hash functions and let SKE = (Gen,Enc,Dec)be a symmetric-key encryption (SKE) scheme, where Gen isa key generation algorithm, Enc is an encryption algorithm,

Algorithm 1 VDERS0.EncryptionInput: Secret key SK, documents d, and keywords wOutPut: Ranked inverted index I and ciphertexts c

1: Initialize the search table Ts with an empty dictionaryof size m + 1 and initialize the search array As with anempty array of size ||c||/8 + z

2: for each keyword W ∈ w do3: Create a ranked linked list LW = (N1, . . . ,N|dW |) by

randomly choosing |dW | unused entries in As whereNi = ⟨idi, εi, addrs(Ni+1)⟩ is defined in Definition 1

4: for i ∈ [|dW |] do5: Choose a κ-bit random string ri and set

As[addrs(Ni)]← (Ni ⊕H(Pk3(W ), ri), ri)

6: Encrypt the address of the head node N1 and setTs[Fk1(W )]← addrs(N1)⊕Gk2(W )

7: Create a free list Lfree = (F1, . . . ,Fz) by random-ly choosing z unused entries in As where Fi =⟨0,0, addrs(Fi−1)⟩ and addrs(F0) = −1

8: for i = z to 1 do9: Set As[addrs(Fi)]← (Fi,0)

10: Store the address of the tail node Fz in Ts by settingTs[free]← addrs(Fz)

11: Fill the remaining entries of As with random strings12: Set I = (Ts,As)13: for j ∈ [n] do14: Set Cj = SKE.Enc(ke, Dj) and c = c ∪ Cj

15: return (I, c)

and Dec is a decryption algorithm. Let H : {0, 1}∗ → {0, 1}∗be a random oracle and let H : {0, 1}∗ → {0, 1}κ be acollision-free hash function. To compute H(x1, . . . , xn), wefirst transform each input xi into a bit string xi and then takethe concatenation of ⟨x1, . . . , xn⟩ as the inputs of H . Let F :{0, 1}κ×{0, 1}∗ → {0, 1}κ, G : {0, 1}κ×{0, 1}∗ → {0, 1}∗,P : {0, 1}κ×{0, 1}∗ → {0, 1}κ, and S : {0, 1}κ×{0, 1}∗ →{0, 1}κ be PRFs. VDERS0 is constructed as follows:• (params, SK) ← GenKey(1κ): Given (N = pq, g) as

shown in Section 3.3, the user randomly chooses a hashfunction f ∈ F and four κ-bit strings k1, k2, k3, k4 as keysof PRFs3. Then, she runs algorithm SKE.Gen to generate thekey ke of SKE, chooses a PRP σ on [m], and sets the publicparameters as params = (N, g, f) and the secret key asSK = (p, q, σ, ke, k1, k2, k3, k4).• (I, c) ← Encyption(SK,d,w): Assume that −1 de-

notes a (log#As + 1)-length string of 1s and that freedenotes a word not in w. The user runs Alg. 1 to builda ranked inverted index I = (Ts,As) and a sequence ofciphertexts c = (C1, . . . , Cn).

In Alg. 1, the user first builds a ranked inverted index I asfollows: After initializing the search table Ts and the searcharray As (Line 1), for each keyword W ∈ w, she creates anencrypted list LW of nodes storing at random locations ofAs, and stores the encrypted address of the head node in Ts

(Lines 2-6). Then, she creates an unencrypted free list Lfree

by choosing z unused entries at random in As and storesthe address of the tail node in Ts (Lines 7-10). Next, she fills

3. A prime x is chosen randomly from f−1. A secret key ka ∈ {0, 1}κshared between the user and the CSP is used as the randomness whencomputing x, so that they obtain the same outputs from f−1.



6

Algorithm 2 VDERS0.SearchInput: Ranked inverted index I and search token TWOutPut: Search results RW

1: Parse TW as (τ1, τ2, τ3, τ4, τ5)2: if τ1 is not in Ts then3: return ∅4: Compute Ts[τ1]⊕ τ2 and recover addrs(N1) the address

of the head node N1 of LW in As

5: repeat6: Obtain the encrypted node Ni stored at As[addrs(Ni)]

and parse Ni as (vi, ri)7: Compute vi ⊕ H(τ3, ri) and recover node Ni =

⟨idi, εi, addrs(Ni+1)⟩8: until addrs(Ni+1) = −19: Sort documents according to their scores

10: Let IDj and εj be the identifier and score of the rank-jdocument for W , respectively

11: if τ4 = ∗ then12: Set idK+1

W = (ID1, . . . , IDK+1), sK+1W =

(ε1, . . . , εK+1), and cKW = (CID1 , . . . , CIDK)

13: Set RW ← (idK+1W , sK+1

W , cKW )14: else15: Set idW = (ID1, . . . , ID|dW |), sW = (ε1, . . . , ε|dW |),

and cW = (CID1 , . . . , CID|dW |)16: Set RW ← (idW , sW , cW )17: return RW

the remaining entries of As with random strings to hindstatistical information about the document collection (Lines11-12). Finally, she generates a sequence of ciphertexts c andreturns (I, c) as the outputs of Alg. 1 (Lines 13-15).• (Φ,Ψ) ← AccGen(params, SK,d,w, c): The user

constructs a verifiable matrix V as defined in Section 4.3,and sets the remote auxiliary information as Φ = V. LetP(y) be a random prime x such that f(x) = y. She calculatesa local evidence Ψ = (ψC , ψI) with Eq. 2 and Eq. 3.

ψC = g∏n

i=1 P(H(i,H(Ci))) mod N, (2)

ψI = g∏m

i=1

∏nj=1 P(H(i,V[i][j])) mod N. (3)

• TW ← SrcToken(SK,W ): To retrieve the top-Kdocuments, the user generates a search token TW =(Fk1(W ), Gk2(W ), Pk3(W ),K, σ(ξ(W ))) for keyword W ∈w. If the user wants to retrieve all documents containingkeyword W , K in TW is replaced by notation ∗.• RW ← Search(I, TW ): For top-K queries, the CSP run-

s Alg. 2 to output search results RW = (idK+1W , sK+1

W , cKW ),where idj

W , sjW , and cjW denote the first j identifiers, scores,and ciphertexts in idW , sW , and cW , respectively. For nor-mal queries, search results RW are set as (idW , sW , cW ).

In Alg. 2, the CSP parses TW as (τ1, τ2, τ3, τ4, τ5), andreturns ∅ if τ1 is not in Ts (Lines 1-3). Otherwise, the CSPfirst recovers the address of head node of LW in As andrecovers all nodes of LW in sequence (Lines 4-8). Then, itsorts documents according to their scores and returns iden-tifiers, scores, and ciphertexts of the matched documents asthe search results RW (Lines 9-17).• Π ← GenProof(params, c,Φ, TW ,RW ): For top-K

queries, the CSP parses TW as (τ1, τ2, τ3, τ4, τ5), and setssearch proof Π = (πC , πI) with Eq. 4 and Eq. 5.

πC = g∏

Ci∈c,Ci ∈cKW

P(H(i,H(Ci)))mod N, (4)

πI = g∏m

i=1,i=τ5ϕi·

∏nj=K+1 P(H(τ5,V[τ5][j])) mod N, (5)

where ϕi =∏n

j=1 P(H(i,V[i][j])). For normal queries,parameter K in both Eq. 4 and Eq. 5 is replaced by |dW |.• {0, 1} ← Verify(params, SK,Ψ, TW ,RW ,Π): Let IDj

be the identifier of the rank-j document containing keywordW . To verify a top-K query, for i ∈ [K], the user computesxi with Eq. 6 and checks whether Eq. 7 holds.

xi ← P(H(IDi, H(CIDi))). (6)

ψC = (πC)∏K

i=1 xi mod N. (7)

Next, she parses TW as (τ1, τ2, τ3, τ4, τ5), and reconstructsthe first K elements V[τ5][1], . . . ,V[τ5][K] at the τ5-rowof the verifiable matrix V with Eq. 1. For j ∈ [K], shecomputes yj with Eq. 8 and checks whether Eq. 9 holds.

yj ← P(H(τ5,V[τ5][j])). (8)

ψI = (πI)∏K

j=1 yj mod N. (9)

If Eq. 7 and Eq. 9 hold, she outputs 1, and 0 otherwise.For normal queries, parameter K in both Eq. 7 and Eq. 9is replaced by |dW |. If the output is 1, for each ciphertextCj ∈ RW , the user runs SKE.Dec to recover document Dj .• (T∗,Ψ′) ← UpdToken(params, SK,D,Ψ): To delete

document D with ξ(D) = j, the user sets the updatetoken as T∗ = Tdel = (j,delete, C ′), where C ′ is outputof SKE.Enc(ke,0). Then, she retrieves Cj from the CSP andgenerates a new local evidence as Ψ′ = (ψ′

C , ψ′I), where

ψ′I = ψI and ψ′

C is computed with Eq. 10.

ψ′C = (ψC)

P(H(j,H(C′)))P(H(j,H(Cj)))

mod φ(N)mod N. (10)

To add document D with ξ(D) = n+1, the user sets theupdate token as T∗ = Tadd = ((n + 1, add, Cn+1), ζ, τv, τa),where Cn+1 is the output of SKE.Enc(ke, Dn+1), ζ =(γ1, . . . , γm) are a sequence of κ-bit random strings, τv =(β1, . . . , β|wn+1|) and τa = (λ1, . . . , λ|wn+1|). For the i-thkeyword W ∈ wn+1, she computes βi and λi as follows:

1) She computes the scores εi of document Dn+1, choos-es a random string ri of κ bits, and computes λi =(λi[1], λi[2], λi[3], λi[4], λi[5]) as:

λi[1] = Fk1(W ), λi[2] = Gk2(W ), λi[3] = ri,

λi[4] = ⟨n+ 1, εi,0⟩ ⊕ H(Pk3(W ), λi[3]),

λi[5] = σ(ξ(W )).

(11)

2) She first performs a normal query to retrieve alldocuments containing keyword W and verifies the cor-rectness of the search results RW . Then, she determinesthe rank of Dn+1, denoted by J, by comparing its s-core, εi, with those of documents in RW , and sets βi =(βi[1], βi[2], βi[3], βi[4], βi[5], βi[6]) as:

βi[1] = H(IDJ−2, IDJ−1, n+ 1)⊕ Sk4(W ),

βi[2] = H(IDJ−1, n+ 1, IDJ)⊕ Sk4(W ),

βi[3] = H(n+ 1, IDJ, IDJ+1)⊕ Sk4(W ),

βi[4] = H(IDJ−2, IDJ−1, IDJ)⊕ Sk4(W ),

βi[5] = H(IDJ−1, IDJ, IDJ+1)⊕ Sk4(W ),

βi[6] = J.

(12)



7

Algorithm 3 VDERS0.UpdateInput: The original ranked inverted index I, the original

ciphertexts c, the original remote auxiliary informationΦ, and the update token T∗

OutPut: The updated versions of ranked inverted index,ciphertexts, and remote auxiliary information (I′, c′,Φ′)

1: Set I′ = I, c′ = c, and Φ′ = Φ (i.e., V′ = V)2: if T∗ = Tdel = (j,delete, C ′) then3: Mark Cj as delete and replace Cj ∈ c′ with C ′

4: else5: T∗ = Tadd = ((n+ 1, add, Cn+1), ζ, τv, τa)6: Set c′ = c′ ∪ Cn+1

7: for the i-the keyword W ∈ wn+1 do8: Compute ϖ ← Ts[free] and obtain the last free

location ϖ in As

9: Compute ((0,0, ϖ−1),0) ← As[ϖ] and obtain thesecond last free location ϖ−1 in As

10: Update the free list to point to the second last freelocation ϖ−1 in As by setting Ts[free]← ϖ−1

11: Recover the address of the first node N1 of LW inAs by computing α1 = Ts(λi[1])⊕ λi[2]

12: Store a new node at location ϖ as the first node ofLW by setting As[ϖ]← (λi[4]⊕ ⟨0,0, α1⟩), λi[3])

13: Store the pointer to the first node of LW in Ts bysetting Ts[λi[1]]← ϖ ⊕ λi[2]

14: Add ζ as the last column of V′

15: for the i-th keyword W ∈ wn+1 do16: Set J = βi[6] and τ = λi[5]17: if J < n then18: for j = n to J + 1 do19: Set V′[τ ][j + 1]=V′[τ ][j]20: if J = 1 then21: Set V′[τ ][J] = βi[2] and V′[τ ][J + 1] = βi[3]22: else23: if J = |dW |+ 1 then24: Set V′[τ ][J− 1] = βi[1] and V′[τ ][J] = βi[2]25: else26: Set V′[τ ][J − 1] = βi[1], V′[τ ][J] = βi[2], and

V′[τ ][J + 1] = βi[3]27: return (I′, c′,Φ′)

where IDj is the identifier of the rank-j document forkeyword W before update, and ID|dW |+1 = ID0 = 0. Inthe special case, if J = 1, we set βi[1] = βi[4] = ⊥; ifJ = |dW |+ 1, we set βi[3] = βi[5] = ⊥. Then, she generatesa new local evidence Ψ′ = (ψ′

C , ψ′I) as follows:

ψ′C = (ψC)

P(H(n+1,H(Cn+1))) mod N. (13)

For i ∈ [|wn+1|], she computes yi with Eq. 14:

yi =

∏3j=1 P(H(λi[5], βi[j]))∏5j=4 P(H(λi[5], βi[j]))

mod φ(N), (14)

Note that if βi[j] = ⊥, P(H(λi[5], βi[j])) will be set to 1.Then, she updates ψI to ψ′

I with Eq. 15:

ψ′I = (ψI)

∏i∈χ P(H(i,γi))

∏|wn+1|i=1 yi mod N, (15)

where χ = [m]/{λ1[5], . . . , λ|wn+1|[5]} is a set of elementsand γi ∈ ζ is a κ-bit random string.• (I′, c′,Φ′) ← Update(I, c,Φ, T∗): The CSP runs algo-

rithm Alg. 3 to generate the updated versions of ranked

Fig. 3. The ranking information of documents.

Fig. 4. Samples of verifiable matrices.

inverted index, ciphertext collection, and remote auxiliaryinformation (I′, c′,Φ′). In Alg. 3, for the deletion operation,the CSP keeps I and V unchanged, and updates c byreplacing ciphertext Cj with C ′ (Lines 1-3).

For the addition operation, the CSP first generates c′ byadding Cn+1 into c (Lines 4-6). To generate I′ = (T′

s,A′s),

it executes lines 7-13 of Alg. 3 as follows: For each keywordW ∈ wn+1, it stores the new node at the last free location ϖin As and sets it as the first node of LW , and then it updatesTs[λi[1]] to point to ϖ and updates Ts[free] to point tothe second last free location ϖ−1 in As. Finally, it generatesΦ′ = V′ and returns (I′, c′,Φ′) (Lines 14-27).

5.3 Illustration ExampleThe correctness of our VDERS scheme can be sketched inthe following example.

(Initial phase) Given a sequence of documents d =(D1, D2, D3, D4) and a sequence of keywords w =(W1,W2,W3), document ranks for each keyword are shownin Fig 3-(a). Given (params, SK), the user encrypts eachdocument Di ∈ d with SKE to generate ciphertext collectionc = (C1, . . . , C4), and constructs the verifiable matrix Vand the ranked inverted index I as shown in Fig. 4-(a) andFig. 2-(a), respectively.

She then sets the remote auxiliary information as Φ = Vand calculates the local evidence Ψ = (ψC , ψI) as:

ψC = g∏4

i=1 P(H(i,H(Ci))) mod N,

ψI = g∏3

i=1

∏4j=1 P(H(i,V[i][j])) mod N.

Finally, she sends (params, I, c,Φ) to the CSP, and keeps(SK,Ψ) locally.

(Search phase) To retrieve the top-1 document contain-ing keyword W2, the user sends the search token TW =(Fk1(W2), Gk2(W2), Pk3(W2), 1, 3) to the CSP.

The CSP locates Ts[Fk1(W2)] = Ts[3] and recovers theaddress of the first node in As by computing 4 = Ts[3] ⊕Gk2(W2). It then computes As[4]⊕H(Pk3(W2), r1) to recov-er the first node N1 = ⟨4, 120, 3⟩, where 4 is the identifierof the first document, 120 is the score, and 3 is the addressof the next node in As. In the same way, it recovers thesecond node N2 = ⟨3, 110,−1⟩, where 3 is the identifier ofthe second document, 110 is file score, and −1 indicates D3

is the last document containing keyword W2. The CSP sortsdocuments according to their scores, and sets the searchresults as RW = (id2

W , s2W , c1W ) = ((4, 3), (120, 110), C4).



8

Fig. 2. Samples of ranked inverted indexes.

Next, it calculates the search proof Π = (πC , πI) as:πC = g

∏i=1,2,3 P(H(i,H(Ci))) mod N,

πI = g∏

i=1,2 ϕi·∏

j=2,3,4 P(H(3,V[3][j])) mod N,

where ϕi =∏4

j=1 P(H(i,V[i][j])). The messages returnedto the user are set as (RW ,Π).

Once the user receives the messages returned, she com-putes x = P(H(4,H(C4))) and checks if ψC = πx

C mod N.She then reconstructs V[3][1] = H(0, 4, 3)⊕ Sk4(W2), com-putes y = P(H(3,V[3][1])), and checks if ψI = (πI)

y

mod N. Note that, the following equations hold only whenthe search results are correct:

πxC mod N

= g∏

i=1,2,3 P(H(i,H(Ci)))·P(H(4,H(C4)))

= g∏4

i=1 P(H(i,H(Ci))) = ψC

(πI)y mod N

= g

∏i=1,2

4∏j=1

P(H(i,V[i][j]))·P(H(3,V[3][1]))·∏

j=2,3,4P(H(3,V[3][j]))

= g

4∏j=1

P(H(3,V[3][j]))∏

i=1,2

4∏j=1

P(H(i,V[i][j]))

= g

3∏i=1

4∏j=1

P(H(i,V[i][j]))

= ψI

If all these checks succeed, she can recover document D2

from ciphertext C2 ∈ RW .(Update phase) As shown in Fig. 3-(b), documentD5, the

rank-2 document for keyword W2, is added. The user gen-erates the update token T∗ = Tadd = ((5, add, C5), ζ, τv, τa)as follows: she first generates the ciphertext C5 of documentD5 with SKE, and sets ζ = (γ1, γ2, γ3), where γi is arandom string of κ bits. Then, she sets τv = β whereβ[1] = H(0, 4, 5) ⊕ Sk4(W2), β[2] = H(4, 5, 3) ⊕ Sk4(W2),β[3] = H(5, 3,0) ⊕ Sk4(W2), β[4] = H(0, 4, 3) ⊕ Sk4(W2),β[5] = H(4, 3,0) ⊕ Sk4(W2), and β[6] = 2. Further-more, she chooses a random string r3 of κ bits and setsτa = λ where λ[1] = 3, λ[2] = Gk2(W2), λ[3] = r3,λ[4] = ⟨5, 115,0⟩ ⊕ H(Pk3(W2), λ[3]), and λ[5] = σ(2) = 3.

Next, she computes y as:

y =

∏3j=1 P(H(λ[5], β[j]))∏5j=4 P(H(λ[5], β[j]))

mod φ(N)

=

∏3j=1 P(H(3,V′[3][j]))∏2j=1 P(H(3,V[3][j]))

mod φ(N),

Finally, she generates the new local evidence Ψ′ = (ψ′C , ψ

′I)

by setting ψ′C = (ψC)

P(H(5,H(C5))) mod N, and ψ′I =

ψP(H(1,γ1))·P(H(2,γ2))·yI mod N.

Given T∗, the CSP generates the updated verifiablematrix V′ and the updated ranked inverted indexes I′ as

shown in Fig. 4-(b) and Fig. 2-(b), respectively. We leave theconstruction of document deletion to the readers.

5.4 Security ProofWe first provide the definitions of the leakage functions. Thedefinitions below closely follow those in [6] except that rankinformation of each document is also exposed.• L1(d) = (#As, {ξ(W )}W∈w, {ξ(D)}D∈d, {||D||}D∈d).

#As is the size of search array, ξ is the identifier functionoutputting the identifier of a document D or keyword W ,and ||D|| is the bit length of document D.• L2(d,W ) = (ξ(W ),ACCP(W )). ACCP(W ) is the

access pattern defined as the sequence of document identi-fiers idW = (ID1, . . . , ID|dW |) where IDj is the identifier ofthe rank-j document for keyword W .• L3(d, D) = (ξ(D), {ξ(W ),apprs(W )}W∈w, ||D||)

where apprs(W ) is a bit set to 1 if W appears in at leastone document in d and to 0 otherwise.• L4(d, D) = (ξ(D)).

Theorem 1. If SKE is secure under adaptive chosen-plaintextattacks (adaptive CPA-secure), H is a collision-free hash function,and F , G, P , and S are pseudo-random, then VDERS0 satisfiesconfidentiality and verifiability under the strong RSA assump-tion in the random oracle model.

A proof is given in the appendix.

6 ADVANCED VDERS SCHEME

6.1 Main IdeaOur basic VDERS construction allows top-K queries in adynamic and verifiable way, but falls short in the followingaspects: (1) Inefficiency in deletion operation. It flags thedeleted document but keeps corresponding nodes in thesearch array As. Therefore, it has no real meaning to deletea document and limits the number of the documents thatcan be added. (2) Inefficiency in verification. It requiresto transmit the verifiable matrix V to the CSP for the gen-eration of search proofs. The complexity of V is O(nmκ),which may be unacceptable for a large-scale dataset.

To release the space of a deleted document, we define adeletion table Tj for each document Dj ∈ d as follows:

Definition 5 (Deletion table). Tj is a dictionary of size |wj |.For each keyword W ∈ wj , Tj [W ] stores addrs(N), where N isthe node in the ranked linked list LW corresponding to documentDj , and addrs(N) is N’s location in the search array As.

Tj stores the addresses in As for nodes that should bereleased if document Dj is removed. Let N−1 and N+1



9

Algorithm 4 VDERS⋆.EncryptionInput: Secret key SK, documents d, and keywords wOutPut: Ranked inverted index I and ciphertexts c⋆

1: Execute lines 1-12 of Alg. 1 and generate I = (Ts,As)2: for each document Dj ∈ d do3: Initialize Tj with an empty dictionary of size |wj |4: for each keyword W ∈ wj do5: Locate node N corresponding to Dj in LW

6: Set Tj [W ]← addrs(N)7: Set Tj ← SKE.Enc(ke,Tj)8: Set Cj ← SKE.Enc(ke, Dj) and c⋆ = c⋆ ∪ (Cj , Tj)9: return (I, c⋆)

denote nodes previous to and following N in LW , respec-tively. Given Tj , node N can be deleted from LW for eachkeyword W ∈ wj by storing N as the last node of the freelist Lfree and modifying the pointer of node N−1 to pointto node N+1. To reduce the communication cost, we encodethe verifiable matrix V into accumulated values so that theCSP can generate search proofs correctly without V.

6.2 Our ConstructionThe main differences between VDERS0 and VDERS⋆ lie inthe following algorithms.• (I, c⋆) ← Encryption(SK,d,w): As shown in Alg. 4,

the user first executes lines 1-12 of Alg. 1 and generatesthe ranked inverted index I = (Ts,As). For each documentDj ∈ d, she creates a deletion table Tj as defined in Defi-nition 5, and separately encrypts Tj and Dj with algorithmSKE.Enc. The ciphertext collection c⋆ is set as (Cj , Tj)

nj=1.

• (Φ⋆,Ψ⋆) ← AccGen(params, SK,d,w, c⋆): The userconstructs an m × n verifiable matrix V as defined inSection 4.3. For each keyword W ∈ w, she calculatesi = σ(ξ(W )) and ϕi =

∏|dW |j=1 P(H(i,V[i][j])) mod φ(N),

and sets remote auxiliary information as Φ⋆ = (ϕ1, . . . , ϕm).Note that the primary role of V is to record document rank-ing information. Since such information has been encodedinto Φ⋆, V can be removed from local storage. Finally, shesets a local evidence as Ψ⋆ = (ψ⋆

C , ψ⋆I ) where ψ⋆

C = ψC iscalculated with Eq. 2 and ψ⋆

I is calculated with Eq. 16.

ψ⋆I = g

∏mi=1 ϕi mod N. (16)

• R⋆W ← Search(I, TW ): The CSP executes lines 1-10 of

Alg. 2. Let cKW be the first K ciphertexts in cW . For top-Kqueries, it setsR⋆

W = (idW , sW , cKW ). For normal queries, itsets R⋆

W = (idW , sW , cW ).• Π⋆ ← GenProof(params, c⋆,Φ⋆, TW ,R⋆

W ): The CSPparses TW as (τ1, τ2, τ3, τ4, τ5) and sets the proof as Π⋆ =(π⋆

C , π⋆I ), where π⋆

C = πC is computed with Eq. 4 and π⋆I is

computed based on Φ⋆ = (ϕ1, . . . , ϕm):

π⋆I = g

∏mi=1,i=τ5

ϕi mod N. (17)

• {0, 1} ← Verify(params, SK,Ψ⋆, TW ,R⋆W ,Π⋆): To

verify a top-K query, for i ∈ [K], the user computes xi withEq. 6 and checks whether Eq. 7 holds. She parses TW as(τ1, τ2, τ3, τ4, τ5), and reconstructs V[τ5][1], . . . ,V[τ5][|dW |]with Eq. 1. For j ∈ [|dW |], she computes yj with Eq. 18 andchecks whether Eq. 19 holds:

yj ← P(H(τ5,V[τ5][j])). (18)

ψ⋆I = (π⋆

I )∏|dW |

j=1 yj mod N. (19)

If Eq. 7 and Eq. 19 hold, she outputs 1, and 0 otherwise. Fornormal queries, parameter K in Eq. 7 is replaced by |dW |. Ifthe output is 1, for each ciphertext Cj ∈ R⋆

W , the user runsSKE.Dec to recover the document Dj .• (T ⋆

∗ ,Ψ⋆′)← UpdToken(params, SK,D,Ψ⋆): To delete

document Dj with ξ(D) = j, the user first retrieves(Cj , Tj) from the CSP, and recovers Tj . Then, she sets theupdate token as T∗ = Tdel = ((j,delete), τv, τa), whereτv = (ν1, . . . , ν|wj |) and τa = (λ1 . . . , λ|wj |). For the i-thkeyword W ∈ wj , she computes νi and λi as follows:

1) She performs normal queries to retrieve all doc-uments containing keyword W . For node N ∈ LW

that corresponds to document Dj , the CSP will return(addrs(N−1), addrs(N+1)) together with the search resultsR⋆

W , where N−1 and N+1 are the nodes previous to andfollowing N, and addrs(N−1) and addrs(N+1) are their lo-cations in As, respectively. Then, she computes addrs(N)←Tj [W ], and sets λi = (λi[1], λi[2], λi[3], λi[4], λi[5]) as:

λi[1] = Fk1(W ), λi[2] = addrs(N−1),

λi[3] = addrs(N), λi[4] = addrs(N+1), λi[5] = σ(ξ(W )).

2) After verifying the correctness ofR⋆W , she reconstructs

V[λi[5]][1], . . . ,V[λi[5]][|dW |] with Eq. 1, and calculatesϕλi[5] =

∏|dW |j=1 P(H(λi[5],V[λi[5]][j])) mod φ(N). Then,

she determines the rank of Dj , denoted by J, and setsβi = (βi[1], βi[2], βi[3], βi[4], βi[5], βi[6]) as:

βi[1] = H(IDJ−2, IDJ−1, IDJ+1)⊕ Sk4(W ),

βi[2] = H(IDJ−1, IDJ+1, IDJ+2)⊕ Sk4(W ),

βi[3] = H(IDJ−2, IDJ−1, IDJ)⊕ Sk4(W ),

βi[4] = H(IDJ−1, IDJ, IDJ+1)⊕ Sk4(W ),

βi[5] = H(IDJ, IDJ+1, IDJ+2)⊕ Sk4(W ),

βi[6] = J,

where IDj is the identifier of the rank-j document forkeyword W before update, and ID|dW |+1 = ID0 = 0. In thespecial case, if J = 1, we set βi[1] = βi[3] = ⊥; if J = |dW |,we set βi[2] = βi[5] = ⊥. Then, she calculates νi = ϕλi[5]×ximod φ(N), where xi is calculated as follows:

xi =


mod φ(N). (20)

Next, she generates the new local evidence as Ψ⋆′=

(ψ⋆′

C , ψ⋆′

I ) with Eq. 21 and Eq. 22, respectively.

ψ⋆′

C = (ψ⋆C)

1P(H(j,H(Cj)))

mod φ(N)mod N. (21)

ψ⋆′

I = (ψ⋆I )

∏|wj |i=1 xi mod N. (22)

To add document Dn+1 with ξ(D) = n + 1, theuser sets the update token as T ⋆

∗ = Tadd = ((n +1, add, Cn+1, Tn+1), τv, τa), where Cn+1 is output ofSKE.Enc(ke, Dn+1), Tn+1 is output of SKE.Enc(ke,Tn+1)

4,τv = (ν1, . . . , ν|wn+1|) and τa = (λ1, . . . , λ|wn+1|). For the i-th keyword W ∈ wn+1, she calculates νi and λi as follows:

1) She computes λi = (λi[1], . . . , λi[5]) with Eq. 11.2) As deletion operations, she reconstructs V[λi[5]][1],

. . . ,V[λi[5]][|dW |] with Eq. 1, and calculates ϕλi[5] =

4. To construct Tn+1, the user can either ask the CSP for the locationsof free nodes or locally store the locations of all free nodes.



10∏|dW |j=1 P(H(λi[5],V[λi[5]][j])) mod φ(N). Then, she cal-

culates xi with Eq. 23 and sets νi = ϕλi[5] × xi mod φ(N):

xi =


mod φ(N), (23)

where βi = (βi[1], . . . , βi[6]) is calculated with Eq. 12. Next,she generates the new local evidence as Ψ⋆′

= (ψ⋆′

C , ψ⋆′

I )with Eq. 13 and Eq. 22, respectively. In Eq. 22, where |wj | isreplaced by |wn+1|.• (I′, c⋆

′,Φ⋆′

) ← Update(I, c⋆,Φ⋆, T ⋆∗ ): If the update

token T ⋆∗ = Tdel = ((j,delete), τv, τa), the CSP gener-

ates c⋆′

by removing (Tj , Cj) from c⋆. It generates Φ⋆′

by replacing ϕλi[5] with νi for i ∈ [|wj |]. To generateI′ = (T′

s,A′s), for the i-th keyword W ∈ wj , it parses

λi = (λi[1], λi[2], λi[3], λi[4], λi[5]) and performs as follows:

1) It computes ϖ ← Ts[free] and finds the last freelocation ϖ in As.

2) It updates the free list to point to node N by settingTs[free]← λi[3].

3) It releases the location of N by setting As[λi[3]] ←(⟨0,0, ϖ⟩,0).

4) If λi[2] = −1, let As[λi[2]] = (v−1, r−1). It up-dates N−1’s next pointer by setting As[λi[2]] ←((0,0, λi[3]⊕ λi[4]⊕ v−1), r−1).

5) If N is the first node of LW , it sets Ts[λi[1]]← λi[4].

If T∗ = Tadd = ((n + 1, add, Cn+1, Tn+1), τv, τa), it gen-erates the updated ranked inverted index I′ in the sameway as VDERS0.Update. Next, it generates c⋆

′by adding

(Cn+1, Tn+1) into c⋆ and generates Φ⋆′by replacing ϕλi[5]

with νi for i ∈ [|wn+1|].Discussion. In both VDERS0 and VDERS⋆, a user needs

to search a set of keywords before generating an update to-ken. This will incur additional costs in the update phase. Tosolve this problem, we can adopt the lazy update mechanis-m [23], which allows to postpone an update operation untila search operation happens. Specifically, to delete documentDj in VDERS⋆, the user sends Tdel = (j,del) to the CSP,which will send back Tj and remove (Cj , Tj). Once the usersearches any keyword W ∈ wj , she can obtain related in-formation for generating (λi, νi, xi). With those information,the CSP can update (I,Ψ,Φ) accordingly. After searching allkeywords in wj , she completes all components of the dele-tion token. Similarly, to add document Dn+1 in VDERS⋆,the user keeps the document identifier and related score lo-cally and sends Tadd = ((n+1, add, Cn+1, Tn+1), τa)) to theCSP, which updates c⋆ and I. Once searched any keywordin wn+1, she can obtain related information for modifyingΨ and Φ accordingly. After searching all keywords in wn+1,she completes all components of the add token.

6.3 Security ProofWe assume that the size of deletion table Tj is the sameas that of document Dj . Therefore, VDERS⋆ has the sameleakages as VDERS0 except that more information is leakedin the deletion operation. We provide the definition forL4(d, D) as follows:• L4(d, D) = (ξ(D), {addrs(N−1), addrs(N), addrs(N+1),

ξ(W )}W∈wξ(D)) where (addrs(N−1), addrs(N), addrs(N+1))

is the information leaked by the deletion token.

TABLE 3Comparison of computation cost

Server Initial Search Add DeleteBASELINE1 O(1) O(n ·m) O(1) O(n)

VDERS0 O(1) O(n ·m) O(1) O(1)VDERS⋆ O(1) O(n+m) O(1) O(1)User Initial Search Add DeleteBASELINE1 O(n ·m) O(n+ |dW |) O(m) O(1)

VDERS0 O(n ·m) O(K) O(m) O(1)VDERS⋆ O(n+m· O(K + |dW |) O(|wj |· O(|wj |·

|dW |max) |dW |max) |dW |max)

Encryption AccGen(a) Communication cost

0

2000

4000

6000

8000

Siz

e(M

B)

VEDRS*

VEDRS0

BASELINE1BASELINE2

Encryption AccGen(b) Execution time

0

200

400

600

800

1000

Tim

e(s)

VEDRS*

VEDRS0

BASELINE1BASELINE2

Fig. 5. Comparison in initial phase.

Theorem 2. If SKE is adaptive CPA-secure, H is a collision-free hash function, and F , G, P , and S are pseudo-random,then VDERS⋆ satisfies confidentiality and verifiability under thestrong RSA assumption in the random oracle model.

A proof is given in the appendix.

7 EVALUATION

7.1 Performance AnalysisWe compare the performance of VDERS0, VDERS⋆, andthe two schemes proposed in [16] and [6] (denoted byBASELINE1 and BASELINE2, respectively) in terms of com-putational and communication complexities. For computa-tion cost, we only consider the most expensive operations,i.e., the calculation of RSA accumulators, and we analyzethe incurred cost from the server side and the user side,respectively. Note that BASELINE2 is not verifiable, andthus, its cost is 0; it will not be listed. Given a collectionof n documents with m keywords, the comparison resultsare shown in Table 3, where K is the parameter for a top-K search, |dW | is the number of documents containingkeyword W , and |wj | is the total number of keywordscontained in document Dj .

For communication cost, all schemes encrypt documentswith SKE, and the sizes of the ciphertexts are the same. Fur-thermore, RSA accumulator generates a constant-size searchproof. Therefore, we only consider the sizes of a secureindex, a verifiable matrix, and a search/update token. Thecomparison results are shown in Table 4, where Ξ =

∑|dW |

is the total number of keyword-document pairs.

TABLE 4Comparison of communication cost

BASELINE1 BASELINE2 VDERS0 VDERS⋆

Initial O(n ·m) O((Ξ +m) · κ) O((Ξ + n ·m) · κ) O((Ξ +m) · κ)Search O(n) O(κ) O(κ) O(κ)Add O(m) O(|wj | · κ) O((m+ |wj |)) · κ O(|wj | · κ)

Delete O(1) O(κ) O(1) O(|wj | · κ)

7.2 Parameter SettingExperiments were conducted on a local machine runningthe Microsoft Windows 10 Ultimate operating system withan Inter Core i5 CPU running at 3.4GHz and 8GB memory.



11

0.5 1 1.5 2 2.5 3n 104

0

0.1

0.2

0.3

0.4

0.5T

ime(

ms)

VEDRS*

VEDRS0

BASELINE1BASELINE2

(a) Search.

0.5 1 1.5 2 2.5 3n 104

0

2

4

6

Tim

e(m

s)

105

VEDRS*

VEDRS0

BASELINE1

(b) GenProof.

10 20 30 40 50top-K

0

0.1

0.2

0.3

0.4

Tim

e(m

s)

VEDRS*

VEDRS0

BASELINE1BASELINE2

(c) Search.

10 20 30 40 50top-K

0

2

4

6

8

Tim

e(m

s)

105

VEDRS*

VEDRS0

BASELINE1

(d) GenProof.

Fig. 6. Execution time in the search phase (server side). Given fixed K = 20, the execution time of (a) and (b) changes over the number ofdocuments n; Given fixed n = 30, 000, the execution time of (c) and (d) changes over the number of returned K documents.

0.5 1 1.5 2 2.5 3n 104

02468

101214161820

Tim

e(s)

VEDRS*

VEDRS0

BASELINE1

(a) Verify.

10 20 30 40 50top-K

5

10

15

Tim

e(s)

VEDRS*

VEDRS0

BASELINE1

(b) Verify.

Fig. 7. Execution time in the search phase (user side). Given fixed K =20, the time of (a) changes over the number of documents n; Givenfixed n = 30, 000, the time of (b) changes over parameter K.

10 20 30 40|w

j|

0

100

200

300

Siz

e(K

B)

VEDRS*

VEDRS0

BASELINE1BASELINE2

(a) Communication cost (add).

10 20 30 40|w

j|

1

1.5

2

2.5

3

3.5

4

Siz

e(K

B)

VEDRS*

BASELINE2

(b) Communication cost (delete).

Fig. 8. Communication cost in the update phase. Given fixed n =30, 000, the communication costs of (a) and (b) change over the numberof keywords in a document |wj |.

10 20 30 40|w

j|

0

1000

2000

3000

Tim

e(m

s)

VEDRS*

VEDRS0

BASELINE1BASELINE2

(a) UpdToken (add).

10 20 30 40|w

j|

0

1

2

3

4

5

6

Tim

e(m

s)

VEDRS*

VEDRS0

BASELINE1BASELINE2

(b) Update (add).

10 20 30 40|w

j|

0

500

1000

1500

2000

2500

Tim

e(m

s)

VEDRS*

BASELINE2

(c) UpdToken (delete).

10 20 30 40|w

j|

0

0.5

1

1.5

2

2.5

3

Tim

e(m

s)

VEDRS*

BASELINE2

(d) Update (delete).

Fig. 9. Execution time in the update phase. Given fixed n = 30, 000, the execution time of (a)-(d) changes over the number of keywords in adocument |wj |. (a) and (c) show the computation time at the user side, and (b) and (d) show the computation time at the server side.

The programs were implemented in Java and compiledusing Eclipse 4.6.0. The cryptographic algorithms were im-plemented with JPBC library.

To validate the effectiveness and efficiency of the VDERSscheme in practice, we conducted a performance evaluationon a real e-mail data set, the Enron Email Data Set5. Thisdata set has 30,000 plaintext e-mails with a total size of about71.2MB. The average size of each document is 2.43KB. Wechose 1 ∼ 3 keywords for each document after ranking themby frequency of occurrence, and got 7,922 keywords finally.

7.3 Experiment ResultsWe compared our VDERS constructions with BASELINE1and BASELINE2 in terms of the communication cost and theexecution time in the initial phase, search phase, and updatephase. Since BASELINE2 is unverifiable, it lacks algorithmsAccGen, GenProof, and Verify. Furthermore, SKE is em-ployed to encrypt document contents in all schemes, andthe computation and communication costs of generating

5. http://www.cs.cmu.edu/∼./enron/

a search token are very small. Therefore, we left out thetime of encrypting documents in algorithm Encrption, andomitted the comparison of algorithm SrcToken.

Fig. 5 illustrates the comparison results in the initialphase. For the Encryption algorithm, BASELINE1 encryptsa string of length n for every keyword, and thus incursthe most computation and communication costs. VDERS0

and VDERS⋆ encrypt a ranked inverted index in the sameway. Since VDERS⋆ needs to encrypt a deletion table Ti

in addition, and thus it incurs more computation andcommunication costs compared with VDERS0. BASELINE2encrypts an inverted index which includes a deletion tableand a deletion array in addition, and thus its costs exceedVDERS0. For the AccGen algorithm, VDERS0 is the mostexpensive in both computation and communication costsbecause of the generation and transmission of a verifiablematrix (m× n× κ bits in size); VDERS⋆ incurs the minimalcomputation cost since the number of RSA accumulator cal-culations is reduced to |dW | for each keyword; BASELINE1incurs the minimal communication cost since there is no



12

need to transmit auxiliary information to the CSP. Further-more, the communication cost of VDERS⋆ in the AccGenalgorithm is also largely reduced since it only needs totransmit m integers to the CSP.

Fig. 6 shows the execution time at the server side in thesearch phase. In Fig. 6-(a), as the number of documentsn grows, BASELINE1 needs to decrypt longer strings andreturn more documents, rendering the execution time togrow linearly; The search time in our VDERS construction-s and BASELINE2 is impacted by the number of docu-ments matching a keyword, and thus increases smoothly.In Fig. 6-(b), the execution time of algorithm GenProofin both VDERS0 and BASELINE is mainly impacted bythe number of documents n. However, in our advancedGenProof algorithm, the computation of Eq. 17 is based onΦ⋆, and the execution time increases slightly as n increases.For example, the execution time in VDERS0 and VDERS⋆

increases from 68s to 582s, and from 4s to 18s, respectively,as n increases from 5,000 to 30,000. From Fig. 6-(c)(d),we know that the execution time of algorithm Search isslightly impacted by parameter K , but the execution time ofalgorithm GenProof decreases smoothly as K increases. Forexample, as K increases from 10 to 50, the execution timeof algorithm Search in VDERS0 fluctuates around 0.08ms,and the execution time of algorithm GenProof in VDERS⋆

decreases from 20s to 12s.Fig. 7 shows the execution time at the user side in the

search phase. From this figure, we know that our VDER-S constructions have better performance compared withBASELINE1. To verify the search results, the number ofRSA accumulator calculations in BASELINE1, VDERS0, andVDERS⋆ is mainly impacted by |dW |+n, K, and |dW |+K ,respectively. Therefore, the execution time of BASELINE1in Fig. 7-(a) grows linearly as n increases, but it remainsunchanged in Fig. 7-(b) as K increases. On the contrary, theexecution time of VDERS⋆ in Fig. 7-(a) changes slightly andVDERS0 just fluctuates around a fixed value as n increases,but they both grow linearly in Fig. 7-(b) as K increases.

Fig. 8 shows the communication costs in the updatephase. In terms of adding a document, VDERS0 incursthe largest communication cost because it needs to uploadm extra random numbers to the CSP. BASELINE1 incursthe minimal cost because it only needs to upload m bits.From Fig. 8-(a), we know that the cost of BASELINE2 growslinearly, the costs of our VDERS constructions grow slightly,but the cost of BASELINE1 remains unchanged, as |wj |increases. For example, as |wj | increases from 10 to 40,the cost of BASELINE1 remains as 8KB, but the costs ofBASELINE2, VDERS0, and VDERS⋆, increase from 43KB to170KB, from 259KB to 269KB, and from 6KB to 12KB, respec-tively. In terms of deleting a document, the communicationcosts of BASELINE1 and VDERS⋆ are negligibly small, andwe only consider VDERS⋆ and BASELINE2. From Fig. 8-(b), we know that the cost of BASELINE2 is independent of|wj |, but the cost of VDERS⋆ badly grows with |wj |.

Fig. 9 shows the computation time in the update phase.In Fig. 9-(a), BASELINE1 needs to compute m times of RSAaccumulators, and thus it incurs the maximum executiontime. In Fig. 9-(b), VDERS0 incurs the maximum executiontime since it needs to update both the ranked inverted indexand the verifiable matrix. In terms of deleting a document,

we still only consider VDERS⋆ and BASELINE2. In Fig. 9-(c), the time of BASELINE2 is independent of |wj |, but thetime of VDERS⋆ grows with |wj |. In Fig. 9-(d), both thetime of VDERS⋆ and BASELINE2 grows as |wj | increases.Furthermore, BASELINE2 with a more complicated indexstructure incurs more execution time than VDERS⋆.

8 CONCLUSION

In this paper, we design a VDERS scheme to simultane-ously support sublinear search time, efficient update andverification, and on-demand information retrieval in cloudcomputing environment. Experiment results demonstratethat our scheme is effective for verifying the correctnessof top-K searches on a dynamic document collection. Aspart of our future work, we will try to design a forwardsecure VDERS scheme, where the update phase does notleak keyword information about a newly added document.

APPENDIX APROOF OF CONFIDENTIALITY

The confidentiality proof of VDERS⋆ is similar to that ofVDERS0. The main differences from VDERS0 are markedas the boxed code in VDERS⋆.

Proof: We describe a PPT simulator Sim such thatfor all PPT adversaries Adv, the outputs of RealAdv(κ)and IdealAdv,Sim(κ) are computationally indistinguishable.Consider Sim that adaptively simulates an encrypted indexI = (As,Ts), a set of ciphertexts c, and t ∈ N tokensT1, . . . , Tt as follows:

(Setting up internal data structures) Given L1, Simsets up its internal data structures (iTs,T′

s, iAs,A′s, RO),

where iTs and T′s are empty dictionaries of size m+1, iAs

and A′s are empty arrays of size ||c||/8 + z, and RO is an

empty dictionary. Let γs be a bijection mapping search keysin iTs to those in T′

s. Then, Sim generates k′e by runningSKE.Gen(1κ) and outputs (N = pq, g, f). It chooses arandom κ-bit search key σ, and for i ∈ [m], it chooses κ-bit keys ki randomly. It will send params′ = (N, g, f) toAdv and keep SK ′ = (k′e, p, q, σ, {ki}i∈[m]) secret.

(Simulating As) It generates an empty array As of size||c||/8+z and fills ||c||/8 random entries of As with randomstrings of the form ⟨N, r⟩ where ||N|| = log#As + 1 + 2 · κand ||r|| = κ. It then generates a copy A′

s of As and marksall the empty entries in As as free in iAs.

(Simulating Ts) It generates a dictionary Ts of sizem+1and stores a (log#As + 1)-bit random string in each entryof Ts under key σ. It then generates a copy T′

s of Ts.(Simulating c) It simulates ciphertexts Ci = SKE.Enc

(k′e,0||Di||) and Ti = SKE.Enc(k′e,0

||Di||) , for i ∈ [n].(Simulating search token) Given L2, Sim simulates

search token TW as follows:Case 1: ξ(W ) has not appeared in any previous leakage. Sim

chooses |dW | unused and non-free cells in iAs at randomand marks them with ξ(W ). It then chains these entries as alist by setting the i-th cell Ni as ⟨idi, εi, addrs(Ni+1)⟩, whereidi is the identifier of the i-th document in ACCP(W ),εi is the score of the i-th document for keyword W , andaddrs(Ni+1) points to the cell that holds idi+1. If i = |dW |,addrs(Ni+1) = ⊥. Let α1 be the address of list head. It setsiTs[ξ(W )]← α1.



13

Case 2: ξ(W ) has appeared in previous leakage. IfiTs[ξ(W )] = α1 = ⊥ where α1 is a location in iAs thatholds the head of a list marked with ξ(W ), this means thatkeyword W has been searched for in the past and appearsin certain documents. In this case, Sim searches iAs for theentry marked with ξ(W ), and augments the list to length|dW | as Case 1. If iTs[ξ(W )] = ⊥, this means that keywordW has been searched for in the past but is not in anydocument. In this case, Sim sets α1 = ⊥.

It returns TW = (γs(ξ(W )),T′s[γs(ξ(W ))]⊕ α1, kξ(W )).

(Simulating the add token) Given L3, Sim simulates anadd token Tadd as follows:

For a new document D, Sim computes Cξ(D) =

SKE.Enc(k′e,0||D||) and Tξ(D) = SKE.Enc(k′e,0

||D||) . Forall W ∈ wξ(D), if iTs[ξ(W )] = ⊥ and apprs(W ) = 1, itchooses an unused and non-free location α1 in iAs, marksit with ξ(W ), and corrects its internal data structures bystoring ⟨ξ(D), ε,⊥⟩ in α1 and setting iTs[ξ(W )] ← α1,where ε is the score of document D for keyword W .

It then returns the token Tadd = ((ξ(D), add, Cξ(D)), ζ,τv, τa), where ζ = (γ1, . . . , γm) is a sequence of κ-bit random strings, τv = (β1, . . . , β|wξ(D)|) and τa =(λ1, . . . , λ|wξ(D)|). For i ∈ [|wξ(D)|], βi[1], . . . , βi[6] are set toκ-bit random strings, and λi = (γs(ξ(W )),T′

s[γs(ξ(W ))])⊕iTs[ξ(W )], ri, ui, li) where ri is a κ-bit random string, ui isa (2κ + log#As + 1)-bit random string, and li ∈ [m] isdetermined by W and key σ.

It then returns the token Tadd = (ξ(D), add, Cξ(D),Tξ(D)), τv, τa)) where τv = (ν1, . . . , ν|wξ(D)|) and τa isdefined in the same way as above. For i ∈ [|wξ(D)|], νiis set to a random integer sampling from [φ(N)].

Let ϖ and ϖ− be the locations of the last and second-to-last nodes in the free list of iAs. For the i-th keyword W ∈wξ(D), Sim updates its internal data structures as follows:

1) It sets A′s[ϖ] ← (ui ⊕ ⟨0,0, α1⟩, ri), where α1 is the

first element in iTs[ξ(W )].2) It stores ⟨ξ(D), ε, α1⟩ at location ϖ in iAs;3) It sets iTs[ξ(W )]← ϖ and marksϖ in iAs with ξ(W ).(Simulating delete token) Given L4, Sim returns Tdel =

(ξ(D),del, C ′ξ(D)) where C ′

ξ(D) = SKE.Enc(k′e,0).

Given L4, Sim returns Tdel = ((ξ(D),del), τv, τa),where τv = (ν1, . . . , ν|wξ(D)|) and τa =(λ1, . . . , λ|wξ(D)|). For i ∈ [|wξ(D)|], νi is set to arandom integer sampling from [φ(N)], and λi =(γs(ξ(W )), addrs(N−1), addrs(N), addrs(N+1), li),where li ∈ [m] is determined by W and σ.

Next, Sim updates its internal data structures asfollows: For i ∈ |wξ(D)|, it releases corresponding cellin iAs[addrs], and updates its precursor node to pointto its successor node. If the freed node is the head of alist, then it updates the relevant pointer in iTs to pointto that node’s successor node. Then, it merges the newlyfreed nodes in iAs into the free list.

(Answering H queries) given query (k, r), the simulatorchecks if k has been associated with some word identifierξ(W ), i.e., if k = kξ(W ) for some ξ(W ). If not, it returns a

random (2κ+log#As+1)-bit string v and setsRO[⟨k, r⟩] =v to stay consistent in future queries.

If k = kξ(W ), it finds all entries in iAs marked with ξ(W )and checks if any of their corresponding cells in A′

s storesthe randomness r. If such a cell does exist, the simulatorreturns v ⊕ iAs[l], where l is the location in A′

s of that celland v is such that (v, r) = A′

s[l]. If not, it returns and storesa random value v in RO.

Since SKE is adaptive CPA-secure, the ciphertexts areindistinguishable. Furthermore, because of the pseudo-randomness of the PRFs, the simulated indexes and tokensare indistinguishable from the real ones. Therefore, ourVDERS scheme satisfies confidentiality.

APPENDIX BPROOF OF VERIFIABILITY

We next show that the probability for the adversary Advto win the game of VDERS0 is negligible. Note that theverifiability of VDERS⋆ can be proven in the same way.

Proof: For an add/delete token, the user receivesnothing from the CSP (= Adv). Therefore, she alwaysupdates Ψ = (ψC , ψI) correctly. In the search phase, theuser sends TW to the CSP, which returns (RW ,Π).

First suppose that Adv returns (RW ,Π) correctly. Theuser outputs 1 and accepts the search results. Next supposethat Adv returns an invalid (RW ,Π) = (RW ,Π). We willshow that Eq. 7 and Eq. 9 hold with negligible probability.

(Case 1) RW = RW and Π = Π. In this case, the usercomputes {xi}i∈[K] (Eq. 6) and {yj}j∈[K] (Eq. 8) correctly.Neither Eq. 7 nor Eq. 9 holds because (πC , πI) = (πC , πI).

(Case 2) RW = RW . If the user does not compute{yj}j∈[K] correctly, then from Proposition 2 we can seethat Eq. 9 does not hold. Otherwise, the user reconstructsthe verifiable matrix V correctly. This means that thereexist some (i, Ci) ∈ RW and (i, Ci) ∈ RW such thatCi = Ci. For such i, H(i,H(Ci)) = H(i,H(Ci)) due tothe collision-resistant property of hash function H . GivenP(H(i,H(Ci))) = P(H(i,H(Ci))), Proposition 2 showsthat Eq. 7 does not hold. In both cases, the user outputs 0and rejects the search results. Therefore, our VDERS schemesatisfies verifiability.

ACKNOWLEDGMENTS

This work was supported in part by NSFC grants 61632009and 61872133; and NSF grants CNS 1757533, CNS1629746,CNS 1564128, and IIP 1439672.

REFERENCES

[1] G. S. Poh, J.-J. Chin, W.-C. Yau, K.-K. R. Choo, and M. S. Mohamad,“Searchable symmetric encryption: designs and challenges,” ACMComputing Surveys (CSUR), 2017.

[2] Q. Liu, X. Nie, X. Liu, T. Peng, and J. Wu, “Verifiable ranked searchover dynamic encrypted data in cloud computing,” in Proc. ofIWQoS, 2017.

[3] J. Camenisch and A. Lysyanskaya, “Dynamic accumulators andapplication to efficient revocation of anonymous credentials,” inProc. of CRYPTO, 2002.

[4] D. X. Song, D. Wagner, and A. Perrig, “Practical techniques forsearches on encrypted data,” in Proc. of IEEE S&P, 2000.

[5] R. Curtmola, J. Garay, S. Kamara, and R. Ostrovsky, “Searchablesymmetric encryption: improved definitions and efficient con-structions,” in Proc. of the ACM CCS, 2006.

[6] S. Kamara, C. Papamanthou, and T. Roeder, “Dynamic searchablesymmetric encryption,” in Proc. of ACM CCS, 2012.



14

[7] S. Kamara and C. Papamanthou, “Parallel and dynamic searchablesymmetric encryption,” in Proc. of FC, 2013.

[8] D. Cash, J. Jaeger, S. Jarecki, C. Jutla, H. Krawczyk, M. Rosu,and M. Steiner, “Dynamic searchable encryption in very-largedatabases: data structures and implementation,” in Proc. of NDSS,2014.

[9] M. Naveed, M. Prabhakaran, and C. A. Gunter, “Dynamic search-able encryption via blind storage,” in Proc. of S&P, 2014.

[10] Y. Zhang, J. Katz, and C. Papamanthou, “All your queries arebelong to us: The power of file-injection attacks on searchableencryption,” in Proc. of USENIX Security Symposium, 2016.

[11] E. Stefanov, C. Papamanthou, and E. Shi, “Practical dynamicsearchable encryption with small leakage,” in Proc. of NDSS, 2014.

[12] R. Bost, Σoϕoς : Forward secure searchable encryption, in Proc. ofCCS, 2016.

[13] X. Song, C. Dong, D. Yuan, Q. Xu and M. Zhao, “Forward privatesearchable symmetric encryption with optimized I/O efficiency,”in IEEE Transactions on Dependable and Secure Computing, 2018.

[14] K. Kurosawa and Y. Ohtaki, “Uc-secure searchable symmetricencryption,” in Proc. of FC, 2012.

[15] Soleimanian, Azam,and S. Khazaei, “Publicly verifiable searchablesymmetric encryption based on efficient cryptographic compo-nents,” in Designs, Codes and Cryptography, 2019.

[16] K. Kurosawa and Y. Ohtaki, “How to update documents verifiablyin searchable symmetric encryption,” in Proc. of CNS, 2013.

[17] W. Sun, X. Liu, W. Lou, Y. T. Hou, and H. Li, “Catch you if you lieto me: efficient verifiable conjunctive keyword search over largedynamic encrypted cloud data,” in Proc. of IEEE INFOCOM, 2015.

[18] S. Jiang, X. Zhu, L. Guo, and J. Liu, “Publicly verifiable booleanquery over outsourced encrypted data,” in IEEE Transactions onCloud Computing, 2017.

[19] J. Zhu, Q. Li, C. Wang, X. Yuan, Q. Wang and K. Ren, “Enablinggeneric, verifiable, and secure data search in cloud services,” inIEEE Transactions on Parallel and Distributed Systems, 2018.

[20] N. Cao, C. Wang, M. Li, K. Ren, and W. Lou, “Privacy-preservingmulti-keyword ranked search over encrypted cloud data,” IEEETransactions on Parallel and Distributed Systems, 2014.

[21] W. K. Wong, D. W.-l. Cheung, B. Kao, and N. Mamoulis, “SecurekNN computation on encrypted databases,” in Proc, of ACMSIGMOD, 2009.

[22] W. Sun, B. Wang, N. Cao, M. Li, W. Lou, Y. T. Hou, and H. Li,“Verifiable privacy-preserving multi-keyword text search in thecloud supporting similarity-based ranking,” IEEE Transactions onParallel and Distributed Systems, 2014.

[23] C. Chen, X. Zhu, P. Shen, J. Hu, S. Guo, Z. Tari, and A. Y. Zomaya,“An efficient privacy-preserving ranked keyword search method,”IEEE Transactions on Parallel and Distributed Systems, 2016.

[24] W. Zhang, Y. Lin, S. Xiao, J. Wu, and S. Zhou, “Privacy preservingranked multi-keyword search for multiple data owners in cloudcomputing,” IEEE Transactions on Computers, 2016.

[25] Z. Fu, K. Ren, J. Shu, X. Sun, and F. Hua,“Enabling personalizedsearch over encrypted outsourced data with efficiency improve-ment,” IEEE transactions on parallel and distributed systems, 2016.

[26] C. Guo, X. Chen, Y. Jie, Z. Fu, M. Li, and B. Feng, “Dynamicmulti-phrase ranked search over encrypted data with symmetricsearchable encryption,” in IEEE Transactions on Services Computing,2017.

[27] F. Kerschbaum, “Frequency-hiding order-preserving encryption,”in Proc. of CCS, 2015.

[28] D. S. Roche, D. Apon, S. G. Choi, and A. Yerukhimovich, “Pope:Partial order preserving encoding,” in Proc. of CCS, 2016.

[29] R. Gennaro, S. Halevi, and T. Rabin, “Secure hash-and-sign sig-natures without the random oracle,” in Stern J. (eds) Advances inCryptology, 1999.

[30] N. Baric, B. Pfitzmann, “Collision-free accumulators and fail-stopsignatrue schemes without trees,” in Fumy W. (eds) Advances inCryptology, 1997.

Qin Liu received her B.Sc. in Computer Sci-ence in 2004 from Hunan Normal University,China, received her M.Sc. in Computer Sciencein 2007, and received her Ph.D. in ComputerScience in 2012 from Central South Universi-ty, China. She has been a Visiting Student atTemple University, USA. Her research interestsinclude security and privacy issues in cloud com-puting. Now, she is an Associate Professor inthe College of Computer Science and ElectronicEngineering at Hunan University, China.

Yue Tian received her B.Sc. in E-Commercein 2017 from Nanchang University, Nanchang,Jiangxi Province, China. Now, she is a postgrad-uate student majoring in Computer Science atthe College of Computer Science and ElectronicEngineering, Hunan University, Changsha, Hu-nan Province, China. Her research interests in-clude security and privacy issues in cloud com-puting. She is a Student Member of IEEE.

Jie Wu is the Chair and a Laura H. Carnell Pro-fessor in the Department of Computer and Infor-mation Sciences at Temple University, Philadel-phia, PA, USA. Prior to joining Temple Univer-sity, he was a Program Director at the NationalScience Foundation and a Distinguished Pro-fessor at Florida Atlantic University. His currentresearch interests include mobile computing andwireless networks, routing protocols, cloud andgreen computing, network trust and security, andsocial network applications. Dr. Wu has regularly

published in scholarly journals, conference proceedings, and books. Heserves on several editorial boards, including IEEE TRANSACTIONSON SERVICE COMPUTING, and Journal of Parallel and DistributedComputing. Dr. Wu was general co-chair/chair for IEEE MASS 2006,IEEE IPDPS 2008, IEEE ICDCS 2013, and ACM MobiHoc 2014, aswell as program co-chair for IEEE INFOCOM 2011 and CCF CNCC2013. He was an IEEE Computer Society Distinguished Visitor, ACMDistinguished Speaker, and chair for the IEEE Technical Committee onDistributed Processing (TCDP). Dr. Wu is a CCF Distinguished Speakerand a Fellow of the IEEE. He is the recipient of the 2011 China ComputerFederation (CCF) Overseas Outstanding Achievement Award.

Tao Peng received the B.Sc. in Computer Sci-ence from Xiangtan University China, in 2004,the M.Sc. in Circuits and Systems from HunanNormal University, China, in 2007, and the Ph.D.in Computer Science from Central South Univer-sity, China, in 2017. Now, she is an AssistantProfessor of School of Computer Science andCyber Engineering, Guangzhou University, Chi-na. Her research interests include network andinformation security issues.

Guojun Wang received his B.S. in Geophysics,in 1992, M.S. in Computer Science, in 1996,and Ph.D. in Computer Science, in 2002, fromCentral South University, China. He is the PearlRiver Scholar Professor of Guangdong Provinceat School of Computer Science and Cyber En-gineering, Guangzhou University, China. He hasbeen an Adjunct Professor at Temple Universi-ty, USA; a Visiting Scholar at Florida AtlanticUniversity, USA; a Visiting Researcher at theUniversity of Aizu, Japan; and a Research Fellow

at the Hong Kong Polytechnic University. His research interests includenetwork and information security, Internet of things, and cloud comput-ing. He is a distinguished member of CCF, and a member of IEEE, ACM,and IEICE.

Documents

Enabling Verifiable and Dynamic Ranked Search Over ...ieee-trustcom.org/faculty/~csgjwang/papers/QinLiu-IEEETSC2019.pdf · Enabling Veriﬁable and Dynamic Ranked Search Over Outsourced