Secure Disrtibuted De-duplication System with Improved Reliability in Cloud Computing

  • Upload
    ijafrc

  • View
    13

  • Download
    0

Embed Size (px)

DESCRIPTION

Information is a procedure for disposing of copy duplicates of information, and has been broadlyutilized as a part of Cloud storage to de-crease storage room and transfer transmission capacity. Onthe other hand, there is one and only duplicate for every record put away in cloud regard-less of thefact that such a document is possessed by an immense number of clients. Thus, framework enhancesstockpiling use while diminishing un-wavering quality. Besides, the test of security for delicateinformation additionally emerges when they are outsourced by clients to cloud. Intending to addressthe above security challenges, this paper makes the rest endeavor to formalize the thought ofdispersed dependable framework. We propose new conveyed frameworks with higher unwaveringquality in which the in-formation pieces are dispersed over numerous cloud servers. The securitynecessities of information privacy and label consistency are additionally accomplished by presentinga deterministic mystery sharing plan in dispersed stockpiling frameworks, rather than utilizingconcurrent encryption as a part of past frameworks. Security examination shows that ourframeworks are se-cure as far as the definitions determined in the proposed security model. As aproof of idea, we execute the proposed frameworks and exhibit that the caused overhead isextremely restricted in sensible situations.

Citation preview

  • International Journal of Advance Foundation and Research in Computer (IJAFRC)

    Volume 3, Issue 1, January - 2016. ISSN 2348 4853, Impact Factor 1.317

    47 | 2016, IJAFRC All Rights Reserved www.ijfarc.org

    Secure Disrtibuted De-duplication System with Improved

    Reliability in Cloud Computing Mr. Sagar G Khengat, Mr. Swapnil S Belorkar, Mr. Alok Y Shukla, Prof. Nilesh B Madke

    Dept. of Computer Engineering,

    ISB & M School Of Technology Nande Village, Tal.Mulashi, Pune, Savitribai Phule Pune University.

    [email protected], [email protected], [email protected],

    [email protected]

    A B S T R A C T

    Information is a procedure for disposing of copy duplicates of information, and has been broadly utilized as a part of Cloud storage to de-crease storage room and transfer transmission capacity. On the other hand, there is one and only duplicate for every record put away in cloud regard-less of the fact that such a document is possessed by an immense number of clients. Thus, framework enhances stockpiling use while diminishing un-wavering quality. Besides, the test of security for delicate information additionally emerges when they are outsourced by clients to cloud. Intending to address the above security challenges, this paper makes the rest endeavor to formalize the thought of dispersed dependable framework. We propose new conveyed frameworks with higher unwavering quality in which the in-formation pieces are dispersed over numerous cloud servers. The security necessities of information privacy and label consistency are additionally accomplished by presenting a deterministic mystery sharing plan in dispersed stockpiling frameworks, rather than utilizing concurrent encryption as a part of past frameworks. Security examination shows that our frameworks are se-cure as far as the definitions determined in the proposed security model. As a proof of idea, we execute the proposed frameworks and exhibit that the caused overhead is extremely restricted in sensible situations.

    Index Terms: cloud security, de-duplication, distributed system, proof of ownership, file-level ,block-level

    I. INTRODUCTION

    In this situation, it is alluring to review that the SSP meets its contractual commitments. SSPs have

    numerous inspirations to fall flat these commitments; for instance, a SSP may attempt to conceal data

    misfortune occurrences so as to save its notoriety or it may dispose of data that is seldom gotten to so

    that it may exchange the same stockpiling. Remote data checking (RDC) permits a reviewer to test a

    server to give a proof of data possession so as to approve that the server has the data that was initially

    put away by a customer. We say that a RDC plan looks to give a data possession ensure. Archival system

    to wrath present exceptional execution requests. Given that le data is expansive and put away at remote

    locales, getting to a whole le is lavish in I/O expenses to the capacity server and in transmitting the le

    over a system. Perusing a whole chronicle, even intermittently, enormously confines this capacity of

    system stores. Besides, I/O acquired to set up data possession meddles with on-interest transfer speed to

    store and recover data.

    We presume that customers should have the capacity to check that a server has held le data without

    recovering the data from the server and without having the server get to the whole le. A plan for

    inspecting remote data ought to be both lightweight and powerful. Lightweight implies that it doesn't

    unduly trouble the SSP; this incorporates both over-head (i.e., calculation and I/O) at the SSP and

    correspondence between the SSP and the inspector. This objective can be accomplished by depending on

    spot checking, in which the examiner haphazardly tests little divides of the data and checks their

  • International Journal of Advance Foundation and Research in Computer (IJAFRC)

    Volume 3, Issue 1, January - 2016. ISSN 2348 4853, Impact Factor 1.317

    48 | 2016, IJAFRC All Rights Reserved www.ijfarc.org

    uprightness, therefore minimizing the I/O at the SSP. Spot checking permits the customer to identify if a

    small amount of the data put away at the server has been debased, yet it can't recognize defilement of

    little parts of the data (e.g., 1 byte). Vigorous implies that the examining plan consolidates components

    for relieving self-assertive measure corruption. Protecting against large corruptions ensures that the SSP

    has committed the contracted storage resources: Little space can be reclaimed undetectably, making it

    unattractive to delete data to save on storage costs or sell the same storage multiple times. Protecting

    against small corruptions protects the data itself, not just the storage resource. Much data has value well

    beyond its storage costs, making attacks that corrupt small amounts of data practical. For example,

    modifying a single bit may destroy an encrypted le or invalidate authentication information.

    II. PROPOSED SYSTEM

    Input: Though de-duplication technique can save the storage space for the cloud storage service

    providers, it reduces the reliability of the system. Data reliability is actually very critical issue in a de-

    duplication storage system be-cause there is only one copy for each le stored in the server shared by all

    the owners. If such a shared le/chunk was lost, a disproportionately large amount of data becomes

    inaccessible because of the unavailability of all the les that share this le/chunk. If the value of a chunk

    were measured in terms of the amount of le data that would be lost in case of losing a single chunk, then

    the amount of user data lost when a chunk in the storage system is corrupted grows with the number of

    the commonality of the chunk. Thus, how to guarantee high data reliability in de-duplication system is a

    critical problem. Most of the previous de-duplication systems have only been considered in a single-

    server setting. However, as lots of de-duplication systems and cloud storage systems are intended by

    users and applications for higher reliability, especially in archival storage systems where data are critical

    and should be preserved over long time periods. This requires that the de-duplication storage systems

    provide reliability comparable to other high-available systems

    Output: User will login and upload on cloud with tags.

    III. Advantages of Proposed System

    1. Higher reliability in which the data chunks are distributed across multiple cloud servers.

    2. Security requirements of data confidentiality and tag consistency are also achieved by introducing a

    deterministic secret sharing scheme in distributed storage systems.

    IV. LITERATURE SURVEY

    1. Reclaiming Space from Duplicate Files in a Server less Distributed File System:

    The Far site distributed file system provides availability by replicating each file onto multiple desktop

    computers. Since this replication consumes significant storage space, it is important to reclaim used

    space where possible. Measurement of over 500 desktop file systems shows that nearly half of all

    consumed space is occupied by duplicate files. We present a mechanism to reclaim space from this

    incidental duplication to make it available for controlled file replication. Our mechanism includes 1)

    convergent encryption, which enables duplicate files to coalesced into the space of a single file, even if the

    files are encrypted with different users keys, and 2) SALAD, a Self- Arranging, Loss, Associative Database

    for aggregating file content and location information in a decentralized, scalable, fault-tolerant manner.

    Large-scale simulation experiments show that the duplicate-file coalescing system is scalable, highly

    effective, and fault-tolerant.

  • International Journal of Advance Foundation and Research in Computer (IJAFRC)

    Volume 3, Issue 1, January - 2016. ISSN 2348 4853, Impact Factor 1.317

    49 | 2016, IJAFRC All Rights Reserved www.ijfarc.org

    2. DupLESS: Server-Aided Encryption for De-duplicated Storage:

    Cloud storage service providers such as Dropbox, Mozy, and others perform de-duplication to save space

    by only storing one copy of each le uploaded. Should clients conventionally encrypt their les, however,

    savings are lost. Message-locked encryption (the most prominent manifestation of which is convergent

    encryption) resolves this tension. However it is inherently subject to brute-force attacks that can recover

    les falling into a known set. We propose an architecture that provides secure de-duplicated storage

    resisting brute-force attacks, and realize it in a system called DupLESS. In DupLESS, clients encrypt under

    message-based keys obtained from a key-server via an oblivious PRF protocol. It enables clients to store

    encrypted data with an existing service, have the service perform de-duplication on their behalf, and yet

    achieves strong condentiality guarantees. We show that encryption for de-duplicated storage can

    achieve performance and space savings close to that of using the storage service with plaintext data.

    3. Message-Locked Encryption and Secure De-duplication.

    We formalize a new cryptographic primitive, Message-Locked Encryption (MLE), where the key under

    which encryption and decryption are performed is itself derived from the message. MLE provides a way

    to achieve secure de-duplication (space-ecient secure outsourced storage), a goal currently targeted by

    numerous cloud-storage providers. We provide denitions both for privacy and for a form of integrity

    that we call tag consistency. Based on this foundation, we make both practical and theoretical

    contributions. On the practical side, we provide ROM security analyses of a natural family of MLE

    schemes that includes deployed schemes. On the theoretical side the challenge is standard model

    solutions, and we make connections with deterministic encryption, hash functions secure on correlated

    inputs and the sample-then-extract paradigm to deliver schemes under dierent assumptions and for

    dierent classes of message sources. Our work shows that MLE is a primitive of both practical and

    theoretical interest.

    4. CD Store: Toward Reliable, Secure, and Cost-Efcient Cloud Storage via Convergent Dispersal

    We present CD Store, which disperses users backup data across multiple clouds and provides a unied

    multi cloud storage solution with reliability, security, and cost- efciency guarantees. CD Store builds on

    an augmented secret sharing scheme called convergent dispersal, which supports de-duplication by using

    deterministic content- derived hashes as inputs to secret sharing. We present the design of CD Store, and

    in particular, describe how it combines convergent dispersal with two-stage de-duplication to achieve

    both bandwidth and storage savings and be robust against side-channel attacks. We evaluate the

    performance of our CD Store prototype using real-world workloads on LAN and commercial cloud test

    beds. Our cost analysis also demonstrates that CD Store achieves a monetary cost saving of 70% over a

    baseline cloud storage solution using state-of-the-art secret sharing.

    5. Secure De-duplication and Data Security With Efficient And Reliable CEKM

    Secure de-duplication is a technique for eliminating duplicate copies of storage data, and provides security to them. To reduce storage space and upload bandwidth in cloud storage de-duplication has been a well-known technique. For that purpose convergent encryption has been extensively adopt for secure de-duplication, critical issue of making convergent encryption practical is to efficiently and reliably manage a huge number of convergent keys. The basic idea in this paper is that we can eliminate duplicate copies of storage data and limit the damage of stolen data if we decrease the value

  • International Journal of Advance Foundation and Research in Computer (IJAFRC)

    Volume 3, Issue 1, January - 2016. ISSN 2348 4853, Impact Factor 1.317

    50 | 2016, IJAFRC All Rights Reserved www.ijfarc.org

    of that stolen information to the attacker. This paper makes the first attempt to formally address the problem of achieving efficient and reliable key management in secure de-duplication. We first introduce a baseline approach in which each user holds an independent master key for encrypting the convergent keys and outsourcing them. However, such a baseline key management scheme generates an enormous number of keys with the increasing number of users and requires users to dedicatedly protect the master keys. To this end, we propose Dekey, User Behavior Profiling and Decoys technology. Dekey new construction in which users do not need to manage any keys on their own but instead securely distribute the convergent key shares across multiple servers for insider attacker. As a proof of concept, we implement Dekey using the Ramp secret sharing scheme and demonstrate that Dekey incurs limited overhead in realistic environments

    V. ARCHITECTURE OF PROPOSED SYSYTEM

    In following figure we design a system which is useful for preventing data de-duplication with improved

    reliability.

    Figure 1. Architecture of Proposed System.

    VI. CONCLUSION

    We concentrated on the issue of evaluating if an untrusted server stores a customer's data. We presented

    a model for provable data possession (PDP), in which it is alluring to minimize the le piece gets to, the

    calculation on the server, and the client server correspondence. Our answers for PDP t this model: They

    cause a low (or even steady) overhead at the server and oblige a little, consistent measure of

    correspondence per challenge. Key parts of our plans are the backing for spot checking, which guarantees

    that the plans stay light weight, and the homomorphic variable labels, which permit to concern data

    possession without having entry to the genuine data le. We likewise dene the idea of hearty inspecting,

    which coordinates remote data checking (RDC) with for-ward mistake amending codes to moderate

    discretionarily little le debasements and propose a non-specific change for adding vigor to any spot

  • International Journal of Advance Foundation and Research in Computer (IJAFRC)

    Volume 3, Issue 1, January - 2016. ISSN 2348 4853, Impact Factor 1.317

    51 | 2016, IJAFRC All Rights Reserved www.ijfarc.org

    checking-based RDC plan. Examinations demonstrate that our plans make it down to earth to check

    possession of vast data sets. Past plans that don't permit testing are not commonsense when PDP is

    utilized to demonstrate possession of a lot of data, as they force a significant I/O and computational

    weigh on the server.

    VII. FUTURE SCOPE

    The distributed de-duplication systems to improve the reliability of data. Four constructions were

    proposed to support file-level and block-level data de-duplication. Our de-duplication systems using the

    Ramp secret sharing scheme and demonstrated that it incurs small encoding/decoding overhead

    compared to the network transmission overhead in regular upload/download operations.

    VIII. REFERENCES

    [1] Jin Li, Xiao feng Chen, Shaohua Tangand, Yang Xiang, Mo-hammad Mehedi Hassan, Abdulhameed

    Alelaiwi Secure Distributed De-duplication Systems with Improved Reliability, IEEE 2015,pp.

    15574014

    [2] J. Gantz and D. Reinsel, The digital universe in 2020: Big data, bigger digital shadows, and biggest

    growth in the far east, http://www.emc.com/collateral/analyst-reports/idcthe- digital-universe-

    in-2020.pdf, Dec 2014.

    [3] M. O. Rabin, Fingerprinting by random polynomials, Center for Re-search in Computing

    Technology, Harvard University, Tech. Rep. Tech. Report TR-CSE-03-01,

    [4] J. R. Douceur, A. Adya, W. J. Bolosky, D. Simon, and M. Theimer, Re-claiming space from duplicate

    les in a serverless distributed le system. in ICDCS, 2013, pp. 617624

    [5] Message-locked encryption and secure de-duplication, in EUROCRYPT, 2013, pp. 296312.

    [6] G. R. Blakley and C. Meadows, Security of ramp schemes, in Advances in Cryptology: Proceedings

    of CRYPTO 84, ser. Lecture Notes in Computer Science, G. R. Blakley and D. Chaum, Eds. Springer-

    Verlag Berlin/Heidelberg, 1985, vol. 196, pp. 242268

    [7] M. O. Rabin, client dispersal of information for security, load balancing, and fault tolerance,

    Journal of the ACM, vol. 36, no. 2, pp. 335348, Apr. 2014

    [8] A. Shamir, How to share a secret Communication. ACM, vol. 22, no. 11, pp. 612613, 2014

    [9] J. Li, X. Chen, M. Li, J. Li, P. Lee, and W. Lou, Secure de-duplication with efficient and reliable

    convergent key management, in IEEE Trans-actions on Parallel and Distributed Systems, 2014,

    pp. vol. 25(6), pp. 16151625.