Upload
ijafrc
View
13
Download
0
Embed Size (px)
DESCRIPTION
Information is a procedure for disposing of copy duplicates of information, and has been broadlyutilized as a part of Cloud storage to de-crease storage room and transfer transmission capacity. Onthe other hand, there is one and only duplicate for every record put away in cloud regard-less of thefact that such a document is possessed by an immense number of clients. Thus, framework enhancesstockpiling use while diminishing un-wavering quality. Besides, the test of security for delicateinformation additionally emerges when they are outsourced by clients to cloud. Intending to addressthe above security challenges, this paper makes the rest endeavor to formalize the thought ofdispersed dependable framework. We propose new conveyed frameworks with higher unwaveringquality in which the in-formation pieces are dispersed over numerous cloud servers. The securitynecessities of information privacy and label consistency are additionally accomplished by presentinga deterministic mystery sharing plan in dispersed stockpiling frameworks, rather than utilizingconcurrent encryption as a part of past frameworks. Security examination shows that ourframeworks are se-cure as far as the definitions determined in the proposed security model. As aproof of idea, we execute the proposed frameworks and exhibit that the caused overhead isextremely restricted in sensible situations.
Citation preview
International Journal of Advance Foundation and Research in Computer (IJAFRC)
Volume 3, Issue 1, January - 2016. ISSN 2348 4853, Impact Factor 1.317
47 | 2016, IJAFRC All Rights Reserved www.ijfarc.org
Secure Disrtibuted De-duplication System with Improved
Reliability in Cloud Computing Mr. Sagar G Khengat, Mr. Swapnil S Belorkar, Mr. Alok Y Shukla, Prof. Nilesh B Madke
Dept. of Computer Engineering,
ISB & M School Of Technology Nande Village, Tal.Mulashi, Pune, Savitribai Phule Pune University.
[email protected], [email protected], [email protected],
A B S T R A C T
Information is a procedure for disposing of copy duplicates of information, and has been broadly utilized as a part of Cloud storage to de-crease storage room and transfer transmission capacity. On the other hand, there is one and only duplicate for every record put away in cloud regard-less of the fact that such a document is possessed by an immense number of clients. Thus, framework enhances stockpiling use while diminishing un-wavering quality. Besides, the test of security for delicate information additionally emerges when they are outsourced by clients to cloud. Intending to address the above security challenges, this paper makes the rest endeavor to formalize the thought of dispersed dependable framework. We propose new conveyed frameworks with higher unwavering quality in which the in-formation pieces are dispersed over numerous cloud servers. The security necessities of information privacy and label consistency are additionally accomplished by presenting a deterministic mystery sharing plan in dispersed stockpiling frameworks, rather than utilizing concurrent encryption as a part of past frameworks. Security examination shows that our frameworks are se-cure as far as the definitions determined in the proposed security model. As a proof of idea, we execute the proposed frameworks and exhibit that the caused overhead is extremely restricted in sensible situations.
Index Terms: cloud security, de-duplication, distributed system, proof of ownership, file-level ,block-level
I. INTRODUCTION
In this situation, it is alluring to review that the SSP meets its contractual commitments. SSPs have
numerous inspirations to fall flat these commitments; for instance, a SSP may attempt to conceal data
misfortune occurrences so as to save its notoriety or it may dispose of data that is seldom gotten to so
that it may exchange the same stockpiling. Remote data checking (RDC) permits a reviewer to test a
server to give a proof of data possession so as to approve that the server has the data that was initially
put away by a customer. We say that a RDC plan looks to give a data possession ensure. Archival system
to wrath present exceptional execution requests. Given that le data is expansive and put away at remote
locales, getting to a whole le is lavish in I/O expenses to the capacity server and in transmitting the le
over a system. Perusing a whole chronicle, even intermittently, enormously confines this capacity of
system stores. Besides, I/O acquired to set up data possession meddles with on-interest transfer speed to
store and recover data.
We presume that customers should have the capacity to check that a server has held le data without
recovering the data from the server and without having the server get to the whole le. A plan for
inspecting remote data ought to be both lightweight and powerful. Lightweight implies that it doesn't
unduly trouble the SSP; this incorporates both over-head (i.e., calculation and I/O) at the SSP and
correspondence between the SSP and the inspector. This objective can be accomplished by depending on
spot checking, in which the examiner haphazardly tests little divides of the data and checks their
International Journal of Advance Foundation and Research in Computer (IJAFRC)
Volume 3, Issue 1, January - 2016. ISSN 2348 4853, Impact Factor 1.317
48 | 2016, IJAFRC All Rights Reserved www.ijfarc.org
uprightness, therefore minimizing the I/O at the SSP. Spot checking permits the customer to identify if a
small amount of the data put away at the server has been debased, yet it can't recognize defilement of
little parts of the data (e.g., 1 byte). Vigorous implies that the examining plan consolidates components
for relieving self-assertive measure corruption. Protecting against large corruptions ensures that the SSP
has committed the contracted storage resources: Little space can be reclaimed undetectably, making it
unattractive to delete data to save on storage costs or sell the same storage multiple times. Protecting
against small corruptions protects the data itself, not just the storage resource. Much data has value well
beyond its storage costs, making attacks that corrupt small amounts of data practical. For example,
modifying a single bit may destroy an encrypted le or invalidate authentication information.
II. PROPOSED SYSTEM
Input: Though de-duplication technique can save the storage space for the cloud storage service
providers, it reduces the reliability of the system. Data reliability is actually very critical issue in a de-
duplication storage system be-cause there is only one copy for each le stored in the server shared by all
the owners. If such a shared le/chunk was lost, a disproportionately large amount of data becomes
inaccessible because of the unavailability of all the les that share this le/chunk. If the value of a chunk
were measured in terms of the amount of le data that would be lost in case of losing a single chunk, then
the amount of user data lost when a chunk in the storage system is corrupted grows with the number of
the commonality of the chunk. Thus, how to guarantee high data reliability in de-duplication system is a
critical problem. Most of the previous de-duplication systems have only been considered in a single-
server setting. However, as lots of de-duplication systems and cloud storage systems are intended by
users and applications for higher reliability, especially in archival storage systems where data are critical
and should be preserved over long time periods. This requires that the de-duplication storage systems
provide reliability comparable to other high-available systems
Output: User will login and upload on cloud with tags.
III. Advantages of Proposed System
1. Higher reliability in which the data chunks are distributed across multiple cloud servers.
2. Security requirements of data confidentiality and tag consistency are also achieved by introducing a
deterministic secret sharing scheme in distributed storage systems.
IV. LITERATURE SURVEY
1. Reclaiming Space from Duplicate Files in a Server less Distributed File System:
The Far site distributed file system provides availability by replicating each file onto multiple desktop
computers. Since this replication consumes significant storage space, it is important to reclaim used
space where possible. Measurement of over 500 desktop file systems shows that nearly half of all
consumed space is occupied by duplicate files. We present a mechanism to reclaim space from this
incidental duplication to make it available for controlled file replication. Our mechanism includes 1)
convergent encryption, which enables duplicate files to coalesced into the space of a single file, even if the
files are encrypted with different users keys, and 2) SALAD, a Self- Arranging, Loss, Associative Database
for aggregating file content and location information in a decentralized, scalable, fault-tolerant manner.
Large-scale simulation experiments show that the duplicate-file coalescing system is scalable, highly
effective, and fault-tolerant.
International Journal of Advance Foundation and Research in Computer (IJAFRC)
Volume 3, Issue 1, January - 2016. ISSN 2348 4853, Impact Factor 1.317
49 | 2016, IJAFRC All Rights Reserved www.ijfarc.org
2. DupLESS: Server-Aided Encryption for De-duplicated Storage:
Cloud storage service providers such as Dropbox, Mozy, and others perform de-duplication to save space
by only storing one copy of each le uploaded. Should clients conventionally encrypt their les, however,
savings are lost. Message-locked encryption (the most prominent manifestation of which is convergent
encryption) resolves this tension. However it is inherently subject to brute-force attacks that can recover
les falling into a known set. We propose an architecture that provides secure de-duplicated storage
resisting brute-force attacks, and realize it in a system called DupLESS. In DupLESS, clients encrypt under
message-based keys obtained from a key-server via an oblivious PRF protocol. It enables clients to store
encrypted data with an existing service, have the service perform de-duplication on their behalf, and yet
achieves strong condentiality guarantees. We show that encryption for de-duplicated storage can
achieve performance and space savings close to that of using the storage service with plaintext data.
3. Message-Locked Encryption and Secure De-duplication.
We formalize a new cryptographic primitive, Message-Locked Encryption (MLE), where the key under
which encryption and decryption are performed is itself derived from the message. MLE provides a way
to achieve secure de-duplication (space-ecient secure outsourced storage), a goal currently targeted by
numerous cloud-storage providers. We provide denitions both for privacy and for a form of integrity
that we call tag consistency. Based on this foundation, we make both practical and theoretical
contributions. On the practical side, we provide ROM security analyses of a natural family of MLE
schemes that includes deployed schemes. On the theoretical side the challenge is standard model
solutions, and we make connections with deterministic encryption, hash functions secure on correlated
inputs and the sample-then-extract paradigm to deliver schemes under dierent assumptions and for
dierent classes of message sources. Our work shows that MLE is a primitive of both practical and
theoretical interest.
4. CD Store: Toward Reliable, Secure, and Cost-Efcient Cloud Storage via Convergent Dispersal
We present CD Store, which disperses users backup data across multiple clouds and provides a unied
multi cloud storage solution with reliability, security, and cost- efciency guarantees. CD Store builds on
an augmented secret sharing scheme called convergent dispersal, which supports de-duplication by using
deterministic content- derived hashes as inputs to secret sharing. We present the design of CD Store, and
in particular, describe how it combines convergent dispersal with two-stage de-duplication to achieve
both bandwidth and storage savings and be robust against side-channel attacks. We evaluate the
performance of our CD Store prototype using real-world workloads on LAN and commercial cloud test
beds. Our cost analysis also demonstrates that CD Store achieves a monetary cost saving of 70% over a
baseline cloud storage solution using state-of-the-art secret sharing.
5. Secure De-duplication and Data Security With Efficient And Reliable CEKM
Secure de-duplication is a technique for eliminating duplicate copies of storage data, and provides security to them. To reduce storage space and upload bandwidth in cloud storage de-duplication has been a well-known technique. For that purpose convergent encryption has been extensively adopt for secure de-duplication, critical issue of making convergent encryption practical is to efficiently and reliably manage a huge number of convergent keys. The basic idea in this paper is that we can eliminate duplicate copies of storage data and limit the damage of stolen data if we decrease the value
International Journal of Advance Foundation and Research in Computer (IJAFRC)
Volume 3, Issue 1, January - 2016. ISSN 2348 4853, Impact Factor 1.317
50 | 2016, IJAFRC All Rights Reserved www.ijfarc.org
of that stolen information to the attacker. This paper makes the first attempt to formally address the problem of achieving efficient and reliable key management in secure de-duplication. We first introduce a baseline approach in which each user holds an independent master key for encrypting the convergent keys and outsourcing them. However, such a baseline key management scheme generates an enormous number of keys with the increasing number of users and requires users to dedicatedly protect the master keys. To this end, we propose Dekey, User Behavior Profiling and Decoys technology. Dekey new construction in which users do not need to manage any keys on their own but instead securely distribute the convergent key shares across multiple servers for insider attacker. As a proof of concept, we implement Dekey using the Ramp secret sharing scheme and demonstrate that Dekey incurs limited overhead in realistic environments
V. ARCHITECTURE OF PROPOSED SYSYTEM
In following figure we design a system which is useful for preventing data de-duplication with improved
reliability.
Figure 1. Architecture of Proposed System.
VI. CONCLUSION
We concentrated on the issue of evaluating if an untrusted server stores a customer's data. We presented
a model for provable data possession (PDP), in which it is alluring to minimize the le piece gets to, the
calculation on the server, and the client server correspondence. Our answers for PDP t this model: They
cause a low (or even steady) overhead at the server and oblige a little, consistent measure of
correspondence per challenge. Key parts of our plans are the backing for spot checking, which guarantees
that the plans stay light weight, and the homomorphic variable labels, which permit to concern data
possession without having entry to the genuine data le. We likewise dene the idea of hearty inspecting,
which coordinates remote data checking (RDC) with for-ward mistake amending codes to moderate
discretionarily little le debasements and propose a non-specific change for adding vigor to any spot
International Journal of Advance Foundation and Research in Computer (IJAFRC)
Volume 3, Issue 1, January - 2016. ISSN 2348 4853, Impact Factor 1.317
51 | 2016, IJAFRC All Rights Reserved www.ijfarc.org
checking-based RDC plan. Examinations demonstrate that our plans make it down to earth to check
possession of vast data sets. Past plans that don't permit testing are not commonsense when PDP is
utilized to demonstrate possession of a lot of data, as they force a significant I/O and computational
weigh on the server.
VII. FUTURE SCOPE
The distributed de-duplication systems to improve the reliability of data. Four constructions were
proposed to support file-level and block-level data de-duplication. Our de-duplication systems using the
Ramp secret sharing scheme and demonstrated that it incurs small encoding/decoding overhead
compared to the network transmission overhead in regular upload/download operations.
VIII. REFERENCES
[1] Jin Li, Xiao feng Chen, Shaohua Tangand, Yang Xiang, Mo-hammad Mehedi Hassan, Abdulhameed
Alelaiwi Secure Distributed De-duplication Systems with Improved Reliability, IEEE 2015,pp.
15574014
[2] J. Gantz and D. Reinsel, The digital universe in 2020: Big data, bigger digital shadows, and biggest
growth in the far east, http://www.emc.com/collateral/analyst-reports/idcthe- digital-universe-
in-2020.pdf, Dec 2014.
[3] M. O. Rabin, Fingerprinting by random polynomials, Center for Re-search in Computing
Technology, Harvard University, Tech. Rep. Tech. Report TR-CSE-03-01,
[4] J. R. Douceur, A. Adya, W. J. Bolosky, D. Simon, and M. Theimer, Re-claiming space from duplicate
les in a serverless distributed le system. in ICDCS, 2013, pp. 617624
[5] Message-locked encryption and secure de-duplication, in EUROCRYPT, 2013, pp. 296312.
[6] G. R. Blakley and C. Meadows, Security of ramp schemes, in Advances in Cryptology: Proceedings
of CRYPTO 84, ser. Lecture Notes in Computer Science, G. R. Blakley and D. Chaum, Eds. Springer-
Verlag Berlin/Heidelberg, 1985, vol. 196, pp. 242268
[7] M. O. Rabin, client dispersal of information for security, load balancing, and fault tolerance,
Journal of the ACM, vol. 36, no. 2, pp. 335348, Apr. 2014
[8] A. Shamir, How to share a secret Communication. ACM, vol. 22, no. 11, pp. 612613, 2014
[9] J. Li, X. Chen, M. Li, J. Li, P. Lee, and W. Lou, Secure de-duplication with efficient and reliable
convergent key management, in IEEE Trans-actions on Parallel and Distributed Systems, 2014,
pp. vol. 25(6), pp. 16151625.