225
SECURE CLOUD STORAGE MODEL TO PRESERVE CONFIDENTIALITY AND INTEGRITY SARFRAZ NAWAZ BROHI A thesis submitted in fulfilment of the requirements for the award of the degree of Doctor of Software Engineering Advanced Informatics School Universiti Teknologi Malaysia JANUARY 2015

Secure Cloud Storage Model to Preserve Confidentiality and Integrity.pdf

Embed Size (px)

Citation preview

  • i

    SECURE CLOUD STORAGE MODEL TO PRESERVE CONFIDENTIALITY

    AND INTEGRITY

    SARFRAZ NAWAZ BROHI

    A thesis submitted in fulfilment of the

    requirements for the award of the degree of

    Doctor of Software Engineering

    Advanced Informatics School

    Universiti Teknologi Malaysia

    JANUARY 2015

  • iii

    To

    my supportive parents,

    and

    beloved siblings

  • iv

    ACKNOWLEDGEMENT

    First of All, I thank ALLAH (SWT), the God Almighty, for granting me the

    health, knowledge, strength, ability, and patience to accomplish this research, and for

    blessing me with sympathetic and supportive supervisors as well as family members.

    I am glad to express tremendous gratitude to my supervisor Dr Suriayati

    Chuprat for her compassionate character, knowledge sharing, ideas and continuous

    support from the first until the last day of this study. Her sincere behaviour and

    constructive feedback enabled me to achieve significant research milestones within

    the required time-frame.

    I would also like to thank my external supervisor Dr Jamalul-lail Ab Manan

    for enriching me with innovative ideas and skills by sharing his expertise and

    knowledge in the field of cloud computing security. Due to his unlimited support for

    reviewing, improving and evaluating my research, I was able to publish several high

    quality research papers.

    At various stages during this study, I faced several undesirable challenges

    which overburdened me with mental and physical stress. However, this never

    stopped me from progressing further due to encouraging, moral as well as financial

    support from my father Dr Muhammad Nawaz Brohi. I am extremely thankful to

    him for his understanding, kindness, believe, and trust on me.

    I also wish to express deepest appreciation to my mother for her prayers

    regarding my success during this entire study. I will always remember my late

    grandmother in prayers. This research would have never been possible without her

    wishes for my success.

  • v

    ABSTRACT

    Cloud Service Providers (CSPs) offer remotely located cloud storage services

    to business organizations which include cost-effective advantages. From an

    industrial perspective, Amazon Simple Storage Service (S3) and Google Cloud

    Storage (GCS) are the leading cloud storage services. These storages are secured

    using the latest data security approaches such as cryptography algorithms, data

    auditing processes, and strict access control policies. However, organizations where

    confidentiality of information is a significant act, they are not assertive to adopt these

    services due to emerging data confidentiality and integrity concerns. Malicious

    attackers have violated the cloud storages to steal, view, manipulate, and tamper

    clients data. The researchers have attempted to overcome these shortcomings by

    designing and developing various security models. These solutions incorporate

    limitations and require enhancements as well as improvements before they can be

    widely accepted by CSPs to guarantee secure cloud storage services. In order to

    solve the stated problem, this research developed an improved security solution

    namely Secure Cloud Storage Model (SCSM) which consists of Multi-factor

    authentication and authorization process using Role-Based Access Control (RBAC)

    with Complex Random Security Code Generator (CRSCG), Partial homomorphic

    cryptography using Rivest, Shamir and Adleman (RSA) algorithm, Trusted Third

    Party (TTP) services including Key Management (KM) approach and data auditing

    process, Implementation of 256-bit Secure Socket Layer (SSL), and Service Level

    Agreement (SLA). SCSM was implemented using Java Enterprise Edition with

    glassfish server and deployed on a cloud computing infrastructure. The model was

    evaluated using extended euclidean algorithm, system security analysis, key

    management recommendations, web-based testing tool, security scanner, and survey.

    The survey results presented that 83.33% of the respondents agreed for SCSM to be

    widely accepted by CSPs to offer secured cloud storage services. The aggregate

    evaluation results proved that SCSM is successful in preserving data confidentiality

    and integrity at remotely located cloud storages.

  • vi

    ABSTRAK

    Penyedia perkhidmatan awan (CSP) menawarkan servis storan awan secara

    jauh yang memberi kelebihan kos yang efektif. Mengikut perspektif industri,

    Amazon Simple Storage Service (S3) dan Google Cloud Storage (GCS) merupakan

    peneraju utama servis storan awan. Storan ini adalah selamat kerana mereka

    menggunakan pendekatan keselamatan data yang terkini seperti algoritma

    kriptografi, proses pengauditan data serta polisi kawalan capaian yang ketat. Walau

    bagaimanapun, bagi organisasi yang mengutamakan kerahsiaan maklumat, mereka

    tidak tertarik untuk menggunakan servis tersebut kerana bimbang akan kerahsiaan

    dan integriti data. Penyerang yang berniat jahat telah mencabuli storan awan dengan

    mencuri, melihat, memanipulasi dan mengganggu data pelanggan. Para penyelidik

    telah mencuba menangani masalah-masalah ini dengan mereka bentuk dan

    membangunkan pelbagai model keselamatan. Penyelesaian yang telah dibangunkan

    ini masih mempunyai had tertentu dan memerlukan penambahbaikan sebelum ianya

    diterima secara meluas oleh CSP demi menjamin keselamatan servis tersebut. Untuk

    menyelesaikan masalah yang dinyatakan, penyelidikan ini telah membangunkan

    penyelesaian keselamatan yang telah ditambahbaik dan ianya dinamakan Secure

    Cloud Storage Model (SCSM). Model ini terdiri daripada pengesahan pelbagai-

    faktor, proses kebenaran menggunakan Role-Based Access Control (RBAC) dengan

    Complex Random Security Code Generator (CRSCG), kriptografi homomorphic

    separa menggunakan algoritma Rivest, Shamir and Adleman (RSA), servis-servis

    Trusted Third Party (TTP) iaitu pendekatan pengurusan kunci (KM) dan proses

    pengauditan data, perlaksanaan Secure Socket Layer (SSL) 256-bit, dan Service

    Level Agreement (SLA). SCSM dibangunkan menggunakan Java Enterprise Edition

    dengan pelayan Glassfish dan dilaksanakan pada infrastruktur pengkomputeran

    awan. Model ini kemudiannya dinilai menggunakan algoritma Extended Euclidean,

    analisis keselamatan sistem, cadangan-cadangan pengurusan kunci, alatan ujian

    berasaskan sesawang, pengimbas keselamatan serta kajian. Hasil kajian

    menunjukkan 83.33% responden bersetuju SCSM boleh diterima secara meluas oleh

    CSP yang menawarkan servis storan awan yang selamat. Keputusan penilaian

    membuktikan SCSM berjaya dalam memelihara kerahsiaan data dan integriti pada

    storan awan jarak jauh.

  • vii

    TABLE OF CONTENTS

    CHAPTER TITLE PAGE

    DECLARATION ii

    ACKNOWLEDGEMENT iv

    ABSTRACT v

    ABSTRAK vi

    TABLE OF CONTENTS vii

    LIST OF TABLES xiii

    LIST OF FIGURES xiv

    LIST OF ABBREVIATIONS xvii

    LIST OF SYMBOLS xx

    LIST OF APPENDICES xxi

    1 INTRODUCTION 1

    1.1 Overview 1

    1.2 Problem Background 2

    1.3 Problem Statement 6

    1.4 Research Objectives 7

    1.5 Scope of Research 9

    1.6 Significance of Research 10

    1.7 Contribution of Research 10

    1.8 Thesis Organization 11

    1.9 Summary 13

    2 LITERATURE REVIEW 14

    2.1 Introduction 14

  • viii

    2.2 Cloud Deployment Models 16

    2.2.1 Public Cloud 16

    2.2.2 Private Cloud 17

    2.2.3 Hybrid Cloud 18

    2.2.4 Community Cloud 19

    2.3 Cloud Service Delivery Models 20

    2.3.1 Software as a Service 20

    2.3.2 Platform as a Service 21

    2.3.3 Infrastructure as a Service 21

    2.4 Cloud Storage Services 22

    2.5 Cloud Storage Data Security Concerns 23

    2.5.1 Data Confidentiality 23

    2.5.2 Data Integrity 24

    2.6 Data Protection Mechanisms for Cloud Storages 25

    2.6.1 Cryptography and Key Management 25

    2.6.2 Trusted Computing 26

    2.6.3 Access Control Mechanisms 27

    2.6.4 Service Level Agreement 27

    2.6.5 Data Auditing Services 28

    2.7 Industry Based Implementations of Cloud Storage

    Services 29

    2.7.1 Amazon Simple Storage Service 29

    2.7.2 Google Cloud Storage 33

    2.8 Limitations of Industry Implemented Cloud Storage

    Services 37

    2.8.1 Vulnerable Key Management Approach 39

    2.8.2 Inadequate Cryptographic Support 40

    2.8.3 Exclusion of Security Assurance in Service

    Level Agreements 40

    2.8.4 Untrustworthy Data Integrity Verification

    Services 41

    2.9 Confidentiality and Integrity Preserving

    Cloud Storage Models 42

  • ix

    2.9.1 Secure Cloud Storage Integrator for

    Enterprises 43

    2.9.2 Data Confidentiality and Integrity

    Verification Using User Authenticator

    Scheme in Cloud 45

    2.9.3 Secure Storage Services in Cloud 47

    2.9.4 Data Confidentiality in Storage-Intensive

    Cloud Applications 49

    2.9.5 Cloud Storage Integrity Checking Using

    Encryption Algorithm 51

    2.10 Critical Analysis on Related Work Solutions 52

    2.11 Contribution and Road Map of Research 56

    2.12 Summary 59

    3 RESEARCH METHODOLOGY 60

    3.1 Introduction 60

    3.2 Research Methodology 62

    3.2.1 Literature Review 62

    3.2.2 Analysis 64

    3.2.3 Design 65

    3.2.4 Implementation 66

    3.2.5 Evaluation 67

    3.3 Research Activities and Outcomes 68

    3.4 Summary 71

    4 SECURE CLOUD STORAGE MODEL 72

    4.1 Introduction 72

    4.2 Building Blocks of SCSM 73

    4.3 Description and Architecture of SCSM 74

    4.3.1 Roles and Responsibilities 76

    4.4 Components of SCSM 77

    4.4.1 Multi-factor Authentication and Authorization

    Process 78

  • x

    4.4.1.1 Role Based Access Control 79

    4.4.1.2 Complex Random Security

    Code Generator 81

    4.4.2 Partial Homomorphic Cryptography 82

    4.4.3 256-bit Secure Socket Layer 86

    4.4.4 Service Level Agreement 87

    4.4.5 Trusted Third Party Services 96

    4.4.5.1 Key Management Approach 96

    4.4.5.2 Data Auditing Process 98

    4.5 Process of SCSM 101

    4.6 Summary 103

    5 IMPLEMENTATION OF THE SECURE CLOUD

    STORAGE MODEL 104

    5.1 Introduction 104

    5.2 Software Development Process of SCSM 106

    5.3 Systematic Workflow of SCSM 112

    5.3.1 Data Transfer and Retrieval 113

    5.3.2 Encrypted Data Processing 115

    5.3.3 Verification Metadata Generation

    and Secure Transfer of Parameters 117

    5.3.4 Data Integrity Verification 118

    5.3.5 Data Recovery 122

    5.3.6 Private Key Retrieval and Data

    Downloading 123

    5.4 Deployment of SCSM 125

    5.5 Summary 127

    6 EVALUATION AND RESULTS 128

    6.1 Introduction 128

    6.2 Evaluation Strategy of Research 129

    6.3 Evaluation and Results of SCSM Components 130

    6.3.1 Qualys Web-based Evaluation Methodology 131

  • xi

    6.3.1.1 SSL Certificate Inspection 131

    6.3.1.2 Server Configuration Inspection 133

    6.3.1.3 Final Score and Grade Assignment 138

    6.3.2 Mathematical Evaluation 140

    6.3.3 Compliance Evaluation 144

    6.3.4 Security Analysis 146

    6.3.5 Survey Based Evaluation 147

    6.3.5.1 Structure of Survey 148

    6.3.5.2 Survey Analysis for Multi-factor

    Authentication and Authorization

    Process 150

    6.3.5.3 Survey Analysis for Service

    Level Agreement 152

    6.4 Evaluation of SCSM using Survey and Skipfish 156

    6.5 Benchmarking of SCSM with Industry and Academia

    Best Practices 161

    6.5.1 Secure and Flexible Partial Homomorphic

    Cryptography 165

    6.5.2 Security and Privacy Guaranteeing Service

    Level Agreement 167

    6.5.3 Trusted, Secure and Efficient Data Auditing

    Service 168

    6.5.4 Trusted and Secure Key Management

    Approach 170

    6.5.5 Extremely Secure Multi-factor Authentication

    and Authorization Process 171

    6.6 Summary 173

    7 CONCLUSION AND FUTURE WORK 174

    7.1 Introduction 174

    7.2 Contributions and Significance 175

    7.3 Potential Applications of SCSM 178

    7.4 Limitations and Future Directions of Research 179

  • xii

    7.4.1 Fully Homomorphic Encryption 179

    7.4.2 Heterogeneous Data 180

    7.4.3 Performance 180

    7.4.4 Multi-user Computing Environment 181

    7.5 Summary 181

    REFERENCES 182

    Appendices A - C 197 - 201

  • xiii

    LIST OF TABLES

    TABLE NO. TITLE PAGE

    1.1 Analysis of Research Problem Area 4

    3.1 Research Activities and Outcomes 69

    4.1 Service Level Agreement 89

    6.1 Protocol Support Rating Guide 134

    6.2 Key Exchange Rating Guide 135

    6.3 Cipher Strength Rating Guide 137

    6.4 Evaluation Criteria 138

    6.5 Letter Grading Translation 139

    6.6 Keys of Alice and Bob 141

    6.7 Key Management Compliance and Auditing 144

    6.8 Participation of the Industry Experts in Survey 149

    6.9 Analysis of Multi-factor Authentication and

    Authorization Process 151

    6.10 Analysis of Service Level Agreement 155

    6.11 Analysis of SCSM 157

    6.12 SCSM Benchmarking with Industry and Academia

    Implemented Solutions 163

  • xiv

    LIST OF FIGURES

    FIGURE NO.

    TITLE

    PAGE

    1.1 Survey for Research Problem Area 4

    1.2 Thesis Organization 12

    2.1 Server Side Encryption 30

    2.2 Encryption with Clients Key 31

    2.3 Client Side Encryption 32

    2.4 Data Migration Process 34

    2.5 Authentication Process 36

    2.6 Limitations of Amazon S3 and GCS 38

    2.7 Cloud Storage Integrator 44

    2.8 Preserving Data Confidentiality 45

    2.9 Data Integrity Verification 46

    2.10 Data Updating 46

    2.11 TrustStore Hybrid Cloud Service 48

    2.12 Key Management and Data Confidentiality 50

    2.13 Cloud Storage Security using Broker 51

    2.14 Academia Implemented Cloud Storage Models 53

    2.15 Research Road Map 58

    3.1 Research Methodology 61

    4.1 Architecture of SCSM 74

    4.2 Components of SCSM 78

    4.3 RBAC Privileges 80

    4.4 Access Logs Report 100

    4.5 Process of SCSM 102

    5.1 HTTP based Authentication 106

    5.2 Role Mapping 107

  • xv

    5.3 Roles and Security Annotations 108

    5.4 RSA Partial Homomorphic Cryptography 109

    5.5 Metadata Generation 110

    5.6 Metadata Verification 110

    5.7 Sound Steganography 111

    5.8 Operations of SCSM 113

    5.9 Encryption Process 114

    5.10 Decryption Process 115

    5.11 Data Processing 116

    5.12 VMD Generation and Transfer Process 117

    5.13 VMD Decoding Process 119

    5.14 Data Auditing Process 119

    5.15 Auditing Report 120

    5.16 Data Integrity Violation 121

    5.17 Auditing Report After Violation 121

    5.18 Data Recovery Process 122

    5.19 Auditing Report after Data Recovery Process 123

    5.20 Private Key Decoding Process 124

    5.21 Data Retrieval Process 124

    5.22 Module based Deployment Using Glassfish Server 126

    6.1 Evaluation Strategy 130

    6.2 Implemented SSL Certificate Details 132

    6.3 SSL Certificate Inspection 133

    6.4 Protocol Support 135

    6.5 Key Exchange 136

    6.6 Cipher Strength 138

    6.7 SSL Evaluation Results 140

    6.8 Results for Multi-factor Authentication and

    Authorization Process 151

    6.9 Results for SLA 154

    6.10 Results for SCSM 158

    6.11 Skipfish Security Scanning Report 159

    6.12 Skipfish Interactive Report 160

  • xvi

    6.13 Performance Analysis of Encryption Process 166

    6.14 Performance Analysis of Decryption Process 166

    6.15 Performance Analysis of Data Integrity Verification

    Process 169

    6.16 Security Experiment on CRSCG 172

    7.1 Contributions, Publications and Certificates 177

  • xvii

    LIST OF ABBREVIATIONS

    ACL - Access Control List

    ACM - Access Control Mechanism

    ACP - Access Control Policy

    AES - Advanced Encryption Standard

    API - Application Programming Interface

    AWS - Amazon Web Services

    CA - Clients Admin

    CAT - Computer Associates Technologies

    CentOS - Community Enterprise Operating System

    CRC - Cyclic Redundancy Check

    CRSCG - Complex Random Security Code Generator

    CSA - Cloud Security Alliance

    CSP - Cloud Service Provider

    CSPA - Cloud Service Providers Admin

    CSSP - Cloud Storage Service Provider

    DAC - Discretionary Access Control

    DBAN - Dariks Boot and Nuke

    DSA - Digital Signature Algorithm

    ECC - Elliptic Curve Cryptography

    EJBs - Enterprise Java Beans

    FHE - Fully Homomorphic Encryption

    GCS - Google Cloud Storage

    GFIS - German Federal Office of Information Security

    HIPAA - Health Insurance Portability and Accountability Act

    HMAC - Keyed-Hash Message Authentication Code

    HTML - Hypertext Markup Language

    HTTPS - Hypertext Transfer Protocol Secure

  • xviii

    IaaS - Infrastructure as a Service

    IM - Integrity Management

    JSF - Java Server Faces

    JSP - Java Server Pages

    KM - Key Management

    MAC - Mandatory Access Control

    MITM - Man-in-the-Middle

    NAS - Network Attached Storage

    NIST - National Institute of Standards and Technology

    NSA - National Security Agency

    OS - Operating System

    PaaS - Platform as a Service

    PCI - Payment Card Industry

    PCIDSS - Payment Card Industry Data Security Standard

    RBAC - Role-based Access Control

    RSA - Rivest, Shamir and Adleman

    S3 - Simple Storage Service

    SaaS - Software as a Service

    SCSM - Secure Cloud Storage Model

    SDK - Software Development Kit

    SDLC - Software Development Life Cycle

    SE - Software Engineering

    SHA - Secure Hash Algorithm

    SLA - Service Level Agreement

    SMBs - Small and Medium Businesses

    SMS - Short Message Service

    SQL - Structured Query Language

    SSE - Server Side Encryption

    SSE-C - Server Side Encryption with Customer-Provided Key

    SSL - Secure Socket Layer

    SSO - Single Sign-On

    TCG - Trusted Computing Group

    TDEA - Triple Data Encryption Algorithm

  • xix

    TED - Trusted Extension Device

    TLS - Transport Layer Security

    TPM - Trusted Platform Module

    TTP - Trusted Third Party

    TTPA - Trusted Third Partys Admin

    TVD - Trusted Virtual Domain

    UML - Unified Modelling Language

    VF - Virtual Firewall

    VM - Virtual Machine

    VMD - Verification Metadata

    VPC - Virtual Private Cloud

    VPS - Virtual Private Server

    vTPM - Virtual Trusted Platform Module

    XHTML - Extensible Hypertext Markup Language

    XML - Extensible Markup Language

    XSS - Cross-site Scripting

  • xx

    LIST OF SYMBOLS

    | - Such That

    d - Private Key Exponent

    e - Public Key Exponent

    n - Modulus for Private and Public Key

    (n) - Phi Eulers Function

    R - Random Factor

  • xxi

    LIST OF APPENDICES

    APPENDIX

    TITLE

    PAGE

    A Papers published during the authors

    candidature

    197

    B Certificates obtained during the authors

    candidature

    200

    C Survey design and delivery 201

  • 1

    CHAPTER 1

    INTRODUCTION

    1.1 Overview

    Cloud computing is an innovative method of delivering computing resources

    (Tripathi and Mishra, 2011). It facilitates the clients to execute their enterprise

    applications and store data at third party owned servers. The cloud offers various

    service delivery models such as Software as a Service (SaaS), Platform as a Service

    (PaaS) and Infrastructure as a Service (IaaS), which are acquired by the clients

    according to their requirements (Bouayad et al., 2012). IaaS is further categorized in

    three major facilities which include compute, network, and storage.

    This research mainly focuses on storage sub-offering of IaaS, which is

    provided to clients by well-known Cloud Service Providers (CSPs) such as Amazon,

    and Google (Ghosh and Ghosh, 2012). This service facilitates the organizations to

    obtain dynamic, redundant and scalable, remotely located data storage services that

    can be easily scaled-up or down to avoid costly burdens of an under or over-utilized

    storage capacity (Jiang et al., 2013). Cloud storage services have been very useful

    for Small and Medium Businesses (SMBs) that lack capital budget to implement and

    maintain personalized storage infrastructure (Sun and Sha-sha, 2011; Deyan and

    Hong, 2012).

  • 2

    However, nowadays cloud storage is becoming a business interest for all size

    organizations that are requiring resilient data availability, business continuity, and

    disaster recovery solutions. For cloud storage clients, critical data are maintained

    and backed-up by the CSP at multiple geographically distributed locations (Zhang

    and Zhang, 2011).

    The remainder of this chapter is organized in eight sections. Section 1.2,

    describes the problem background. Section 1.3, represents the problem statement.

    The objectives, scopes, significance, and contribution of research are described in

    Sections 1.4, 1.5, 1.6 and 1.7, respectively. Section 1.8, illustrates and describes

    anatomy of the entire thesis. Section 1.9, represents the summary of this chapter.

    1.2 Problem Background

    The organizations that are required to follow well-defined data security

    standards such as the Health Insurance Portability and Accountability Act (HIPAA)

    and Payment Card Industry Data Security Standard (PCIDSS), do not trust the

    existing security techniques as well as policies offered by the CSPs (Hofmann and

    Woods, 2010; Bamiah et al., 2012; Shucheng et al., 2010). Due to lack of control on

    their confidential data while it is stored at cloud storages, clients are concerned that

    malicious users might gain illegal access to their sensitive records (Taeho et al.,

    2013).

    This research focuses on solving two major issues which are emerging

    concerns for organizations dealing with confidential data not to adopt cloud storage

    services, these include data confidentiality and integrity breaches (Syam and

    Subramanian, 2011; Gansen et al., 2010). The term data confidentiality refers to the

    concept that only authorized parties or systems have the ability to access protected

  • 3

    information. The threat of data compromise increases in the cloud environment due

    to augmented number of parties, devices and applications involved, which leads to an

    increase in the amount of access points.

    Data integrity means data can only be modified by the authorized parties. The

    concept of data integrity refers to protection of data from unauthorized deletion,

    modification or fabrication (Zissis and Lekkas, 2012). In order to further analyze the

    research problem, this research also conducted a survey from industry and academia

    based information security analysts, data auditors, cloud computing researchers,

    developers, architects and security specialists. The detailed structure of the survey is

    described in Chapter 6. The following question was mentioned in the survey to

    determine the validity and impact of the problem background of this research.

    Question: Organizations dealing with confidential data are reluctant to use

    remotely located third party cloud storage services due to emerging data

    confidentiality and integrity concerns.

    The response scale was based on three options, i.e. Agree, Neutral and

    Disagree. The survey response obtained for the research problem area, as shown in

    Table 1.1 and Figure 1.1, justifies the necessity of formulating a solution for the

    research problem, whereby 83.33% of respondents agreed that the organizations are

    reluctant to adopt cloud storage services due to emerging data confidentiality and

    integrity concerns.

  • 4

    Table 1.1: Analysis of Research Problem Area

    Answer Choices Response Rate Academia Industry Total

    Agree 83.33% 14 11 25

    Neutral 16.67% 2 3 5

    Disagree 0% 0 0 0

    Figure 1.1: Survey for Research Problem Area

  • 5

    Past studies proved that confidentiality and integrity of data stored at cloud

    computing storage is breached by external or internal attacks (Ling et al., 2011).

    External attacks are issued by outside hackers who steal clients confidential records.

    These attacks may take place by wicked IT personnel from the competitors of CSP or

    the client. The intention of these attacks is to damage the brand reputation of CSP or

    to violate the clients files. In order to defend against these attacks, CSPs normally

    secure their physical and virtual infrastructure using various tools and techniques for

    protecting clients data and their systems. However, existing solutions are not

    adequate enough to achieve the desired target (Rocha and Correia, 2011). It is also

    identified that internal employees of CSP may become malicious as well (Catteddu

    and Hogben, 2009).

    Internal attacks are placed by malicious insiders such as disgruntled

    employees of a CSP. They intentionally exceed their privileged accesses in a

    negative manner to affect the data confidentiality and integrity (Duncan et al., 2012).

    In contrast to an external hacker, malicious insiders can attack the computing

    infrastructure with relatively easy manner and less knowledge of hacking, since they

    have a detailed description of the underlying infrastructure. Without using a

    complete trustworthy solution for defending against insider attacks, malicious

    insiders can easily obtain the passwords, cryptographic keys, files and gain access to

    clients records (Rocha et al., 2011). When clients data confidentiality has been

    breached, they would never have knowledge of the unauthorized access mostly due

    to lack of control over their data and lack of transparency in the CSPs security

    practices as well as policies.

    The breach of data confidentiality and integrity creates a barrier of trust

    among clients and CSPs. Clients need to ensure that CSP will always provide the

    agreed level of service and security to protect their confidential data. Trust is

    impacted when CSPs do not meet the negotiated agreements, for example,

    implementing insufficient security techniques, storing data at invalid locations which

    are not permitted by the legal law or not complying with the standards such as

    HIPAA or PCIDSS (Khan and Malluhi, 2010). The trust issues are normally

  • 6

    mitigated by signing a legal Service Level Agreement (SLA) and granting adequate

    control to the clients on their confidential data (Xiaoyong and Junping, 2013).

    However, the existing SLAs are non-negotiable and fixed from the CSPs for every

    client which may be either an ordinary home user or a banking sector. These SLAs

    are not able to accommodate specific requirements of the organizations who are

    seeking to leverage cloud storage services for storing confidential data (Asha, 2012).

    1.3 Problem Statement

    As discussed in the problem background that cloud storages are vulnerable to

    external and internal attacks which have impacted the clients trust towards CSPs for

    shifting their confidential data at third party cloud storages. Existing network

    security solutions are not able to overcome cloud storage data confidentiality and

    integrity violating threats (Nirmala et al., 2013). Considering these issues, the

    problem statement of research is mentioned as follows:

    How to develop a secure cloud storage model that preserves data

    confidentiality and integrity as well as ensures the delivery of trusted services to the

    clients by considering their data security policies?

    Several research questions can be extracted from the problem statement,

    which are mentioned as follows:

    i. What are the existing security models that have been designed, developed

    or proposed by the industry and academia researchers to overcome data

    confidentiality and integrity concerns for using cloud storage services?

  • 7

    ii. What are the limitations of existing industry and academia implemented

    cloud storage models that raise confidentiality and integrity issues which

    prevent organizations dealing with sensitive data from adopting cloud

    storage services?

    iii. How to design a model that preserves data confidentiality and integrity at

    cloud storages as well as ensures the delivery of trusted services to the

    clients?

    iv. How to develop a model that enables the clients to store and process their

    data at cloud storages with consistent data integrity, confidentiality and

    trust?

    v. How to verify that the implemented cloud storage model is successful in

    preserving the confidentiality and integrity of sensitive data, and ensuring

    the delivery of trusted services to the clients?

    1.4 Research Objectives

    The aim of this research is to develop a security model that overcomes the

    data confidentiality and integrity concerns for using cloud storage services as well as

    for ensuring the delivery of trusted services to the clients by considering their data

    security policies. The targeted aim will be achieved by completing the following

    research objectives:

  • 8

    i. To investigate and obtain in-depth understanding of existing security

    models that have been proposed by the industry and academia researchers

    to overcome data confidentiality and integrity concerns for using cloud

    storage services.

    ii. To critically analyze as well as explain the limitations or gaps which have

    been identified in the existing industry and academia implemented secure

    cloud storage models.

    iii. To design an improved and enhanced secure cloud storage model which

    preserves data confidentiality and integrity, as well as ensures the delivery

    of trusted services to the clients by considering their data security

    policies.

    iv. To implement and deploy a web-based prototype on a cloud computing

    infrastructure which facilitates the clients to store and process their data at

    cloud storages with consistent data confidentiality, integrity and trust

    assurance.

    v. To evaluate the developed cloud storage model in order to ensure that it

    overcomes or mitigates the data confidentiality and integrity concerns,

    and gains the trust of organizations dealing with sensitive data to adopt

    cloud storage services.

  • 9

    1.5 Scope of Research

    Cloud reference architecture consists of three service delivery (SaaS, PaaS,

    and IaaS) and four deployment models (Public, Private, Hybrid, and Community)

    (Mell and Grance, 2011). Since cloud computing is a vast area of research, this

    study only focuses on IaaS. Furthermore, IaaS providers offer compute, network and

    storage services to the clients. This research considers security of a cloud storage

    that resides at data center of a CSP. Security has several perspectives when it comes

    to research and development. This research considers confidentiality and integrity

    parameters of security as the major problems to be solved. This research assumed

    that breach of data confidentiality and integrity will impact the clients trust for using

    cloud storage services. In order to achieve clients trust, data confidentiality and

    integrity must be protected, and CSP must always ensure the delivery of trusted

    cloud storage services to the clients. Therefore, in this thesis, trust do not refers to

    the concept of trusted computing.

    However, this research assumed that users may be required to use trusted

    platforms for using cloud storage services. For example, Trusted Extension Device

    (TED) and Trusted Platform Module (TPM) can be used by the clients to protect

    their devices. In a cloud computing environment, system performance is also

    considered as a significant factor, but SCSM was designed and developed mainly by

    considering the security requirements of the organizations dealing with highly

    confidential data. We believe that the identified research problem was not possible

    to be solved just by providing encryption and data auditing approaches. Therefore,

    our research scope focuses on providing a complete secure process that is comprised

    of a set of five components which include Multi-factor authentication and

    authorization process using Role-Based Access Control (RBAC) with Complex

    Random Security Code Generator (CRSCG), Partial homomorphic cryptography,

    Trusted Third Party (TTP) services including Key Management (KM) approach and

    data auditing process, implementation of 256-bit Secure Socket Layer (SSL) and

    SLA. This research also focuses on the deployment of the research contribution

  • 10

    Secure Cloud Storage Model (SCSM) on a cloud computing infrastructure in order to

    obtain authentic evaluation results.

    1.6 Significance of Research

    When objectives of the research are successfully accomplished, the

    development of SCSM can be considered as one of the valuable contributions in the

    field of cloud computing security, since it will overcome the existing data

    confidentiality and integrity concerns by providing trusted and secure cloud storage

    services to the clients. Contribution of this research will be beneficial for both, client

    organizations and CSPs. Clients will adopt cost-effective storage solutions in order

    to store their confidential data for high availability, accessibility, secure backup and

    recovery. Alternatively, CSPs will adopt this solution to overcome the limitations of

    their existing cloud storage services and to gain clients trust. This research expects

    that adoption of cloud storage service will rapidly increase with the successful

    implementation and deployment of SCSM at the industry level.

    1.7 Contribution of Research

    The advent of cloud computing brought up enormous challenges for the

    software engineers to design as well as develop secure cloud applications, platforms,

    and infrastructures that deal with the storage of mission critical data. In the domain

    of Software Engineering (SE), information security engineers apply security

    principles at each stage of the Software Development Life Cycle (SDLC) from

    requirements analysis until development and deployment phases. They are also

    responsible to analyze and test the security of their developed cloud based solutions

  • 11

    (Zingham and Saqib, 2013). This research adopted a SE approach by designing,

    developing, deploying and analyzing the requirements of secure cloud storages.

    Therefore, this research contributed in the field of SE by completing those

    requirements which actually fall under the responsibilities of information security

    engineers for developing secure cloud storage services. The final contribution

    produced by this research as software will introduce a novel SE approach to develop

    complex confidentiality and integrity preserved cloud storage systems.

    1.8 Thesis Organization

    This thesis explores an emerging area of cloud security research focusing on

    data confidentiality and integrity concerns for using cloud storage services. The

    complete research is organized in seven chapters. Figure 1.2, shows the flow of

    thesis organization. Chapter 1 represents the significance of this research mainly by

    clarifying the research problem area, scope, contributions and objectives. An in-

    depth analysis of existing literature is provided in Chapter 2, which covers cloud

    security techniques and models provided by various researchers to solve the existing

    cloud storage security problems. Chapter 2 also covers the critical analysis on the

    limitations and strengths of industry and academia implemented contributions.

    Chapter 3 describes the entire research methodology used systematically for

    accomplishing each research objective. Description and design of the SCSM are

    provided in Chapter 4. Each component of SCSM is discussed with technical as well

    as theoretical details. SCSM is designed using architecture, use-case and sequence

    diagram, in-addition to the construction of an effective SLA. Chapter 5 describes the

    development details of SCSM implementation as a web-based prototype. Entire

    system workflow is described using user interface snapshots. System deployment

    details at the real cloud computing infrastructure are also described in Chapter 5.

    The evaluation process and results for the entire process as well as the each

    component of SCSM are described in Chapter 6. The applications of SCSM, overall

  • 12

    research conclusion, limitations and future direction, are critically discussed and

    justified in Chapter 7.

    Figure 1.2: Thesis Organization

    CHAPTER 2

    Literature Review

    CHAPTER 3

    Research

    Methodology

    CHAPTER 7

    Conclusion and Future

    Work

    CHAPTER 1

    Introduction

    CHAPTER 6

    Evaluation and

    Results

    CHAPTER 4

    Secure Cloud Storage Model

    CHAPTER 5

    Implementation of the Secure

    Cloud Storage Model

  • 13

    1.9 Summary

    Cloud storage service is sub-category of IaaS which is provided to

    organizations for storing large amounts of data with unlimited capacity, broad

    accessibility, resilient availability, disaster recovery, and cost-effectiveness features.

    However, organizations dealing with confidential data are reluctant to adopt remotely

    located cloud storage services due to emerging data confidentiality and integrity

    concerns which have created a barrier of trust among the CSPs and clients. In order

    to overcome the mentioned problem, this research aims to provide an improved as

    well as enhanced solution for designing as well as developing confidentiality and

    integrity preserved secure model to use cloud storage services by accomplishing the

    research objectives. The successful implementation as well as deployment of SCSM

    at the industry level will assist CSPs to adopt this solution for offering secure and

    trusted cloud storage services to the business organizations.

  • 14

    CHAPTER 2

    LITERATURE REVIEW

    2.1 Introduction

    National Institute of Standards and Technology (NIST) defines cloud

    computing as A model for enabling ubiquitous, convenient, on-demand network

    access to a shared pool of configurable computing resources e.g., networks, servers,

    storage, applications, and services, that can be rapidly provisioned and released

    with minimal management effort or service provider interaction (Mell and Grance,

    2011). A cloud model is composed of five essential characteristics defined as

    follows:

    i. On-demand Self-service: Consumers can automatically provision the

    computing capabilities such as server time and network storage

    without human interaction.

    ii. Broad Network Access: Services are provided to a large community

    of users over the internet for ubiquitous and pervasive access via web-

    browser.

  • 15

    iii. Resource Pooling: CSPs pool their resources to server multiple

    clients by developing a multi-tenant architecture using virtualization

    tools and technologies.

    iv. Rapid Elasticity: Clients can easily scale up and down their service

    capabilities by requesting the CSP without being engaged in physical

    efforts.

    v. Measured Service: Usage of resources such as storage, bandwidth,

    and processing, are automatically monitored, controlled, reported and

    optimized to provide efficient services. (Mell and Grance, 2011)

    The remainder of this chapter is organized in eleven sections. Section 2.2,

    describes the cloud deployment models. Section 2.3, describes the cloud service

    delivery models. Section 2.4, describes the cloud storage services, its concepts,

    advantages and the adopting organizations. Data security concerns for using cloud

    storage services, and the protection mechanisms used to overcome those concerns are

    described in Sections 2.5 and 2.6, respectively. Section 2.7, describes the leading

    industry implemented cloud storage services. Section 2.8, identifies and analyses the

    limitations or vulnerabilities of industry implemented cloud storage services. Section

    2.9, presents the related work of various researchers who designed or developed

    security models to overcome confidentiality and integrity concerns for using cloud

    storage services. Section 2.10, critically analyses the reviewed research

    contributions and determines their limitations. Section 2.11, describes the strengths

    of related work and complete roadmap of this research. Section 2.12, presents the

    summary of this chapter.

  • 16

    2.2 Cloud Deployment Models

    Cloud services are mainly provided via Private, Public, Hybrid, and

    Community cloud deployment models. Adoption of these models depends on the

    security, privacy, performance, flexibility and scalability requirements of an

    organization (Keung and Kwok, 2012; Soni et al., 2013). However, if one model

    does not well-suit the demands of an organization, they can also mix-up among

    multiple models which is considered as a hybrid cloud deployment model (Savu,

    2011). Cloud deployment models are described in the following sub-sections.

    2.2.1 Public Cloud

    Public cloud infrastructure is managed and operated by the CSP. It is offered

    to a wide range of registered users, and it is configured as well as settled at on-

    premises of the CSP. Resources such as storage, application or servers are provided

    to clients without requiring them to build, maintain or monitor a personalized IT

    infrastructure (Wang et al., 2013). They are also not required to pay for the software

    licensing or the hardware purchasing cost. Using this deployment model, everything

    (SaaS, PaaS and IaaS) is provided from the CSPs such as Amazon, Google and

    Microsoft (Ang et al., 2011). Clients can easily access the public cloud services over

    the internet using a web-browser. They are only required to pay according to the

    usage of their services, and resources are highly flexible, which can be easily scaled

    up and down at any time with minimal interaction, hence it saves organizations from

    paying for the complexities of managing an over or under-utilized computing

    infrastructure. Users of public cloud also enjoy the benefits of resilient availability.

    For instance, Amazon guarantees 99.99% service uptime (Amazon, 2013). However,

    using public cloud, clients do not gain direct control over their data or services and

    there is no transparency of cloud security standards, techniques, policies or

    procedures for the client to trust the public cloud services (Huaqun, 2013) (Astrova

    et al., 2012).

  • 17

    2.2.2 Private Cloud

    A private cloud infrastructure is managed and operated by the client

    organization, CSP or in certain cases both. Unlike public, the private cloud belongs

    exclusively to a single client (Stipic and Bronzin, 2012). A business organization

    such as education, banking or healthcare sector, can use their existing in-house

    infrastructure to build a cost-effective private cloud using the supporting software

    stack which includes Eucalyptus or OpenStack (Loewen et al., 2013) (Baun and

    Kunze, 2009). Alternatively, if an organization does not own an on-premises

    infrastructure or they lack supporting IT staff to build a private cloud, they can also

    acquire it from the CSP. Normally the private cloud offered by CSPs is the Virtual

    Private Cloud (VPC).

    A VPC is an on-demand configurable pool of shared computing resources

    allocated within a public cloud environment. However, it is isolated from other parts

    of a public cloud infrastructure using encrypted communication channels, Access

    Control Mechanisms (ACMs), Trusted Virtual Domain (TVD), and Virtual Firewall

    (VF) (Mishra et al., 2013). Using a private cloud, clients have a greater degree of

    control on their data and services since data reside at on-premises of clients

    organization and resources are controlled as well operated by internal IT staff, but

    private cloud is less scalable compared to public or VPC. Alternatively, a VPC is

    secure compared to a public cloud and client has minimal control over their acquired

    space, but since it is off-premises cloud service for the client organizations, they still

    feel lack of direct control over their personal data and computing resources (Dillon et

    al., 2010).

  • 18

    2.2.3 Hybrid Cloud

    Hybrid cloud is the combination of multiple distinct cloud models (Private,

    Public or Community) to create a customized solution based on the requirements of

    an organization (Mell and Grance, 2011). This provides an opportunity for the

    organizations to store sensitive information and mission critical processes in a private

    cloud, and non-critical information as well as processes in a public cloud or to use

    different cloud models for backup and disaster recovery (Yen-Hung et al., 2013). A

    hybrid cloud solution is also useful, when an organization has started a new business

    and they require certain systems to serve their customers, but they do not possess an

    existing IT infrastructure as well as technical IT staff to deploy, monitor and manage

    their systems.

    They might decide to handle these tasks to a third party CSP by leveraging

    public cloud services without investing in building an in-house infrastructure, as a

    result, they will gain a cost-effective IT solution and the opportunity to focus on their

    core business. However, with the passage of time, as their customers growth rate

    increases, they will decide to initiate their own IT department and build on-premises

    private cloud as a long term business strategy. The organization will host their

    internal systems at on-premises infrastructure but still continue to use SaaS and

    storage from CSP, in this case their computing infrastructure will be considered as a

    hybrid cloud (Marston et al., 2011). The security of a hybrid cloud depends on the

    type of cloud models which are used as mixed and the approach of adoption such as

    on or off-premises.

  • 19

    2.2.4 Community Cloud

    A community cloud infrastructure is provisioned for exclusive use by a

    specific group of organizations. Members of community cloud may belong to a

    combination of hospitals, banks or universities that have a common mission, security

    requirements, policy, and compliance considerations (CSA, 2011). It may be

    managed and operated by one or more entities in the community, a third party CSP

    or combination of them, and it may exists on or off-premises (Jadeja and Modi,

    2012). In community cloud, organizations have ubiquitous access to shared

    information. For example, using a healthcare community cloud, one hospital can

    access the shared records of patients from another hospital.

    A community cloud is smaller than public but larger than a private cloud.

    Since it is constructed by several organizations collaborating together to achieve

    economies of scale (Gall et al., 2013; Sattiraju et al., 2013). Using this model,

    organizations have complete control over their management responsibilities as they

    can control, which other organizations are permitted to join the community

    infrastructure. Organizations also have enough control over their resource and they

    trust the security standards used, especially when it is located at an on-premises

    datacentre. Organizations are also able to manage their data sharing policies.

    However, allocation of cost, responsibilities, governance, security and controlling

    multiple user access points are the main challenges to be faced while acquiring a

    community cloud (Sathiyapriya et al., 2013).

  • 20

    2.3 Cloud Service Delivery Models

    CSPs offer variety of services such as remotely located business processing

    applications, integration and development tools, flexible and scalable storages to the

    clients. These services are obtained by the clients using SaaS, PaaS, and IaaS, which

    are delivered via private, public, hybrid, and community deployment models with

    minimal interaction as well as management responsibilities. Cloud service delivery

    models are described in the following sub-sections.

    2.3.1 Software as a Service

    SaaS refers to web-based applications such as word-processing, human

    resource and customer relationship management systems, running on the CSPs

    infrastructure. These applications are accessed using a web-browser over the internet

    (Junjie et al., 2009). Users of SaaS are charged according to pay-as-you-go billing

    model and they do not manage or control the underlying cloud infrastructure besides

    having limited permissions for application configuration settings. SaaS applications

    are gaining rapid popularity and they are provided by the well-known companies

    such as Microsoft and Salesforce. SaaS bear advantages for clients and providers,

    the key benefits for providers include rapid deployment, better user adoption, and

    reduced support needs. For customers, key benefits of SaaS include lower IT cost

    and faster access to new technology, functionality and upgrades (Pang Xiong and Li,

    2013). However, due to clients lack of control on their enterprise applications,

    security and privacy are the top concerns preventing firms from adopting SaaS (Yu-

    Hui, 2011).

  • 21

    2.3.2 Platform as a Service

    PaaS facilitates the clients to rapidly develop, deploy and test applications on

    a cloud infrastructure using Application Programming Interfaces (APIs), libraries,

    and tools, provided by the CSPs without buying or maintaining the underlying

    infrastructure (Zeng and Xu, 2010). PaaS clients have full control over their

    deployed applications and adequate control over the application hosting platforms.

    PaaS tools can play an important role in software development stages. Nowadays,

    there is a variety of PaaS tools offered by the CSPs. These include Cloud Foundry,

    Azure, OpenShift, Google App Engine, AppFog, Cloudify and Heroku etc. PaaS

    offerings provide plenty of features that will appeal to any developer working in a

    cloud environment. However, there are some challenges faced in adoption of PaaS

    mainly due to the required learning curve, since developers are not ready to use these

    new tools beside the traditional ones they are already familiar with (Cohen, 2013).

    2.3.3 Infrastructure as a Service

    IaaS is a method of delivering resources such as servers, storage, network and

    operating systems (OSs) as on-demand services (Xing et al., 2012). IaaS clients are

    not required to build private datacentres, maintain servers, storage or networks, also

    they do not manage or control the underlying cloud infrastructure but they have

    control over their OSs, storage, deployed applications and possibly limited control of

    selected networking components e.g., host firewalls (Mell and Grance, 2011; CSA,

    2011). IaaS is offered from companies such as Amazon, Rackspace, HP and IBM.

    Business organizations can achieve high availability, reliability and disaster recovery

    solutions by leveraging IaaS. However, security policies, governance and lack of

    awareness about the physical location of data remain the main barriers in adopting

    IaaS (Gibson et al., 2012).

  • 22

    2.4 Cloud Storage Services

    Cloud storage services are offered via public cloud deployment models and

    they do not require setup, configuration or installation of a personalized IT

    infrastructure at the client organizations. These services are remotely acquired from

    a third party CSP such as Amazon, and Google at any time with on-demand

    unlimited capacity. For example, Amazon Simple Storage Service (S3) offers data

    storage that is easily scalable without physical interaction which can shrink and grow

    as per clients requirements and it does not require capital investment. Users are

    normally charged according to pay-per-use billing model (Mazhelis et al., 2012).

    Beside the cost-effectiveness, ease-of-use and accessibility, cloud storage services

    also involve resilient availability and disaster recovery solutions for the organizations

    (Ullrich et al., 2012). Typically, CSPs utilize cost-effective redundant storage

    hardware which overcomes the issues of interrupted service during a planned or

    accidental outage, i.e. the scheduled maintenance or upgrades. Amazons claim for

    providing 99.99% uptime is visible proof of resilient cloud storage service

    availability (Amazon, 2013).

    Organizations that are not dealing with sensitive data, are rapidly adopting

    cloud storage services for disaster recovery solutions. Under some unwanted

    circumstances such as natural disasters, they can easily recover their data from the

    backup storage without loss or damage to the information. Normally, CSPs create

    redundant backup for the clients data at various backup zones by considering the

    local data protection law. Due to these backup zones, clients data will be always

    protected and returned as intact. In case of a disaster, IT staff from a CSP can restore

    the data back to cloud from local storage, which may reside at geographically located

    backup zones (Javaraiah, 2011). However, compared to traditional storage methods,

    cloud storages possess new challenges in data security, reliability, and management.

    CSPs use numbers of heterogeneous storage devices which work together to provide

    data storage and business functions. While using cloud storage services, clients are

    not aware about the details of security controls used to protect their data (Zhang and

    Zhang, 2011).

  • 23

    Cloud storages are yet not being adopted by the organizations dealing with the

    confidential data due to emerging data confidentiality and integrity problems (Chirag

    et al., 2012). In order to enhance adoption of cloud storage services by all type

    business organizations, there is tremendous opportunity for this research to formulate

    a valuable contribution to overcome data confidentiality and integrity concerns.

    2.5 Cloud Storage Data Security Concerns

    Data stored in third party cloud storages might not be secure due to the

    absence of ensuring confidentiality and integrity preserved services. Although, CSPs

    are offering cost-effective cloud storage services, since it is a remotely located

    facility, clients cannot trust the CSP or feel satisfied that their data are always secure

    at the cloud. Hence, organizations dealing with confidential data are not willing to

    adopt cloud storage services until certain guarantees are achieved (Karumanchi,

    2010). Information confidentiality and integrity concerns for using remotely located

    public cloud storage services are described in the following sub-sections.

    2.5.1 Data Confidentiality

    In a cloud computing environment, confidentiality is a key concern when data

    are of sensitive nature that includes information related to health, political opinions,

    religious and personal beliefs. The organizations such as banking, healthcare and

    Payment Card Industry (PCI) have strict data confidentiality requirements, so they

    will have numerous anxieties for storing their sensitive records at a remotely located

    third party owned cloud storages (Hofmann and Woods, 2010).

  • 24

    Since by acquiring and using cloud storage services, clients lose direct control

    over their data, it will open new opportunities for illegal authorities to access their

    personal records. Confidentiality of data can be breached in-transit or while it is

    stored at the cloud storage. When clients send data to the cloud, it may be attacked

    by Man-in-the-Middle (MITM) or personal records can be viewed by the CSP which

    will lead to a breach of confidentiality. Encryption is recommended as the

    fundamental approach to preserve data confidentiality while using cloud storages

    (Karumanchi, 2010). However, implementation of improper security procedures

    such as ineffective KM approaches may result into a vulnerable and insecure cipher.

    Data of a cloud user must be accessed only by the authorized parties as specified in

    the SLA. Alternatively, CSP and other involved organizations such as TTP must not

    exceed their privileges to view or distribute confidential data of clients to

    unauthorized parties.

    2.5.2 Data Integrity

    Although data confidentiality at cloud storage can be preserved by

    implementing encryption with effective KM approaches, however, these techniques

    cannot guarantee that the integrity of data will always remain intact. Data integrity

    can be breached when data are modified by an external hacker or a malicious insider

    even when it is in cipher format. In order to defend data integrity violations, security

    approaches such as ACMs and data auditing services should be implemented. Data

    owners must hire a TTP to monitor their data on cloud and to ensure that their

    sensitive records are protected from integrity violations (Stamou et al., 2012). TTP

    can verify the integrity of clients data using applications which are based on the

    implementations of Digital Signature Algorithm (DSA). CSPs can also play an

    integral role in sustaining a safe computing environment for storing clients

    confidential data by enforcing ACMs which are specified from the data owners

    considering their data security policies (Cong et al., 2013). Clients trust on cloud is

  • 25

    breached when the SLA, terms or conditions are not followed or accomplished

    (Zissis and Lekkas, 2012).

    2.6 Data Protection Mechanisms for Cloud Storages

    In order to overcome the issues of breaching data confidentiality and integrity

    in a cloud computing environment, researchers have designed and developed security

    models using various data protection mechanisms which include cryptography and

    KM, trusted computing, ACMs, SLA, and data auditing services. The significances

    of these data protection mechanisms to secure cloud storage services are described in

    the following sub-sections.

    2.6.1 Cryptography and Key Management

    The protection of data against the loss and theft is a shared responsibility of

    cloud customer and CSP. Nowadays, encryption is one of the strongly recommended

    techniques specified in cloud SLAs (Jansen and Grance, 2011). However, simply

    encryption is not enough to secure the data. There must be proper KM practices to

    ensure safe and legal access of encryption keys. Keys must be protected with the

    same significance as the data itself and they should be accessed only by limited and

    authorized personalities. Proper procedures must be followed, if encryption keys are

    lost or stolen (CSA, 2011).

  • 26

    Triple Data Encryption Algorithm (TDEA) and Advanced Encryption

    Standard (AES), are vastly implemented symmetric algorithms used for protecting

    data at cloud storages. These algorithms use a single secret key for performing

    encryption and decryption processes. From asymmetric cryptography, Rivest,

    Shamir and Adleman (RSA) cryptography, and Elliptic Curve Cryptography (ECC)

    are mostly used encryption techniques to protect data at cloud. Unlike symmetric,

    these methods use two different keys, a public key for encryption and a private key

    for decryption (Jing-Jang et al., 2011). If encryption practices are followed

    accurately, data will be protected from illegal access or theft by malicious employees

    of a CSP and external adversaries.

    2.6.2 Trusted Computing

    Trusted computing is a term that refers to technology and proposals for

    resolving computer security problems through hardware augmentations and related

    software amendments. Several well-known hardware manufacturers and software

    companies, jointly known as the Trusted Computing Group (TCG), are working for

    trusted computing and they have come up with significant developments to enhance

    computing security. TCG developed a set of hardware and software technologies to

    enable the construction of trusted platforms (Karumanchi, 2010). Trusted computing

    is a promising technology to mitigate the novel security challenges existing in cloud

    computing infrastructures using TCG products. The use of approaches such as TPM,

    and Virtual Trusted Platform Module (vTPM) can significantly improve the security

    of cloud services. For example, TPM can be used by the CSPs to secure cloud

    storage servers. In order to protect client Virtual Machines (VMs) that are residing at

    a remote cloud platform, each VM can be associated with a vTPM instance which

    emulates the TPM functionality to extend the chain of trust from physical TPM.

    Similarly, TED (a portable device containing the functionality of TPM) which was

    introduced by Nepal et al., (2007), can be used by the clients for remote platform

    attestation and integrity verification tasks.

  • 27

    2.6.3 Access Control Mechanisms

    Access control is the process of limiting system access to only authorized

    people, programs, processes or other system components. ACMs are responsible for

    protecting cloud storage by limiting, denying or restricting access to a system or an

    entity according to the well-defined security policies (Afoulk et al., 2012). Most

    common ACMs used in a cloud computing environment include Mandatory Access

    Control (MAC), Discretionary Access Control (DAC), and RBAC (Wei et al., 2012).

    All these techniques are known as identity based ACMs, as user subjects and

    resources objects are identified by unique names. Identification may be done directly

    or through roles assigned to subjects. ACMs guarantee integrity and confidentiality

    of the resources. ACMs must be implemented by the CSP or third party in

    association with the cloud user (Khan, 2012).

    2.6.4 Service Level Agreement

    SLA is a formal document which contains the terms and condition for using

    cloud services. It documents common understandings about priorities,

    responsibilities, and guarantees. The main objective of SLA is to reduce key areas of

    potential conflicts and to identify their resolution before they get materialized. Each

    CSP offers different SLA structure, service offerings, negotiation opportunities and

    performance levels. SLA can be used to select a CSP on the basis of data protection,

    continuity, and cost. In order to avoid unwanted situations, client and the CSP

    should involve a TTP who can monitor the provided service and undertake necessary

    steps independently by considering the specified terms and conditions in cases of

    service violation (Stamou et al., 2012). TTP will ensure the delivery of required

    service from CSP using metrics such as throughput, response, availability, reliability,

    and data security controls (Ghosh and Ghosh, 2012). However, an SLA should not

  • 28

    only validate and penalized the CSP but it must also involve the possible penalties

    for malicious client activities (Kandukuri et al., 2009).

    2.6.5 Data Auditing Services

    When clients transfer their confidential data to cloud storages, they lose

    physical possession over it, which raises serious concerns of data integrity protection,

    and makes it a challenging task. While using cloud storage services, users should be

    able to access and process their data in the similar manner as they do on their

    personalized system without worrying about the requirements of verifying its

    integrity. Hence, facilitating public auditability for cloud storage is a vital task to

    ensure protection of data residing at third party storages. CSPs should negotiate with

    clients to allow a third party auditor for checking the integrity of outsourced data

    (Nithiavathy, 2013). However, the auditing process should not bring new

    vulnerabilities towards users data confidentiality. The process of data verification is

    very significant while using cloud storages, since there are possible chances of

    threats from external hackers and malicious insiders such as disgruntled employees

    from a CSP. They may illegally or intentionally delete or modify the clients

    records. The third party auditor can periodically conduct the auditing services on

    behalf of clients to ensure that their data are stored with the maintained integrity. In

    cases of integrity violation, auditor should report to client and take appropriate steps

    as specified in the SLA (Cong et al., 2013).

  • 29

    2.7 Industry Based Implementations of Cloud Storage Services

    The demand for adopting cloud storage services is rapidly increasing due to its

    advantages such as cost-effectiveness, scalability, backup, and disaster recovery.

    However, the clients dealing with mission critical data are reluctant to move their

    sensitive records at external third-party owned infrastructures due to data security

    concerns (Taeho et al., 2013; Syam and Subramanian, 2011; Gansen et al., 2010).

    Numerous industries are developing security models for cloud storages to preserve

    data confidentiality and integrity. This research focused on reviewing and analyzing

    the security of cloud storage solutions developed by Amazon and Google since they

    are the leading cloud storage providers, and they are quite transparent when it comes

    to the discussion of their security mechanisms and approaches used for protecting

    clients data at cloud storages (Eric, 2013). The cloud security mechanisms as well

    as approaches used by Amazon and Google are described in following sub-sections.

    2.7.1 Amazon Simple Storage Service

    Amazon S3 is scalable, reliable and low-latency data storage infrastructure. It

    supports a simple web services interface which is used to store and retrieve unlimited

    data ubiquitously and pervasively. Users of S3 are required to follow pay-per-use

    billing model, and the provided service is efficiently flexible. Amazon claims to

    consider S3 as a highly durable storage infrastructure designed for mission critical

    and primary data storage (Amazon, 2014). S3 supports user authentication process

    for controlling access to confidential data. Considering the data security approaches

    of S3, only data owners have access to their personal resources. The clients can use

    ACMs such as bucket policies and Access Control List (ACL) to selectively grant

    permissions to the users and groups. Amazon further strengthens the authentication

    process for their customers by adding an extra layer of security to the system. The

    users are required to provide a six digit single-use code in addition to their standard

  • 30

    username and password credentials before access is granted to their services and

    resources. Customers retrieve this code from an authentication device which they

    keep in their physical possession. This process is called multi-factor authentication

    because two different factors are checked before the access is granted (Amazon,

    2011).

    Figure 2.1: Server Side Encryption

    (Jeff, 2011)

    Furthermore, data is protected in-transit during upload and download

    operations, as well as at rest. Clients can protect their data in transit using 256-bit

    SSL, and data at rest can be protected using Server Side Encryption (SSE) or the

    Server Side Encryption with Customer-Provided Key (SSE-C). The users of S3 can

  • 31

    use third party libraries to encrypt the data before storing it to the cloud storage.

    Selecting an encryption method is based on preferences and requirements of the

    clients (Amazon, 2014). Both SSE and SSE-C are used to encrypt data at rest, and

    they are based on using 256-bit AES algorithm. S3 encrypts the data when it is

    written to the disk and decrypts it when a client submits a request to access it, as

    shown in Figure 2.1.

    Using SSE, a client will not realize any difference in accessing the encrypted

    data as long as S3 authenticates the client and validate his/her access privileges.

    Whereas, using SSE-C, client provides the encryption key to the server as a part of

    the request. S3 performs encryption when data is written to the disk, and performs

    decryption, when client accesses the objects. Therefore, clients are not required to

    maintain any source code to encrypt or decrypt data manually. The only task they

    require is to manage the encryption keys. When a client uploads an object, S3 uses

    the provided encryption key to apply AES-256 encryption on data as shown in Figure

    2.2.

    Figure 2.2: Encryption with Clients Key

    (Jeff, 2014)

  • 32

    S3 does not store the clients provided encryption key. Instead, it stores a

    randomly salted Keyed-Hash Message Authentication Code (HMAC) value of the

    encryption key in order to validate the future requests. The salted HMAC value

    cannot be used to derive value of the encryption key or to decrypt the contents of an

    encrypted object, therefore if client loses the encryption key, the cipher data will be

    useless. Unlike SSE, using S3, clients can use third party libraries or Amazon

    Software Development Kit (SDK) to encrypt their data using client side encryption

    before sending it to the cloud storage. Clients can apply any encryption algorithm to

    their data. The private keys of the client and unencrypted data are never sent to S3.

    Therefore, it is mandatory for the clients to safely manage their keys. For client side

    encryption, the Amazon Web Services (AWS) SDK uses a process called envelope

    encryption. Using this technique, client provides the encryption key to S3 encryption

    module, and S3 performs the entire encryption process, as shown in Figure 2.3.

    Figure 2.3: Client Side Encryption

    (Jeff, 2011a)

  • 33

    Initially, during the encryption process, the S3 encryption module generates a

    one-time-use 256-AES symmetric key known as the envelope key, S3 uses this key

    to encrypt clients data, and then encrypts the envelope key using clients private key

    (Amazon, 2014). Client uploads the encrypted envelope key along with the

    encrypted data to S3. During the retrieving and decrypting process, S3 retrieves

    clients encrypted data from the server along with the encrypted envelope key. S3

    then decrypts the envelope key using clients private encryption key, and finally it

    decrypts clients data using the envelope key. In this process, if encryption keys are

    lost, clients will not be able to decrypt their data. Beside these encryption and

    decryption approaches, S3 uses integrity verification techniques to verify the

    correctness of data stored at the cloud. It uses a combination of checksums and

    Cyclic Redundancy Checks (CRCs) to detect data corruption. S3 performs

    checksums on data at rest and repairs any corruption using redundant data.

    Moreover, S3 server also calculates checksums on the entire network traffic to detect

    corruption of data packets when storing or retrieving data. The data protection

    approaches of Amazon are also in compliance with its associated SLA which mainly

    specifies the service commitments regarding durability and availability of service

    (Amazon, 2014).

    2.7.2 Google Cloud Storage

    The data storage process can be time-consuming and costly since it includes

    maintaining data servers, storage disks, firewalls, backup copies and disaster

    recovery provisions. Google Cloud Storage (GSC) reduces these burdens on

    individuals as well as organization, allowing them to store, retrieve, share, and

    analyze their data without worrying about maintenance, hardware and firmware

    upgrades. It is massively scalable, users can store and process terabytes of data to

    support big data scenarios required by scientific, financial analysis and media

    applications or they can store small amounts of data required for light business

    websites. GCS is elastic, so users can design applications for a large global

  • 34

    audience, and scale those applications as desired. Users are required to pay only for

    what and when they use. It supports a simple programming interface which enables

    the developers to take advantage of Googles own reliable and fast network

    infrastructure to perform data operations in a secure and cost-effective manner

    (Google, 2012). The GCS users can move their data from Amazon S3 to the google

    data storage through migration pipeline as shown in Figure 2.4.

    Figure 2.4: Data Migration Process

    (Google, 2013)

  • 35

    Beside these advantages of GCS, it also involves strict security policies and

    approaches to protect confidential data of the users from external as well as internal

    threats such as disgruntled employees. Google requires the use of a unique user

    identification number for each employee. It is used to identify each persons activity

    on Googles network, including any access to data of employees or customers.

    During the hiring process, an employee is assigned an identification number by

    Human Resources (HR) system and granted with default set of privileges. At the end

    of a persons employment, his/her accounts access to Googles network is disabled

    from the HR system. Google makes widespread use of two-factor authentication

    mechanisms, such as certificates and one-time password generators. Two-factor

    authentication is required for accessing the production environments and resources

    through Googles Single Sign-On (SSO) system. Access rights and levels are based

    on an employees job function and role using the least-privileges concept (Google,

    2012a).

    GCS also uses ACLs to manage access to objects and buckets. ACLs are the

    mechanisms that customers use to share objects with other users and allow them to

    access their buckets as well as objects. An ACL consists of one or more entries,

    where each entry grants permissions to a scope. Permissions define the actions that

    can be performed against an object or bucket for example read or write tasks,

    whereas the scope defines who the permission applies to i.e. a specific user or group

    of users. When a user requests access to an object or bucket, the GCS system reads

    the ACL on the object or bucket and determines whether to allow or reject the

    request. If ACL grants the user permission for the requested operation, the task is

    performed. Users are able to share data with authorized colleagues and partners

    through ACLs and they can control the access to their confidential records. If ACL

    does not grant the user permission for the requested operation, the request will fail

    and a forbidden error or access denied message will be returned (Google, 2012a).

    GCS users are also able to grant access of their objects to the users who do not have

    GCS accounts as these types of users can access the objects by authenticating via

    simple Google accounts. For example, in order to permit a user Jane to download an

    object from a bucket, the object owner is first required to grant Jane read permission

  • 36

    for that object and then to provide her the resource link. When Jane opens the link in

    her browser she will be automatically prompted to sign-in to her Google account.

    After she is authenticated, and her browser has acquired a cookie with an

    encapsulated identity token, she will be redirected to the object in GCS repository.

    GCS then verifies that Jane is allowed to read the object, and then object is

    downloaded to Janes computer (Google, 2014). This process of authorization is

    shown in Figure 2.5.

    Figure 2.5: Authentication Process

    (Google, 2014)

    GCS is based on the use of Hypertext Transfer Protocol Secure (HTTPS) by

    configuring 256-bit SSL to establish a secure communication channel. Information

    sent via HTTPS is encrypted from the time it leaves GCS until it is received by the

    recipients computer. GCS automatically encrypts all data before it is written to disk,

  • 37

    at no additional charge. There is no setup or configuration required, and no need to

    modify the way customers access their services. Data is automatically and

    transparently decrypted when read by an authorized user. If customers require

    encryption for their data, this functionality frees them from the hassle and risk of

    managing personal encryption and decryption keys. GCS manages the cryptographic

    keys on behalf of customers using the same hardened key management systems that

    Google uses for their own encrypted data, including strict key access controls and

    auditing. Each cloud storage objects data and metadata is encrypted using the 128-

    bit AES, and each encryption key is encrypted with a regularly rotated set of master

    keys. Users of GCS are also able encrypt their data themselves prior to sending it for

    storage, in this case users will be responsible to manage their own encryption and

    decryption keys (Ferreira, 2013). GCS provides a CRC header that allows the clients

    to verify the integrity of object contents. For non-composite objects, GCS also

    provides a message digest header in order to allow the clients to verify integrity of

    the objects, but for composite objects only CRC is available. Integrity checks are

    automatically performed on all uploads and downloads. Likewise S3, the service

    provided to the GCS customers will be operational and available by considering the

    compliance with SLA. If Google does not comply with the requirements of SLA,

    and if customers meet their obligations as specified in SLA, customers will be

    eligible to receive service credit. Google also ensures that retired disks containing

    customers old information are subjected to data destruction process prior to leaving

    the premises (Google, 2012a).

    2.8 Limitations of Industry Implemented Cloud Storage Services

    Although cloud storage services have several advantages for business

    organizations as well individuals, but after conducting in-depth literature review, this

    research identified that well-known cloud storages such as S3 and GCS have certain

    vulnerabilities or limitations in terms of KM approach, cryptographic support, SLA,

  • 38

    and data integrity verification services. The summarized information of industry

    implemented cloud storage services and their limitations is provided in Figure 2.6.

    Figure 2.6: Limitations of Amazon S3 and GCS

    The shortcomings of existing cloud storages can raise information

    confidentiality and integrity concerns, therefore these services are not trusted by the

    organizations to store mission critical data such as healthcare, banking or

    government related records. The common vulnerabilities of S3 and GCS are

    described in following sub-sections.

  • 39

    2.8.1 Vulnerable Key Management Approach

    GCS and Amazon S3 automatically encrypt data before it is written to disk

    using SSE process without additional charges. There is no setup or configuration

    required and no need to modify the way clients access the service. Data are

    automatically and transparently decrypted when read by an authorized client without

    requiring the users to undertake the burden of managing their private keys.

    Although, CSPs protect the cryptographic keys of clients using the same hardened

    KM systems that they use for their own encrypted data including strict key access

    controls and auditing. However, this process of data and keys protection requires the

    clients to undoubtedly trust the CSPs because the keys are managed by them. In

    certain cases organizations can be concerned that some government authorities such

    as National Security Agency (NSA) can obtain their keys from the CSP to decrypt

    and illegally access or view their sensitive records (Ferreira, 2013).

    In order to overcome clients concerns, alternative approach supported by

    CSPs is to recommend the clients to encrypt their data before sending it to the cloud

    storage or to perform SSE-C without storing their keys at the server. In this

    approach, there are also implications in terms of KM, since clients are responsible to

    manage and protect their data encryption and decryption keys from compromise to

    an adversary, unintentional deletion or loss, whereas clients are not eager to

    undertake such a responsibility as it requires managing hardware and software which

    will add to the cost of overall service The clients must be facilitated with an

    effective and secure KM approach by involving a TTP and without being anxious

    about maintaining a secure key storage service to protect their data confidentiality

    and integrity.

  • 40

    2.8.2 Inadequate Cryptographic Support

    The data stored at S3 or GCS, are encrypted using AES algorithm. S3

    supports AES-256 bit, whereas GCS supports AES-128 bit. At present AES-256 bit

    is proven secure but there is criticism on GCS using AES-128 bit because it has been

    already proven as not strong enough to protect confidential data, and it can be

    cracked by todays technology in a reasonable amount of time (Ferreira, 2013).

    Beside the security concerns, encryption techniques of S3 and GCS have limitations

    in terms of functionality and usage. For example when data is encrypted at the GCS

    or S3, it cannot be processed to perform computations unless it is decrypted to its

    original format, so each time users are required download their data to perform these

    tasks. In other words, in existing cloud storage services, there is no support for FHE

    which can enable the users to perform live computations on their data while their

    privacy remains preserved (Murali et al., 2013). FHE is a good basis to enhance the

    security measures of un-trusted systems or applications that stores and manipulates

    sensitive data. Although, FHE is implemented but it is not proven efficient for

    practical use in challenging cloud computing services such as S3 and GCS (Kui et

    al., 2012; Wang et al., 2013; Stefania et al., 2012).

    2.8.3 Exclusion of Security Assura