27
Controlling Data in the Cloud: Outsourcing Computation without Outsourcing Control Richard Chow, Philippe Golle, Markus Jakobsson, Ryusuke Masuoka, Jesus Molina, Elaine Shi, Jessica Staddon PARC (formerly Xerox PARC) & Fujitsu Laboratories of America The ACM Cloud Computing Security Workshop e 16th ACM Conference on Computer and Communications Security (CCS), by Mike Hsiao, 2010.03.0

Controlling Data in the Cloud: Outsourcing Computation without Outsourcing Control Richard Chow, Philippe Golle, Markus Jakobsson, Ryusuke Masuoka, Jesus

  • View
    217

  • Download
    0

Embed Size (px)

Citation preview

Controlling Data in the Cloud:Outsourcing Computation without Outsourcing

Control

Richard Chow, Philippe Golle, Markus Jakobsson, Ryusuke Masuoka, Jesus Molina, Elaine Shi, Jessica Staddon

PARC (formerly Xerox PARC) & Fujitsu Laboratories of America

CCSW 2009: The ACM Cloud Computing Security Workshop with the 16th ACM Conference on Computer and Communications Security (CCS), Nov. 2009.

by Mike Hsiao, 2010.03.05

2

CCSW 2009: The ACM Cloud Computing Security Workshop

• The security problems caused by Virtualization– secure cloud resource virtualization

mechanisms– sandboxing and VM-based enforcements– network security (DOS, IDS etc.) mechanisms

for cloud contexts– scalability of security in global-size clouds

• The security problems of Data/Computation Outsourcing– secure data management outsourcing (e.g.,

database as a service)– practical privacy and integrity mechanisms for

outsourcing– secure computation outsourcing

• Trust and Privacy– trust and policy management in clouds– secure identity management mechanisms– trusted computing technology and clouds– binary analysis of software for remote

attestation and cloud protection– remote attestation mechanisms in clouds

• Threat/Attack based on Cloud environment– foundations of cloud-centric threat

models– new cloud-aware web service

security paradigms and mechanisms

– security for emerging cloud programming models

• Political and Economics issues– cloud-centric regulatory

compliance issues and mechanisms– business and security risk models

and clouds– energy/cost/efficiency of security

in clouds– cost and usability models and their

interaction with security in clouds

3

USENIX Hot Topics in Cloud Computing (HotCloud)

• Workshop Session– Cloud Platforms and Architectures– Elastic Clouds and Resource Management– Storage Cloud and Appliances– Map Reduce and Cloud Applications

4

Cloud Computing [*]

• Cloud Computing, the long-held dream of computing as a utility, has the potential to transform a large part of the IT industry,– making software even more attractive as a service – and shaping the way IT hardware is designed and purchased.

• No longer– require large capital outlays to deploy new innovative Internet service– concern about over and under provisioning

• Cloud Computing refers to both– the applications delivered as service over Internet, and– the hardware and system software in the datacenters that provide

services.

[*] “Above the Clouds: A Berkeley View of Cloud Computing,” UCB Technical Report, Feb. 2009.

5

Cloud Computing [*] (cont’d)• The services themselves have long been referred to as Software as a

Service (SaaS).• The datacenter hardware and software is what we will call a Cloud.• When a Cloud is made available in a pay-as-you-go manner to the

general public, we call it a Public Cloud.• The service being sold is Utility Computing.• We use the term Private Cloud to refer to internal datacenters of a

business or other organization, not made available to the general public.• Cloud Computing is the sum of SaaS and Utility of Computing.

• People can be users or provides of SaaS, or users or providers of Utility Computing.– SaaS providers = Cloud users

[*] “Above the Clouds: A Berkeley View of Cloud Computing,” UCB Technical Report, Feb. 2009.

6

Outline

• Basics of Cloud Computing– Cloud Hype?

• Fear of the Cloud• New Problems• New Directions

7

Cloud Hype• Larry Ellison (Co-founder and CEO, Oracle Corporation)

– What the Hell is cloud computing?• Richard Stallman (President of the Free Software Foundation)

– It’s stupidity. It’s worse than stupidity: it’s a marketing hype campaign.• Bruce Schneier (computer security specialist)

– Cloud computing is nothing new.• Ron Rivest (‘R’SA, ‘R’C2-RC6, MD2-MD6)

– [Cloud computing will become a] focal point in our work in security. I’m optimistic …

• UC Berkeley RAD Lab– Cloud Computing is likely to have the same impact on software that foundries

have had on the hardware industry.– developers would be wise to design their next generation of system to be

deployed into Cloud Computing.

8

Fear of the Cloud

More in 2008 - Bring back In-house may difficult - Regulatory requirements prohibit cloud - Not enough major supplier

9

Fear of the Cloud (cont’d)

10

Taxonomy of Fear

• Traditional Security– Computer and network attacks will be made possible or easier by

moving to the cloud.– Cloud provider argue that their security measures and processes

are more mature and tested.– It may be easier to enforce security via contracts with online

service provider than via internal controls.• Availability• Third-party data control

– There is a potential lack of control and transparency when a third party holds the data.

– build private cloud?

11

Traditional Security

• TS1. VM-level attacks.– vulnerability in hypervisor, VM

• VMware, Xen, Virtual PC, Vitrual Server

– “Using Memory Errors to Attack a Virtual Machine,” in S&P 2003.• TS2. Cloud provider vulnerability.

– platform-level: XSS, SQL injection, Google Doc.– IBM’s Rational AppScan tool: scans for vulnerabilities in web

services as cloud security service.• TS3. Phishing cloud provider (employees).

– Salesforce.com: one of its employees was tricked into divulging a corporate password

Host Hardware

Hypervisor

VMVMAdmin

Physical-infrastructure

Virtual-infrastructure

Software-platform

App-component

Web-based SoftwareTS1 TS2

12

Traditional Security (cont’d)

• TS4. Expanded network attack surface.– Need to protect infrastructure used to connect

and interact with the cloud– Difficulty: Cloud is outside the firewall in many

cases.– Example

ShadowProcess

Evil Job

User Job

Submitting Host Execution Host

system calls

Lurker

attach?

fork

13

Traditional Security (cont’d)

• TS5. Authentication and Authorization.– How to extend existing enterprise framework to

cloud resources?• TS6. Forensics in the cloud.– Standard practice

• Seize equipment, perform detailed analysis on the media, and

– Differences• Scale of the cloud• Rate of overwriting data

14

Availability

• A1. Uptime.– Often-voiced concern: cloud providers argue their downtime

compare well with cloud user’s own data centers.– SAP’ CEO: you cannot run in the cloud because the cloud

would collapse… Don’t believe that any utility company is going to run its billing for 50 million consumers in the cloud.

• A2. Single point failure.• A3. Assurance of computational integrity.

– Is a cloud provider faithfully running a hosted application and giving valid results?

15

Third-party data control

• BL1. Due diligence. – If served a subpoena or other legal action, can a cloud user

compel the cloud provider to respond in the required time-frame?– How can a cloud user be guaranteed that data has been deleted

by the cloud provider?• BL2. Auditability.

– It may be very difficult to satisfy auditors that your data is properly isolated and cannot be viewed by other customers.

– Lots of standards are not written for the cloud.• SAS 70, SOX, HIPAA, …

– Certain regulations require data and operations to remain in certain geographic locations. (Amazon EC2.)

16

Third-party data control (cont’d)

• BL3. Contractual obligations.– One problem with using another company's infrastructure

is that there might be surprising legal implications.– EC2 10.4 Non-assertion.

• BL4. Cloud Provider Espionage.– This is the worry of theft of company proprietary

information by the cloud provider.– Google Gmail and Google Apps– But…

17

Third-party data control (cont’d)

• BL5. Data Lock-in.– How does a cloud user avoid lock-in to a particular

cloud-computing vendor?• The data might itself be locked in a proprietary format.

– There is also the problem of the cloud user having no control over frequent changes in cloud-based services.

• BL6. Transitive nature.– Another possible concern is that the contracted cloud

provider might itself use subcontractors, over whom the cloud user has even less control, and who also must be trusted.

18

New Problems

• Cheap data and data analysis– Enormous data sets can be monetized by

applications such as advertising (Google)• so as attackers

– Clear impact to privacy – intense pressure on companies to anonymized their data• Fear of bad publicity from data breach• Fear of subpoenas

– Anonymizing data and retaining utility is difficult

19

New Problems (cont’d)

• Cost-effective defense of availability– The damages are not only related to the losses of

productivity, but extend to losses due to the degraded trust in the infrastructure, and potentially costly backup measures.

– The cloud computing model encourages single points of failure. It is therefore important to develop methods for sustained availability (in the context of attack), and for recovery from attack.

20

New Problems (cont’d)

• Increased authentication demands.– Authentication needs will increase.

• Personal, financial, medical data will be hosted in the cloud• Software applications hosted in the cloud requires access control

– Need for higher-assurance authentication• Limits of password authentication

– Password are weak.

– Need for authentication from mobile devices with no/limited user involvement

• Mash-up authorization.– Facebook.

21

New Directions

• Information-centric security– In order for enterprises to extend control to data in the cloud,

we propose shifting from protecting data from the outside (system and applications which use the data) to protecting data from within.

– Data needs to be self-describing and defending, regardless of its environment.

– Data needs to be encrypted and packaged with a usage policy.– When accessed, data should consult its policy and attempt to

re-create a secure environment using virtualization and reveal itself only if the environment is verified as trustworthy (using Trusted Computing).

22

New Directions (cont’d)

• High-Assurance Remote Server Attestation– Data owners wish to audit how their data is being handled at

the cloud, and ensure that their data is not being abused or leaked, or at least have an unalterable audit trail when it does happen.• Currently, only manual audit procedure

– Trusted Computing: a trusted monitor installed at the cloud server that can monitor or audit the operations of the cloud server.• To ensure integrity of the monitor, Trusted Computing also allows

secure bootstrapping of this monitor to run beside (and securely isolated from) the operating system and applications.

• Signed software.

23

New Directions (cont’d)

• Applied Cryptography– Old approach: Give cloud only encrypted data– Issue: Limits use of data, e.g., searching– State-of-art cryptography offers new tools• Searchable encryption: respond to query without

knowing query or data• Proofs of retrievability: prove all data correctly stored• Fully homomorphic encryption scheme: analysis of

encrypted data without decrypting

24

New Directions (cont’d)

• Implicit Authentication– Vision: authenticate users implicitly based on observed

behavior• Location and co-location• Application usage• Biometric measurement• Contextual data

LearningAlgorithm

Past Behavior UserModel

ScoringAlgorithm

Recent Behavior

Score

25

Conclusion

• Cloud fears largely stem from the perceived loss of control of sensitive data.– Current control measures do not adequately

address cloud computing’s third-party data storage and processing needs.

• Trusted Computing and Applied Cryptographic Techniques!

26

Comments

• Problems and ideas are seen and discussed, but implementations are not.

• Define threat models in cloud environment.• Privacy is the most concerned issues.• Virtualization & forensics/auditing.

27

Google MapReduceExample: grep, sort, large-scale machine learning problem clustering problem for the Google News extracting data for report processing satellite imagery data language model processing graph computations

MapReduce is a patented software framework introduced by Google to support distributed computing on large data sets (terabytes) on clusters of computers.