16
1 Databricks Enterprise Security Guide Databricks Enterprise Security Guide

Databricks Enterprise Security Guidego.databricks.com/hubfs/docs/Security/Enterprise-Security-Guide.pdfDatabricks Enterprise Security Guide 2 ... comprehensive strategy spans technology,

  • Upload
    lynga

  • View
    247

  • Download
    5

Embed Size (px)

Citation preview

Page 1: Databricks Enterprise Security Guidego.databricks.com/hubfs/docs/Security/Enterprise-Security-Guide.pdfDatabricks Enterprise Security Guide 2 ... comprehensive strategy spans technology,

1Databricks Enterprise Security Guide

Databricks Enterprise Security Guide

Page 2: Databricks Enterprise Security Guidego.databricks.com/hubfs/docs/Security/Enterprise-Security-Guide.pdfDatabricks Enterprise Security Guide 2 ... comprehensive strategy spans technology,

2Databricks Enterprise Security Guide

Databricks is committed to building a platform where data scientists, data engineers, and data analysts can trust that their data is secure. Through implementing industry-wide best practices and building upon the many security related features provided by AWS, Databricks addresses the most commonly required security controls, highlighted in this document.

This document describes Databricks’ deployment architecture in detail, illustrating how security is addressed throughout.

ContentsDeployment Model .................................................................................................................................... 3Compliance Program ................................................................................................................................. 4Defense in Depth ........................................................................................................................................ 5 Customer Data .................................................................................................................................... 6 Databricks Access to Customer Environment ............................................................................. 6 Employee Access ........................................................................................................................... 7 Data Governance .......................................................................................................................... 7 Data Flow & Encryption ................................................................................................................ 7 Customer Credentials Management .......................................................................................... 10 Backups ............................................................................................................................................. 10 Application ........................................................................................................................................ 11 Authentication and Authorization - End User Access Control .................................................. 11 Role-based Access Controls (ACL) .............................................................................................. 11 Change Management & Secure Coding ..................................................................................... 12 Host .................................................................................................................................................... 13 Hardening Standards ................................................................................................................. 13 Vulnerability Management ......................................................................................................... 13 Network Security ............................................................................................................................. 14 Network Isolation ....................................................................................................................... 14 Spark Cluster Network Isolation ................................................................................................ 14 VPC Isolation of Customer’s Service in Databricks Account ..................................................... 14 Security Groups & Network ACLs ............................................................................................... 14 No Public IPs ............................................................................................................................... 14 Monitoring ................................................................................................................................... 14 Physical Security .............................................................................................................................. 15 Infrastructure ................................................................................................................................... 15 Office .................................................................................................................................................. 15 Logging and Monitoring .................................................................................................................. 15 Policies & Procedures ...................................................................................................................... 15

Page 3: Databricks Enterprise Security Guidego.databricks.com/hubfs/docs/Security/Enterprise-Security-Guide.pdfDatabricks Enterprise Security Guide 2 ... comprehensive strategy spans technology,

3Databricks Enterprise Security Guide

Deployment Model

Databricks Enterprise offering is a single tenant deployment.

• Data plane – Spark clusters are deployed in a customer AWS account. Customer datasets are stored in customer owned and managed storage e.g. AWS S3, RDBMS, NoSQL.

• Control plane – Runs in Databricks account in a VPC dedicated to a single customer.

• Zero Maintenance• Single-Tenant VPC Isolation of Control Plane• Secured Internal Communication• Secured Access and Authorization• Encrypted Customer State• Isolated AWS Accounts• Apache Spark Cluster Network Isolation• Smarter cost controls

Home Workspace Notebooks Tables Jobs

Controlled Audited Access*

ClustersClustersClustersClustersClusters

VPN gatewaySOC 2 Type 2 (3/17)

Central Services

Databricks Admin

Dedicated VPC

Customer VPC

Customer

Customer VPCs

VPC

TLS†

TLS†

Databricks

* Refer to Audited Controls End-to-End encryption, & integrity protection KMS Encryption Controlled by Customer

Customer Choice of Connectivity

Data Sources

IAM Role Cross-Account API Access

Databricks

Page 4: Databricks Enterprise Security Guidego.databricks.com/hubfs/docs/Security/Enterprise-Security-Guide.pdfDatabricks Enterprise Security Guide 2 ... comprehensive strategy spans technology,

4Databricks Enterprise Security Guide

Compliance Program

Databricks engages with an independent CPA firm to perform annual and semi-annual audits. We currently hold:

• A SOC 2, Type 2 attestation. SOC 2 report covers design and operational effectiveness of controls to meet the trust criteria for the security, availability, and confidentiality.

• An attestation of HIPAA compliance.

Additionally, Databricks is engaged with an independent third party organization, NCC Group (formerly iSEC Partners) to conduct annual code reviews and penetration tests.

Page 5: Databricks Enterprise Security Guidego.databricks.com/hubfs/docs/Security/Enterprise-Security-Guide.pdfDatabricks Enterprise Security Guide 2 ... comprehensive strategy spans technology,

5Databricks Enterprise Security Guide

Defense in Depth

Databricks follows the Defense in Depth approach in order to address security as a whole. This comprehensive strategy spans technology, policies and procedures, as well as promoting a security first culture. Databricks’ Defense in Depth covers Customer Data, Application, Host, Network, Physical, Logging and Monitoring, Policies, Procedures and Awareness.

Customer Data

Application

Host

Network Security

Physical Security

Logging and Monitoring

Policies and Procedures

Page 6: Databricks Enterprise Security Guidego.databricks.com/hubfs/docs/Security/Enterprise-Security-Guide.pdfDatabricks Enterprise Security Guide 2 ... comprehensive strategy spans technology,

6Databricks Enterprise Security Guide

Customer Data

CUSTOMER DATASETSDatabricks is built to work with a customer’s existing data. It does not provide a persistent storage layer in-and-of-itself, but is instead designed to leverage Spark’s excellent support for various preexisting data sources and data formats, and provides additional optimizations where applicable.

Databricks customers most often utilize AWS’ Simple Storage Service (S3), but can also access a number of other sources (e.g. RDBMS, NoSQL, CSV uploads, etc.) A wide range of data formats are supported, including CSV, Parquet, JSON, Hadoop (e.g. Sequence Files, Avro). All sources and formats are accessible using whatever client authentication mechanisms are required for the given source.

CUSTOMER METADATACustomer metadata, including customer queries, outputs of the queries, as well as web user accounts, is stored in Databricks’ AWS RDS and encrypted with AWS KMS. Databricks provides customers with an option to user their own encryption (AWS KMS) to secure data at rest.

SECURED INTERFACES TO SPARK CLUSTERS Spark clusters are ultimately responsible for accessing and processing data in the Databricks environment, and access to Spark clusters occurs primarily through the web frontend interface. Access to frontend services requires authenticated identities and is encrypted through SSL. Commands are pushed from the frontend to the Spark cluster through an SSL-encrypted connection and utilizes certificate based authentication.

VPC PEERING TO ADDITIONAL CUSTOMER VPCNetwork access from the Databricks Spark clusters to any additional customer data sources can be conveniently enabled through VPC peering between the Spark clusters’ VPC and the external VPC. In lieu of VPC peering, standard network routing or VPN configurations can be used.

Databricks Access to Customer EnvironmentPROGRAMMATICPrivileged Databricks services have the ability to monitor and update customer deployments.

Our monitoring agent has the ability to make metadata-only black box checks against the customer environment, such as listing clusters or jobs to ensure that the respective services are healthy and resulting in valid data. Additionally, we make EC2 describe calls to ensure the health of the AWS resources.

Our update agent has the ability to provision new EC2 instances in the customer environment and to request that existing instances pull new artifacts from the Databricks artifact repository and self-update.

Page 7: Databricks Enterprise Security Guidego.databricks.com/hubfs/docs/Security/Enterprise-Security-Guide.pdfDatabricks Enterprise Security Guide 2 ... comprehensive strategy spans technology,

7Databricks Enterprise Security Guide

Employee AccessDatabricks has developed a proprietary system for requesting, approving, revoking, and logging access to customer data - Genie. As a general practice, Databricks employees do not access customer data unless specifically requested by a customer (e.g. to troubleshoot). Such requests should be documented in a Zendesk ticket and include consent for Databricks to access their environment. Following receipt of a Zendesk ticket, a Databricks engineer will review the issue reported and, if needed, submit a request to Genie to grant him/her access to the customer environment in order to address the issue. Genie, upon successful validation of the ticket number and customer consent, approves the engineer’s access to the customer environment. Such access is approved for a specified period of time after which the access permission is automatically revoked.

Genie can approve access only for a limited group of engineers, which is reviewed and revalidated quarterly. All access to a customer environment by Databricks personnel, including any actions taken, is logged and available for customers to review as part of the Databricks service audit logs.

Data GovernanceCustomer data is stored in Amazon S3 and Databricks designates which physical region individual customers’ data and servers will be located. Data replication for Amazon S3 data objects is done within the regional cluster where the data is stored and is not replicated to data center clusters in other regions. For example, by default, all data from Databricks customers in the EU will have their cloud data, logs, databases, and cluster management stored in the AWS data center in the EU, and that data will not be transferred to data centers outside the EU.

Data Flow & Encryption This section details data flow, where a user’s data enters Databricks, how it moves through the system and gets stored, with the particular goal of ensuring the data is always encrypted in transit and at rest.

Page 8: Databricks Enterprise Security Guidego.databricks.com/hubfs/docs/Security/Enterprise-Security-Guide.pdfDatabricks Enterprise Security Guide 2 ... comprehensive strategy spans technology,

8Databricks Enterprise Security Guide

CUSTOMER DATA ENTERS DATABRICKS THROUGH TWO MECHANISMS:1. Data sources that are accessed through Databricks 2. User-entered data (typically credentials)

The data flow below illustrates (i) the Databricks-owned instances for Databricks Services and (ii) customer-owned Worker instances on which the customer-owned Container Processes and Databricks-owned Data Daemon reside.

Lines indicate where data is in transit and disks indicate where data lies at rest. Orange is input to the system (customer data) and green is Databricks-owned, where customer data initially does not reside.

1. Customer data stored in customer-owned data sources (e.g., S3, RedShift, RDS) is read directly by the container. The customer is responsible for using encrypted connections. Databricks provided defaults always use encryption for S3 access. The Data Daemon (which always uses S3 Root Bucket) always uses HTTPS to talk to S3.

2. Data input by the customer to Databricks services (or secrets which may give access to customer data) always uses HTTPS (either through a browser session or through our API which requires TLS 1.1 or 1.2). a) For AWS-related calls, customers are recommended to use roles.

3. Communication between the Databricks Service (Control Plane) and Container Process (Data Plane) occurs over an RPC mechanism which uses TLS 1.2 and client/server mutual authentication.

4. Communication between the Container Process and Data Daemon is not encrypted but it is co-located on the same physical instance and iptables rules prevent other containers from observing the traffic.

5. Spark will transfer data between executors in order to perform distributed operations. This data is not encrypted and travels between physical instances within the same VPC.

6. Databricks Services, the Data Daemon, and the Container Process write logs to their local EBS volumes. Encryption depends on the configuration of the EBS (see below).

ServiceContainer

Process

Container Process

DataDaemon

Root Bucket

Logs S3/Kinesis

RDS

EBS

(6)

EBS

(6/7)

(10)(10)

(10)

(4)

EBS

(6/7)(5)

(1)(1)(9)(8) (9)

(3)(2)

(8)

Customer Data

Customer Input

Page 9: Databricks Enterprise Security Guidego.databricks.com/hubfs/docs/Security/Enterprise-Security-Guide.pdfDatabricks Enterprise Security Guide 2 ... comprehensive strategy spans technology,

9Databricks Enterprise Security Guide

7. The Container Process and Data Daemon additionally write customer data to their local EBS

volumes for the sake of caching. Same encryption story as 6. a) Local disks are used for logs and data caching. When Amazon launches a new instance,

the bootstrap disk can either be a copy of a local disk image stored in S3 or an EBS volume snapshot. Our AMIs are based on EBS volumes. The bootstrap EBS volume snapshot may be encrypted with KMS, but then the AMI cannot be directly shared with other accounts. As a result of this stipulation, our current solution regarding encrypted EBS volumes is a bit nuanced:

i) Instances running in our account (Databricks Services) use an encrypted EBS volume, and as a result, are encrypted using KMS.

ii) Instances running in the customer account do not use an encrypted EBS volume on boot, but we instead request additional data EBS volumes encrypted with KMS and put all container data on these disks.

8. The Databricks Services and in some configurations, the Container Process, share an RDS instance in which they store user-input data (including access keys) as well as results of customer queries. The instance uses a per-customer KMS key to encrypt its EBS and backups. The database is also backed up to S3 where it is also KMS-encrypted using the same key.

9. Databricks Services and the Data Daemon store certain data (namely, mount point metadata) in the Databricks Root bucket which may contain customer data. Customer-input secret keys are encrypted with SSE-S3.

10. Log data is uploaded to the Databricks Log Pipeline via Kinesis. Logs at rest are encrypted with AWS KMS and logs in flight are encrypted with TLS 1.2.

(Figure repeated)

ServiceContainer

Process

Container Process

DataDaemon

Root Bucket

Logs S3/Kinesis

RDS

EBS

(6)

EBS

(6/7)

(10)(10)

(10)

(4)

EBS

(6/7)(5)

(1)(1)(9)(8) (9)

(3)(2)

(8)

Customer Data

Customer Input

Page 10: Databricks Enterprise Security Guidego.databricks.com/hubfs/docs/Security/Enterprise-Security-Guide.pdfDatabricks Enterprise Security Guide 2 ... comprehensive strategy spans technology,

10Databricks Enterprise Security Guide

Customer Credentials ManagementData input by the customer to Databricks services (or secrets which may give access to customer data) always uses HTTPS (either through a browser session or through our API which requires TLS 1.1 or 1.2).

Customer AWS credentials are stored encrypted with client side encryption on a private and secure S3 bucket. The key used to encrypt the credentials are stored encrypted on S3 in separate private and secure S3 bucket. The stored credentials are only accessed by our automated deployment process and no Databricks personnel has direct access to the credentials.

BackupsDatabricks performs automated scheduled backups of metadata and systems every 24 hours. The backups are stored in AWS RDS with access restricted to authorized employees. Backup and recovery procedures are tested on an annual basis.

Page 11: Databricks Enterprise Security Guidego.databricks.com/hubfs/docs/Security/Enterprise-Security-Guide.pdfDatabricks Enterprise Security Guide 2 ... comprehensive strategy spans technology,

11Databricks Enterprise Security Guide

Application

Authentication and Authorization - End User Access ControlSSODatabricks provides Single Sign-On (SSO) to enable a customer to authenticate its employees using a customer’s identity provider. As long as the identity provider supports SAML 2.0 protocol (e.g. OKTA, Google for Work, OneLogin, Ping Identity, Microsoft Windows Active Directory), a customer can use Databricks SSO to integrate with your identity provider and sign in.

Databricks provides several ways to control access to both data and clusters inside of Databricks.

Role-based Access Controls (ACL)CLUSTERS IAM ROLESAn IAM role is an AWS identity with permission policies that determine what the identity can and cannot do in AWS. IAM Roles allow you to access your data from Databricks clusters without having to embed your AWS keys in notebooks.

CLUSTER ACLThere are two configurable types of permissions for Cluster Access Control:

• Individual Cluster Permissions - This controls a user’s ability to attach notebooks to a cluster, as well as to restart/resize/terminate clusters.

• Cluster Creation Permissions - This controls a user’s ability to create clusters.

WORKSPACE ACLWorkspace ACL provides control over who can view, edit, and run notebooks.

You can assign five permission levels to notebooks and folders: No Permissions, Read (View Cells, Comment), Run (Run Commands, Attach/Detach Notebooks), Edit Cells, and Manage (Change Permissions).

NOTEBOOKS ACLAll notebooks within a folder inherit all permissions settings of that folder. For example, if you give a user Run permission on a folder, that user will have Run permission on all notebooks in that folder.

LIBRARY AND JOBSAll users can view libraries. To control who can attach libraries to clusters, use Cluster Access Control. A user can only create jobs from notebooks that they have read permissions to. Also, users can view a Notebook Job run result only if they have Read permissions on the notebook of that job. If a user deletes a

notebook, only admins can view the runs.

Page 12: Databricks Enterprise Security Guidego.databricks.com/hubfs/docs/Security/Enterprise-Security-Guide.pdfDatabricks Enterprise Security Guide 2 ... comprehensive strategy spans technology,

12Databricks Enterprise Security Guide

Change Management & Secure CodingDatabricks has a formal change management process in place. All changes must be authorized, tested, approved, and documented.

Databricks has implemented a secure development lifecycle (SDL) to ensure that security best practices are integral part of development. The SDL covers formal design reviews by the security team, threat modeling, automated and manual code peer review, as well as penetration testing by a leading security firm. Additionally, all developers are provided with secure coding practices training as part of their onboarding.

Page 13: Databricks Enterprise Security Guidego.databricks.com/hubfs/docs/Security/Enterprise-Security-Guide.pdfDatabricks Enterprise Security Guide 2 ... comprehensive strategy spans technology,

13Databricks Enterprise Security Guide

Host Databricks has formal host hardening and vulnerability management processes in place.

Hardening StandardsAll hosts run the latest version on Ubuntu operating system and are hardened according to Center for Internet Security (CIS) benchmarks. In summary the hardening standards cover the following:

• Changing of all vendor supplied defaults and elimination of unnecessary default accounts. • Enabling only necessary services, protocols, daemons, etc., as required for the function of the system. • Implementing additional security features for any required services. • Configuring system security parameters to prevent misuse. • Removing all unnecessary functionality, such as scripts, drivers, features, subsystems, file systems,

and unnecessary web servers.

Vulnerability Management PATCHING UPDATESAll hosts are patched periodically for security updates and critical patch fixes. All patches are authorized, tested, and approved in accordance with Databricks change management process.

Zeroday exploits are patched as soon as possible after testing.

SCANNING All hosts are scanned periodically for vulnerabilities with Nessus. All security vulnerabilities are investigated by the security team and remediated according to Databricks’ security incident remediation SLA:

• Critical – Immediately • High – Within five days• Medium – Within 60 days• Low – Based on the business requirements

Page 14: Databricks Enterprise Security Guidego.databricks.com/hubfs/docs/Security/Enterprise-Security-Guide.pdfDatabricks Enterprise Security Guide 2 ... comprehensive strategy spans technology,

14Databricks Enterprise Security Guide

Network Security

Network IsolationDatabricks is deployed in a customer AWS account. We recommend that a customer uses a separate AWS account for deploying the Databricks service because the IAM role required for running the service could theoretically affect other services within the account.

Spark Cluster Network Isolation The Spark deployments are firewalled by default and isolated from each other. Access to these clusters is limited to the frontend of Databricks by default, but can also be opened up by adding an Elastic IP address (Databricks provides sample notebooks for performing this operation).

VPC Isolation of Customer’s Service in Databricks Account Databricks operates and maintains the web frontend and cluster management resources on behalf of the customer, but isolates those resources from other customer deployments by deploying within a dedicated VPC. The VPC uses dynamic IPs in the range 10.54.0.0/16.

Security Groups & Network ACLsA Databricks deployment utilizes multiple AWS security groups to control and protect egress and ingress traffic. The external facing resources such as the Databricks web portal instance uses a security group that exposes port 443 which provides the ability for users to login. The login to the web portal via port 443 is secured by SSL encryption. There are no other ports exposed externally on the Databricks webapp instance. Other instances such as the Databricks cluster manager instance and Spark workers, do not expose any external facing ports. The AWS security groups attached to these instances only allow internal facing traffic between instances.

In addition to security groups, a Databricks deployment utilizes network ACLs to control inbound and outbound traffic at the subnet level.

No Public IPsThe Databricks customer success team can enable a feature flag to turn off not having public IPs in the workers as well as white list IP addresses that are allowed to access the Databricks web portal.

MonitoringAll network activity is logged and monitored. Databricks leverages AWS VPC flow logs to capture information about the IP traffic going to and from network interfaces as well as all VPC and AWS Cloudtrail logs to capture all APIs made by a Databricks AWS account.

The log data is retained for a minimum of 365 days and access to the logs is restricted to prevent tampering.

Page 15: Databricks Enterprise Security Guidego.databricks.com/hubfs/docs/Security/Enterprise-Security-Guide.pdfDatabricks Enterprise Security Guide 2 ... comprehensive strategy spans technology,

15Databricks Enterprise Security Guide

Physical Security

InfrastructureDatabricks is hosted on AWS. AWS data centers are frequently audited and comply with a comprehensive set of frameworks including ISO 27001, SOC 1, SOC 2, SOC 3, PCI DSS. AWS physical data centers are located in secret locations and have stringent physical access controls in place to ensure that no unauthorized access is permitted including biometric access controls and twenty-four-hour armed guards and video surveillance.

OfficeDatabricks implements physical controls in its office including badge readers, a staffed reception desk, visitor sign-in, and a clean desk policy.

Logging and MonitoringDatabricks provides comprehensive end-to-end audit logs of activities done by the users on the platform, allowing enterprises to monitor the detailed usage patterns of Databricks as the business requires. The audit logs cover Accounts, Notebooks, Clusters, DBFS, Genie, Jobs, SQL Permissions, Customer SSH Access, Tables.

Once enabled for your account, Databricks will automatically start shipping the audit logs in human readable format to that location every 24 hours. The logs will be available within 72 hours of an activation.

Databricks encrypts audit logs using Amazon S3 server-side encryption.

Policies & ProceduresDatabricks has implemented a number of policies and procedures aimed at enforcing security best practices. The policy and procedures documents are accessible to all employees, reviewed and updated at least annually, and communicated to all employees upon hire and periodically thereafter.

The suite of security policies includes the following:

• Data Classification – Defines levels of data sensitivity (public, private, sensitive, confidential, secret) and describes acceptable methods for storage, access, sharing.

• Access Management – Describes procedures for provisioning and deprovisioning of access, periodic access reviews, password and MFA requirements (provisioning, deprovisioning, 2fa, reviews).

• Acceptable Use – Describes acceptable and unacceptable use as well as enforcement.• Security Training – Outlines types of security trainings per function (engineering vs. general),

frequency, and delivery methods.• Incident Response – Describes incident response process, responsibilities, SLA.

Page 16: Databricks Enterprise Security Guidego.databricks.com/hubfs/docs/Security/Enterprise-Security-Guide.pdfDatabricks Enterprise Security Guide 2 ... comprehensive strategy spans technology,

16Databricks Enterprise Security Guide

• Risk Management – Describes risk management methodology and frequency of assessment.• Threat Modeling – Describes threat modeling methodology and tools.• Performance Monitoring – Defines system performance KPIs and describes escalation process.• Hardening Standards – Describes system hardening standards and process.

Databricks has a dedicated security team focused on product security, corporate security, security operations, as well as privacy and risk and compliance.

Secure Your Enterprise Workload Today

Hundreds of organizations have deployed the Databricks virtual analytics platform to improve the productivity of their data teams, power their production Spark applications, and securely democratize data access. Databricks is available in Amazon Web Services globally, including the AWS GovCloud (US) region.

Contact Databricks for a personalized demo, or register to try Databricks for free.