Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
17 June 2019© MARKLOGIC CORPORATION
Data Security and MarkLogic Security ALL the Things
JASON HUNTER SE Director
RANGAN DORESWAMYProduct Manager -
Security
Data security is bigger than this little box
Source: Momentum Partners
Most organizations focus on network security
NETWORK SECURITYPROTECT THE PERIMETER,THE “CRUNCHY OUTSIDE”
DATA SECURITYPROTECT THE DATA IN THE “SQUISHY MIDDLE”
Challenges withdata security Data management is complex
Data gets scattered across silos
Policies spread in multiple places
Data models always changing
Multiple tools must work together
CEO/CFO/CIO
ComplianceOfficer
Developer
DBA
SLIDE: 5
Secure by default
Fine-grained, role-based security
Advanced Encryption
Data anonymization and redaction
Improved data security across the integration lifecycle
MarkLogic: The most secure NoSQL database
DATA SECURITY & DATA SHARINGBetter security leads to more sharing with less risk
Secure everywhere
Safe in any cloudDeploy confidently and avoid vendor lock-in
Safe data sharingControl exactly who sees what data
Cloud neutralHybrid cloud and on-premisesAdvanced encryption
Granular access controlAnonymization and redactionCurated and governed data
Controlling access to information
Access to documents
Who is the user?
What should the user see or do?
Security principle: Authorization
AUTHORIZATION
AUTHENTICATION
AUDITING
Utilize Roles, Compartments, and Privileges to control access
Privileges – what actions you can execute (i.e. reboot the box, insert docs with a given URI prefix)
Permissions – what capabilities you have with data (i.e. can you see/update a doc or element)
RBAC – Role Based Access Control
- Users are assigned to “roles” in the database
- The roles of a user control what they can do, see, edit, etc.
- Roles can inherit from each other for easier management
Security 101
Document security model
ROLE SECURITY MODEL
RolesUser universe
PrivilegesActions
hierarchical
Users and Groups
APPLIED TO DOCUMENTS
CapabilitiesRead, update, insert, execute
Roles
…<role1, read><role2, node-update>…
Permissions
VISIBILITY
Must first have permissions to see the document, then if you have permission to see “secret”, you see all.
{"Customer_ID": 1001,"Fname": "Paul","Lname": "Jackson",
},
…}
Compliance (read, update)CallCenter (read)
Role
Compliance
CallCenter
Role Based Access Control (RBAC)
User Permissions
InsurancePolicy
CallCenter (read)Country: US (read)CallCenter
CallCenter
RBAC – Compartment Security*
USInsurance
PolicyCountry: US
Country: UK
CallCenter (read)Country: UK (read)
AND
UKInsurance
Policy
* Part of the Advanced Security option
AND
RoleUser Permissions
Access inside documents
What sensitive information has to be protected?
How do you enable authorized search only?
Role Based, Element Level SecurityGranular control on information visibility
{"Customer_ID": 1001,"Fname": "Paul","Lname": "Jackson","Phone": "415-555-1212","SSN": "123-45-6789","Addr": "123 Avenue ","City": "Someville","State": "CA","Zip": 94111
}
You can control access to the level of XML elements or JSON properties
Protection is based on element names, element attributes, property names, and values – Called Protected Paths
Out-of-the-box, in-database solution
Real-time control enforced at the data layer for: search, queries, and updates
Integrate data while preserving privacy
Each Protected Path is associated with roles and permissions
- sec:protect-path("//ssn", ("hr_role", "read"))
- sec:protect-path("/root/reg[fn:matches(@access, 'USA')]", ("USA_role", "read"))
- sec:protect-path("/root/data[@cls='ts']", ("ts_role", "update"))
* Function signatures simplified for illustration
Only a user from HR can see the SSN
Only a Top Secret person can update data classified as Top Secret
Integrated document security model
ROLE SECURITY MODEL
RolesUser universe
PrivilegesActions
hierarchical
Users and Groups
APPLIED TO DOCUMENTS
APPLIED TO PROTECTED PATHS
CapabilitiesRead, update, insert, execute
Roles
…<role1, read><role2, node-update>…
Permissions
CapabilitiesRead, update, insert, execute
Roles
Path
…<//path, role1, read><//path2, role2, update>…
Permissions
VISIBILITY
Must first have permissions to see the document, then if you have permission to see “secret”, you see all.
{……………
"path": {"secret"
},
…}
Security Principle: Authentication
AUTHENTICATION
AUDITING
External or Local External is via LDAP or Kerberos
(with groups mapped to roles) Certificate-based (X.509) SAML 2.0 Secure credential storage
AUTHORIZATION
External Authentication (Recommended)
- Primary user database lives external to the MarkLogic Database
- Authentication type can be LDAP, Kerberos, SAML Auth, Certificate
- Seamless Integration with existing enterprise security model
- You DO always need your Roles created within MarkLogic
- Authorization can be External (user roles derived from the Group declared by the external system) or Internal (users and roles are known within MarkLogic but credentials are checked externally)
Who you are: Authentication types
Internal Authentication
- User database lives fully within MarkLogic
- Best to limit to the MarkLogic Bootstrapping Admin account (always need that local user)
- Remember: You DO always need your Roles created within MarkLogic
Who you are: Authentication types
Delegated Authorization – SAML
The middle tier can forward passwords. Make sure you secure the communication channel at every step. Beware, the middle tier and MarkLogic would be known as “password collectors”
The middle tier can hit MarkLogic with one high powered account and dictate what roles the current request should run under. Common with databases. Beware, it gives full trust to the middle tier code, and there’s no auditing
Better: Have a client present a secure “token” to the middle tier and have the middle tier pass it on. Within the token can be identity and capability. No passwords flying by, no undue trust given. This is SAML
Security with a middle tier
Security Assertion Markup Language, a standard type of secure token that provides identity and capability
Generated by an “identity provider”. Picture an SSO server where a user logs in and gets the secure token in return. This token proves who they are and what they should be allowed to do
In a three-tier application, the middle tier manages the generation of the token and passes it to MarkLogic along with each request
MarkLogic (as of MarkLogic 9.0-9) reads SAML 2.0 tokens and securely recognizes the user without ever having the user credentials. SAML attributes can be mapped to MarkLogic roles
SAML
Safe Data Sharing
Share the right information
Can I get a data dump with PII removed?
How to give data to data scientists?
How to get realistic data on QA/UAT?
Remove sensitive information on export
Data exported for testing and analysis must not have any real PII or Sensitive Information
Need a way to find the client or financial information in the dataNeed a way to tell what to do with the information depending on the target needs
Share data while preserving privacyMask or conceal sensitive information on export
Redact at the level of XML elements, JSON properties, or even free text patterns when exporting
Combine built-in or custom rules into policies to match different target needs
Use built-in functions that best fit each content:
- concealing, random, deterministic, dictionary, pattern, or custom
{"Customer_ID": 1001,"Fname": "Paul","Lname": "Jackson","Phone": "415-555-1212","SSN": "123-45-6789","Addr": "123 Avenue ","City": "Someville","State": "CA","Zip": 94111
}
Rule-based redaction on exportShare data while preserving privacy
Mask or conceal sensitive information
Use predefined functions
Out-of-the-box, in-database solution
{"Customer_ID": 1001,"Fname": "Paul","Lname": "Jackson","Phone": "415-555-1212","SSN": "343-45-6569","Addr": "456 Main St ","City": "NYC","State": "NY","Zip": 94111
}
Original document
{"Customer_ID": 3456,"Fname": "John","Lname": “Jameson","Phone": "123-123-1233","SSN": "xxx-xx-6569","Addr": “23 Side St ","City": “San Francisco","State": “CA","Zip": 90051
}
QA/Dev Export
{"Customer_ID": 34567,"Phone": "123-123-1233","SSN": “456-456-9876","City": “NYC","State": “NY","Zip": 94111
}
BI export
SLIDE: 29 17 June 2019© MARKLOGIC CORPORATION
Use the rdt.redact function to create redacted in-memory copies of documents
Suitable for testing and debugging your rules or for redacting a small number of documents.
rdt.redact is not a security function
Use the mlcp command line tool to export data
Use rdt.ruleValidate to test the validity of your rules before calling rdt.redact
Redact function vs. mlcp
var doc = fn.collection("Redaction");
var validate = rdt.ruleValidate(["MY_RULES"]);
rdt.redact(doc, ["MY_RULES"]);
mlcp.sh export -host localhost -port 8000 \
-username u -password p -mode local \
-output_file_path ./results \
-collection_filter people \
-redaction "MY_RULES"
SLIDE: 30 17 June 2019© MARKLOGIC CORPORATION
ruleClientInfo = {
"rule": {
"description": "Random #..",
"path": "/policy/client/id",
"method": {
"function": "mask-random" },
"options": {
"length": 10 }}};
• Rules are documents in a collection, e.g. MY_RULES
• Each rule defines what to do with the information by specifying a function
• Each rule uses XPath expressions to find information to conceal or mask
Redaction rules
xdmp.documentInsert(
"ruleClientInfo.json", ruleClientInfo, {
"collections": [ "MY_RULES" ]
});
SLIDE: 31 17 June 2019© MARKLOGIC CORPORATION
"method": {"function": "mask-deterministic"
},"options": {"length": 10"salt": "a23sdas#4er""extend-salt": "collection"
}
Ships with out-of-the-box functions:
- Conceal, Random, Deterministic, Dictionary
- Highly secure design to prevent linkage attacks
- Patterns: SSN, US Phone, email, IPv4, Regex, Dates, Numbers
Users can write custom functions
Redaction functions
"method": {"function": "redact-us-ssn",
},"options": {"level": "partial","character": "X"
}
Security principles: Auditing
AUTHENTICATION
AUDITING
Audit document access and updates
Audit configuration changes, administrative actions, code execution, and changes to access control
AUTHORIZATION
Advanced SecuritySafe data sharing by controlling exactly who sees what data
Prevent direct access to files
Is my data secure on disk?
Can the cloud sys admin see the data?
Can someone modify the data on disk?
Can you erase traces of wrongdoing?
Securing Data at Rest With Encryption
Advanced EncryptionTransparent encryption of data, configuration, and logs
Protection from insider threats
Prevent sys admin access to information Reduce DBA authority
Better governance Prevent tampering of information on disk Reduce ability to hack a system
Easier compliance Match stringent security standards and mandates
Enable Data Integration without Security Compromises
Data confidentiality and integrityEncryption Protects Data Confidentiality
Protect database files above the file system
Avert non-authorized users from seeing file contents or using files in other systems
Enable safe deployment in the public cloud
Encryption Protects Data Integrity
Block modifications to audit logs – important to you, important to regulators
Prevent modifications to files on diskDisk storage
File system
Protected
Decryption
Database
DBA
Sys Admin
SecurityAdmin
Advanced Encryption – Internal KMSTransparent encryption of data, configuration and logs
CLUSTER OR LAPTOP
DB BACKUPLOCAL KEY STORE
DBA
Wallet and encryption keys held on local disk
Wallet is needed to read the data; delete the wallet and the data is unrecoverable
MarkLogic admin manages the wallet and encryption keys: can backup, restore, extract, rotate, and move them
High performance on Intel Chip encryption NIST Approved Algorithm – AES-256 Transparent: No code modification!
Advanced Encryption – External KMSTransparent encryption of data, configuration and logs
CLUSTER OR LAPTOP
DB BACKUPDBA
SEC ADMIN
SYS ADMIN
Wallet and encryption keys held in external key management system (KMS)
Integrates with any KMS supporting KMIP 1.2 standard or PKCS#11 HSM
MarkLogic admin has NO access to encryption keys
Without continued KMS cooperation the data is useless
Encryption keys managed per database in a cluster, so you can pick what databases to encrypt External Key
Management
Backup confidentiality and integrity
Encryption Protects Data Backup and Restore Backup can be encrypted with a Passphrase,
the Cluster Level Encryption Key, or the External KMS Backup Key Encryption Key
Restore with Passphrase – fully self contained, for safe sharing to another system
Restore with Cluster Level Key – keys held in local wallet, for restoring a local cluster after data loss
Restore with External KMS BKEK – keys in KMS, for restoring a local cluster after data loss when KMS in use
Disk storage
File system
Protected
Decryption
Database
DBA
Sys Admin
SecurityAdmin
External key management has been validated against:
SafeNet
Vormetric
Thales
Fornetix
nChiper
External KMS validation
Advanced Encryption – for HA/DR
The advanced security license option includes three features:
1. External key management
2. Redaction
3. Compartment security
Advanced Security
MarkLogic Data Hub Service architecture
There are three pre-defined roles managed in the portal cluster:
- Account admin – bootstrapper, full access, assigns people to other admin roles, billing
- Security admin – controls the network, LDAP, users; once created can prevent account admin from having this ability
- Service admin – controls the service, deploy code, load docs, run flows
Admin roles
Data Hub roles
Endpoint operator
Service Admin
Account Admin
Flow operator
Endpoint developer
ODBC
Security Admin
Can upload new/changed documents (e.g. flows) to the modules database.
Load and modify data in the staging database and final database, call flow runner
Has access to endpoints and the final documents, can add documents to the modules database
Has access to endpoints
Has access to the analytics stack that has an ODBC server
Can create and modify the service
Can see billing, account details
Can assign the DHF roles to other roles, and create vanilla roles
Flow developer
Data Hub Service wraps around the Data Hub
The Data Hub provisions roles as part of its deployment
Security admin can use these roles or create new roles (that inherit from one of these)
VPC – a virtual private cloud
- A virtual network dedicated to your AWS account
- VPC peering connects two accounts together
- MarkLogic sets up the VPC peering based on the configuration setup in the portal
Definition: VPC
DHS Cluster
Service VPC/ Single Cluster
Customer VPC
VPC Peering
App developer
Flow developer
Service Admin
Account Admin
Endpoint Developer
Portal VPC
Portal Cluster
LDAP
Customer LDAP
Users w/ doc permissions
Operator
Security Admin
Ops Director Cluster
ML Ops
Security Admin
DHS Cluster
Service VPC/ Single Cluster
Flow developer
Endpoint Developer
VPC Peering
IAM HSM KMS
Data Hub Service – deployment
Some enterprises and gov’t agencies require certain security certifications
Certifications are based on assessments conducted by third parties
MarkLogic is pursuing SOC 2 Type 2 and NIST 800-53 certifications
NIST 800-53 covers all common controls required for FedRamp, HIPPA, FDA 21 CFR , and others
Data Hub Service certifications
Get started withData Hub ServiceCloud service that deploys in minutes with predictable, low cost
Key Takeaways
Resources
MarkLogic University: On-demand and instructor-led classes
- https://www.marklogic.com/learn/university/
- Hands-on security workshop tomorrow! (Space is limited)
Business/Security White Paper:
1. Top Concerns When Integrating Data
https://www.marklogic.com/resources/top-data-security-concerns-integrating-data/
Resources
Tech/Security White Papers:
1. Building Security Into MarkLogic
https://www.marklogic.com/resources/building-security-marklogic/
2. Developing Secure Application on MarkLogic
http://www.marklogic.com/resources/developing-secure-apps/marklogic/resource_download/whitepapers/
3. Deploying MarkLogic Securely
http://www.marklogic.com/resources/deploying-MarkLogic-securely/resource_download/whitepapers/
Q&A
Thank you