Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Securing sensitive data
through APIs and AI pattern
recognition
Sesh Raj, President DSAPPS INC
• Introducing a new way to robustly secure sensitive enterprise data using APIs
and AI (artificial intelligence) pattern recognition methods.
• Currently the cost of a data breach can cost an enterprise several millions of
dollars and in some cases could mean closure of business.
• With new stringent regulations like GDPR in Europe and the California
Consumer Privacy Act enterprises face new urgencies on securing data.
• Given that around 75% of attacks come from insider threats enterprises
require new models to ensure data security.
Outline of talk• Data security urgency
Data breaches - a worldwide epidemic.
Look at the new data security and privacy regulations
• Examine problem / Current solutions
Understand what causes data breaches, review attack vectors
Current data security strategies
• Introducing the Touchfree API/AI data security model
• Review API security challenges and solutions
• Review AI technologies and applying these for data security
• Implementing Touchfree API/AI distributed cloud service
• Summary and Next steps
• Automate processes and workflows.
• Manage tasks, events, milestones, documents, issues, risks.
• Collaborate with key resource teams, vendors, partners, customers.
• Knowledge management templates.
• Risk Management Indicators
• Rapid digital enterprise transformation
Solution 1
Smart Enterprise Apps
dsapps.com – Rapid digital enterprise transformation
Claims and Disclosures
• Claims
1. Presentation is vendor neutral.
2. Primarily presented from a consultant’s viewpoint
• Disclosures
1. New potentially useful ideas presented and feedback sought,
ideas that may mature to become future startup products.
Source: Statista.com
Data Breaches affect Millions of Records
IBM Security: Cost of a Data Breach
Average total cost of a data breach USD 3.92 million
Most expensive country United States, USD 8.19 million
Most expensive industry Healthcare, USD 6.45 million
Average size of a data breach 25,575 records
Average cost per lost or stolen record $148
IBM security study:
Mega data breaches cost
$40 million to $350 million
Data breaches estimated to cost
over $2 Trillion Annually, by 2020
https://en.wikipedia.org/wiki/List_of_data_breaches
Data Breach/Data Privacy Penalties
• GDPR - European data protection regulation.
• Fines up to 10 Million Euros or 2% of annual turnover, whichever higher.
• CCPA - California Consumer Privacy Act (enforced 2020)
• $100 to $750 fine for each California resident affected by a data breach.
• Affects companies over $25 Million annual revenues or posses personal information of 50K
or more consumers/devices or over half of revenue from selling personal information
• Implement new personal information acquisition and deletion processes
https://en.wikipedia.org/wiki/California_Consumer_Privacy_Act
A business that collects a consumer's personal information must, at or
before the point of collection, inform the consumer as to the categories
of personal information to be collected and the purposes for which the
categories of personal information shall be used. A business must
disclose and deliver the personal information the business collected
about the consumer in response to a verifiable consumer request.
A business must delete the personal information the business
collected about a consumer and direct service providers to delete the
consumer's personal information in response to a verifiable consumer
request, subject to certain exceptions.
CCPA - new process for private information
In addition, after satisfying certain procedural requirements, a
consumer can bring a civil action in an amount not less than $100
and not greater than $750 per consumer per incident or actual
damages, whichever is greater, regarding their nonencrypted or
nonredacted personal information that is subject to an
unauthorized access and exfiltration, theft, or disclosure as a result
of the business's violation of the duty to implement and maintain
reasonable security procedures and practices appropriate to
the nature of the information to protect the personal information.
CCPA – penalties, require encryption,
security procedures
What causes data breaches?
• Poor security processes that lead to attacks - such as storing passwords in the clear, storing private data without adequate encryption.
• Insecure data processing and storage infrastructure.
• Not following standards such as PCI, HIPAA etc.
• Lack of training.
• Insider threat - Malicious attacks, thefts, carelessness, mistakes etc. by employees and temporary staff with access to key data resources.
Insider threat and human error
accounts for near 75% of attacks
https://securityintelligence.com/news/insider-threats-account-for-nearly-75-
percent-of-security-breach-incidents/
Current Data Security Strategies
Zero-Trust model. Adding additional security layers to trust users and devices
Governance. Manage systems and people. Establish business rules, approved code and API libraries,
guidelines for controlling systems across business units, departments, and geographies.
Authentication methods. Solid passwords, 2FA, MFA, AMFA. Single sign-on and biometrics.
Encryption at rest and in motion.
Mobile Device Management (MDM) - Manage lost devices, control apps, commission/decommission
Backup, archiving, and storage.
AI and analytics to spot and address anomalies
Right IT tools and solutions.
Employee education and training.
Introducing Touchfree Data Security Model
• Minimal human touch - leverages robotic process automation for
security management and maintenance
• APIs only access - to write and read sensitive data
• APIs secured via AI (Artificial Intelligence) - learns, detects and flags
abnormalities in data access (content, context, process, location etc.)
• Hosted in encrypted, distributed cloud - no direct data access
(patent pending)
(Attack vectors reduced to one)
API Security is key
According to Gartner, APIs will be the
most common attack vector by 2022
https://www.securityweek.com/next-big-cyber-attack-vector-apis
How do we ensure API security?
The most common attack vectors can be broken down into three categories:
ParametersParameter attacks exploit the data sent into an API, including URL, query parameters, HTTP
headers, and/or post content.
IdentityIdentity attacks exploit flaws in authentication, authorization, and session tracking. In
particular, many of these are the result of migrating bad practices from the web world into
API development.
Man-in-the-Middle These attacks intercept legitimate transactions and exploit unsigned and/or unencrypted
data being sent between the client and the server. They can reveal confidential information
(such as personal data), alter a transaction in flight, or even replay legitimate transactions.
Example - layered security model
https://www.forumsys.com/featured/four-pillars-of-api-security/
OWASP API SECURITY TOP 10 (2019)
Security Issue Solution(s)
Broken Object Level Authorization Verify user permissions/policies, don’t depend on IDs from clients
Broken Authentication Strong passwords, keys, tokens, timestamps
Excessive Data Exposure Don’t expose all data if not needed, to prevent traffic sniffing
Lack of Resources & Rate Limiting Set resource limits and rules on clients
Broken Function Level Authorization Check user authorization, endpoint access and user groups/roles
Mass Assignment Avoid functions that bind a client’s input into code variables/objects.
Security Misconfiguration Fix unpatched flaws, harden environment, review settings
Injection Validate, sanitize, filter client data. Parameterize interfaces. Limit records.
Improper Assets Management Inventory all API hosts and document permissions
Insufficient Logging & Monitoring Log all failed attempts, denied access, validation errors. Use SIEM – security
information and event management to aggregate/manage logs
Mistakes or malicious intent by an
authorized insider who meets all the
requirements under OWASP could pose an
insider threat – this leads us to explore use
of AI technologies, to flag such threats
Using AI to spot and flag abnormalityBeyond API user identity we can track and analyze
• DATA - what is being requested (type, volume)
• TIME - when requested,
• LOCATION - from where (network, IP address, geography)
• DEVICE - via which device,
• APPLICATION - via what application,
• FREQUENCY - how often
• HISTORY - since when
• CONTEXT - for what purpose, what relationships
BEHAVIOR – with reference to user role, permissions and business processes
AI 101
Labeling and flagging
sensitive data
DataLabeled
Training set
Flag Private
information
(example social
security number)
https://www.datasciencecentral.com by
Vincent Granville
Clustering is the problem of grouping
points by similarity using distance
metrics, which ideally reflect the
similarities you are looking for.
K-Means Clustering
Simple and elegant algorithm to partition
a dataset into K distinct, non-overlapping
clusters.
Healthcare Example –
protecting data access via k-means clustering
Time of
access
Location of
access
Patient data
Volume of
data
Time of access
Data
type
Data type
Objective: Flag outliers
noon midnight
Identify and optimize the position of the centroids in each cluster.
Other unsupervised
learning techniques
• KMeans + Autoencoder (a simple deep learning) - Autoencoders
are data compression algorithms that transform input data into
sparse and more efficient representations, speeding learning,
improving accuracy.
• Deep embedded clustering algorithm (advanced deep learning) –
learns feature assignments and cluster assignments using deep
neural networks that outperform and are more robust.
Deep Neural networks
• Set of algorithms patterned after the
human brain designed to cluster and
classify data recognizing patterns
based on first learning labeled data.
• Unsupervised deep learning uses very
little training data – detecting and
learning features from the data.
Towardsdatascience.com
Deep learning provides automatic discovery of
abnormal data access and patterns, without need for
programming or customization independent of data
structure, company, industry type etc.
Deployment
Firewall
Touchfree.ai
API service managing private
information
User/App
Roles/Authorization management,
key management, token
management
API
Inside firewallPublic/private cloud,
Encrypted, distributed containers
(no direct data access)
touchfree.ai Rest API
Function Operation inputs outputs Keys, tokens, timestamps
store PI POST PI (json) Success flaguserauth, publickey for
encryption, PI token
get PI GET - PI (json)userauth, privatekey for
decryption, PI token
update PI UPDATE Updated PI (json) Success flaguserauth, publickey for
encryption, PI token
delete PI DELETE - Success flaguserauth, publickey to
writeover blank, PI token
inform PI NOTIFY Admin/CSO emailsPI abnormal status via AI
check -
admin PIGET/POST/U
PDATENew admin settings Current admin settings userauth
Summary
• Currently enterprises are facing a growing epidemic of data breaches with severe
financial, business and regulatory costs.
• Enterprises now face a deadline of January 1st, 2020 to comply to the requirements of
California’s Consumer Privacy Act to safeguard private information.
• Insider threats from malware, mistakes, lack of training as well as malicious attacks
accounts to about 75% of data breach attacks
• TOUCHFREE.AI hosted in encrypted, distributed cloud containers offers a new data
security model that replaces most insider threat attack vectors with an API interface
secured by an AI abnormality monitoring layer and meeting regulatory processes that
require that consumers have complete control over their private information