45
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Armando Leite, Principal Security Architect 03/29/17 Best Practices for Managing Security Operations in AWS

Best Practices for Managing Security Operations in AWS - March 2017 AWS Online Tech Talks

Embed Size (px)

Citation preview

© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Armando Leite, Principal Security Architect

03/29/17

Best Practices for Managing Security Operations in AWS

A practical approach to help achieve SecOps excellence

+How to leverage AWS services to implement. +Take home toolkit i.e. try it by yourself.

Control Monitor Fix

What to expect from the session

In detail

1. Introduction2. CMF: Control/Monitor/Fix

- Control: Creating the guardrails.- IAM, Code*, AWS Config

- Monitor: Provide visibility- Cloudtrail, Flowlogs, Syslog, Cloudwatch

- Fix: Dealing with Exceptions- Lambda

3. In Practice (aka demo)4. Your take home kit and actions

MSB – Minimum Security BaselinePro Level – What to aim for.

Cloud Adoption FrameworkThe Security Perspective

Directive

Preventive Detective

Responsive

Control Monitor

?

Fix

Driving the right behavior Maintain and assure over time.

Get back to known good.

Our guidelines (‘Directive’)

Operating principles:

1. Think pipelines/workflows, not isolated controls.

2. Use the data.3. The SOP is Code.

Control Monitor FixControl Monitor Fix

Phase 1: ControlGoal:• Drive towards secure outcomes i.e. Build guardrails

Possible options:• IAM• Cloudformation• Code*

Best practice:• MSB: Individual users + Least privilege + use of groups.• Pro level: Centralized deployment of controls across N accounts.

AWS Identity and Access Management (IAM) Enables you to control who can do what in your AWS account Splits into users, groups, roles, and permissions Control

Centralized Fine-grained - APIs, resources, and AWS Management Console

Security Secure (deny) by default

Final decision =“deny”(explicit deny)

Yes

Final decision =“allow”

Yes

NoIs there an

Allow?

4

Decisionstarts at Deny

1

Evaluate allapplicable

policies

2

Is there an explicit deny?

3No

Final decision =“deny”(default deny)

5

AWS retrieves all policies associated with the user and resource.

Only policies that match the action and conditions are evaluated.

If a policy statement has a deny, it trumps all other policy statements.

Access is granted if there is an explicit allow and no deny.

• By default, an implicit (default) deny is returned.

Top 11 IAM best practices1. Users – Create individual users.2. Permissions – Grant least privilege.3. Groups – Manage permissions with groups.4. Conditions – Restrict privileged access further with conditions.5. Auditing – Enable AWS CloudTrail to get logs of API calls. 6. Password – Configure a strong password policy. 7. Rotate – Rotate security credentials regularly.8. MFA – Enable MFA for privileged users.9. Sharing – Use IAM roles to share access.10.Roles – Use IAM roles for Amazon EC2 instances.11. Root – Reduce or remove use of root.

One AWS account vs. multiple AWS accounts?

Use a single AWS account when you: Want simpler control of who does what in your AWS environment. Have no need to isolate projects/products/teams. Have no need for breaking up the cost.

Use multiple AWS accounts when you: Need full isolation between projects/teams/environments. Want to isolate recovery data and/or auditing data (e.g., writing your

CloudTrail logs to a different account). Need a single bill, but want to break out the cost and usage.

Segmented AWS Account Structure

Procurement and Finance

SOC/Auditors

Billing account

Production accounts

User managementaccount

Security / Auditaccount

Application Owners

Security/auditUtilityFinancial

Consolidated Billing, Billing Alerts

Read-only access for all accounts

Dev / Test accounts

Operational

Loggingaccount

Backup / DR account

Key management account

Shared services account

Domain Specific Admins

Event and State Logging

Read-only access to logging data

Introducing AWS Organizations

Control AWS service use across accounts

Policy-based management for multiple AWS accounts.

Consolidate billingAutomate AWSaccount creation

Typical Use Cases

Control the use of AWS services to help comply with corporate security and compliance policies.

Automate the creation of AWS accounts for different resources.

• API response to trigger additional automation. (e.g. deploy CloudFormation template)

What is AWS CloudFormation?• AWS CloudFormation allows you to model,

provision, and update the full breadth of AWS resources.

• Manage anything from a single Amazon EC2 instance to a multi-tier application.

• Integrates with other development and management tools.

Continuous Integration / Continuous Deployment

Cloudformation Security

Elements of a Continuous Delivery Pipeline

Commit Phase: Source Control changes• Static code analysis: Analyze the CFN templates against a set of security rules

Acceptance Phase: Dev Environment • Dynamic analysis: Run template in sandbox / acceptance test environment.

Capacity/Integration/Staging Phases: Pre-Prod Environment • Load, performance, Penetration and failover testing.

Production Phase: Prod Environment • Deploy controls.

Code* for Infrastructure code

Create StackCloudFormation

CodePipeline

DevOps

Code PushCode Pull

Static Code Analysis Lambda

Dynamic Security checksLambda

Manual Approval

Create ChangeSetCloudFormation Approve

ChangeSet

Delete Stack

CloudFormation

Execute ChangeSetCloudFormation

Commit Phase Acceptance Phase Prod Phase

S3

Control Monitor Fix

Phase 2: Monitor

Goals:- Ensure effective operation over time.- Detect anomalies/change.

Options:• Cloudtrail, Cloudwatch*, VPC Flowlogs, Config…

Best Practice:• MSB: Aggregate log data.• Pro level: Analyze and act on log data as it arrives.

What is AWS CloudTrail?A fully managed service that records API calls made on your AWS account.

Customers are making API calls...

On a growing set of services around the

world…

CloudTrail is continuously recording

API calls…

And delivering log files to customers

Alert indexer

Triage/Classification rules

Cloudtrail

Cloudtrail

Cloudtrail

... ...

Security accountAccount 1

Account 2

Account N

Cloudtrail aggregationbucket

1

1Automated configuration to enable logging and aggregation destination.

2

2 Log files deposited in S3 bucket under Security Account.

3

3 SNS notifies lambda of new events available for processing.

4

4Each lambda evaluates a specific compliance item or misuse case.

5

5Rules engines help define action to take based on asset and environment.

66 If dictated by rules engine,

event results in notification via email i.e. critical events.

7

7Alerts preserved in Dynamodb for reporting and indexing of raw data.

8

8All processing in Security Account i.e. no external dependencies to add new logic, log processing, etc.

AWS Config & Config Rules

Changing resourcesAWS Config

Config Rules

History, Snapshot

Notifications

API Access

Normalized

AWS Config: Inventory and compliance

AWS Config Rules: Evaluate resource Config

Alert…

Account DBCloudtrail

Cloudtrail

Cloudtrail

... ...

Logging aggregation accountAccount 1

Account 2

Account N

Cloudtrail aggregationbucket

1

2

SQS

DashboardCWE

Config

Config

Config

Ticketing…

Alert…

Account DB

... ...

Logging aggregation accountAccount 1

Account 2

Cloudtrail aggregationbucket

SQS

DashboardCWE

Ticketing…

Cloudtrail

Account N

ConfigFlowlogs

CloudtrailConfig Flowlogs

CloudtrailConfig Flowlogs

Flowlogs Aggregationbucket

Control Monitor Fix

Goal:• Return to ‘known good’ • ‘Don’t throw the baby out with the bathwater’…

Options:• Lambda shines but whole AWS platform plays a role.

Best Practices:• MSB: automate alerting and integrate with ticketing systems. • Pro Level: Closed loop.

Fix – Correcting anomalies

Signal

Noise

Gather Remediate

Do Nothing

Correct

Alert

Enrich

Stop

Measure

Spectrum of options

Awareness

Indirect

Direct action

Fix using AWS services

Trusted Advisor

AWS Config

Managed Rules

AWS Config Custom

Rules with remediatio

n

CloudWatch Events

with Lambda

rules

Lambda code with various triggers

Ease of getting started vs. customization and control

Security Incident Response SimulationsTest and benchmark your security response to security events. Experts from the Security, Risk and Compliance (SRC) practice can help you assess your current state of incident response readiness, then prepare and execute an exercise to practice that response.

Objectives:• Assess current incident response processes and procedures• Provide recommendations for using AWS services of incident

response• Test the cloud incident response process via a simulated exercise

Typical effort: 15 Man Days

Control Monitor FixControl Monitor Fix

In practice…

Demo – event flow

1 – Standard2 – Enhanced3 – Active

Auto Scaling group

security group

security groupEC2 instance

Web server

security groupEC2 instance

Appserver

Auto Scaling group

CloudWatch

Syslog

Flowlogs

CloudTrail

In standard operation, we are observant. Control: - Security agent loaded in

instance.- Logons tracked.Monitoring: - We gather data covering API

activity (cloudtrail), network (Flowlogs) and also in-instance activity (Syslog).

Fix:- We are good

Logon ok?

Logon is OK!

SSH

Login!(C

WE C

ustom)

Demo – event flow1 – Standard2 – Enhanced3 – Active

Auto Scaling group

security group

security groupEC2 instance

Web server

security groupEC2 instance

Appserver

Auto Scaling group

CloudWatchSyslog

Flowlogs

CloudTrail

SSH

Login!(C

WE C

ustom)

A logon event occurs. We go to Enhanced surveillance mode.Control: - Dynamically add lambda

subscriptions to log feeds.Monitor:- In instance activity (privilege

escalation)- Initiation of forbidden flows.Fix:- Alert only. Watchful but passive.

Enhance

OS data analysis

Network dataanalysis

Subscribe to Syslog

Enable Instance level flowlogs

Subscribe to instance flowlogs

Flowlogs

Logon ok?

Logon NOT ok.

Demo – event flow

Auto Scaling group

security groupEC2 instance

web appserver

Elastic Load Balancing

security groupEC2 instance

web appserver

security groupEC2 instance

web appserver

security group

Appserver

1 – Standard2 – Enhanced3 – Active

OS data analysis

Isolate Preserve Deregister

Syslog dataRoot Access

CloudWatch

Demo – event flow

Auto Scaling group

security groupEC2 instance

web appserver

Elastic Load Balancing

security groupEC2 instance

web appserver

security groupEC2 instance

Anomaly

security group

Appserver

1 – Standard2 – Enhanced3 – Active

OS data analysis

Isolate Preserve Deregister

Syslog data

CloudWatch

Demo – event flow

Auto Scaling group

security groupEC2 instance

web appserver

Elastic Load Balancing

security groupEC2 instance

web appserver

security groupEC2 instance

Anomaly

security group

Appserver

1 – Standard2 – Enhanced3 – Active

OS data analysis

Isolate Preserve Deregister

Syslog data

CloudWatch

Block all

Demo – event flow

Auto Scaling group

security groupEC2 instance

web appserver

Elastic Load Balancing

security groupEC2 instance

web appserver

security groupEC2 instance

Anomaly

security group

Appserver

1 – Standard2 – Enhanced3 – Active

OS data analysis

Isolate Deregister Preserve

Syslog data

CloudWatch

Block all DeregASG/ELB

Demo – event flow

Auto Scaling group

security groupEC2 instance

web appserver

Elastic Load Balancing

security groupEC2 instance

web appserver

security groupEC2 instance

Anomaly

security group

Appserver

1 – Standard2 – Enhanced3 – Active

OS data analysis

Isolate Deregister Preserve

Syslog data

CloudWatchLogs

Block all DeregASG/ELB

Amazon EBS

snapshots

Demo – event flow

Auto Scaling group

security groupEC2 instance

web appserver

Elastic Load Balancing

security groupEC2 instance

web appserver

security groupEC2 instance

web appserver

security group

Appserver

1 – Standard2 – Enhanced3 – Active

security groupEC2 instance

Anomaly

An escalation occurred and we switched to Active i.e. intervene and get it fixed. Control: - SG to isolate anomalous

instance.- Preserve instance for both

live and offline analysis. - Deregister application from

live use.Monitoring: - We continue to monitor all

activity as per previous steps.

Fix:- The control actions cause ASG to be 1 instance short and will recover to original fleet size from ‘last known good’.

Demo – event flow

1 – Standard2 – Enhanced3 – Active

Auto Scaling group

security group

security groupEC2 instance

Web server

security groupEC2 instance

Appserver

Auto Scaling group

CloudWatch

Syslog

Flowlogs

CloudTrail

In standard operation, we are observant. Control: - Security agent loaded in

instance.- Logons tracked to TT.Monitoring: - We gather data covering API

activity (cloudtrail), network (Flowlogs) and also in-instance activity (Syslog).

Fix:- We are BACK TO good

SummaryControl:• IAM is the foundation for everything else.• Service catalogue as an option to standardize product distribution.• Code*: Embed security throughout (‘Fail early’). Monitor:• Cloudtrail, Config, Flowlogs,…:To get visibility, you need to see – enable

logging.• Data is good. Better if you use it. Great if used to drive automation. Fix:• Reduce ‘Detect-Report-Remediate’ cycles. • Automate to gain speed + free human intellect to more added value tasks.

Take home kit – your turn!

#1 Demo code is published• https://github.com/awslabs/automating-governance-sample

#2 Implementing DevSecOps using AWS Codepipeline

• https://aws.amazon.com/blogs/devops/implementing-devsecops-using-aws-codepipeline

#3 “what should I Control/Monitor/Fix next?”• https://aws.amazon.com/whitepapers/aws-security-best-practices/

#4 (Optional) Come Jam with us!

San Francisco Summit 2017 – April 18 (am) and April 19 (pm)Washington DC, Public Sector Summit - June 12 (pm)

More to come…Your company?

Thank you!

Armando Leite, Principal Security Architect