42
0 Built-in Operational Visibility and Analytics Designed for Cloud Canturk Isci IBM Research, NY @canturkisci Boston University Thu Apr 28, 11:00 AM CloudSight Research Vulnerability Advisor

Operational Visibiliy and Analytics - BU Seminar

Embed Size (px)

Citation preview

Page 1: Operational Visibiliy and Analytics - BU Seminar

0

Built-in Operational Visibility and AnalyticsDesigned for Cloud

Canturk Isci

IBM Research, NY

@canturkisci

Boston UniversityThu Apr 28, 11:00 AM

CloudSightResearch

Vulnerability Advisor

Page 2: Operational Visibiliy and Analytics - BU Seminar

1

Cloud Evolution: Greats and Needs

What is GreatWhat is Great

�Density

�Scale

�Portability

�Repeatability

�Speed

What Needs WorkWhat Needs Work

�Visibility

�Operational Insight

Utility Cost Scale Automation Agility (u)ServicesOperational Intelligence

- Modernization of IT infra and SW delivery

- Complex made simple

- Unprecedented efficiency and TTV

- Lots of shiny toys across IT lifecycle

- Visibility into our environments remains an issue

- Also lots of shiny toys for monitoring & analytics

BUT:

- Still based on traditional IT Principles!

Page 3: Operational Visibiliy and Analytics - BU Seminar

2

- Provide unmatched deep, seamless visibility into cloud instances- Drive operational insights to solve real-world pain points

Our Work: Built-in Op Visibility & Analytics Designed for Cloud

Page 4: Operational Visibiliy and Analytics - BU Seminar

3

- Provide unmatched deep, seamless visibility into cloud instances- Drive operational insights to solve real-world pain points

Built-in Operational Visibility & Analytics Designed for Cloud

Page 5: Operational Visibiliy and Analytics - BU Seminar

4

- Provide unmatched deep, seamless and unified visibility into ALL cloud instances- Drive operational insights to solve real-world pain points

Built-in Operational Visibility & Analytics Designed for Cloud

Agentless System Crawler (ASC)

Page 6: Operational Visibiliy and Analytics - BU Seminar

5

Traditional Monitoring vs. Crawlers

OS

Host

Wkld

Agent

Agent

Agent

Agent

OS

Host

Wkld A A

AA

VM

OS Wkld A A

AA

Host

OS

Wkld

A A

AA

Cont

. Wkld

A A

AA

Cont

. Wkld

A A

AA

Cont

.

VMBMS Container

OS

Host

Wkld OS

Host

Wkld

VM

OS Wkld

Host

OS

Wkld

Cont

. Wkld

Cont

. Wkld

Cont

.

VMBMS Container

Page 7: Operational Visibiliy and Analytics - BU Seminar

6

Some Data Points

From an employee- "This is the BES client agent. I don't know what it does but it's always at

50%. I would be the first customer to remove this evil thing from my machines:”

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND

3515 root 20 0 781m 21m 6272 R 53.8 0.3 51:28.92 BESClient

C. Colohan. The Scariest Outage Ever. CMU SDI Seminar Series, 2012. http://pdl.cmu.edu/SDI/2012/083012b.html

Amazon. Summary of Oct. 22 '12 AWS Service Event in US-East Region.

http://aws.amazon.com/message/680342/

Page 8: Operational Visibiliy and Analytics - BU Seminar

7

”Users do not have to do anything to get this visibility. It is already there by default”

Container Cloud

Docker Hosts

App

Cont.App

Cont.App

Cont.App

Cont.

Docker Hosts

App

Cont.App

Cont.App

Cont.App

Cont.

Docker Hosts

App

Cont.App

Cont.App

Cont.App

Cont.

Metrics & LogsBus

MultitenantIndex

LogmetSvc

ProvisioningTenancy Info

StateEvents

� Built-in in every compute node, all geos

� Enabled by default for all users in all prod

� O(10K) metrics/s & logs/s

Current State

Seamless: Built-in Monitoring & Logging in Bluemix Containers

Page 9: Operational Visibiliy and Analytics - BU Seminar

8

Container Cloud

App

Cont.App

Cont.App

Cont.App

Cont.

Cool!

Happy User: Effortless, painless

visibility in user world

magic

Seamless: Built-in Monitoring & Logging in Bluemix Containers

”Users do not have to do anything to get this visibility. It is already there by default”

Page 10: Operational Visibiliy and Analytics - BU Seminar

9

Key AdvantagesKey Advantages

App

Cont.App

Cont.App

Cont.App

Cont.

Why Agentless System Crawlers

magic

�Monitoring built into the platform not in end-user systems

�No complexity to end user (They do nothing, all they see is the service)

�No agents/credentials/access(nothing built into userworld)

�Works out of the box

�Makes data consumable* (lower barrier to data collection and analytics)

�Better Security* for end user(No attack surface, in userworld)

�Better Availability* of monitoring (From birth to death, inspect even defunct guest)

�Guest Agnostic (Build for platform, not each user distro)

�Decoupled* from user context (No overhead/side-effect concerns)

�Monitoring done right for the processes of the Cloud OS

Page 11: Operational Visibiliy and Analytics - BU Seminar

10

Deep Visibility: What We Actually Collect (and Annotate)

- OS Info- Processes- Disk Info- Metrics- Network Info- Packages - Files- Config Info

From Container/VM

- Docker metadata(docker inspect)

- CPU metrics(/cgroup/cpuacct/)

- Memory metrics(/cgroup/memory)

- Docker history

Docker Runtime

ConfigAnnotator

Vulnerability Annotator

Compliance Annotator

Password Annotator

SW Annotator

LicenceAnnotator

Page 12: Operational Visibiliy and Analytics - BU Seminar

11

Deep Visibility � Operational Insights/Analytics � Solve Real Problems

Index (Data)

Data Bus Annotators Index (Data)

Vuln. &

Compl.

Analysis

Config

Analytics

(SecConfig)

Cloud Time

Machine

(Audit/PD)

Pipeline

Service

(DevOps)

Remediation

Service

Analyitcs

* All analytics services work from the

same data & pipeline!

Today’s Special:

Vulnerability Advisor- OS Info- Processes- Disk Info- Metrics- Network Info- Packages - Files- Config Info

From Container/VM

- Docker metadata(docker inspect)

- CPU metrics(/cgroup/cpuacct/)

- Memory metrics(/cgroup/memory)

- Docker history

Docker Runtime

ConfigAnnotator

Vulnerability Annotator

Compliance Annotator

Password Annotator

SW Annotator

LicenceAnnotator

Page 13: Operational Visibiliy and Analytics - BU Seminar

12

Crawler: How it Works for VMs

• Leverage VM Introspection (VMI) techniques to access VM Mem and Disk state

(We built bunch or our own optimizations that make this very efficient and practical)

• Can even remote both (decouple all from VM and host)

• Almost no new dependencies on host

• Currently support 1000+ kernel distros

Hypervisor

MEM View

KB

APP

Analytics Apps

Memory CrawlAPI

VM

OS

MEMDisk

Disk View

Disk CrawlAPI

Cloud Analytics

CrawlLogic Structured

view of VM states

APP

APP

{..............

}

Frames

Page 14: Operational Visibiliy and Analytics - BU Seminar

13

Crawler: How it Works for Containers

• Leverage Docker APIs for base container information

• Exploit container abstractions (namespace mapping and cgroups) for deeper insight

• Provide deep state info at scale with no visible overheads to end user

1) Get visibility into container world

by namespace mapping

2) Crawl the container

(Crawler dependencies still borrowed from host.

No need to inject into container!)

3) Return to original namespace

4) Push data to backend index

Page 15: Operational Visibiliy and Analytics - BU Seminar

14

Crawler: Typical Deployment

• Typical deployment, able to track diverse cloud runtimes w parity

• Need not be on same host, most crawler functions can be even remoted

Page 16: Operational Visibiliy and Analytics - BU Seminar

15

Crawler: Design

• Same crawler across runtimes for unified operational visibility

• Multiple fanouts as use cases grow

Page 17: Operational Visibiliy and Analytics - BU Seminar

16

Open Innovation <3

April 13

Open Container Introspection Toolkit

for Security Analysis

Open Container Introspection Toolkit

for Security Analysis

Page 18: Operational Visibiliy and Analytics - BU Seminar

17

DEMO TIME

This SessionThis Session

�Agentless System Crawler

�Bluemix Test Drive (live – ldwave)https://developer.ibm.com/bluemix/2015/11/16/built-in-monitoring-and-logging-for-bluemix-containers/

�LogCrawler and JSON Parsing (live – CanoLibUK3)

�Vanilla LogCrawler(20150619_LogCrawlerDemo)

�Crawl even Non-responsive systems(oopsRconsole2)

�Out of Band SIEM(QRadarDemo)

�TopoLog for Topology Discovery(newTopo)

�RTop for Realtime Monitoring(RtopAnnotatedMOV)

�Crawling for Rootkits with RConsole(RConsoleAnnotatedMOV)

Sunday & WednesdaySunday & Wednesday

�Vulnerability Advisor

�Coming soon…

Page 19: Operational Visibiliy and Analytics - BU Seminar

18

Bluemix Test Drive

Just start a Bluemix Container

(https://console.ng.bluemix.net/)

Go to Container Overview

(Metrics show up in few mins)

Page 20: Operational Visibiliy and Analytics - BU Seminar

19

… Bluemix Test Drive

Go to Monitoring and Logs

>> Monitoring

Page 21: Operational Visibiliy and Analytics - BU Seminar

20

… Bluemix Test Drive

Go to Monitoring and Logs

>> Logging

Page 22: Operational Visibiliy and Analytics - BU Seminar

21

Back to: Deep Visibility � Operational Insights/Analytics � Solve Real Problems

- OS Info- Processes- Disk Info- Metrics- Network Info- Packages - Files- Config Info

From Container/VM

- Docker metadata(docker inspect)

- CPU metrics(/cgroup/cpuacct/)

- Memory metrics(/cgroup/memory)

- Docker history

Docker Runtime

ConfigAnnotator

Vulnerability Annotator

Compliance Annotator

Password Annotator

SW Annotator

LicenceAnnotator

How can I identify my vulnerable/non-compliant images before they go live?

How can I detect and block systems with password access

configurations and weak passwords?

21

Page 23: Operational Visibiliy and Analytics - BU Seminar

22

- OS Info- Processes- Disk Info- Metrics- Network Info- Packages - Files- Config Info

From Container/VM

- Docker metadata(docker inspect)

- CPU metrics(/cgroup/cpuacct/)

- Memory metrics(/cgroup/memory)

- Docker history

Docker Runtime

ConfigAnnotator

Vulnerability Annotator

Compliance Annotator

Password Annotator

SW Annotator

LicenceAnnotator

How can I track, query and analyze my configurations in a simpleand robust manner for drift/config analytics?

How can I do better resource management and allocation?

22

Deep Visibility � Operational Insights/Analytics � Solve Real Problems

Page 24: Operational Visibiliy and Analytics - BU Seminar

23

DEMO TIME

This SessionThis Session

�Vulnerability Advisor, Policy Mgr

�Go to Bluemix Catalog

�See VA Image Status (Safe, Caution, Blocked)

�Go to Create View

�Explore Status Details(Vulnerabilities, Policy Violations)

�Browse Policy Manager(Policy Settings, Deployment Impact)

�Change Org Policies

�Override Policies(Don’t do it)

�See Weak Password Discovery

�Update Image in Local Dev

�Fix Policy Violation

PreviouslyPreviously

�Built-in Monitoring & Logging

�We just did that one…

Page 25: Operational Visibiliy and Analytics - BU Seminar

24

Getting Started: Let’s Go to London

Login to Bluemix London

(https://console.eu-gb.bluemix.net/)

Page 26: Operational Visibiliy and Analytics - BU Seminar

25

Deployment Status

Login to Bluemix London

(https://console.eu-gb.bluemix.net/)

Go to Catalog and Look for Containers

Hover over containers to see VA verdict:

Safe to Deploy

Page 27: Operational Visibiliy and Analytics - BU Seminar

26

Deployment Status

Login to Bluemix London

(https://console.eu-gb.bluemix.net/)

Go to Catalog and Look for Containers

Hover over containers to see VA verdict:

Safe to Deploy | Deploy with Caution

Page 28: Operational Visibiliy and Analytics - BU Seminar

27

Deployment Status

Login to Bluemix London

(https://console.eu-gb.bluemix.net/)

Go to Catalog and Look for Containers

Hover over containers to see VA verdict:

Safe to Deploy | Deploy with Caution | Blocked

Page 29: Operational Visibiliy and Analytics - BU Seminar

28

Create View

Login to Bluemix London

(https://console.eu-gb.bluemix.net/)

Go to Catalog and Look for Containers

Hover over containers to see VA verdict:

Safe to Deploy | Deploy with Caution | Blocked

Click on Image to go to Create View

See Verdict Details and Explore Options

Page 30: Operational Visibiliy and Analytics - BU Seminar

29

Vulnerability Advisor Report

Login to Bluemix London

(https://console.eu-gb.bluemix.net/)

Go to Catalog and Look for Containers

Hover over containers to see VA verdict:

Safe to Deploy | Deploy with Caution | Blocked

Click on Image to go to Create View

See Verdict Details and Explore Options

View Vulnerability Advisor Report:Discovered Vulnerabilities | Policy Violations

Page 31: Operational Visibiliy and Analytics - BU Seminar

30

Vulnerability Advisor Report

Login to Bluemix London

(https://console.eu-gb.bluemix.net/)

Go to Catalog and Look for Containers

Hover over containers to see VA verdict:

Safe to Deploy | Deploy with Caution | Blocked

Click on Image to go to Create View

See Verdict Details and Explore Options

View Vulnerability Advisor Report:Discovered Vulnerabilities | Policy Violations

Page 32: Operational Visibiliy and Analytics - BU Seminar

31

Policy Manager and Deployment Impact

Login to Bluemix London

(https://console.eu-gb.bluemix.net/)

Go to Catalog and Look for Containers

Hover over containers to see VA verdict:

Safe to Deploy | Deploy with Caution | Blocked

Click on Image to go to Create View

See Verdict Details and Explore Options

View Vulnerability Advisor Report:Discovered Vulnerabilities | Policy Violations

Policy Manager and Deployment Impact

Page 33: Operational Visibiliy and Analytics - BU Seminar

32

Policy Manager and Deployment Impact

Login to Bluemix London

(https://console.eu-gb.bluemix.net/)

Go to Catalog and Look for Containers

Hover over containers to see VA verdict:

Safe to Deploy | Deploy with Caution | Blocked

Click on Image to go to Create View

See Verdict Details and Explore Options

View Vulnerability Advisor Report:Discovered Vulnerabilities | Policy Violations

Policy Manager and Deployment ImpactChange Org Policy and Observe Impact

Page 34: Operational Visibiliy and Analytics - BU Seminar

33

Policy Override

Login to Bluemix London

(https://console.eu-gb.bluemix.net/)

Go to Catalog and Look for Containers

Hover over containers to see VA verdict:

Safe to Deploy | Deploy with Caution | Blocked

Click on Image to go to Create View

See Verdict Details and Explore Options

View Vulnerability Advisor Report:Discovered Vulnerabilities | Policy Violations

Policy Manager and Deployment ImpactChange Org Policy and Observe Impact

Create View > Click One-time Override

Name your risky container and deploy

Page 35: Operational Visibiliy and Analytics - BU Seminar

34

Also: One-stop Shop “Michael View” for the Purists

Page 36: Operational Visibiliy and Analytics - BU Seminar

3535

Some Nostalgia: Big Vision = Systems as Data

Transform systems into documents/frames/data

Crawl the cloud like you crawl the web

Query & mine the cloud like query/mine the web

Learn good & bad sytem/SW configurations automagically

Page 37: Operational Visibiliy and Analytics - BU Seminar

36

Operational Analytics Data Pipeline [Where We Started]

Images

(Registry)

Kafka

Configuration Channel

Compliance Channel

Vulnerability Channel

Indexers

Vulnerability Annotator

Elastic

Configuration Index

Compliance Index

Vulnerability Index

Compliance Annotator

Page 38: Operational Visibiliy and Analytics - BU Seminar

37

Operational Analytics Data Pipeline [Where We Are]

Images

(Registry) Notification Channel

Kafka

Configuration Channel

Compliance Channel

Vulnerability Channel

Indexers

Vulnerability Annotator

Discovery Channel

Instances

(Compute) SecConfig Channel

Rootkit Channel

Licence Channel

Notification Index

Elastic

Configuration Index

Compliance Index

Vulnerability Index

Discovery Index

SecConfig Index

Rootkit Index

Licence Index

USNs Index

Compliance Annotator

Password Annotator

Config Parser

SecConfig Annotator

SW Discovery

Rootkit Annotator

Licence Discovery

Notification Parser

Security

Notices

Page 39: Operational Visibiliy and Analytics - BU Seminar

38

Our Other Key Operational Analytics Directions

Config Analytics SW and System Discovery by Examples

Secure Config Advisor Cloud Time Machine

Risk Analysis Licence Discovery

Licence Discovery

Data Pipeline Licence Db

Im

g

Page 40: Operational Visibiliy and Analytics - BU Seminar

39

Summary & Open Problems�Summary:

� Challenges: Operational visibility into complex cloud applications; need for real operational intelligence

� Opportunities: Transform systems to data; New line of ops data analytics; So many low-hanging pain points

� Agentless System Crawler and Vulnerability Advisor as simple ground-floor examples

�Parting Thoughts:

� Operational Visibility >> Metrics & Logs (although a good start, add state, config, interactions, dependencies,…)

� Cloud lends itself to novel & elegant “monalytics” designed with cloud-native thinking

� Everything analytics can be as-a-service when we decouple systems | observations | recommendations | actions

�Open Research Questions:

� Truly Seamless OpVis: No performance impact (~/~) + Absolutely no side effects (+/-)

� Extensibility and configurability: Deep visibility into system, application and infra

� Scale out across runtimes and scale up to many instances; challenges & limits

� How do you design DDOS-mitigation/admission-control/fair sharing in this model of built-in service

� Privacy and data sensitivity with Ops data analytics

� Piecemeal analytics/security solutions � Cloud analytics/security roadmap

� Rules/annotators � Actually smart analytics that learn good and bad configs for security, performance, availability, etc.

� Cross-silo analytics across Time, Space, Dev/Ops [CloudSight Dream]

Page 41: Operational Visibiliy and Analytics - BU Seminar

40

The More You Know�Papers:

� Operational Visibility: IC2E’14, Sigmetrics’14, VEE’15, HotCloud’15, ATC’16 (InterConnect’15)

� Operational Analytics: BigData’14, IBM JRD’16:{SWDisc,NFM,DevOps} (InterConnect’16)

�Blogs:

� Crawl the Cloud Like You Crawl the Web: https://developer.ibm.com/open/2015/07/18/crawl-cloud-like-crawl-web/

� Monitoring and Logging for IBM Containers. No configuration needed: https://developer.ibm.com/bluemix/2015/07/06/monitoring-and-logging-for-containers-no-config-required/

� Test Driving Built-in Monitoring and Logging in IBM Containers:https://developer.ibm.com/bluemix/2015/11/16/built-in-monitoring-and-logging-for-bluemix-containers/

� Is your Docker container secure? Ask Vulnerability Advisor!:https://developer.ibm.com/bluemix/2015/07/02/vulnerability-advisor/

�Demos:

� https://www.youtube.com/channel/UCf8Fn8dKQzBCJRgI1jOlGYg

�Open Source:

� dwOpen Tech Talk: https://developer.ibm.com/open/events/dw-open-tech-talk-agentless-system-crawler/

� dwOpen Page: https://developer.ibm.com/open/agentless-system-crawler/

� Agentless System Crawler: http://github.com/cloudviz/agentless-system-crawler

� PSVMI Introspection Library: https://github.com/cloudviz/psvmi

�Try It:

� As-a-service today: http:///www.bluemix.net

� Run it yourself: http://github.com/cloudviz/agentless-system-crawler

Page 42: Operational Visibiliy and Analytics - BU Seminar

41

Thank YouSeamless, Unified Operational visibility and Analytics Designed fro Cloud

[feat. Agentless System Crawler & Vulnerability Advisor]

IBM Research

Cloud Monitoring, Operational and DevOps Analytics

http://www.canturkisci.com/blog

@canturkisci