31
© Copyright 2014 Glassbeam Inc. Internet of Complex Things Analytics with Cassandra Mohammed Guller September 11 | #CassandraSummit

Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassandra

Embed Size (px)

Citation preview

Page 1: Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassandra

© Copyright 2014 Glassbeam Inc.

Internet of Complex Things

Analytics with Cassandra

Mohammed Guller

September 11 | #CassandraSummit

Page 2: Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassandra

© Copyright 2014 Glassbeam Inc.

Who am I

Application Architect and Lead Developer at Glassbeam

Founder, TrustRecs and GoodOrGreatIdea

MBA from Berkeley

2

Page 3: Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassandra

© Copyright 2014 Glassbeam Inc.

Audience

Cassandra

–Expert

–Beginner

IoT

–Working on it

–Have read about it

Role

–Technical

–Business

3

Page 4: Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassandra

© Copyright 2014 Glassbeam Inc.

Source: Cisco, IDC, Wikibon report 2013

APPS

MACHINES

1980s 1990-2000s

2010 - beyond

Data from IoT is exploding

20x more connected “things” than people by 2020

42+% of data will be from machines

4

PEOPLE

Page 5: Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassandra

© Copyright 2014 Glassbeam Inc.

IoT data presents new challenges

5

Volume

Instrumented systems generating Terabytes of data

Variety

Structured, unstructured and multi-structured data

Velocity Streams/files coming in at machine speed from multiple systems

Page 6: Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassandra

© Copyright 2014 Glassbeam Inc.

IoT data also presents new opportunities

6

•Discover up-sell and cross-sell opportunities

•Understand usage patterns, and adoption curve

•Predict customer needs and trends

•Build better products

•Become proactive vs. reactive

•Lower Mean Time To Resolution (MTTR)

•Increase customer satisfaction and retention

•Reduce support costs Support Engineering

Sales Marketing

Page 8: Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassandra

© Copyright 2014 Glassbeam Inc.

10101000101011010101110101111010101000101001010101010111110001011001000110000110101110100110011111000000101011010100111110001010010101100101001011000100110101011401010100001010100001011110010011010110100101010000011110101010100010101101010111010111101010100010100101010101011111000101100100011000011010111010011001111100000010101101010011111000101001010110010100101100010011010101140101010000101010000101111001001101011010010101000001111010101010001010110101011101011110101010001010010101010101111100010110010001100001101011101001100111110000001010110101001111100010100101011001010010110001001101010114010101000010101000010111100100110101101001

Glassbeam enables analysis of unstructured and multi-structured operational IoCT data

8

Operational data to

powerful insights

Page 9: Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassandra

© Copyright 2014 Glassbeam Inc.

Multi-structured data

9

========================== SYSTEM INFORMATION ========================== OS Release: 1.1.5 Serial Number: 12345678 ========================== DISK INFORMATION ========================== Disk Size Alloc Avail Raid Group ------ -------- -------- --------- ---------- 1.1 1000 76 924 rdg001 2.3 1178 72 1106 rdg002 ================== DISK STATS ================== 1220338860,DISKPERF,1.1,60,65,004 220349660,DISKPERF,1.1,75,65,004 ================ EVENTS ================ Tue Sep 2 00:00:00 2008 [monitor] INFO System operating at CPU < 50% Tue Sep 2 11:04:00 2008 [disk] ERROR Media Write Error 4523 on disk 2.4

Static

Config

Stats

Logs

Page 10: Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassandra

© Copyright 2014 Glassbeam Inc.

Semiotic Parsing Language (SPL) – our core IP

10

========================== SYSTEM INFORMATION ========================== OS Release: 1.1.5 Serial Number: 12345678 ========================== DISK INFORMATION ========================== Disk Size Alloc Avail Raid Group ------ -------- -------- --------- ---------- 1.1 1000 76 924 rdg001 2.3 1178 72 1106 rdg002 ================== DISK STATS ================== 1220338860,DISKPERF,1.1,60,65,004 220349660,DISKPERF,1.1,75,65,004 ================ EVENTS ================ Tue Sep 2 00:00 2008 [monitor] INFO System operating at CPU < 50% Tue Sep 2 11:04 2008 [disk] ERROR Media Write Error 4523 on disk 2.4

Parsing rules

Storage rules

Search facets

Analytics transformations

Metadata

Multi-structured data SPL

Page 11: Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassandra

© Copyright 2014 Glassbeam Inc.

Apps SCALAR

SPL

Machine data

60,000 feet view

11

Page 12: Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassandra

© Copyright 2014 Glassbeam Inc.

Apps

SPL

Multi-structured

data

First-generation architecture

12

SPLi ETL1

SQLite

Vertica Apache

2 3

MariaDB ETL2

5

6

7

Page 13: Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassandra

© Copyright 2014 Glassbeam Inc.

Challenges we faced

Slow ingest speed

Difficult to make schema changes

Data reload painful

Costly to scale

Not a multi-tenant solution

Page 14: Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassandra

© Copyright 2014 Glassbeam Inc.

1010100010101101010111010111101010100010100101010101011111000101100100011000011010111010011001111100000010101101010011111000101001010110010100101100010011010101140101010000101010000101111001001101011010010101000001111010

Next-gen with scalable technologies

Stream/ files

SPL Library

SCALAR

S3

Cassandra

SolrCloud

I

N

F

O

S

E

R

V

E

R

LogVault

Explorer

Workbench

Standard Apps

Rules & Alerts

Custom Apps

Cloud Orchestration

DirectAccess

GB Studio PostgreSQL

Page 15: Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassandra

© Copyright 2014 Glassbeam Inc.

Why we chose Cassandra

15

Volume Incrementally scale-out from gigabytes to terabytes of data

Variety Easily store structured, unstructured and multi-structured data

Velocity Ingest high-speed new data

Quickly reload old data

Linear Scalability

Dynamic Schema

Fast Writes

Page 16: Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassandra

© Copyright 2014 Glassbeam Inc.

C* enabled us to build a multi-tenant solution

One cluster

One keyspace

One set of column families for all customers

Page 17: Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassandra

© Copyright 2014 Glassbeam Inc.

What we store in C*

Parsed machine data

Metadata

Application configuration

Application usage statistics

Journal

Page 18: Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassandra

© Copyright 2014 Glassbeam Inc.

Word of wisdom

Data model is important

–queries drive CF design

Avoid queries returning large amount of data

– slow

–may cause problems

Page 19: Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassandra

© Copyright 2014 Glassbeam Inc.

Lessons from the trenches

Ad-hoc queries are difficult

–data models need to driven by query pattern

– limited support from traditional BI tools

Performance depends on the data model and data

–benchmark with your data

Page 20: Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassandra

© Copyright 2014 Glassbeam Inc.

Questions

20

?

Page 21: Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassandra

© Copyright 2014 Glassbeam Inc.

Thank you

21

[email protected]

Page 22: Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassandra

© Copyright 2014 Glassbeam Inc.

Appendix

22

Page 23: Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassandra

© Copyright 2014 Glassbeam Inc.

GB Studio

• Auto generation of SPL on

log data formats

• Apply machine learning to

decipher new log patterns

Rules and Alerts

• Define complex rules on

machine data

• Proactive action using rules

on incoming data

DirectAccess

• REST API allows access to

parsed data

• Use 3rd party reporting

tools like Tableau

Custom Apps

• Custom apps built by

Glassbeam PS

• Examples: Capacity

forecasts, Perf analysis

Standard Apps

• Out-of-the-box analytics

after initial setup

• Config View, Config Diff,

Trends View

Explorer

• Full text & parametric

search across all logs

• Correlated event and

section viewer

Glassbeam Apps

23

Log Vault

• Centralized cloud

repository of all raw

machine data

• Quick filters to find, &

download specific logs

Workbench

• Sandbox to play with

parsed structured data

• Self serve app for

building charts & graphs

Page 24: Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassandra

© Copyright 2014 Glassbeam Inc.

Quick filters

Glassbeam Log Vault

Glassbeam Log Vault is a centralized cloud repository for all the log files sent to Glassbeam that has search and download features

24

Features

Benefits

• Centralized store for historical log data stored

in its original format

• Quick filters to search for a log of interest

• Download one or more log files/bundles from

the vault

• One stop shop for all log files, making it easier for

anyone to get to the file quicker

• Shortcuts to other apps help users get to the log

file first and then link to other apps as required

Centralized Log Store

Save filters Download Logs

Page 25: Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassandra

© Copyright 2014 Glassbeam Inc.

Graphs & Charts

Glassbeam Explorer

25

Full Text Search

Section Diff Event Viewer

Glassbeam Explorer is a full-text and parametric search application, combined with a log viewer and configuration change explorer

Features

Benefits

• Explore logs and other machine data

• “Save Search” and build knowledge base

• Correlate data from multiple sources like CRM, case

history, bug database, knowledge base

• Search both time series and multi-structure data

• Seamless data and insights sharing across support,

engineering and field organization

• Reduce MTTR in solving complex escalations

• Reduce tribal knowledge by sharing search best

practices

Page 26: Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassandra

© Copyright 2014 Glassbeam Inc.

Glassbeam Workbench provides an intuitive, drag and drop easy way to perform ad-hoc analysis and create visual

analytics of machine data

Glassbeam Workbench

26

Intuitive, Drag and Drop Interface

Powerful Dashboards

Features

Benefits

• Intuitive drag and drop, adhoc, visual analysis that can also be easily

shared across the enterprise

• Create new calculations on existing data, make one-click forecasts

• Run trend analyses, regressions, correlations, and more

• Discovery of deep insights and hidden correlations in underlying machine

data

• Smart and business impactful visualization of parsed machine data

• Fast time to deployment from raw data to presentable graphs for executive

and business use cases

Page 27: Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassandra

© Copyright 2014 Glassbeam Inc.

Glassbeam Standard Apps is a suite of productivity apps for support that reduces troubleshooting time and helps

create best practices templates to reduce MTTR

Select one or more sections

Glassbeam Standard Apps

27

Custom section view

Save custom views Filter rows/columns

Features

Benefits

• Organize parsed section data in any order

• Filter rows or columns of parsed data

• Transpose views to better represent data

• Save views as best practice templates along with KB documents

• Reduce MTTR by allowing support to create custom views of their log data

specific to a given problem

• Save custom views as templates, which would help troubleshoot known

problems

• Document best practices for others in the team to follow

Page 28: Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassandra

© Copyright 2014 Glassbeam Inc.

Glassbeam Custom Apps

28

Performance Analysis Capacity Planning

Version Analysis System Reports

Glassbeam Custom Apps is a professional services offering that helps create powerful custom Apps to provide insights on parsed machine data

Features

Benefits

• Create custom applications built to a customers

specifications

• Transform customer requirement to custom apps in

days (not months)

• Provide template apps for capacity planning,

performance, version control, install base summary

etc.

• Create insightful analytics using our powerful tools

and domain expertise

• Understand machine performance to build better

products and accelerate time to market

• Identify usage patterns and version adoption to upsell

customers

Page 29: Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassandra

© Copyright 2014 Glassbeam Inc.

Create new rule

Glassbeam Rules and Alerts

29

List of rules

Generate rule report Test rule

Glassbeam Rules and Alerts is a powerful rules engine that allows complex rules to be

setup for proactive action

Features

Benefits

• Rules evaluated in real-time, with streams – in the

path of parsing

• Powerful backend allows for complex rules to be

setup

• Modern architecture and design that enables rules to

be setup from data across multiple sources

• Proactive action on incoming data

• Tremendous ROI with support automation through

automatic case opening in CRM (e.g SFDC)

• Reduced tribal knowledge with centralized repository

of all operational data rules

• Increased support efficiency (cases/engineer) since

workload shifts from L3 to L1/L2 resolution

Page 30: Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassandra

© Copyright 2014 Glassbeam Inc.

Glassbeam DirectAccess

30

Glassbeam DirectAccess allows Restful API access to data stored in Glassbeam for application developers to build

custom applications

Features

Benefits

• Restful interface multiple APIs to access parsed content

• SQL Extracts for specific parsed content

• Aggregation, filtering functions built into APIs

• Highly scalable, asynchronous and distributed processing

• Extend Glassbeam’s value by creating custom apps using APIs exposed

• Mix parsed data from Glassbeam with data from other in-house sources for

comprehensive dashboards

• Use any tools of choice for development as APIs implemented using Restful standards

Page 31: Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassandra

© Copyright 2014 Glassbeam Inc.

Auto detect log formats

Glassbeam Studio

31

GB Studio layout

Auto generate Regex Auto generate SPL

Glassbeam Studio provides an intelligent IDE that allows SPL development and also

auto-generates SPL code by analyzing log files

Features

Benefits

• Handle stream data or log files

• Auto detect log formats and generate SPL code

• Continuously learns new formats

• Powerful UI to filter what to parse, select sections of

interest, teach system about new formats etc.

• Reduce time to generate custom SPL

• Quicker deployment cycle

• User can create new SPLs without coding