21
How to Eat an Elephant Qlik and Big Data Ecosystem David Freriks Technology Evangelist Office of Strategy Management Q1 2018

How to Eat an Elephant Qlik and Big Data Ecosystemgo.qlik.com/rs/497-BMK-910/images/Qlik-on-Big-Data-Q12018.pdf · (Data Stored in Parquet or Kudu) Complex Data Types w/ Impala (Maps,

  • Upload
    others

  • View
    9

  • Download
    0

Embed Size (px)

Citation preview

Page 1: How to Eat an Elephant Qlik and Big Data Ecosystemgo.qlik.com/rs/497-BMK-910/images/Qlik-on-Big-Data-Q12018.pdf · (Data Stored in Parquet or Kudu) Complex Data Types w/ Impala (Maps,

How to Eat an Elephant –

Qlik and Big Data Ecosystem

David Freriks

Technology Evangelist

Office of Strategy Management

Q1 2018

Page 2: How to Eat an Elephant Qlik and Big Data Ecosystemgo.qlik.com/rs/497-BMK-910/images/Qlik-on-Big-Data-Q12018.pdf · (Data Stored in Parquet or Kudu) Complex Data Types w/ Impala (Maps,

2

• Most Big Data Users are not Data Scientists

─ Business users want simple, guided access

• Helping the user find relevant and contextual information

─ Instead of having to search through everything

• Ensuring the solution can accommodate today and tomorrow

─ Big Data landscape continues to rapidly evolve

• Able to use different methods for different data volumes and complexities

─ “One method does not fit all”

Challenge - Providing Big Data to everyone

“A car may produce an exabyte of data a year (a billion gigabytes), but most is

completely meaningless. Isolating the megabyte of data a month that’s really valuable,

and then figuring out what you can do with it, that’s the challenge of Big Data.”

Scott McCormick, president of the Connected Vehicle Trade Association and industry adviser to the U.S. Secretary of Transportation, September 2013

Page 3: How to Eat an Elephant Qlik and Big Data Ecosystemgo.qlik.com/rs/497-BMK-910/images/Qlik-on-Big-Data-Q12018.pdf · (Data Stored in Parquet or Kudu) Complex Data Types w/ Impala (Maps,

The Qlik platform – for all usersMost Big Data Users are not Data Scientists

Deep drilling

Mostly drilling, some exploration

Mostly exploration,

some drilling

Data Experts

Data Scientists

Breadth of Coverage

Dep

th o

f C

overa

ge

Data Explorers

Descriptive, diagnostic and predictive analytics(“What happened?”, “Why did it happen?” and “What is likely to happen?”

Page 4: How to Eat an Elephant Qlik and Big Data Ecosystemgo.qlik.com/rs/497-BMK-910/images/Qlik-on-Big-Data-Q12018.pdf · (Data Stored in Parquet or Kudu) Complex Data Types w/ Impala (Maps,

Qlik Accelerates Big Data ROI

Many firms that are investing in Big Data still

struggle to get the most from it.

Qlik’s platform drives higher ROI by delivering big data in

context with other data to ensure that Big Data stays relevant.

Make Big Data

Accessible

Deliver Big Data

In Context

Keep Big Data

Relevant

Page 5: How to Eat an Elephant Qlik and Big Data Ecosystemgo.qlik.com/rs/497-BMK-910/images/Qlik-on-Big-Data-Q12018.pdf · (Data Stored in Parquet or Kudu) Complex Data Types w/ Impala (Maps,

Qlik within a Big Data Architecture

Analyze

Refinement

Initial Processing

Gather

HADOOPDATA SOURCES

ACCELERATORS

QIX Associative Engine

Unstructured

data

Structured

data

Standards-based or application-specific connector

NON-HADOOP

Page 6: How to Eat an Elephant Qlik and Big Data Ecosystemgo.qlik.com/rs/497-BMK-910/images/Qlik-on-Big-Data-Q12018.pdf · (Data Stored in Parquet or Kudu) Complex Data Types w/ Impala (Maps,

Hadoop

EDWRDBMS

Data Lake

A Data Lake is a storage repository that holds a vast

amount of raw data in its native format until it is needed. *

Technology Implementation

Source: http://searchaws.techtarget.com/definition/data-lake

The Data Lake

Page 7: How to Eat an Elephant Qlik and Big Data Ecosystemgo.qlik.com/rs/497-BMK-910/images/Qlik-on-Big-Data-Q12018.pdf · (Data Stored in Parquet or Kudu) Complex Data Types w/ Impala (Maps,

• The New IT: How Technology Leaders Are Enabling

Business Strategy in the Digital Age, Jill Dyché, 2015

“You’ve been loading data into a data

warehouse for as long as we can

remember.

But no one asked us if we needed

any of that data.”

Page 8: How to Eat an Elephant Qlik and Big Data Ecosystemgo.qlik.com/rs/497-BMK-910/images/Qlik-on-Big-Data-Q12018.pdf · (Data Stored in Parquet or Kudu) Complex Data Types w/ Impala (Maps,

Indexed, Flexible, and

Agile Data Model

Why Qlik for Big Data?Qlik is a data lake accelerator!

Sync Explore

Syncs and indexes data, and makes it available for

(1) search, (2) explore, (3) report.

Page 9: How to Eat an Elephant Qlik and Big Data Ecosystemgo.qlik.com/rs/497-BMK-910/images/Qlik-on-Big-Data-Q12018.pdf · (Data Stored in Parquet or Kudu) Complex Data Types w/ Impala (Maps,

Simple Analogy: Analytics off of Big Data

Data Lake

Water Tower

Direct

Sync Drink

UsersRaw

Page 10: How to Eat an Elephant Qlik and Big Data Ecosystemgo.qlik.com/rs/497-BMK-910/images/Qlik-on-Big-Data-Q12018.pdf · (Data Stored in Parquet or Kudu) Complex Data Types w/ Impala (Maps,

If Data Is The New Oil...

Shouldn't We Treat It That Way?

• Nobody Invests In Drilling At Random

• You can’t use raw Oil for anything…

• Refining is key!

Page 11: How to Eat an Elephant Qlik and Big Data Ecosystemgo.qlik.com/rs/497-BMK-910/images/Qlik-on-Big-Data-Q12018.pdf · (Data Stored in Parquet or Kudu) Complex Data Types w/ Impala (Maps,

The final part of the story is

adding context and relevance

and delivering it to people

at the point of decision.

“ “

Big Data is Only Half the Story

Page 12: How to Eat an Elephant Qlik and Big Data Ecosystemgo.qlik.com/rs/497-BMK-910/images/Qlik-on-Big-Data-Q12018.pdf · (Data Stored in Parquet or Kudu) Complex Data Types w/ Impala (Maps,

1212

Advanced Analytics Integration (AAI)

• Direct integration with 3rd party advanced analytics

engines through server-side extension APIs

• Allows data to be directly exchanged between the QIX

engine and external tools during analysis

– Leverages Qlik’s Associative Model to pass relevant data

based on user context

• Full integration with Qlik Sense expressions and libraries

• Connectors can be built for any external engines

• Open source connectors to be made available by Qlik for

R and Python

Leverage the power of advanced analytics

calculations in Qlik Sense

Etc..

Page 13: How to Eat an Elephant Qlik and Big Data Ecosystemgo.qlik.com/rs/497-BMK-910/images/Qlik-on-Big-Data-Q12018.pdf · (Data Stored in Parquet or Kudu) Complex Data Types w/ Impala (Maps,

13

How AAI works

1User interacts with app,

making a selection or a

search2

Hypercube recalculated

by QIX Engine to the new

context3

In-context data and script

sent to external engine

4External engine runs and

sends results to QIX

engine5

QIX engine combines

hypercube with new data6Combined hypercube

Is visualized for the user

in the app

Page 14: How to Eat an Elephant Qlik and Big Data Ecosystemgo.qlik.com/rs/497-BMK-910/images/Qlik-on-Big-Data-Q12018.pdf · (Data Stored in Parquet or Kudu) Complex Data Types w/ Impala (Maps,

Qlik + Cloudera

12 Points of Integration

App on Demand w/ Impala(In memory user generated data slices )

Direct Query w/ Impala(Data Stored in Parquet or Kudu)

Complex Data Types w/ Impala(Maps, Arrays, and Structures )

Writeback with Kudu(Interactive Analytics)

IOT and Kafka integration(Event Driven / Streaming Analytics)

Solr Integration (In-Memory Apps Built on Solr Data)

Qlik Solr-API App on Demand(Search + QAP + D3js)

Advanced Analytics(Integration with Spark/Python/R)

Cloudera Metrics Dashboard(REST API based management console for CM)

Security – New SSO Support(Kerberos Delegation / SSO Pass-through)

Fast & Flexible BI & Analytics Go Beyond SQL Enterprise Ready

Data Lake Browser (Beta)(Data Concierge for Cloudera)

SAP Offload w/ Attunity(SAP S&D Module into HDFS/Impala)

Page 15: How to Eat an Elephant Qlik and Big Data Ecosystemgo.qlik.com/rs/497-BMK-910/images/Qlik-on-Big-Data-Q12018.pdf · (Data Stored in Parquet or Kudu) Complex Data Types w/ Impala (Maps,

• Let’s Eat…

Page 16: How to Eat an Elephant Qlik and Big Data Ecosystemgo.qlik.com/rs/497-BMK-910/images/Qlik-on-Big-Data-Q12018.pdf · (Data Stored in Parquet or Kudu) Complex Data Types w/ Impala (Maps,

Analyze data

Different data volumes and complexities need different Qlik solutions

Method DescriptionQlik

Sense® QlikView®

In-MemoryHighly compresses data into memory.

On Demand App

Generation

User selection generates purpose-built app

Segmentation &

Chaining

Multiple related apps that are linked together

Other methods • APIs related to On Demand App Generation

• Partner solutions

Data Volume• Size (rows)

• Dimensions (columns)

• Cardinality (uniqueness)

App Complexity• Computational complexity

• Object density

Variables

Methods can be combined to meet different use cases

Page 17: How to Eat an Elephant Qlik and Big Data Ecosystemgo.qlik.com/rs/497-BMK-910/images/Qlik-on-Big-Data-Q12018.pdf · (Data Stored in Parquet or Kudu) Complex Data Types w/ Impala (Maps,

On-Demand App Generation

• A template app summarizes the

entire big data environment

• Users can select subsets of data

and dynamically generate new

apps for analysis

• Analysis apps offer fully

unrestricted search and

exploration

Make selections to segment

Big Data and generate

analysis apps on the fly

Page 18: How to Eat an Elephant Qlik and Big Data Ecosystemgo.qlik.com/rs/497-BMK-910/images/Qlik-on-Big-Data-Q12018.pdf · (Data Stored in Parquet or Kudu) Complex Data Types w/ Impala (Maps,

Pre

se

nta

tio

nA

pp

lic

ati

on

Qlik Sense HTML 5

Web Client

Proxy

Scheduler

QIX Engine

Repository

Applications

Custom HTML 5

Interface / Client

Used to create /

save apps

2nd proxy used to

auto-login

anonymous users

into known users

Used to read data

from Cloudera

metadata app

Used to visualize

data profiles

On-Demand with Qlik Sense API’s:

Qlik Solr / Data Concierge

Reads users

from NTLM

Impala

Hive

Solr

CM/CN

Page 19: How to Eat an Elephant Qlik and Big Data Ecosystemgo.qlik.com/rs/497-BMK-910/images/Qlik-on-Big-Data-Q12018.pdf · (Data Stored in Parquet or Kudu) Complex Data Types w/ Impala (Maps,

Demos

• SAP offload to Cloudera

• ODAG

• QlikSolr

• Cloudera Data Explorer

/ Metadata Miner

Page 20: How to Eat an Elephant Qlik and Big Data Ecosystemgo.qlik.com/rs/497-BMK-910/images/Qlik-on-Big-Data-Q12018.pdf · (Data Stored in Parquet or Kudu) Complex Data Types w/ Impala (Maps,

Thank you

Page 21: How to Eat an Elephant Qlik and Big Data Ecosystemgo.qlik.com/rs/497-BMK-910/images/Qlik-on-Big-Data-Q12018.pdf · (Data Stored in Parquet or Kudu) Complex Data Types w/ Impala (Maps,

Thank you