33
© 2016 MapR Technologies 1 © 2016 MapR Technologies 1 © 2016 MapR Technologies Introduction to MapR

Introduction to MapR - DAMAhouston.dama.org/wp-content/uploads/2017/02/2017_FEB_MAPR_HOUSTON_D… · • A MapR Converged Data Platform cluster installed in MapR’s private cloud

  • Upload
    others

  • View
    12

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Introduction to MapR - DAMAhouston.dama.org/wp-content/uploads/2017/02/2017_FEB_MAPR_HOUSTON_D… · • A MapR Converged Data Platform cluster installed in MapR’s private cloud

© 2016 MapR Technologies 1© 2016 MapR Technologies 1© 2016 MapR Technologies

Introduction to MapR

Page 2: Introduction to MapR - DAMAhouston.dama.org/wp-content/uploads/2017/02/2017_FEB_MAPR_HOUSTON_D… · • A MapR Converged Data Platform cluster installed in MapR’s private cloud

OUR GLOBAL REACH

© 2017 MapR Technologies 2

• San Jose, California (HQ)• United Kingdom• Korea• Netherlands• Germany

• France• India• Singapore• Australia• Japan

Page 3: Introduction to MapR - DAMAhouston.dama.org/wp-content/uploads/2017/02/2017_FEB_MAPR_HOUSTON_D… · • A MapR Converged Data Platform cluster installed in MapR’s private cloud

© 2016 MapR Technologies 3© 2016 MapR Technologies 3

MapR is Transforming Business with Data

WHAT WE DO

Bring together

analytics and operations into next-generation

Converged Applications

for the business

WHYIT MATTERS

Empowers companies to grow revenue through innovation

and cutting costs

HOW WE DO IT

Patented technology

architecture with the world’s only complete Converged Data Platform

Leading companies around the world are transforming their business with the industry’s only Converged Data Platform

Page 4: Introduction to MapR - DAMAhouston.dama.org/wp-content/uploads/2017/02/2017_FEB_MAPR_HOUSTON_D… · • A MapR Converged Data Platform cluster installed in MapR’s private cloud

© 2016 MapR Technologies 4© 2016 MapR Technologies 4

Enabling Transformation Through Converged Applications

OPERATIONALAPPLICATIONS

Immediate

ANALYTICAL APPLICATIONS

Historical

Complete access to real-time and historical data in one platform

Converged Applications

Page 5: Introduction to MapR - DAMAhouston.dama.org/wp-content/uploads/2017/02/2017_FEB_MAPR_HOUSTON_D… · • A MapR Converged Data Platform cluster installed in MapR’s private cloud

© 2016 MapR Technologies 5© 2016 MapR Technologies 5

Our Customers Are Leading the Way

Financial Services Telco & Media

Ad tech

Government

RetailOver 80 use cases including

payment efficiency and accuracy

of claims processing. $2M/month

reduction in payment errors and

fraud.

Provides 95% of Fortune 500

CPG and retailers with data and

analytics. Achieved $2.5M/year

annual savings from

mainframe & DW offload.

Ported credit scoring use case

to MapR resulting in 20X cost

savings over DB2.

Biometric identification system

for more than 1.25 billion

people in India. $1.3B yearly

savings thru fraud reduction.

Developed a new self-service

analytics platform to give their

customers better market

insights to help them

operationalize their decisions.

Protects $1 trillion in charge

volume from fraud every year.

Amex offers program has

saved card members over

$180M.INNOVATION

COSTREDUCTION

Page 6: Introduction to MapR - DAMAhouston.dama.org/wp-content/uploads/2017/02/2017_FEB_MAPR_HOUSTON_D… · • A MapR Converged Data Platform cluster installed in MapR’s private cloud

© 2016 MapR Technologies 6© 2016 MapR Technologies 6

Powered by the World’s Only Converged Data Platform

Breakthrough Reliability

Operate globally at enterprise grade for mission critical apps

Breakthrough Value

Radically cut costs of big data IT infrastructure

BreakthroughInnovation

Enable continuous innovation with proprietary technology and open source access

A platform engineered to support next-generation applications

Page 7: Introduction to MapR - DAMAhouston.dama.org/wp-content/uploads/2017/02/2017_FEB_MAPR_HOUSTON_D… · • A MapR Converged Data Platform cluster installed in MapR’s private cloud

© 2016 MapR Technologies 7© 2016 MapR Technologies 7

Optimized for Speed

Supports parallel processing of large scale analytics and machine learning across data.

Built with Breakthrough TechnologyInnovative architecture delivers uncompromising scale, speed and availability

Optimized for Availability

Provides advanced capabilities including self-healing and disaster recovery to support continuous data access.

Optimized for Scale

Enables high scale processing by organizing underlying data into large distributed containers to scale to trillions of files.

Page 8: Introduction to MapR - DAMAhouston.dama.org/wp-content/uploads/2017/02/2017_FEB_MAPR_HOUSTON_D… · • A MapR Converged Data Platform cluster installed in MapR’s private cloud

© 2016 MapR Technologies 8© 2016 MapR Technologies 8

A Crisis of Complexity

Expensive to stitch together

Fragile not agile

“Connected” and “Federated” not converged

Limited in scale, no global

Many security models, points of failure

Hadoop & Spark

cluster

Cassandra for event

or content

logging

Classic data

warehouse

Message

middleware

Application

serverDocumentJSON DB

Search server

vs. the Complete Data Platform

Engineered as single platform

Powers legacy and next-gen apps

Enables continuous innovation

Supports all big data technologies

Multiple deployment environments

BUSINESS MODEL: SUPPORT FREE SOFTWARE BUSINESS MODEL: ENTERPRISE SOFTWARE LICENSES

Page 9: Introduction to MapR - DAMAhouston.dama.org/wp-content/uploads/2017/02/2017_FEB_MAPR_HOUSTON_D… · • A MapR Converged Data Platform cluster installed in MapR’s private cloud

© 2016 MapR Technologies 9© 2016 MapR Technologies 9

Flexible processing where

change is the norm

Distributed processing across clusters, data

centers, public & private cloud environments

Supports global apps that

can scale arbitrarily

A Single Platform: On-Prem, In the Cloud, or InterCloud

Page 10: Introduction to MapR - DAMAhouston.dama.org/wp-content/uploads/2017/02/2017_FEB_MAPR_HOUSTON_D… · • A MapR Converged Data Platform cluster installed in MapR’s private cloud

© 2014 MapR Technologies 10© 2016 MapR Technologies

MapR Customer Use Cases

Page 11: Introduction to MapR - DAMAhouston.dama.org/wp-content/uploads/2017/02/2017_FEB_MAPR_HOUSTON_D… · • A MapR Converged Data Platform cluster installed in MapR’s private cloud

© 2014 MapR Technologies 11

ENTERPRISE

DATA HUB

MARKETING

OPTIMIZATION

RISK & SECURITY

OPTIMIZATION

OPERATIONS

INTELLIGENCE

• Multi-structured

data staging & archive

• ETL / DW optimization

• Mainframe optimization

• Data exploration

• Recommendation engines

& targeting

• Customer 360

• Click-stream analysis

• Social media analysis

• Ad optimization

• Network security

monitoring

• Security information &

event management

• Fraudulent behavioral

analysis

• Supply chain & logistics

• System log analysis

• Manufacturing quality

assurance

• Preventative maintenance

• Smart meter analysis

• Non-Productive Time

Mitigation

Common Use Cases:

Page 12: Introduction to MapR - DAMAhouston.dama.org/wp-content/uploads/2017/02/2017_FEB_MAPR_HOUSTON_D… · • A MapR Converged Data Platform cluster installed in MapR’s private cloud

© 2014 MapR Technologies 12

Exploration and Production OptimizationFind new sources of revenue and maximize revenue from existing sources

• Optimal predictive analytics requires massive data volumes, compute power,

and input speed, leading to costly infrastructure

• Existing data loads limit the ability to run additional analytics for identifying

new revenue opportunities

OBJECTIVES

CHALLENGES

SOLUTION

Business Impact

Image credit: “Oilfields near Ramana” by Mark van Laere is licensed under CC BY-ND 2.0

• Cost-effective, high performance and scalable computing platform for

capturing data from many sources

• Ability to run complex analytics over massive volumes of data to identify

patterns than lead to new revenue sources

More precise predictions for new sources of revenue, lower costs associated with exploration, more

efficient production use of existing revenue sources

• High performance analytics to keep up with massive volumes of high velocity, granular data

• Scalable platform for more cost-effective, parallel processing of predictive analytics

• Make better and faster decisions on pursuing future production projects

• Make more accurate measurements of yield/cost ratio on existing projects

Page 13: Introduction to MapR - DAMAhouston.dama.org/wp-content/uploads/2017/02/2017_FEB_MAPR_HOUSTON_D… · • A MapR Converged Data Platform cluster installed in MapR’s private cloud

© 2015 MapR Technologies 13

NOV Avoids Oil Well Failure and Reduces Costly DowntimePredictive analytics on oil well operations enables proactive repairs prior to failure

• High cost of managing huge volumes of high resolution data to predict failure

• Failures occur due to many different variables (usage patterns, usage conditions,

etc.) so data on all factors must be captured and correlated

OBJECTIVES

CHALLENGES

SOLUTION

Business Impact

• Efficiently collect/store huge volumes of sensor data (up to 1TB historical data

per rig, PBs of total data), scale out as data grows

• Use predictive analytics and anomaly detection to analyze all data inputs,

and based on historical patterns, alert when equipment is likely to fail

Customers save millions of dollars with predictive maintenance – gaining greater insights with higher

resolution data provides a competitive advantage

• High performance MapR lets them store data at a higher frequency, with fewer resources

• Low latency enables faster & advanced responsiveness for keeping assets running and productive

• Reduce well failure rate by improving predictability of repair and replacement

schedule of parts/equipment by analyzing higher resolution data

• Avoid costly downtime of revenue-generating operations

Page 14: Introduction to MapR - DAMAhouston.dama.org/wp-content/uploads/2017/02/2017_FEB_MAPR_HOUSTON_D… · • A MapR Converged Data Platform cluster installed in MapR’s private cloud

© 2014 MapR Technologies 14

Smart Meter AnalysisMake more accurate operations decisions from smart meter data

• Cost-effectively managing high velocity data from millions of sources

• Scaling for growth expectations and higher resolution data

OBJECTIVES

CHALLENGES

SOLUTION

Business Impact

Image credit: “Onzo Smart Energy Meter Kit Display” by Digitpedia Com is licensed under CC BY 2.0

• Fast ingestion/storage and large scale analytics on a cost-effective,

distributed computing platform

• Clustering/segmentation, proactive alerting, usage recommendations,

graph analysis, pattern matching, etc. for customer billing optimization,

demand response optimization, increasing operational efficiency

Revenue opportunities around better resource allocation, special offers, customer analytics, etc.

• High performance analytics to get a better understanding of usage and behavior

• Integrated security and HA/DR capabilities to comply with regulations

• Better segmentation of consumer markets for optimized pricing

• Identify opportunities for value-added data services – alerts on anomalous

usage, recommended power plans, allocation planning, etc.

Page 15: Introduction to MapR - DAMAhouston.dama.org/wp-content/uploads/2017/02/2017_FEB_MAPR_HOUSTON_D… · • A MapR Converged Data Platform cluster installed in MapR’s private cloud

© 2014 MapR Technologies 15

➢ Approximately 20 % reduction in fraud and leakage of govt aid programs($50B)

➢ Average citizen’s life is transformed as they can get access to various stipulated benefits

➢ 645 million citizens currently enrolled providing identity for approx. 60% of the population

➢ 10x throughput; 4-6x lower latency; 1/3 the hardware of previous Hadoop distribution

World’s Largest Biometric Database Indian government agency creates biometric identification system for all citizens

• Increase % of citizens who have bank accounts and can access benefits

• Reduce corruption and fraud in government aid programs

• Issues with data replication and loss across clusters in competing distribution

• Weak disaster recovery strategy in competitive distribution

• Complicated upgrade process and high availability issues

• Complete data backup: Snapshots and mirroring

• Lower maintenance overhead: Rolling upgrades

• Fingerprints and retina scans with 200 millisecond response: MapR- DB

OBJECTIVES

CHALLENGES

SOLUTION

Business Impact

INDIAN GOVERNMENT AGENCY

Page 16: Introduction to MapR - DAMAhouston.dama.org/wp-content/uploads/2017/02/2017_FEB_MAPR_HOUSTON_D… · • A MapR Converged Data Platform cluster installed in MapR’s private cloud

© 2014 MapR Technologies 16© 2016 MapR Technologies

Next Steps

Page 17: Introduction to MapR - DAMAhouston.dama.org/wp-content/uploads/2017/02/2017_FEB_MAPR_HOUSTON_D… · • A MapR Converged Data Platform cluster installed in MapR’s private cloud

© 2016 MapR Technologies 17© 2016 MapR Technologies 17

Reduce costs to improve efficiency

Extend capabilities

to grow revenues

Innovate for disruptiveadvantage

Put the Power of MapR to Work for Your Business

Wherever you are on your big data journey

Page 18: Introduction to MapR - DAMAhouston.dama.org/wp-content/uploads/2017/02/2017_FEB_MAPR_HOUSTON_D… · • A MapR Converged Data Platform cluster installed in MapR’s private cloud

© 2016 MapR Technologies 18© 2016 MapR Technologies 18

We Make It Easy to Get Started

1

Understand capabilities of big

data platform

Experimentation

2

Develop first use cases and put into production

Implementation

3

Expand to multiple use

cases across key lines of business

Expansion

4

Integrate and expand data driven apps and analysis to all lines of business and more business functions

Optimization

Take the MapR Big Data Maturity Model

Page 19: Introduction to MapR - DAMAhouston.dama.org/wp-content/uploads/2017/02/2017_FEB_MAPR_HOUSTON_D… · • A MapR Converged Data Platform cluster installed in MapR’s private cloud

© 2016 MapR Technologies 19© 2016 MapR Technologies 19© 2016 MapR Technologies

Sullexis

Page 20: Introduction to MapR - DAMAhouston.dama.org/wp-content/uploads/2017/02/2017_FEB_MAPR_HOUSTON_D… · • A MapR Converged Data Platform cluster installed in MapR’s private cloud

Cost Effective Data Archiving and Reporting with BigData Tools and the Cloud

DAMA (Houston) – February 14, 2017

Tim Morgan - Managing Director

Page 21: Introduction to MapR - DAMAhouston.dama.org/wp-content/uploads/2017/02/2017_FEB_MAPR_HOUSTON_D… · • A MapR Converged Data Platform cluster installed in MapR’s private cloud

About Sullexis

• Sullexis is a professional services firm that specializes in helping its clients to

create, manage, and enhance data to accelerate and improve decision making

across the enterprise. We bring data and technology together to make our clients

measurably more effective

• With industry experience ranging from energy and manufacturing to finance and

high tech, Sullexis brings the technology, processes, and strategies together to

make you more effective in what you do

• Founded in 2006, Sullexis is headquartered in Houston, TX and has a delivery

center in Monterrey, MX.

• Our consultants have implemented solutions across the US, Caribbean, Europe

and Latin America.

Presentation Title 21

Page 22: Introduction to MapR - DAMAhouston.dama.org/wp-content/uploads/2017/02/2017_FEB_MAPR_HOUSTON_D… · • A MapR Converged Data Platform cluster installed in MapR’s private cloud

Client Background

• Our client is one of North America’s largest Oilfield Services companies

providing well construction, completion and operating services to exploration

and production companies.

• A significant number of acquisitions over the last 10 years resulted in 18

different ERP applications running on 5 different platforms. To enable

future, scale-able growth, they embarked on an ERP standardization project.

The goal was to put the entire company on one technology stack with a

common process.

• Having decided to consolidate on a single ERP, the client still needed to

determine how best to handle compliance, regulatory and operational needs

associated with the legacy systems.

• Migrating transaction data to the new ERP would be cost prohibitive and

risky; and market ready data archiving solutions were costly and unable to

meet the defined business needs.

• This left retaining the legacy systems themselves, which would be very costly,

or finding a new approach that was cost effective, reliable and could meet

the business needs.

22

18 to 1

Page 23: Introduction to MapR - DAMAhouston.dama.org/wp-content/uploads/2017/02/2017_FEB_MAPR_HOUSTON_D… · • A MapR Converged Data Platform cluster installed in MapR’s private cloud

Key Requirements

Preserve and provide easy access to ALL data• Preserve all structured and unstructured data (approx 12 TBs)

• Ability to run legacy reports to meet compliance, regulatory and ongoing business needs

• Easy for a business person to use, to minimize IT resource dependency

• Ability to provide consolidated views across disparate data sets

Be cost effective• Flexible and scalable compute/data storage options (ex. Use of cold storage)

• Provide access through existing BI and reporting tools (ex. Hyperion, MS Power BI, SAP Lumira)

to eliminate new purchases and training

• Enable 100% decommissioning of legacy systems

Enable the future• Establish processes and tools that support future company acquisitions

• Provide platform to enable new and innovate data applications and solutions

23

Page 24: Introduction to MapR - DAMAhouston.dama.org/wp-content/uploads/2017/02/2017_FEB_MAPR_HOUSTON_D… · • A MapR Converged Data Platform cluster installed in MapR’s private cloud

Solution Selection Process

Initial Analysis• Market Research

• Vendor presentations

Two week POC ‘bake-off’ to demonstrate:• Rapid integration of different data sources both structured and unstructured

• Connectivity to SAP ECC and Oracle EBS

• Reporting capabilities re-using SAP Lumira

Winning POC Solution• A MapR Converged Data Platform cluster installed in MapR’s private cloud

• Predefined adapters for Oracle used to extract and load structured data to MapR (<100GB)

• Unstructured data of CSV, PDFs and TXT loaded and made viewable through Elastic Search

• Apache Drill and a local install of SAP Lumira connected to the MapR cluster to demonstrate

reporting capabilities

24

Page 25: Introduction to MapR - DAMAhouston.dama.org/wp-content/uploads/2017/02/2017_FEB_MAPR_HOUSTON_D… · • A MapR Converged Data Platform cluster installed in MapR’s private cloud

Solution Architecture

Page 26: Introduction to MapR - DAMAhouston.dama.org/wp-content/uploads/2017/02/2017_FEB_MAPR_HOUSTON_D… · • A MapR Converged Data Platform cluster installed in MapR’s private cloud

Project Considerations

Technology Factors

• Reliability and speed of connection to cloud

• Count and category of machines in cloud

(CPU, RAM, Storage)

• Volume of data (row size and count)

• Ongoing transaction use of source system

• Variable needs for data (frequency,

response, volume)

Project Factors

• Timeliness of and accessibility to various

parties

• Cataloging of all data

• Evaluation of transactional status of

existing data sets, and how to address

moving targets (blackout periods, iterative

loads, journaling)

• Ability to validate data loads (row counts

samples)

Page 27: Introduction to MapR - DAMAhouston.dama.org/wp-content/uploads/2017/02/2017_FEB_MAPR_HOUSTON_D… · • A MapR Converged Data Platform cluster installed in MapR’s private cloud

Solution Architecture

NFS

PDF, CSV, XLS Oracle Navision SysPro MS Excel Great Plains

Data

Web-Scale StorageMapR-FS MapR-DB

Real Time Unified Security Multi-tenancy Disaster Recovery Global NamespaceHigh Availability

MapR StreamsEvent StreamingDatabase

Enterprise Grade Platform

27

PDF TIFF CSV

Page 28: Introduction to MapR - DAMAhouston.dama.org/wp-content/uploads/2017/02/2017_FEB_MAPR_HOUSTON_D… · • A MapR Converged Data Platform cluster installed in MapR’s private cloud

Why Azure

• Sullexis and client both experienced with Azure and MSFT

• MapR Quick Start on Azure made it easy and fast to get started

• MapR already successfully running well on Azure (see blog)

• Client’s enterprise MSFT account made it simple to procure and administer

• Connectivity to Azure via ExpressRoute mitigated some of the reliability and latency of

connection

28

Page 29: Introduction to MapR - DAMAhouston.dama.org/wp-content/uploads/2017/02/2017_FEB_MAPR_HOUSTON_D… · • A MapR Converged Data Platform cluster installed in MapR’s private cloud

Apache Drill - Flexible & Fast

Access to any data type, any data source

• Relational

• Nested data

• Schema-less

Rapid time to insights

• Query data in-situ

• No Schemas required

• Easy to get started

Integration with existing tools

• ANSI SQL

• BI tool integration

Scale in all dimensions

• TB-PB of scale

• 1000’s of users

• 1000’s of nodes

Granular security

• Authentication

• Row/column level controls

• De-centralized

29

Page 30: Introduction to MapR - DAMAhouston.dama.org/wp-content/uploads/2017/02/2017_FEB_MAPR_HOUSTON_D… · • A MapR Converged Data Platform cluster installed in MapR’s private cloud

Sqoop – Easy & Efficient

Leveraging a Sullexis developed direct connect extract tool based on Sqoop was

seen as meeting all the technology and project factors:

• Addresses all source data

• Support for both Oracle and SQL Server

• Import direct to Parquet

• Supports type mapping

• Supports incremental imports and merges

• Enables validation via row count matches

• Provides for parallel imports for enhance speed (but also allows for throttling)

30

Page 31: Introduction to MapR - DAMAhouston.dama.org/wp-content/uploads/2017/02/2017_FEB_MAPR_HOUSTON_D… · • A MapR Converged Data Platform cluster installed in MapR’s private cloud

Elastic Search – Simple & Transparent

31

Reporting Client Browser

Web UI

edgenode 1node 0 node 2

POSIX Client

PDF TIFF CSV PDF TIFF CSV PDF TIFF CSV

MapR-FS

ODBC or JDBC HTTP(S)

Page 32: Introduction to MapR - DAMAhouston.dama.org/wp-content/uploads/2017/02/2017_FEB_MAPR_HOUSTON_D… · • A MapR Converged Data Platform cluster installed in MapR’s private cloud

Highlights

• Quick and easy startup

• Primary technical concerns around latency to the cloud can be successfully mitigated (e.g. client’s cluster enabled transfer rates of 100-140 million records per hour)

• While early, the base business case will result in a payback within a few months and business users have suggested that data access is easier now than originally available in the legacy system

• This ERP legacy system decommissioning approach can be executed in as little 2 months for a complete data archive to 6 months with robust operational reporting

• Provides repeatable tools and process available for future system decommissioning needs

• The client is already experimenting with the platform for use as an IoT sensor data historian. So far the results have been encouraging

32

Page 33: Introduction to MapR - DAMAhouston.dama.org/wp-content/uploads/2017/02/2017_FEB_MAPR_HOUSTON_D… · • A MapR Converged Data Platform cluster installed in MapR’s private cloud

© 2016 MapR Technologies 33© 2016 MapR Technologies 33© 2016 MapR Technologies

Demonstration