Big data advance topics - part 2.pptx

Preview:

Citation preview

© 2016 Ness SES. All Rights Reserved1

BIG DATAadvanced topics

Cloudera vs HortonworksMOLDOVAN Radu Adrian Timisoara May 2016

© 2016 Ness SES. All Rights Reserved2

Who am I? :)❏passionate about

technology❏20 years of programming using open source❏ last 4 years in Big Data

❏Big Data Architect @

© 2016 Ness SES. All Rights Reserved3

© 2016 Ness SES. All Rights Reserved4

Cloudera and Hortonworks: The Similarities

- set on top of Apache Hadoop

- both are mature offering security

- provide paid consulting, training and services

- strong development communities

- master-slave architecture

- support MapReduce

- YARN as resource manager

- reducing the deployment time

- set on top of Apache Hadoop

- both are mature offering security

- provide paid consulting, training

and services

- strong development communities

- master-slave architecture

- support MapReduce

- YARN as resource manager

- reducing the deployment time

The Similarities

© 2016 Ness SES. All Rights Reserved5

Cloudera and Hortonworks: The Differences

- a commercial license

(a free 60-day trial)

- reposition as “enterprise

data hub”

- 2008, Facebook, Google,

Oracle and Yahoo in 2008

- +400 customers

- founds $1.04B

- open source license is

completely free.

- positioned as Hadoop distro

- has no proprietary software

- 2011, Teradata

- Yahoo & Microsoft

- founds $248M

https://www.crunchbase.com

© 2016 Ness SES. All Rights Reserved6

Security Solutions

http://www.forbes.com/sites/gilpress/2016/03/14/top-10-hot-big-data-technologies/#7cd07887f26a

HortonworksApache RangerApache KnoxApache Falcon

Cloudera Project RhinoProject Sentry

© 2016 Ness SES. All Rights Reserved7

HADOOP (HDFS) (C+H)

Res. ManagerYarn (C+H)

Warehouse DBPresto (H)

MapReducePIG(C+H)

Search EnginesSolrCloud (C+H)

Analytics

Columnar Store

Accumulo (C+H)

Impala(C)

Machine

LearningSpark ML (C+H)

Mahout(H)

HBase(C+H)

Data StreamingStorm(H)Spark Streaming(C+H)

HIVE (C+H)

Tableau

Data AggregationFlume (C+H)

Msg Brokers + Streams

Kafka (C+H)

COLLECT PROCESS STORE VISUALIZE

Data LoaderSqoop (C+H)

Cluster ecosystem - VISUALIZE

In MemorySpark (C+H)

Tez (H)

Logi

Jasper Reports

D3

Pentaho*Interactive Reporting

Crystal Reports

Data GovernanceAtlas (H)

© 2016 Ness SES. All Rights Reserved8

Cloudera

© 2016 Ness SES. All Rights Reserved9

Cloudera Management Service

© 2016 Ness SES. All Rights Reserved10

Hortonworks

© 2016 Ness SES. All Rights Reserved11

Trends - Forbes report Q1 2016

http://www.forbes.com/sites/gilpress/2016/03/14/top-10-hot-big-data-technologies/#7cd07887f26a

© 2016 Ness SES. All Rights Reserved12

Big Data - Buzz words #TAGs

FAULT TOLERANCE

DATA LOCALITY

LAMBDA ARCHITECTURE

CRUD => CRUD

SHARDING

REPLICATION

RESILIENT SYSTEMS

DISRUPTIVE TECHNOLOGIES

Cloud ComputingInternet of ThingsData Analytics

© 2016 Ness SES. All Rights Reserved13

Thank you!

Skype: r.moldovan

Recommended