8
Big Data consulting Big applications on Cloudera Hadoop

Big Data applications on Cloudera Hadoop

Embed Size (px)

Citation preview

Big Data consulting

Big applications on Cloudera Hadoop

Hadoop as an application framework■ Hadoop makes a great solution for web scale data

warehousing and batch data processing■ But there’s more to hadoop than big analytics queries

alone■ Cloudera Hadoop offers a compelling foundation for

building and deploying large scale distributed internet applications

Anatomy of a Hadoop-backed internet application

HBase database service

SolrCloud search service

Spark batch processing tierCache serviceWeb UI service API service

HDFS filesystem

Hadoop as an operating system■ Hadoop offers many of the foundation components you need to

build web scale applications:■ Message queues [Kafka]■ Stream processing [Spark]■ Batch processing [Spark, MapReduce]■ Database [Hbase]■ Search [SolrCloud]■ Storage [HDFS]

Cloudera Manager integration■ You can use Cloudera Manager to deploy, operate, monitor

and alert on your service’s custom components■ Stuff like Memcached, your APIs, and your web UI■ https://github.com/cloudera/cm_ext/wiki■ You package your custom services components as parcels

for distribution across the cluster■ A robust framework for packaging, versioning and upgrades

Hadoop cluster

Component view

Web 1 Web 2

SolrCloud 1

Hbase 1 Hbase 2 Hbase N SolrCloud N

Spark 1 Spark 2 Spark N Hadoop Master / CM

Network Services

Hadoop Master / CM

Firewall ServicesMemcached 1 Memcached N

Distribution Server

Deployment considerations■ You will still need the support of foundation network services like DNS, NTP and

firewall■ You may still need to deploy HAProxy - can be on nodes in the hadoop cluster with

floating IP■ http://blog.cloudera.com/blog/2013/08/how-to-achieve-higher-availability-for-

hue/■ Use Linux Control Groups (CGroups) to guarantee resource shares - configure from CM

■ http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/cm_mc_cgroups.html

Wrap up■ By extending Cloudera Manager, Hadoop can be used to

build, deploy and operate complete, web-scale applications in a consistent and predictable way

■ Hadoop can offer much more than data warehousing alone

■ But still a little way to go until Hadoop becomes a fully fledged Data Centre scale OS