Hi tune sharing

Preview:

DESCRIPTION

Belief view on HiTune

Citation preview

HiTune sharing

Xiao Zhu1/29/2013

2

HiTune is...– a Hadoop performance analyzer– developed by Intel– based on Chukwa– https://github.com/intel-hadoop/HiTune– Contact: jason.dai@intel.com jie.huang@intel.com.– Has 3 parts:– 1) Tracker – 2) Aggregation Engine– 3) Analysis Engine

3

Example of HiTune Output

4

Example of HiTune Output

5

Example of HiTune Output

6

Chukwa is...– an open source data collection system

for monitoring large distributed systems.– based on HDFS and Map/Reduce

framework.– http://incubator.apache.org/chukwa/

– Has many parts, including:– 1) Agent– 2) Collector– 3) DemuxManager – 4) Other processes for logging and

archive

7

HiTune is based on ChukwaTracker

Aggregation Engine

Analysis Engine

Agent

Collector

Demux Manager

is partly based on

is based on

is partly based on

We tend to call those parts by the right side names, and when we refer toHiTune, we are considering HiTune and Chukwa together

Some of them are simply built upon Chukwa componentsbut others are implemented by modifying Chukwa or add new components.

You will find Chukwa patches and patched Chukwa binary in HiTune release.So when you are going to deploy HiTune, I do not suggest deploy Chukwafirst manually (though you can), for HiTune has already included it.

8

HiTune is based on ChukwaTracker

Aggregation Engine

Analysis Engine

Agent

Collector

Demux Manager

is partly based on

is based on

is partly based on

The tracker includes HiTune java agent part and Chukwa agent part.The analysis engines includes HiTune script part and Chukwa Demux part.

See following data flow for explanations on those parts.

9

HiTune/Chukwa System Basic StructureHiTune/Chukwa itself needs to set up on a standalone hadoop cluster. We name it as ‘Chukwa Cluster’, and the target cluster is named ‘Hadoop Cluster’.

HiTune Agents

Workload

Map/ReduceHDFS

Demux

Collectors Map/Reduce

HDFS

Hadoop Cluster Chukwa Cluster

User’s ComputerExcel

HiTune/Chukwa Process and Data Flow

10

1. HiTune agents (java agent part) will be invoked by JVM when the workload starts on every node in hadoop cluster. This part will get system status and hadoop logs and save them on local storage.

HiTune Agents

Workload

Map/ReduceHDFS

Demux

Collectors Map/Reduce

HDFS

Hadoop Cluster Chukwa Cluster

User’s ComputerExcel

HiTune/Chukwa Process and Data Flow

11

2. Agent (Chukwa agent part) process will check java agent output periodically and send new data to (one of) the Collector(s).

HiTune Agents

Workload

Map/ReduceHDFS

Demux

Collectors Map/Reduce

HDFS

Hadoop Cluster Chukwa Cluster

User’s ComputerExcel

HiTune/Chukwa Process and Data Flow

12

3. Collector(s) put data to HDFS on Chukwa Cluster, When it has received 64MB data or a given time interval has passed, it pack received data to data packages (.done)

HiTune Agents

Workload

Map/ReduceHDFS

Demux

Collectors Map/Reduce

HDFS

Hadoop Cluster Chukwa Cluster

User’s ComputerExcel

HiTune/Chukwa Process and Data Flow

13

4. Demux Manager check data packages in Collector output dir on HDFS every 20 seconds. If it find .done files, it start Map/Reduce procedure to analyze it (May cost a long time to finish).

HiTune Agents

Workload

Map/ReduceHDFS

Demux

Collectors Map/Reduce

HDFS

Hadoop Cluster Chukwa Cluster

User’s ComputerExcel

HiTune/Chukwa Process and Data Flow

14

4. (Cont.) After Demux finishes, a HiTune script is required to run by the user. This script will run Map/Reduce to get final output (.csv files) (May cost a long time to finish, but faster than 3).

HiTune Agents

Workload

Map/ReduceHDFS

Demux

Collectors Map/Reduce

HDFS

Hadoop Cluster Chukwa Cluster

User’s ComputerExcel

HiTune/Chukwa Process and Data Flow

15

5. User get final output from hdfs://.JOBS/ manually. Then apply the output (.csv files) to HiTune Excel template to see the result. Graphics, Summaries and etc. will be computed by Excel.

HiTune Agents

Workload

Map/ReduceHDFS

Demux

Collectors Map/Reduce

HDFS

Hadoop Cluster Chukwa Cluster

User’s ComputerExcel

HiTune/Chukwa Process and Data Flow• Yes if you want you can deploy Chukwa on Hadoop cluster.

• Doing so will add difficulties to management and maintenance, but this is theoretically feasible.

17

Why such structure?• Using Hadoop for MapReduce processing of

logs is somewhat troublesome.• Logs are generated incrementally across many

machines, but Hadoop MapReduce works best on a small number of large files.

• HDFS doesn't currently support appends, making it difficult to keep the distributed copy fresh.

18

Why such structure?• Chukwa is devoted to bridging that gap

between logs and MapReduce. • Chukwa is a scalable distributed monitoring

and analysis system, particularly logs from Hadoop and other large systems.

• Though process of agents and collectors, large, appended, distributed logs are transformed into large data chunks, which are suitable for Map/Reduce.

19

Why such structure?• The overhead is mainly caused by agents,

since only agents run on Hadoop Cluster.• According to the HiTune paper, the overhead

is less than 2%• See those papers:• Dai, Jinquan, et al. "Hitune: Dataflow-based performance analysis for big data

cloud." Proc. of the 2011 USENIX ATC (2011): 87-100. (Available on HiTune Github https://github.com/intel-hadoop/HiTune)

• Boulon, Jerome, et al. "Chukwa, a large-scale monitoring system." Proceedings of CCA. Vol. 8. 2008.

current HiTune version: 0.9• Support Hadoop 0.2 best• Based on Chukwa 0.4• Can support Hadoop 0.2+ , some options need

to be changed, and some metrics will be missing. (Current IDH is using Hadoop 1.0+)

• Usually require a long time to complete aggregating and analyzing. Better deploy it on a fast cluster.

Questions?

Backup

HiTune trouble shooting• Trouble shooting on HiTune is usually painful.• Need to check those logs: Hadoop cluster logs (task

tracker logs, job tracker logs, namenode logs, datanode logs), (most important!)Chukwa logs (agent logs, collector logs, demux logs), HiTune logs(script outputs).

• If there is no error or warning in logs, check outputs on disk and HDFS

• HiTuneStatusCheck.sh is not reliable. Check the logs yourself.

HiTune/Chukwa Process and Data Flow

24

6. Later, Chukwa will group and archive data used on Chukwa Cluster HDFS to save space, but we will not discuss it here.

HiTune Agents

Workload

Map/ReduceHDFS

Demux

Collectors Map/Reduce

HDFS

Hadoop Cluster Chukwa Cluster

User’s ComputerExcel

Recommended