23
Lee, Gil Jae, and José AB Fortes. " Hadoop Performance Self-Tuning Using a Fuzzy-Prediction Approach. " Autonomic Computing (ICAC), 2016 IEEE International Conference on. IEEE, 2016. Summarized by: Cristopher Flagg

Lee, Gil Jae, and José AB Fortes. Hadoop Performance Self ...menasce/cs788/slides/cs788-Lee2016-cflagg-an… · - Bad for real time processing. Hadoop Bus Car Train Train Plane Car

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Lee, Gil Jae, and José AB Fortes. Hadoop Performance Self ...menasce/cs788/slides/cs788-Lee2016-cflagg-an… · - Bad for real time processing. Hadoop Bus Car Train Train Plane Car

Lee, Gil Jae, and José AB Fortes. "Hadoop Performance Self-Tuning Using a Fuzzy-Prediction Approach."

Autonomic Computing (ICAC), 2016 IEEE International Conference on. IEEE, 2016.

Summarized by: Cristopher Flagg

Page 2: Lee, Gil Jae, and José AB Fortes. Hadoop Performance Self ...menasce/cs788/slides/cs788-Lee2016-cflagg-an… · - Bad for real time processing. Hadoop Bus Car Train Train Plane Car

Hadoop - Distributed Processing of Large Data Sets across Clusters of Computers- Limited/Standardized interactions with data- Components:

- Hadoop Distributed File System (HDFS)- MapReduce Task Processing- YARN Resource Management

- Good for parallel processing of data- Bad for real time processing

Page 3: Lee, Gil Jae, and José AB Fortes. Hadoop Performance Self ...menasce/cs788/slides/cs788-Lee2016-cflagg-an… · - Bad for real time processing. Hadoop Bus Car Train Train Plane Car

Hadoop Bus Car TrainTrain Plane CarBus Bus Plane

Bus Car Train Train Plane Car Bus Bus Plane

Bus 1

Car 1

Train 1

Plane 1

Bus 1

Bus 1

Train 1 Car 1 Plane 1

The input is divided into tasks, each task being handled by an identical Map Process.

Page 4: Lee, Gil Jae, and José AB Fortes. Hadoop Performance Self ...menasce/cs788/slides/cs788-Lee2016-cflagg-an… · - Bad for real time processing. Hadoop Bus Car Train Train Plane Car

Hadoop Bus Car TrainTrain Plane CarBus Bus Plane

Bus Car Train Train Plane Car Bus Bus Plane

Bus 1

Car 1

Train 1

Plane 1

Bus 1

Bus 1

Train 1 Car 1 Plane 1

The input is divided into tasks, each task being handled by an identical Map Process.

The output of the Map processes is written to shared file storage, sorted, and sent to reduce.

Page 5: Lee, Gil Jae, and José AB Fortes. Hadoop Performance Self ...menasce/cs788/slides/cs788-Lee2016-cflagg-an… · - Bad for real time processing. Hadoop Bus Car Train Train Plane Car

Hadoop Bus Car TrainTrain Plane CarBus Bus Plane

Bus Car Train Train Plane Car Bus Bus Plane

Bus 1

Car 1

Train 1

Plane 1

Bus 1

Bus 1

Train 1 Car 1 Plane 1

Bus 1

Bus 1

Bus 1

Car 1

Car 1

Plane 1

Plane 1

Train 1

Train 1Bus 3Car 2Plane 2Train 2

The input is divided into tasks, each task being handled by an identical Map Process.

The output of the Map processes is written to shared file storage, sorted, and sent to reduce.

Each Reduce process receives ALL of the matching keys and writes the results to output.

Page 6: Lee, Gil Jae, and José AB Fortes. Hadoop Performance Self ...menasce/cs788/slides/cs788-Lee2016-cflagg-an… · - Bad for real time processing. Hadoop Bus Car Train Train Plane Car

Hadoop + Yarn Bus Car TrainTrain Plane CarBus Bus Plane

Bus Car Train Train Plane Car Bus Bus Plane

Bus 1

Car 1

Train 1

Plane 1

Bus 1

Bus 1

Train 1 Car 1 Plane 1 The resource manager splits the input into tasks and sends tasks to nodes based on availability and ability to process the tasks.

Node- Task

Node- Task- Task

ResourceManager

Page 7: Lee, Gil Jae, and José AB Fortes. Hadoop Performance Self ...menasce/cs788/slides/cs788-Lee2016-cflagg-an… · - Bad for real time processing. Hadoop Bus Car Train Train Plane Car

Hadoop + Yarn Bus Car TrainTrain Plane CarBus Bus Plane

Bus Car Train Train Plane Car Bus Bus Plane

Bus 1

Car 1

Train 1

Plane 1

Bus 1

Bus 1

Train 1 Car 1 Plane 1

The nodes write the results to shared storage and the resource manager sorts the results and prepares them for Reduce

Node- Task

Node- Task- Task

ResourceManager

Page 8: Lee, Gil Jae, and José AB Fortes. Hadoop Performance Self ...menasce/cs788/slides/cs788-Lee2016-cflagg-an… · - Bad for real time processing. Hadoop Bus Car Train Train Plane Car

Hadoop + Yarn Bus Car TrainTrain Plane CarBus Bus Plane

Bus Car Train Train Plane Car Bus Bus Plane

Bus 1

Car 1

Train 1

Plane 1

Bus 1

Bus 1

Train 1 Car 1 Plane 1

Bus 1

Bus 1

Bus 1

Car 1

Car 1

Plane 1

Plane 1

Train 1

Train 1Bus 3Car 2Plane 2Train 2

The resource manage passes the Reduce tasks to the nodes. The nodes write the results to shared storage.

Node- Task

Node- Task- Task

ResourceManager

Page 9: Lee, Gil Jae, and José AB Fortes. Hadoop Performance Self ...menasce/cs788/slides/cs788-Lee2016-cflagg-an… · - Bad for real time processing. Hadoop Bus Car Train Train Plane Car

- "Hadoop has over 200 tunable configuration parameters [11], some having complex inter-dependencies"

- Previous attempts require model-training phase to tune parameters- Time consuming initial model construction- Requires re-training for different style jobs- Cannot accurately predict dynamic workload from job/data analysis before run time

- "Resource usage optimization seeks to avoid over-/under-allocation of resources to MR jobs"

- Does not require modeling phase- Reacts better to different dynamic workloads.

Hadoop Parameters

Page 10: Lee, Gil Jae, and José AB Fortes. Hadoop Performance Self ...menasce/cs788/slides/cs788-Lee2016-cflagg-an… · - Bad for real time processing. Hadoop Bus Car Train Train Plane Car

- Resource Manager (RM)- Scheduler - allocation of 'containers' for

tasks (CPU and Memory)- ApplicationsManager (AM) accepts job

submissions and launches application-specific AM

- CapacityScheduler (default) and FairScheduler for allocating containers

- ApplicationsManager (AM)- Creates per-application

ApplicationMaster, which negotiates resource containers from RM

Hadoop Daemons

https://hadoop.apache.org/docs/r2.7.2/hadoop-yarn/hadoop-yarn-site/YARN.html

Page 11: Lee, Gil Jae, and José AB Fortes. Hadoop Performance Self ...menasce/cs788/slides/cs788-Lee2016-cflagg-an… · - Bad for real time processing. Hadoop Bus Car Train Train Plane Car

- Parameter AMRP- Yarn.scheduler.capacity.maximum-am-resource-percent -

maximum percent of resources in the cluster that can be used to run AMs.

- AMRP = 0.6 means that up to 60% of resources in the cluster can be used to run AMs. The default value is 10%.

Hadoop CapacityScheduler

https://hadoop.apache.org/docs/r2.7.2/hadoop-yarn/hadoop-yarn-site/YARN.html

- HIGH AMRP = small sized MR jobs- HIGH AMRP = Larger sized MR

jobs, may increase response time- LOW AMRP = More containers may

be run at once, higher throughput (based on parameter description) (Not explicitly stated in paper)

Page 12: Lee, Gil Jae, and José AB Fortes. Hadoop Performance Self ...menasce/cs788/slides/cs788-Lee2016-cflagg-an… · - Bad for real time processing. Hadoop Bus Car Train Train Plane Car

Fuzzy Rules- M = Measured Parameters at time T- I = Input vector at time T

- Total CPU Usage - Task Activity - % based on total containers minus used containers and AM containers

- O = Output vector at time T- Predicted output

Figure 5

Rules are collected based on actual resource usage. Future parameters are derived from closest matching rules.

Page 13: Lee, Gil Jae, and José AB Fortes. Hadoop Performance Self ...menasce/cs788/slides/cs788-Lee2016-cflagg-an… · - Bad for real time processing. Hadoop Bus Car Train Train Plane Car

Fuzzy Controller- Seeks to improve performance by avoiding over-/under-allocation of MR jobs.- Adjust AMRP to optimize allocation of MR jobs.- "An MR job is in “RAMPING” status if it is in “RUNNING” status and its AM

task is launched, but no containers have been allocated to its other tasks."- RAMPING AMs = performance degradation

Figure 7 - AMRP 0.8 = High Ramping Figure 8 - AMRP 0.2 = Low Ramping

Page 14: Lee, Gil Jae, and José AB Fortes. Hadoop Performance Self ...menasce/cs788/slides/cs788-Lee2016-cflagg-an… · - Bad for real time processing. Hadoop Bus Car Train Train Plane Car

Fuzzy Controller- When CPU usage and the task activity are expected to be increased, cluster

still has room for new AMs, that is, is under-allocated- Increase AMRP to process more tasks

- Task activity increases while the CPU usage is expected to decrease, cluster is over-allocated

- Decrease AMRP to reduce ramping

- New AMRP values sent to Resource Manager

Page 15: Lee, Gil Jae, and José AB Fortes. Hadoop Performance Self ...menasce/cs788/slides/cs788-Lee2016-cflagg-an… · - Bad for real time processing. Hadoop Bus Car Train Train Plane Car

Data Sets- Teragen - Hadoop benchmarking toolkit for generating/sorting data.- RandomTextWriter - Distributed job with no interaction between the tasks

and each task writes a large unsorted random sequence of words.

TS = TeraSort, GR = Grep, WC = Word Count (Hello World)

Page 16: Lee, Gil Jae, and José AB Fortes. Hadoop Performance Self ...menasce/cs788/slides/cs788-Lee2016-cflagg-an… · - Bad for real time processing. Hadoop Bus Car Train Train Plane Car

Results - 1 GB Data

DynamicPrev = results from [17] B. Zhang, F. Krikava, R. Rouvoy, and L. Seinturier. Selfconfiguration of the number of concurrently running mapreduce jobs in a hadoop cluster.

Page 17: Lee, Gil Jae, and José AB Fortes. Hadoop Performance Self ...menasce/cs788/slides/cs788-Lee2016-cflagg-an… · - Bad for real time processing. Hadoop Bus Car Train Train Plane Car

Results - 5 GB Data

Page 18: Lee, Gil Jae, and José AB Fortes. Hadoop Performance Self ...menasce/cs788/slides/cs788-Lee2016-cflagg-an… · - Bad for real time processing. Hadoop Bus Car Train Train Plane Car

Results - 10 GB Data

Page 19: Lee, Gil Jae, and José AB Fortes. Hadoop Performance Self ...menasce/cs788/slides/cs788-Lee2016-cflagg-an… · - Bad for real time processing. Hadoop Bus Car Train Train Plane Car

Results Comparison

Page 20: Lee, Gil Jae, and José AB Fortes. Hadoop Performance Self ...menasce/cs788/slides/cs788-Lee2016-cflagg-an… · - Bad for real time processing. Hadoop Bus Car Train Train Plane Car

Contributions- CPU usage by a MR job is highly correlated with the number of concurrent

tasks of the MR job, regardless of the pattern of resource usage.- Observation that once a MR job has its status changed to “RUNNING,” it may

stay in a so-called “RAMPING.”- Local resource-usage monitor that counts the number of tasks and AMs

running in each slave node and monitors system-level metrics.- Fuzzy-prediction controller that periodically forecasts CPU usage and the

number of concurrent tasks.- Design of a capacity scheduler controller that decides when and how to

update the Hadoop parameter AMRP based on the value predicted.

Page 21: Lee, Gil Jae, and José AB Fortes. Hadoop Performance Self ...menasce/cs788/slides/cs788-Lee2016-cflagg-an… · - Bad for real time processing. Hadoop Bus Car Train Train Plane Car

Additional AnalysisFairScheduler

- "Fair scheduling is a method of assigning resources to applications such that all apps get, on average, an equal share of resources over time. Hadoop NextGen is capable of scheduling multiple resource types. By default, the Fair Scheduler bases scheduling fairness decisions only on memory. It can be configured to schedule with both memory and CPU, using the notion of Dominant Resource Fairness developed by Ghodsi et al."

- FairScheduler was discussed but not evaluated in paper.

https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/FairScheduler.html

Page 22: Lee, Gil Jae, and José AB Fortes. Hadoop Performance Self ...menasce/cs788/slides/cs788-Lee2016-cflagg-an… · - Bad for real time processing. Hadoop Bus Car Train Train Plane Car

Additional AnalysisFuzzy Prediction

- The paper uses 'Fuzzy' controller to predict future CPU and task utilization.- The paper uses these results in a loop to consider whether to increase or

decrease AMRP. Nudges parameter using static rules.- Optimal AMRP is not predicted.

Control Loop - cases not considered

- Ucpu < 0 and Atask < 0 - decrease in % CPU and tasks- Ucpu > 0 and Atask < 0 - increase in % CPU amd decrease in tasks

Page 23: Lee, Gil Jae, and José AB Fortes. Hadoop Performance Self ...menasce/cs788/slides/cs788-Lee2016-cflagg-an… · - Bad for real time processing. Hadoop Bus Car Train Train Plane Car

Questions?