26
A Hybrid Scheduling Approach for Scalable Heterogeneous Hadoop Systems Authors: Aysan Rasooli Douglas G. Down McMaster University November 2012

A Hybrid Scheduling Approach for Scalable Heterogeneous ...datasys.cs.iit.edu/events/MTAGS12/s04.pdfA Hybrid Scheduling Approach for Scalable Heterogeneous Hadoop Systems Authors:

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: A Hybrid Scheduling Approach for Scalable Heterogeneous ...datasys.cs.iit.edu/events/MTAGS12/s04.pdfA Hybrid Scheduling Approach for Scalable Heterogeneous Hadoop Systems Authors:

A Hybrid Scheduling Approach for

Scalable Heterogeneous Hadoop Systems

Authors:

Aysan Rasooli

Douglas G. Down

McMaster University

November 2012

Page 2: A Hybrid Scheduling Approach for Scalable Heterogeneous ...datasys.cs.iit.edu/events/MTAGS12/s04.pdfA Hybrid Scheduling Approach for Scalable Heterogeneous Hadoop Systems Authors:

Agenda

McMaster University 2

• Introducing the Hadoop System

• Heterogeneity and Scalability in Hadoop

• Performance Issues of Existing Hadoop Schedulers

• Proposed Hybrid Scheduling System

• Evaluation

• Conclusion

Page 3: A Hybrid Scheduling Approach for Scalable Heterogeneous ...datasys.cs.iit.edu/events/MTAGS12/s04.pdfA Hybrid Scheduling Approach for Scalable Heterogeneous Hadoop Systems Authors:

MapReduce

McMaster University 3

Page 4: A Hybrid Scheduling Approach for Scalable Heterogeneous ...datasys.cs.iit.edu/events/MTAGS12/s04.pdfA Hybrid Scheduling Approach for Scalable Heterogeneous Hadoop Systems Authors:

Hadoop System

McMaster University 4

Page 5: A Hybrid Scheduling Approach for Scalable Heterogeneous ...datasys.cs.iit.edu/events/MTAGS12/s04.pdfA Hybrid Scheduling Approach for Scalable Heterogeneous Hadoop Systems Authors:

Heterogeneity and Scalability in Hadoop

McMaster University 5

• Cluster

• Workload

• User

Page 6: A Hybrid Scheduling Approach for Scalable Heterogeneous ...datasys.cs.iit.edu/events/MTAGS12/s04.pdfA Hybrid Scheduling Approach for Scalable Heterogeneous Hadoop Systems Authors:

Hadoop Schedulers

McMaster University 6

• FIFO

• Fair Sharing

• COSHH

Page 7: A Hybrid Scheduling Approach for Scalable Heterogeneous ...datasys.cs.iit.edu/events/MTAGS12/s04.pdfA Hybrid Scheduling Approach for Scalable Heterogeneous Hadoop Systems Authors:

Fair Sharing

McMaster University 7

(Zaharia et al., 2010)

• Group jobs into “pools”

• Assign each pool a guaranteed minimum share

• Divide excess capacity evenly between pools

J1

Job Queue

J2

R4

R3

R2

R1

Page 8: A Hybrid Scheduling Approach for Scalable Heterogeneous ...datasys.cs.iit.edu/events/MTAGS12/s04.pdfA Hybrid Scheduling Approach for Scalable Heterogeneous Hadoop Systems Authors:

Fair Sharing

McMaster University 8

• Goal: fast response times for small jobs, guaranteed service

levels for long jobs

• Considers Minimum Share satisfaction, Fairness

Drawbacks:

• Does not take into account locality

• Does not take into account heterogeneity

Page 9: A Hybrid Scheduling Approach for Scalable Heterogeneous ...datasys.cs.iit.edu/events/MTAGS12/s04.pdfA Hybrid Scheduling Approach for Scalable Heterogeneous Hadoop Systems Authors:

Classes, Suggested Classes for all

Resources

Execution Time of the New Job on

all Resources

Hadoop System

Task Scheduling Process

Queuing Process

Routing Process

Heartbeat New Job

Selected Jobs

Assigned Tasks

COSHH Scheduler

Hadoop System

Task Scheduling Process

Queuing Process

Routing Process

• Estimation of Job execution time across all resources

• Classify the jobs

• Calculate the best set of suggested job classes for each resource

• Select a job for the current free resource using suggested jobs of the Queuing Process

• Consider the fairness and the minimum share satisfaction in the system

Page 10: A Hybrid Scheduling Approach for Scalable Heterogeneous ...datasys.cs.iit.edu/events/MTAGS12/s04.pdfA Hybrid Scheduling Approach for Scalable Heterogeneous Hadoop Systems Authors:

COSHH Scheduler

McMaster University 10

• Considering the heterogeneity in the Hadoop system

• Improves Mean Completion Time

• Considers:

• Minimum Share Satisfaction

• Fairness

• Locality

Page 11: A Hybrid Scheduling Approach for Scalable Heterogeneous ...datasys.cs.iit.edu/events/MTAGS12/s04.pdfA Hybrid Scheduling Approach for Scalable Heterogeneous Hadoop Systems Authors:

Performance Issues of Existing Schedulers

McMaster University 11

Problem I. Small Jobs Starvation

FIFO :

J1

Job Queue

J2

R4

R3

R2

R1

Page 12: A Hybrid Scheduling Approach for Scalable Heterogeneous ...datasys.cs.iit.edu/events/MTAGS12/s04.pdfA Hybrid Scheduling Approach for Scalable Heterogeneous Hadoop Systems Authors:

Performance Issues of Existing Schedulers

McMaster University 12

Problem II. Sticky Slots

Fair Sharing:

R4

R3

R2

R1

Job Fair

Share Running

Tasks

Job 1 2 2

Job 2 2 2

Job Fair

Share Running

Tasks

Job 1 2 1

Job 2 2 2

Page 13: A Hybrid Scheduling Approach for Scalable Heterogeneous ...datasys.cs.iit.edu/events/MTAGS12/s04.pdfA Hybrid Scheduling Approach for Scalable Heterogeneous Hadoop Systems Authors:

Performance Issues of Existing Schedulers

McMaster University 13

Problem III. Resource and Job Mismatch

Storage

Unit

slice 1

slice 2

slice 3

Computation

Unit

slot 1

slot 2

slot 3

slot 4

slot 5

R1

Storage

Unit

slice 1

slice 2

slice 3

slice 4

slice 5

slice 6

Computation

Unit

slot 1

slot 2

R2

File 1:

File 2:

J2

Job Queue

J1

Storage

Unit

slice 1

slice 2

slice 3

Computation

Unit

slot 1

slot 2

slot 3

slot 4

slot 5

Storage

Unit

slice 1

slice 2

slice 3

slice 4

slice 5

slice 6

Computation

Unit

slot 1

slot 2

Page 14: A Hybrid Scheduling Approach for Scalable Heterogeneous ...datasys.cs.iit.edu/events/MTAGS12/s04.pdfA Hybrid Scheduling Approach for Scalable Heterogeneous Hadoop Systems Authors:

Problem I. Small Jobs Starvation

McMaster University 14

FIFO :

Page 15: A Hybrid Scheduling Approach for Scalable Heterogeneous ...datasys.cs.iit.edu/events/MTAGS12/s04.pdfA Hybrid Scheduling Approach for Scalable Heterogeneous Hadoop Systems Authors:

Problem II. Sticky Slots

McMaster University 15

Fair Sharing :

Page 16: A Hybrid Scheduling Approach for Scalable Heterogeneous ...datasys.cs.iit.edu/events/MTAGS12/s04.pdfA Hybrid Scheduling Approach for Scalable Heterogeneous Hadoop Systems Authors:

Problem III. Resource and Job Mismatch

McMaster University 16

COSHH:

Page 17: A Hybrid Scheduling Approach for Scalable Heterogeneous ...datasys.cs.iit.edu/events/MTAGS12/s04.pdfA Hybrid Scheduling Approach for Scalable Heterogeneous Hadoop Systems Authors:

Performance Issues of Existing Schedulers

McMaster University 17

Page 18: A Hybrid Scheduling Approach for Scalable Heterogeneous ...datasys.cs.iit.edu/events/MTAGS12/s04.pdfA Hybrid Scheduling Approach for Scalable Heterogeneous Hadoop Systems Authors:

Experimental Environment

McMaster University 18

Page 19: A Hybrid Scheduling Approach for Scalable Heterogeneous ...datasys.cs.iit.edu/events/MTAGS12/s04.pdfA Hybrid Scheduling Approach for Scalable Heterogeneous Hadoop Systems Authors:

Real Hadoop Workloads

McMaster University 19

(Chen et al., 2011)

Page 20: A Hybrid Scheduling Approach for Scalable Heterogeneous ...datasys.cs.iit.edu/events/MTAGS12/s04.pdfA Hybrid Scheduling Approach for Scalable Heterogeneous Hadoop Systems Authors:

Scalability Analysis- Results

Job Number Scalability

McMaster University 20

Page 21: A Hybrid Scheduling Approach for Scalable Heterogeneous ...datasys.cs.iit.edu/events/MTAGS12/s04.pdfA Hybrid Scheduling Approach for Scalable Heterogeneous Hadoop Systems Authors:

Scalability Analysis- Results

Resource Number Scalability

McMaster University 21

Page 22: A Hybrid Scheduling Approach for Scalable Heterogeneous ...datasys.cs.iit.edu/events/MTAGS12/s04.pdfA Hybrid Scheduling Approach for Scalable Heterogeneous Hadoop Systems Authors:

Scalability Analysis- Hybrid Scheduler

McMaster University 22

Page 23: A Hybrid Scheduling Approach for Scalable Heterogeneous ...datasys.cs.iit.edu/events/MTAGS12/s04.pdfA Hybrid Scheduling Approach for Scalable Heterogeneous Hadoop Systems Authors:

Scalability Analysis- Hybrid Scheduler

Job Number Scalability

McMaster University 23

Page 24: A Hybrid Scheduling Approach for Scalable Heterogeneous ...datasys.cs.iit.edu/events/MTAGS12/s04.pdfA Hybrid Scheduling Approach for Scalable Heterogeneous Hadoop Systems Authors:

Scalability Analysis- Hybrid Scheduler

Resource Number Scalability

McMaster University 24

Page 25: A Hybrid Scheduling Approach for Scalable Heterogeneous ...datasys.cs.iit.edu/events/MTAGS12/s04.pdfA Hybrid Scheduling Approach for Scalable Heterogeneous Hadoop Systems Authors:

Conclusion

McMaster University 25

• Performance Issues of Hadoop Schedulers:

• Small Jobs Starvation

• Sticky Slots

• Resource and Jobs Mismatch

• Propose a Hybrid Hadoop Scheduler

Page 26: A Hybrid Scheduling Approach for Scalable Heterogeneous ...datasys.cs.iit.edu/events/MTAGS12/s04.pdfA Hybrid Scheduling Approach for Scalable Heterogeneous Hadoop Systems Authors:

Thanks

McMaster University