24
AVERAGES, DISTRIBUTIONS AND SCALABILITY OF MPI COMMUNICATION TIMES FOR ETHERNET AND MYRINET NETWORKS Nor Asilah Wati Abdul Hamid and Paul Coddington Presented by: Ibrahim Saidu GS22854 Kumane Saed GS24433 Cheng Kian Yong GS24460 Luay GS 21605 Lecturer: Dr. Nor Asilah Wati Abdul Hamid

Nor Asilah Wati Abdul Hamid and Paul Coddington Presented by:

  • Upload
    katy

  • View
    38

  • Download
    0

Embed Size (px)

DESCRIPTION

AVERAGES, DISTRIBUTIONS AND SCALABILITY OF MPI COMMUNICATION TIMES FOR ETHERNET AND MYRINET NETWORKS. Nor Asilah Wati Abdul Hamid and Paul Coddington Presented by: Ibrahim Saidu GS22854 Kumane Saed GS24433 Cheng Kian Yong GS24460 Luay GS 21605. Lecturer: Dr. Nor Asilah Wati Abdul Hamid. - PowerPoint PPT Presentation

Citation preview

Page 1: Nor Asilah Wati Abdul Hamid and Paul Coddington Presented by:

AVERAGES, DISTRIBUTIONS AND SCALABILITY OF MPI

COMMUNICATION TIMES FOR ETHERNET AND MYRINET NETWORKS

Nor Asilah Wati Abdul Hamid and Paul Coddington

Presented by:Ibrahim Saidu GS22854Kumane Saed GS24433

Cheng Kian Yong GS24460Luay GS 21605

Lecturer: Dr. Nor Asilah Wati Abdul Hamid

Page 2: Nor Asilah Wati Abdul Hamid and Paul Coddington Presented by:

INTRODUCTIONIn the past few years, commodity clusters have

become the dominant architecture for high performance computing.

Most parallel programs that run on clusters use the Message Passing Interface (MPI) for communicating data between nodes of the clusters.

It is well known that Myrinet with GM has significant advantages over Fast Ethernet with TCP.

In the case of Ethernet with TCP, retransmit timeouts (RTOs) can also occur

Page 3: Nor Asilah Wati Abdul Hamid and Paul Coddington Presented by:

PROBLEM STATEMENT• Most modern parallel computers are clusters using

Myrinet or Ethernet communication networks.• Several studies have been published comparing the

performance of these two networks for parallel computing, however these focus on average performance, and do not address the distributions of communication times, which can have long tails due to contention effects.

• In the case of Ethernet with TCP, retransmit timeouts (RTOs) can also occur.

Page 4: Nor Asilah Wati Abdul Hamid and Paul Coddington Presented by:

OBJECTIVESTo investigate the effect of Retransmit

timeouts (RTOs) on Ethernet performance and how much could be gained from reducing the effects of RTOs.

We have analyzed the distributions of communication times for standard MPI routines on Ethernet with TCP and Myrinet with GM communications networks on the same cluster.

We also studied the scalability of the distributions as the number of communicating processes increases.

Page 5: Nor Asilah Wati Abdul Hamid and Paul Coddington Presented by:

RELATED WORK• [4,5,6,7]) measure only the average times for point-

to-point (ping-pong) communications between two nodes.

• [3] Studied the effects of TCP Retransmit Timeouts (RTO) on MPI communications over Ethernet networks, including collective communications.

• [3,4,5,6]) compare network performance using applications benchmarks such as the NAS Parallel Benchmarks.

• [3,4] analyzed the effects of tuning Ethernet drivers or TCP configuration to improve MPI performance on Ethernet networks.

Page 6: Nor Asilah Wati Abdul Hamid and Paul Coddington Presented by:

RELATED WORK• [8] has used MPIBench to compare the MPI

performance (including distributions of communication times) of Ethernet and Myrinet networks, but these were not direct comparisons.

• [9] compare the performance of different Ethernet network topologies in commodity clusters, showed that there were significant problems with the performance of collective communications in MPICH version 1.2.0 on Fast Ethernet networks.

• [11] used later version of the MPICH for collective communication routines , which give much better performance on Ethernet networks and perhaps reduce the number of RTOs

Page 7: Nor Asilah Wati Abdul Hamid and Paul Coddington Presented by:

METHODOLOGYIBM eServer 1350 Linux Cluster

Page 8: Nor Asilah Wati Abdul Hamid and Paul Coddington Presented by:

IBM eServer 1350 Linux ClusterFast Ethernet Architecture

Page 9: Nor Asilah Wati Abdul Hamid and Paul Coddington Presented by:

METHODOLOGY Bench Mark.Measurements of MPI communication times

were obtained using MPIBench [1,2,8]. All measurements were run with dedicated access to the cluster, so there were no other processes affecting the results.

Page 10: Nor Asilah Wati Abdul Hamid and Paul Coddington Presented by:

Results 1. Send/Receive

Page 11: Nor Asilah Wati Abdul Hamid and Paul Coddington Presented by:

Send/Receive (Cont..)Fast Ethernet are about 10 times higher than

Myrinet.

For higher message sizes the difference is primarily due to the difference in bandwidth for each network.

For Ethernet there is a jump between 64 and 128 CPUs (32 to 64 nodes) which is due to the communication no longer being between processors connected by a single switch.

Page 12: Nor Asilah Wati Abdul Hamid and Paul Coddington Presented by:

Send/Receive (Cont..)

Page 13: Nor Asilah Wati Abdul Hamid and Paul Coddington Presented by:

Send/Receive (Cont..)TCP Retransmit-Timeout (RTO), which the

TCP specifications say should be given by RTO = SRTT + 4 * RTTVAR

The average communication time without RTO (SRTT= 25 ms) plus the 200 ms minimum value for 4 * RTTVAR set by the Linux kernel.

Presumably caused by communications that suffer 2 or 3 RTOs before finally being completed

Page 14: Nor Asilah Wati Abdul Hamid and Paul Coddington Presented by:

2. Combined Send and Receive

Page 15: Nor Asilah Wati Abdul Hamid and Paul Coddington Presented by:

Combined Send/Receive (Cont..)Results are approximately a factor of 2 larger

than the MPI_Send/MPI_Recv Results indicated the duplex capability of

these networks is not being utilized.

Page 16: Nor Asilah Wati Abdul Hamid and Paul Coddington Presented by:

3.Barrier

Page 17: Nor Asilah Wati Abdul Hamid and Paul Coddington Presented by:

Barrier (Cont…)The big jump in the Ethernet result is

probably due to a different algorithm being used in MPICH 1.2.6 code.

Ethernet is approximately 4-5 times slower than Myrinet.

Page 18: Nor Asilah Wati Abdul Hamid and Paul Coddington Presented by:

Barrier (Cont…)

Page 19: Nor Asilah Wati Abdul Hamid and Paul Coddington Presented by:

4.Broadcast

Page 20: Nor Asilah Wati Abdul Hamid and Paul Coddington Presented by:

Broadcast (Cont…)Through a single Ethernet switch, rather

than between switches, there are no RTOs for broadcast.

Myrinet distributions have quite long tails, which are caused by a small number of repetitions of the benchmark

Page 21: Nor Asilah Wati Abdul Hamid and Paul Coddington Presented by:

5.Alltoall

Page 22: Nor Asilah Wati Abdul Hamid and Paul Coddington Presented by:

Alltoall (Cont…)That average completion time for Myrinet

increases gradually with message size and number of processes.

Ethernet performance for more than 32 CPUs shows the effect of Retransmit -Timeouts

Page 23: Nor Asilah Wati Abdul Hamid and Paul Coddington Presented by:

6. ConclusionsAs expected, the Myrinet network performs

significantly better than Fast Ethernet.The TCP RTO on the Ethernet network does

affect communications performance, but only for large message sizes and large numbers of processors, where the network becomes saturated.

The effects are much less serious than previous measurements.

Page 24: Nor Asilah Wati Abdul Hamid and Paul Coddington Presented by:

FACULTY OF COMPUTER SCIENCE AND INFORMATION TECHNOLOGY

Thank you