Upload
lucas-nichols
View
215
Download
0
Embed Size (px)
Citation preview
SECTION 5: PERFORMANCECHRIS ZINGRAF
OVERVIEW:
• This section measures the performance of MapReduce on two computations, Grep and Sort.
• These programs represent a large subset of real programs that MapReduce users have created.
5.1 CLUSTER CONFIGURATIONTHE MACHINES:
• Cluster of ≈ 1800 machines.
• Two 2GHz Intel Xeon processors with Hyper-Threading.
• 4 GB of memory.
• Two 160GB IDE(Integrated Drive Electronics) Disks.
• Gigabit Ethernet link.
5.1 CLUSTER CONFIGURATION (CONTINUED)• Arranged in a two-level tree-shaped switched network.
• ≈ 100-200 Gbps aggregate bandwidth available at root.
• Every machine is located in the same hosting facility.
• Round trip between pairs is less than a millisecond.
• Out of the 4GB of memory available, approximately 1-1.5GB was reserved by other tasks.
• Programs were run on a weekend afternoon, when the CPUs, disks, and network were mostly idle.
5.2 GREP
• Grep scans through 10^10 100-byte records.
• The program looks for a match to a rare 3-character pattern.
• This pattern occurs in 92,337 records.
• The input gets slip up into ≈ 64 MB pieces.
• Output gets stored into one file.
5.2 GREP (CONTINUED)• Y-axis shows the rate at
which the input data is scanned.
• This picks up as more machines are assigned to the computation.
• The graph reaches its peak (above 30GB/s when 1764 workers have been assigned.
• The entire computation takes about 150 seconds.
• This time includes the minute it takes to start everything up.
20 40 60 80 100
Seconds
0
10000
20000
30000
Inpu
t (M
B/s
)
5.3 SORT
• The sort program sorts through 10^10 100-byte records.
• This is modeled after the TeraSort benchmark.
• Whole program is less than 50 lines.
HOW DOES THE PROGRAM SORT THE DATA?• A 3 line Map function extracts a 10-byte sorting key from a
text line.
• It then emits the key and original text line.
• (This is the intermediate key/value pair.
• The built-in Identify function served as the Reduce operation.
• This passes the intermediate key/value pair unchanged as the output key/value pair.
• The final sorted output is written to a set of 2-way replicated Google File System (GFS) files
• i.e., 2 terabytes are written as the output of the program
5.3 SORT (CONTINUED)
• Like Grep the input for the sort program is split up into 64MB pieces.
• The sorted output is partitioned into 4000 files.
• The partitioning function uses the initial bytes of the key to segregate the output into one of the 4000 pieces.
5.3 SORTING (CONTINUED) • This figure shows the data transfer rate
over time for a normal execution of the Sort function.
• The rate peaks at 13GB/s and then starts to die quickly since all of the map tasks get finished before the 200 second mark.
5.3 SORTING( CONTINUED) • This graph shows the rate of data being sent
over the network from the map tasks to the reduce tasks.
• This is started as soon as the first map task finishes
• First bump in the graph is when the first batch of reduce tasks
• This is approximately 1700 reduce tasks, since the entire task was given to around 1700 machines and each machine does one task at a time.
• Then, around 300 seconds into the computation the second batch of reduce tasks finishes so they get shuffled.
• Everything is finished in about 600 seconds.
5.3 SORT(CONTINUED)
• This figure shows the rate at which sorted data is written to the final output files.
• There is a delay between the last of the first batch of shuffling and the start of the writing since the machines are too busy sorting the intermediate data.
• The writes stay at a more steady rate compared to reading input and shuffling.
• This rate is about 2-4GB/s.
• The writes are finished by around 850 seconds.
• The entire computation takes a total of 891 seconds.
• (this rate is similar to the best reported result by TeraSort [1057 seconds]).
5.4 EFFECT OF BACKUP TASKS• Figure 3(b) shows how long it takes
when the sort program is run without backup tasks enabled.
• It is similar to 3(a) except for the long tail where there is hardly any write activity.
• 960 seconds in, every reduce task but 5 are completed.
• These last 5 tasks do not finish until 300 seconds later.
• The whole thing takes 1283 seconds to complete.
• This is a 44% increase in time.
5.5 MACHINE FAILURES• Figure 3(c) shows the sort program executing
where they intentionally killed 200 of the 1746 workers several minutes into the operation.
• Since the machines were still functioning, the underlying cluster scheduler immediately restarted new worker processes on the “killed” machines.
• The graph shows negative input, this is when the machines were killed.
• The input goes into the negatives because the data was lost, and needed to be redone.
• The whole thing is finished after 933 seconds
• This is only a 5% increase in time over the normal execution.