View
65
Download
2
Category
Tags:
Preview:
DESCRIPTION
Parallel File System Simulator. tests. . tests. . In order to test the Parallel File System (PFS) scheduling algorithms in a light-weighted approach, we have developed the PFS simulator. - PowerPoint PPT Presentation
Citation preview
1HEC FSIO 2009
Parallel File System Simulator In order to test the Parallel File System (PFS)
scheduling algorithms in a light-weighted approach, we have developed the PFS simulator.
We use discrete event simulator and accurate disk modeling software to construct an agile parallel file system simulator.
This simulator is capable of simulating enough details while having an acceptable simulation time.
tests. tests.
2HEC FSIO 2009
Existing PFS simulators The IMPIOUS simulator, by E Molina-Estolano, al.
et[1]. It does not model the metadata server. The scheduler modules are lacking, so scheduling
algorithms are hard to model. The simulator developed in PVFS improvement paper
by Carns P. H., al. et[2]. It over simulates the network, which extends the
simulation time. It uses real PVFS in simulation, which introduces too much
details, while not flexible.
3HEC FSIO 2009
Simulator Details We use the discrete event simulation library
OMNeT++ 4.0 to simulate the network. It is capable of simulating the network topology with
bandwidth and delay. We use Disksim to simulate the data server disks.
Disksim accurately estimates the time for data transactions on the physical disks.
Disksim allows users to extract disk characteristics from real disks.
4HEC FSIO 2009
Simulator Architecture
Client
trace
MetadataServer
StrippingStrategy
Client
trace
Client
trace
Simulated Network
OMNeT++
Disksims instances
Data ServersSchedulingAlgorithm
Socket Connections
MetadataServer
Local FS
Diskqueue
Local FS
Disk
queue
Local FS
Disk
queueoutput
New request arrives.
Query file layout.
Send the request to data server.
Dispatch the job.
Send the results back.
5HEC FSIO 2009
Implementation of Scheduling Algorithms
Similarly, we have also implemented FIFO, Distributed SFQ[3], MinSFQ[4] algorithms, etc.
/* SFQ algorithm, at each data server */systime = 0waitQ.initiate()while(!simulation_end) { if reqArrive(), then: R = getReq() R.start_tag = min { R.getPrevReq().finish_tag, systime } R.finish_tag = R.start_tag + R.cost / R.getFlow().weight pushReq(R, waitQ) if diskHasSlot(), then: R = popReqwithMinStartTag(waitQ) systime = R.start_tag dispatch(R)}
The request from client arrives at the data server.
Disksim tells OMNet++ that it still has free slot.
OMNet++ dispatches the request to Disksim.
6HEC FSIO 2009
Simulation 1: Weighted Scheduling 2 groups of clients, 16 each. 4 Data servers and 1 Metadata server. Parallel Virtual File System(v2) is modeled. All files striped to all servers. Stripe size: 256KB. The trace files are generated by IOR.
Each client has 100MB checkpoint-write operation. SFQ with different weight assignments have been
simulated. The simulated weight ratios are 1:1, 2:1 and 10:1.
7HEC FSIO 2009
Throughput ResultsFIFO
010000200003000040000500006000070000
1 3 5 7 9 11 13 15 17 19 21 23 25
Time (s)
Thro
ughp
ut(b
yte/
s)
Group 1
Group 2
SFQ (1:1)
010000200003000040000500006000070000
1 3 5 7 9 11 13 15 17 19 21 23 25
Time (s)
Thro
ughp
ut(b
yte/
s)
Group 1
Group 2
SFQ (2:1)
0
20000
40000
60000
80000
100000
1 3 5 7 9 11 13 15 17 19
Time (s)
Thro
ughp
ut(b
yte/
s)
Group 1
Group 2SFQ (10:1)
0
20000
40000
60000
80000
100000
120000
1 3 5 7 9 11 13
Time (s)
Thro
ughp
ut(b
yte/
s)
Group 1
Group 2
8HEC FSIO 2009
Simulation 2: Scheduling Fairness Same as Simulation 1, except that the two groups do
not have the same resources: Group 1’s files are stripped to all servers [1, 2, 3, 4]. Group 2’s files are stripped to 3 servers [1, 2, 3].
We add Distributed SFQ algorithm[3], in which all servers share the scheduling information.
We are able to show that if the workloads are not evenly distributed, the Distributed SFQ algorithm achieves better fairness than FIFO and SFQ.
9HEC FSIO 2009
Throughput Results
FIFO
010000200003000040000500006000070000
1 3 5 7 9 11 13 15 17 19 21 23 25
Time (s)
Thro
ughp
ut(b
yte/
s)
Group 1
Group 2
SFQ (1:1)
010000200003000040000500006000070000
1 3 5 7 9 11 13 15 17 19 21 23 25
Time (s)
Thro
ughp
ut(b
yte/
s)
Group 1
Group 2
DSFQ (1:1)
010000200003000040000500006000070000
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29
Time (s)
Thro
ughp
ut(b
yte/
s)
Group 1
Group 2
10HEC FSIO 2009
References[1] E Molina-Estolano, C Maltzahn, J Bent and S A Brandt, “Building a parallel file system
simulator”, Journal of Physics: Conference Series 180 (2009) 012050.[2] Carns P H, Ligon W B, al. et. “Using Server-to-Server Communication in Parallel File Systems to
Simplify Consistency and Improve Performance”, Proceedings of the 4th Annual Linux Showcase and Conference (Atlanta, GA) pp 317-327.
[3] Yin Wang and Arif Merchant, “Proportional Share Scheduling for Distributed Storage Systems”, File and Storage Technologies (FAST’07), San Jose, CA, February 2007.
[4] W. Jin, J. S. Chase and J. Kaur, “Interposed Proportional Sharing For A Storage Service Utility”, SIGMETRICS, E. G. C. Jr., Z. Liu, and A. Merchant, Eds. ACM, 2004, pp. 37-48.
Recommended