Upload
shawn-flynn
View
218
Download
0
Embed Size (px)
Citation preview
CHEP04 CHEP04CHEP04Performance Analysis of
Cluster File System on Linux
Yaodong CHENGIHEP, CAS
CHEP'04 Sep 27 - Oct 1, 2004 Congress Zentrum Interlaken, Switzerland
Outline
IntroductionReview of cluster file systemData access modelPerformance analysis formulaPerformance testSome useful methods
CHEP'04 Sep 27 - Oct 1, 2004 Congress Zentrum Interlaken, Switzerland
IntroductionCluster systems made up with PCs are more and more popularThe improvement of commodity hardware and software
CPU, memory, hard disk, networkLinux software technology
How to use the our existing hardware and software more efficiently
CHEP'04 Sep 27 - Oct 1, 2004 Congress Zentrum Interlaken, Switzerland
Architecture of a cluster system
jobjob
Compute node1 Compute node N• • •
• • •I/ONode 1
disk
disk
diskI/O
Node N
disk
tape
High speed network
diskdisk
CHEP'04 Sep 27 - Oct 1, 2004 Congress Zentrum Interlaken, Switzerland
Cluster file system review
one of the most important methods to share information of cluster system General characteristics:
Single-system imageTransparencyGood scalabilityHigh performance
StructureC/S, share-disk, virtual share-disk
CHEP'04 Sep 27 - Oct 1, 2004 Congress Zentrum Interlaken, Switzerland
Data access model
Meta DataServer
I/O Servers
IO node 1Disk
IO node 2Disk
IO node NDisk
● ● ●
Manager Node
N e t w
o r k
Client 1
Client 2
Client N
● ● ●
CHEP'04 Sep 27 - Oct 1, 2004 Congress Zentrum Interlaken, Switzerland
Some assumptionsData is processed only in each client Storage nodes only provide storage capacity and deal with file operationsThe traffic between clients and management nodes is very smallThe time for dealing with requests of clients is far smaller than the time consumed by transferring data
CHEP'04 Sep 27 - Oct 1, 2004 Congress Zentrum Interlaken, Switzerland
Performance analysis formula
c: the CPU time to compute each byte; D: the total of data; I: network speed;
M: the number of I/O nodes; N: the number of clients;
P: the number of disks in parallel; R: disk speed T: the minimum access time to total dataS: the maximum aggregate bandwidthLimitation: P/M >=1
T = max (D*c/N, D/(N*I), D/(M*I), D/(P*R) )
S = D/T = min (N/c, N*I, M*I, P*R)
CHEP'04 Sep 27 - Oct 1, 2004 Congress Zentrum Interlaken, Switzerland
In above formula, if c is very small, the formula becomes:
T = max (D/(N*I), D/(M*I), D/(P*R) )
S = D/T = min (N*I, M*I, P*R)
and this formula is the basis of performance analysis in this work
CHEP'04 Sep 27 - Oct 1, 2004 Congress Zentrum Interlaken, Switzerland
Some cases
N=1, M>=1 (or N>=1 and M=1), R>I S depends on I
N=1, M>=1 (or N>=1 and M=1), R<I S depends on I and P*R
N>1, M>1, R>I S depends on the number
of clients and I/O nodes
CHEP'04 Sep 27 - Oct 1, 2004 Congress Zentrum Interlaken, Switzerland
Test environmentTwelve PCs
I/O nodes, Manager nodes and clientsP4 2.8G/512M/DiskWD80G-8M-7200RPM
OSCERN Linux 7.3.3Kernel: 2.4.20-18.7.cernsmp Local file system: ext3
Network: 100M EthernetCluster file system
OpenAFS 1.2.9, NFS v3, PVFS, CASTOR1.6.1.2
CHEP'04 Sep 27 - Oct 1, 2004 Congress Zentrum Interlaken, Switzerland
Pre-testTest tools
Netperf 2.2pl3Iozone 3.217
Local area network bandwidth (I):100M Ethernet: about 94.11Mbits/sec
Local file system measurement (R)./iozone -Rab local.xls -g 2048M
Recompile IOzone linked with CASTOR RFIO library
64k2M
64M
2G64k
1M16M
0
50
100
150
200
250
200- 250
150- 200
100- 150
50- 100
0- 50
record size
CPU cache
eff ect
physic
disk I / O
MB/s
ec
memory
eff ect
file size
write performance of local fi le system
CHEP'04 Sep 27 - Oct 1, 2004 Congress Zentrum Interlaken, Switzerland
One client one serverOnly one client access filesOnly one I/O nodes in server configurationWrite performance measurement
file size: 512MB record size: 64KB-16MB output unit: KB/sec
CHEP'04 Sep 27 - Oct 1, 2004 Congress Zentrum Interlaken, Switzerland
Results
FSRecord size (KB)
64 128 256 512 1024 2048 4096 8192 16384
NFS 11101 10803 11054 11125 11083 11042 11045 11109 11047
AFS 5173 5342 5239 5137 5148 5335 5212 5175 5353
PVFS 9953 10158 10103 10239 10759 10603 10662 10948 10976
CASTOR
10209 10335 10530 10622 10697 10722 10723 10705 10678
CHEP'04 Sep 27 - Oct 1, 2004 Congress Zentrum Interlaken, Switzerland
Multi-process testOnly one client and one I/O nodeMany processes access one I/O node simultaneously.Write performance measurement
File size: 100MBRecord size: 512KBProcess number: 1 10Output unit: KB/sec
ResultsFS NFS AFS PVFS CASTOR
Nu
mb
er o
f pro
cess
1 10372 7878 10806 106802 10362 7889 10752 112553 10323 10841 10751 112214 10311 1020 10686 114505 10257 9358 10707 114306 10258 9142 10690 114417 10255 8120 10696 113908 10173 8545 10697 114409 10240 8652 10696 1144210 10250 7305 10698 11430
CHEP'04 Sep 27 - Oct 1, 2004 Congress Zentrum Interlaken, Switzerland
Multi-client to multi-server
Multiple clients read/write files Multiple I/O nodes provide file storageThe output is aggregate bandwidthOnly measure CASTOR and PVFSWrite performance
The size of each file: 200MRecord size: 2MByteOutput unit: MB/sec
Results
0
10
20
30
40
50
60
70
1 2 3 4 5 6
number of cl i ent&server
MB
/sec
PVFS CASTOR
CHEP'04 Sep 27 - Oct 1, 2004 Congress Zentrum Interlaken, Switzerland
Some useful methods In theory, good cluster file system
the data is physically balanced among the I/O devicesthe data requirements are balanced among the application’s tasksnetwork has enough aggregate bandwidth to pass the data between the two without saturating
In practice, the following methods are useful
CHEP'04 Sep 27 - Oct 1, 2004 Congress Zentrum Interlaken, Switzerland
Use high-speed network, for example Gigabit Ethernet or MyrinetUse or develop high performance network file transfer protocolUse multi-server to improve the aggregate bandwidth Improve the read/write speed of disksFile stripping and parallel I/OGood file system designImprove the processing ability of manager nodes
CHEP'04 Sep 27 - Oct 1, 2004 Congress Zentrum Interlaken, Switzerland
Summary
Cluster file system reviewPerformance analysis formulaPerformance testSome methods to improve the performance
CHEP'04 Sep 27 - Oct 1, 2004 Congress Zentrum Interlaken, Switzerland
Thank you!!