Upload
nickolas-brooks
View
214
Download
0
Embed Size (px)
Citation preview
SRB system at Belle/KEK
Yoshimi IidaCHEP 04, Interlaken29 September 2004
CHEP 04 Yoshimi Iida, KEK 2
Outline
The Belle experiment at KEK What is SRB? SRB activities at KEK
Transfer rate measurements with SRB SRB test beds at Belle SRB for Belle data processing
Summary
CHEP 04 Yoshimi Iida, KEK 3
The Belle experiment As presented at the plenary session, Belle is an exper
iment at the KEK B-factory. Its goal is to study the origin of CP violation
Belle now accumulates more than 1TB of raw data from the detector everyday
The raw data, processed data accumulated so far exceed peta bytes
This corresponds to 40GB/day of compressed hadronic data for final physics analyses
Monte Carlo simulation data are generated at the rate of ~200GB/day
The number of files is more than 10 million so far and grows every day (at the rate of 10,000/day)
CHEP 04 Yoshimi Iida, KEK 4
IHEP, Moscow IHEP, ViennaITEPKanagawa U.KEKKorea U.Krakow Inst. of Nucl. Phys.Kyoto U. Kyungpook Nat’l U. U. of Lausanne Jozef Stefan Inst.
Aomori U.BINPChiba U.Chonnam Nat’l U.Chuo U.U. of CincinnatiEwha Womans U.Frankfurt U.Gyeongsang Nat’l U.U. of HawaiiHiroshima Tech.IHEP, Beijing
U. of MelbourneNagoya U.Nara Women’s U.National Central U.Nat’l Kaoshiung Normal U.Nat’l Lien-Ho Inst. of Tech.Nat’l Taiwan U.Nihon Dental CollegeNiigata U.Osaka U.Osaka City U.Panjab U.Peking U.Princeton U.Riken-BNLSaga U.USTC
Seoul National U.Shinshu U.Sungkyunkwan U.U. of SydneyTata InstituteToho U.Tohoku U.Tohuku Gakuin U.U. of TokyoTokyo Inst. of Tech.Tokyo Metropolitan U.Tokyo U. of A and T.Toyama Nat’l CollegeU. of TsukubaUtkal U.VPIYonsei U.
The Belle Collaboration
13 countries, institutes, ~400 members
CHEP 04 Yoshimi Iida, KEK 5
Dell 36PCs(Pentium-III ~0.5GHz)
Appro 113PCs(Athlon 1.67GHz×2)
320GHz
168GHz
470GHz
NEC 84PCs(Xeon 2.8GHz×2)
768GHz
450GHz
PC farm of several generations
heterogeneous system from various vendors cost effectiveness
3 types of CPU (PenIII/Xeon/Athlon)Fujitsu 120PCs
(Xeon 3.2GHz×2)Compaq 60PCs
(Pentium-III 0.7GHz)
Fujitsu 127PCs(Pentium-III 1.26GHz)
CHEP 04 Yoshimi Iida, KEK 6
Belle data must be distributed Belle has more than 200TB of real and Monte
Carlo simulation data for final physics analyses As shown, the Belle collaboration consists of
more than 57 institutes in 13 countries Collaborators want to share the data and analyze
them at their own institutes About a half of Monte Carlo data are generated at
outside institutions (not at KEK) Belle wants to simplify the management of
data and files among collaborators The remote institutions want to exchange the
data at their own pace and control their own resources
CHEP 04 Yoshimi Iida, KEK 7
What is SRB? “The SDSC Storage Resource Broker (SRB) is
client-server middleware that provides a uniform interface for connecting to heterogeneous data resources over a network and accessing unique or replicated data objects”
“SRB, in conjunction with the Metadata Catalog (MCAT), provides a way to access data sets and resources based on their logical names or attributes rather than their names and physical locations”
http://www.npaci.edu/DICE/SRB/index.html
CHEP 04 Yoshimi Iida, KEK 8
SRB for distributed collaborators SRB provides access to data storage across local and
wide-area networks With federated MCAT (or zoneSRB), each institution c
an share physical resources and logical collections, yet maintain more local control over those resources, data objects, and collections.
SRB supports parallel I/O for larger size files, and “Containers” and/or “Bulk load” for smaller size files
SRB supports the Globus Grid Security Infrastructure (GSI) as an optional method of authentication
CHEP 04 Yoshimi Iida, KEK 9
SRB activities at KEK The Computing Research Center (CRC) of KEK
started experimenting SRB in collaboration with SLAC Computing Services
SLAC had already been using SRB to replicate files between SLAC and IN2P3 Lyon
The CRC has built several test beds and measured performance in data transfer
Belle group, in particular, Australian Belle collaborators and KEK started working with the CRC and built SRB test beds
Belle collaborators in Taiwan and Korea are now trying to join the efforts
CHEP 04 Yoshimi Iida, KEK 10
PostgreSQL
DB2HPSS enabledSRB server(single CPU)
MCAT enabledSRB server(dual CPUs)
SRB server(dual CPUs)
MCAT enabledSRB server(dual CPUs)
FC RAID800GB
HPSS 120TB
Internet KEK network
The CRC SRB test systemKEK FW
Giga Switch
zone A
zone B
CHEP 04 Yoshimi Iida, KEK 11
Performance measurement Measure two cases
Mixed files 68 files, 928MB in total Max file size 101MB, Min file size 4.7kB
Larger file 1GB
Compare SRB commands and ftp/pftp Various transfer commands in SRB
“Bulk load” and “Container” for the mixed files Parallel I/O for the larger file
ftp for Unix file system, pftp for HPSS
CHEP 04 Yoshimi Iida, KEK 12
SRB transfer commands Sput
Imports local files or directories into SRB space Sput -m
Sets I/O mode to parallel I/O Sput -c container
Imports local file into container A “Container” is a way to put together a lot of files into one larger file to i
mprove performance. Sbload
“Bulk load” It use a single call for registering up to several hundreds files with MCAT It use separate threads for registration and data transfer
Sbload -c container “Bulk load” into container
CHEP 04 Yoshimi Iida, KEK 13
Machine configuration
Direct comparison between zone A (HPSS) and zone B (UNIX file system) cases cannot be done in following measurements
Machine configuration is different Single CPU vs. Dual CPUs Pentium 4 vs. Xeon DB2 vs. PostgreSQL in MCAT
CHEP 04 Yoshimi Iida, KEK 14
Transfer mixed files (preliminary)
Resource
Commands
Average rate (MB/sec)
Unix file system(zone B)
Sput -c 14
Sbload 12
Sbload -c 30
ftp 36
HPSS*(zone A)
Sput -c 14
Sbload 3
Sbload -c 11
(Pftp 14)
Among SRB commands “Sbload -c” is the fastest
The “Sput -c” case gives the best result in HPSS case
due to the characteristic of HPSS which is designed for storage of larger files
* HPSS enabled server has single CPU while others have dual CPUs
CHEP 04 Yoshimi Iida, KEK 15
Transfer larger file (preliminary)Resource
Commands
Average rate (MB/sec)
Unix file system(zone A)
Sput 23
Sput -m 29
ftp 34
HPSS*(zone B)
Sput 7
Sput -m 19
(Pftp 17)
“Sput -m” (parallel thread mode) is better than single
* HPSS enabled server has single CPU while others have dual CPUs
CHEP 04 Yoshimi Iida, KEK 16
SRB transfer performance
Performance could be better MCAT lookup time
SRB takes the extra time that is required for the database query. We need the tuning (indices that are build) for MCAT
HPSS interface Looks still not mature on Linux (originally on
AIX). Further improvements are desired Measurements are done on congested KEK
LAN for HPSS case
CHEP 04 Yoshimi Iida, KEK 17
MCAT
The Belle SRB system
SRB serverSCSI-RAID
KEK network
B-InetTape Library
NFS HSM-DISC
HSM Server
KEK-B-System
B-TnetGbE
GbE
GbE
SRB clientRouter
KEK FW
MCAT
federation
SRB serverat Melbourne U.
MCAT enabledSRB server
MCAT enabledSRB serverat ANU
Internet
Belle FW
CHEP 04 Yoshimi Iida, KEK 18
Belle software with SRB Belle uses home grown analysis framework called BA
SF Belle has extended BASF to dynamically load I/O subs
ystems as C++ objects It was quite simple to add SRB support as a new I/O
class using the following SRB client APIs srbConnect, srbObjOpen, srbObjCreate, srbObjStat, srbObjRe
ad, srbObjWrite and srbObjClose We then tested and compared I/O performance using
SRB, Belle’s own TCP/IP protocol, and NFS only in KEK
CHEP 04 Yoshimi Iida, KEK 19
BASF test results (preliminary)
Protocol Resource Elapsed time Utilization
SRB LocalSCSI-RAID
10:22 53.3%
SRB RemoteHSM(NFS)
13:13 41.8%
UNIXread
LocalSCSI-RAID
5:44 90.0%
BelleTCP/IP
RemoteHSM(NFS)
6:24 86.1%
The data used for this test is 6 files in 2.8GB total size Although the elapsed time when using SRB protocol is
longer, CPU utilization is almost the same
CHEP 04 Yoshimi Iida, KEK 20
SRB for Belle processing It works well as it claims
We have tested the mechanisms on a small scale test bed
GSI, zones and federations look promising Each institution can manage its own resources
Successful reading and writing remote data within the Belle software
SRB is about 40% slower than Belle’s own TCP/IP transfer interface
More detailed tests are necessary
CHEP 04 Yoshimi Iida, KEK 21
More tests and plans Federation among Australia, Taiwan,
Korea and KEK will be established soon Quick (Bulk) registration of many files Scalability test
for a single MCAT for zone synchronization multiple access to resources and MCAT
several hundreds jobs run at a time accessing files File replica consistency and checks for
the broken files in case of disk/network failure
CHEP 04 Yoshimi Iida, KEK 22
Summary SRB is now working in the Belle experiment
Zone federation between Australia and KEK has been established
SRB has been implemented into BASF, the Belle analysis software framework
Preliminary performance measurements have done
G. Moloney gives a talk in the different session about Australian experiences (ID:486)
CHEP 04 Yoshimi Iida, KEK 23
Acknowledgement SDSC (San Diego Supercomputer Center)
S. Chen, G. Kremenek, A. Rajasekar and R. Moore, SLAC (Stanford Linear Accelerator Center)
A. Hasan and W. Kroeger University of Melbourne
G. Moloney ANU (Australian National University)
S. McMahon and J. Smillie IHEP (Institute of High Energy Physics)
Ma Mei Fujitsu
S. Honma, H. Kuraishi and T. Nakajima IBM
K. Ishikawa and S. Yamamoto KEK (High Energy Accelerator Research Organization)
I. Adachi, N. Katayama, S. Kawabata, Ma Mei, A. Manabe, T. Sasaki, S. Y. Suzuki, S. Yashiro and Y. Watase
SuperSINET supported by National Institute of Informatics
For backups
CHEP 04 Yoshimi Iida, KEK 25
The CRC SRB machine specification
HPSS enabledMCAT enabled
SRB serverMCAT enabled
CPUPentium4 2.8GHz
Xeon 2.8GHz ×2
Xeon 2.8GHz ×2
Xeon 2.8GH ×2
Memory 512MB 512MB 512MB 512MB
Disc 40GB 36GB 36GB 36GB
OS RH Linux 7.3 RH Linux 7.2 RH Linux 7.2RH Linux AS v3
SRB v3.1.0 v3.1.0 v3.1.0 v3.1.0
Globus Toolkit
v2.2.4 v2.2.4 v2.2.4 V2.4.3
DB × DB2 v8.1 ×PostgreSQL 7.4.2
SRB Resource
HPSS(client library v4.5):2TB
× ×FIBERNET RAID:800GB
CHEP 04 Yoshimi Iida, KEK 26
The Belle SRB machine specification
MCAT enabled SRB server SRB client SRB server
CPU500MHz(SPARC64 GP)x4
Pentium III 1266MHz x2
500MHz(SPARC64 GP) x4
Memory 2GB 512MB 2GB
OS Solaris 7RH Linux 7.2
Solaris 7 RH Linux 8
SRB v3.1 v3.1 v3.1 (client) V3.1
Globus Toolkit
v2.4.3 v2.4.3 V2.4.3 V2.4.3
DB PostgreSQL - - -SRB Resource
HSM-DISK SCSI RAID ×
CHEP 04 Yoshimi Iida, KEK 27
Bulk load (unload) and Container “Bulk load” (unload)
“Bulk load” (unload) is designed to greatly improve the efficiency of ingesting a large number of small files by
1. registering up to several hundreds files with MCAT with a single call instead of the normal mode of registering one file at a time
2. use of separate threads for registration and data transfer
“Container” A “Container” is a way to put together a lot of
files into one large file to improve performance.
CHEP 04 Yoshimi Iida, KEK 28
Federated (multiple) MCAT system
SRB zone An SRB Zone (or zone for short)
consists of one or more SRB servers along with one MCAT-enabled server
Federated MCAT The Federated MCAT implementation
allows users to access resources and data across zones
CHEP 04 Yoshimi Iida, KEK 29
Logical file systemSingle SRB system /
- container/- home/- styles/- trash/- zoneA/
- container/- home/
- srbUserA.domain/- srbUserB.domain/
data.txt- styles/- trash/
Federated SRB system /
- container/- home/- styles/- trash/- zoneA/
- container/- home/- styles/- trash/
- zoneB::
CHEP 04 Yoshimi Iida, KEK 30
Belle plan Continue to experiment
among several remote institutions larger scale tests involving thousands of files totali
ng tens of Tera bytes integration with the hierarchical storage manageme
nt system Belle uses SONY Peta-site and Peta-serve
Disaster recovery scenarios Belle uses cheep IDE based RAID systems
Ask users to analyze SRB data files