Upload
arnold-tyler
View
217
Download
3
Embed Size (px)
Citation preview
CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE1
CHEP 2000CHEP 2000
Data Handling in KLOEI.Sfiligoi
INFN LNF, Frascati, Italy
CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE2
The KLOE experimentThe KLOE experiment
• at DANE -factory• main goal:
• CP violation study• other interesting fields:
• kaon form factors• kaon rare decays• radiative decays
KS+- KL+- (CP not)
KS+- KL306
CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE3
KLOE RequirementsKLOE Requirements
• Data acquisition (at full DANE luminosity)
• 1011 events per year acquired• 50 MB/s sustained throughput
• Computing power• ALL the events need to be reconstructed
• Storage requirements• one petabyte of raw and reconstructed events• hundreds of megabytes of related data
(configurations, slow control data, calibration parameters, etc.)
CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE4
KLOE computing KLOE computing environmentenvironment
• Based on a set of medium-sized servers• Connected using commercial switched
networks (Fast Ethernet and Gigabit Ethernet)
• Heterogeneous environment, several platforms:• IBM AIX on PowerPC• Sun Solaris on Sparc• Compaq Tru64 Unix on Alpha• HP-UX on PA-RISC
CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE5
KLOE storage poolKLOE storage pool
• Different policies for different types of data:• raw and reconstructed events on tape libraries,
with big disk pools for data caching• related data managed by a disk based database
system• analysis output on disk pools
CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE6
Disk poolsDisk pools
• Four categories of disk pools are present:• each data acquisition node in the farm has its
own small disk pool• computing nodes write their output to
centralized, NFS mounted disk pools• separate disk pools are used as a cache for the
events on tape• analysis output is written to its own, central
AFS mounted disk pool
CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE7
Tape libraryTape library
• Several automated tape libraries supported(at the moment the 5500 slot tape library is partitioned between two tape servers)
• Accessed using commercial software• IBM ADSM with the current tape library
CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE8
KLOE softwareKLOE software
• Three distinct categories • DAQ (or online)• reconstruction and
analysis (or offline)• Monte Carlo
ANSI C
FORTRAN inside A_C
FORTRAN
The interface to the Data Handling System must be compatible with all of them
CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE9
KLOE Data Handling SystemKLOE Data Handling System
• Composed of four elements:• Database System• Archiving System• Spy System• KLOE Integrated Dataflow (KID)
CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE10
KLOE Data Handling SystemKLOE Data Handling System
• A mix of commercial and custom software
• the dependency on commercial software is minimized by the layers of custom software
• commercial software carries on all the vital functions
•custom software mostly extends and coordinates the functionality of the commercial software
CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE11
KLOE Data Handling SystemKLOE Data Handling System
• Based on a set of multi-threaded non-privileged daemons and related libraries
• Distributed across several nodes• Communication by means of TCP/IP sockets
on high portsbypasses TCP/IP filteringflexible, programming language and
operating system independentno configuration needed on the client side
CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE12
KLOE Data Handling SystemKLOE Data Handling System
• Composed of four elements:• Database System• Archiving System• Spy System• KLOE Integrated Dataflow (KID)
CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE13
Database SystemDatabase System
• Two distinct database systems are used
• offline database system
• online database system
based on HepDB data stored as ZEBRA banks
based on a Relational DBMS
data are structured in fieldsextended for distributed environments
CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE14
Online Database SystemOnline Database System
• data stored in a Relational DBMS• IBM DB2 Universal Database at the moment
• communication between the clients (user applications) and the RDBMS through a database daemon
RDBMS
DDapp
app app
CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE15
Database DaemonDatabase Daemon
• The database daemon is the only link between the applications and the RDBMS• if the RDBMS is changed in the future, only
the database daemon will need to be changed• Different kinds of commands are managed
by the daemon• general SQL commands• KLOE specific commands
CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE16
Database DaemonDatabase Daemon
• Different kinds of commands are managed by the daemon
• general SQL commands
•KLOE specific commands
•passed directly to the RDBMSselect run_nr from run_logger where status = 'OK'
•managed by the daemon itself•the RDBMS is used to retrieve and store data needed by the daemon itself
log that I am starting processing file relative to run 3
CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE17
Database DaemonDatabase Daemon
• The use of KLOE specific commands has several advantages• additional checks and restrictions are possible• data consistency management is centralized• fast central caches can be implemented
• for example, the DAQ configuration cache reduces the typical access time from 4 to 0.1 s
CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE18
A light versionA light version
• The RDBMS is used to ensure flexibility, reliability and performance
• Demanding in terms of computing resources and management effort• stand-alone environments often
cannot afford it• A RDBMS-independent version of the
database daemon is under development
CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE19
A light versionA light version
• A RDBMS-independent version of the database daemon is under development• limited to KLOE specific and the most
frequently used SQL commands• based on use of flat files containing a small
portion of the data• not suitable for production environment,
but enough for home use
CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE20
KLOE Data Handling SystemKLOE Data Handling System
• Composed of four elements:• Database System• Archiving System• Spy System• KLOE Integrated Dataflow (KID)
CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE21
KLOE Archiving SystemKLOE Archiving System
• Expected event data managed by KLOE• 1 PB
• Tape libraries needed• data storage and retrieval non trivial• random access to data very inefficient
• Disk-based intermediate buffers used
CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE22
KLOE Archiving SystemKLOE Archiving System
• Two types of intermediate buffers• DAQ, offline and Monte Carlo output are
structured as YBOS files and written on their disk output areas
• event data needed by offline as input are read from the archiving system disk-cache
CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE23
KLOE Archiving SystemKLOE Archiving System
• Data needs to be migrated• from output areas to the tape library
• as soon as possible(taking into account also efficiency concerns)
• from the tape library to the disk cache• when an application needs it
(or even better, a bit earlier)• Migration is totally automated and
transparent to the applications
CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE24
KLOE Archiving SystemKLOE Archiving System
• The Archiving System is made of four components• storage managers• disk space managers
• output areas• cache areas
• archival director• cache manager
• Communication by means of TCP/IP sockets• Coordinated by the online database
archADSM
spacekeeper
filekeeperarchiverretrieve
CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE25
Storage ManagersStorage Managers
• One for each logical tape library• Allows
• queries about tape library content• file archival• file retrieval
• Transaction oriented(if the underlying tape library software supports it)
CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE26
Storage ManagersStorage Managers
• The only link between the tape library and the rest of the system• interface independent of the underlying
archiving software • IBM ADSM is used with the current tape
library• if other products is used in the future, only a
specific storage manager will need to be developed
CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE27
Disk Space ManagersDisk Space Managers
• One for each disk pool• Create and delete files
• unused files get deleted to make space for new ones
CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE28
Archival DirectorArchival Director
• Fully automated• Works in polling mode
• from time to time looks for files ready to be archived
• starts archiving only when enough data is available
• Files are ordered and grouped to minimize the expected retrieve time
• Several groups of files can be archived in parallel
CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE29
Cache ManagerCache Manager
• User driven• when a file is needed, the application asks the
cache manager where it is located• a retrieve is performed by the manager if
needed• Several requests can be issued at the same
time• the manager reorders them internally to
minimize the tape mounts• Communication by means of TCP/IP sockets
CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE30
KLOE Archival SystemKLOE Archival System
archiver
archADSM archADSM
spacekeeper
filekeeper
spacekeeper
filekeeper
retrieve
DB
...
. . .
n
m
. . .k
NFS mount local file system TCP/IP socket TCP/IP socket
Tape LibraryTape Library
Disk Pool
Disk Pool
Disk Pool Disk Pool
CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE31
KLOE Data Handling SystemKLOE Data Handling System
• Composed of four elements:• Database System• Archiving System• Spy System• KLOE Integrated Dataflow (KID)
CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE32
SpySpy System System
• KLOE data acquisition software allows the event data to be read-out before they get written to disk
• The mechanism that reads those data is called Spy
• Based on use of shared memory buffers• DAQ processes are piped using this mechanism• the spy system reads data from the buffers
without interfering with the DAQ
CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE33
KLOE Data Handling SystemKLOE Data Handling System
• Composed of four elements:• Database System• Archiving System• Spy System• KLOE Integrated Dataflow (KID)
CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE34
KLOE Integrated Dataflow KLOE Integrated Dataflow (KID)(KID)
• Integration library• database accesses and retrieve operations
hidden• Offers a single point of access to all the
services• URI-based selection
datarec:(run_nr=5000) and (stream='ksl')spy:/buffer
open a spy channel and pass the events to the application
read the list from DB, ask the cache manager for the files, pass the events from the files to the application
CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE35
Management effortManagement effort
• The entire system is managed by only a few people:• 3 people (2 full time) are engaged in KLOE
computing system management (including storage)
• 1 person is engaged in the development and management of the online database and the archiving system
• 2 people spend few percent of their time for the maintenance of the offline database
CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE36
CHEP 2000CHEP 2000
Data Handling in KLOEI.Sfiligoi
INFN LNF, Frascati, Italy