Upload
sana
View
45
Download
0
Tags:
Embed Size (px)
DESCRIPTION
A Uniform and Coherent Approach to Object Persistency. Vincenzo Innocente. User Tag (N-tuple). Tracker Alignment. Ecal calibration. Tracks. Event Collection. Collection Data. Electrons. Event. HEP Data. Environmental data Detector and Accelerator status Calibrations, Alignments - PowerPoint PPT Presentation
Citation preview
Vincenzo Innocente, CERN/EP
Software Strategy
A Uniform and Coherent ApproachA Uniform and Coherent Approachto Object Persistencyto Object Persistency
Vincenzo Innocente
Vincenzo Innocente, CERN/EP
Software Strategy 2
HEP DataHEP DataHEP DataHEP Data
Event Event CollectioCollectio
nn
CollectioCollectionn
DataDataEvent Event
ElectronsElectrons
Tracker Tracker AlignmenAlignmen
tt
TracksTracks Ecal Ecal
calibratiocalibrationn
User TagUser Tag(N-tuple)(N-tuple)
Environmental data Detector and Accelerator status Calibrations, Alignments
Event-Collection Data(luminosity, selection criteria, …)
…
Event Data, User Data
Navigation is essential for an effective physics analysisComplexity requires coherent access mechanisms
Vincenzo Innocente, CERN/EP
Software Strategy 3
Not in original design
Later selected DAQ
Later more filters to DVNs and Ntpule
Vincenzo Innocente, CERN/EP
Software Strategy 4
CMS Experiment-Data AnalysisCMS Experiment-Data AnalysisCMS Experiment-Data AnalysisCMS Experiment-Data Analysis
Detector ControlOnline Monitoring
Environmental data
storeRequest part
of event
Simulation
G3or G4
store
store
Data Quality
Calibrations
Group AnalysisUser Analysis
on demand
Request part
of event
Request part of eventStore rec-Obj
and calibrations
Quasi-online
Reconstruction
Request part
of event
Store rec-Obj
Persistent Object Store ManagerObject Database Management System
Event FilterObject Formatter
Vincenzo Innocente, CERN/EP
Software Strategy 5
Uniform approachUniform approachUniform approachUniform approach
Coherent data access model same mechanisms, same language, same transaction model
Save effort A single team of experts A single team of administrators
Leverage experience developers can easily move from one application to another (from event-
data to calibration-data applications)
Reuse design and code Basic requirements are often the same We can use the same code to manage event data, calibrations, “n-tuple”
Main road in producing better and higher quality software
Vincenzo Innocente, CERN/EP
Software Strategy 6
Reconstruction SourcesReconstruction SourcesReconstruction SourcesReconstruction Sources
Vincenzo Innocente, CERN/EP
Software Strategy 7
CMS Reconstruction ModelCMS Reconstruction ModelCMS Reconstruction ModelCMS Reconstruction Model
Detector Element
Raw Data
Sim Hits
Rec Hits
Digis
ConditionsGeometry
Event
Algorithm
Rec Objs
Algorithm
Rec Objs
Algorithm
Rec Objs
Algorithm
Vincenzo Innocente, CERN/EP
Software Strategy 8
Vincenzo Innocente, CERN/EP
Software Strategy 9
Raw EventRaw Event
RawData
RawEvent
RawData
..
.Vector of Digi Vector of Digi
ReadOut
ReadOut
IndexRawData are identified by thecorresponding ReadOut.
RawData belonging to different“detectors” are clustered into different containers.The granularity will be adjustedto optimize I/O performances.
An index at RawEvent level is used to avoid the access to allcontainers in search for a givenRawData.
A range index at RawData levelcould be used for fast randomaccess in complex detectors.Index implemented as an ordered vector of pairs
Vincenzo Innocente, CERN/EP
Software Strategy 10
Reconstruction Object ModelReconstruction Object ModelReconstruction Object ModelReconstruction Object Model
All persistent objects are managed by CARF.Physics Modules access them through standard C++ pointers
Vincenzo Innocente, CERN/EP
Software Strategy 11
CMS Reconstructed ObjectsCMS Reconstructed Objects
S Track
S-TrackReconstruct
or
S Track
..Vector of RHits
RecEvent
TrackSecInfo
TrackConstituen
ts
Reconstructed Objects produced by a given “algorithm” are managed by a Reconstructor.
A Reconstructed Object (Track) is split into several independent persistent objects to allow their clustering according to their access patterns (physics analysis, reconstruction, detailed detector studies, etc.).
The top level object acts as a proxy.Intermediate reconstructed objects (RHits) are cached by value into the final objects .
“rec”
“esd”
“aod”
Vincenzo Innocente, CERN/EP
Software Strategy 12
CARF2000 Event StructureCARF2000 Event StructureCARF2000 Event StructureCARF2000 Event Structure
Vincenzo Innocente, CERN/EP
Software Strategy 13
RecEvent
RecEvent
RecEvent
RecEvent
CMS Event StructureCMS Event StructureCMS Event StructureCMS Event Structure
RawEvent
EventCollectio
n
Run
EventCollectio
n
In case of re-reconstructionthe original structure is kept.Event objects are cloned and new collections created
Persistent
Transient
Vincenzo Innocente, CERN/EP
Software Strategy 14
Physical clusteringPhysical clusteringPhysical clusteringPhysical clustering
Vincenzo Innocente, CERN/EP
Software Strategy 15
CMS needs a real DBMSCMS needs a real DBMSCMS needs a real DBMSCMS needs a real DBMS
An experiment lasting 20 years can not rely just on ASCII files and file systems for its production bookkeeping, “condition” database, etc.
Even today at LEP, the management of all real and simulated data-sets (from raw-data to n-tuples) is a major enterprise Multiple models used (DST, N-tuple, HEPDB, FATMAN, ASCII)
A DBMS is the modern answer to such a problem
An ODBMS provides a coherent and scalable solution for managing all kind of data seamless integration with OO languages internal navigation capability
Vincenzo Innocente, CERN/EP
Software Strategy 16
CMS Experience CMS Experience CMS Experience CMS Experience
CMS has used Objectivity/DB for the current prototype activity in close contact with IT in the context of the RD45 project
Database Developers (just OO and C++) : Designing and implementing persistent classes not harder than for
native C++ classes.
Physics Software Developers (do not see Objectivity) : Persistent objects are accessed using standard C++ Same code can access either persistent or transient object
Framework (easy to manage DB) : Flexible and transparent distinction between logical associations and
physical clustering. Fully transparent I/O with performances essentially limited by the
disk speed (random access).
Vincenzo Innocente, CERN/EP
Software Strategy 17
CMS ExperienceCMS ExperienceCMS ExperienceCMS Experience
Administration (essentially file management) : Very flexible file-level management (localization, archival,
replication) using AMS features Several tools available to monitor activities and performance File size overhead (5% for realistic CMS object sizes) not larger
than for other “products”
Physicists (easy to use) : Personal Databases are invaluable and in common use
Analysis performance and flexibility improved by shallow (link) & deep (data) local copy of selected event sample
use same type of event-catalog as production Framework and CMS tools hide all details
All our tests show that Objectivity/DB can satisfy CMS requirements in terms of performance, scalability and
flexibility for all kind of data
Vincenzo Innocente, CERN/EP
Software Strategy 18
Alternatives: other ODBMSAlternatives: other ODBMSAlternatives: other ODBMSAlternatives: other ODBMS
Versant is a viable commercial alternative to Objectivity do we have time to build an effective partnership (eg. MSS interface)?
Espresso (by IT/DB) should be able to produce a fully fledged ODBMS in a couple of years once the proof-of-concept prototype is ready
Migrate CARF from Objectivity to another ODBMS We expect that it would take about one year Will not affect the basic principles of CMS software architecture and data
model Will involve only the core CARF development team. Will not disrupt production and physics analysis
Vincenzo Innocente, CERN/EP
Software Strategy 19
Alternatives: ORDBMSAlternatives: ORDBMSAlternatives: ORDBMSAlternatives: ORDBMS
ORDBMS (Relational DB with OO interface) are appearing on the marketUp to now they looked targeted to those who have already a relational
system and wish to make a transition to OO
A New ORACLE product has all the appearances of a fully fledged ODBMS
IT/DB is in the process of evaluating this new product as an event storeIf it will look promising CMS will join this evaluation next year.
We will consider the impact of ORDBMS on CMS Data Model and on migration effort before the end of 2001
Vincenzo Innocente, CERN/EP
Software Strategy 20
Fallback Solution: Hybrid ModelsFallback Solution: Hybrid ModelsFallback Solution: Hybrid ModelsFallback Solution: Hybrid Models
We believe that this solution could seriously compromise our ability to perform our physics program competitively
(R)DBMS for Event Catalog, Calibration, etc Object-Stream files for event data Ad-hoc networked data-server and MSS interface
Less flexible Rigid split between DBMS and event data One way navigation from DBMS to event data
More complex Two different I/O systems More effort to learn More resources for developing and maintaining our application software
This approach will be used by several experiment at BNL and FermiLab (RDBMS not directly accessible from user applications)
CMS is following closely these experiences.
Vincenzo Innocente, CERN/EP
Software Strategy 21
ConclusionConclusionConclusionConclusion
CMS has chosen to follow a uniform and coherent approach for the development of Experiment-Data Analysis Software
Today a Functional Prototype exists and includes A modular Object Oriented Framework A Service and Utility Toolkit A Persistent Object Service based on Objectivity/DB Specialized applications for DAQ, Simulation, Reconstruction and
Visualization A set of plug-in modules for detector and physics simulation,
reconstruction and analysis
CMS is currently reviewing the present architecture, the software design and the technical choices to prepare for next
software development cycle