14
Irina Sourikova Brookhaven National Laboratory for the PHENIX collaboration Migrating PHENIX databases from object to relational model

Irina Sourikova Brookhaven National Laboratory for the PHENIX collaboration Migrating PHENIX databases from object to relational model

Embed Size (px)

Citation preview

Page 1: Irina Sourikova Brookhaven National Laboratory for the PHENIX collaboration Migrating PHENIX databases from object to relational model

Irina Sourikova

Brookhaven National Laboratory for the

PHENIX collaboration

Migrating PHENIX databases from object to relational model

Page 2: Irina Sourikova Brookhaven National Laboratory for the PHENIX collaboration Migrating PHENIX databases from object to relational model

Sept 27 CHEP’04 Interlaken, CH Irina Sourikova 2

Introduction

PHENIX is one of two large experiments at RHIC, produces hundreds of TB of data per year

In four years of running PHENIX accumulated tens of GB of calibration ( condition) data which used to get archived in Objectivity database

For a variety of reasons ( licensing and compiler issues among them ) the decision was made to change the underlying storage technology and use open source RDB instead of proprietary OODB

Main constraints - avoid any downtime for production and provide backward compatibility by migrating old Objectivity-based calibration data to RDB of choice

Page 3: Irina Sourikova Brookhaven National Laboratory for the PHENIX collaboration Migrating PHENIX databases from object to relational model

Sept 27 CHEP’04 Interlaken, CH Irina Sourikova 3

Where do we store data if not in Objy?

One option is to store metadata in RDB and data in flat files ( STAR )

Another option is to store calibrations in BLOBs ( Binary Large Objects ). PHOBOS keeps its calibration data in BLOBs in Oracle

Data consistency, data replication and performance considerations led us to the decision to store calibration data, not only metadata in the database

PostgreSQL was chosen as RDBMS

Page 4: Irina Sourikova Brookhaven National Laboratory for the PHENIX collaboration Migrating PHENIX databases from object to relational model

Sept 27 CHEP’04 Interlaken, CH Irina Sourikova 4

What’s involved in the database transition

Design a relational schema that supports our data and queries on it.

Migrate large amounts of old Objectivity-based data to a new DB. That requires I/O from objects in memory to tables in RDB

Preserve the existing API by providing a new implementation

Page 5: Irina Sourikova Brookhaven National Laboratory for the PHENIX collaboration Migrating PHENIX databases from object to relational model

Sept 27 CHEP’04 Interlaken, CH Irina Sourikova 5

Calibration data

Our calibrations differ widely in shape and size but have the same structure - they are arrays ( “banks” ) of individual channels

Example: a lookup table for slewing corrections for a PMT in ZDC can be a channel

A bank is a unit of information which is stored and retrieved based on validity ranges. For example all PMTs in ZDC form a bank

PHENIX ZDC

Page 6: Irina Sourikova Brookhaven National Laboratory for the PHENIX collaboration Migrating PHENIX databases from object to relational model

Sept 27 CHEP’04 Interlaken, CH Irina Sourikova 6

Relational schema

Why this doesn’t work for us: RDBs have limit on the number of columns, large size of some channels makes this approach problematicOne possibility could be to use PostgreSQL array type to store a bank, but array implementation is not optimized for big array size. Moreover other RDBs do not support array typeI/O is still a problem

Most direct approach - map channel data members to columns in a tableMakes data transparent, suitable for Web display

Page 7: Irina Sourikova Brookhaven National Laboratory for the PHENIX collaboration Migrating PHENIX databases from object to relational model

Sept 27 CHEP’04 Interlaken, CH Irina Sourikova 7

BLOBs

Another approach is to store calibration banks in BLOBs ( Binary Large Objects ) and calibration metadata as simple types

Solves I/O problem - ROOT I/O can be used to serialize banks into BLOBs and RDBC ( ROOT DataBase Connectivity ) to send BLOBs to the database

Makes rewriting of calibration DB interface easy Allows fast index-based calibration retrieval The only thing we lose is “transparency”, Web display

Page 8: Irina Sourikova Brookhaven National Laboratory for the PHENIX collaboration Migrating PHENIX databases from object to relational model

Sept 27 CHEP’04 Interlaken, CH Irina Sourikova 8

Final relational schema

Decided to proceed with BLOBs Each Objy db mapped into relational db table All tables have the same schema:

Each object in Objy container mapped to a row in a table

Each calibration header data member mapped to a column in a table

Metadata BLOBptr

Page 9: Irina Sourikova Brookhaven National Laboratory for the PHENIX collaboration Migrating PHENIX databases from object to relational model

Sept 27 CHEP’04 Interlaken, CH Irina Sourikova 9

Software layers Couple of months spent on installing

and testing new software After fixing few bugs adopted the

following: RDBC - talks to RDBs from ROOT

libodbc++ - c++ library for accessing RDBs, runs on top of ODBC, simplifies the code

unixODBC - free ODBC interface

psqlodbc - official PostgrSQL ODBC driver

RDBC

libodbc++

unixODBC

psqlodbc

DB

PhenixDB API

User application

Page 10: Irina Sourikova Brookhaven National Laboratory for the PHENIX collaboration Migrating PHENIX databases from object to relational model

Sept 27 CHEP’04 Interlaken, CH Irina Sourikova 10

New calibration API implementation

Top calibration abstract base class inherits from Tobject to use RDBC method SetObject(int, TObject *)

A ClassDef macro added to calibration headers to equip calibration classes with streamers

New calibration DB API was made ODBC-compliant to ease possible future technology changes

Data migration code was written by a perl script with Objy db name and calibration class name as arguments

One new method introduced to benefit from finer commit granularity available in RDBs

Page 11: Irina Sourikova Brookhaven National Laboratory for the PHENIX collaboration Migrating PHENIX databases from object to relational model

Sept 27 CHEP’04 Interlaken, CH Irina Sourikova 11

Old data transfer

A clone of Objy federation was made and its schema evolved to reflect a change in the inheritance schema ( all calibration classes got Tobject as a parent )

A CVS branch was created for the code development with new replica Objy federation

About 13 GB of old data were transferred from Objy to Postgres which took a few days

Page 12: Irina Sourikova Brookhaven National Laboratory for the PHENIX collaboration Migrating PHENIX databases from object to relational model

Sept 27 CHEP’04 Interlaken, CH Irina Sourikova 12

Validating new framework

Validating the new framework took a lot of time due to very active code development and Objectivity updates

Non-atomic CVS operations ( tagging the code ) added to the complexity of comparing reconstruction output in old and new frameworks

After byte-by-byte comparisons Postgres-based calibrations are now used in production

Page 13: Irina Sourikova Brookhaven National Laboratory for the PHENIX collaboration Migrating PHENIX databases from object to relational model

Sept 27 CHEP’04 Interlaken, CH Irina Sourikova 13

Database replication

Due to PostgreSQL source code availability and ease of administration is was not very hard to install local database servers in 6 off-site institutions and make them slave databases

This was possible without synchronizing compiler versions and paying license fees

PHENIX can run reconstruction and simulations at more sites than before

Page 14: Irina Sourikova Brookhaven National Laboratory for the PHENIX collaboration Migrating PHENIX databases from object to relational model

Sept 27 CHEP’04 Interlaken, CH Irina Sourikova 14

Summary

Objectivity/DB is not used in PHENIX production since July 2004

Transition from Objectivity to Postgres was relatively transparent to the Collaboration, took about 1 year of 1 FTE

New adopted software saved code development time, but now we must pay a maintenance price

Web display with BLOBs requires more work, but possible

Many thanks to Laurent Aphecetche, Saskia Mioduszewski, Chris Pinkenburg and Martin Purschke for their help