28
Acat 2000 18 October Rene Brun 1 Future of Analysis Environments Personal views Rene Brun CERN

Future of Analysis Environments Personal views

  • Upload
    magda

  • View
    22

  • Download
    0

Embed Size (px)

DESCRIPTION

Future of Analysis Environments Personal views. Rene Brun CERN. Type of data ? Any type ? PAW-like ntuple?. No restrictions. Data. Restricted to histogramming & visualisation ?. Analysis. Structure ? What is modularity? Abstract interfaces? Languages? Parallelism?. Coherent - PowerPoint PPT Presentation

Citation preview

Page 1: Future of Analysis Environments Personal views

Acat 2000 18 October Rene Brun 1

Future of Analysis EnvironmentsPersonal views

Rene BrunCERN

Page 2: Future of Analysis Environments Personal views

Acat2000 16 Octobre Rene Brun 2

Data

Analysis

Packages

Type of data ?Any type ?

PAW-like ntuple?

Restricted tohistogramming

& visualisation ?

Structure ?What is modularity?Abstract interfaces?

Languages?Parallelism?

No restrictions

CoherentFramework

ofCooperatingsystemsI/O + UII/O + UIObject Bus

Page 3: Future of Analysis Environments Personal views

Acat2000 16 Octobre Rene Brun 3

Type of Data in the past

Event data managed by data structure (bank) managers (zebra, bos..) a bank is like an object

Final physics data in ntuple format (paw) ntuple is like a table in a RDBMS

Run/File catalog with adhoc tools (fatmen) calibrations, geometry, etc, adhoc tools

(hepdb)

Page 4: Future of Analysis Environments Personal views

Acat2000 16 Octobre Rene Brun 4

Type of data: trends-1

Put everything in an Object Data base like Objectivity

Choice of RD45 project Many experiments initially following this

line Abandonned by most experiments

recently Interesting experience with Babar Solution not suited for PAW-like analysis

Page 5: Future of Analysis Environments Personal views

Acat2000 16 Octobre Rene Brun 5

Type of data: trends-2

Put write-once data in an object store like ROOT in Streamer mode

Use a RDBMS for : Run/Event catalogs Geometry, calibrations eg with ROOT<->Oracle interface

http://www.phenix.bnl.gov/WWW/publish/onuchin/rooObjy/

or with ROOT <-> Objectivity interface http://www.phenix.bnl.gov/WWW/publish/onuchin/RDBC/

Use ROOT split/no-split mode for phys analysis

Page 6: Future of Analysis Environments Personal views

Acat2000 16 Octobre Rene Brun 6

Framework basic requirements

Dynamic Linking AND Unlinking of user

shared libs

User can define new classes interactively

Interpreted code can call compiled code

Compiled code can call interpreted code

Scripts can be dynamically

compiled/linked

This is the normaloperation mode

Interesting featurefor GUIs &

event displays

Script CompilerRoot > .x file.C++

Page 7: Future of Analysis Environments Personal views

Acat2000 16 Octobre Rene Brun 7

Fundamental features of an Object-Oriented Framework

Functions

Data DDL

KUIPCDF

Data

FunctionsRTTI

Persistencyservices

User Interface

Procedural World

OO World

ROOTC++

C++Java

Page 8: Future of Analysis Environments Personal views

Acat2000 16 Octobre Rene Brun 8

Page 9: Future of Analysis Environments Personal views

Acat2000 16 Octobre Rene Brun 9

Page 10: Future of Analysis Environments Personal views

Acat2000 16 Octobre Rene Brun 10

Automatic Code generation

Hand-writtencode

Automaticallygenerated

code40 per centin ROOT

Algorithms Meta information

Used by I/O, GUI,Inspectors, browsers interpreter, html, etc

Page 11: Future of Analysis Environments Personal views

Acat2000 16 Octobre Rene Brun 11

Java - ROOT interface(s)

Read ROOT files from a java program see Tony Johnson will be simpler with new ROOT 2.26 supporting

automatic schema evolution Call ROOT classes from a java program

work by Subir Sarkar (hand-coded JNI interface)

could use JACO (see Tony Johnson) or better use a variant of rootcint (rootjava)

Generate ROOT-Java data classes TTree::MakeJava like TTree::MakeClass

Page 12: Future of Analysis Environments Personal views

Acat2000 16 Octobre Rene Brun 12

Java - ROOT interface (s) import root.*; TROOT troot = new TROOT("simple", "Simple Java to root interface");

TApplication app = new TApplication("ROOT Apllication"); System.out.println("TApplication .....");

TBenchmark bench = new TBenchmark(); bench.Start("Hsum");

TRandom random = new TRandom();

TH1F total = new TH1F("total","total distribution",100,-4.0F,4.0F); TH1F main = new TH1F("main","Main contributor",100,-4.0F,4.0F); TH1F s1 = new TH1F("s1","first signal",100,-4.0F,4.0F); TH1F s2 = new TH1F("s2","second signal",100,-4.0F,4.0F);

total.Sumw2(); // this makes sure that the sum of squares of weights will be stored total.SetMarkerStyle(21); total.SetMarkerSize(0.7F); main.SetFillColor(16); s1.SetFillColor(42); s2.SetFillColor(46);

TCanvas canvas = new TCanvas("c1","The HSUM example",200,10,600,400); canvas.SetGrid();

and so on.

Page 13: Future of Analysis Environments Personal views

Acat2000 16 Octobre Rene Brun 13

Java - ROOT interface (s)

It is important to cooperate to: facilitate the Java/C++ integration

Could be interesting for applications where performance is not an issue (event display)

However, I do not believe in a solution where the bulk of data is stored as C++ objects and analyzed with a Java-based system. It must fun but very inefficient what do you gain?

Page 14: Future of Analysis Environments Personal views

Acat2000 16 Octobre Rene Brun 14

Languages for data analysis

Data analysis requires an efficient access to objects (both data and functions).

It requires a powerful programming language: in interpreted mode in compiled mode Transition from interpreted mode to compiled

mode must be smooth and transparent. A scripting language is not the solution Python is not a solution

Page 15: Future of Analysis Environments Personal views

Acat2000 16 Octobre Rene Brun 15

GUI

Commands

Interpretedscripts

Compiledscripts

Page 16: Future of Analysis Environments Personal views

Acat2000 16 Octobre Rene Brun 16

A role for commercial components ?

Data bases Oracle very likely, others NO

Graphics/UI NO but YES for interfaces to commercial systems

Special algorithms like fitting strong doubts

I strongly believe in the advantages of Open Source systems Large news/discussions groups

Page 17: Future of Analysis Environments Personal views

Acat2000 16 Octobre Rene Brun 17

Our current work

Continuous consolidation of the system Automatic schema evolution Common GUI between Unix and Windows Upgrade UI to new style GUI Tree query processor reimplemented

using the new TSelector facility. PROOF (Parallel ROOT Facility) (see next) Interface with other systems, eg G3, G4 Support thousands of usersSupport thousands of users

Page 18: Future of Analysis Environments Personal views

Acat2000 16 Octobre Rene Brun 18

The OODBMS dreams

SelectionParameters

FederationDB1

DB3

DB4

DB5

DB6

CPU

Local

Remote OODB

DB2

Page 19: Future of Analysis Environments Personal views

Acat2000 16 Octobre Rene Brun 19

ROOT/PROOF and GRIDs

SelectionParameters

DB1

DB4

DB5

DB6

CPU

Local

Remote

Procedure

Proc.C

Proc.C

Proc.C

Proc.C

Proc.C

PROOF

CPU

CPU

CPU

CPU

CPU

TagDB

RDB

DB3

DB2

Page 20: Future of Analysis Environments Personal views

Acat2000 16 Octobre Rene Brun 20

What is a modular system ?

Modularity is a nice word. Everybody claims to be modular.

a system with many small and independent modules? where is the object bus? what is the cost of assembling all the pieces in

a real application? a hierarchical system with easily

replaceable components? but with many internal dependencies

Page 21: Future of Analysis Environments Personal views

Acat2000 16 Octobre Rene Brun 21

What is a modular system ?

a system with well defined interfaces? where is the object bus? passing data by reference or value? Collections/Folders?

a system easy to understand (user view) ? end users like monolithic systems doing everything

a system easy to maintain (developer view) ? a system that can easily be integrated into other

systems? a theoretical system and no implementation?

Modularity is difficult to achievein a growing system.

Page 22: Future of Analysis Environments Personal views

Acat2000 16 Octobre Rene Brun 22

Modularity and Dependencies in ROOT

By dependency, we mean binary dependency,when one module (shared library) forces the loading of another library. In the past this was a weak point of the system. For example,if you wanted to produce in a batch program some histograms you were required to link your app with all ROOT graphics libs up to X11.Like with PAW

This problem was rightly pointed out by many users as something to befixed. We did this. In the current system only a small set of baselibraries are needed when creating e.g. histograms, in batch mode.Besides the decoupling of the graphics system many more abstract layerswere introduced to decouple other parts of the system: histogram fromits painter, the tree storage system from its query mechanism (treeplayer),fitting from minuit, etc. Following this reorganization none of the lowerlevel libraries depend anymore on higher level libraries. These changesimproved besides modularity also overal system performance.

Page 23: Future of Analysis Environments Personal views

Acat2000 16 Octobre Rene Brun 23

Page 24: Future of Analysis Environments Personal views

Acat2000 16 Octobre Rene Brun 24

Page 25: Future of Analysis Environments Personal views

Acat2000 16 Octobre Rene Brun 25ALICE 13/3/2000 Software Panel Computing Review 6

Typically5 yearsbetween

alpha releaseand mature

product

RelativeImportance

Page 26: Future of Analysis Environments Personal views

Acat2000 16 Octobre Rene Brun 26

ROOT Quality assurance

Page 27: Future of Analysis Environments Personal views

Acat2000 16 Octobre Rene Brun 27

A growing users base

Page 28: Future of Analysis Environments Personal views

Acat2000 16 Octobre Rene Brun 28

Summary We are implementing a powerful system

designed for large scale data analysis with parallel architectures in a GRID context.

The ROOT system is a framework providing a coherent object bus in DAQs, simulation, reconstruction and analysis phases.

We have learnt a lot in the past 5 years, also following our 10 years of experience with PAW.

Developing the system and at the same time supporting a rapidly growing users base is a demanding but also rewarding job.