32
CERN 2001 Summer Student Lectures Computing at CERN Lecture 1 — Looking Around Tony Cass — [email protected]

2001 Summer Student Lectures Computing at CERN Lecture 1 — Looking Around Tony Cass — [email protected]

Embed Size (px)

Citation preview

CERN

2001 Summer Student Lectures

Computing at CERN

Lecture 1 — Looking Around

Tony Cass — [email protected]

CERN

2Tony Cass

Acknowledgements The choice of material presented here is entirely my own. However, I

could not have prepared these lectures without the help of– Charles Granieri, Hans Grote, Mats Lindroos, Franck Di Maio, Olivier Martin,

Pere Mato, Bernd Panzer-Steindel, Les Robertson, Stephan Russenschuck, Frank Schmidt, Archana Sharma and Chip Watson

who spent time discussing their work with me, generously provided material they had prepared, or both.

For their general advice, help, and reviews of the slides and lecture notes, I would also like to thank– Marco Cattaneo, Mark Dőnszelmann, Dirk Düllmann, Steve Hancock, Vincenzo

Innocente, Alessandro Miotto, Les Robertson, Tim Smith and David Stickland.

CERN

3Tony Cass

Some DefinitionsGeneral

Computing Power– CERN Unit– MIPS– SPECint92, SPECint95

Networks– Ethernet

» Normal (10baseT, 10Mb/s)» Fast (100baseT, 100Mb/s)» Gigabit (1000Mb/s)

– FDDI– HiPPI

bits and Bytes– 1MB/s = 8Mb/s

Factors– K=1024, K=1000

CERN Interactive Systems

– Unix: WGS & PLUS» CUTE, SUE, DIANE

– NICE Batch Systems

– Unix: SHIFT, CSF» CORE

– PCSF

Other Data Storage, Data Access & Filesystems

– AFS, NFS, RFIO, HPSS, Objectivity[/DB] CPUs

– Alpha, MIPS, PA-Risc, PowerPC, Sparc

– Pentium, Pentium II, Merced

CERN

4Tony Cass

How to start? Computing is everywhere at CERN!

– experiment computing facilities, administrative computing, central computing, many private clusters.

How should this lecture course be organised? – From a rigorous academic standpoint?

– From a historical standpoint

– ...

– From a physics based viewpoint

CERN

5Tony Cass

Weekly use of Interactive Platforms1987-2001

Nu

mb

er

of

Us

ers

ea

ch

We

ek

Week

Windows 95

Windows NT

WGS and PLUS

CERNVM

VXCERN

0

2000

4000

6000

8000

10000

12000

CERN

7Tony Cass

Computing at CERN Computing “purely for (experimental) physics” will be the

focus of the second two lectures of this series. Leaving this area aside, other activities at CERN can be considered as falling into one of three areas:– administration,

– technical and engineering activities, and

– theoretical physics.

We will take a brief look at some of the ways in which computing is used in these areas in the rest of this first lecture.

CERN

9Tony Cass

Technical and Engineering Computing Engineers and physicists working at CERN must

– design,

– build, and

– operate

for experimental physicists to be able to collect the data that they need.

As in many other areas of engineering design, computer aided techniques are essential for the construction of today’s advanced accelerators and detectors.

– accelerators and

– detectorsboth

CERN

10Tony Cass

Accelerator design issues Oliver Brüning’s lectures will tell you more about accelerators. For the

moment, all we need to know is that – particles travelling in bunches around an accelerator are bent by dipole magnets and

must be kept in orbit. » Of course, they must be accelerated as well(!), but we don’t consider that here.

Important studies for LHC are– magnet design

» how can we build the (superconducting) dipole magnets that are needed?

– transverse studies» will any particles leave orbit? (and hit the magnets!)

– longitudinal studies» how can we build the right particle bunches for LHC?

CERN

11Tony Cass

LHC Magnet Design2D field picture for LHC dipole coil

3D representation of dipole coil end with magnetic field vectorsPictures generated with ROXIE.

CERN

12Tony Cass

Genetic Algorithms for Magnet Design

Original coil design.

New coil design found usinga genetic algorithm.This was further developedusing deterministic methodsand replaced the originaldesign.

Genetic Algorithm convergence plot.

The algorithm is designed to come upwith a number of alternative solutionswhich can then be further investigated.

CERN

14Tony Cass

Longitudinal Studies Not all particles in a bunch have the same energy. Studies of energy

distribution show aspects of bunch shape.– The energy of a particle affects its arrival time at the accelerating cavity… which

then in turn affects the energy.

Need to measure both energy and arrivaltime, but can’t measure energy directly.Measuring arrival times is easy– but difficult to interpret successive slices.

Tomography techniques lead to a completepicture– like putting together X-ray slices through

a person.

CERN

15Tony Cass

Bunch splitting at the PS

CERN

20Tony Cass

CERN and the World Wide Web The World Wide Web started as a project to make information more

accessible, in particular, to help improve information dissemination within an experiment.– These aspects of the Web are widely used at CERN today. All experiments

have their own web pages and there are now web pages dedicated to explaining about Particle Physics to the general public.

– In a wider sense, the web is being used to distribute graphical information on system, accelerator and detector status. The release of Java has given a big push to these uses.

Web browsers are also used to provide a common interface, e.g.» currently to the administrative applications, and» possibly in future as a batch job submission interface for PCs.

CERN

23Tony Cass

20002001: What has changed? I Windows 2000 has arrived and Wireless Ethernet is arriving.

– Portable PCs replacing desktops.

– Integration of home directory, web files, working offline makes things easier—just like AFS and IMAP revolutionised my life 8 years ago.

I now have ADSL at home rather than ISDN.– I am now outside the CERN firewall when connected from home but

this doesn’t matter so much with all my files cached on my portable.» I just need to bolt on a wireless home network so I can work in the garden!

– The number of people connecting from outside the firewall will grow» CERN will probably have to support Virtual Private Networks for privileged

access

» And users will have to worry about securing their home network against hackers…

CERN

24Tony Cass

Looking Around—Summary Computing extends to all areas of work at CERN. In terms of CERN’s “job”, producing particle physics

results, computing is essential for– the design, construction and operation of accelerators and

detectors, and

– theoretical studies, as well as

– the data reconstruction and analysis phases.

The major computing facilities at CERN, though, are provided for particle physics work and these will be the subject of the next two lectures.

CERN

2001 Summer Student Lectures

Computing at CERN

Lecture 2 — Looking at Data

Tony Cass — [email protected]

CERN

26Tony Cass

Data and Computation for Physics Analysis

batchphysicsanalysis

batchphysicsanalysis

detector

event summary data

rawdata

eventreconstruction

eventreconstruction

eventsimulation

eventsimulation

interactivephysicsanalysis

analysis objects(extracted by physics topic)

event filter(selection &

reconstruction)

event filter(selection &

reconstruction)

processeddata

CERN

27Tony Cass

Central Data Recording CDR marks the boundary between the experiment and

the central computing facilities. It is a loose boundary which depends on an

experiment’s approach to data collection and analysis. CDR developments are also affected by

– network developments, and

– event complexity.detector

rawdata

event filter(selection &

reconstruction)

event filter(selection &

reconstruction)

CERN

28Tony Cass

Monte Carlo Simulation From a physics standpoint, simulation is needed to study

– detector response

– signal vs. background

– sensitivity to physics parameter variations.

From a computing standpoint, simulation– is CPU intensive, but

– has low I/O requirements.

Simulation farms are therefore good testbedsfor new technology:– CSF for Unix and now PCSF for PCs and Windows/NT.

eventsimulation

eventsimulation

CERN

29Tony Cass

Data Reconstruction The event reconstruction stage turns detector information into

physics information about events. This involves– complex processing

» i.e. lots of CPU capacity

– reading all raw data» i.e lots of input, possibly read

from tape

– writing processed events» i.e. lots of output which

must be written topermanent storage.

event summary data

rawdata

eventreconstruction

eventreconstruction

CERN

30Tony Cass

Batch Physics Analysis

Physics analysis teams scan over all events to find those that are interesting to them.– Potentially enormous input

» at least data from current year.

– CPU requirements are high.

– Output is “small”» O(102)MB

– but there are many different teams andthe output must be stored for future studies

» large disk pools needed.

batchphysicsanalysis

batchphysicsanalysis

event summary data

analysis objects(extracted by physics topic)

CERN

31Tony Cass

Symmetric MultiProcessor Model

Experiment

TapeStorage TeraBytes of disks

CERN

32Tony Cass

Scalable model—SP2/CS2

Experiment

TapeStorage TeraBytes of disks

CERN

33Tony Cass

Distributed Computing Model

Experiment

TapeStorage

Disk Server

CPU Server

Switch

CERN

37Tony Cass

Today’s CORE Computing Systems

CERN Network

CORE Physics Services

CERN

Dedicated RISCclusters

300 computers, 750 processors(DEC, HP, SGI, SUN)

300 computers, 750 processors(DEC, HP, SGI, SUN)

Central Data Services

Shared Disk Servers

5 TeraByte disk3 Sun servers6 PC based servers

5 TeraByte disk3 Sun servers6 PC based servers

10 tape robots100 tape drives9940, Redwood, 9840, DLT,IBM 3590E, 3490, 3480EXABYTE, DAT, Sony D1

10 tape robots100 tape drives9940, Redwood, 9840, DLT,IBM 3590E, 3490, 3480EXABYTE, DAT, Sony D1

Shared Tape Servers

Homedirectories& registry

consoles&

monitors

DXPLUS, HPPLUS,RSPLUS,LXPLUS, WGS

InteractiveServices

120 systems (HP, SUN, IBM, DEC, Linux)120 systems (HP, SUN, IBM, DEC, Linux)

NAP - accelerator simulation service

NAP - accelerator simulation service

10-CPU DEC 840012 DEC workstations

20 dual processor PCs

10-CPU DEC 840012 DEC workstations

20 dual processor PCs

PaRCEngineeringCluster

PaRCEngineeringCluster

13 DEC workstations5 dual processor PCs5 Sun workstations

13 DEC workstations5 dual processor PCs5 Sun workstations

“Queue shared”Linux Batch Service

350 dual processor PCs350 dual processor PCs

RISC Simulation FacilityMaintained for LEP only

“Timeshared” Linuxcluster

200 dual processor PCs200 dual processor PCs

Dedicated Linuxclusters

250 dual processor PCs250 dual processor PCs

PC & EIDE baseddisk Servers

40TB mirrored disk(80TB raw capacity)

40TB mirrored disk(80TB raw capacity)

25 PC servers

CERN

38Tony Cass

Hardware Evolution at CERN, 1989-2001

Event Filter

Engineering Mainframes(I BM, Cray)

Disk Servers RI SC systems

Tape Servers Scalable systems(SP2, CS2)

I nteractive PCs

Batch

89 90 91 92 93 94 95 96 97 98 99 00 01

CERN

39Tony Cass

Interactive Physics Analysis Interactive systems are needed to enable physicists to develop and test

programs before running lengthy batch jobs.– Physicists also

» visualise event data and histograms

» prepare papers, and

» send Email

Most physicists use workstations—either private systems or central systems accessed via an Xterminal or PC.

We need an environment that provides access to specialist physics facilities as well as to general interactive services.

analysis objects(extracted by physics topic)

CERN

40Tony Cass

Unix based Interactive Architecture Backup

& ArchiveReference

EnvironmentsCORE

Services

Optimized Access

X Terminals PCs PrivateWorkstations.

WorkGroupServer

Clusters

PLUSCLUSTERS

Central Services

(mail, news,ccdb, etc.)

ASIS :Replicated

AFS Binary Servers

AFS Home Directory Services

GeneralStaged Data

Pool

X-terminal Support

CERN InternalNetwork

CERN

41Tony Cass

PC based Interactive Architecture

CERN

42Tony Cass

Event Displays

Event displays, such as this ALEPH display help physicists to understand what is happening in a detector. A Web based event display, WIRED, was developed for DELPHI and is now used elsewhere.

Clever processing of events can also highlight certain features—such as in the V-plot views of ALEPH TPC data.

Standard X-Y view

V-plot view

CERN

43Tony Cass

Data Analysis Work

By selecting a dE/dx vs. p region on this scatter plot, a physicist can choose tracks created by a particular type of particle.

Most of the time, though, physicists will study eventdistributions rather than individual events.

RICH detectors provide better particle identification, however. This plot shows that the LHCb RICH detectors can distinguish pions from kaons efficiently over a wide momentum range.

Using RICH information greatly improves the signal/noise ratio in invariant mass plots.

CERN

44Tony Cass

Looking at Data—Summary Physics experiments generate data!

– and physcists need to simulate real data to model physics processes and to understand their detectors.

Physics data must be processed, stored and manipulated. [Central] computing facilities for physicists must be designed to take

into account the needs of the data processing stages– from generation through reconstruction to analysis

Physicists also need to– communicate with outside laboratories and institutes, and to

– have access to general interactive services.