18
Clara Gaspar, May 2010 The LHCb Run Control System An Integrated and Homogeneous Control System

The LHCb Run Control System

  • Upload
    kasia

  • View
    53

  • Download
    2

Embed Size (px)

DESCRIPTION

The LHCb Run Control System. An Integrated and Homogeneous Control System. The Experiment Control System. Is in charge of the Control and Monitoring of all parts of the experiment. DCS Devices (HV, LV, GAS, Cooling, etc.). Detector Channels. L0. TFC. Front End Electronics. Readout Network. - PowerPoint PPT Presentation

Citation preview

Page 1: The LHCb Run Control System

Clara Gaspar, May 2010

The LHCb Run Control System

An Integrated and Homogeneous Control

System

Page 2: The LHCb Run Control System

Clara Gaspar, May 2010 2

The Experiment Control System

Detector Channels

Front End Electronics

Readout Network

HLT Farm

Storage

L0

Expe

rimen

t Con

trol S

yste

m

DAQ

DCS Devices (HV, LV, GAS, Cooling, etc.)

External Systems (LHC, Technical Services, Safety, etc)

TFC

Monitoring Farm

❚ Is in charge of the Control and Monitoring of all parts of the experiment

Page 3: The LHCb Run Control System

Clara Gaspar, May 2010 3

Some Requirements❚ Large number of devices/IO channels

➨ Need for Distributed Hierarchical Control❘ De-composition in Systems, sub-systems, … , Devices❘ Local decision capabilities in sub-systems

❚ Large number of independent teams and very different operation modes➨ Need for Partitioning Capabilities (concurrent usage)

❚ High Complexity & Non-expert Operators➨ Need for Full Automation of:

❘ Standard Procedures❘ Error Recovery Procedures

➨ And for Intuitive User Interfaces

Page 4: The LHCb Run Control System

Clara Gaspar, May 2010 4

Design Steps❚ In order to achieve an integrated System:

❙Promoted HW Standardization (so that common components could be re-used)

❘Ex.: Mainly two control interfaces to all LHCb electronics〡Credit Card sized PCs (CCPC) for non-radiation zones〡A serial protocol (SPECS) for electronics in radiation areas

❙Defined an Architecture❘That could fit all areas and all aspects of the monitoring and

control of the full experiment❙Provided a Framework

❘An integrated collection of guidelines, tools and components that allowed the development of each sub-system coherently in view of its integration in the complete system

Page 5: The LHCb Run Control System

Clara Gaspar, May 2010 5

Generic SW Architecture

LVDev1

LVDev2

LVDevN

DCS

SubDetNDCS

SubDet2DCS

SubDet1DCS

SubDet1LV

SubDet1TEMP

SubDet1GAS

Com

man

ds

DAQ

SubDetNDAQ

SubDet2DAQ

SubDet1DAQ

SubDet1FEE

SubDet1RO

FEEDev1

FEEDev2

FEEDevN

ControlUnit

DeviceUnit

Legend:

INFR. TFC LHC

ECS

HLT

Sta

tus

& A

larm

s

Page 6: The LHCb Run Control System

Clara Gaspar, May 2010 6

❚The JCOP* Framework is based on:❙SCADA System - PVSSII for:

❘Device Description (Run-time Database)❘Device Access (OPC, Profibus, drivers)❘Alarm Handling (Generation, Filtering, Masking, etc)❘Archiving, Logging, Scripting, Trending❘User Interface Builder❘Alarm Display, Access Control, etc.

❙SMI++ providing:❘Abstract behavior modeling (Finite State Machines)❘Automation & Error Recovery (Rule based system)

* – The Joint COntrols Project (between the 4 LHC exp. and the CERN Control Group)

The Control FrameworkDe

vice

Uni

ts

Cont

rol U

nits

Page 7: The LHCb Run Control System

Clara Gaspar, May 2010 7

Device Units❚Provide access to “real” devices:

❙The Framework provides (among others):❘“Plug and play” modules for commonly used

equipment. For example: 〡CAEN or Wiener power supplies (via OPC)〡LHCb CCPC and SPECS based electronics (via DIM)

❘A protocol (DIM) for interfacing “home made” devices. For example:

〡Hardware devices like a calibration source 〡Software devices like the Trigger processes

(based on LHCb’s offline framework – GAUDI)❘Each device is modeled as a Finite State Machine

DeviceUnit

Page 8: The LHCb Run Control System

Clara Gaspar, May 2010 8

Hierarchical control❚Each Control Unit:

❙Is defined as one or more Finite State Machines❙Can implement rules based on its children’s states❙In general it is able to:

❘Summarize information (for the above levels)❘“Expand” actions (to the lower levels)❘Implement specific behaviour

& Take local decisions〡Sequence & Automate operations〡Recover errors

❘Include/Exclude children (i.e. partitioning)〡Excluded nodes can run is stand-alone

❘User Interfacing〡Present information and receive commands

DCS

MuonDCS

TrackerDCS

MuonLV

MuonGAS

ControlUnit

Page 9: The LHCb Run Control System

Clara Gaspar, May 2010 9

Control Unit Run-Time❚Dynamically generated operation

panels(Uniform look and feel)

❚ Configurable

User Panelsand Logos

❚ “Embedded” standard partitioning rules:❙ Take❙ Include❙ Exclude❙ Etc.

Page 10: The LHCb Run Control System

Clara Gaspar, May 2010 10

Operation Domains❚Three Domains have been defined:

❙DCS❘For equipment which operation and stability is

normally related to a complete running periodExample: GAS, Cooling, Low Voltages, etc.

❙HV❘For equipment which operation is normally related

to the Machine state. Example: High Voltages❙DAQ

❘For equipment which operation is related to a RUNExample: Readout electronics, High Level Trigger processes, etc.

Page 11: The LHCb Run Control System

Clara Gaspar, May 2010 11

FSM Templates❚ DCS Domain ❚ HV Domain

❚ DAQ Domain READY

STANDBY1

OFF

ERRORRecover

STANDBY2

RAMPING_STANDBY1

RAMPING_STANDBY2

RAMPING_READY

NOT_READY

Go_STANDBY1

Go_STANDBY2

Go_READY

RUNNING

READY

NOT_READY

Start Stop

ERROR UNKNOWN

Configure

Reset

Recover

CONFIGURING

READY

OFF

ERROR NOT_READY

Switch_ON Switch_OFF

Recover Switch_OFF

❚ All Devices and Sub-Systems have been implemented using one of these templates

Page 12: The LHCb Run Control System

Clara Gaspar, May 2010 12

❚Size of the Control Tree:❙Distributed over ~150 PCs

❘~100 Linux(50 for the HLT)

❘~ 50 Windows❙>2000 Control Units❙>50000 Device Units

❚The Run Control can be seen as:❙The Root node of the tree➨ If the tree is partitioned there can be

several Run Controls.

ECS: Run Control

DCS

SubDetNDCS

SubDet1DCS

DAQ

SubDetNDAQ

SubDet1DAQ

HV TFC LHCHLT

SubDet1

ECS

XX

Page 13: The LHCb Run Control System

Clara Gaspar, May 2010 13

Partitioning❚ ECS Domain

RUNNING

READY

NOT_READY

StartRun StopRun

ERROR

Configure

Reset

Recover

CONFIGURING

NOT_ALLOCATED

Allocate

Deallocate

ALLOCATING

❚Creating a Partition❙Allocate = Get a “slice” of:

❘Timing & Fast Control (TFC)❘High Level Trigger Farm (HLT)❘Storage System❘Monitoring Farm

ACTIVE

StartTrigger StopTrigger

SWITCH

HLT farm

Detector

TFC System

SWITCHSWITCH SWITCH SWITCH SWITCH SWITCH

READOUT NETWORK

L0 triggerLHC clock

MEP Request

Event building

Front-End

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

Readout Board

Expe

rimen

t Con

trol S

yste

m (E

CS)

VELO ST OT RICH ECal HCal MuonL0

Trigger

Event dataTiming and Fast Control SignalsControl and Monitoring data

SWITCH

MON farm

CPU

CPU

CPU

CPU

Readout Board

Readout Board

Readout Board

Readout Board

Readout Board

Readout Board

FEElectronics

FEElectronics

FEElectronics

FEElectronics

FEElectronics

FEElectronics

FEElectronics

Page 14: The LHCb Run Control System

Clara Gaspar, May 2010 14

DomainX

Sub-Detector

Run Control❚Matrix

❚ActivityUsed for

Configuring all Sub-Systems

Page 15: The LHCb Run Control System

Clara Gaspar, May 2010 15

Sub-Detector Run Control

❚“Scan” Run

Page 16: The LHCb Run Control System

Clara Gaspar, May 2010 16

LHCb Operations❚ Two operators

on shift:❙ Data Manager❙ Shift Leader has 2

views of the System:

❘ Run Control❘ Big Brother

❚ Big Brother❙ Manages the LHC

<-> LHCbdependencies

❙ SubDetector Voltage x LHC State table

Page 17: The LHCb Run Control System

Clara Gaspar, May 2010

Automation❚Automation at several levels:

17

HLT AutopilotLHCb

❚Always done by the FSM(not by the panels)

BigBrother

Page 18: The LHCb Run Control System

Clara Gaspar, May 2010 18

Conclusions❚ LHCb has designed and implemented a coherent

and homogeneous control system❚ The Run Control allows to:

❙ Configure, Monitor and Operate the Full Experiment❙ Run any combination of sub-detectors in parallel in standalone❙ Can be completely automated (when we understand the

machine)❚ Some of its main features:

❙ Sequencing, Automation, Error recovery, Partitioning➨ Come from the usage of SMI++ (integrated with PVSS)

❚ It’s being used daily for Physics data taking and other global or sub-detector activities