19
Weaknesses of the LHC Machine Protection System Bernhard Holzer, CERN BE-ABP ... what a MPS should do: 2 major tasks * protect the machine in case of hardware / software failure * protect the machine in case of ... " the experts " Personal Definition: I consider a quench already as something that should be avoided

Weaknesses of the LHC Machine Protection System

  • Upload
    amadis

  • View
    45

  • Download
    5

Embed Size (px)

DESCRIPTION

Weaknesses of the LHC Machine Protection System. Bernhard Holzer, CERN BE-ABP. ... what a MPS should do: 2 major tasks * protect the machine in case of hardware / software failure * protect the machine in case of ... " the experts " Personal Definition: - PowerPoint PPT Presentation

Citation preview

Page 1: Weaknesses of the LHC Machine Protection System

Weaknesses of the LHC Machine Protection System

Bernhard Holzer, CERN BE-ABP

... what a MPS should do: 2 major tasks

* protect the machine in case of hardware / software failure

* protect the machine in case of ... " the experts "

Personal Definition: I consider a quench already as something that should be avoided

Page 2: Weaknesses of the LHC Machine Protection System

- wrong sextupole polarity in the yellow ring (systematically) due to cpu problem of the machine physicists

- BPM signals between the two rings interchanged ... again systematically

- wrong BPM polarity at single BPM's

- dead BPM ... indicating still an oribt offset of some mm --> orbit correction algorithm tries to "compensate" and in the end there are indeed 10 mm real offset.

- vacuum valves are indicated as open ... but in reality are closed ... a nice beam dump

- aluminum foil inside the vacuum chamber ... they just forgot to take it out

- horizontal orbit correctors distort significantly the beam momentum on the first turn

- Luminosity with 110 Bunches is not higher than with 55 bunches ... injection kicker diluted the transverse beam emittance

1.) Practical Applications of Murphy's Law:

... just some examples from RHIC: stories that you normally don't hear in EPAC reports

Page 3: Weaknesses of the LHC Machine Protection System

Statistic BLM events 1995 - 1997

0

3

6

9

12

15

18

21

19 22 25 28 31 34 37 40 43 46 49 22 25 28 31 34 37 40 43 46 6 9 12 15 18 21 24 27 30 33 36 39 42

week

even

ts/w

eek

0

10

20

30

40

50

60

70

80

90

100

bea

m c

urr

ent

[mA

]

Errors

Quenches

5 ms events

BLM-Alarms

beam current

1995 1996 1997

I

2.) What are we talking about ???

Experience from HERA: the Machine Protection System has to handle a large number of "events" ... larger than I myself expected ... reasons spread over all hardware components

beam current

beam loss alarms , very fast beam loss alamrs ( < 5ms) and quenches in the first HERA run years per week.

Page 4: Weaknesses of the LHC Machine Protection System

2.) Where do the problems come from

... just a number of most prominent examples:

BPM's BLM'sPower ConvertersRF (...can lead not only to dc current

but to fast losses)Vacuum Experiments (!)Operateurs (!)

Weakness of the MPS: clear enough: we should not forget any componentbut the MPS is only watertight if the hardware is perfectthe logic of the software is okif the protection system is redundant !!

0%

Powersupply8%

Magnete7%

Petra9%

Quenchpr2%

p-HF3%

Bedienung4%

Senderstrom1%

Linac21%

e-Dump2%

Desy35%

Desy26%

Pia2%

MVP0%

Linac30%

MKS40%

Cryo6%

Cryo-Kontr0%

MIN1%

Power8%

Klima0%

Water0%

sl-cav1%

MSK6%

MDI6%

MST2%

e-HF7%

Diverses5%

MPS0%

MVA1%

Strahlverlust2%

Exp2%

0%

Page 5: Weaknesses of the LHC Machine Protection System

Simple Example: the BPM's ( ... sorry Rhodry)

Example RHIC / HERA

local offset in a BPM during Lumi-Run: Δx ≈ 16 mm

leading to several quenches at injection and flat top

BPM's are the backbone of the machine diagnostics system but they can be dead, show the wrong polarity, develop an offset and this can change spontaneously

MPS might recognise a dangerous orbit ---> trigger the dumpOrbit correction loop might compensate via local steering ---> BLM alarm / quench

Page 6: Weaknesses of the LHC Machine Protection System

Special situation: 90° lattice

cite of the logbook 180 deg-bump check: WR 579 CX -5A WR 626 CX +13A WR 673 CX -5A

Lapidar Example: the BPM's

90° 90°

Page 7: Weaknesses of the LHC Machine Protection System

3.) What can go wrong ?a rough statistics of 20 years HERA

Injection: too early (during magnet cycle)too late (during accleration)into a filled bucket (timing problem)with kicker/septum offwith magnet at transferline offafter wrongly applied injection correction ... why ???with closed collimatorswith closed vacuum valvewith wrong magnet polarity (after maintenance day)

Acceleration: failure of persistent current compensationerrors in ramp correction tablestune jump during polarity switch of a quadrupolecollimators too close to the beamhead tail problems (chromaticity correction)magnet failures

Luminosity: aperture limitations due to RF fingersbeam quality issues: beam beam spoils the emittance (up to beam losses at the aperture limit) orbit correction loop: coil at limit or offdedicated beam orbit steeringcoasting beam (rf problems)failure at dump kickerfailure of dump timing systemcollimator control defect (radiation problem)error in BLM / BPM signal processing (server)vacuum valve closes during luminosity run

Nota bene: each of these errors lead to a beam loss alarm or quench

Page 8: Weaknesses of the LHC Machine Protection System

4.) Nice example, because it was unexpected:

strong development of dc current (coasting beam) due to rf noise

sudden jump of the rf timing by 18 ° dc current develops after a while

sudden jump of the rf timing by several bunch positions

dump kicker gap filled

DC beam contribution broken connection between rf pre amplifier & main driver in the tunnel

..."excellent" noise amplification

... driving DC contribution

... spoiling several luminosity runs

accumulating up to 20 % DC contribution... scraping ... did not solve the problem ... problem for the dump gap

Page 9: Weaknesses of the LHC Machine Protection System

5.) Detection of Beam Losses

Example HERA-p

loss pattern around the storage ring

beam losses seen by a single BLM failure of standard magnet (dipole /quadrupole)

beam losses seen by a single BLM failure of a critical power converter --> very fast losses --> quench cannot be avoided --> and eventually damage of components

Problem: MPS was not redundant in special cases a single system (eg. the BLMs) is not sufficient for the machine protection. Solution ... in special cases: FMCM direct & fast link between power converter & dump system

Page 10: Weaknesses of the LHC Machine Protection System

6.) Possible Weakness of the LHC Machine Protection System ... ?Analysis of fast beam losses (A. Gómez)

Phase space deformation in case of failure of RQ4.LR7

Short Summary of the studies: quench in sc. arc dipoles: τ loss =20 - 30 ms BLM system reacts in time, QPS is not fast enough

quench in sc. arc quadrupoles: τ loss =200 ms BLM & QPS react in time

failure of nc. quadrupoles: τ det = 6 ms τ damage = 6.4 ms

failure of nc. dipole: τ damage = 2 ms

→ FMCM installed

Page 11: Weaknesses of the LHC Machine Protection System

Possible Weakness of the LHC MPS: Analysis of fast beam losses (A. Gómez)

worst case: nc. dipole magnets: RD1.LR1 / LR5

simulaion of beam losses due to failure of RD1damage level reached after 25 turnsτBLM react. ≈ τ damage

FMCM intsalled ... but redundancy does not really exist

... does it make sense to contemplate about a fast AC beam current monitor in LHC ???

experience is excellent:combination of fast FMCM and AC-BM installed at HERA in 2003/2004

Page 12: Weaknesses of the LHC Machine Protection System

7.) Possible Weakness of ANY Machine Protection System ... ?

... the human beings

HERA run year 2007number of beam dumps and the reasons for it

Anzahl Ausfälle 2007

0 20 40 60 80 100

MVA

Exp

Powersupply

Petra

Cryo-Kontr

Magnete

Power

MSK

Desy3

e-HF

Strahlverlus t

Bedienung

Senders trom

MDI

Desy2

Cryo

MIN

MST

Quenchpr

Diverses

p-HF

Pia

Linac3

e-Dum p

Linac2

s l-cav

Water

MPS

MVP

MKS4

Klim a

rf (4 systems)

water cooling

power converters

cryo systems

... and the operators

Page 13: Weaknesses of the LHC Machine Protection System

7.) Possible Weakness of ANY Machine Protection System ... ? ... the human beings

especially problematic: the Monday morning effect, in other words: the experts

* are actions that are dangerous really inhibited by the MPS ???

* is it possible to trigger actions from outside the CCC ??? eg. wire scanner / collimator from the office eg. power converters actions on site at the local controller

Examples: correct bump but wrong IP --> BLM alarmlocal change of magnet currents --> BLM alarmwrong files in the sequencer --> spoils the machine run for a day !!! (still today I could kill the person) firing the wires for demonstration from the office

only the expert can retract the collimators without warning ... and he did

* is it possible to stop actions of the control system ? Can the operator or the MPS stop / inhibit orbit corrections / bumps / sequences in case of trouble

Does / Should the MPS communicate with the control system ??

Page 14: Weaknesses of the LHC Machine Protection System

UPS Timing Software Interlocks

LHCBeam

Interlock System

Powering Interlock System

BLMs aperture

BPMs for Beam Dump

LHC Experiments

Collimators / Absorbers

NC Magnet Interlocks

Vacuum System

RF + Damper

Beam Energy Tracking

Access Safety System

Quench Protection

Power Converters

Discharge Switches

dI/dt beam current

Beam Dumping System

AUG

DCCT Dipole Current 1

DCCT Dipole Current 2

RF turn clock

Cryogenics

Beam DumpTrigger

Beam Current Monitors

Current

BLMs arc

BPMs for dx/dt + dy/dt

dI/dt magnet current

Energy

SPS ExtractionInterlocks

Injection Kickers

Safe LHCParametersEnergy

essentialcircuits

auxiliarycircuits

Screens

SafeBeamFlag

Energy

TL collimators

Software InterlocksOperators

8.) An Evident Weakness of the Machine Protection System ... ? ... its complexity

Page 15: Weaknesses of the LHC Machine Protection System

140 "user systems" can trigger the alarm, each containing sometimes 1000 single devices

infinite number of possible alarms need systematic checks establish procedures for testing,

masking, book keeping ... "issue tracking"

needs a lot of self disciplin

8.) An Evident Weakness of the Machine Protection System ... ? ... its complexity

some bad HERA examples 1994 One Mega Quench (beam induced) the head of the machine disabled the BLM system because of too many false alarms

2005 ... too many false alarms ... will be ignored

Page 16: Weaknesses of the LHC Machine Protection System

8.) An Evident Weakness of the HERA Machine Protection System ... ? ... its complexity

stable luminosity run in 2005

nice background situation, good lifetime, everybody is happy

... and the alarm system would like to dump the beam.

the real problem:alarms are masked and ignored

Page 17: Weaknesses of the LHC Machine Protection System

9.) For the Fun of it: Weakness of the HERA Machine Protection System ... ? ... its experiments

provocative statement: there are a number of secondary collimator jaws

and as primary collimators there are the FPS stations

BeamInterlock System

LHC Experiments

Software InterlocksOperators

Does the MPS control / inhibit the experimets ??

Page 18: Weaknesses of the LHC Machine Protection System

Num ber of Quenches

0

1

2

3

4

5

6

7

8

9

10

11

12

13

Location of m agnet

Fre

qu

ency

32 different locations75 quenches (1994-2005)

most at 300 GeV/c in 1996, since 1997 4 BLMs at 198m locations

10.) Résumé:What we should avoid ... but what happened in other machines

Page 19: Weaknesses of the LHC Machine Protection System

10.) Résumé: The LHC MPS, just some keywords for the coffee break

t.q.m.m.

5 mm grove in the HERA proton collimator detected in 2003 after many years of operation.

* complexity:

establish procedures for testing, masking,

book keeping needs a lot of self disciplin

* redundancy: how many independent alarms do you get in case of failure ( AC Monitor ? )

* don't forget the human beings: they need information & training

* avoid fake alarms

What we should avoid ... but what happened in other machines