An Examination of Factors Affecting Human Reliability

Embed Size (px)

Citation preview

  • 8/13/2019 An Examination of Factors Affecting Human Reliability

    1/13

  • 8/13/2019 An Examination of Factors Affecting Human Reliability

    2/13

  • 8/13/2019 An Examination of Factors Affecting Human Reliability

    3/13

    IChemE SYMP OSIUM SERIES No 115

    SRlH tfgSTCreuxis

    Fid.I SheMahc of Meckam ccLl Bn tkin* Suste\

    As he began to take this action he noticed sparks under the spring nest andheard a bang . He at once moved the speed control lever towards 'o ff and thebrak e lever to 'on'. The leve r, however, felt to be disconnected and failedto have any effect. Hastily he pressed the emergency stop button, aperfectly natural reaction at such a moment but one which, because of theassociated tripping of the electrical power to the winding motor, alsodisabled the regenerative braking action. Thus, the last of the remainingbraking was removed and the rotation of the winding drum continued unabated.His last and desperate act was to trip the hydraulic pump which powered theungabbing g ear - but the brak ing syste m, with its main force path severed,was by then beyond recovery and the final tragic outcome already sealed.Within 30 seconds of the engineman discovering his useless brak e lever , thedescending c age , then travelling a t an estimated 27 mph, reached the pitbottom. Eighteen of the twenty nine men were killed and none of theremaining eleven escaped without serious injury.

  • 8/13/2019 An Examination of Factors Affecting Human Reliability

    4/13

    IChemE SY MP OS IU M SERIESNo.115

    The accident enquiry concluded that if the regenerative braking action hadremained operational, the final speed of the cages would have beensignificantly reduced. However, it was considered that the action of thewinding engineman was above critic ism, given the situation with which he hadbeen faced, and the means available to him for dealing with itAnalysisIt might be considered that the engineman, by his inadvertent disengagementof the regenerative braking, was guilty of a category of human error known asmental mode l perception failure'. Howe ver, his state of mind in those firstfew seconds must have seen a rapid escalation from sudden anxiety tonightmarish shock . With his brake lever lik e a broken fence post and theunthinkable possibility of a man-winding runaway he had failed to appreciatethat the 'stop but ton 1, which until then had always mea nt 'stop*, nowtraitorously meant 'go'.When manifes tly a fearful thing is 'on', the natural human reaction is toturn it ' o f f . This survival response is deeply engrained from earlychildhood and is virtually irresistib le. It is a form of 'pre-experiencetake-over' and is essentially a reflex error of the non-culpable kind. Thetrue fa iling, of course , lay not with the engineman but rather with the humanteam which placed him in that terrible situation - which the accident enquiryproperly recognised.The investigation into the cause of failure of the spring nest rod revealedan interesting design weakness. The rod served to transmit the spring nestforce to the "main" lever via a crosshead trunnion, the trunnion axle bearingdirectly onto pads set into the leve r. The bearing pressu res thus createdwere such that any lubricant present quickly became ejected from between themating sur face s; and since there was no engineered provision for replenishingthe lubricant, short of forcible separation of the part s, thevirtually permanent condition of the bearings was one of maximum frictionalintera ction . In conseque nce a bending force was applied to the trunnion endof the spring nest rod whenever motion was transmitted to the bra k es.

    It was demonstrated thatthe rod had an adequatemargin of strength to copewith the steady tensilestress induced by thespring nest but thedesigner had not cateredfor the fatigue burdensuperimposed by theadditional bending cycles.The safety margin here infact was found to benegative, which meant thateventual failure hadalways been inevitable.The moment came in thatearly morning of the 30July 1973 - 21 years afterthe rod had first beenbrought into service.

    flxJe

    CVassAeaJ

    lever

    Sfr'tna *esred

    pad.

    PtQ.2. ZHu skmhon of Scntit'nj Action350

  • 8/13/2019 An Examination of Factors Affecting Human Reliability

    5/13

    IChemE SY MP OS IU M SERIESNo.115

    Undeniably this was a manifestation of designer error althoug h whetherthrough over-sight, mis-calculation or wrongful judgement we cannot know.The lack of an engineered means of lubrication, which would have been anexceptional over-sight had one been intended, might suggest as more probablethat the designer had consciously decided against th e need for lubricati on inthe instance. Whatever his presence or absence of motive s, however, there isno escaping the fact that the design was flawed.Such are the interactions, iterations and convolutions of the design processthat mistak es, particularly at the more detailed l evel, are inevitable; and aproportion of such mistakes will survive even the most rigorous of checkingregimes. This we must acce pt. There is, however, an essential andpotentially effective last line of defence, namely, the systematic andrepetitive testing of 'function' and 'fitness-for-purpose' which must formpart of the ongoing life of any operational facility.

    Effective proving of this kind requires in every instance that a practicable,dependable and searching method of testing or examination be available. Thedesigner, given that he recognises testability as part of the designrequir ement, is in the strongest position to ensure that this aspec t isadequate ly provi ded. In the case of the spring nest rod the designer hadmade no provisions of any kind.On 14 January 1961, twelve years before the Mark ham rod failure, a similarrod brok e at Ollerton Colliery. Routine examination of all operationalspring nest rods was stipulated from that time but it seems that the methodadopted was one of visual inspection, and it was to be later proven that thisstood little cha nce of giving reliable pre-failure warni ng. More l atterly,as was demonstrated during the accident enqui ry, ultrasonic inspe ction couldhave been used entirely effectively for the spring nest rod - but it wasnot.As a broad generalisation, the design of heavy engineering componentstraditiona lly adhered to the principle of 'designing beyond life '. That isto say, components were designed with a sufficient margin of strength toguarantee survival during the expected operating existence of the facility.The margin would be such that any defect capable of removing it would need tobe so large tha t its det ection , even by visual mean s, could hardly fail - apersuasive enough argument in principleIn the cas e of the spring nest rod the safety fac tor was 6, but , as hasalready been noted, this was related to the linear loading expected from thespring nest and did not allow for the additio nal bending fa tigue which therod also was to experience. Had the designer appreciated the existence ofthis extra component we must presume that he would have strengthened the rodaccordingly, we can be confident, however, that his attitude to in-serviceproving would have remained unaltered.That the designer chose a nest of springs rather than a single spring mightbe construed as one example of wher e it was felt that 'design beyond life 'could not b e guara nteed . By the arrangement adopt ed, failure of a singlespring would not disable the braking action and would be visually detectablebefor e other failures followe d.

    351

  • 8/13/2019 An Examination of Factors Affecting Human Reliability

    6/13

    IChemE SY MP OS IU M SERIESNo.115

    A designer can only make such selective decisions by reference to his lifetime learning and experience. Failures of springs would be known andunderstood; a reality with which every engineer /designer had to cope .Howe ver, failur es of simple components in simple duties would be much rarerand generally, when they did arise, could be traced to some special reason,perceived as avoida ble in the future. Therefore even these 'rare' faultswould tend t o be seen as 'one-offs', if anything adding t o, rather thandetracting from, the designer's confidence in future performance.Confi dence in a component' s reliability is a complex amalga m, firstly ofbelief in the design and manufacturing processes from which the component wasborn and, secondly, of subjective performance expectation, sublimated fromeducation and direct life-time experience. Such a system of conviction musthave prevailed to allow that an essential system component in a windingengine braking system received only superficial examination throughout the 21years of its operational duty - until the system itself finally tested thecomponent to destruction. We are not looking at slack practice or at lack ofability or mean s. We are looking simply at a cultural approach to theattainment of reliabilityToday, the technique of probabilistic risk analysis is available and this cantell us th at a life-time's 'no failure s' observ ation, say over 50 to 100years, even though embuing the observer with the highest subjectiveconfidence, nevertheless in statistical terms represents a correspondingreliability which is surprisingly poor. For example, a period of 100 yearswith no observed failures enables us to believe (based upon the Poissondistribut ion) with 99 % confidence that the equivalent failure rate is notworse tha n one failure per 20 year s. This is a dramatic reduction inexpectation and , of cour se, there is still a 1% chanc e that the situationcould be even worse. Given this kind of objectivity, would the brakingsystem design er still have placed all his faith in the 'design beyond life'principle?Proper application of today's risk assessment methodology, within which thephilosophy of full testability is one essential basti on, would havesystematical ly questioned the routine testing/ex amination of all criticalsystem components, including static members such as the spring nest rod. Wecannot claim that the subtle mechanism of the rod's demise would have beenuncovered by the analytic process but the efficacy of the rod's testingwould . Moreov er, an appropriate frequency of proving , related to specifiedrisk, could have been deduced.Additi onally , it may be fairly claimed that the risk assess ment proc ess wouldhave found the dependence on so may single components (there were more in aprimary safety role than just the spring nest ro d) to be totallyunreconc ilable with the attainment of an acceptably row risk for the man -winding operation. Most evidently this was not the view held by the originaldesign ers. But it was the view reached by the accident enquiry which calledfor the systematic elimination wherever pos sible of all "single-line "components in this and all similar winding installations.Firstly we see a winding engineman's error, but one concluded to be of thenon-culp able k ind - the fundamental failing lay not with the driv er, so tospeak in the heat of the rac e, but rather with the designer who permitted the'stop' button actually to remove one element of the stopping action.

    352

  • 8/13/2019 An Examination of Factors Affecting Human Reliability

    7/13

  • 8/13/2019 An Examination of Factors Affecting Human Reliability

    8/13

    IChemE SY MP OS IU M SERIESNo.115

    Thus, two seemingl y harmless changes in operational prac tice had evolved, thefirst creating real opportunity for the tunnel not to remain fully floodedand the second enabling this situation to persist undetected for a protractedperiod of time.In due course the enabling elements of human error were provided - firstly,an erroneous setting of the washout valve which left it slightly too faropen, natural chance then exacerbating the position by providing a period ofparticularly low rain-fall - and secondly, the employment of an operator who,although routinely checking the outfall to the river, had not been instructedto interpret in plant terms , or be concerned by, what he saw; in this casethe pers isting dry states of the discharge ports from the weirs . His lastroutine visit to the valve-house had concluded uneventfully only hours beforethe accide nt occurre d. By this ti me, as it is now believed, the tunnel hadbeen variously in a state of voidage for up to 17 days.

    Unbeknown to the operators and designers alik e, a menace from the mostancient of orig ins , later to be identified as fossil meth ane, was awaitingthis very opportunity. Although k nown to have low solubility in water undernormal conditions, the elevated pressures of the surrounding geologicalenvironment altered this situation to the extent that the in-leaking water(more than 1000 m 3/day) to the tunnel carried sufficient of this dissolvedmethane to cre ate, over the nominal 17 days period, an explosiveconcentration of gases within the tunnel void. Subsequent pumping of waterthrough the tunnel acted to displace this lethal atmosphere to the Abbeysteadend where, by an unfortunate arrangement of ventilation, the discharge wasdirected into the valve-house.The stage was thus set for tragedy and on the 23rd May 1984 it becameconsum mated . On this fine May evening a party of 44 visit ors and WaterAuthority staff were assembled at Abbeystead and by 7.30 pm most werecollected inside the valve-house, waiting while the distant pumps worked tobring water from the Lune. The exhaling tunne l required only t o find asource of ignition - and amongst the company were smok ers In theensuing holoc aust, of those in the valve-house 16 died and no survivor escapeinjury.AnalysisTo the designers and constructors of underground workings the potentialhazard of explosive atmospheres is as much a part of their awareness as is hehazard of electrocution to the electrical power industry. However , whereasthe hazard of electrocution is always p resent, the explosive atmospherehazard is mor e quixoticThe geology of the area through which the tunnel was to pass was particularlywell k nown but Binnie & Partne rs, the pumping scheme designers, although theyinvestigated the geology at the tunnel ends, did not extend their bore-holeexploration s t o inc lude the deep line of the tunn el. Based upon pastexperience, they had concluded that the extra information gained would nothave justified the effort and cost of its procurement. It seems that wateringress, rather than gas ingress , was the main practical difficultyanticipated.

    354

  • 8/13/2019 An Examination of Factors Affecting Human Reliability

    9/13

    IChemE SY MP OS IUM SERIES No 115

    F/a.-3. Scht^gt/c Amnjei*.t*tcf Lune/tyreJJdfo TntHsfe r

    A. Lors/e

    6ian feed x/e n i Q , Nof diScUrjC 6m*.'^

    Ouf/a/(Out-fill

    F ta IltuSfratian 0C AM ejSJ d tyscAarp Syste-*. ** ri**-.355

  • 8/13/2019 An Examination of Factors Affecting Human Reliability

    10/13

    IChemE SY MP OS IU M SERIESNo.115

    During the tunnel construction the standard precautions against flammableatmosphere formation wer e tak en but at no time was methane detected to anysignificant ext ent . The fact that the design positive ly catered for a fullyflooded tinnel might suggest at some point an element of quiet cautionarythought, out, if this had been s o, we must logically suppose that thedesigners would have ventilated the tunnel directly to atmosphe re. In fact,not by omission but seemingly by positive provision, the tunnel wasventilatec directly into the Abbeystead va lve-h ouse ; more ove r, if any seriouspossibility of methane had been anticipated, intrinsically safe electricalequipmen t would have been installed in this below-ground b uilding andpossibly also a methane monitor provided. The facts provide their ownstatement; such features were not incorporated.

    The evidence of the exploratory surveys, the general experience of this typeof geology, the detailed design, construction and subsequent operation of thewater transfer scheme all point relentlessly to the one conclusion, namelythat the gas explosion hazard, although passingly investigated, was not afactor carried forward into the project. The designers had anticipated wateringress but they seemingly had not recognised that such water, emerging froma higher pressure environment, could contain significant amounts of gas notappreciably soluble under normal conditions. Such knowledge was not commonin the industry and the continuously ventilated state of the tunnel virtuallythrougho ut the period of its construc tion would have masked any mani festat ionof the effect.

    Clearly there were operational irregularities, notably the too easily madechanges in the washout practice and in the level of surveillance applied tothe valve-house operating conditio ns. Seemingly these changes were made forreasons of local convenience without any formal reference to the possiblewider impli catio ns. This was wrong and revealed a flaw in at least oneaspect of the North West Water Authority's management of safety. However,even had the implications of the changes been formally questioned, right backto the desig ners , it is very doubtf ul whether th e metha ne hazard would havebeen a guiding factor in any r espon se. Simply in this case methane was not arecognised danger.Must we therefore conclude that the Abbeystead disaster was unstoppable?After all, the root cause in the whole chain of events was the unawareness,by e xpe rts , of methane as a potential hazard in this particular ca se. Fromthe outset me thane had been judged as an unlikely candida te for concern andthe tunnel construction phase had appeared to verify this expectation.But here perhaps w e touch on the achilles heel of the whole enterprise. Thisfinal verification, the test measurements, as reported, were not trulyrigorous. Of ten known measurements made only one was conducted without theventilation system being in operation and, even in this case, the period fromshutdown is not stated. Furthermore, the instrument used, a Draeger gasdetector, in the circumstances of its use would have yielded only the crudestindications of specific methane pre sence. As well, it is almost certain thatwhen the measurements were made the safety concerns of the constructors wouldhave been centered on the construction phase itself, not on the tunnel'soperation. Anyway, it would be known that the intention was to operate thetunnel continuously flooded. Fire and water do not cohabit. Methane absentor prese nt, and it was believed to be abs ent, sounded no alarms for thefuture - once the tunnel was built

    356

  • 8/13/2019 An Examination of Factors Affecting Human Reliability

    11/13

    IChemE SY MP OS IU M SERIESNo.115

    So, these crude measurements, while serving their immediate purposes, wereincapable of providing the longer term assurances which we now know were justas important . The constructors were work ing to their normal agenda - and thevital item under 'Any Other Busine ss' had not been included..-.. After all ,the tunnel was to be kept flooded.... and if flammable gases were around theconstructors would let us know.... But "no news" is not always "good news".Seemingly t he matter proceeded on a fabric of negatives while what w smissing was the single 'positive' - "Demonstrate that an explosive atm ospherecannot form". And to pose such a chall enge , by the very strength of itsniaivity, may require a certain independenceThe cause of the Abbeystead disaster was there fore rooted in an errone ousdesign expectation and a failure to employ effective means for verifying thisexpectation. It is fair to claim that the use of independent riskassessment, systematically applied, would have stood a high chance at leastof questioning this issue ; and may well have applied the required 'positive'by emphasising the need for specific test verification in relation toflammable gas es. Effectiv e quality con trol , exercised within a structured'safety management' regime, would have examined the appropriateness of thetest verification method and perhaps these challeng es, acting toge ther , wouldhave blocked the pathway to tragedy which ended in the terrible events ofthat fine May evening in 1984 .

    CONCLUSIONSThe two case-studies presented involve different industries and quitedifferent technologies. The natures and forms of the disasters which theyinvolved also were different. The broad equivalences which can be drawn fromtheir causal patterns of human error are therefore all the more remark able.Firstly, in both cases the basic cause was designer error, but of a kindwhose potential can never be eliminated. At Markham, a number of designweaknesses perhaps the primary one of which was the dependence on "single-line" components. At Abbeystead, the premature discounting of the potentialflammable atmosphere hazard.Secondly, the failure of the principle safety net, that is the failure torecog nise, and provide for , the positive proving of a safety relatedcomponent, function or assumption - again attributable as design error butbearing influences of the culture of the industry. At Markham, the lack ofan effect ive i nspection method for the spring nest rod. At Abbeystead, thefailure to apply an effective test to confirm a design assumption.Thirdly, operator/maintainer error but of the kind which naturally emanatesfrom the pressures and circumstances which the design errors have predetermined. At Mark ham, the winding engine-man's removal of the regenerativebraking action. At Abbeystead, the informal modification of laid-downoperating instructions and the failure combination which produced, andallowed t o persist, the partially drained state of the tunne l.

    It is considered that the formal application of techniques such as Hazard &Operability Studies and Probabilistic Risk Analysis at the design stage ofboth undertak ings, by their structured and systematic approach es, would havechallenge d mos t of the factors which under-pinned the acci dent s. They wouldhave called for justification of the initial assumption at Abbeystead andwould ha ve identified and analysed the 'single-line' depende nce at Mark ham,and offered a quantitative assessment of system risk.

    357

  • 8/13/2019 An Examination of Factors Affecting Human Reliability

    12/13

    IChemE SY MP OS IU M SERIESNo.115

    The power of such techniques derives from the independent and systematicexamination which challenges the designers not only to create systems ofcalculated reliability bu t also that they should formally demonstrate thatthey have done so. It requires also that safety cases be documented and thattheir assumptions and defining parameters be carried forward into the systemoperation as operating rules/instructions.Furth ermor e, there needs to be a readily traceable and documentedrelationship between each operating rule/instruction and the hazard to whichit refers. Managerially there must exist a formal means whereby any proposedalteration to an identified safety-related instruction /rule is assessed inrelation to the safety case before it is adopted. Such a system was notoperative at Abbeystead.Finally, it is again stressed that the responsibility for effective systemproving, whether routine or otherwise, starts with the designer. In the sameway as he designs for operability and maintai nability so must h e design alsofor testability.By far t he greatest human error contribution in both of the accidentsanalysed was designer error. Operators will continue to make errors day byday. This is a fact of life. But it is the designer who sets the scene , whomust seek to include the defences needed to mak e the operat ion tolerant ofsuch erro rs. Where his efforts ar e defici ent, the operator 'in the frontline' sooner or later will be beatenIf the seeds of destruction can survive elimination in a tradition al collierywinding engine and a simple water-work s, how must we regard the possibilitiesfor a large computer controlled chemic al plant? The 'programmable logiccontroller" in its many forms is now the typical nerve center of such plantsand commonly will be found operating not singly but in sophisticated networksof distributed intelligence.The important physical change introduced by this technology is thereplacement of hardware by software for the implementation of controlfunctions and information man ageme nt. It is , howeve r, the resulting step-increase in scope for the inclusion of extra dimensions of automa tion whichhas been both dramatic in effect and irresistibly impres sive. But thismeans that from the point of view of the safety analyst he sees a controlregime of vastly increased complexity with considerable scope for in-built,or subsequently introduced, err ors. Moreover the embodimen t of theassociate d logic is relatively covert. The analyst 's task of demonstratingsafety, where automation of this type and complexity is involved, becomeshighly demanding and almost inevitably destined to produce less th nconvincing results.

    Neverth eless th e principles highlighted by the two accidents studied shouldnot be ignored, namely (i) the need for the designer to provid e for, andensure, the positive proving of significant safety-dependent assumptions,(ii) the need for the designer to make the necessary provisions for thethorough routine proving/maintenance of safety-related systems and componentsduring normal operational life , and (iii) the need, where significant hazardsare inv olved, for the mak ing by independent specialists of a documentedsafety demonstration. A fourth might be added: (iv) the need for designersto confine their designs to forms which allow effective safety demonstration.

    358

  • 8/13/2019 An Examination of Factors Affecting Human Reliability

    13/13

    IChemE SYMPOSIUM SERIESNo.115

    The latter need presents a particular challenge, for where major hazards are(or might be) involved it is no longer enough simply to design for safety andallow that history will prove the case - today, such safety must bedemonstrated in advance, and the associated designs, irrespective ofsophistication, must remain amenable to this requirement.

    REFERENCESCALDER J W, 6 March 1974 , Report on the cause of, and circumstancesattending, the overwind which occurred at Markham Colliery, Duckmanton,Derbyshire, on oo aiy 1072. HKSO, HM 6602 Dd 252561 K32 3/74.Health & Safety Executive, 1985, The Abbeystead Explosion, HMSO, ISBN 0 11883795 8.

    V/NII.D1./D7/02.89/SAH

    359