Organisational Failures in Accident Reports Michéle Jeffcott & Chris Johnson Department of Computing Science, University of Glasgow, Glasgow, G12 8QQ. Tel: +44 141 330 0917, Fax: +44 141 330 4913 Email: {shellyj, johnson}@dcs.gla.ac.uk Accident investigation aims to identify the root causes that lead to major incidents. These factors include problems in the design and operation of human computer interfaces. They also include the organisational factors that are increasingly considered as important in accident causation and in our understanding of human computer interaction in complex working environments. However, the organisational “causes” of major accidents are still poorly investigated in comparison to technical failures. This results in disproportionate feedback, and a lack of improvement in organisational functioning. It also hides many of the contextual factors that jeopardise successful human computer interaction. This paper, therefore, shows how a design rationale notation that was developed within HCI can also be applied to represent and reason about these wider systemic causes of failure. This semi-formal notation is useful because it can be difficult for readers to trace the complex arguments about human ‘error’ and system ‘failure’ that are scattered throughout the body of a lengthy text- based documents. In particular, we show how Conclusion-Analysis-Evidence (CAE) diagrams provide a graphical overview of evidence and lines of argument about human and organisational failure. A Marine Incident Investigation Unit's (MIIU) report into a fire on the Aurora Australis is used as a case study to illustrate the argument in this paper. Keywords: Organisational failure, Design Rationale, Human ‘Error’. 1. Introduction According to Reason, until the 1930’s investigation of British railroad accidents mainly focused on technical failure, which was typical of many industries. This lead, in the 1
following decades, to attempts to make technical systems error proof by developing elaborate in-built defences. But it proved increasingly difficult to reduce accidents by technical safeguards, and attention turned to the role of the human operator as a causal factor in accidents. As well as human fallibility, engineers' lack of consideration of the needs, capabilities and restrictions of humans when designing technical systems was also blamed. But the onus lay on the operator, who by the early 1980’s, was reported to be responsible for 80 to 100% of accident causes (Wagenaar, 1983). It seemed that the basic operators and maintenance personnel were becoming the scapegoats. More recent accidents, such as Chernobyl and King’s Cross, have shifted the emphasis once again to those who make the decisions at a management and organisational level. This mirrors the increasing interest in organisational and contextual factors within human computer interaction (Beyer and Holtzblatt, 1998). Reason (1998) concludes that the importance of organisational and management factors as a precursor to human “error” is now widely accepted. But it remains to be seen if organisational failure is actually properly investigated and subsequently represented in accident reports. A series of organisational failures make up an organisational accident, which Reason (1998) defines as: "An organisational accident has multiple causes involving people operating at different levels of their respective companies. They are events in which no one person's failure was a sufficient cause and trace back into many parts of the organisation, from the operator to the manufacturer, and - by implication - the regulator." Reason adds that understanding and limiting the occurrence of organisational accidents are the major challenges to be faced in the new millennium. This paper addresses and suggests improvements in the study of organisational failure in accident reports and so aids the first of Reason's challenges. The relevance of this work for HCI can be explained in two ways. Firstly, HCI is playing an increasingly important role in many accidents (Norman, 1990). They therefore provide important information for future interface development. Secondly, it is critical that we identify and report on these organisational and managerial aspects that lead to failure in HCI if we are to understand the true context in which accidents occur. 1.1 Marine Incident Investigation Unit (MIIU) Case Study A report prepared by the Marine Incident Investigation Unit (MIIU) is used to illustrate the argument in this paper. It is a lengthy 60-page document, with an additional 15 page 'Investigation-In-Confidence' following the main body of the report. The supplementary investigation records a seconded investigator's special study on the fuel system, whose failure caused the diesel oil leak. The general report presents the crucial time scale of the events leading to and during the incident, the action's of the crew and equipment used to fight the fire, and analysis of the effectiveness of the fire fighting 2
effort and the causes of the fire itself. The MIIU summarises the incident development as follows: "At about 0230 on 22 July 1998 a fire broke out in the engine room of the Antarctic research and supply vessel Aurora Australis. The ship was about 1300 miles south of Tasmania with 54 special purpose personnel (or expeditioners), 24 crew and an ice pilot on board. About 25 minutes before the outbreak of the fire, the duty engineer had been woken by an alarm visited the machinery control room and inspected the engine room. He cancelled the alarm and returned to his cabin at 0213. At that time, everything in the engine room appeared to be normal. At 0225 the duty engineer was roused by a second alarm and, returning to the engine room, he discovered a fire at the forward end of the port main engine, around the turbocharger. The engine was stopped and the fire alarms sounded. The fire at the turbochargers was attacked by engineers using portable extinguishers and apparently extinguished. A few moments later, however, at about 0236, a fireball erupted and the engineers were forced to evacuate the engine room." (MIIU, 1999). This accident report provides an appropriate case study because it typifies the complex combinations of human “error” and systems “failure” that characterise many major accidents. It also reflects the problems in team working and in accessing appropriate information sources that exacerbate the initial causes of an incident. It is also an appropriate case study because the format of this report typifies the presentation problems that affect the “usability” of these technical documents. These problems can prevent readers from gaining an accurate understanding of the implications that such failure has for the future design and operation of interactive systems. The Maritime Incident Investigation Unit (MIIU) presents their reports as a long text document, occasionally broken up with photographs and diagrams. It is presented as a single document, but can be roughly divided into two main sections. The first describing the Incident Development , detailing the time of events, their location, those involved and the actions they took. It is very much a narrative account of the event in precision. The second section consists of the Comment and Analysis and ends with a list of conclusions, or root causes, about the incident. Although the conclusions are set out explicitly at the end of the report, the reader has no way of tracing the steps the investigator(s) took to reach these conclusions. The conclusions are based on the analysis and evidence, which is scattered all throughout the body of the report. CAE diagrams allow the reader to follow, which points of analysis are, being considered and where the supporting/negating evidence comes from, for each relevant conclusion. A potential problem with this is that the reader can not interpret and use the evidence, free of the investigator's argument, which means that any unwanted bias in the report can not be avoided. The more explicitly the argument is presented, the more the reader is left to rely simply on the analysis of the investigator, and is reduced to an essentially passive role. However, relying on an 3
Recommend
More recommend