Step 1. Confirm Hazard and Risk Analysis (H&RA) assumptions
The Hazard and Risk Analysis establishes the preliminary risk management strategy. The supporting documentation contains assumptions related to automation reliability and the capability of on-site staff to sustain the system. Examples of common assumptions include:
- Well trained operations who reliably use written procedures
- Facility only operates under the conditions evaluated in the H&RA
- Facility does not intentionally violate the safe operating limits (throughput, process conditions, etc.)
- Adequate staffing levels of operations and maintenance technicians to address faults and discrepancies in a timely manner
- An effective alarm management program is in place
- Instruments are fit-for-use for the process applications
- Low demand rate on the instrumented safeguards
- Independence between causes and safeguards exists
Functional safety practices dictate that the site verify the H&RA assumptions. IEC-61511:2016 requires the determination that the design, operation, maintenance, and testing are sustaining the safety integrity as required by the site risk management strategy. ANSI/ISA-18.2 requires that alarm events be monitored to ensure effectiveness of the alarm system. Gaps between the assumptions and reliability can have both direct and indirect impact on actual site risk.
Case Study Example:
Natural gas processing; Longford, Australia; September 25, 1998.
Impact: Explosion and fire; 2 fatalities; 8 injuries; Plant 1 destroyed, Plants 2 and 3 shutdown, 5% loss of supply, 250,000 workers sent home.
Control instrumentation for absorber bottoms condensate
Summary
The LPG Plant 1 separated methane from LPG in a pair of absorber towers using lean oil. During the night before the accident, the level increased in the knockout section of Absorber B. Since the disposal route to Plant 2 was not available, an alternate route to a Condensate Flash Tank was used. The normal procedure of increasing absorber bottom temperature was not done. As a result, the flash tank protected itself from excessively cold temperatures by decreasing incoming flow, which in turn caused absorber condensate level to continue to increase. Eventually, condensate mixed with rich stripping oil. This mixture flashed across the level control valve and lowered the temperature in the Rich Oil Flash Tank. Temperatures throughout the plant were lowered as rich oil flowed through the process. A low temperature trip of the lean oil pumps resulted, and the trip was not communicated to the plant supervisor for over an hour. A hand switch was actuated to decrease flow through exchanger GP905 in an attempt to restart the pumps. The heat exchanger ruptured due to cold temperature embrittlement, releasing a vapor cloud of gas and oil. The cloud traveled 170 meters to fired heaters before ignition occurred.
Instrumentation and Controls Gaps
- 100s-1000s of alarms happened per day, many regarded as nuisance
- Critical alarms were not prioritized
- Operators desensitized, alarm system ineffective
- Operators and supervisors did not understand the consequences of their manual actions and experienced engineers had been moved off-site
Key automation learning points
Alarm management reduces the number of alarms to only those requiring operator action. The risk reduction that normally results from a robustly managed safety alarm program is fully dependent on timely and correct operator response to the alarms. Having chronically high process alarm rates in a facility or a significant number of alarms which do not require action will promote the development of ineffective alarm response habits. [ANSI/ISA 18.2]
As in most large events, there were many contributing factors. Among them were two significant deviations from common H&RA automation assumptions:
- The facility was operating in an alternate mode in which operations and the front-line supervisors, in the absence of experienced engineers, did not correctly understand how the process conditions would respond to the manual actions they took.
- The alarm management system was ineffective, with chronically high levels of unprioritized alarms, fostering desensitization to and basic distrust of alarms by the operators.
With these deviations from standard H&RA assumptions, any Operator Response to Alarm protection layer would be very unlikely to succeed.
Sustainability action
Audit to confirm that the assumptions made during hazard and risk analysis are actually reflected in plant operation and that these assumptions remain valid over time.
References:
- Hopkins A. 2000. Lessons from Longford: The ESSO Gas Plant Explosion. CCH Australia Limited.
- 2017. Guidelines for Safe Automation in Chemical Processes-2ed. New York: AICHE.
- Summers, Angela E., E. Roche, H Jin, M Carter. 2015. “Incidents That Define Safe Automation.” Presented at 61st International Instrumentation Symposium, Huntsville, Alabama, May.