WECC Human Performance Work Group Event Analysis Norm Szczepanski, SMUD Shawn Halverson, BPA W E C C E S T E R N L E C T R I C I T Y O O R D I N A T I N G O U N C I L
2 Agenda • WECC HPWG • Event • Perspective from operations • Perspective from the field • Corrective actions • Lessons learned W E C C E S T E R N L E C T R I C I T Y O O R D I N A T I N G O U N C I L
3 WECC Human Performance Work Group Event Analysis Purpose: The Human Performance Work Group (HPWG) provides common vocabularies, tools, techniques, and training materials to assist Bulk Power System (BPS) Operations and Field personnel in order to promote the sustainability of Human Performance Improvement practices. W E C C E S T E R N L E C T R I C I T Y O O R D I N A T I N G O U N C I L
4 WECC Human Performance Work Group Event Analysis Process: The Human Performance Work Group attends monthly review sessions to analyze operating events from a Human Performance perspective and share any Human Performance lessons learned. W E C C E S T E R N L E C T R I C I T Y O O R D I N A T I N G O U N C I L
5 WECC Human Performance Work Group Event Analysis Goal: By sharing Human Performance Lessons Learned from these operating events information is passed along that will help others avoid the same or similar situations. W E C C E S T E R N L E C T R I C I T Y O O R D I N A T I N G O U N C I L
6 How much of the time do we error? To Err Is 90 Percent Human Why We Make Mistakes Joseph T. Hallinan W E C C E S T E R N L E C T R I C I T Y O O R D I N A T I N G O U N C I L
7 How much of the time do we error? W E C C E S T E R N L E C T R I C I T Y O O R D I N A T I N G O U N C I L
8 WECC Human Performance Work Group Event Analysis Lesson Learned Loss of communication to multiple SCADA RTUs at a Switching Center W E C C E S T E R N L E C T R I C I T Y O O R D I N A T I N G O U N C I L
9 WECC Human Performance Work Group Event Analysis Why is this event a good Human Performance Lesson Learned? This example shows how human error created a system condition that lay undetected until specific circumstances were created. The event that transpired had wide reaching impacts to both Control Center Operations and Field personnel. W E C C E S T E R N L E C T R I C I T Y O O R D I N A T I N G O U N C I L
10 Event Description Grid Operations lost communications with multiple substation Remote Terminal Units (RTUs) that were routed through a Switching Center Energy Management System (EMS) platform. W E C C E S T E R N L E C T R I C I T Y O O R D I N A T I N G O U N C I L
11 Event Description A total of 87 RTUs were impacted, including 34 Bulk Electric System (BES) RTUs. The various substations affected have operating voltages ranging from 4kV to 500kV. This resulted in the loss of Substation Control and Data Acquisition (SCADA) functionality. The total event duration was 78 minutes. W E C C E S T E R N L E C T R I C I T Y O O R D I N A T I N G O U N C I L
12 Context behind this Event • The Switching Center that contained the EMS platform for the RTU communication was recently relocated to a newly constructed control room approximately 2 months prior to this event. • The power for this EMS platform was routed through an uninterruptible power supply (UPS) that was a new and different model compared to other Switching Center facilities. W E C C E S T E R N L E C T R I C I T Y O O R D I N A T I N G O U N C I L
13 Context behind this Event • A transformer #3 was taken out of service. Its tertiary supplied the primary station service power source. • This created a “UPS General Alarm” that was acknowledged by both the Switching Center System Operator and the Transmission Dispatcher. W E C C E S T E R N L E C T R I C I T Y O O R D I N A T I N G O U N C I L
14 Context behind this Event • The Switching Center System Operator determined that a Substation Operator needed to be called out to investigate the alarm. • However, due to other switching taking place, the Dispatch request was never issued and the cause of the UPS General Alarm was not investigated. W E C C E S T E R N L E C T R I C I T Y O O R D I N A T I N G O U N C I L
15 Context behind this Event • About 5 ½ hours later communications with the 87 RTUs was lost as the UPS system was operating on its backup battery and finally ran out of power to run the communications equipment. W E C C E S T E R N L E C T R I C I T Y O O R D I N A T I N G O U N C I L
16 Context behind this Event • A Technician was sent to determine the cause of the power failure and found the main circuit breaker to the UPS tripped. • This main circuit breaker was reset and closed by the Technician restoring power to the RTU communications equipment. W E C C E S T E R N L E C T R I C I T Y O O R D I N A T I N G O U N C I L
17 EMS UPS Cabinet W E C C E S T E R N L E C T R I C I T Y O O R D I N A T I N G O U N C I L
EMS UPS Main CB (inside UPS cabinet) 18
19 Upon Further Review • An inspection determined that the system Auto/Manual Restart switch was selected to the “ Manual ‘’ position ( factory default setting the vendor was unaware of and did not correct during in-servicing of new system ). • In this configuration, the UPS system is designed to trip the main CB for a momentary loss of AC power, as experienced during an automatic transfer of station service. W E C C E S T E R N L E C T R I C I T Y O O R D I N A T I N G O U N C I L
20 Upon Further Review • The vendor selected the switch to the “Auto” position which will ensure the main circuit breaker remains closed and the UPS transfers to the alternate AC power supply. • Subsequent local testing verified the UPS system automatic transfer switch to be functional and set appropriately. W E C C E S T E R N L E C T R I C I T Y O O R D I N A T I N G O U N C I L
21 HP Perspective “Testing and Energization of new equipment ” • During the time that new equipment is being installed provides an opportunity for local Technicians and Substation Operators to work with vendors and receive training on new equipment. W E C C E S T E R N L E C T R I C I T Y O O R D I N A T I N G O U N C I L
22 HP Perspective “Testing and Energization of new equipment ” • The “As Left” condition of newly installed equipment needs to be understood and verified that proper operation will occur when called upon to perform its required function. W E C C E S T E R N L E C T R I C I T Y O O R D I N A T I N G O U N C I L
23 HP Perspective “Equipment has become more advanced and complicated” • Many times physical control switches have been replaced with logic buried in menus on a display screen. This can result in unwanted factory default settings being overlooked and resulting in equipment not operating as expected. W E C C E S T E R N L E C T R I C I T Y O O R D I N A T I N G O U N C I L
24 HP Perspective “Equipment has become more advanced and complicated” • Control / Selector switches may be located in areas that are not normally inspected on a routine basis. This can lead to a condition that is not readily visible to the Technician or Substation Operator. W E C C E S T E R N L E C T R I C I T Y O O R D I N A T I N G O U N C I L
25 HP Perspective “Understanding the meaning of Alarms” • There are several circumstances where alarms are ganged together to produce one alarm point and requiring in-depth local troubleshooting to determine the actual problem. • Alarm nomenclature can be misleading as to the actual problem or severity of the condition. In this case a “UPS General Alarm” came in after the loss of primary AC power to the UPS, however there was no apparent power system trouble. W E C C E S T E R N L E C T R I C I T Y O O R D I N A T I N G O U N C I L
26 HP Perspective “Understanding the meaning of Alarms” • It was not clear that the communication equipment was running on backup battery power until the RTU communication failed. W E C C E S T E R N L E C T R I C I T Y O O R D I N A T I N G O U N C I L
Recommend
More recommend