Data Center Reason for Outage Summary
Jul 16, 2013

Power Outage Event July 29th – Sequence of Events and Actions – Data Center Reason for Outage (RFO)

Share With
JaguarPC has now been provided with additional details related to the matter of losing power beginning on 06/29/13 and leading into the first days of July. We hope that the following series of events helps our clients better understand the situation in its entirety. Thank you again for your continued loyalty during this unfortunate series of mishaps. RFO (Reason for Outage) Regarding power loss that occurred on June 29, 2013. The following information details the power outage that occurred on June 29, 2013. During this event the data center experienced several power outages as well as generator failures that resulted in a loss of power to the data center and to all customers located at this facility. 6/29/2013: Main utility power failure. Supervisor on shift escalated situation as transition to generator. On generator approximately 20 minutes. Utility power not yet restored. Generator shuts down in low coolant state even though temperature is within normal operating specifications. Engineers investigate all breakers on main utility power and switch gear while on UPS power. UPS power system exhausted. Confirmation that generator contractors are on call and standing by. Determination made that 600A ground-fault tripped within facility basement. Generator technicians arrive on site and begin work. Generator technician determines problem to be faulty sensor/ sending unit on generator coolant level. States that this is a problem common to Kohler generators. Sensor disconnected by generator technician. Generator restarted, stable running condition without load achieved. Load transferred to generator. Generator enters overheat state. Data Center & generator technicians supplement cooling with water on generator radiator coils. Generator below 180℉ per digital displays. Generator shuts down with overheat condition. Attempt made to transfer back to utility power. 600A breaker on utility feed fails. Technicians begin looking for any ground faults. 600A breaker encounters another fault. Source of ground fault located in A/C unit compressor on CRAC #3. A/C unit isolated. Fan circulation initiated to compensate for loss of CRAC #3. Temperature stabilized at normal levels. Utility power restored. Generator cooling issue evaluated. Cooling system hooked into city water to maintain temperature control. Replacement components sourced. 6/30/2013 600A utility main breaker trips Generator activates, maintaining power. Generator shuts down with “overheat” warning. Generator begins forcibly ejecting water mixed with motor oil. Generator & power contractors called in. Temporary, external generator ordered as a precautionary measure. Contractors arrive. Electrical contractor determination is made that 600A breaker is defective as it was extremely hot under only 70% load. Temporary, external generator arrives and is connected. Temporary, external generator powered up to restore power. UPS batteries begin charging. Utility power restored. Generator contractor determines that existing generator will need a full rebuild due to serious internal motor damage. A/C contractor arrives with new compressor to rebuild CRAC #3. Generator contractor arrives to dismantle and begin repairs on generator. It is determined that a thermostat inside the cooling system for the generator failed. This failure caused a failure of coolant circulation and subsequent overheat condition. Examination determines that it is not economically viable to repair the generator. Temporary, external generator refueled. Electrical contractor arrives with replacement 600A breaker for primary power panel. Temporary, external generator switched on to power facility while 600A breaker is replaced. Utility power restored. Utility power & breaker operating with proper stability. A/C contractor completes replacement of compressor on CRAC #3. A/C returned to 100% capacity 7/3/2013 Replacement 500kW generator is located, purchased, and shipping is arranged. 7/8/2013 Empower Engineering & JE Dunn are brought in to assist with planning and coordinating engineering & construction work for the installation of replacement 500kW generator. Schedule arranged with subcontractors for demolition of existing generator installation set to begin on July 10, 2013. Completion of new generator installation set for Thursday, July 18, 2013. Root Causes: The root cause of this outage was a bad A/C compressor in CRAC #3. This exacerbated a fault in a 600A utility power breaker in the main power entry room at the building. These two items are ultimately what caused line power to fail resulting in the subsequent start up of the facility generator system. Failure of a coolant thermostat within the generator is ultimately what caused the failure and destruction of the generator. Corrective Actions: Engineers have replaced our existing 350kW diesel generator with a new 500kW Kohler diesel generator. As part of the replacement, the new generator will be commissioned and tested by Kohler factory personnel. After a critical fault in our existing generator system it has been determined that replacing the generator is the best course of action. As part of this replacement generator capacity increases from 350kW to 500kW. The increase in capacity brings future plans ahead of schedule. Also of note, the 600A utility breaker that failed will be replaced with a higher capacity, 800A breaker to better match the capabilities of the new generator. In addition to these measures the data center has been in the process of rolling out 2N power availability to the facility per expansion plans announced in June 2013. Reason For Outage Regarding power loss that occurred on July 12, 2013. The following document details the power outage that occurred on July 12, 2013. Previously, this same generator had been load tested by engineers from generator contractor, Prime Power. This generator sustained full load of the entire data center for over two hours without problem.  During this event the data center was powered by this same temporary, external generator. Unfortunately, it proved unable to sustain the required load and failed which resulted in another power failure for our services. The following timeline includes what happened, when it happened, and what was steps were taken to remedy the situation. 7/12/2013 Load transfer from utility power to temporary, external generator. 7/12/2013 Electricians begin planned maintenance to replace electrical panels in facility electrical room upgrading the main utility sizing in anticipation of the installation of the new, larger generator. 7/12/2013 Facility running on temporary, external, generator approximately 45 minutes. 7/12/2013 Temporary, external generator shuts down and data center goes to UPS. UPS batteries are exhausted and modification to Primary power is not yet complete. 7/12/2013 Generator contractor (Prime Power) summoned to investigate. 7/12/2013 Temporary, external generator found to be incapable of sustaining power levels required. 7/12/2013 Electricians complete primary power feed upgrade from 600A to 800A and data center restored to main utility power. 7/12/2013 New temporary, roll up generator ordered. 7/12/2013 1.5MW temporary, external generator arrives. 7/12/2013 Generator contractor (Prime Power) replaces previous temporary, external generator with new, 1.5MW roll-up generator. Data center continues to operate on upgraded Primary power. 7/12/2013 Generator power restored. Root Causes: This outage occurred during replacement & upgrade of utility power breakers and subsequent demolition of old generator system and installation. The generator contractor had put in place a temporary, external generator. This outage was caused by our generator contractor employing an external generator system which was unable to sustain the required load placed upon it by our data center operation during replacement of a main utility breaker. Corrective Actions: To rectify this situation, the decision was made to employ a 1.5MW generator for temporary, as needed power at the facility. This roll-up generator will be on-site and wired into the data center power feed until all generator work is complete.. Further, the new 500kW provides greater capacity than the unit which it replaces.
Share With

Leave a Reply