Customer impact: Servers hard down for customers in Rack D08
Outage Detail: Early monday morning we experienced a power failure in Rack D08 , which resulted in all servers within this rack falling offline. The failure was caused by a faulty Automatic Transfer Switch which ensures power redundancy within racks. The root cause of why the ATS failed is still being investigated.
Reset ATS and restarted failed machines.
ATS consists of two separate 16A power banks. Consider moving half of the machines to the other power bank in order to reduce risk of potential fallout due to a failure and reduce impact if there is a failure.