Setonix Compute Node partial outage
Resolved
The booted nodes have been returned to service. HPE are continuing to investigate the cause to ensure it doesn't happen in the future.
Posted Nov 13, 2023 - 10:59 AWST
Investigating
Over the weekend two of the cabinets of CPU compute nodes in Setonix performed an emergency power off so we have been running at reduced capacity. Onsite HPE engineers have been investigating to find the cause but so far have been unable to determine what caused it. They are powering up the cabinets at the moment and testing them. They will hand them back to Pawsey when they've finished powering them up and we'll test them and return them to service. They should be available within a few hours. HPE will continue to investigate the cause.
Posted Nov 13, 2023 - 09:16 AWST
This incident affected: Setonix (Setonix work partition).