Possible blade failure in Galaxy
Incident Report for Pawsey Supercomputing Centre
Resolved
Hardware back in service some time ago, once vendor supply chain issues were resolved.
Posted May 31, 2021 - 12:36 AWST
Monitoring
We are still waiting for replacement parts for this issue. Our vendor supply has been affected by Covid19 but we are expecting parts soon.
Posted Dec 07, 2020 - 15:31 AWST
Investigating
Automated alerting has just indicated there may be a high speed network issue in Galaxy. It looks like one of the blades (containing nid00500- to nid00503]) went offline around 6pm Perth time.

Staff will investigate in the morning, but jobs running at that time may be impacted
Posted Oct 21, 2020 - 18:10 AWST
This incident affected: Galaxy (Galaxy Compute nodes).