Resolved -
This incident has been resolved.
Jun 5, 08:08 AWST
Monitoring -
The reservation on the system has been lifted. We (Pawsey) will monitor the system overnight.
Jun 3, 18:40 AWST
Update -
The onsite HPE team (together with the L3 guys) are happy that Setonix is now fully back up and stable. HPE will continue to monitor and check later this evening and tomorrow morning.
Jun 3, 17:31 AWST
Update -
HPE is still working on getting the 6 x leader nodes and the /gluster filesystem synced and healthy.
HPE L3 are assisting local Engineers to resolve the problem.
Once resolved HPE will be rebooting all the compute/GPU nodes to production ready.
Jun 3, 15:31 AWST
Update -
HPE have provided no further updates.
Jun 3, 15:13 AWST
Identified -
HPE were able to get the leaders back up. HPE are going to reboot the leaders to make sure they have no extra problems
Jun 3, 14:17 AWST
Update -
A critical issue has been raised with the vendor. We are waiting for a response.
Jun 3, 13:04 AWST
Investigating -
Because of the work performed by HPE yesterday, there appears to be name resolution issue on Setonix.
Pawsey staff are attempting a work around.
Jun 3, 12:44 AWST