HPE handed Phase 1 of Setonix back to Pawsey this morning. Pawsey has completed its testing of the system and has returned it to service.
When HPE updated the slurm configuration to add the Phase 2 CPU and GPU nodes, they have wiped all the jobs sitting in the queue before Setonix was put into maintenance. Jobs will have to be resubmitted to the queue.
We apologise for the inconvenience and have put measures in place to prevent this from re-occurring.
Please note that the /scratch purge policy is scheduled to commence on the 15th April 2023 (ie. Any file that has not be accessed since the 15th March 2023 which be subject to removal).
HPE have handed over Phase 1 to Pawsey (8:30 AM AWST). We are working as quickly as possible to verify hardware and ensure the configuration changes haven't affected the Pawsey running environment.
Please note due to HPE adding Phase 2 nodes to the SLURM configuration, they have wiped the jobs which were sitting in the queue prior to maintenance. We were unaware this was a risk, and HPE have been unable to restore the state.
Posted Mar 13, 2023 - 09:17 AWST
Scheduled maintenance is still in progress. We will provide updates as necessary.
Posted Mar 10, 2023 - 10:03 AWST
Scheduled maintenance is currently in progress. We will provide updates as necessary.
Posted Mar 07, 2023 - 09:40 AWST
An extended outage is required to allow both an update to the NEO software on /scratch and /software and the final slingshot integration tasks. This outage is planned to have a duration of 7 days and, if the planned works are completed successfully, it will be the final outage required to integrate Setonix phases 1 and 2.
Posted Mar 07, 2023 - 09:39 AWST
This scheduled maintenance affected: Setonix (Login nodes, Data-mover nodes, Slurm scheduler, Setonix work partition, Setonix debug partition, Setonix long partition, Setonix copy partition, Setonix askaprt partition, Setonix highmem partition).