Pawsey Scheduled Maintenance (June)
Scheduled Maintenance Report for Pawsey Supercomputing Research Centre
Completed
• Banksia has been returned to service
• ASKAP Ingest has been returned to service
• Garrawarla has been returned to service
• Setonix has been returned to service
• Core services have been patched

As announced earlier, the Cray Programming Environment (CPE) on Setonix has been updated to version 23.09 and the new version of the software stack 2024.05 is now available by default. For moreminformation, please see: https://pawsey.atlassian.net/wiki/spaces/US/pages/190087173/June+2024+Software+Update+-+Important+Information
Posted Jun 04, 2024 - 18:11 AWST
In progress
• Banksia has been returned to service
• ASKAP Ingest has been returned to service
• Garrawarla has been returned to service
• Core services have been patched
• Setonix is currently in the last throws of testing the new software stack
Posted Jun 04, 2024 - 16:10 AWST
Scheduled
Maintenance will be carried out on Pawsey systems on Tuesday the 4th of June to apply required patches and updates to improve the systems stability, security, and performance. This maintenance window will also be used to undertake other tasks which require down-time to achieve.

Planned work for this window includes:
• Update the BIOS on the askapbuffer Lustre servers
• Update the controller firmware on the askapbuffer storage arrays
• Update the LNet Monitoring resource agents on askapbuffer
• Update the HCA firmware on the Setonix LNet routers
• Update the BIOS on the Garrawarla data mover nodes
• Rectifying cabling on DDN arrays for Banksia
• Patching of core Pawsey services

During the upcoming June maintenance, the Cray Programming Environment (CPE) on Setonix will be updated to version 23.09. This will include updated MPI libraries and a newer Cray compiler version (16.0.1), and the same GCC compiler version. In preparation for this, the Pawsey team has built the new software stack.

The new version of the software stack will be 2024.05 which will be available by default, and it will sit alongside version 2023.08 that is currently available on Setonix. You can still choose to use the older 2023.08 deployment by unloading the compiler module, swapping the pawseyenv module, and reloading the compiler module. More detail is available on our documentation page: June 2024 Software Update - Important Information.

Based on our testing, researchers should be able to use the existing local software installations without recompilation, but there may be exceptions. In such cases, feel free to reach out to the helpdesk for support.

Please note that the 2022.11 software stack is currently planned to be deprecated during September 2024 maintenance, with the 2023.08 software stack planned to be deprecated in January 2025. We recommend all researchers to migrate to the 2024.05 software stack as soon as possible.

Highlights of the new software stack:
• Software has been installed using a newer Spack major release, v0.21.0.
• Several packages are now also compiled using the Cray compiler.
• Version updates of most packages in both the GNU and Cray programming environments.
• The cpe/23.09 module has been used to compile 2024.05 stack. The cpe/23.03 will also be available as a non-default environment.
• Several newer versions of ROCm up to 5.7.3 will be available.
• ROCm 5.7.3 is recommended and has been used to build GPU packages.

We will publish the changes in more detail in our technical newsletter. Please monitor the documentation page above for further details approaching the maintenance.

Please note that there won’t be any changes to Cray Programming Environment (CPE) on the Setonix Remote Visualisation Nodes. We are aiming to update CPE and the Software stack during the July maintenance.
Posted May 28, 2024 - 09:03 AWST
This scheduled maintenance affected: ASKAP (ASKAP ingest nodes, ASKAP service nodes), Central Services (Authentication and Authorization, Service Desk, License Server, Application Portal, Origin, Documentation), Garrawarla (Garrawarla workq partition, Garrawarla gpuq partition, Garrawarla asvoq partition, Garrawarla copyq partition, Garrawarla login node), Setonix (Login nodes, Data-mover nodes, Slurm scheduler, Setonix work partition, Setonix debug partition, Setonix long partition, Setonix copy partition, Setonix askaprt partition, Setonix highmem partition, Setonix gpu partition, Setonix gpu high mem partition, Setonix gpu debug partition), Storage Systems (Banksia, Data Portal Systems, MWA Nodes, CASDA Nodes, MWA ASVO), Lustre filesystems (/askapbuffer filesystem), and Visualisation Services (CARTA - Stable, CARTA - Test).