Pawsey Supercomputing Research Centre

All Systems Operational

Setonix Operational
Login nodes ? Operational
Data-mover nodes ? Operational
Slurm scheduler ? Operational
Setonix work partition Operational
Setonix debug partition Operational
Setonix long partition Operational
Setonix copy partition Operational
Setonix askaprt partition Operational
Setonix highmem partition Operational
Setonix gpu partition Operational
Setonix gpu high mem partition Operational
Setonix gpu debug partition Operational
Lustre filesystems Operational
/scratch filesystem ? Operational
/software filesystem ? Operational
/askapbuffer filesystem ? Operational
/askapingest filesystem ? Operational
Storage Systems Operational
Acacia Ingest ? Operational
Acacia MWA ? Operational
Acacia Projects ? Operational
Banksia ? Operational
Data Portal Systems ? Operational
CASDA Nodes Operational
MWA Nodes Operational
MWA ASVO ? Operational
ASKAP Operational
ASKAP ingest nodes ? Operational
ASKAP service nodes Operational
Central Services Operational
Authentication and Authorization ? Operational
Service Desk Operational
License Server Operational
Application Portal ? Operational
Origin ? Operational
/home filesystem Operational
/pawsey filesystem Operational
Central Slurm Database ? Operational
Documentation ? Operational
Visualisation Services Operational
Remote Vis ? Operational
Vis scheduler ? Operational
Setonix vis nodes ? Operational
Nebula vis nodes ? Operational
Visualisation Lab Operational
Reservation ? Operational
CARTA - Stable ? Operational
CARTA - Test ? Operational
Pawsey Remote VR Operational
The Australian Biocommons Operational
Fgenesh++ ? Operational
Nimbus - Legacy Operational
Ceph storage ? Operational
Nimbus instances ? Operational
Nimbus dashboard ? Operational
Nimbus APIs ? Operational
Operational
Degraded Performance
Partial Outage
Major Outage
Maintenance
Allocated Cores (Setonix)
Fetching
Allocated Nodes (Setonix work partition)
Fetching
Allocated nodes (Setonix askaprt partition) ?
Fetching
Aug 13, 2025

No incidents reported today.

Aug 12, 2025

No incidents reported.

Aug 11, 2025

No incidents reported.

Aug 10, 2025

No incidents reported.

Aug 9, 2025

No incidents reported.

Aug 8, 2025

No incidents reported.

Aug 7, 2025

No incidents reported.

Aug 6, 2025

No incidents reported.

Aug 5, 2025
Completed - All services have been returned to service. As always some hardware played nicely, some hardware less so. So the Pawsey team will continue to work in the background to remediate any hardware issues identified as part of maintenance.

Please note for Setonix, the maximum NVMe allocation for non-exclusive jobs on Setonix GPU nodes is now 2604 GiB. Exclusive jobs can use up to 3500 GiB.

As announced earlier, the Cray Programming Environment (CPE) on Setonix has been updated to version 25.03 and the new version of the software stack 2025.08 is now available by default. For more information, please see: https://pawsey.atlassian.net/wiki/x/AQBhN

We are done! Huzzah (clink, clink).

Big shout out to the entire Pawsey team who helped with maintenance (nothing should be read into omissions): Ilkhom and Craig for shepherding the new software stack and making sure it worked; Ahmed for patching all the vis things like there is no tomorrow; Kevin for lending a hand (and some wisdom) to help the Services Team; Mahitha and Jeffrey for looking after Banksia while Chris is having fun in Trump land; and Audrey, Greg and Michael for making sure Acacia pleases the security gods.

And Sumitha and Boney, the superstars and unsung heros, for patching all the critical services that every other Pawsey service relies on. Your fellow yellow minions salute you.

And Stacy, for asking "are we there yet?"

As always (and despite the accusations to the contrary), Pawsey is here it help. If you run into any difficulties (no matter the size) reach out to help@pawsey.org.au and one of our friendly yellow minions will try and help. Please and thank you.

Aug 5, 17:16 AWST
Update - Acacia (Ingest, Projects and MWA) have been returned to service.
Aug 5, 15:57 AWST
Update - Setonix has been returned to service.

As announced earlier, the Cray Programming Environment (CPE) on Setonix has been updated to version 25.03 and the new version of the software stack 2025.08 is now available by default. For more information, please see:
https://pawsey.atlassian.net/wiki/x/AQBhN

Aug 5, 14:28 AWST
In progress - Scheduled maintenance is currently in progress. We will provide updates as necessary.
Aug 5, 09:00 AWST
Scheduled - Maintenance will be carried out on Pawsey systems on Tuesday the 5th August to apply required patches and updates to improve the systems stability, security, and performance. This maintenance window will also be used to undertake other tasks which require down-time to achieve.

Planned work for this window includes:
• Setonix will be upgraded to SUSE Linux Enterprise Server 15 SP6 which provides bug fixes and security patches.
• Setonix GPU nodes NVMe configuration update.
• Slurm Controller (ASKAP) and Slurm Database Controller will have operating system upgrade.
• ASKAP Buffer will have iDRAC updates applied.
• Banksia will have a ScoutAM upgrade.
• Upgrade of one of the DDN 7990 arrays connected to Banksia.
• Nebula nodes will have iDRAC and BIOS updates applied.
• Patching of visualisation services will be undertaken.
• Patching of core Pawsey services will be undertaken.

CPE will be upgraded (including upgrades to the MPI libraries, GCC compiler and Cray compiler) and a new software stack will be deployed on Setonix and will become the default. The Setonix Visualisation nodes will continue to use the same software stack. More details are available on our documentation page (August 2025 Software Update - Important Information)

We expect to be able to bring all services back by the end of the day. If you have any questions, please contact help@pawsey.org.au.

Jul 29, 12:04 AWST
Aug 4, 2025

No incidents reported.

Aug 3, 2025

No incidents reported.

Aug 2, 2025

No incidents reported.

Aug 1, 2025

No incidents reported.

Jul 31, 2025

No incidents reported.

Jul 30, 2025

No incidents reported.