Pawsey Supercomputing Research Centre
Update - setonix-04 reported a C_EC_CRIT error yesterday. It is not in the login pool, but HPE are stumped at why this is happening.
Nov 07, 2025 - 13:40 AWST
Monitoring - HPE rebooted a number of Slingshot switches during maintenance.

We haven't observed any Slingshot errors on the login, data mover or visualisation nodes for 48 hours.

We will continue to monitor.

Nov 06, 2025 - 12:29 AWST
Update - HPE have provided no new information
Oct 31, 2025 - 08:11 AWST
Update - HPE have provided no new information.
Oct 27, 2025 - 10:59 AWST
Update - HPE have provided no new information.
Oct 24, 2025 - 21:01 AWST
Update - HPE have provided no new information.

setonix-08 has slingshot issues. Pawsey is rebooting it.

Oct 20, 2025 - 13:25 AWST
Update - setonix-02 and setonix-03 have been added back to the RR DNS.
Oct 16, 2025 - 14:09 AWST
Investigating - There appears to be an issue will the Slingshot interfaces in the login nodes in Setonix. We appear to be down to 1 login node in the normal pool of login nodes.

We have had a case open with HPE for weeks, but they appear to be no closer to providing any kind of solution.

Please, please, please, please don't run any computational intensive operations on the login nodes. We have lovely compute nodes for that.

Please be aware that you can log into setonix-workflow.pawsey.org.au and get access to additional "workflow" nodes.

Oct 16, 2025 - 12:02 AWST
Setonix Operational
Login nodes ? Operational
Data-mover nodes ? Operational
Slurm scheduler ? Operational
Setonix work partition Operational
Setonix debug partition Operational
Setonix long partition Operational
Setonix copy partition Operational
Setonix askaprt partition Operational
Setonix highmem partition Operational
Setonix gpu partition Operational
Setonix gpu high mem partition Operational
Setonix gpu debug partition Operational
Lustre filesystems Operational
/scratch filesystem ? Operational
/software filesystem ? Operational
/askapbuffer filesystem ? Operational
/askapingest filesystem ? Operational
Storage Systems Operational
Acacia Ingest ? Operational
Acacia MWA ? Operational
Acacia Projects ? Operational
Banksia ? Operational
Data Portal Systems ? Operational
CASDA Nodes Operational
MWA Nodes Operational
MWA ASVO ? Operational
ASKAP Operational
ASKAP ingest nodes ? Operational
ASKAP service nodes Operational
Central Services Operational
Authentication and Authorization ? Operational
Service Desk Operational
License Server Operational
Application Portal ? Operational
Origin ? Operational
/home filesystem Operational
/pawsey filesystem Operational
Central Slurm Database ? Operational
Documentation ? Operational
Visualisation Services Operational
Remote Vis ? Operational
Vis scheduler ? Operational
Setonix vis nodes ? Operational
Nebula vis nodes ? Operational
Visualisation Lab Operational
Reservation ? Operational
CARTA - Stable ? Operational
CARTA - Test ? Operational
Pawsey Remote VR Operational
The Australian Biocommons Operational
Fgenesh++ ? Operational
Nimbus - Legacy Operational
Ceph storage ? Operational
Nimbus instances ? Operational
Nimbus dashboard ? Operational
Nimbus APIs ? Operational
Operational
Degraded Performance
Partial Outage
Major Outage
Maintenance

Scheduled Maintenance

Pawsey Scheduled Maintenance (December) Dec 2, 2025 09:00-17:00 AWST

Maintenance will be carried out on Pawsey systems on Tuesday the 2nd December to apply required patches and updates to improve the systems stability, security, and performance. This maintenance window will also be used to undertake other tasks which require down-time to achieve.

Planned work for this window includes:
• The firewall will have security profiles applied to increase visibility and threat prevention coverage.
• Firewall security policy clean up.
• Preemptive high availability election on firewalls will be enabled.
• HPE will re-cable the temperature sensor in rack x1001 of Setonix.
• HPE will perform coolant sampling of Setonix.
• Update to the cli-filter on Setonix to support future GPU power control operations.
• Install CPU and GPU versions of NAMD 3.0.2 on Setonix.
• /home on ASKAP Ingest and Ella will be replaced with a NetApp (as well as /software on ASKAP Ingest).
• Block port 5000 externally on Acacia Projects.
• Acacia Ingest will continue migration work off Puppet infrastructure.
• Banksia will be moving to a new VLAN.
• Change over to new Kafka Production server for event notifications on Banksia.
• Patching of visualisation services will be undertaken.
• Patching of core Pawsey services will be undertaken.

If you have any questions, please contact help@pawsey.org.au.

Posted on Nov 25, 2025 - 09:49 AWST
Allocated Cores (Setonix)
Fetching
Allocated Nodes (Setonix work partition)
Fetching
Allocated nodes (Setonix askaprt partition) ?
Fetching
Nov 25, 2025
Resolved - Setonix has been stable overnight.
Nov 25, 06:47 AWST
Monitoring - HPE have returned all CDUs to operation. We will monitor overnight.
Nov 24, 17:33 AWST
Update - HPE have handed Setonix back to Pawsey. We are performing the last checking.
Nov 24, 14:49 AWST
Identified - CBIS have returned cooling to service. HPE are currently checking the Cooling Distribution Units in Setonix and will be rebooting nodes shortly.
Nov 24, 13:50 AWST
Investigating - Pawsey has had a partial loss in cooling. It appears that the Setonix CPU and GPU nodes have been powered off.

We will provide updates when we have more information.

Nov 24, 12:52 AWST
Nov 24, 2025
Nov 23, 2025

No incidents reported.

Nov 22, 2025

No incidents reported.

Nov 21, 2025

No incidents reported.

Nov 20, 2025

No incidents reported.

Nov 19, 2025

No incidents reported.

Nov 18, 2025

No incidents reported.

Nov 17, 2025
Resolved - This incident has been resolved.
Nov 17, 12:44 AWST
Monitoring - The transfer speeds have improved after Tuesday maintenance. Pawsey continues to monitor the performance of this service.
Nov 6, 11:06 AWST
Identified - Pawsey is working closely with the Mediaflux vendor to resolve the issue.
Oct 30, 14:07 AWST
Investigating - We are currently investigating an issue with the Pawsey data portal aka storage.pawsey.org.au We are working with the product vendor on this issue - so until we settle on a fix and implement it, please be advised that the portal performance is presently suboptimal and there maybe timeouts when staging data from Banksia tape to online via the Data portal web GUI or with pshell.
Oct 27, 10:17 AWST
Nov 16, 2025

No incidents reported.

Nov 15, 2025

No incidents reported.

Nov 14, 2025

No incidents reported.

Nov 13, 2025

No incidents reported.

Nov 12, 2025

No incidents reported.

Nov 11, 2025
Completed - The scheduled maintenance has been completed.
Nov 11, 15:00 AWST
In progress - Scheduled maintenance is currently in progress. We will provide updates as necessary.
Nov 11, 14:00 AWST
Scheduled - We plan to address a high priority security bug in the authentication subsystem of Acacia Ingest. The service will remain available throughout, but there is a possibility that some S3 access keys will no longer work. All research groups who are potentially affected have been contacted.

If you encounter any problems, please contact help@pawsey.org.au

Nov 11, 09:40 AWST