Pawsey Supercomputing Research Centre

Partially Degraded Service

Setonix Degraded Performance
Login nodes Operational
Data-mover nodes Operational
Slurm scheduler Operational
Setonix work partition Operational
Setonix debug partition Operational
Setonix long partition Operational
Setonix copy partition Operational
Setonix askaprt partition Operational
Setonix highmem partition Operational
Setonix gpu partition Operational
Setonix gpu high mem partition Operational
Setonix gpu debug partition Degraded Performance
Lustre filesystems Operational
/scratch filesystem Operational
/software filesystem Operational
/askapbuffer filesystem Operational
/askapingest filesystem Operational
Storage Systems Operational
Acacia Ingest Operational
Acacia MWA Operational
Acacia Projects Operational
Banksia Operational
Data Portal Systems Operational
MWA ASVO Operational
ASKAP Operational
ASKAP ingest nodes Operational
ASKAP service nodes Operational
Central Services Operational
Authentication and Authorization Operational
Service Desk Operational
License Server Operational
Application Portal Operational
Origin Operational
/home filesystem Operational
/pawsey filesystem Operational
Central Slurm Database Operational
Documentation Operational
Visualisation Services Operational
Remote Vis Operational
Vis scheduler Operational
Setonix vis nodes Operational
Nebula vis nodes Operational
Visualisation Lab Operational
Reservation Operational
CARTA - Stable Operational
CARTA - Test Operational
Pawsey Remote VR Operational
The Australian Biocommons Operational
Fgenesh++ Operational
Operational
Degraded Performance
Partial Outage
Major Outage
Maintenance
Allocated Cores (Setonix)
Fetching
Allocated Nodes (Setonix work partition)
Fetching
Allocated nodes (Setonix askaprt partition) ?
Fetching
Mar 28, 2026

No incidents reported today.

Mar 27, 2026

No incidents reported.

Mar 26, 2026
Resolved - This has been resolved
* Temperature rise on the SANs matches the read/write pattern on storage volume (OST) which happened to be on front end storage oss04 which matches back end askapbuffer storage array 07|08
* It's been established it's normal behaviour with this workload pattern

Mar 26, 12:00 AWST
Monitoring - A fix has been implemented and we are monitoring the results.
Mar 26, 11:19 AWST
Identified - We are investigating an issue with Lustre Filesystem "/askapbuffer"
* There is an artificial high load / temp on one of the physical back end SAN storage unit Array8
* It look like there is high load lustre thread on one of the front end lustre node ie OSS4 which attached to this unit
* There will be high availability failover of OSS4 to OSS3 to it's matching pair
* Then the original storage LUNS will be restored back from OSS3 to OSS4
* Failover has been completed

Mar 26, 10:36 AWST
Mar 25, 2026
Resolved - This item has been resolved. Both Tape Libraries are back into full operation for Banksia.
Mar 25, 12:17 AWST
Identified - We have identified the issue with our external engineering company. and we plan to return both tape libraries to full use this week.
Mar 23, 12:34 AWST
Update - We are continuing to work on this issue with our external engineering company.
Mar 20, 12:54 AWST
Update - We are continuing to work on this issue with our external engineering company and this may continue until Friday at a minimum
Mar 19, 11:12 AWST
Update - We are continuing to work on this issue with our external engineering company.
Mar 18, 14:58 AWST
Update - We are continuing to work on this issue with our external engineering company.
Mar 16, 13:17 AWST
Investigating - The Banksia service is currently in a degraded, “at risk” state as it is operating with only one tape library instead of the standard two. As a result, the alternate copy of files on tape will be unavailable for staging or archiving until Library 1 is restored to service. The secondary copy is still available for all data so this should not impact the Banksia offline files service. If you experience any issues accessing data please let us know at help@pawsey.org.au

The service maintenance provider has required some work to be undertaken on Library 1 to resolve a minor fault affecting a limited number of tapes. This work will take some time but the engineer indicates that in the worst case it will take until Tuesday but they are hoping this is resolved no later than COB Monday.

Mar 16, 13:16 AWST
Resolved - The issue has been identified and resolved.
Mar 25, 10:37 AWST
Investigating - We are currently investigating an issue with Origin communicating with Acacia Projects.

Information about Acacia Projects when queried via Origin may be incorrect.

Acacia Projects is fully operational.

Mar 25, 09:59 AWST
Mar 24, 2026

No incidents reported.

Mar 23, 2026
Mar 22, 2026

No incidents reported.

Mar 21, 2026

No incidents reported.

Mar 20, 2026
Mar 19, 2026
Mar 18, 2026
Mar 17, 2026

No incidents reported.

Mar 16, 2026
Mar 15, 2026

No incidents reported.

Mar 14, 2026

No incidents reported.