Pawsey Supercomputing Research Centre
All Systems Operational
Setonix Operational
90 days ago
100.0 % uptime
Today
Login nodes ? Operational
Data-mover nodes ? Operational
Slurm scheduler ? Operational
Setonix work partition Operational
Setonix debug partition Operational
Setonix long partition Operational
Setonix copy partition Operational
Setonix askaprt partition Operational
Setonix highmem partition Operational
Setonix gpu partition Operational
90 days ago
100.0 % uptime
Today
Setonix gpu high mem partition Operational
90 days ago
100.0 % uptime
Today
Setonix gpu debug partition Operational
90 days ago
100.0 % uptime
Today
Lustre filesystems Operational
90 days ago
100.0 % uptime
Today
/scratch filesystem (new) ? Operational
90 days ago
100.0 % uptime
Today
/software filesystem ? Operational
90 days ago
100.0 % uptime
Today
/askapbuffer filesystem ? Operational
90 days ago
100.0 % uptime
Today
/askapingest filesystem ? Operational
90 days ago
100.0 % uptime
Today
Storage Systems Operational
90 days ago
99.98 % uptime
Today
Acacia - Projects ? Operational
Banksia ? Operational
Data Portal Systems ? Operational
MWA Nodes Operational
CASDA Nodes Operational
Acacia - Ingest ? Operational
MWA ASVO ? Operational
90 days ago
99.98 % uptime
Today
ASKAP Operational
ASKAP ingest nodes ? Operational
ASKAP service nodes Operational
Garrawarla Operational
Garrawarla workq partition ? Operational
Garrawarla gpuq partition ? Operational
Garrawarla asvoq partition ? Operational
Garrawarla copyq partition ? Operational
Garrawarla login node Operational
Slurm Controller (Garrawarla) Operational
Nimbus Operational
Ceph storage ? Operational
Nimbus instances ? Operational
Nimbus dashboard ? Operational
Nimbus APIs ? Operational
Central Services Operational
90 days ago
100.0 % uptime
Today
Authentication and Authorization ? Operational
Service Desk Operational
License Server Operational
Application Portal ? Operational
Origin ? Operational
/home filesystem Operational
/pawsey filesystem Operational
Central Slurm Database ? Operational
90 days ago
100.0 % uptime
Today
Documentation ? Operational
90 days ago
100.0 % uptime
Today
Visualisation Services Operational
Remote Vis ? Operational
Vis scheduler ? Operational
Setonix vis nodes ? Operational
Nebula vis nodes ? Operational
Visualisation Lab Operational
Reservation ? Operational
CARTA - Stable ? Operational
CARTA - Test ? Operational
Pawsey Remote VR Operational
The Australian Biocommons Operational
Fgenesh++ ? Operational
Operational
Degraded Performance
Partial Outage
Major Outage
Maintenance
Major outage
Partial outage
No downtime recorded on this day.
No data exists for this day.
had a major outage.
had a partial outage.
Scheduled Maintenance
Pawsey Scheduled Maintenance (August) Aug 9, 2024 08:00 - Aug 14, 2024 16:00 AWST
Our regular first-Tuesday-of-the-month maintenance will not proceed in August. Instead, an extended maintenance period for Setonix is scheduled for August 9th - 14th.

During the shutdown, HPE will replace Setonix's management system, bringing several benefits:
• Moving forward system updates and patches will happen during regular maintenance, minimising disruptions.
• HPE reports improved system stability with the upgraded management system.

What this mean?
• Setonix (including the Setonix visualisation nodes) and Garrawarla will be unavailable during August 9th to 14th
• All Pawsey systems will be subject to regular maintenance on the 13th of August including disruptive testing on both Acacia clusters, an upgrade to the control plane of Nimbus and patching of the Banksia system.

The replacement of Setonix's management system has been successfully implemented on Pawsey's test and development system. When Setonix is returned to service, the version of Cray Operating System won't have changed nor will the software stack provided by Pawsey. Only security fixes are being applied.

The login and visualisation nodes are bring moved to the 100 gigabit network which means anyone providing external software access to Setonix should allow for access from the 146.118.74.0/22 network.

Further updates will be provided on status.pawsey.org.au, and any questions should be directed to help@pawsey.org.au.

Posted on Jul 25, 2024 - 11:15 AWST
Allocated Cores (Setonix)
Fetching
Allocated Nodes (Setonix work partition)
Fetching
Allocated nodes (Setonix askaprt partition) ?
Fetching
Active Instances (Nimbus)
Fetching
Active Cores (Nimbus)
Fetching
Allocated nodes (Garrawarla workq partition)
Fetching
Past Incidents
Jul 27, 2024

No incidents reported today.

Jul 26, 2024

No incidents reported.

Jul 25, 2024

No incidents reported.

Jul 24, 2024

No incidents reported.

Jul 23, 2024
Completed - The scheduled maintenance has been completed.
Jul 23, 17:12 AWST
Update - We will be undergoing scheduled maintenance during this time.
Jul 19, 08:49 AWST
Scheduled - CBIS contractors have identified during their thermographic and ultrasonic testing in the SC Cell underfloor that an isolator has a hot spot.

The isolator and cabling terminations are in need of urgent isolation and replacement, so the Setonix cabinet attached to the isolator is currently being drained. This will be the highmem partition is not available, and the work partition has reduced capacity.

Jul 19, 08:49 AWST
Jul 22, 2024

No incidents reported.

Jul 21, 2024

No incidents reported.

Jul 20, 2024

No incidents reported.

Jul 19, 2024

No incidents reported.

Jul 18, 2024

No incidents reported.

Jul 17, 2024
Resolved - The issue has been resolved, thankyou for your patience.
Jul 17, 15:39 AWST
Identified - The issue has been identified and services restarted. One tape library is back running jobs, just working on the second tape library.
Jul 17, 08:44 AWST
Investigating - Hello there appears to be a blip with the scoutam service this morning affecting staging so we will work on this issue.
Jul 17, 08:00 AWST
Jul 16, 2024

No incidents reported.

Jul 15, 2024

No incidents reported.

Jul 14, 2024

No incidents reported.

Jul 13, 2024

No incidents reported.