Betzy: Corrected GPU node config

The queue system configuration of the GPU nodes on Betzy had an error: The number of CPUs were set to 128 instead of 64. Most jobs would probably not be affected by this, but it is possible that some jobs got sub-optimal cpu pinnings.

This has now been fixed, and the documentation updated. There is nothing users have to do with their job scripts (except if they asked for more than 64 cpus per node).

[Resolved] NIRD mount unavailable on Saga and Betzy

We have identified that the NIRD mount is unavaialble on Saga and Betzy and are working on finding the cause and putting a fix in place.

28-03-2022-13:20 – Mounts should be back now, the problem was caused by Friday’s maintenance on network gear …

We hope that the above has not caused too much frustration for you guys and we would like to wish a very nice day to everyone !

NRIS HPC staff

Downtime on Saga and Betzy, Thursday February 3.

There will be a short maintenance stop of Saga and Betzy on Thursday, Feburary 3. at 15:00 CET, due to work on the cooling system in the data hall. The downtime is planned to last for three hours.

During the downtime, no jobs will run, but the login nodes and the /cluster file system will be up. Jobs that cannot finish before 15:00 at February 3, will be left pending in the queue until after the stop.

Downtime Betzy, 6th December 08:00 until 9th December 20:00

There will be a scheduled downtime for Betzy lasting three days starting on Monday 6th December at 08:00. Downtime will last until Thursday 9th, 20:00.

During the downtime we will conduct:

  • Full upgrade of the Lustre filesystem (both servers and clients)
  • Full upgrade of the infiniband firmware
  • Full upgrade of the Mellanox infiniband drivers
  • minor updates to other parts of the system (Slurm, configs, etc)

Please be aware that this does also affect the storage services recently moved from NIRD to Betzy.

We apologize for the inconvenience

Update 08.12.2021 18:00 : Betzy downtime is over, and system is open for users. All planned update is performed .

[Updated] Batch system issue on Betzy

There is currently an issue on Betzy with the batch system which results in jobs not completing and new jobs not being started.

We are currently investigating the issue and will update once we know what caused it and how it can be resolved.

[Update 14:22]: Job submission is working again. The users experiencing this were unfortunately victims of a batch system restart which happened at the same time as the job was submitted.