Maintenance Stops on Saga, Fram and Betzy

[Update, 2022-04-30 11:10] The Fram and Saga maintenance is now over, and jobs are running again.

[Update, 2022-04-29 08:00] The Fram and Saga maintenances have now started.

[Update, 2022-04-28 12:56] The Betzy maintenance is now over, and jobs are starting again.

[Update, 2022-04-28 08:00] The Betzy maintenance has now started.

There will unfortunately be maintenance stops on all NRIS clusters next week, for an important security update. The maintenance stops will be

  • Betzy: Thursday, April 28. at 08:00
  • Fram and Saga: Friday, April 29. at 08:00

We expect the stops will last a couple of hours. We have set up maintenance reservations on all nodes on the clusters, so jobs that would have run into the reservation will be left pending in the job queue until after the maintenance stop.

We are sorry for the inconvenience this creates. We had hoped to be able to apply the security update with jobs running, but that turned out not to be possible.

Fram downtime 23rd – 24th February

[Update, 2022-02-24 22:30]: The maintenance is over and Fram is in production again. Thank you for your patience!

[Update, 2022-02-24 20:30]: The maintenance is taking a little longer than planned. We plan to get back into production at 22:00.

[Update, 2022-02-23 12:00]: The maintenance stop has now begun.

Fram supercomputer will be unavailable due to maintenance on the cooling system from February 23rd 12:00 until 24th 20:00

If time allows it we will also upgrade whole or parts/components of the storage system, including file system clients (compute nodes)

Downtime on Saga and Betzy, Thursday February 3.

There will be a short maintenance stop of Saga and Betzy on Thursday, Feburary 3. at 15:00 CET, due to work on the cooling system in the data hall. The downtime is planned to last for three hours.

During the downtime, no jobs will run, but the login nodes and the /cluster file system will be up. Jobs that cannot finish before 15:00 at February 3, will be left pending in the queue until after the stop.

NIRD Toolkit: maintenance 03.02

Dear NIRD Toolkit users and principal investigators,

The authentication to NIRD Toolkit has been improved to become more flexible. The new solution will be put in production on the 3rd of February, 2022. The maintenance will be done in the evening, starting at 20:00.

For NIRD Toolkit users the change will be minimal, but it will require that you re-authenticate your session (log out and log in). 

Thank you for your understanding!

Maintenance postponed due to technical issues. More info to come.

NIRD Toolkit: maintenance 27.01

Dear NIRD Toolkit users and principal investigators,

Feide has planned maintenance on the 27th of January, 2022.

This might impact logging in to services running on the NIRD Service Platform using Dataporten/Feide, such as NIRD Toolkit.

However, we do not expect stoppage or cancellation for any of the already running services and applications.

Downtime Betzy, 6th December 08:00 until 9th December 20:00

There will be a scheduled downtime for Betzy lasting three days starting on Monday 6th December at 08:00. Downtime will last until Thursday 9th, 20:00.

During the downtime we will conduct:

  • Full upgrade of the Lustre filesystem (both servers and clients)
  • Full upgrade of the infiniband firmware
  • Full upgrade of the Mellanox infiniband drivers
  • minor updates to other parts of the system (Slurm, configs, etc)

Please be aware that this does also affect the storage services recently moved from NIRD to Betzy.

We apologize for the inconvenience

Update 08.12.2021 18:00 : Betzy downtime is over, and system is open for users. All planned update is performed .

[FINISHED] Saga downtime 17th November 12:00 -19th November 15:00

[Update: 2021-11-19 09:50] The maintenance work is now done, and Saga is back in full production and running jobs as normal.

[Update, 2021-11-18 12:40] login nodes are ready for users, users can access their data and work with it. Compute nodes are still under maintenance thus running jobs are still not possible.

[Update, 2021-11-17 12:00]: The maintenance has now started

We will conduct firmware update/maintenance on all of Saga during next week, starting on Wednesday 17th 12:00

Downtime will last until 15:00 on friday 19th, but we will bring back access to login nodes and file system as soon as the upgrade is done on vital parts of the system. Compute ndes will be brought back sequentially while they are updated.