[Resolved] UiB MATLAB License server is down

Update 2021-05-10: The UiB MATLAB license server is now up and running again.

Dear users,
We have problem with UiB MATLAB license server, the license server is not stable and crashing from time to time, Users using MATLAB software from different clusters will have problem to contact UiB MATLAB license server.

we are working on this issue, and will keep you updated.

We apologise for any inconvenience caused.

Best Regards

[Finished] NIRD Service Platform Maintenance, 22-23 September

Update 2021-09-23: The maintenance is now finished on both sites. Services should be back in production.

Dear users,

We’ll have scheduled maintenance on the NIRD Service platform on 22 and 23 September in order to perform upgrades on the clusters.

In addition to project deployments running on the service platform, the following services are affected during the maintenance:

  • NIRD Toolkit
  • NIRD Archive
  • EasyDMP

The service platform consists of two sites, one in Tromsø and the other in Trondheim. This maintenance will be performed on one site at a time, planned as follows:

22 september: Tromsø
Services running on TOS-SP will be offline. NIRD will be accessible from login-trd.nird.sigma2.no.

23 september: Trondheim

Services running on TRD-SP will be offline. NIRD will be accessible from login-tos.nird.sigma2.no.

To check what site your project is running on, you may log in on the NIRD login-nodes and run the following command: (ssh login.nird.sigma2.no)

readlink /projects/<project number>

Make sure to write the project number in all uppercase.
This will then output the full path to the volume, starting with either “trd” for Trondheim or “tos” for Tromsø.

Example:

[user@login0-nird-trd ~]$ readlink /projects/NS9999K
/tos-project3/NS9999K

The output indicates that this project have it’s primary site in Tromsø (tos-project).

If you have any questions, please do not hesitate to contact us.

[DONE] Fram Maintenance October 6 — 8.

Update, 2021-10-11 08:15: The maintenance is now finished, and the compute nodes are in production again. (There are still some nodes down, they will be fixed and returned to production. Also, the VNC service is not up yet. We are looking at it.)

Update, 2021-10-08 15:40: We have now opened the login nodes for users again. The work on the cooling system is taking longer than we hoped, so the compute nodes will not be available until Monday morning.

Udate: The maintenance stop has now started.

UPDATE OCTOBER 4TH:

Login and file system services will be available during Friday or earlier, but running jobs will not be possible until Monday morning

There will be a maintenance stop on Fram starting Wednesday October 6 at 12:00 and ending Friday 8 in the afternoon. All of Fram will be down and unavailable during that time. Jobs that would not finish before the maintenance starts will be left pending until after the maintenance.

The main reason for the maintenance is replacements of some parts of the cooling system. During the stop, the OS of compute and login nodes will be updated from CentOS 7.7 to 7.9, and Slurm will be upgraded to 20.11.8 (the same version as on Saga).

[Resolved] login-1.fram crashed – VNC unavailable

One of the login nodes on Fram unexpectedly this morning, causing some users to be disconnected from their sessions.

This also affects the VNC service on Fram. Any attempts on using this service will fail while we’re working on restoring the node.

Updates will be provided once we have more information to share.

Update 13:00 – The node, NIRD exports and VNC service is now back up and running and put back into production. Please let us know if you experience any issues.

We’re very sorry for any inconveniences this may cause.

[Resolved] NIRD Archive unavailable

We regret to inform that the NIRD Archive is currently unavailable due to issues with deployment of the service. A possible cause has been identified and we’re working on resolving it to restore the service.

Updates will be provided as we have new information.

Update 19:47 – The archive is now back up and running.
Update 18:06 – The fix is now properly in place. We’re redeploying the archive web service next.
Update 16:33 – Unfortunately it takes longer than expected to apply the fix. Thank you for your patience!
Update 15:53 – We’re applying a fix right now and expect the archive to be available again shortly.