Dear Fram users. Unfortunately, there has been a short power outage in Tromsø causing a shutdown of compute nodes on Fram. We are working on bringing them back to production as soon as possible.
Sorry for the inconvenience this has caused.
[2022-05-03 – 10:45] – Fram is back in production.
[2022-05-04 – 13:20] – As a result of the power outage we have some problems with FRAM file system. Slowness/lagging. We are currently working on fixing this and are sorry for the inconvenience this is causing.
[2022-05-06 – 13:25] Fram filesystem is still priodically slow for some users. We assure you that we are continuously working to resolve this issue, but it is hard to debug due to the inconcistancy of the problem.
We need to conduct some work on the filesystem controllers for NIRD – TOS. Unfortunately this results in a short unavailability (downtime) period.
All services connected to- and/or utilizing TOS (Tromsø) part of NIRD will be affected. Exported NFS services mounted on FRAM will unfortunately NOT be available either.
The maintenance is set for Thursday 07.04.22 from 09:00-11:00 AM
We are sorry for any inconveniences that may occur. Opslog is updated as soon as the system is back in production.
UPDATE 07-04-2022 – 11:25 … we are still working on the issue and starting to bring the file system up, we hope to back in production soon
UPDATE 07-04-2022 – 12:25 … we are struggling and fighting with the file system, doing our best, we are very sorry for the troubles the issue is causing you
UPDATE 07-04-2022 – 15:35 … the file system is back up and running
Update 2021-09-23: The maintenance is now finished on both sites. Services should be back in production.
We’ll have scheduled maintenance on the NIRD Service platform on 22 and 23 September in order to perform upgrades on the clusters.
In addition to project deployments running on the service platform, the following services are affected during the maintenance:
- NIRD Toolkit
- NIRD Archive
The service platform consists of two sites, one in Tromsø and the other in Trondheim. This maintenance will be performed on one site at a time, planned as follows:
22 september: Tromsø
Services running on TOS-SP will be offline. NIRD will be accessible from login-trd.nird.sigma2.no.
23 september: Trondheim
Services running on TRD-SP will be offline. NIRD will be accessible from login-tos.nird.sigma2.no.
To check what site your project is running on, you may log in on the NIRD login-nodes and run the following command: (ssh login.nird.sigma2.no)
readlink /projects/<project number>
Make sure to write the project number in all uppercase.
This will then output the full path to the volume, starting with either “trd” for Trondheim or “tos” for Tromsø.
[user@login0-nird-trd ~]$ readlink /projects/NS9999K The output indicates that this project have it’s primary site in Tromsø (tos-project).
If you have any questions, please do not hesitate to contact us.
[UPDATE, 2021-06-08 08:00] Betzy is now up and in production again.
[UPDATE] Unfortunately, the downtime is taking longer than anticipated, and will not be finished tonight. We plan on getting Betzy up again at around 08:00 tomorrow morning.
Campusservice at NTNU will conduct maintenance on the High Voltage circuits for Non-redundant power on 7th of June 2021, between 15:00 and 20:00. All compute nodes and login nodes will be shut down during this time, and no jobs will be running during this period. Submitted jobs estimated to run into the downtime reservation will be held in queue.
Update: The file system servers have now been fixed, and we are back online again. Thank you for your patience.
We have an ongoing performance issue with Fram filesystem. We need to shut down file servers to get this fixed, and therefore need to have three hours downtime:
Wednesday 20th January between 12:00 and 15:00, Fram will be unavailable
As previously announced, Saga will be down in the coming week, from 7th December 08:00 until 11th December 16:00.
The downtime is allocated for expanding the storage. When we come back we will have ca 4 Petabyte in addition to the already existing 1 PetaByte.
Update: Saga is back online and running jobs again. The new storage is not online yet, but all the hardware has been mounted.
We are going to expand the storage on Saga. This will happen during week 50, between 7th and 11th December. Hopefully this will give oss a few Petabytes extra and enough storage for the lifetime of the system.
On 1st of December 2020 between 07:45 and 16:00 there will be a power outage on Fram compute nodes due to scheduled maintenance on UPS and backup power equipment.
Dear NIRD and NIRD Toolkit user: NIRD-TOS is currently down and will remain unavailable until Wednesday 12:00. We are replacing all cables during the next coupe of days.
Note that NIRD-home is also not available during that time.
All remote mounts on Fram, Saga and Betzy using NIRD-TOS will be unavailable until downtime is over
We will have downtime the following week to try again to replace all internal cables in NIRD-TOS and Fram storage systems.
NIRD-TOS (Including the toolkit) will be down from 08:00 Monday 2nd November to wednesday 4th 12:00
Fram will be down from Wednesday 4th 08:00 until Friday 6th 12:00
There is still a chance that the downtime will not happen, but proper notification will be given in the opslog. Unfortunately the current situation with Covid-19 makes it difficult to make detailed plans.
We apologize for any inconvenience.