Dear Saga User,
We have the pleasure to announce that we have now fixed all the technical requirements and mounted NIRD project file systems on Saga login nodes.
You may find your projects in the
Please note that to transfer of large amount of files is sluggish and has a big impact on the I/O performance. It is always better to transfer one larger file than many small files.
As an example, transfer of a folder with 70k entries and about 872MB took 18 minutes, while the same files archived into a single 904MB file took 3 seconds.
You can read more about the tar archiving command by reading the manual pages. Type
in your Saga terminal.
/cluster filesystem on Saga is crashed, we are working on it. users should expect that their Slurm jobs will crash.
Update, 09:45: The file system is back online now. Only parts of /cluster was unavailable, but we recommend you to check your jobs, some of them will probably have crached.
Quite a few users have lost access to their project(s) on Nird and all clusters during the weekend. This was due to a bug in the user administration software. The bug has been identified, and we are working on rolling back the changes.
We will update this page when access has been restored.
Update 12:30: Problem resolved. Project access has now been restored. If you still have problems, please contact support at email@example.com
Update: This applies to all systems, not only Fram and Saga.
Dear Saga cluster Users:
We have discovered /cluster filesystem issue on Saga, which can lead to possible data corruption, to be able to examine the problem, we decided to suspend all running jobs on Saga and reserve entire cluster. No new job will be accepted until problem is resolved.
Users can still login to Saga login nodes.
We are sorry for any inconvenience this may have caused.
We will keep you updated as we progress.
Update: We are trying to repair the file system without killing all jobs. It might not work, at least not for all jobs. In the mean time, we have closed access to the login nodes to avoid more damage to the file system.
Update 14:15: Problem resolved, Saga is open again. Please check if you have running jobs, some of the jobs could get crashed.
The source of the problem is related to the underlying filesystem (XFS) and the current kernel that we are running. We scanned the underlying filesystem on our OSS servers to eliminate possible data corruption on /cluster filesystem, and we also updated kernel on OSS’es.
Please don’t hesitate to contact us if you have any questions
- 2020-01-13 14:54: Problems have been sorted out now and network is functional again.
- 2020-01-13 14:40: Problems are unfortunately back again. Uninett’s network specialists are working on solving the problem as soon as possible.
- 2020-01-13 14:22: Network is functional again. Apologies for the inconvenience it has caused.
We are currently experiencing network outage on Saga and some parts of NIRD. The problem is under investigation.
Please check back here for an update on this matter.
To improve performance of the /cluster file system, we will reboot the Saga login nodes this evening. We apologize for the short notice, but expect the increased performance to make up for any inconvenience.
Jobs in the queue system will not be affected.
Dear Fram and Saga User,
As you may remember quotas have been enforced on $HOME areas during October past year. This has been carried out only for users having less then 20GiB in their Fram or Saga home folders.
Because of repeated space related issues on NIRD home folders and due to the backups taken from either Fram or Saga, we had to change our backup policies and exclude backup for users using more then the 20GiB in their Fram or Saga $HOME areas.
If you manage to clean up your $HOME on Fram and/or Saga and decrease your $HOME usage below 20GiB, we can then enforce quotas on it and re-enable the backup.
Thank you for your understanding!
One of the fileserver had a problem which caused some of the folders under /cluster is unavailable.
The problem is resolved now.
Dear Fram and Saga user,
As you may know, we have a standard 20GB block quota on $HOME on Fram and Saga HPC resources. This was however not enforced until now, but due to frequent overuse and backup limitations, we are compelled to do it now and will start to be in effect starting on 04.11.2019.
Any project related data shall be moved to
/cluster/projects area and unneeded data shall be removed.
We have also implemented a new policy with regards to backups and any files placed under
$HOME/tmp will be excluded.
For more information, please check the documentation pages at https://documentation.sigma2.no/.
Thank you for your understanding!
We have the pleasure to announce that the Saga HPC cluster is now opened for production for existing pilot users.
Candidate projects for migration from the Abel cluster will be contacted directly.
Link to the Saga documentation page.