Dear Fram and Saga user,
As you may know, we have a standard 20GB block quota on $HOME on Fram and Saga HPC resources. This was however not enforced until now, but due to frequent overuse and backup limitations, we are compelled to do it now and will start to be in effect starting on 04.11.2019.
Any project related data shall be moved to
/cluster/projects area and unneeded data shall be removed.
We have also implemented a new policy with regards to backups and any files placed under
$HOME/tmp will be excluded.
For more information, please check the documentation pages at https://documentation.sigma2.no/.
Thank you for your understanding!
We are currently experiencing problems with the /cluster file system on Saga. This prevents users from logging in.
We are investigating, and will update here when we know more.
Update: 11:30 we have identified and solved the problem, now /cluster filesystem is back online.
Dear Fram cluster users:
login-1-2 will be reinstalled, and will be removed from DNS temporarily. It will be added back to DNS when reinstallation is over.
Update: 15:12 login-1-2 is reinstalled and added back to the DNS configuration.
- 2019-10-18 14:36 We are ready with the reinstallation, configuration checks, QA and tests. Access to the machine has been reopened and queued jobs are running again.
- 2019-10-18 06:12 Reinstallation of compute nodes is much slower then anticipated and thus re-opening of the machine is delayed. We do our best to finish the maintenance as soon as possible. In parallel we are conducting tests and benchmarks.
Will keep you updated.
- 2019-10-17 08:25 File system servers and infrastructure switches were patched yesterday.
We are proceeding now with the upgrade of the service and the login nodes.
- 2019-10-16 08:07 Maintenance has started.
Dear Fram User,
We will have a two days planned downtime starting from 08:00AM on the 16th of October for maintenance on the storage and the file system.
During this time we will, together with the vendor, upgrade the storage firmwares, upgrade the software on the /cluster file system servers and upgrade the operating system on Fram.
This upgrade is necessary to fix the frequent issues with the metadata servers and enhance stability and security of the system.
Fram jobs which can not finish by the 16th of October, are queued up and will not start until the maintenance is finished.
Thank you for your consideration!
The node mentioned above has to be rebooted due to its unresponsiveness. We are sorry for any inconvenience.
login-1-1 node hanged and had to be rebooted. Up and running again now. Have a nice weekend!
Fram login-1-2 is rebooted around 15:10 today due to the lustre filesystem glitch.
We have the pleasure to announce that the Saga HPC cluster is now opened for production for existing pilot users.
Candidate projects for migration from the Abel cluster will be contacted directly.
Link to the Saga documentation page.
With two nodes short of a fully 1404 node capacity, we are now back in full production. A few jobs were lost due to missing lustre filesystem on a few nodes, wich again was due to a faulty interconnect/infiniband cable.
Thank you for your patience.
We are currently experiencing a network error on VIlje, causing around 100 nodes to be unavailable until further notice. Some jobs may be lost.
We apologize for the inconvenience.