Queueing system on Vilje has crashed. We are working on a fix
Tag: Vilje
Vilje experienced network outage
Vilje queueing system was unavailable from Sunday 5th 15:30 until monday 6th 08:30, due to a faulty infiniband cable.
We apologize for the inconvenience.
Vilje filesystem is back
Vilje filesystem has been fixed with good help from DDN and we are now open for business.
Please be aware that some files may have been lost.
Always back up your files.
Vilje partially down
We are currently experiencing a network error on VIlje, causing around 100 nodes to be unavailable until further notice. Some jobs may be lost.
We apologize for the inconvenience.
Vilje is online
The infiniband error was due to a controller module with bad connection. This has been corrected.
The queueing system is back online. Also: 19 additional nodes has been recovered.
Three jobs were lost. We apologize for the inconvenience.
Vilje infiniband problems
We are currently experiencing infiniband problems on VIlje. The queueing system is unavailable until further notice.
Some jobs may have been lost.
Vilje is back online
Vilje is online.
The outage was caused by the loss of infiniband connectivity/loss of two infiniband switches.
36 nodes will remain out of production.
There may still be dns issues with connectivity from innside the cluster to outside (i.e: licence server lookups). Please report any issues to: support@metacenter.no
Vilje down at the moment.
2018-11-14, 09:54 Vilje is experiencing a failure with the Lustre filesystem. The system will be down until further notice.