FRAM – controller maintenance

Good morning,

we are going to perform some routine maintenance on one of the file system controllers of FRAM. This should have no significant implications for production, users might experience slightly degraded Lustre (file system) performance.

The operation is scheduled for today – 11 a.m. …

Update 8.07: There were also performance issues with the login nodes. This and the controller maintenance is now finished.

FRAM – Unexpected shutdown

We are experiencing some troubles with FRAM machine. Yesterday morning (Sunday 04.07.2021) there were many compute nodes that went unexpectedly down. We are investigating the issue.

Update 05.07.2021 – 10:54: The shutdown was caused by a power outage in the data center. We are taking all nodes up and monitoring their behavior.

Apologies for the inconvenience this may have caused! 

Stallo – file system problem

Dear Stallo Users,

UPDATE – 27.11/16:20 – we have opened the machine for you guys but there might be some instabilities on global file system as we have also lost one object storage server. The issue is being investigated and we are waiting for some spare parts.

We have some major problems with Lustre file system at the moment. One of the main storage coolers is down. We are kicking out all users now and hope to get the machine back to an operational state ASAP.

Thank you for your patience.

HPC staff (UiT)

Fram off-line: File system issues

Dear Fram Users,

The ongoing problems on FRAM reported July 1st, cause the error message “No space left on device” for various file operations.

The problems are being investigated, and we will keep you updated on the progress.

UPDATE 2020-07-08 14:50: hugemem on Fram is now operating as normal.

UPDATE 2020-07-08 10:35: The file system issues have been resolved and we are operating as normal with the exception of hugemem, which is still unavailable. Please let us know if you’re still experiencing problems. Again we apologize for the inconvenience.

UPDATE 2020-07-08 09:00: Our vendor has corrected the filesystem bug and we should be operating as normal soon. At the moment we’re running some tests which will slow down current jobs running on Fram.

UPDATE 2020-07-07 15:35: The problem on Fram is caused by a bug in the Lustre filesystem. Our vendor is taking over the case to fix the issue. Thank you for your patience, we apologize for the inconvenience.

UPDATE 2020-07-07 09:50 : We are still experiencing file system errors on FRAM, and are working to resolve the issue as soon as possible. Watch this space for updates.

UPDATE 2020-07-06 12:30 : FRAM has been opened again.

UPDATE 2020-07-06 09:50 : The FS is up and running, it seems to be stable and this has also been verified by the vendor. It should be possible to use FRAM within couple of hours.

UPDATE 2020-07-03 17:10 : The FS is up and running but we have decided to keep the machine closed during the weekend so we are sure everything works as it should on Monday. The reason for many recent FRAM downtimes have been caused by storage hardware faults. We are investigating the issue together with the storage vendor.

UPDATE 2020-07-02 13:20 : FRAM is off-line, we are investigating the issues. The machine will probably stay off-line until tomorrow.

UPDATE 2020-07-02 12:10 : Whole file system is still very unstable, we will most likely have to take FRAM down, Slurm reservation created and all users might be kicked out soon.

UPDATE 2020-07-02 11:15 : Whole file system is still very unstable and we are trying to fix the problem.

Metacenter Operations

Reminder: Auto cleanup of Stallo

Dear Stallo users,

From today (25.05.2020) we will enforce the auto cleanup of /global/work. All files with an access date older than 21 days will in a first step set to read-only and at a later point moved to a trash folder.

Please move all files you want to keep to your home folder or to other storage solutions like NIRD.


See also https://hpc-uit.readthedocs.io/en/latest/storage/storage.html#work-scratch-areas


If you have questions or need help, please contact us at migration@metacenter.no

Thank you for your understanding.

Metacenter Operation