[TEMP FIXED] Betzy: Data corruption on login-2

Update 2021-04-28: Settings on login-1 & login-2 were adjusted to prevent data corruption. These new settings have not resulted in data corruption while testing with regular I/O operations (cp, scp, rsync). Nevertheless, data corruption may still happen. Therefore users are cautioned to verify that data is not corrupted after it has been copied or transferred. In case you detect data corruption, please, do not hesitate to inform support even if you have incomplete information about the incident.

The login node login-2 is reopened for use.

We are still evaluating whether adjusting settings on compute nodes is necessary because so far we were not able to reproduce data corruption there.

We labelled this as a temporary fix because additional measures may be necessary.

Original post: On login-2 on Betzy, there are again cases of data corruption (since this morning). Even simply copying a file within the parallel filesystem (/cluster/…) resulted in data corruption in some cases (in about 5-10% of the cases, each copying a 10 GiB file). We are investigating the issue and are in contact with the vendor. Until we find a permanent solution or workable temporary fix, users are kindly asked to not use login-2.

We are sorry for the inconvenience