NIRD Toolkit Upgrade

Dear NIRD Toolkit and NIRD Service Platform user,

We have the pleasure to announce out-rolling of new authentication system for enhanced user experience, as well an upgraded and clean look front-end for the NIRD Toolkit this upcoming Friday.

The upgrade process will start at 09:00 on the 26th of August. Please note that the upgrade process might lead to service interruption.

Thank you for your understanding!

NIRD Toolkit: maintenance 03.02

Dear NIRD Toolkit users and principal investigators,

The authentication to NIRD Toolkit has been improved to become more flexible. The new solution will be put in production on the 3rd of February, 2022. The maintenance will be done in the evening, starting at 20:00.

For NIRD Toolkit users the change will be minimal, but it will require that you re-authenticate your session (log out and log in). 

Thank you for your understanding!

Maintenance postponed due to technical issues. More info to come.

NIRD Toolkit: maintenance 27.01

Dear NIRD Toolkit users and principal investigators,

Feide has planned maintenance on the 27th of January, 2022.

This might impact logging in to services running on the NIRD Service Platform using Dataporten/Feide, such as NIRD Toolkit.

However, we do not expect stoppage or cancellation for any of the already running services and applications.

Upload to the archive stopped

Due to shortage of the storage capacity, the upload in the NIRD Research Data Archive has been temporarily closed.

We plan to increase the capacity of the NIRD Research Data Archive during spring/summer 2022. In the meantime, if you need to have DOI attached to datasets for publication, please contact us at archive.manager@nris.no . The DOI will be created immediately, while the dataset will be uploaded at a later stage.

For any question, please contact us at archive.manager@nris.no.

We apologize for any inconvenience.

[RESOLVED] NIRD Toolkit packages are unavailable

Update 10:49: Issue is now resolved and NIRD Toolkit packages are again available for deployment.

Dear NIRD Toolkit Users,

Due to ongoing configuration changes, the NIRD Toolkit packages are currently unavailable. However, the problem is not affecting the currently running services.

We are working on completing a repository migration.

Apologies for the inconvenience. We will keep you updated.

Betzy pre-production

Dear HPC User,

We are pleased to announce that Betzy is opened for pre-production Friday 20 November.

Being close to the weekend, Betzy is opened stepwise. First to prior pilot projects and then for general access Tuesday 24 November.

It has been a long journey, but we are happy to see good performance and stability on the system.

Please note, that during the coming days, changes will be made to the queue system setup, which could necessitate the cancelling of running jobs.

Finally, support will be also offered only from 24 November.

Thank you for your patience and we wish you happy computing!

Best regards,

Lorand Szentannai, on behalf of the preparations team

Updated information about Betzy production

Dear HPC User,

As mentioned previous week, the validation benchmarks have been stable, and we were ready to run and evaluate the site acceptance test. Unfortunately, the interconnect stability issues reoccured once again. 

We and the vendor have been running extensive tests since. The R&D department from the vendor of the interconnect released a new firmware yesterday afternoon, which was applied already yesterday evening and stress-tests immediately started. In order to be sure that the problem is resolved, several days of testing is needed.

Therefore, we have to postpone the production yet again with a week. Current production estimate is end of week 47.

We can assure you that we are very eager to have the system 100% stabilized and in production and everybody involved in the project (be it from Sigma2, the Metacenter, or vendor) is working intensively with this.

Thank you for your understanding!

Best regards,

Lorand Szentannai, on behalf of the preparations team

Information regarding Betzy production

Dear HPC User,

Our previous estimate of production on Betzy has proved to be somewhat optimistic. 

With the help of the vendor, we believe we have identified and fixed the cause of the interconnect stability problem on Betzy. The most recent validation benchmarks have been stable, and we will begin the site acceptance test (SAT) within Friday, 6 November. If the machine passes the SAT, it will be handed over to the operations and opened for production. 

The final preparations usually take 1-3 days. We therefore estimate that production will begin on Betzy within next week, week 46.

Best regards,

Lorand Szentannai, on behalf of the preparations team

Estimated production date for Betzy

Dear HPC user,

Our newest supercomputer – Betzy – is unfortunately delayed entering production due to circumstances outside of our control. 

We have had significant delays in getting all the components in place due to slack in logistics caused by the Covid pandemic. However, approximately 94% of the system capacity is now ready installed and configured. Work is ongoing to prepare the outstanding system capacity in the upcoming weeks. 

Benchmarks and pilot testing on Betzy have revealed an intermittent stability problem with the node interconnect. The vendor has been investigating the issue in the past two weeks in order to identify the source of the issue. Our new best estimate is that Betzy go into production in week 45

This has consequences for the decommissioning of Vilje and Stallo because we rely on Betzy to free up computational load from the other machines. Thus, the new decommissioning date for Vilje and Stallo is 1. DecemberWe would like the machines to be fully utilized until they are decommissioned, and therefore encourage you to continue using Vilje and Stallo if you still have the opportunity.

Thank you for your understanding!

Best regards,
Lorand Szentannai, on behalf of the preparations team

Betzy access closed, preparing for production

UPDATE:

  • 08.10.2020: After extensive testing, the vendor found stability issues are unfortunately still present. The problem is escalated and under investigation. We will get back to you with more information as soon as we get an update from the vendor.
  • 30.09.2020: The vendor will carry out firmware updates on Betzy during today and as a consequence we need to stop running jobs and run tests to make sure the system is table.
    Access to the machine will be reopened as soon as we are ready with the tests. Please follow the progress here, on OpsLog.
  • 25.09.2020: We are temporarily reopening the access over the weekend in order to allow further testing on the machine.
    Further work is expected to be done by the vendor sometime next week and as a consequence, jobs will be terminated again and access closed while maintenance will be ongoing.

Dear Betzy pilots,

We are pleased to announce that despite logistics challenges caused by Covid-19, most of the outstanding issues were sorted out. This unusual situation requested a more dynamic approach from everyone involved, while putting pressure on the communication due to uncertainties and quick situation changes. Because of this, setting and advertising a production date proved to be difficult.

We can now start aiming for setting Betzy into production in the beginning of October. Before we can conclude, and proceed with the preparations, we need to re-run several comprehensive tests.

Therefore, we will have to stop all jobs and access to Betzy starting from tomorrow, 17 September 2020 10AM. Access to Betzy will be re-established as soon as all the tests are effectuated. Please be prepared for a more extensive maintenance this time, which might require up to two and half weeks.

The file system on Betzy is not going to be reformatted. That is, your data will not be removed intentionally. However, we can not guarantee data integrity until backups are taken and the machine is placed into production. Therefore, we strongly advise you to take a backup of your important data for the sake of security.

Apologies for the short notice and the inconvenience this is causing to you.

Best regards,

Lorand Szentannai, on behalf of the preparations team