Lost connection with NIRD

Major incident High Performance Computing Betzy Fram Saga Storage Services NIRD Data Peak NIRD Backup NIRD Datalake NIRD Service Platform Research Data Archive easyDMP
2024-09-25 20:50 CEST · 5 days, 12 hours, 6 minutes

Updates

Resolved

Situation Resolved, DL and Archive is now back in production.

We are awaiting spare parts and IBM technician to replace some HW that did not respond well to incident.
This will cause a small downtime (some hours) next week.
We will come back with more information as soon as we get all specifications from IBM.

Thank you for your patients and sorry for the inconvenience this may cause.

-Infra Team

October 1, 2024 · 08:52 CEST
Monitoring

Dear Nird user

Current NIRD Datalake update:

We have reestablished redundancy on faulty hardware

We are still running filechecks in offline mode to ensure no data loss has occured.

Next update is expected Tuesday morning around 09:00

September 30, 2024 · 20:18 CEST
Update

NIRD Tier storage is back in prod.

Still working on bringing back Data Lake, and service platform.

-Infra Team

September 26, 2024 · 14:18 CEST
Update

We have established connection and are working on getting NIRD back.

Will update when system are back in production.

-Infra Team

September 26, 2024 · 09:59 CEST
Investigating

We are actively working with the data center to obtain a complete overview of the incident. Further details will be shared later today.

September 26, 2024 · 07:30 CEST
Issue

This means that all HPC systems also lost connection to NIRD.

We are currently investigating what happened.

Sorry for the inconvenience this is causing.

-Infra team

September 25, 2024 · 20:50 CEST

← Back