This is our current understanding of the Bus errors:
Some jobs tend to allocate more memory on one chosen rank, e.g., rank 0, or one rank per compute node – often the rank that runs on CPU 0. This sometimes results in memory exhaustion on the first NUMA node. If the memory is exhausted on one of the nodes, calling the MPI communication ultimately results in a Bus error. Why that happens is still unclear, and most likely related to some kernel-space drivers not being able to allocate the memory. We are in the process of diagnosing this issue, and have submitted a report to the vendor. However, from our experience that will take very long time to solve. So you are better off finding a workable solution on the application level. That would include checking what is the profile of memory allocation in your application, and making it more even among the numa nodes. You can check the occupation of the NUMA memory nodes by running (e.g., on the login nodes)
clush -w numactl -H | grep free
If you see a big imbalance, and memory being exhausted on one of the numa nodes, you can expect to get the Bus error.
Update 15:00 21.01.2022: We are working with new distro for Betzy compute node, which looks promising, which will eliminate buss error. Distro will be tested during the weekend, and eventually will be put in to production next week.
Update : 15:30 24.01.2022: Bus error is eliminated after we updated Lustre version and MOFED version on our distro. Please contact us if you still encounter Bus error.