Accelerating Seismic Redatuming Using Tile Low-Rank Approximations on NEC SX-Aurora TSUBASA

by Yuxi Hong, Hatem Ltaief, Matteo Ravasi, Laurent Gatineau, David Keyes
Paper Year: 2021

Abstract

With the aim of imaging subsurface discontinuities, seismic data recorded at the surface of the Earth must be numerically re-positioned at locations in the subsurface where reflections have originated, a process generally referred to as redatuming by the geophysical community. Historically, this process has been carried out by numerically time-reversing the data recorded along an open boundary of surface receivers into the subsurface. Despite its simplicity, such an approach is only able to handle seismic energy from primary arrivals (i.e., waves that interact only once with the medium discontinuities), failing to explain multi-scattering in the subsurface. As a result, seismic images are contaminated by artificial reflectors if data are not pre-processed prior to imaging such that multiples are removed from the data. In the last decade, a novel family of methods has emerged under the name of Marchenko redatuming; such methods allow for accurate redatuming of the full-wavefield recorded seismic data including multiple arrivals. This is achieved by solving an inverse problem, whose adjoint modeling can be shown to be equivalent to the standard single-scattering redatuming method for primary-only data. A downside of this application is that the so-called multi-dimensional convolution operator must be repeatedly evaluated as part of the inversion. Such an operator requires the application of multiple dense matrix-vector multiplications (MVM), which represent the most time-consuming operations in the forward and adjoint processes. We identify and leverage the data sparsity structure for each of the frequency matrices during the MVM operation, and propose to accelerate the MVM step using tile low-rank (TLR) matrix approximations. We study the TLR impact on time-to-solution for the MVM using different accuracy thresholds whilst at the same time assessing the quality of the resulting subsurface seismic wavefields and show that TLR leads to a minimal degradation in terms of signal-to-noise ratio on a 3D synthetic dataset. We mitigate the load imbalance overhead and provide performance evaluation on two distributed-memory systems. Our MPI+OpenMP TLR-MVM implementation reaches up to 3X performance speedup against the dense MVM counterpart from NEC scientific library on 128 NEC SX-Aurora TSUBASA cards. Thanks to the second generation of high bandwidth memory technology, it further attains up to 67X performance speedup (i.e., 110 TB/s) compared to the dense MVM from Intel MKL when running on 128 dual-socket 20-core Intel Cascade Lake nodes with DDR4 memory, without deteriorating the quality of the reconstructed seismic wavefields.

Keywords

HPC Seismic