Adaptation of Range-Doppler Algorithm for Efficient Beamforming of Monostatic and Multistatic Ultrasound Signals

Marko Jakovljevic,Roger Michaelides,Ettore Biondi,Dongwoon Hyun,Howard A Zebker,Jeremy J Dahl
DOI: https://doi.org/10.1109/TUFFC.2022.3205923
Abstract:Algorithmic changes that increase beamforming speed have become increasingly relevant to processing synthetic aperture (SA) ultrasound data. In particular, beamforming SA data in a spatio-temporal frequency domain using the F-k (Stolt) migration have been shown to reduce the beamforming time by up to two orders of magnitude compared with the conventional delay-and-sum (DAS) beamforming, and it has been used in applications where large amounts of raw data make real-time frame rates difficult to attain, such as multistatic SA imaging and plane-wave Doppler imaging with large ensemble lengths. However, beamforming signals in a spatio-temporal Fourier space can require loading large blocks of data at once, making it memory-intensive and less suited for parallel (i.e., multithreaded) processing. As an alternative, we propose beamforming in a range-Doppler (RD) frequency domain using the range-Doppler algorithm (RDA) that has originally been developed for SA radar (SAR) imaging. Through simulation and phantom experiments, we show that RDA achieves similar lateral resolution and contrast compared with DAS and F-k migration. At the same time, higher axial sidelobes in RDA images can be reduced via (temporal) frequency binning. Like the F-k migration, RDA significantly reduces the overall number of computations relative to DAS, and it achieves ten times lower processing time on a single CPU. Because RDA uses only a spatial Fourier transform (FT), it requires two times less memory than the F-k migration to process the simulated multistatic data and can be executed on as many as a thousand parallel threads (compared with eight parallel threads for the F-k migration), making it more suitable for implementation on modern graphics processing units (GPUs). While RDA is not as parallelizable as DAS, it is expected to hold a significant speed advantage on devices with moderate parallel processing capabilities (up to several thousand cores), such as point-of-care and low-cost ultrasound devices.
What problem does this paper attempt to address?