A High-Throughput and Low-Storage Stereo Vision Accelerator with Dependency-Resolving Strided Aggregation for 8-Path Semi-Global Matching

Yitong Rong,Xuyang Duan,Jun Han
DOI: https://doi.org/10.1016/j.mejo.2024.106156
IF: 1.992
2024-01-01
Microelectronics Journal
Abstract:Semi-global matching(SGM) is a well-known algorithm that generates depth maps from two images. However, due to its high computation, memory requirements and the inherent data dependency problem, implementing SGM in real-time is challenging. In this paper, we propose dependency-resolving strided cost aggregation(SCA) to resolve the data dependency problem. We also propose a cost distillation scheme that reduces the on-chip memory requirement for aggregated costs by 98%. Furthermore, to resolve the inter-tile census transform(CT) reuse problem caused by the overlapped tile-based processing scheme, we propose a CT data recomputing technique that reduces the on-chip memory requirement for CT data by 83.4%. The proposed architecture is evaluated on Virtex-7 and 28 nm CMOS technology. For a resolution of 1920 × 1080 with 128 disparity levels, the evaluation on Virtex-7 and 28 nm technology achieves a maximum frame rate of 81 and 122 frames per second(fps), respectively. The former is 1.25× faster with 44.1% improvement in energy efficiency than the state-of-the-art 8-path implementation while maintaining comparable accuracy.
What problem does this paper attempt to address?