Frequency-aware Feature Fusion for Dense Image Prediction

Linwei Chen,Ying Fu,Lin Gu,Chenggang Yan,Tatsuya Harada,Gao Huang
2024-08-23
Abstract:Dense image prediction tasks demand features with strong category information and precise spatial boundary details at high resolution. To achieve this, modern hierarchical models often utilize feature fusion, directly adding upsampled coarse features from deep layers and high-resolution features from lower levels. In this paper, we observe rapid variations in fused feature values within objects, resulting in intra-category inconsistency due to disturbed high-frequency features. Additionally, blurred boundaries in fused features lack accurate high frequency, leading to boundary displacement. Building upon these observations, we propose Frequency-Aware Feature Fusion (FreqFusion), integrating an Adaptive Low-Pass Filter (ALPF) generator, an offset generator, and an Adaptive High-Pass Filter (AHPF) generator. The ALPF generator predicts spatially-variant low-pass filters to attenuate high-frequency components within objects, reducing intra-class inconsistency during upsampling. The offset generator refines large inconsistent features and thin boundaries by replacing inconsistent features with more consistent ones through resampling, while the AHPF generator enhances high-frequency detailed boundary information lost during downsampling. Comprehensive visualization and quantitative analysis demonstrate that FreqFusion effectively improves feature consistency and sharpens object boundaries. Extensive experiments across various dense prediction tasks confirm its effectiveness. The code is made publicly available at <a class="link-external link-https" href="https://github.com/Linwei-Chen/FreqFusion" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition,Artificial Intelligence
What problem does this paper attempt to address?
### Problems Addressed by the Paper The paper aims to address two major issues in dense image prediction tasks: **intra-category inconsistency** and **boundary displacement**. Specifically: 1. **Intra-category Inconsistency**: During feature fusion, different parts within the same category may exhibit significant variations in feature values, leading to reduced intra-category feature consistency. For example, the wheels of a car may have more texture and darkness, while the windows appear smooth and bright. Standard feature fusion methods cannot effectively handle these inconsistent features, and simple bilinear upsampling may even exacerbate the problem, as an inconsistent feature may be upsampled to multiple pixels, further increasing intra-category inconsistency. 2. **Boundary Displacement**: During feature fusion, high-frequency information at the boundaries is often lost, leading to blurred boundaries and consequently boundary displacement. Previous studies have shown that simple interpolation methods tend to overly smooth features, resulting in the loss of boundary information. To address these issues, the authors propose the **Frequency-Aware Feature Fusion (FreqFusion)** method. This method enhances the feature fusion process through three key components: - **Adaptive Low-Pass Filter Generator (ALPF)**: Predicts spatially varying low-pass filters to reduce high-frequency components within objects, thereby reducing intra-category inconsistency. - **Offset Generator**: Refines large areas and fine boundaries by resampling to replace inconsistent features. - **Adaptive High-Pass Filter Generator (AHPF)**: Extracts high-frequency details lost during downsampling from low-level features, thereby enhancing boundary information. Through the collaborative work of these components, FreqFusion can restore high-quality fused features with consistent category information and clear boundaries. Experimental results show that FreqFusion significantly improves performance in various dense prediction tasks, including semantic segmentation, object detection, instance segmentation, and panoptic segmentation.