RMAFF-PSN: A Residual Multi-Scale Attention Feature Fusion Photometric Stereo Network

Kai Luo,Yakun Ju,Lin Qi,Kaixuan Wang,Junyu Dong

DOI: https://doi.org/10.3390/photonics10050548

2024-04-14

Abstract:Predicting accurate normal maps of objects from two-dimensional images in regions of complex structure and spatial material variations is challenging using photometric stereo methods due to the influence of surface reflection properties caused by variations in object geometry and surface materials. To address this issue, we propose a photometric stereo network called a RMAFF-PSN that uses residual multiscale attentional feature fusion to handle the ``difficult'' regions of the object. Unlike previous approaches that only use stacked convolutional layers to extract deep features from the input image, our method integrates feature information from different resolution stages and scales of the image. This approach preserves more physical information, such as texture and geometry of the object in complex regions, through shallow-deep stage feature extraction, double branching enhancement, and attention optimization. To test the network structure under real-world conditions, we propose a new real dataset called Simple PS data, which contains multiple objects with varying structures and materials. Experimental results on a publicly available benchmark dataset demonstrate that our method outperforms most existing calibrated photometric stereo methods for the same number of input images, especially in the case of highly non-convex object structures. Our method also obtains good results under sparse lighting conditions.

Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

### What problem does this paper attempt to solve? This paper aims to address the problem of predicting accurate normal maps of objects in 2D images using the photometric stereo method in regions with complex structures and spatial material variations. Specifically, traditional methods face challenges in handling these "difficult" areas due to the influence of surface reflectance properties caused by changes in object geometry and surface materials. To tackle this issue, the authors propose a new method called RMAFF-PSN (Residual Multi-Scale Attention Feature Fusion Photometric Stereo Network). Unlike previous methods that rely solely on stacked convolutional layers to extract deep features from input images, RMAFF-PSN combines feature information from different resolution stages and scales, thereby retaining more physical information such as texture and geometric structure in complex regions. Through shallow-deep feature extraction, dual-branch enhancement, and attention optimization, this method can better handle situations with complex structures and material variations. Experimental results show that tests on public benchmark datasets demonstrate that this method outperforms most existing calibrated photometric stereo methods with the same number of input images, especially excelling in cases with highly non-convex object structures. Additionally, the method also achieves good results under sparse lighting conditions.

RMAFF-PSN: A Residual Multi-Scale Attention Feature Fusion Photometric Stereo Network

Learning Inter- and Intra-frame Representations for Non-Lambertian Photometric Stereo

Lightweight Multi-Attention Fusion Network for Image Super-Resolution

NormAttention-PSN: A High-frequency Region Enhanced Photometric Stereo Network with Normalized Attention

Estimating High-resolution Surface Normals via Low-resolution Photometric Stereo Images

Multi-scale and attention training of uncalibrated photometric stereo networks

Learning Conditional Photometric Stereo with High-Resolution Features

Enhanced Multi-Scale Feature Adaptive Fusion Sparse Convolutional Network for Large-Scale Scenes Semantic Segmentation

Fast Multi-Scale Residual Fusion Network for Stereo Matching.

MS-PS: A Multi-Scale Network for Photometric Stereo With a New Comprehensive Training Dataset

Multi-scale Parallax Attention for Stereo Image Super-Resolution

Multi-Level Feature Fusion Network for Lightweight Stereo Image Super-Resolution

Neural Radiance Fields Approach to Deep Multi-View Photometric Stereo

Stereo Matching Method for Remote Sensing Images Based on Attention and Scale Fusion

GR-PSN: Learning to Estimate Surface Normal and Reconstruct Photometric Stereo Images.

A multidimensional fusion image stereo matching algorithm

FA-MSVNet: multi-scale and multi-view feature aggregation methods for stereo 3D reconstruction

Self-adaptive Multi-scale Aggregation Network for Stereo Matching.

Event Fusion Photometric Stereo Network

Scalable, Detailed and Mask-Free Universal Photometric Stereo

Parallax Attention for Unsupervised Stereo Correspondence Learning